From bogus@does.not.exist.com  Sat Dec  9 22:11:42 2006
From: bogus@does.not.exist.com ()
Date: Sat Dec  9 22:11:43 2006
Subject: No subject
Message-ID: <mailman.0.1165702303.30999.p2p-hackers@zgp.org>

specified computer, then you have that same probability of taking over
any other specified computer (before knowing whether or not you took over
the first computer).  Of course, these are not independent
probabilities, in the sense that knowing that you took over one computer
means that you are less likely to have taken over other computers because
you just used up a lot of darts.  However, the solution means that in
the range where it is probable that you have taken over a particular
computer, the correlations are negligible.  Therefore, you are likely
to have taken over a large number of computers by then.

Qualitatively, what seems to happen is that you have a negligible chance
of taking over any computers until you hit a threshold where you
rapidly take over all the computers in the network.  This threshold
is roughly the number of computers in the network.

If you take the correlations in the wash for a qualitative analysis, the
probability of taking over any single computer is roughly M * the probability
of taking over a specific computer.  However, at around N= 1/2 the "critical"
threshold, your probability scales roughly as (1/2)^P of the probability
at the critical threshold.  If P = log M, (1/2)^P = 1/M, which cancels
the factor of M you add if you only want to take over any machine.
What this means, qualitatively, is that there is a threshold, which is some
small slowly-growing factor (like 10 or 20 or so) times M.  If N is less than,
say, 1/2 of this threshold, you are unlikely to take over *any* computer.
When N hits 1/2 the threshold, you have a decent probability to take over
some random computer.  As you increase N from 1/2 the threshold to the
threshold, you rapidly take over most of the computers in the system.
(At N=10*M, you've just put an average of 10 machines in each bin, so it's
 not surprising that you've taken over the system at that point.)

Of course, that threshold is pretty high to begin with.  However, when
it starts to fail, it pretty much all fails at once.  Still, this is
somewhat ideal behavior in the sense that to take over something specific
you have to have enough firepower to pretty much take over the whole thing.

----- End forwarded message -----


From bogus@does.not.exist.com  Sat Dec  9 22:11:42 2006
From: bogus@does.not.exist.com ()
Date: Sat Dec  9 22:11:44 2006
Subject: No subject
Message-ID: <mailman.1.1165702304.30999.p2p-hackers@zgp.org>

necessary to some extent to have true privacy) it's similar to source
routing in many instances, however, from a privacy standpoint it has the
property that each forwarding node only knows their bit of the message's
path such that in order to see the source and destination, all hops along a
path must be owned by the adversary.  This can be invaluable in many
situations.

Chris Connelly

----- Original Message -----
From: "Lists" <michael.iles.lists@sympatico.ca>
To: <p2p-hackers@zgp.org>
Sent: Tuesday, March 26, 2002 6:39 PM
Subject: RE: [p2p-hackers] P2P Onion Routing?


> What's the advantage of onion routing? Onion routing (where the
> _messages_ hold their own routing information) and gnutella routing
> (where the _nodes_ hold routing tables tracking the messages) are both
> unable to recover from the loss of a single node in the routing chain.
>
> It seemed to me like JXTA had addressed this through their hybrid
> 'endpoint routing protocol' where a message could hold a list of desired
> hops, or it could simply hold its desired destination, and any node
> along the way could ask for additional route choices from a peer router.
>
> Wouldn't onion routing be a step backwards for JXTA?
>
> Mike.


From bogus@does.not.exist.com  Sat Dec  9 22:11:42 2006
From: bogus@does.not.exist.com ()
Date: Sat Dec  9 22:11:45 2006
Subject: No subject
Message-ID: <mailman.2.1165702305.30999.p2p-hackers@zgp.org>

International Computer Science Institute (ICSI) in Berkeley. From
1995 to 1999 he was a Professor of Computer Science and Adjunct
Professor of Molecular Biotechnology at the University of
Washington. In 1999 he returned to ICSI and Berkeley, where he is
a University Professor with appointments in Computer Science,
Mathematics and Bioengineering.

The unifying theme in Karp's work has been the study of
combinatorial algorithms. His 1972 paper ``Reducibility Among
Combinatorial Problems'' showed that many of the most commonly
studied combinatorial problems are NP-complete, and hence likely
to be intractable. Much of his subsequent work has concerned
parallel algorithms, the probabilistic analysis of combinatorial
optimization algorithms and he construction of randomized
algorithms for combinatorial problems. His current activities
center around algorithmic methods in genomics and computer
networking.

Karp has received the National Medal of Science, the Turing Award
(ACM) the Fulkerson Prize(AMS and Math. Programming Society), the
von Neumann Theory Prize(ORSA-TIMS), the Lanchester Prize (ORSA)
the von Neumann Lectureship (SIAM), the Harvey Prize (Technion),
the Centennial Medal (Harvard) and the Distinguished Teaching
Award (Berkeley). He is a member of the National Academy of
Sciences, the National Academy of Engineering and the American
Philosophical Society and a Fellow of the American Academy of
Arts and Sciences and the American Association for the
Advancement of Science. He has been awarded five honorary
degrees.

Contact information:

Richard Karp
ICSI Center for Internet Research and
University of California, Berkeley

karp@ICSI.Berkeley.EDU 

+----------------------------------------------------------------------------+
| This message was sent via the Stanford Computer Science Department         |
| colloquium mailing list.  To be added to this list send an arbitrary       |
| message to colloq-subscribe@cs.stanford.edu.  To be removed from this list,|
| send a message to colloq-unsubscribe@cs.stanford.edu. For more information,|
| send an arbitrary message to colloq-request@cs.stanford.edu. For directions|
| to Stanford, check out http://www-forum.stanford.edu                       |
+-------------------------------------------------------------------------xcl+


From bogus@does.not.exist.com  Sat Dec  9 22:11:42 2006
From: bogus@does.not.exist.com ()
Date: Sat Dec  9 22:11:45 2006
Subject: No subject
Message-ID: <mailman.3.1165702305.30999.p2p-hackers@zgp.org>

--mike


> If I take the message digest of all the IP addresses (all of which are
> 32-bits long and cover the whole 2exp32 space), will the resulting m-bit mess
> age digests produced by SHA-a be uniformly distributed in the
> 2exp160 space? SHA-1 documentation does not make any such claims. If the mess
> age digest produced are not uniformly distributed, what can we say about its 
> diistribution?
> 
> ~nikhil


From bogus@does.not.exist.com  Sat Dec  9 22:11:42 2006
From: bogus@does.not.exist.com ()
Date: Sat Dec  9 22:12:05 2006
Subject: No subject
Message-ID: <mailman.4.1165702325.30999.p2p-hackers@zgp.org>

overlays to mobile ad hoc networking, the ultimate success of a
peer-to-peer system rests on the twin pillars of scalable and robust
system design and alignment of economic interests among the
participating peers. The Workshop on Economics of Peer-to-Peer Systems
will bring together for the first time researchers and practitioners
from multiple disciplines to discuss the economic characteristics of P2P
systems, application of economic theories to P2P system design, and
future directions and challenges in this area. Topics of interest
include, but are not limited to: 

- incentives and disincentives for cooperation 
- distributed algorithmic mechanism design 
- reputation and trust 
- reliability, identity, and attack resistance 
- network externalities and scale economies 
- public goods and club formation 
- accounting and settlement mechanisms 
- payment and currency systems 
- user behavior and system performance 
- measurement studies 
- leveraging heterogeneity without compromising anonymity 
- economic impact to network providers 
- interconnection of P2P networks 

The program of the workshop will be a combination of invited talks,
paper presentations, and discussion. Workshop attendance will be limited
to ensure a productive environment. Each potential participant should
submit a position paper that expresses a novel or interesting problem,
offers a specific solution, reports on actual experience, or advances a
research agenda. Participants will be invited based on the originality,
technical merit and topical relevance of their submissions, as well as
the likelihood that the ideas expressed in their submissions will lead
to insightful discussions at the workshop. Accepted papers will be
published on the workshop website. 

Submission guidelines: 
Submissions of position papers are due March 27, 2003, and should not
exceed 5 pages. Two column papers are acceptable, but the font size
should be no smaller than 11pt. Papers must be submitted electronically
in postscript or PDF format to <p2pecon@sims.berkeley.edu>. 

Important Dates : 

Submission due: March 27 
Notification of acceptance: April 25 
Revised version due: May 22 
Workshop: June 5-6 

Program Committee: 
John Chuang, UC Berkeley (chair) 
Roger Dingledine, The Free Haven Project 
Ian Foster, University of Chicago and Argonne National Lab 
Bernardo Huberman, HP Labs 
Ramayya Krishnan, CMU (co-chair) 
H.T. Kung, Harvard University 
David Parkes, Harvard University 
Paul Resnick, University of Michigan 
Scott Shenker, ICSI 
Michael D. Smith, CMU 
Hal Varian, UC Berkeley

From bogus@does.not.exist.com  Sat Dec  9 22:11:42 2006
From: bogus@does.not.exist.com ()
Date: Sat Dec  9 22:12:21 2006
Subject: No subject
Message-ID: <mailman.5.1165702341.30999.p2p-hackers@zgp.org>

in tiny parts or big parts, as a separate transaction or 
alongside your content downloads. If anyone give you bogus
info, it's instantly recognizable as being inconsistent
with the root value. 

The THEX interchange format is just one way to fill in the
details. 

> They appear to propose that the recipient would download log(n) chunks
> from a tree server (by asking for offsets and lengths in the tree
> file, presumably using HTTP keep-alive and offset, length features of
> HTTP).  However this has significant request overhead when the file
> hashes are 16-20bytes (the request would be larger than the hash), it
> also involves at least two connections: one for authentication info,
> another for downloading chunks.

I wouldn't recommend that at all. The THEX data format is really best 
for grabbing a whole subset of the internal tree values, from the
top/root on down, in one gulp. Yes, that top-down format includes
redundant info, and you could grab the data from lots of different
people, but it's so small compared to the content you're getting,
why not just get the whole thing from any one arbitrary peer who has
it handy? (If you were going to try any random access, I think you'd
aim for one whole generation of the tree at your desired resolution,
because you can calculate everything else from that.)

For example, the full tree to verify a 1GB file at a resolution of 
64KB chunks is only (1G/64K)*2*24(Tiger)=768K, or less than 1/10th 
of 1% of the total data being verified. So I'd say, just get it from
anyone who offers it, verify it's consistent with the desired root,
and keep it around -- nothing fancy. 

Now, you could imagine a system which includes a custom minimal
proof that "paints in" any tranferred data segment with the
neighboring internal tree nodes, so that every range-get includes
its own standalone verification-up-to-root. I think that'd be 
neat, but it'd be tougher to code, and introduces its own 
redundancies, unless you make the protocol for requesting partial
trees even more intricate. Still, it'd be an approach compatible
with the same tree hash calculation method.

> They also mention their format is suitable for serial download (as
> well as the random access download I described in the above
> paragraph).  Here I presume (though it is not stated) that the user
> would be expected to download either the entire set of leaf nodes (1/2
> the full tree size), or some subset of the leaf-nodes plus enough
> other nodes to verify that the leaf-nodes were correct.  (To avoid
> being jammed during download from the tree server.)  Again none of
> this is explicitly stated but would be minimally necessary to avoid
> jamming.

Yes, you'd get the whole tree or a whole generation to your desired
resolution. 

> A simpler and more efficient approach is as follows (I presume a
> 128bit (16byte) output hash function such as MD5, or truncated SHA1; I
> also presume each node has the whole file):
> 
> if the file is <= 1KB download the file and compare to master hash.
> 
> If the file is > 1KB and <= 64KB download, hash separately each of the
> 1KB chunk of the file; call the concatenation of those hashes the 2nd
> level hash set, and call the hash of the 2nd level hash set the master
> hash.  To download first download the 2nd level hash set (a 1KB file)
> and check that it hashes to the master hash.  Then download each 1KB
> chunk of the file (in random order from multiple servers) and check
> each 1KB chunk matches the corresponding chunk in the 2nd level hash
> set.
> 
> If the file is > 64KB and <= 4MB, hash separately each of the 1KB
> chunks of the file; call the concatenation of those hashes the 2nd
> level hash set.  The 2nd level hash set will be up be 64KB in size.
> Hash separately each of the up to 64 1KB chunks of the 2nd level hash
> set; call the concatenation of those hashes the 3rd level hash set.
> Call the hash of the 3rd level hash set the master hash.  Download an
> dverification is an obvious extension of the 2 level case.
> 
> Repeat for as many levels as necessary to match the file size.
> Bandwidth efficiency is optimal, there is a single compact file
> authenticator (the master hash: the hash of the 2nd level hash set),
> and immediate authentication is provided on each 1KB file chunk.

On what criteria is this more simple or efficient?

It appears to be essentially the same as the THEX approach, but
instead of concatenating intermediate values with their neighbors 
in pairs (a simple binary tree) you're doing it in groups of up 
to 64. 

You still need the exact same amount of "bottom generation" 
data to do complete verification at any given resolution. 

And if you ever wanted to do minimally compact proofs, they're
larger with your tree's larger node degree. To verify that a 
single 1KB segment fits inside a 1MB file with a desired root,
using the THEX calculation technique, takes 10 intermediate hash
values, say 200 bytes using SHA1 as the internal hash.

To verify a single 1KB segment using your scheme, you'd need
the individual hashes of its 63 neighbors, and then the
individual hashes of the 31 next-level neighbors: at least 94
intermediate hash values, say 1.8K using SHA1. 

So there are things the binary tree construction can do that
the degree-64 tree construction cannot, but there's nothing the
degree-64 tree can do that the binary tree can't. (Remember
that every 6th generation of the binary tree is almost exactly 
analogous to the degree-64 tree: it includes segment summary 
values covering the exact same amount of source data.)

> To avoid the slow-start problem (can't download and verify from
> multiple servers until the 2nd level hash set has been downloaded),
> the 2nd level hash set chunk could have download started from multiple
> serers (to discover the fastest one), and/or speculative download of
> 3rd level chunks or content chunks could be started and verifciation
> deferred until the 2nd level hash set chunk is complete.

Given that the verification tree info is such a tiny proportion of the
total data to be transferred, and that many (if not all) peers
will be able to provide the tree info on demand for any file
they're also providing, I can't imagine slow-start being much of a 
problem in practice. 

- Gordon @ Bitzi
____________________   
Gordon Mohr <gojomo@ . . . At Bitzi, people cooperate to identify, rate,
bitzi.com> Bitzi CTO . . . describe and discover files of every kind. 
_ http://bitzi.com _ . . . Bitzi knows bits -- because you teach it!