From bogus@does.not.exist.com Sat Dec 9 22:11:42 2006 From: bogus@does.not.exist.com () Date: Sat Dec 9 22:11:43 2006 Subject: No subject Message-ID: specified computer, then you have that same probability of taking over any other specified computer (before knowing whether or not you took over the first computer). Of course, these are not independent probabilities, in the sense that knowing that you took over one computer means that you are less likely to have taken over other computers because you just used up a lot of darts. However, the solution means that in the range where it is probable that you have taken over a particular computer, the correlations are negligible. Therefore, you are likely to have taken over a large number of computers by then. Qualitatively, what seems to happen is that you have a negligible chance of taking over any computers until you hit a threshold where you rapidly take over all the computers in the network. This threshold is roughly the number of computers in the network. If you take the correlations in the wash for a qualitative analysis, the probability of taking over any single computer is roughly M * the probability of taking over a specific computer. However, at around N= 1/2 the "critical" threshold, your probability scales roughly as (1/2)^P of the probability at the critical threshold. If P = log M, (1/2)^P = 1/M, which cancels the factor of M you add if you only want to take over any machine. What this means, qualitatively, is that there is a threshold, which is some small slowly-growing factor (like 10 or 20 or so) times M. If N is less than, say, 1/2 of this threshold, you are unlikely to take over *any* computer. When N hits 1/2 the threshold, you have a decent probability to take over some random computer. As you increase N from 1/2 the threshold to the threshold, you rapidly take over most of the computers in the system. (At N=10*M, you've just put an average of 10 machines in each bin, so it's not surprising that you've taken over the system at that point.) Of course, that threshold is pretty high to begin with. However, when it starts to fail, it pretty much all fails at once. Still, this is somewhat ideal behavior in the sense that to take over something specific you have to have enough firepower to pretty much take over the whole thing. ----- End forwarded message ----- From bogus@does.not.exist.com Sat Dec 9 22:11:42 2006 From: bogus@does.not.exist.com () Date: Sat Dec 9 22:11:44 2006 Subject: No subject Message-ID: necessary to some extent to have true privacy) it's similar to source routing in many instances, however, from a privacy standpoint it has the property that each forwarding node only knows their bit of the message's path such that in order to see the source and destination, all hops along a path must be owned by the adversary. This can be invaluable in many situations. Chris Connelly ----- Original Message ----- From: "Lists" To: Sent: Tuesday, March 26, 2002 6:39 PM Subject: RE: [p2p-hackers] P2P Onion Routing? > What's the advantage of onion routing? Onion routing (where the > _messages_ hold their own routing information) and gnutella routing > (where the _nodes_ hold routing tables tracking the messages) are both > unable to recover from the loss of a single node in the routing chain. > > It seemed to me like JXTA had addressed this through their hybrid > 'endpoint routing protocol' where a message could hold a list of desired > hops, or it could simply hold its desired destination, and any node > along the way could ask for additional route choices from a peer router. > > Wouldn't onion routing be a step backwards for JXTA? > > Mike. From bogus@does.not.exist.com Sat Dec 9 22:11:42 2006 From: bogus@does.not.exist.com () Date: Sat Dec 9 22:11:45 2006 Subject: No subject Message-ID: International Computer Science Institute (ICSI) in Berkeley. From 1995 to 1999 he was a Professor of Computer Science and Adjunct Professor of Molecular Biotechnology at the University of Washington. In 1999 he returned to ICSI and Berkeley, where he is a University Professor with appointments in Computer Science, Mathematics and Bioengineering. The unifying theme in Karp's work has been the study of combinatorial algorithms. His 1972 paper ``Reducibility Among Combinatorial Problems'' showed that many of the most commonly studied combinatorial problems are NP-complete, and hence likely to be intractable. Much of his subsequent work has concerned parallel algorithms, the probabilistic analysis of combinatorial optimization algorithms and he construction of randomized algorithms for combinatorial problems. His current activities center around algorithmic methods in genomics and computer networking. Karp has received the National Medal of Science, the Turing Award (ACM) the Fulkerson Prize(AMS and Math. Programming Society), the von Neumann Theory Prize(ORSA-TIMS), the Lanchester Prize (ORSA) the von Neumann Lectureship (SIAM), the Harvey Prize (Technion), the Centennial Medal (Harvard) and the Distinguished Teaching Award (Berkeley). He is a member of the National Academy of Sciences, the National Academy of Engineering and the American Philosophical Society and a Fellow of the American Academy of Arts and Sciences and the American Association for the Advancement of Science. He has been awarded five honorary degrees. Contact information: Richard Karp ICSI Center for Internet Research and University of California, Berkeley karp@ICSI.Berkeley.EDU +----------------------------------------------------------------------------+ | This message was sent via the Stanford Computer Science Department | | colloquium mailing list. To be added to this list send an arbitrary | | message to colloq-subscribe@cs.stanford.edu. To be removed from this list,| | send a message to colloq-unsubscribe@cs.stanford.edu. For more information,| | send an arbitrary message to colloq-request@cs.stanford.edu. For directions| | to Stanford, check out http://www-forum.stanford.edu | +-------------------------------------------------------------------------xcl+ From bogus@does.not.exist.com Sat Dec 9 22:11:42 2006 From: bogus@does.not.exist.com () Date: Sat Dec 9 22:11:45 2006 Subject: No subject Message-ID: --mike > If I take the message digest of all the IP addresses (all of which are > 32-bits long and cover the whole 2exp32 space), will the resulting m-bit mess > age digests produced by SHA-a be uniformly distributed in the > 2exp160 space? SHA-1 documentation does not make any such claims. If the mess > age digest produced are not uniformly distributed, what can we say about its > diistribution? > > ~nikhil From bogus@does.not.exist.com Sat Dec 9 22:11:42 2006 From: bogus@does.not.exist.com () Date: Sat Dec 9 22:12:05 2006 Subject: No subject Message-ID: overlays to mobile ad hoc networking, the ultimate success of a peer-to-peer system rests on the twin pillars of scalable and robust system design and alignment of economic interests among the participating peers. The Workshop on Economics of Peer-to-Peer Systems will bring together for the first time researchers and practitioners from multiple disciplines to discuss the economic characteristics of P2P systems, application of economic theories to P2P system design, and future directions and challenges in this area. Topics of interest include, but are not limited to: - incentives and disincentives for cooperation - distributed algorithmic mechanism design - reputation and trust - reliability, identity, and attack resistance - network externalities and scale economies - public goods and club formation - accounting and settlement mechanisms - payment and currency systems - user behavior and system performance - measurement studies - leveraging heterogeneity without compromising anonymity - economic impact to network providers - interconnection of P2P networks The program of the workshop will be a combination of invited talks, paper presentations, and discussion. Workshop attendance will be limited to ensure a productive environment. Each potential participant should submit a position paper that expresses a novel or interesting problem, offers a specific solution, reports on actual experience, or advances a research agenda. Participants will be invited based on the originality, technical merit and topical relevance of their submissions, as well as the likelihood that the ideas expressed in their submissions will lead to insightful discussions at the workshop. Accepted papers will be published on the workshop website. Submission guidelines: Submissions of position papers are due March 27, 2003, and should not exceed 5 pages. Two column papers are acceptable, but the font size should be no smaller than 11pt. Papers must be submitted electronically in postscript or PDF format to . Important Dates : Submission due: March 27 Notification of acceptance: April 25 Revised version due: May 22 Workshop: June 5-6 Program Committee: John Chuang, UC Berkeley (chair) Roger Dingledine, The Free Haven Project Ian Foster, University of Chicago and Argonne National Lab Bernardo Huberman, HP Labs Ramayya Krishnan, CMU (co-chair) H.T. Kung, Harvard University David Parkes, Harvard University Paul Resnick, University of Michigan Scott Shenker, ICSI Michael D. Smith, CMU Hal Varian, UC Berkeley From bogus@does.not.exist.com Sat Dec 9 22:11:42 2006 From: bogus@does.not.exist.com () Date: Sat Dec 9 22:12:21 2006 Subject: No subject Message-ID: in tiny parts or big parts, as a separate transaction or alongside your content downloads. If anyone give you bogus info, it's instantly recognizable as being inconsistent with the root value. The THEX interchange format is just one way to fill in the details. > They appear to propose that the recipient would download log(n) chunks > from a tree server (by asking for offsets and lengths in the tree > file, presumably using HTTP keep-alive and offset, length features of > HTTP). However this has significant request overhead when the file > hashes are 16-20bytes (the request would be larger than the hash), it > also involves at least two connections: one for authentication info, > another for downloading chunks. I wouldn't recommend that at all. The THEX data format is really best for grabbing a whole subset of the internal tree values, from the top/root on down, in one gulp. Yes, that top-down format includes redundant info, and you could grab the data from lots of different people, but it's so small compared to the content you're getting, why not just get the whole thing from any one arbitrary peer who has it handy? (If you were going to try any random access, I think you'd aim for one whole generation of the tree at your desired resolution, because you can calculate everything else from that.) For example, the full tree to verify a 1GB file at a resolution of 64KB chunks is only (1G/64K)*2*24(Tiger)=768K, or less than 1/10th of 1% of the total data being verified. So I'd say, just get it from anyone who offers it, verify it's consistent with the desired root, and keep it around -- nothing fancy. Now, you could imagine a system which includes a custom minimal proof that "paints in" any tranferred data segment with the neighboring internal tree nodes, so that every range-get includes its own standalone verification-up-to-root. I think that'd be neat, but it'd be tougher to code, and introduces its own redundancies, unless you make the protocol for requesting partial trees even more intricate. Still, it'd be an approach compatible with the same tree hash calculation method. > They also mention their format is suitable for serial download (as > well as the random access download I described in the above > paragraph). Here I presume (though it is not stated) that the user > would be expected to download either the entire set of leaf nodes (1/2 > the full tree size), or some subset of the leaf-nodes plus enough > other nodes to verify that the leaf-nodes were correct. (To avoid > being jammed during download from the tree server.) Again none of > this is explicitly stated but would be minimally necessary to avoid > jamming. Yes, you'd get the whole tree or a whole generation to your desired resolution. > A simpler and more efficient approach is as follows (I presume a > 128bit (16byte) output hash function such as MD5, or truncated SHA1; I > also presume each node has the whole file): > > if the file is <= 1KB download the file and compare to master hash. > > If the file is > 1KB and <= 64KB download, hash separately each of the > 1KB chunk of the file; call the concatenation of those hashes the 2nd > level hash set, and call the hash of the 2nd level hash set the master > hash. To download first download the 2nd level hash set (a 1KB file) > and check that it hashes to the master hash. Then download each 1KB > chunk of the file (in random order from multiple servers) and check > each 1KB chunk matches the corresponding chunk in the 2nd level hash > set. > > If the file is > 64KB and <= 4MB, hash separately each of the 1KB > chunks of the file; call the concatenation of those hashes the 2nd > level hash set. The 2nd level hash set will be up be 64KB in size. > Hash separately each of the up to 64 1KB chunks of the 2nd level hash > set; call the concatenation of those hashes the 3rd level hash set. > Call the hash of the 3rd level hash set the master hash. Download an > dverification is an obvious extension of the 2 level case. > > Repeat for as many levels as necessary to match the file size. > Bandwidth efficiency is optimal, there is a single compact file > authenticator (the master hash: the hash of the 2nd level hash set), > and immediate authentication is provided on each 1KB file chunk. On what criteria is this more simple or efficient? It appears to be essentially the same as the THEX approach, but instead of concatenating intermediate values with their neighbors in pairs (a simple binary tree) you're doing it in groups of up to 64. You still need the exact same amount of "bottom generation" data to do complete verification at any given resolution. And if you ever wanted to do minimally compact proofs, they're larger with your tree's larger node degree. To verify that a single 1KB segment fits inside a 1MB file with a desired root, using the THEX calculation technique, takes 10 intermediate hash values, say 200 bytes using SHA1 as the internal hash. To verify a single 1KB segment using your scheme, you'd need the individual hashes of its 63 neighbors, and then the individual hashes of the 31 next-level neighbors: at least 94 intermediate hash values, say 1.8K using SHA1. So there are things the binary tree construction can do that the degree-64 tree construction cannot, but there's nothing the degree-64 tree can do that the binary tree can't. (Remember that every 6th generation of the binary tree is almost exactly analogous to the degree-64 tree: it includes segment summary values covering the exact same amount of source data.) > To avoid the slow-start problem (can't download and verify from > multiple servers until the 2nd level hash set has been downloaded), > the 2nd level hash set chunk could have download started from multiple > serers (to discover the fastest one), and/or speculative download of > 3rd level chunks or content chunks could be started and verifciation > deferred until the 2nd level hash set chunk is complete. Given that the verification tree info is such a tiny proportion of the total data to be transferred, and that many (if not all) peers will be able to provide the tree info on demand for any file they're also providing, I can't imagine slow-start being much of a problem in practice. - Gordon @ Bitzi ____________________ Gordon Mohr Bitzi CTO . . . describe and discover files of every kind. _ http://bitzi.com _ . . . Bitzi knows bits -- because you teach it!