From dmarti at zgp.org Fri Feb 23 14:55:02 2001 From: dmarti at zgp.org (Don Marti) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] (no subject) Message-ID: <20010223225408.E5E703FC21@capsicum.zgp.org> Fri Feb 23 14:54:08 PST 2001 From bram at gawth.com Mon Feb 26 20:49:01 2001 From: bram at gawth.com (Bram Cohen) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Hello p2p hackers Message-ID: Hello everyone, I wanted to check who's subscribed to this list so far. I'm Bram, one of the people working on Mojo Nation, which I will happily chew everyone's ear off with just a little prompting. Who else is on here? -Bram Cohen "Markets can remain irrational longer than you can remain solvent" -- John Maynard Keynes From zooko at zooko.com Mon Feb 26 21:11:01 2001 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Hello p2p hackers In-Reply-To: Message from Bram Cohen of "Mon, 26 Feb 2001 20:48:36 PST." References: Message-ID: > I'm Bram, one of the people working on Mojo Nation, which I will happily > chew everyone's ear off with just a little prompting. Heh heh heh... Okay... What was that you were saying about a new method of doing replay attack prevention on IRC today? I remain enamoured of my own method (which provides full scale extensible fail-safe behaviour[1] as defined by Li Gong[2] and Paul Syverson), so I would like to hear about alternatives. Regards, Zooko [1] "Fail-Stop Protocols: An Approach to Designing Secure Protocols" http://citeseer.nj.nec.com/49099.html [2] Fun footnote for p2p fans: Li Gong was the chief security architect for Java, and is now leading Sun's nebulous p2p tech platform that Bill Joy talked about at the O'Reilly conference. From wesley at felter.org Mon Feb 26 22:11:01 2001 From: wesley at felter.org (Wesley Felter) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Hello p2p hackers In-Reply-To: Message-ID: On Mon, 26 Feb 2001 zooko@zooko.com wrote: > [2] Fun footnote for p2p fans: Li Gong was the chief security architect > for Java, and is now leading Sun's nebulous p2p tech platform that Bill > Joy talked about at the O'Reilly conference. And considering that Sun apparently isn't planning to answer any of their email about Jxta until the spec/code is released, I don't have a lot of faith in its security model. Maybe the third time will be the charm, though. While I'm at it, I'll pipeline in an introduction: I'm not working on a P2P system; instead I read the docs and protocols for as many of the existing ones as I can and try to learn some lessons from them. Then I try to convince other people to learn those lessons, too. It's somewhat annoying to have to correct the To: line since the Reply-To is not the list... Wesley Felter - wesley@felter.org - http://felter.org/wesley/ From md98-osa at nada.kth.se Tue Feb 27 07:30:02 2001 From: md98-osa at nada.kth.se (Oskar Sandberg) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) In-Reply-To: <200102271059.KAA04195@longitude.doc.ic.ac.uk> Message-ID: (I'm crossposting this into the p2p-hackers list, since it isn't freenet-dev stuff.) On Tue, 27 Feb 2001, Theodore Hong wrote: > Oskar Sandberg wrote: > > I didn't think the Oceanstore paper was very interesting. Like you say, > > the naming scheme is like ours (and that is pretty much the obvious way > > of doing it). The interesting part is the paper they reference for their > > global routing system (which I first thought was just a hypercube mesh, > > but which is actually a lot more complicated): > > > > http://www.cs.utexas.edu/users/plaxton/ps/1999/tocs.ps > > yeah, what you told me at the conference didn't sound that great, but it is > more sophisticated -- from what I gather, the network is covered by a large > number of overlapping trees. Each tree, which corresponds to some object > GUID, covers all the nodes, but with different orderings. To find an > object, you traverse the appropriate tree upwards to its root, and then > downwards to the location of the object. Along the way, however, if you > encounter a downwards reference to the location of the object, go straight > there. Thus the root can be corrupt, but it doesn't matter -- the > important thing is that requests will converge towards the root and > hopefully intersect a storage reference. Actually, it doesn't seem that > dissimilar to Freenet, if you substitute "epicenter" for "root". I need to > go actually read the Plaxton paper, though, since they didn't lay out that > many details. Upon further thought, the actual protocol is not very different from what I told you. It is basically a hypercube type search, though they modify it to allow for arbitrary dimmensions (not hard) and the ability to decrease the number of hops by increasing the size of the table at every node (I don't know how to describe that geometrically). The further complication, that of going downward in the tree once the root for an object is found is there, as far as I can tell, to satisfy their objective of minizing a certain cost function in the final transfer - something that we do not deal with. I'm not sure that their model is so fault tolerant to the roots of objects falling out at all. And to the extent that it is, it has the problem that it is easy to identify the level of "rootiness" of any node for a piece of data - making targetted attacks easy. Also, for Oceanstores big words about mobility of data, this offers little or no such thing as far as I can tell (which the Oceanstore people get around by adding the second Bloom filter level, but large parts of the web are being served by Bloom filter using Squid caches already - that hardly makes the data mobile). It is a nice model, but the routing within such a system feel uncomfortably rigid to me. I do have to look at it in more detail before passing any final judegement. > They also had a reference to some type of searching in encrypted data, > without revealing the search string? Presumably I guess you present some > encrypted string, and the algorithm tells you whether the string is present > in the data without decrypting either? That could be useful. I read the paper they reference some time ago, and it is interesting but 100% useless. Basically it is just encrypting every word as a seperate block and having the searcher encrypt the search terms with the same key (but modified heavily so as to not suffer from the million holes in that version). The only application it might be useful for are ASP like systems that keep the data secret from the ASP itself (which would be a cool thing, though not very related to what we are doing). In fact, the fact that they presented this as a workable alternative for searching REALLY did a disservice to the Oceanstore paper in my eyes (they obviously had not really considered it). > > theo > > > _______________________________________________ > Devl mailing list > Devl@freenetproject.org > http://www.uprizer.com/mailman/listinfo/devl > From md98-osa at nada.kth.se Tue Feb 27 07:50:01 2001 From: md98-osa at nada.kth.se (Oskar Sandberg) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Re: [freenet-devl] Alpine, ELF In-Reply-To: <200102271104.LAA04215@longitude.doc.ic.ac.uk> Message-ID: (Also crossposted from freenet-dev into p2p-hackers.) On Tue, 27 Feb 2001, Theodore Hong wrote: > Peter Todd wrote: < > > > Alpine is horribly inefficient. Being a all-to-all network topology > > where every search request is sent to *every* machine on the network > > it's bandwidth useage for any single node is n where n is the number > > of nodes in the network. Therefore the bandwidth usage for all of > > nodes is n^2, obviously horribly inefficient. > > Well, that's the point -- they claim they can do it: "The low overhead of a > DTCP connection means hundreds of thousands of concurrent connections can > be used by an application for direct communication with a large number of > peers." Can that be true? If capacity grows with n and the amount of messages with n^2, the it is easy enough to figure out how many nodes can be supported. If you want to support 1000 nodes, the amount of capacity added by each node must be 1000 times greater than the amount of messages generated by each node (per unit time) times the size of the message. If search messages are only 1 kB or so, and nodes in the network generate an average of 10 new searches per hour, then each node must make 10,000 kB of transfer capacity available per hour to handle that traffic. That is a comfortable background level for people with broadband connections (if not for the ISPs serving them). Continuing up, at 46,000 nodes you are saturating a 1 megabit connection - which would indicate (given that my numbers of 1 kB and 10 searches per hour were probably conservative) that the Alpine people are sprouting turkey excrement. If you turn it the other way of course - even if your search horizon contains only 1000 people, that is certainly enough for many uses of P2P (including filesharing). I think that if the people working on Gnutella clones would just get there acts together, do the math, and code their systems with recognition that the network cannot scale, but that the horizon can still be large enough to satisfy most users, we would have viable decentralized Napster alternative today... > > theo > > > _______________________________________________ > Devl mailing list > Devl@freenetproject.org > http://www.uprizer.com/mailman/listinfo/devl > From orasis at acm.cs.umn.edu Tue Feb 27 08:05:01 2001 From: orasis at acm.cs.umn.edu (Justin Chapweske) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) In-Reply-To: ; from md98-osa@nada.kth.se on Tue, Feb 27, 2001 at 04:28:53PM +0100 References: <200102271059.KAA04195@longitude.doc.ic.ac.uk> Message-ID: <20010227100356.D16140@go.cs.umn.edu> > > > They also had a reference to some type of searching in encrypted data, > > without revealing the search string? Presumably I guess you present some > > encrypted string, and the algorithm tells you whether the string is present > > in the data without decrypting either? That could be useful. > > I read the paper they reference some time ago, and it is interesting > but 100% useless. Basically it is just encrypting every word as a > seperate block and having the searcher encrypt the search terms with > the same key (but modified heavily so as to not suffer from the > million holes in that version). The only application it might be > useful for are ASP like systems that keep the data secret from the > ASP itself (which would be a cool thing, though not very related to > what we are doing). > Are you referring to Schneier's "Clueless Agents" paper (http://www.counterpane.com/clueless-agents.html)? You might also want to check out http://www.islandnet.com/~mskala/limdiff.html but I wouldn't trust it because it requires its own S-box construction. -- Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc. http://www.sourceforge.net/projects/swarmcast/ From orasis at acm.cs.umn.edu Tue Feb 27 10:23:01 2001 From: orasis at acm.cs.umn.edu (Justin Chapweske) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] range bitmap compression and set operations Message-ID: <20010227122223.F16140@go.cs.umn.edu> Hey guys, I figure that some of you must have run into similar situations as I have and was wondering if anyone had some insights before I totally dive into this: I need to send various bitmaps across the network with the size being up to 65536 bits. The nice thing about these bitmaps is that they typically contain very nice ranges of 1's and 0's so they are very ammenable to simple RLE style encoding. For example a typical bitmap will simply have 1's for bits 0-1024,32768-34816 which could be encoded quite simply in XML using exactly that range format. Here is the interesting part: I want to be able to find an encoding of these bitmaps that allows me to VERY efficiently compute unions of two bitmap sets. I could of course expand the bitmaps and execute a simple AND between them, but I don't want to waste the memory. So does anyone know of or have any pointers to an RLE-style encoding that allows simple/efficient set operations on the encoded form????? -- Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc. http://www.sourceforge.net/projects/swarmcast/ From hal at finney.org Tue Feb 27 10:33:01 2001 From: hal at finney.org (hal@finney.org) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) Message-ID: <200102271829.KAA12263@finney.org> The Plaxton routing method used in OceanStore is interesting but it's not quite as described here. > > > http://www.cs.utexas.edu/users/plaxton/ps/1999/tocs.ps OceanStore is at http://oceanstore.cs.berkeley.edu/. As described in the OceanStore papers, every node gets a number. It appears to be important that no two nodes have the same number, and that the numbering be relatively "dense" - that is, there should not be too many numbers that don't have nodes. (So you can't just let each node pick a 160 bit random number, for example.) The routing scheme then goes from one node to another by number. It can be used for looking up data if the data item has a "home" on the node whose number corresponds to a (truncated) hash of the data item. Each node contains pointers to other nodes which have similar numbers. The example in the OceanStore paper uses base 16. Each node has a set of pointers to certain other nodes in the net. These pointers are organized into levels. Each node has 16 k-level pointers, to the 16 closest neighbors which match in the low order k digits. ("Closest" is in terms of ping time.) For example, node 0325 has 16 level-0 pointers to nodes of ___0, ___1, ___2, ... ___f, where the _ represent "don't care" digits. The pointers are to nearby nodes which match this pattern. So it might actually point to 19a0, 07f1, 6932, ..., 4cbf. Only the last digit matters. Then for the level 1 pointers, these will point to the 16 closest nodes which match in the lower digit: __05, __15, __25, ... __f5. Again, the first two digits don't matter and are based on whichever nodes are closest that satisfy this. It might point to a305, 2915, 0325, ..., 80f5. Then there are level 2 pointers of the form _025, _125, _225, ... and level 3 pointers of the form 0325, 1325, 2325, 3325, .... I'm not clear how this system of numbers and pointers is set up and maintained, particularly in a dynamic network where nodes are constantly joining and leaving. I need to read the Plaxton paper to learn more. Once you have the pointers, though, the routing is easy. Simply move through the network, setting the digits from right to left. The example in the paper goes from 0325 to 4598. The first step follows a level 0 pointer from 0325 to ___8, so that the right digit will be correct. It happens to go to b4f8. From this node we will follow a level 1 link to __98, which actually goes to 9098. From here we take a level 2 link to _598, which happens to go to 7598. And from here we can take a level 3 link directly to 4598. To the extent that we view this as a tree traversal, I see it as one where the root of the tree is the destination node, 4598. Its first-level children are those reachable via level-3 pointers: 0598, 1598, 2598, 3598, etc. The children of these nodes are the ones reachable to them via level-2 pointers. The children of 7598 are __98, one of which happens to be 9098. And the children of 9098 are the ones reachable to it by level-1 pointers, one of which is b4f8. Finally, the children of b4f8 are those reachable to it by level-0 pointers, one of which is 0325. In this view, what we did was walk straight up the tree, from leaf node to parent. At each step we got one more digit right. And the single data structure can be looked at as a different tree rooted at each different node. In this way it is similar to hypercube routing, as Oskar noted earlier. Hypercube routing usually is done base 2, but the Plaxton tree could be done that way as well, just substitute 2 for 16 above. The difference is that in the hypercube, we point to the nodes which differ from us in exactly one bit. If our address is 1010101, we point to 1010100 and 1010111 and 1010001, etc. But in a binary Plaxton tree we point to ______0, and to _____11, and to ____001, etc. It is like a "loose" hypercube, in that a number of the bit positions are unspecified, so we can pick a closer node. But the same basic routing algorithm (get 1 bit at a time right) is used. I think the main advantage you get from this looseness in the Plaxton system is that you can pick a closer node. Hypercube routing gets you there in a small number of steps, but each step may be long in physical space. With the Plaxton tree, the steps are closer, at least the first ones. (The last step is the same as the hypercube so it may not be particularly close.) That's because with Plaxton you choose your neighbors to be close nodes. You can get something of the same effect with a hypercube if you can map it to the geometry of your network, but since the surface of the earth is topologically 2-dimensional it is impossible to map a 20-dimensional hypercube onto it and maintain distance closeness. So the Plaxton tree should be faster than hypercube routing in practice. However it still seems like it should share some of the inefficiency, particularly in the last couple of steps where you won't have any physically nearby nodes that match yours so closely. You may end up hopping from Chicago to Timbuktu to London on your last three steps, even if the first 10 steps stayed in the U.S. Hal From hal at finney.org Tue Feb 27 11:23:01 2001 From: hal at finney.org (hal@finney.org) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] range bitmap compression and set operations Message-ID: <200102271919.LAA12444@finney.org> Justin Chapweske writes: > I need to send various bitmaps across the network with the size being up > to 65536 bits. The nice thing about these bitmaps is that they typically > contain very nice ranges of 1's and 0's so they are very ammenable to > simple RLE style encoding. For example a typical bitmap will simply have > 1's for bits 0-1024,32768-34816 which could be encoded quite simply in XML > using exactly that range format. > > Here is the interesting part: > > I want to be able to find an encoding of these bitmaps that allows me to > VERY efficiently compute unions of two bitmap sets. I could of course > expand the bitmaps and execute a simple AND between them, but I don't want > to waste the memory. > > So does anyone know of or have any pointers to an RLE-style encoding that > allows simple/efficient set operations on the encoded form????? It seems like your proposed encoding is well suited for unions. Have a list of start and end points, and do something like: for( ; ; ) { if( b->start < a->start ) { // Swap a and b t = a; a = b; b = t; } // Now a->start <= b->start o->start = a->start; for( ; ; ) { if( a->end < b->start ) { // No overlap o->end = a->end; o++; break; // out of inner loop, back to top of outer loop } // Overlap if( b->end <= a->end ) { // b segment enclosed in a ++b; continue; // back to inner loop } // b extends beyond a // advance a, swap a and b, back to inner loop ++a; t = a; a = b; b = t; } } This copies a union b into o, using fields ->start and ->end as the start and end of the ranges of 1's. It needs to be enhanced to detect the end of the data, but the basic idea is simple. We look at the next segment and see if it overlaps the current one. Here is another way to do it, without swapping a and b, same idea: insegment = 0; for( ; ; ) { if( !insegment ) { if( a->start <= b->start ) { o->start = a->start; if( a->end < b->start ) { o++->end = a->end; } else insegment = 1; } else { o->start = b->start; if( b->end < a->start ) { o++->end = b->end; } else insegment = 1; } } else if( a->end < b->start ) { o++->end = a++->end; insegment = 0; } else if( b->end < a->start ) { o++->end = b++->end; insegment = 0; } else if( a->end < b->end ) ++a; else ++b; } I haven't tested any of this of course, it's just to show the general idea. I doubt you will come up with a data structure or algorithm that's much faster. Hal From bram at gawth.com Tue Feb 27 12:03:01 2001 From: bram at gawth.com (Bram Cohen) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Hello p2p hackers In-Reply-To: Message-ID: On Mon, 26 Feb 2001 zooko@zooko.com wrote: > > > What was that you were saying about a new method of doing replay attack > prevention on IRC today? It has to do with moving towards connection-awareness in our communications. It's a bit involved to go into here, but I think there's a simple lesson - don't worry about the security in the first version of your system too much, you'll probably want to change it around later anyway. -Bram Cohen "Markets can remain irrational longer than you can remain solvent" -- John Maynard Keynes From alk at pobox.com Tue Feb 27 12:30:01 2001 From: alk at pobox.com (Tony Kimball) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Re: [freenet-devl] Alpine, ELF References: <200102271104.LAA04215@longitude.doc.ic.ac.uk> Message-ID: <15004.3599.450174.566258@spanky.love.edu> Quoth Oskar Sandberg on Tuesday, 27 February: : : I think that if the people working on : Gnutella clones would just get there acts together, do the math, and : code their systems with recognition that the network cannot scale, : but that the horizon can still be large enough to satisfy most users... If the population space is large enough and persistent enough, this will happen by annealling, inevitably: Interest pockets will form, consisting of servents with a lot of interest overlap. The preconditions for such an outcome aren't really there, though, in the current gnutella software. From orasis at acm.cs.umn.edu Tue Feb 27 12:58:01 2001 From: orasis at acm.cs.umn.edu (Justin Chapweske) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] range bitmap compression and set operations In-Reply-To: <200102271919.LAA12444@finney.org>; from hal@finney.org on Tue, Feb 27, 2001 at 11:19:20AM -0800 References: <200102271919.LAA12444@finney.org> Message-ID: <20010227145727.G16140@go.cs.umn.edu> > > > > Here is the interesting part: > > > > I want to be able to find an encoding of these bitmaps that allows me to > > VERY efficiently compute unions of two bitmap sets. I could of course > > expand the bitmaps and execute a simple AND between them, but I don't want > > to waste the memory. > > Shit! I meant intersect! Although quick unions are important as well for adding new ranges to the set so I much appreciate the feedback. Right now the approach I'm taking is to ensure that all lists are ordered and as compact as possible (no entries like "0,0-10,9-15") and then performing the operation, but for some reason I tend to think that this sort/compress phase may be unneccessary. -- Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc. http://www.sourceforge.net/projects/swarmcast/ From orasis at acm.cs.umn.edu Tue Feb 27 13:06:01 2001 From: orasis at acm.cs.umn.edu (Justin Chapweske) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] range bitmap compression and set operations In-Reply-To: <20010227145727.G16140@go.cs.umn.edu>; from orasis@acm.cs.umn.edu on Tue, Feb 27, 2001 at 02:57:28PM +0000 References: <200102271919.LAA12444@finney.org> <20010227145727.G16140@go.cs.umn.edu> Message-ID: <20010227150506.H16140@go.cs.umn.edu> I found my answer in Perl (of course). Set::IntSpan is exactly what I was looking for: http://www.infoboard.com/perldoc/modules/Set/IntSpan.html Sorry if some folks may find this off-topic, but I think that we will find various algorithms and data structures to commonly appear on our different systems. -- Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc. http://www.sourceforge.net/projects/swarmcast/ From bram at gawth.com Tue Feb 27 13:14:01 2001 From: bram at gawth.com (Bram Cohen) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] range bitmap compression and set operations In-Reply-To: <20010227150506.H16140@go.cs.umn.edu> Message-ID: On Tue, 27 Feb 2001, Justin Chapweske wrote: > I found my answer in Perl (of course). Set::IntSpan is exactly what I was > looking for: > > http://www.infoboard.com/perldoc/modules/Set/IntSpan.html > > Sorry if some folks may find this off-topic, but I think that we will find > various algorithms and data structures to commonly appear on our different > systems. This isn't coderpunks - you won't get flamed for talking about code here. -Bram Cohen "Markets can remain irrational longer than you can remain solvent" -- John Maynard Keynes From md98-osa at nada.kth.se Tue Feb 27 17:24:01 2001 From: md98-osa at nada.kth.se (Oskar Sandberg) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) In-Reply-To: <20010227100356.D16140@go.cs.umn.edu>; from orasis@acm.cs.umn.edu on Tue, Feb 27, 2001 at 10:03:56AM +0000 References: <200102271059.KAA04195@longitude.doc.ic.ac.uk> <20010227100356.D16140@go.cs.umn.edu> Message-ID: <20010228022531.A1131@hobbex.localdomain> On Tue, Feb 27, 2001 at 10:03:56AM +0000, Justin Chapweske wrote: > > I read the paper they reference some time ago, and it is interesting > > but 100% useless. Basically it is just encrypting every word as a > > seperate block and having the searcher encrypt the search terms with > > the same key (but modified heavily so as to not suffer from the > > million holes in that version). The only application it might be > > useful for are ASP like systems that keep the data secret from the > > ASP itself (which would be a cool thing, though not very related to > > what we are doing). > > > > Are you referring to Schneier's "Clueless Agents" paper > (http://www.counterpane.com/clueless-agents.html)? You might also want to > check out http://www.islandnet.com/~mskala/limdiff.html but I wouldn't > trust it because it requires its own S-box construction. Neither, I reffering the paper regarding search on encrypted data that they reference from the Oceanstore paper. You can read it here: http://paris.cs.berkeley.edu/~dawnsong/papers/se.ps -- 'DeCSS would be fine. Where is it?' 'Here,' Montag touched his head. 'Ah,' Granger smiled and nodded. Oskar Sandberg md98-osa@nada.kth.se From md98-osa at nada.kth.se Wed Feb 28 09:03:01 2001 From: md98-osa at nada.kth.se (Oskar Sandberg) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) In-Reply-To: <200102271829.KAA12263@finney.org>; from hal@finney.org on Tue, Feb 27, 2001 at 10:29:51AM -0800 References: <200102271829.KAA12263@finney.org> Message-ID: <20010228173753.B642@hobbex.localdomain> On Tue, Feb 27, 2001 at 10:29:51AM -0800, hal@finney.org wrote: > The Plaxton routing method used in OceanStore is interesting > but it's not quite as described here. > > > > http://www.cs.utexas.edu/users/plaxton/ps/1999/tocs.ps > > OceanStore is at http://oceanstore.cs.berkeley.edu/. < snip description > It seems to me that Oceanstore is using the system described in Plaxton's paper pretty much straight off. They claim to add more redundant links, but that isn't really a big change. > I'm not clear how this system of numbers and pointers is set up and > maintained, particularly in a dynamic network where nodes are constantly > joining and leaving. I need to read the Plaxton paper to learn more. The above paper doesn't deal with adding nodes to the network at all. The Oceanstore paper says: "While existing work on Plaxton-like data structures did not include algorithms for online creation and maintenance of the global mesh, we have produced recursive node insertion and removal algorithms." The problem of giving a new node links at each level should be pretty trivial by just following the primary neighbor sequence of it's ID from any node (finding the closest ones may be harder - though often you could just start with from a node at your POP on the physical network and therefore get it). Giving other nodes links to a new node is probably more difficult, and definitely impossible in an environment where the new node could have less than honest intentions... <> > I think the main advantage you get from this looseness in the Plaxton > system is that you can pick a closer node. Hypercube routing gets > you there in a small number of steps, but each step may be long in > physical space. With the Plaxton tree, the steps are closer, at least > the first ones. (The last step is the same as the hypercube so it may > not be particularly close.) That's because with Plaxton you choose your > neighbors to be close nodes. > > You can get something of the same effect with a hypercube if you can map > it to the geometry of your network, but since the surface of the earth > is topologically 2-dimensional it is impossible to map a 20-dimensional > hypercube onto it and maintain distance closeness. So the Plaxton tree > should be faster than hypercube routing in practice. > > However it still seems like it should share some of the inefficiency, > particularly in the last couple of steps where you won't have any > physically nearby nodes that match yours so closely. You may end up > hopping from Chicago to Timbuktu to London on your last three steps, > even if the first 10 steps stayed in the U.S. In a way the Plaxton system deals with this. It assumes that there is some person somewhere that is sharing the data. This person does an Insert, which places the data on each step following the routing from their location (in Plaxton's language, the primary neighbor sequence of the). Then, when searching, it does not just have the primary neighbors for each value and level, it also has a set of secondary neighbors, being some number d of the other nodes that could have be the primary neighbors but weren't closest. At each step in a read, it checks if the secondary neighbors has well as the primary if they have the data (ie, if they were in the primary neighbor sequence of the inserter). If going through the root node (where the sequences from the inserter and the reader are guaranteed to intersect) is not the closest path between the inserter and reader, then the chances are high that it would have been found along the earlier path (don't trust me, most of the Plaxton paper deals with proving this). -- 'DeCSS would be fine. Where is it?' 'Here,' Montag touched his head. 'Ah,' Granger smiled and nodded. Oskar Sandberg md98-osa@nada.kth.se From hal at finney.org Wed Feb 28 10:21:01 2001 From: hal at finney.org (hal@finney.org) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) Message-ID: <200102281817.KAA16135@finney.org> Oskar writes: > The above paper doesn't deal with adding nodes to the network at all. The > Oceanstore paper says: > > "While existing work on Plaxton-like data structures did not include > algorithms for online creation and maintenance of the global mesh, we have > produced recursive node insertion and removal algorithms." Of course "recursive" tells us nothing about the actual cost of such algorithms. I have read the Plaxton paper (the part about the algorithms, not the proofs!) now, and there are some subtleties in the Plaxton storage that look to me like they require global knowledge. > The problem of giving a new node links at each level should be pretty > trivial by just following the primary neighbor sequence of it's ID from > any node (finding the closest ones may be harder - though often you could > just start with from a node at your POP on the physical network and > therefore get it). I don't think this will work (starting with a nearby node) because the local node will have a different label than yours. If you are looking for __34 and his number is 4321, he won't have anything that matches. He'll forward to ___4, which will forward to __34, but this is now two jumps away and is not necessarily the closest matching node to you. That is, the returned __34 is the closest such node to the ___4 node, not to 4321 which is where you are. With higher level neighbors there are more intervening hops and maintaining closeness is even more questionable. If you just accept this as being "close enough" then this sloppiness will grow with each node insertion. It's also important that nodes not share labels, or at least that they find out if any other nodes have the same label. This can largely be done by simply querying the network for a given label, but it is vulnerable to race conditions and it's not clear what happens then. > Giving other nodes links to a new node is probably more > difficult, and definitely impossible in an environment where the new node > could have less than honest intentions... Yes, Wei Dai pointed out on the bluesky list that the system was highly vulnerable to data-erasing attacks, where a node is able to choose its own label which matches the ID of some document it wants to erase. It then gets that document assigned to it and is able to keep it off the network. Regarding inefficiency of routing: > In a way the Plaxton system deals with this. It assumes that there is some > person somewhere that is sharing the data. This person does an Insert, > which places the data on each step following the routing from their > location (in Plaxton's language, the primary neighbor sequence of the). Yes, I see that Plaxton short-circuits the routing by spreading around pointers to the data. This makes it less likely that the last few hops (which are the expensive ones) will be needed to find the target node. The typical search involves hopping around in the low order parts of the tree, where nodes are physically close, until we find the pointer to the actual data and go there directly. Hence they are able to prove that they are within a constant bound of optimal. OceanStore appears to extend this by spreading the data around, not just references to it, but I need to read that part more carefully. Hal From markm at caplet.com Wed Feb 28 11:42:01 2001 From: markm at caplet.com (Mark S. Miller) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Re: Welcome to the "p2p-hackers" mailing list In-Reply-To: <20010228081920.0DBC93FCA5@capsicum.zgp.org> Message-ID: <5.0.2.1.2.20010228113411.04fa1cf0@shell9.ba.best.com> At 12:19 AM Wednesday 2/28/01, p2p-hackers-request@zgp.org wrote: >So here is a mailing list which I hope will continue the noble >tradition of fraternization among p2p hackers. In the noble tradition of openness, we should also make the archives visible to non-subscribers. That way, the archives serve as a valuable public record accessible to search engines, and for people to link into. I've been doing this on e-lang from the beginning, and have been very happy with the results. Cheers, --MarkM From zooko at zooko.com Wed Feb 28 11:42:02 2001 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] proposal re: scalability of block publication and fetching Message-ID: I was mildly stung by Oskar's assertion that Mojo Nation architecture is inherently non-scalable with respect to fetching a block whose id you know. (Which Freenet people apparently call "routing", I guess because they are thinking of app-level communication routing.) I am thinking about a change to the MN architecture which would be very very simple to deploy (for you Mojo Hackers, it would be simply a new handicapper plug-in). I will describe it in abstract terms, glossing over at least four implementation details that you don't need to know in order to tell me if this is scalable or not. Suppose that you have a network with nodes which hold blocks of data indexed by the SHA1 hash of the block. A node, `A', wants to publish a block of data and then later a different node `B', who already knows the unique id of that block wants to fetch the block `A' knows the "phonebook info" for N other nodes, where the "phonebook info" consists of the public key and other information sufficient to communicate with that node. My proposal, which I call "MaskMatchingHandicapper", is that `A' chooses the log(N) nodes whose public key ids have the highest "mask match" with the id of the block. A "mask match" is currently the number of contigious leading bits which all match, but any scalar comparison like Hamming distance or integer difference should behave as well. `A' then sends the block to those log(N) nodes. Now later `B' wishes to fetch the block, whose id `B' already knows. `B' already knows M other nodes. `B' queries the top log(M) nodes based on mask match. (Note that you can consider the fact that `A' and `B' know possibly different sets of counterparties to be either a consequence of one or both of them having an incomplete knowledge of the net, or of `B' operating at a later time than `A', in which case some nodes will have come and gone.) Now what is the chance of success? It is the chance that at least one of the log(N) nodes published-to by `A' are also among the log(M) nodes queried by `B'. The key phrase in there is "at least one", and given some weak assumptions about the chance of an arbitrary node being reachable by one of the counterparties, we can easily gain a high confidence that at least one will be reachable by both simply by changing our "log(X)" to "K * log(X)" for some small constant K. (This has implications for the bandwidth usage and the performance, which is one of those issues that I'm glossing over, although we do have a solution to this already implemented and deployed in Mojo Nation. Actually four solutions, two of which we are grandfathering-out at this point... ;-)) There are (at least) two particular ways of failure: 1. All log(N) of the nodes that `A' published to are unknown to `B'. 2. All log(M) of the nodes queried by `B' were unknown to `A'. Also combinations of the two. Note that the latter case (#2) could happen if the size of the network had ballooned dramatically between `A' publishing and `B' querying. But the network would have to actually _square_ in size before _all_ of the log(M) nodes queried by `B' were newbies. Now is this scalable? It seems obvious to me that it is, although it leaves open questions of practical performance (you do not want to publish log(N) times the size of your data), and the big question of how `B' learned the unique id of the block. (Both of these issues are already solved, of course, on Mojo Nation, but those solutions might not scale.) Regards, Zooko P.S. The idea of MaskMatchingHandicapper was inspired by an idea that Raph Levien posted to advogato concerning doing the same thing, but for plaintext meta-data instead of for blocks. From md98-osa at nada.kth.se Wed Feb 28 12:35:02 2001 From: md98-osa at nada.kth.se (Oskar Sandberg) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] proposal re: scalability of block publication and fetching In-Reply-To: ; from zooko@zooko.com on Wed, Feb 28, 2001 at 11:39:16AM -0800 References: Message-ID: <20010228213550.B1736@hobbex.localdomain> On Wed, Feb 28, 2001 at 11:39:16AM -0800, zooko@zooko.com wrote: > > I was mildly stung by Oskar's assertion that Mojo Nation architecture > is inherently non-scalable with respect to fetching a block whose id > you know. Good :-). > (Which Freenet people apparently call "routing", I guess > because they are thinking of app-level communication routing.) Fetching (Freenet: Requesting Data) a piece of data from a known id (Freenet: key) is not what I refer to as routing, but a node deciding where to send a Request is. Since your system assumes that every node has global knowledge of the network, you don't have to route as such, but that is exactly what I'm saying is wrong. <> > Suppose that you have a network with nodes which hold blocks of data > indexed by the SHA1 hash of the block. > > > A node, `A', wants to publish a block of data and then later a > different node `B', who already knows the unique id of that block wants > to fetch the block > > > `A' knows the "phonebook info" for N other nodes, where the "phonebook > info" consists of the public key and other information sufficient to > communicate with that node. The question is, how large is this "phonebook" (freenet: ReferenceStore or Routing table)? For the network to be truly scalable, you don't want the size of the routing table at any node have to grow faster than O(log N) where N is the number of nodes on the entire network. If you have two parties (A and B) that selected their routing tables independently and randomly, then the chance that they will contain a single shared node is the chance that A chose a node that B chose, or: 1-(1-)^ (it's not really an exponent since A will only choose a given node once, but as the numbers increase that will stop mattering). If you want the size of the routing tables to grow only with O(log n) then this becomes: 1-(1-O(log n)/O(n))^O(log n) You can try plotting that in Matlab or something, and you'll see that it dives towards zero pretty soon. <> > Now is this scalable? It seems obvious to me that it is, although it > leaves open questions of practical performance (you do not want to > publish log(N) times the size of your data), and the big question of > how `B' learned the unique id of the block. (Both of these issues are > already solved, of course, on Mojo Nation, but those solutions might > not scale.) Not unless I am misunderstanding you in some respect regarding how the nodes "phonebooks" are gathered. I would assert that is impossible to make a scalable way of finding data on a network that does not work through some sort of sorting and then hillclimbing find when trying to locate the data. In fact, I would _really_ recommend that you take a look at the Plaxton scheme that Hal and I were discussing here. While I have issues with it's resistant to coordinated attacks, anonymity, and node operator control over the routing, these are things that you guys don't seem to put to weight on, and that your current scheme isn't any better suited in respect to anyways. > > > Regards, > > Zooko > > P.S. The idea of MaskMatchingHandicapper was inspired by an idea that > Raph Levien posted to advogato concerning doing the same thing, but for > plaintext meta-data instead of for blocks. > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers -- 'DeCSS would be fine. Where is it?' 'Here,' Montag touched his head. 'Ah,' Granger smiled and nodded. Oskar Sandberg md98-osa@nada.kth.se From bram at gawth.com Wed Feb 28 12:48:02 2001 From: bram at gawth.com (Bram Cohen) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] proposal re: scalability of block publication and fetching In-Reply-To: <20010228213550.B1736@hobbex.localdomain> Message-ID: On Wed, 28 Feb 2001, Oskar Sandberg wrote: > > `A' knows the "phonebook info" for N other nodes, where the "phonebook > > info" consists of the public key and other information sufficient to > > communicate with that node. > > The question is, how large is this "phonebook" (freenet: ReferenceStore or > Routing table)? For the network to be truly scalable, you don't want the > size of the routing table at any node have to grow faster than O(log N) > where N is the number of nodes on the entire network. Until we get up to around 10,000 counterparties the size of the phone book won't be more than a few megs. When we get to that point we'll start worrying about how to make the system scale more. DNS has the same scaling problem. It doesn't seem to be melting. -Bram Cohen "Markets can remain irrational longer than you can remain solvent" -- John Maynard Keynes From hal at finney.org Wed Feb 28 12:56:01 2001 From: hal at finney.org (hal@finney.org) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Bluesky list Message-ID: <200102282053.MAA16966@finney.org> A[nother] new mailing list is getting started to discuss general issues of peer-to-peer style file-sharing systems like Freenet, MojoNation, Publius, Gnutella and the like. Information is at: http://www.transarc.com/~ota/bluesky/index.html. Their charter: The purpose of the mailing list is to foster discussion of design and implementation issues related to the development of scalable, decentralized storage systems of literally global scope. The emphasis should be on technical descriptions and critique of mechanisms providing efficiency, reliability, security, and similar properties. Discussion of goals and semantics is also desirable, while acknowledging that a diversity of systems will be built and evaluated. Messages with primarily political, legal or philosophical content are discouraged. It's got some smart people signed up although the traffic level has been pretty low. P2P hackers might want to take a look. Hal From orasis at acm.cs.umn.edu Wed Feb 28 12:59:01 2001 From: orasis at acm.cs.umn.edu (Justin Chapweske) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Scalability vs Network Stability In-Reply-To: ; from zooko@zooko.com on Wed, Feb 28, 2001 at 11:39:16AM -0800 References: Message-ID: <20010228145824.L16140@go.cs.umn.edu> One thing that seems to be missing from these conversations on routing and lookup is some quantifiable measure of network stability. A lot of these routing systems being proposed seem nice from a pure scalabilty perspective but I doubt that most of them will work in any sort of unstable network environment. Does anyone have any good quantifiable definition of network stability that we can use to benchmark our systems against? It seems to me that most of the algorithms in Freenet depend on a relatively high degree of network stability...But what happens when Napster goes down and everyone starts using Espra? Here is my current gut feeling on the order of network stability levels required for various systems to succeed: 1) Oceanstore, Publius 2) Freenet, OpenCola Folders 3) MojoNation 4) Gnutella 5) Swarmcast (Nodes actively shut themselves down after a period of time) Now what I want is to assign some numbers to each of these systems....any guesses how? -- Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc. http://www.sourceforge.net/projects/swarmcast/ From md98-osa at nada.kth.se Wed Feb 28 16:12:01 2001 From: md98-osa at nada.kth.se (Oskar Sandberg) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] proposal re: scalability of block publication and fetching In-Reply-To: ; from bram@gawth.com on Wed, Feb 28, 2001 at 01:58:33PM -0800 References: <20010228213550.B1736@hobbex.localdomain> Message-ID: <20010301011350.A2677@hobbex.localdomain> On Wed, Feb 28, 2001 at 01:58:33PM -0800, Bram Cohen wrote: > On Wed, 28 Feb 2001, Oskar Sandberg wrote: > > > > `A' knows the "phonebook info" for N other nodes, where the "phonebook > > > info" consists of the public key and other information sufficient to > > > communicate with that node. > > > > The question is, how large is this "phonebook" (freenet: ReferenceStore or > > Routing table)? For the network to be truly scalable, you don't want the > > size of the routing table at any node have to grow faster than O(log N) > > where N is the number of nodes on the entire network. > > Until we get up to around 10,000 counterparties the size of the phone book > won't be more than a few megs. When we get to that point we'll start > worrying about how to make the system scale more. Until then you can't claim to have a scalable architecture. > DNS has the same scaling problem. It doesn't seem to be melting. I think taking one's cues from DNS is just about the last thing somebody trying to build a decentralized P2P system should do. And remember that systems like Napster and ICQ already have namespaces considerably larger than the DNS domain names - by burdening MN with a simplistic routing model you are severly undershooting it's potential. > -Bram Cohen > > "Markets can remain irrational longer than you can remain solvent" > -- John Maynard Keynes > -- 'DeCSS would be fine. Where is it?' 'Here,' Montag touched his head. 'Ah,' Granger smiled and nodded. Oskar Sandberg md98-osa@nada.kth.se From md98-osa at nada.kth.se Wed Feb 28 16:35:02 2001 From: md98-osa at nada.kth.se (Oskar Sandberg) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) In-Reply-To: <200102281817.KAA16135@finney.org>; from hal@finney.org on Wed, Feb 28, 2001 at 10:17:40AM -0800 References: <200102281817.KAA16135@finney.org> Message-ID: <20010301013605.B2677@hobbex.localdomain> On Wed, Feb 28, 2001 at 10:17:40AM -0800, hal@finney.org wrote: > Oskar writes: > > The above paper doesn't deal with adding nodes to the network at all. The > > Oceanstore paper says: > > > > "While existing work on Plaxton-like data structures did not include > > algorithms for online creation and maintenance of the global mesh, we have > > produced recursive node insertion and removal algorithms." > > Of course "recursive" tells us nothing about the actual cost of > such algorithms. I have read the Plaxton paper (the part about the > algorithms, not the proofs!) now, and there are some subtleties in the > Plaxton storage that look to me like they require global knowledge. I guess. Plaxton's system had parent links, so shouldn't the node be able to find the root for it's id, and then walk backwards up each branch of the tree to find all the options at every level (which would be high cost operation of course). <> > It's also important that nodes not share labels, or at least that they > find out if any other nodes have the same label. This can largely be done > by simply querying the network for a given label, but it is vulnerable > to race conditions and it's not clear what happens then. Plaxton seems to solve ties by simply invoking an order function on the network (his beta function) so I figure that should be possible to apply here as well. -- 'DeCSS would be fine. Where is it?' 'Here,' Montag touched his head. 'Ah,' Granger smiled and nodded. Oskar Sandberg md98-osa@nada.kth.se From hal at finney.org Wed Feb 28 17:55:01 2001 From: hal at finney.org (hal@finney.org) Date: Sat Dec 9 22:11:41 2006 Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists) Message-ID: <200103010152.RAA18066@finney.org> Oskar writes: > On Wed, Feb 28, 2001 at 10:17:40AM -0800, hal@finney.org wrote: > > Of course "recursive" tells us nothing about the actual cost of > > such algorithms. I have read the Plaxton paper (the part about the > > algorithms, not the proofs!) now, and there are some subtleties in the > > Plaxton storage that look to me like they require global knowledge. > > I guess. Plaxton's system had parent links, so shouldn't the node be able > to find the root for it's id, and then walk backwards up each branch of > the tree to find all the options at every level (which would be high cost > operation of course). I don't follow how this would work. Say we're using base 4, and my new node has randomly chosen the label 0123. Now here's a neighboring node with label 3210. For my level-0 links, I need to find the closest nodes matching ___0, ___1, ___2, and ___3. I can get these from 3210 just fine. For my level-1 links, I need close nodes matching __03, __13, __23, __33. I can't get any of these from 3210 directly. He can follow his ___3 link and I can get nodes from that 2nd link, but they may not be closest to me. Likewise for the level-2 links I need _023, _123, _223, _323. For these I can take the ___3 neighbor of 3210, and his __23 neighbor, and use his level-2 links. But now these are three steps away. And last I need 0123, 1123, 2123 and 3123, which I can get by following ___3, __23, _123 path and ask for his neighbors (cost is not an issue for the highest-level neighbors in Plaxton). Possibly a good compromise would be to somehow identify a bunch of physically-nearby nodes participating in the network, and perform this algorithm with each of them to get a good assortment of candidates for each one, and then to take the closest. > > It's also important that nodes not share labels, or at least that they > > find out if any other nodes have the same label. This can largely be done > > by simply querying the network for a given label, but it is vulnerable > > to race conditions and it's not clear what happens then. > > Plaxton seems to solve ties by simply invoking an order function on the > network (his beta function) so I figure that should be possible to apply > here as well. Right, and you could use IP address or some such to break the ties, but you have to recognize that ties exist first, which is global information. There's a special rule for the last-level neighbors; you don't take the closest one, you take the one highest in beta (assuming there is more than one). So in the example above when I go for 1123, 2123, & 3123, I need to query whether there is more than one of these and pick the one with the highest beta value. Now, I suppose I can just copy anyone's links because they also were supposed to use the highest beta. But if a new node has just been added with label 2123, and it happens to have a higher beta, I need to know about it, and so does everybody else who's pointing at 2123. (And for that matter, all the data at the old 2123 has to get sent over to the new one, since that's where people will look.) Similar problems happen when a node leaves. I'd feel better if OceanStore had a claim that adding/removing nodes took log n or log^2 n operations, or some such. Given that nodes will be entering and leaving all the time, if these operations are costly it could be a significant load. Hal