From threelions0916 at yahoo.com.cn Thu Dec 1 06:34:37 2005 From: threelions0916 at yahoo.com.cn (Michael Liu) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] P2P in SFC References: <438B8E20.3030903@quinthar.com><20051129011547.34494.qmail@web53605.mail.yahoo.com> <40513.67.188.193.83.1133228692.squirrel@webmail.redswoosh.net> Message-ID: <002301c5f641$4ee31820$3118080a@cnc.intra> eW91IGd1eXMgYXJlIHNvIGx1Y2t5LCBJIHdhbnQgdG8gYXR0ZW5kIHRoZSBtZWV0aW5nIHRvbywg YnV0IEkgYW0gaW4gY2hpbmEgOi0oDQoNCk1pY2hhZWwNCg0KLS0tLS0gT3JpZ2luYWwgTWVzc2Fn ZSAtLS0tLSANCkZyb206ICJUcmF2aXMgS2FsYW5pY2siIDx0cmF2aXNAcmVkc3dvb3NoLm5ldD4N ClRvOiAiUGVlci10by1wZWVyIGRldmVsb3BtZW50LiIgPHAycC1oYWNrZXJzQHpncC5vcmc+DQpT ZW50OiBUdWVzZGF5LCBOb3ZlbWJlciAyOSwgMjAwNSA5OjQ0IEFNDQpTdWJqZWN0OiBSZTogW3Ay cC1oYWNrZXJzXSBQMlAgaW4gU0ZDDQoNCg0KPiBMZW1vbiwgSSBhZ3JlZSB3aXRoIHlvdS4gIFNp bmNlIG1vc3QgcGVvcGxlIHNlZW0gdG8gYmUgYWJsZSB0byBtYWtlIHRoZQ0KPiBXZWRuZXNkYXkg dGltZSwgbWF5YmUgd2Ugc2hvdWxkIGZpbmFsaXplIG9uIHRoYXQuDQo+IA0KPiBEYXZpZCwgZG8g d2UgaGF2ZSBhIGNvbnNlbnN1cz8NCj4gDQo+IA0KPiBUDQo+IA0KPiBMZW1vbiBPYnJpZW4gc2Fp ZDoNCj4gPiBJIGNhbiBjb21lIGFueXRpbWUuLi5JIHRoaW5rIHRoaXMgd291bGQgYmUgbmVhdDsg aSBkb24ndCBrbm93IGFib3V0IHlvdQ0KPiA+IGd1eXM7IGJ1dCBpIG93biBteSBvd24gY29tcGFu eTsgY29taW5nIG91dCB3aXRoIGEgcHJvZHVjdCBzb29uLiBJIGtub3cNCj4gPiBkYXZpZCBoYXMg aUdsYW5jZS4uLndoaWNoIGlzIG5vdCBpbiBteSBzcGFjZSwgYnV0IGJlbGlldmUgbWVldGluZyBv dGhlcg0KPiA+IGxpa2UgbWluZGVkIHBlb3BsZSB3aG8ga25vdyAicGVlciIgaXMgdGhlIG5leHQg YmlnIHRoaW5nLi4uc29ycnkgbXkNCj4gPiBmcmllbmRzOyBidXQgdGhlIHdlYiBpcyBwbGF5ZWQg b3V0Lg0KPiA+DQo+ID4gICBlbA0KPiA+DQo+ID4gRGF2aWQgQmFycmV0dCA8ZGJhcnJldHRAcXVp bnRoYXIuY29tPiB3cm90ZToNCj4gPiAgIFdoYXQgZGF5L3RpbWUgd291bGQgeW91IHByb3Bvc2U/ DQo+ID4NCj4gPiBTZXJndWVpIE9zb2tpbmUgd3JvdGU6DQo+ID4+Pk1heWJlLCBXZWRuZXNkYXks IDlwbT8NCj4gPj4NCj4gPj4NCj4gPj4gU29ycnkgLSBJJ20gYnVzeSBvbiBXZWRuZXNkYXkuLi4N Cj4gPj4NCj4gPj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPj4gRnJvbTogcDJwLWhh Y2tlcnMtYm91bmNlc0B6Z3Aub3JnIFttYWlsdG86cDJwLWhhY2tlcnMtYm91bmNlc0B6Z3Aub3Jn XU9uDQo+ID4+IEJlaGFsZiBPZiBEYXZpZCBCYXJyZXR0DQo+ID4+IFNlbnQ6IE1vbmRheSwgTm92 ZW1iZXIgMjgsIDIwMDUgMTo0MCBQTQ0KPiA+PiBUbzogUGVlci10by1wZWVyIGRldmVsb3BtZW50 Lg0KPiA+PiBTdWJqZWN0OiBbcDJwLWhhY2tlcnNdIFAyUCBpbiBTRkMNCj4gPj4NCj4gPj4NCj4g Pj4gU28gbG9va3MgbGlrZSB0aGVyZSdzIGEgZGVjZW50IHNob3dpbmcgb2YgUDJQIGd1eXMgaW4g U2FuIEZyYW5jaXNjbyAtLQ0KPiA+PiBzaXggYnkgbXkgY291bnQuIEhvdyBhYm91dCBzdXNoaSBh bmQgYmVlciB0aGlzIHdlZWsgYXQsIHNheSBSeW9rbz8NCj4gPj4NCj4gPj4gaHR0cDovL3Rpbnl1 cmwuY29tL2JrazVkDQo+ID4+DQo+ID4+IE1heWJlLCBXZWRuZXNkYXksIDlwbT8gQW55IG9iamVj dGlvbnMgb3IgYWZmaXJtYXRpb25zPw0KPiA+Pg0KPiA+PiAtZGF2aWQNCj4gPj4NCj4gPj4NCj4g Pj4NCj4gPj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18N Cj4gPj4gcDJwLWhhY2tlcnMgbWFpbGluZyBsaXN0DQo+ID4+IHAycC1oYWNrZXJzQHpncC5vcmcN Cj4gPj4gaHR0cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiA+PiBf X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+PiBIZXJl IGlzIGEgd2ViIHBhZ2UgbGlzdGluZyBQMlAgQ29uZmVyZW5jZXM6DQo+ID4+IGh0dHA6Ly93d3cu bmV1cm9ncmlkLm5ldC90d2lraS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0K PiA+PiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+ PiBwMnAtaGFja2VycyBtYWlsaW5nIGxpc3QNCj4gPj4gcDJwLWhhY2tlcnNAemdwLm9yZw0KPiA+ PiBodHRwOi8vemdwLm9yZy9tYWlsbWFuL2xpc3RpbmZvL3AycC1oYWNrZXJzDQo+ID4+IF9fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+ID4+IEhlcmUgaXMg YSB3ZWIgcGFnZSBsaXN0aW5nIFAyUCBDb25mZXJlbmNlczoNCj4gPj4gaHR0cDovL3d3dy5uZXVy b2dyaWQubmV0L3R3aWtpL2Jpbi92aWV3L01haW4vUGVlclRvUGVlckNvbmZlcmVuY2VzDQo+ID4+ DQo+ID4+DQo+ID4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X18NCj4gPiBwMnAtaGFja2VycyBtYWlsaW5nIGxpc3QNCj4gPiBwMnAtaGFja2Vyc0B6Z3Aub3Jn DQo+ID4gaHR0cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiA+IF9f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+ID4gSGVyZSBp cyBhIHdlYiBwYWdlIGxpc3RpbmcgUDJQIENvbmZlcmVuY2VzOg0KPiA+IGh0dHA6Ly93d3cubmV1 cm9ncmlkLm5ldC90d2lraS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0KPiA+ DQo+ID4NCj4gPg0KPiA+DQo+ID4gWW91IGRvbid0IGdldCBubyBqdWljZSB1bmxlc3MgeW91IHNx dWVlemUNCj4gPiBMZW1vbiBPYnJpZW4sIHRoZSBUaGlyZC5fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+IHAycC1oYWNrZXJzIG1haWxpbmcgbGlzdA0K PiA+IHAycC1oYWNrZXJzQHpncC5vcmcNCj4gPiBodHRwOi8vemdwLm9yZy9tYWlsbWFuL2xpc3Rp bmZvL3AycC1oYWNrZXJzDQo+ID4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18NCj4gPiBIZXJlIGlzIGEgd2ViIHBhZ2UgbGlzdGluZyBQMlAgQ29uZmVyZW5j ZXM6DQo+ID4gaHR0cDovL3d3dy5uZXVyb2dyaWQubmV0L3R3aWtpL2Jpbi92aWV3L01haW4vUGVl clRvUGVlckNvbmZlcmVuY2VzDQo+ID4NCj4gDQo+IA0KPiBUcmF2aXMgS2FsYW5pY2sNCj4gUmVk IFN3b29zaCwgSW5jLg0KPiBGb3VuZGVyLCBDaGFpcm1hbg0KPiB0cmF2aXNAcmVkc3dvb3NoLm5l dA0KPiAodikgMzEwLjY2Ni4xNDI5DQo+IChmKSAyNTMuMzIyLjk0NzgNCj4gQUlNOiBTY291clRy YXYxMjMNCj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18N Cj4gcDJwLWhhY2tlcnMgbWFpbGluZyBsaXN0DQo+IHAycC1oYWNrZXJzQHpncC5vcmcNCj4gaHR0 cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiBfX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiBIZXJlIGlzIGEgd2ViIHBhZ2Ug bGlzdGluZyBQMlAgQ29uZmVyZW5jZXM6DQo+IGh0dHA6Ly93d3cubmV1cm9ncmlkLm5ldC90d2lr aS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0K __________________________________________________ Do You Yahoo!? ????????G?????????????????????????????????????? http://cn.mail.yahoo.com/?id=77071 From gbildson at limepeer.com Thu Dec 1 16:36:14 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p framework In-Reply-To: <71b79fa90511301016q54c57883se119ee54ea01212c@mail.gmail.com> Message-ID: You could build off the limewire.org open source code (currently down) or off of the gtk-gnutella or gnucleus source. You would want to form a separate network since current vendors would consider alternate uses of the existing network as pollution. If you envisioned similar services as exist or extensions to the existing services then this might make sense. If you want something as a basis for a new clean protocol, I might not recommend it since some aspects of the protocol are a little ugly underneath the covers. However, extension mechanisms do for most message types. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Davide Carboni > Sent: Wednesday, November 30, 2005 1:17 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] p2p framework > > > On 11/29/05, Greg Bildson wrote: > > Do you mean Gnutella's use as a framework or otherwise? > > > > Yes I do. My question is: are there some implementation of gnutella > that can be used to build upon new applications and to develop new > services (beyond simple file sharing) ? > > D. > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From unixsmaxer at hotmail.com Thu Dec 1 18:28:16 2005 From: unixsmaxer at hotmail.com (Salem Mark) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p framework In-Reply-To: Message-ID: >From: "Greg Bildson" >You could build off the limewire.org open source code (currently down) or >off of the gtk-gnutella or gnucleus source. You would want to form a >separate network since current vendors would consider alternate uses of the >existing network as pollution. Could you please elaborate on forming a separate network under Gnutella? I was thinking of using the Echomine-Muse gnutella API, which facilitates sending custom messages in Gnutella, as a technique for Jabber Servers to collorabote and achieve global service discovery. Thanks. - Salem > >If you envisioned similar services as exist or extensions to the existing >services then this might make sense. If you want something as a basis for >a >new clean protocol, I might not recommend it since some aspects of the >protocol are a little ugly underneath the covers. However, extension >mechanisms do for most message types. > >Thanks >-greg > > > -----Original Message----- > > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > > Behalf Of Davide Carboni > > Sent: Wednesday, November 30, 2005 1:17 PM > > To: Peer-to-peer development. > > Subject: Re: [p2p-hackers] p2p framework > > > > > > On 11/29/05, Greg Bildson wrote: > > > Do you mean Gnutella's use as a framework or otherwise? > > > > > > > Yes I do. My question is: are there some implementation of gnutella > > that can be used to build upon new applications and to develop new > > services (beyond simple file sharing) ? > > > > D. > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > >_______________________________________________ >p2p-hackers mailing list >p2p-hackers@zgp.org >http://zgp.org/mailman/listinfo/p2p-hackers >_______________________________________________ >Here is a web page listing P2P Conferences: >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences _________________________________________________________________ No masks required! Use MSN Messenger to chat with friends and family. http://go.msnserver.com/HK/25382.asp From unixsmaxer at hotmail.com Thu Dec 1 18:38:46 2005 From: unixsmaxer at hotmail.com (Salem Mark) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks Message-ID: Hello, I have read in several papers that it is unlikely that the integrity of the DHT can be maintained where there is a high node or link failure rate without significant message transmission overhead. In other words, it is mentioned that, in "highly transient networks", where the number of nodes appearing and disappearing are very high, maintaining the DHT becomes hard and introduces considerable overhead. I am trying to find out what exactly "highly-transient" means. A file sharing network like Gnutella, seems to be highly transient, where peers join/leave the network frequently. Could somebody elaborate on this? is there a node departure/arrival/failure rate (per sec? per min?) that identifies "highly-transient" networks ? Thanks - Salem _________________________________________________________________ FREE English Booklet! Improve your English. http://www.linguaphonenet.com/BannerTrack.asp?EMSCode=MSN03-08ETFJ-0211E From gbildson at limepeer.com Thu Dec 1 18:44:00 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p framework In-Reply-To: Message-ID: There is a standard connect string in Gnutella. Something like "Gnutella Connect" / "Gnutella OK". Change that. Set up your own Gwebcache or UHC (UDP host cache) or include your own gnutella.net ip:ports file and you should be able to bootstrap your own network. I'm not aware of any mainstream users of the Echomine-Muse libraries. They may or may not work. I expect that they are primitive compared to the LimeWire and gtk-gnutella code. However, they may work for your purposes. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Salem Mark > Sent: Thursday, December 01, 2005 1:28 PM > To: p2p-hackers@zgp.org > Subject: RE: [p2p-hackers] p2p framework > > > > >From: "Greg Bildson" > >You could build off the limewire.org open source code (currently down) or > >off of the gtk-gnutella or gnucleus source. You would want to form a > >separate network since current vendors would consider alternate > uses of the > >existing network as pollution. > > Could you please elaborate on forming a separate network under Gnutella? > > I was thinking of using the Echomine-Muse gnutella API, which facilitates > sending custom messages in Gnutella, as a technique for Jabber Servers to > collorabote and achieve global service discovery. > > Thanks. > > - Salem > > > > > > > > > > > > > >If you envisioned similar services as exist or extensions to the existing > >services then this might make sense. If you want something as a > basis for > >a > >new clean protocol, I might not recommend it since some aspects of the > >protocol are a little ugly underneath the covers. However, extension > >mechanisms do for most message types. > > > >Thanks > >-greg > > > > > -----Original Message----- > > > From: p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org]On > > > Behalf Of Davide Carboni > > > Sent: Wednesday, November 30, 2005 1:17 PM > > > To: Peer-to-peer development. > > > Subject: Re: [p2p-hackers] p2p framework > > > > > > > > > On 11/29/05, Greg Bildson wrote: > > > > Do you mean Gnutella's use as a framework or otherwise? > > > > > > > > > > Yes I do. My question is: are there some implementation of gnutella > > > that can be used to build upon new applications and to develop new > > > services (beyond simple file sharing) ? > > > > > > D. > > > _______________________________________________ > > > p2p-hackers mailing list > > > p2p-hackers@zgp.org > > > http://zgp.org/mailman/listinfo/p2p-hackers > > > _______________________________________________ > > > Here is a web page listing P2P Conferences: > > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > >_______________________________________________ > >p2p-hackers mailing list > >p2p-hackers@zgp.org > >http://zgp.org/mailman/listinfo/p2p-hackers > >_______________________________________________ > >Here is a web page listing P2P Conferences: > >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > _________________________________________________________________ > No masks required! Use MSN Messenger to chat with friends and family. > http://go.msnserver.com/HK/25382.asp > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From rrrw at neofonie.de Thu Dec 1 20:48:45 2005 From: rrrw at neofonie.de (Ronald Wertlen) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <20051130223956.C99CC3FEE8@capsicum.zgp.org> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> Message-ID: <438F61AD.80406@neofonie.de> Hi Adam, perhaps you have not understood my message because you have not noticed the focus on "precision and recall" (i.e. search) not the old Distributed DB vs. own DB debate. You have also pigeon-holed my email with the DHT crowd (*grin*), it couldn't be further from it! I was arguing in the other direction - which coderman thankfully picked up. Gnutella doesn't structure enough, that's all. Sure Gnutella beats DHTs on search - I base that observation on a project I finished last year - a public prototype that used JXTA and was honed for search using super-peers [DFN S2S http://s2s.neofonie.de/ (German site) - we've moved on some since them ;) ]. Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows practically anyone to elevate to super-peer, which results in a random (power-law distribtion) network. Such a network is not going to perform very well as far as recall and precision are concerned, past a certain point. I would be interested to calculate that exact point (but doubting I'll get to it some time soon :-/). HTH. Best regards, Ron PS. seems this thread has driven the original author to reformulate his statment... :-) PPS. In fact, the network is not going to be completely random - it will follow the contours of the internet (distribution of servers, broadband connections, users, etc. is not random). I am not sure if that destroys or supports my argument. Back to the drawing board! We actually need a better internet. [oops there I go getting unspecific again, sorry!! ;-) ] > Message: 4 > Date: Wed, 30 Nov 2005 16:42:39 -0500 > From: Adam Fisk > Subject: Re: [p2p-hackers] Re: scalability > To: "Peer-to-peer development." > Message-ID: <438E1CCF.4010907@speedymail.org> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > I don't understand your post. When you say "critical", I assume you're > talking about life and death situations? Are you talking about anything > specifically? DHTs have failure rates. Ad hoc and mesh networks can > become useful in emergency situations where conventional infrastructures > break down, but the centralized/p2p/structured/unstructured questions > here are far from obvious. > > On the "obsessive science types" issue, this completely misses the > point. It's a very non "obsessive science type" statement. There are > strong reasons for using the massive indexing/random walk approach above > DHTs -- reasons that have nothing to do with scalability. In > particulary, DHTs are, well, hash tables. Hash tables don't work well > for metadata queries. They do fine for keywords (hotspots are a > problem, but they can be solved), but they aren't as nice a fit for > metadata. RDF and DHTs are tough to squeeze together, for example. The > massive indexing (mutual index caching to use Serguei's term)/random > walk approach can get around these issues more easily. They are also > not nearly as brittle as DHTs. Sure, DHTs repair themselves after node > joins and leaves, but node transience generally has a much greater > effect on DHTs than it does on massive indexing networks. > > I also think you're underestimating the efficiency of massive indexing > and random walks. Sure, these networks don't scale logarithmically, but > they do pretty darn well. > > I encourage everyone to stay specific with their posts. > > All the Best, > > Adam From agthorr at cs.uoregon.edu Thu Dec 1 20:52:16 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F61AD.80406@neofonie.de> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> Message-ID: <20051201205215.GF5300@cs.uoregon.edu> On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote: > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Gnutella is not a power-law network. See my paper on the graph properties of Gnutella, presented at the Internet Measurement Conference earlier this year: http://www.usenix.org/events/imc05/tech/stutzbach.html > Such a network is not going to perform very well as far as recall > and precision are concerned, past a certain point. I would be > interested to calculate that exact point (but doubting I'll get to > it some time soon :-/). Could you rigorously define recall and precision for me? I'm not sure what you mean by these terms. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From afisk at speedymail.org Thu Dec 1 21:09:22 2005 From: afisk at speedymail.org (Adam Fisk) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F61AD.80406@neofonie.de> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> Message-ID: <438F6682.6070806@speedymail.org> Hi Ron- Apologies for the DHT pigeon-holing. I had this nagging feeling in my stomach that you may come more from the land of small world and power law networks, but I successfully supressed it! I agree with Daniel that Gnutella's not actually a power law network, although I can't remember what led me to decide that (several years ago now). If I recall correctly, it's that degrees between nodes are quite fixed and uniform. How would you prefer superpeers get elected? Superpeer election on Gnutella is fairly simple primarily because there's a scarcity of non-firewalled/NATted machines to fill their roles, so you have to sort of take what you can get. Are you referring more to which superpeers to *select* over the course of a search and not the original choice of superpeers? On the Gnutella 0.6/0.7 issue, that's really just the version of the specification for connection headers -- a frequent source of confusion. Gnutella has rightfully evolved into a family of protocols that themselves have version numbers -- everything from superpeers to dynamic querying to bloom filter exchange and mesh downloading. All of these evolve largely independently from one another, giving the protocol family much more flexibility and agility. All the Best, Adam Ronald Wertlen wrote: > Hi Adam, > > perhaps you have not understood my message because you have not > noticed the focus on "precision and recall" (i.e. search) not the old > Distributed DB vs. own DB debate. You have also pigeon-holed my email > with the DHT crowd (*grin*), it couldn't be further from it! > > I was arguing in the other direction - which coderman thankfully > picked up. Gnutella doesn't structure enough, that's all. Sure > Gnutella beats DHTs on search - I base that observation on a project I > finished last year - a public prototype that used JXTA and was honed > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > (German site) - we've moved on some since them ;) ]. > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Such a network is not going to > perform very well as far as recall and precision are concerned, past a > certain point. I would be interested to calculate that exact point > (but doubting I'll get to it some time soon :-/). > > HTH. > > Best regards, Ron > > PS. seems this thread has driven the original author to reformulate > his statment... :-) > > PPS. > In fact, the network is not going to be completely random - it will > follow the contours of the internet (distribution of servers, > broadband connections, users, etc. is not random). I am not sure if > that destroys or supports my argument. Back to the drawing board! > > We actually need a better internet. [oops there I go getting > unspecific again, sorry!! ;-) ] > > >> Message: 4 >> Date: Wed, 30 Nov 2005 16:42:39 -0500 >> From: Adam Fisk >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer >> development." >> Message-ID: <438E1CCF.4010907@speedymail.org> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> I don't understand your post. When you say "critical", I assume >> you're talking about life and death situations? Are you talking >> about anything specifically? DHTs have failure rates. Ad hoc and >> mesh networks can become useful in emergency situations where >> conventional infrastructures break down, but the >> centralized/p2p/structured/unstructured questions here are far from >> obvious. >> >> On the "obsessive science types" issue, this completely misses the >> point. It's a very non "obsessive science type" statement. There >> are strong reasons for using the massive indexing/random walk >> approach above DHTs -- reasons that have nothing to do with >> scalability. In particulary, DHTs are, well, hash tables. Hash >> tables don't work well for metadata queries. They do fine for >> keywords (hotspots are a problem, but they can be solved), but they >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze >> together, for example. The massive indexing (mutual index caching to >> use Serguei's term)/random walk approach can get around these issues >> more easily. They are also not nearly as brittle as DHTs. Sure, >> DHTs repair themselves after node joins and leaves, but node >> transience generally has a much greater effect on DHTs than it does >> on massive indexing networks. >> >> I also think you're underestimating the efficiency of massive >> indexing and random walks. Sure, these networks don't scale >> logarithmically, but they do pretty darn well. >> I encourage everyone to stay specific with their posts. >> >> All the Best, >> >> Adam > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From srhea at cs.berkeley.edu Thu Dec 1 21:11:02 2005 From: srhea at cs.berkeley.edu (Sean Rhea) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: References: Message-ID: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> On Dec 1, 2005, at 1:38 PM, Salem Mark wrote: > I have read in several papers that it is unlikely that the > integrity of the DHT can be maintained where there is a high node > or link failure rate without significant message transmission > overhead. In other words, it is mentioned that, in "highly > transient networks", where the number of nodes appearing and > disappearing are very high, maintaining the DHT becomes hard and > introduces considerable overhead. > > I am trying to find out what exactly "highly-transient" means. A > file sharing network like Gnutella, seems to be highly transient, > where peers join/leave the network frequently. Could somebody > elaborate on this? is there a node departure/arrival/failure rate > (per sec? per min?) that identifies "highly-transient" networks ? > In the Bamboo USENIX paper, we talked about the average time a node was connected to the network before disconnecting. Bamboo and Chord are definitely resilient (at a routing level) even when that period is a short as a few minutes: http://srhea.net/papers/bamboo-usenix.pdf Other DHTs may be this resilient as well, but I don't have data for them. Sean -- There is no end to the fragility of our democracy. -- Ralph Nader -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051201/f244438a/PGP.pgp From agthorr at cs.uoregon.edu Thu Dec 1 21:15:12 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F6682.6070806@speedymail.org> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> <438F6682.6070806@speedymail.org> Message-ID: <20051201211511.GH5300@cs.uoregon.edu> On Thu, Dec 01, 2005 at 04:09:22PM -0500, Adam Fisk wrote: > On the Gnutella 0.6/0.7 issue, that's really just the version of the > specification for connection headers -- a frequent source of confusion. > Gnutella has rightfully evolved into a family of protocols that > themselves have version numbers -- everything from superpeers to dynamic > querying to bloom filter exchange and mesh downloading. All of these > evolve largely independently from one another, giving the protocol > family much more flexibility and agility. I suggest adding text similar to this to the GDF Wiki main page, and changing "RFC-Gnutella 0.6" to "Gnutella Protocol Family" or the like. (Which apparently cannot be edited by normal wiki users) -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From m.rogers at cs.ucl.ac.uk Thu Dec 1 22:53:24 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> References: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> Message-ID: <438F7EE4.9030209@cs.ucl.ac.uk> Sean Rhea wrote: > In the Bamboo USENIX paper, we talked about the average time a node was > connected to the network before disconnecting. Bamboo and Chord are > definitely resilient (at a routing level) even when that period is a > short as a few minutes: To what extent does this depend on the distribution of session times as well as the mean? Kademlia assumes that old nodes will outlive new nodes, and Daniel's paper shows that Gnutella contains an emergent core of long-lived nodes - how well do Bamboo and Chord survive under non-uniform churn? Cheers, Michael From srhea at cs.berkeley.edu Thu Dec 1 23:01:51 2005 From: srhea at cs.berkeley.edu (Sean Rhea) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: <438F7EE4.9030209@cs.ucl.ac.uk> References: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> <438F7EE4.9030209@cs.ucl.ac.uk> Message-ID: On Dec 1, 2005, at 5:53 PM, Michael Rogers wrote: > To what extent does this depend on the distribution of session > times as well as the mean? Kademlia assumes that old nodes will > outlive new nodes, and Daniel's paper shows that Gnutella contains > an emergent core of long-lived nodes - how well do Bamboo and Chord > survive under non-uniform churn? We used exponentially-distributed node lifetimes, so old nodes do not generally outlive new ones. However, I _think_ that choice only makes the problem harder, though. In particular, I would suspect that Bamboo/Chord would do just as well if old nodes lived longer than new ones, and possibly better. They won't take advantage of it like Kademlia does, but it shouldn't hurt them either. (At least that's my guess; I don't have data to prove it.) Sean -- When I see the price that you pay / I don't wanna grow up -- Tom Waits -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051201/326f82d2/PGP.pgp From john.casey at gmail.com Fri Dec 2 00:07:56 2005 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability (was: p2p framework) In-Reply-To: <438E067B.2040408@neofonie.de> References: <20051130095529.6CAE83FEB8@capsicum.zgp.org> <438E067B.2040408@neofonie.de> Message-ID: On 12/1/05, Ronald Wertlen wrote: > Hi, > > Gnutella-bashing certainly may be fun, the truth is, it is tremendously > well-adapted for its purpose (I think Serguei's said the relevant stuff). > > However, I also believe it is pretty clear that from a search point of > view, a random super-peer based network does not scale - it is never > going to get the kind of precision and recall that we would call > intelligent. It would be too slow or too inaccurate. But if you index everything in some sort of distributed inverted index on top of a DHT a lot of document postings and related meta data still have to be exported to the network which isn't such a great solution either. The worst thing is that semantically close terms and documents are going to be scattered to random locations to remote locations in the network for indexing. Personally what I think is needed here is a slightly coarser indexing structure. So that instead of publishing 1000s of term->document pointers or at the other extreme a few term->peer as with PlanetP there is some sort of middle ground such as term->cluster-id which is better able to direct a search to sensible peers. The difficulty of course with this approach is that it isn't that easy to construct sensible global clusters from local cluster definitions as different local document databases will index different terms and the like. From baoguai2000 at gmail.com Fri Dec 2 03:06:21 2005 From: baoguai2000 at gmail.com (zheng j) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] help:is anyone interested in living streaming or vod over p2p? Message-ID: SGksIEkgYW0gbm93IGRvaW5nIHJlc2VhcmNoIG9uIGxpdmluZyBzdHJlYW1pbmcgYW5kIFZvZCBv dmVyIHAycCwgYnV0CkkgZG9uJ3Qga25vdyB3aG8gY2FuIEkgZGlzY3VzcyBteSBpZGVhIHdpdGgs IHlvdSBrbm93LCB3aXRob3V0IGlkZWEKZXhjaGFuZ2UsIEkgZmVlbCB2ZXJ5IGNvbmZ1c2VkIGFu ZCBhbm5veWVkLiBXaG8gY2FuIHRlbGwgbWUgd2hpY2gKd2Vic2l0ZSBJIGNhbiBmaW5kIHNvbWVv bmUgaW50ZXJlc3RlZCBpbiBpdD8gQW5kLCBpZiB5b3UgYXJlCmludGVyZXN0ZWQgaW4gaXQsIHBs ZWFzZSBjb250YWN0IG1loaMK From joaquin.keller at francetelecom.com Fri Dec 2 04:05:36 2005 From: joaquin.keller at francetelecom.com (KELLER Joaquin RD-MAPS-ISS) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] help:is anyone interested in living streaming or vodover p2p? In-Reply-To: References: Message-ID: Hi Zheng, We are working on that live streaming (not on VoD) http://pulse.netofpeers.net/ -- Joaquin On 12/1/05, zheng j wrote: > > > Hi, I am now doing research on living streaming and Vod over p2p, but > I don't know who can I discuss my idea with, you know, without idea > exchange, I feel very confused and annoyed. Who can tell me which > website I can find someone interested in it? And, if you are > interested in it, please contact me$B!#(B > > -- ___________________________________________________________ Joaquin Keller MAPS/MMC - France Telecom - Division R&D 38-40, rue du General Leclerc 92794 Issy Moulineaux Cedex 9 Tel: +33 (0)1 45 29 52 86 Fax: +33 (0)1 45 29 52 94 joaquin.keller@rd.francetelecom.com http://solipsis.netofpeers.net/ From redist-p2p-hackers at lothar.com Fri Dec 2 07:54:31 2005 From: redist-p2p-hackers at lothar.com (Brian Warner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: P2P in SFC Message-ID: <20051201.235431.65614928.warner@lothar.com> > Regardless, let's wait for the final guest list before deciding if we > switch locales. I'll be there too. Thanks for setting this up! -Brian From aloeser at cs.tu-berlin.de Fri Dec 2 09:28:49 2005 From: aloeser at cs.tu-berlin.de (=?ISO-8859-1?Q?Alexander_L=F6ser?=) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F61AD.80406@neofonie.de> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> Message-ID: <439013D1.7020803@cs.tu-berlin.de> Hi Adam, originally there was a certain type of clustering in the beginnings of Gnutella (late 90ies) . People communicate its ids mouth to mouth or via Email or deja news to other people. So in most cases you got Ids from people which had at least similar interests, or from people where you expected some interesting files. Later, due to the overwhelming attractiveness of the gnutella application they introduced the gtk and other bootstrapping alternatives, given you a number of starting pointers. However, this starting points a chosen 'randomly', so there is no longer any clustering by interests. We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based Node Grouping Algorithm [1][2]) , that reclusters the network based on the interests of the peers, without any DHT, only using on an unstructured network. Similar to freenet, the network topology evolves over a while to a so called small world topology, where people with similar interests are clustered together. In addition, to further speed up the clustering process, peers also keep in a local index structures other peers, that are 'HUBs' in the network, e.g. having a high in and out degree. Our experiments show, that we significantly outperform Gnutella style approaches in messages even in highly volatile networks. Best's Alex [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf [2] Remindin': Semantic query routing in peer-to-peer networks based on social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 http://**www.aifb.uni-karlsruhe.de/ Publikationen/showPublikation?publ_id=447 Ronald Wertlen schrieb: > Hi Adam, > > perhaps you have not understood my message because you have not > noticed the focus on "precision and recall" (i.e. search) not the old > Distributed DB vs. own DB debate. You have also pigeon-holed my email > with the DHT crowd (*grin*), it couldn't be further from it! > > I was arguing in the other direction - which coderman thankfully > picked up. Gnutella doesn't structure enough, that's all. Sure > Gnutella beats DHTs on search - I base that observation on a project I > finished last year - a public prototype that used JXTA and was honed > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > (German site) - we've moved on some since them ;) ]. > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Such a network is not going to > perform very well as far as recall and precision are concerned, past a > certain point. I would be interested to calculate that exact point > (but doubting I'll get to it some time soon :-/). > > HTH. > > Best regards, Ron > > PS. seems this thread has driven the original author to reformulate > his statment... :-) > > PPS. > In fact, the network is not going to be completely random - it will > follow the contours of the internet (distribution of servers, > broadband connections, users, etc. is not random). I am not sure if > that destroys or supports my argument. Back to the drawing board! > > We actually need a better internet. [oops there I go getting > unspecific again, sorry!! ;-) ] > > >> Message: 4 >> Date: Wed, 30 Nov 2005 16:42:39 -0500 >> From: Adam Fisk >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer >> development." >> Message-ID: <438E1CCF.4010907@speedymail.org> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> I don't understand your post. When you say "critical", I assume >> you're talking about life and death situations? Are you talking >> about anything specifically? DHTs have failure rates. Ad hoc and >> mesh networks can become useful in emergency situations where >> conventional infrastructures break down, but the >> centralized/p2p/structured/unstructured questions here are far from >> obvious. >> >> On the "obsessive science types" issue, this completely misses the >> point. It's a very non "obsessive science type" statement. There >> are strong reasons for using the massive indexing/random walk >> approach above DHTs -- reasons that have nothing to do with >> scalability. In particulary, DHTs are, well, hash tables. Hash >> tables don't work well for metadata queries. They do fine for >> keywords (hotspots are a problem, but they can be solved), but they >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze >> together, for example. The massive indexing (mutual index caching to >> use Serguei's term)/random walk approach can get around these issues >> more easily. They are also not nearly as brittle as DHTs. Sure, >> DHTs repair themselves after node joins and leaves, but node >> transience generally has a much greater effect on DHTs than it does >> on massive indexing networks. >> >> I also think you're underestimating the efficiency of massive >> indexing and random walks. Sure, these networks don't scale >> logarithmically, but they do pretty darn well. >> I encourage everyone to stay specific with their posts. >> >> All the Best, >> >> Adam > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- ___________________________________________________________ Dr. Alexander L?ser, Technische Universit?t Berlin, CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY office: +49- 30-314-25556 fax: +49- 30-314-21601 web: http://cis.cs.tu-berlin.de/~aloeser/ ___________________________________________________________ From gwendal.simon at francetelecom.com Fri Dec 2 09:38:14 2005 From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: Hi Alexander, This work is close to the one we perform for Maay [1]. As we just begin to implement it, it could be great if you can participate to the early protocol discussion on the mailing-list. The current Maay implementation [2] is very open. We develop a basic indexer that communicates through XML-RPC to the "Maay node". The "Maay node" manages communication and the sql database. It can be controlled through a web interface. Have fun ! -- Gwendal [1]: MAAY: a decentralized personalized search system, F. Dang Ngoc, J. Keller, G. Simon. SAINT'2006 http://maay.netofpeers.net/documentation/maay_SAINT2006.pdf [2]: http://maay.netofpeers.net > -----Message d'origine----- > De : p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] De la part de Alexander L?ser > Envoy? : vendredi 2 d?cembre 2005 10:29 > ? : Peer-to-peer development. > Objet : Re: [p2p-hackers] Re: scalability > > Hi Adam, > originally there was a certain type of clustering in the > beginnings of > Gnutella (late 90ies) . People communicate its ids mouth to > mouth or via > Email or deja news to other people. So in most cases you got Ids from > people which had at least similar interests, or from people > where you > expected some interesting files. Later, due to the overwhelming > attractiveness of the gnutella application they introduced > the gtk and > other bootstrapping alternatives, given you a number of starting > pointers. However, this starting points a chosen 'randomly', > so there is > no longer any clustering by interests. > > We (Berlin and Karlsruhe) developed a new protocol (INGA > Interest based > Node Grouping Algorithm [1][2]) , that reclusters the network > based on > the interests of the peers, without any DHT, only using on an > unstructured network. Similar to freenet, the network > topology evolves > over a while to a so called small world topology, where people with > similar interests are clustered together. In addition, to > further speed > up the clustering process, peers also keep in a local index > structures > other peers, that are 'HUBs' in the network, e.g. having a > high in and > out degree. Our experiments show, that we significantly outperform > Gnutella style approaches in messages even in highly volatile > networks. > > Best's Alex > > [1] Searching Dynamic Communities with Personal Indexes. > L?ser, Tempich > et.al 3rd. International Semantic Web Conference, Galway. > Springer 2005 > http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf > [2] Remindin': Semantic query routing in peer-to-peer > networks based on > social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 > http://**www.aifb.uni-karlsruhe.de/ > Publikationen/showPublikation?publ_id=447 > > Ronald Wertlen schrieb: > > > Hi Adam, > > > > perhaps you have not understood my message because you have not > > noticed the focus on "precision and recall" (i.e. search) > not the old > > Distributed DB vs. own DB debate. You have also > pigeon-holed my email > > with the DHT crowd (*grin*), it couldn't be further from it! > > > > I was arguing in the other direction - which coderman thankfully > > picked up. Gnutella doesn't structure enough, that's all. Sure > > Gnutella beats DHTs on search - I base that observation on > a project I > > finished last year - a public prototype that used JXTA and > was honed > > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > > (German site) - we've moved on some since them ;) ]. > > > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results > in a random > > (power-law distribtion) network. Such a network is not going to > > perform very well as far as recall and precision are > concerned, past a > > certain point. I would be interested to calculate that exact point > > (but doubting I'll get to it some time soon :-/). > > > > HTH. > > > > Best regards, Ron > > > > PS. seems this thread has driven the original author to reformulate > > his statment... :-) > > > > PPS. > > In fact, the network is not going to be completely random - it will > > follow the contours of the internet (distribution of servers, > > broadband connections, users, etc. is not random). I am not sure if > > that destroys or supports my argument. Back to the drawing board! > > > > We actually need a better internet. [oops there I go getting > > unspecific again, sorry!! ;-) ] > > > > > >> Message: 4 > >> Date: Wed, 30 Nov 2005 16:42:39 -0500 > >> From: Adam Fisk > >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer > >> development." > >> Message-ID: <438E1CCF.4010907@speedymail.org> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> I don't understand your post. When you say "critical", I assume > >> you're talking about life and death situations? Are you talking > >> about anything specifically? DHTs have failure rates. Ad hoc and > >> mesh networks can become useful in emergency situations where > >> conventional infrastructures break down, but the > >> centralized/p2p/structured/unstructured questions here are > far from > >> obvious. > >> > >> On the "obsessive science types" issue, this completely misses the > >> point. It's a very non "obsessive science type" statement. There > >> are strong reasons for using the massive indexing/random walk > >> approach above DHTs -- reasons that have nothing to do with > >> scalability. In particulary, DHTs are, well, hash tables. Hash > >> tables don't work well for metadata queries. They do fine for > >> keywords (hotspots are a problem, but they can be solved), > but they > >> aren't as nice a fit for metadata. RDF and DHTs are tough > to squeeze > >> together, for example. The massive indexing (mutual index > caching to > >> use Serguei's term)/random walk approach can get around > these issues > >> more easily. They are also not nearly as brittle as DHTs. Sure, > >> DHTs repair themselves after node joins and leaves, but node > >> transience generally has a much greater effect on DHTs > than it does > >> on massive indexing networks. > >> > >> I also think you're underestimating the efficiency of massive > >> indexing and random walks. Sure, these networks don't scale > >> logarithmically, but they do pretty darn well. > >> I encourage everyone to stay specific with their posts. > >> > >> All the Best, > >> > >> Adam > > > > > > > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > -- > ___________________________________________________________ > > Dr. Alexander L?ser, > Technische Universit?t Berlin, > CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY > office: +49- 30-314-25556 fax: +49- 30-314-21601 > web: http://cis.cs.tu-berlin.de/~aloeser/ > ___________________________________________________________ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From ian.clarke at gmail.com Fri Dec 2 12:07:32 2005 From: ian.clarke at gmail.com (Ian Clarke) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <20051129140314.046DD698@yumyum.zooko.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> Message-ID: <823242bd0512020407i252b84c4u@mail.gmail.com> On 29/11/05, zooko@zooko.com wrote: > However, the media seems to have started using the word "Darknet" to mean a > friend-to-friend net and/or a blacknet [7, 8], thus simultaneously making it > harder for people to think about blacknets which are based on other than > friend-to-friend architectures and making it harder for people to think about > friend-to-friend networks which are used for other than illegal information > sharing. > > I place some of the blame for this development on the Freenet folks, who may be > the first to promulgate this munging, and if they aren't the first they're > certainly the most effective. As Michael Rogers pointed out, I am not sure this is as clear-cut as you suggest, the goal for Freenet 0.7 is very close to the idea outlined in the caption for Fig. 3 of the Microsoft Darknet paper, which is a friend-to-friend network. That paper may be the first common usage of the term "darknet", but so far as I can see, it contains no concise definition of what a "darknet" is. I would therefore say that there is no authorative basis on which to invalidate any particular definition of the term that is broadly within the area of P2P networks which conceal user activity. As such, defining the term "darknet" as a f2f network that is designed to conceal the activities of its participants (this being, so far as I have seen, one of the main motivations for building an f2f network), is as valid a definition as any other I have seen (and more useful than most). As a side-point, I think it is somewhat pejorative to say that any technology is "designed" for illegal usage, just because it conceals user activity and therefore may be capable of illegal usage. There are many legal reasons why people might wish to preserve their anonymity and privacy. Ian. From adam at cypherspace.org Fri Dec 2 13:35:16 2005 From: adam at cypherspace.org (Adam Back) Date: Sat Dec 9 22:13:05 2006 Subject: idealized content network properties (Re: [p2p-hackers] darknet) In-Reply-To: <823242bd0512020407i252b84c4u@mail.gmail.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> Message-ID: <20051202133516.GA15480@bitchcake.off.net> I think an ideal www2 network should: 1. have any content searchable by anyone (the contents are public) 2. make it hard to determine who the author of content is 3. make it hard for people other than the author to remove content 4. make it hard for people to observe what other people are downloading 5. make it hard for anyone to change content (new version and navigating by version should be the way to "change") It seems to me that this network can provide any of these subset classifications trivially. removing 1 makes a eg "friend-to-friend" network -- that just means you encrypt the searchable tags and content with a shared key. removing 2 you just sign the content. and so forth. (Making it hard for people other than the author to remove content technically probably involves things like redundancy, transience of service, opaque content to its current server location, indirection etc) (The author also should be able to arrange that he himself can't remove the content, by intentionally discarding whatever keys give him the technical means to remove or change the content). > As a side-point, I think it is somewhat pejorative to say that any > technology is "designed" for illegal usage, just because it conceals > user activity and therefore may be capable of illegal usage. There > are many legal reasons why people might wish to preserve their > anonymity and privacy. Yeah. I think my feature set at the top should be the default/base set of properties exhibited by the www2 (next gen web). Any voluntary restrictions on these should be entered into by policy. Say content X is illegal in jurisdiction Y, then Y should publish a blacklist identifying content X and the legal system in jurisdiction Y should if it chooses make it illegal to not consult the blacklist. I mean illegality is not even consistent, there are things which are legally required in Y that are illegal in Z. There is and can be no globally acceptable policy, so we must robustly technologically prevent global enforcement. Adam From m.rogers at cs.ucl.ac.uk Fri Dec 2 14:19:29 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: idealized content network properties (Re: [p2p-hackers] darknet) In-Reply-To: <20051202133516.GA15480@bitchcake.off.net> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202133516.GA15480@bitchcake.off.net> Message-ID: <439057F1.2060208@cs.ucl.ac.uk> Adam Back wrote: > removing 1 makes a eg "friend-to-friend" network -- that just means > you encrypt the searchable tags and content with a shared key. Not sure about this one - I think the use of group keys is orthogonal to the use of a friend-to-friend topology. For example Groove uses group keys without f2f, Freenet 0.7 will use f2f without group keys, and WASTE uses neither (but still fits under the "darknet" umbrella because it's invitation-only). Cheers, Michael From zooko at zooko.com Fri Dec 2 15:45:57 2005 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <823242bd0512020407i252b84c4u@mail.gmail.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> Message-ID: <20051202154557.E559F191C@yumyum.zooko.com> Ian, p2p-hackers: It's not my goal to quibble about etymology (except inasmuch as it is useful to preserve the historical record). My goals are: 1. Avoid ambiguity -- where some people think that word X denotes concept 1, and others think that word X denotes concept 2. Especially if concepts 1 and 2 are related but not identical. Especially if one of them is politically incendiary. 2. Make sure we have names for our useful concepts. However, before I get to that I am going to go through the history one last time in order to cast light on the current problem. I turned up some interesting details. Let's start with a Venn diagram: _______ _______ / \ / \ / \ / \ / \/ \ / /\ \ / / \ \ | | | | | 1 |1^2 | 2 | | | | | | | | | \ \ / / \ \/ / \ /\ / \ / \ / \_______/ \_______/ Let 1 be the set of networks which are used for illegal transmission of information, and 2 be the set of networks which are built on f2f connections, and 1^2 be the intersection -- the set of networks which are used for illegal transmission of information and which are built on f2f connections. [bepw2002] introduces "darknet" to mean concept 1. In their words darknet is "a collection of networks and technologies used to share digital content", and they use it consistently within that meaning. They refer to concept 2, starting in section 2.1, using the term "small-world nets", and they clearly distinguish between what they call "small-world darknets" and "non-small-world darknets". However nowadays some people in the mass media seem to think that a "darknet" means primarily a network which is "invitation-only", i.e. a "small-world" or "f2f" net [globe]. When did the meaning shift? Ooh -- how interesting to examine the evolution of this word on [wikipedia]! The original definition on wikipedia was written on 2004-09-30. It read in full: "Darknet is a broad term to denote the networks and technologies that enable users to copy and share digital material. The term was coined in a paper from four Microsoft Research authors.". The next change was that two months later someone redirected the "Darknet" page to just be a link to the "Filesharing page", with the comment "Just another word for filesharing". The next change was that on 2005-04-14 someone from IP 81.178.83.245 wrote a definition beginning with this sentence: "A Darknet is a private file sharing network where users only connect to people they trust.". By the way, I should point out that I have a personal interest in this history because between 2001 and 2003 I tried to promulgate concept 2, using Lucas Gonze's coinage: "friendnet" [zooko2001, zooko2002, zooko2003, gonze2002]. I would like to know for my own satisfaction if my ideas were a direct inspiration for some of this modern stuff, such as the Freenet v0.7 design. So much for etymology. Now the problem is that in the current parlance of the media, the word "darknet" is used to mean vaguely 1 or 2 or 1^2. The reason that this is a problem isn't that it breaks with some etymological tradition, but that it is ambiguous and that it deprives us of useful words to refer to 1 or 2 specifically. The ambiguity has nasty political consequences -- see for example these f2f network operators struggling to persuade newspaper readers that they are not primarily for illegal purposes: [globe]. My proposal to rectify the lack-of-words problem is to use "blacknet" to refer to 1 specifically and "f2f net" to refer to 2 specifically. I don't know if there is any way to rectify the ambiguity problem. Ian wrote: > > ... > defining the term "darknet" as a f2f network that is designed > to conceal the activities of its participants (this being, so far as I > have seen, one of the main motivations for building an f2f network), So you think of "darknet" as meaning 1^2. That's an interesting remark -- that you regard concealment as one of the main motivations. I personally regard concealment as one of the lesser motivations -- I'm more interested in attack resistance (resisting attacks such as subversion or denial-of-service, rather than attacks such as surveillance), scalability, and other properties. Although I'm interested in the concealment properties as well. Regards, Zooko P.S. Here's some obligatory link juice for Gonze's latest sly neologism: lightnet! [bepw2002] "The darknet and the future of content distribution" Biddle, England, Peinado, Willman (Microsoft Corporation) http://crypto.stanford.edu/DRM2002/darknet5.doc http://www.dklevine.com/archive/darknet.pdf (The .doc version crashes my OpenOffice.org app when I try to read it. Does this mean something? The .pdf version has screwed up images when I view it in evince.) [wikipedia] http://en.wikipedia.org/wiki/Darknet [zooko2001] "Attack Resistant Sharing of Metadata" Zooko and Raph Levien presentation, First O'Reilly Peer-to-Peer conference, 2001 http://conferences.oreillynet.com/cs/p2p2001/view/e_sess/1200 [zooko2002] http://zooko.com/log-2002-12.html#d2002-12-14-the_human_context_and_the_future_of_Mnet [zooko2003] http://www.zooko.com/log-2003-01.html#d2003-01-23-trust_is_just_another_topology [gonze2002] http://www.oreillynet.com/pub/wlg/2428 [globe] "Darknets: The invitation-only Internet" globeandmail.com 2005-11-24 http://www.globetechnology.com/servlet/story/RTGAM.20051007.gtdarknetoct7/BNStory/Technology/ [lightnet] http://gonze.com/weblog/story/lightnet From m.rogers at cs.ucl.ac.uk Fri Dec 2 16:02:07 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <20051202154557.E559F191C@yumyum.zooko.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> Message-ID: <43906FFF.3070302@cs.ucl.ac.uk> zooko@zooko.com wrote: > However nowadays some people in the mass media seem to think that a "darknet" > means primarily a network which is "invitation-only", i.e. a "small-world" or > "f2f" net [globe]. Sorry to split an already frayed hair, but invitation-only isn't the same as f2f. Invitation-only implies that you must know some member of the network, whereas f2f implies that you must know the members you connect to. For example Groove and WASTE are invitation-only but not f2f. Cheers, Michael From mccoy at mad-scientist.com Fri Dec 2 17:32:53 2005 From: mccoy at mad-scientist.com (Jim McCoy) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: P2P in SFC In-Reply-To: <20051201.235431.65614928.warner@lothar.com> References: <20051201.235431.65614928.warner@lothar.com> Message-ID: <56D8091C-1D45-48C2-975C-5F6A1D47059B@mad-scientist.com> > Regardless, let's wait for the final guest list before deciding if we > switch locales. I will be there. Jim From zooko at zooko.com Fri Dec 2 17:20:47 2005 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet.pdf Message-ID: <20051202172047.64212339@yumyum.zooko.com> Thanks to anonymous contributor. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pdf Size: 246474 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/62b9c4af/attachment.pdf From coderman at gmail.com Fri Dec 2 18:08:32 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] help:is anyone interested in living streaming or vod over p2p? In-Reply-To: References: Message-ID: <4ef5fec60512021008v64987949xb6880691dd2fceec@mail.gmail.com> On 12/1/05, zheng j wrote: > Hi, I am now doing research on living streaming and Vod over p2p... wireless is a natural fit for p2p streaming / broadcast distribution From Serguei.Osokine at efi.com Fri Dec 2 18:23:24 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42751@fcexmb04.efi.internal> On Friday, December 02, 2005 Alexander L?ser wrote: > originally there was a certain type of clustering in the beginnings > of Gnutella (late 90ies) . People communicate its ids mouth to mouth > or via Email or deja news to other people. So in most cases you got > Ids from people which had at least similar interests, or from > people where you expected some interesting files. I'm sorry to contradict you, but I think this is all a myth. First, there was no Gnutella in late 90ies. It was released in March of 2000. Second, I remember looking at the connection stability just a few months later (June/July, maybe?), and the churn was quite high - a client tended to replace all its connections within an hour or so. Now if you remember how the connections were replaced, the client was trying the IPs that it received from PONGs, which were essentially the random network IPs, because the network was just a few thousand nodes and every client could see the pongs from pretty much everyone. So in an hour or so your initial connection point stopped being relevant and you found yourself at a random place in the network. After that, all your subsequent sessions used the IP list stored on disk by a previous session to connect to the network, and the address given to you by your friends was no longer important. To be precise, this latest part (about the IP list) was the behaviour of the Gnutella clients that I worked with (I think these were Gnutella v.056 and GNUT). Maybe there were some clients that required to enter an IP at every session start. I don't know. There was also a notion of locality based on the unusually good and stable connections - as soon as the two machines on my desktop would find each other on the network as a result of this random process, they would stay connected for quite a while (as long as I did not stop the clients). But even these considerations are not important, because the early Gnutella (until the meltdown of July 2000) was fully visible, and every query more or less reached every node (in the absence of the flow control, this is exactly what caused the meltdown - TTL was too high to limit the query propagation). Of course, some queries might have been missing some nodes, but generally there was no chance for any clustering - I simply cannot see how it could possibly exist in such a network. > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest > based Node Grouping Algorithm [1][2]) , that reclusters the network > based on the interests of the peers, without any DHT, only using on > an unstructured network. Which is cool, and maybe it is a great protocol - as long as you won't justify its existence by myths. I'm sure there are plenty of legitimate reasons that make this protocol useful ;-) Best wishes - S.Osokine. 2 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Alexander L?ser Sent: Friday, December 02, 2005 1:29 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] Re: scalability Hi Adam, originally there was a certain type of clustering in the beginnings of Gnutella (late 90ies) . People communicate its ids mouth to mouth or via Email or deja news to other people. So in most cases you got Ids from people which had at least similar interests, or from people where you expected some interesting files. Later, due to the overwhelming attractiveness of the gnutella application they introduced the gtk and other bootstrapping alternatives, given you a number of starting pointers. However, this starting points a chosen 'randomly', so there is no longer any clustering by interests. We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based Node Grouping Algorithm [1][2]) , that reclusters the network based on the interests of the peers, without any DHT, only using on an unstructured network. Similar to freenet, the network topology evolves over a while to a so called small world topology, where people with similar interests are clustered together. In addition, to further speed up the clustering process, peers also keep in a local index structures other peers, that are 'HUBs' in the network, e.g. having a high in and out degree. Our experiments show, that we significantly outperform Gnutella style approaches in messages even in highly volatile networks. Best's Alex [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf [2] Remindin': Semantic query routing in peer-to-peer networks based on social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 http://**www.aifb.uni-karlsruhe.de/ Publikationen/showPublikation?publ_id=447 Ronald Wertlen schrieb: > Hi Adam, > > perhaps you have not understood my message because you have not > noticed the focus on "precision and recall" (i.e. search) not the old > Distributed DB vs. own DB debate. You have also pigeon-holed my email > with the DHT crowd (*grin*), it couldn't be further from it! > > I was arguing in the other direction - which coderman thankfully > picked up. Gnutella doesn't structure enough, that's all. Sure > Gnutella beats DHTs on search - I base that observation on a project I > finished last year - a public prototype that used JXTA and was honed > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > (German site) - we've moved on some since them ;) ]. > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Such a network is not going to > perform very well as far as recall and precision are concerned, past a > certain point. I would be interested to calculate that exact point > (but doubting I'll get to it some time soon :-/). > > HTH. > > Best regards, Ron > > PS. seems this thread has driven the original author to reformulate > his statment... :-) > > PPS. > In fact, the network is not going to be completely random - it will > follow the contours of the internet (distribution of servers, > broadband connections, users, etc. is not random). I am not sure if > that destroys or supports my argument. Back to the drawing board! > > We actually need a better internet. [oops there I go getting > unspecific again, sorry!! ;-) ] > > >> Message: 4 >> Date: Wed, 30 Nov 2005 16:42:39 -0500 >> From: Adam Fisk >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer >> development." >> Message-ID: <438E1CCF.4010907@speedymail.org> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> I don't understand your post. When you say "critical", I assume >> you're talking about life and death situations? Are you talking >> about anything specifically? DHTs have failure rates. Ad hoc and >> mesh networks can become useful in emergency situations where >> conventional infrastructures break down, but the >> centralized/p2p/structured/unstructured questions here are far from >> obvious. >> >> On the "obsessive science types" issue, this completely misses the >> point. It's a very non "obsessive science type" statement. There >> are strong reasons for using the massive indexing/random walk >> approach above DHTs -- reasons that have nothing to do with >> scalability. In particulary, DHTs are, well, hash tables. Hash >> tables don't work well for metadata queries. They do fine for >> keywords (hotspots are a problem, but they can be solved), but they >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze >> together, for example. The massive indexing (mutual index caching to >> use Serguei's term)/random walk approach can get around these issues >> more easily. They are also not nearly as brittle as DHTs. Sure, >> DHTs repair themselves after node joins and leaves, but node >> transience generally has a much greater effect on DHTs than it does >> on massive indexing networks. >> >> I also think you're underestimating the efficiency of massive >> indexing and random walks. Sure, these networks don't scale >> logarithmically, but they do pretty darn well. >> I encourage everyone to stay specific with their posts. >> >> All the Best, >> >> Adam > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- ___________________________________________________________ Dr. Alexander L?ser, Technische Universit?t Berlin, CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY office: +49- 30-314-25556 fax: +49- 30-314-21601 web: http://cis.cs.tu-berlin.de/~aloeser/ ___________________________________________________________ _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From coderman at gmail.com Fri Dec 2 18:30:03 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) Message-ID: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> ping From don at dhoffman.net Fri Dec 2 18:45:12 2005 From: don at dhoffman.net (Donald Hoffman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> Message-ID: <0C7F13F7-D31C-47EB-90D0-17289D97ECAF@dhoffman.net> Pong. Also (live) in Portland. (Actually in Montana right now. Anyone there?) Don On Dec 2, 2005, at 11:30 AM, coderman wrote: > ping > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From agthorr at cs.uoregon.edu Fri Dec 2 18:51:37 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> Message-ID: <20051202185136.GB2604@cs.uoregon.edu> On Fri, Dec 02, 2005 at 10:30:03AM -0800, coderman wrote: > ping I'm in Eugene. I'd be willing to drive up for a get-together if we have a big enough group to make it interesting. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From coderman at gmail.com Fri Dec 2 19:13:33 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <20051202185136.GB2604@cs.uoregon.edu> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> Message-ID: <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> On 12/2/05, Daniel Stutzbach wrote: > I'm in Eugene. I'd be willing to drive up for a get-together if we > have a big enough group to make it interesting. i'd be happy to travel to eugene if more of the group is located there as well. weekends would be best in that case. From gbildson at limepeer.com Fri Dec 2 19:22:32 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42751@fcexmb04.efi.internal> Message-ID: The only locality that I can think of that may have occurred back in that early timeframe would be based on the stringiness of the network. I have a feeling that pre-centralized hostcache, the network was more of a long string with some clumps as it went along. So, its possible that the network diameter at its longest point was much larger than max-TTL. Then, the introduction of centralized hostcaches helped create a massive cluster and exacerbated the early modem bandwidth barrier. This appeared to be what Gene Kan thought I believe. Its was only months later with the introduction of clients with keepalive pings and flow control that the clogged spots got freed up. If ToadNode was correct in that they had millions of downloads in those early days then thats the only way that I could see the modem bandwidth barrier not getting hit very quickly. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Serguei Osokine > Sent: Friday, December 02, 2005 1:23 PM > To: Peer-to-peer development. > Subject: RE: [p2p-hackers] Re: scalability > > > On Friday, December 02, 2005 Alexander L?ser wrote: > > originally there was a certain type of clustering in the beginnings > > of Gnutella (late 90ies) . People communicate its ids mouth to mouth > > or via Email or deja news to other people. So in most cases you got > > Ids from people which had at least similar interests, or from > > people where you expected some interesting files. > > I'm sorry to contradict you, but I think this is all a myth. > > First, there was no Gnutella in late 90ies. It was released in > March of 2000. Second, I remember looking at the connection stability > just a few months later (June/July, maybe?), and the churn was quite > high - a client tended to replace all its connections within an hour > or so. > > Now if you remember how the connections were replaced, the > client was trying the IPs that it received from PONGs, which were > essentially the random network IPs, because the network was just > a few thousand nodes and every client could see the pongs from > pretty much everyone. So in an hour or so your initial connection > point stopped being relevant and you found yourself at a random > place in the network. After that, all your subsequent sessions used > the IP list stored on disk by a previous session to connect to the > network, and the address given to you by your friends was no longer > important. > > To be precise, this latest part (about the IP list) was the > behaviour of the Gnutella clients that I worked with (I think these > were Gnutella v.056 and GNUT). Maybe there were some clients that > required to enter an IP at every session start. I don't know. There > was also a notion of locality based on the unusually good and stable > connections - as soon as the two machines on my desktop would find > each other on the network as a result of this random process, they > would stay connected for quite a while (as long as I did not stop > the clients). > > But even these considerations are not important, because the > early Gnutella (until the meltdown of July 2000) was fully visible, > and every query more or less reached every node (in the absence of > the flow control, this is exactly what caused the meltdown - TTL was > too high to limit the query propagation). > > Of course, some queries might have been missing some nodes, but > generally there was no chance for any clustering - I simply cannot see > how it could possibly exist in such a network. > > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest > > based Node Grouping Algorithm [1][2]) , that reclusters the network > > based on the interests of the peers, without any DHT, only using on > > an unstructured network. > > Which is cool, and maybe it is a great protocol - as long as > you won't justify its existence by myths. I'm sure there are plenty > of legitimate reasons that make this protocol useful ;-) > > Best wishes - > S.Osokine. > 2 Dec 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Alexander L?ser > Sent: Friday, December 02, 2005 1:29 AM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] Re: scalability > > > Hi Adam, > originally there was a certain type of clustering in the beginnings of > Gnutella (late 90ies) . People communicate its ids mouth to mouth or via > Email or deja news to other people. So in most cases you got Ids from > people which had at least similar interests, or from people where you > expected some interesting files. Later, due to the overwhelming > attractiveness of the gnutella application they introduced the gtk and > other bootstrapping alternatives, given you a number of starting > pointers. However, this starting points a chosen 'randomly', so there is > no longer any clustering by interests. > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based > Node Grouping Algorithm [1][2]) , that reclusters the network based on > the interests of the peers, without any DHT, only using on an > unstructured network. Similar to freenet, the network topology evolves > over a while to a so called small world topology, where people with > similar interests are clustered together. In addition, to further speed > up the clustering process, peers also keep in a local index structures > other peers, that are 'HUBs' in the network, e.g. having a high in and > out degree. Our experiments show, that we significantly outperform > Gnutella style approaches in messages even in highly volatile networks. > > Best's Alex > > [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich > et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 > http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf > [2] Remindin': Semantic query routing in peer-to-peer networks based on > social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 > http://**www.aifb.uni-karlsruhe.de/ > Publikationen/showPublikation?publ_id=447 > > Ronald Wertlen schrieb: > > > Hi Adam, > > > > perhaps you have not understood my message because you have not > > noticed the focus on "precision and recall" (i.e. search) not the old > > Distributed DB vs. own DB debate. You have also pigeon-holed my email > > with the DHT crowd (*grin*), it couldn't be further from it! > > > > I was arguing in the other direction - which coderman thankfully > > picked up. Gnutella doesn't structure enough, that's all. Sure > > Gnutella beats DHTs on search - I base that observation on a project I > > finished last year - a public prototype that used JXTA and was honed > > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > > (German site) - we've moved on some since them ;) ]. > > > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results in a random > > (power-law distribtion) network. Such a network is not going to > > perform very well as far as recall and precision are concerned, past a > > certain point. I would be interested to calculate that exact point > > (but doubting I'll get to it some time soon :-/). > > > > HTH. > > > > Best regards, Ron > > > > PS. seems this thread has driven the original author to reformulate > > his statment... :-) > > > > PPS. > > In fact, the network is not going to be completely random - it will > > follow the contours of the internet (distribution of servers, > > broadband connections, users, etc. is not random). I am not sure if > > that destroys or supports my argument. Back to the drawing board! > > > > We actually need a better internet. [oops there I go getting > > unspecific again, sorry!! ;-) ] > > > > > >> Message: 4 > >> Date: Wed, 30 Nov 2005 16:42:39 -0500 > >> From: Adam Fisk > >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer > >> development." > >> Message-ID: <438E1CCF.4010907@speedymail.org> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> I don't understand your post. When you say "critical", I assume > >> you're talking about life and death situations? Are you talking > >> about anything specifically? DHTs have failure rates. Ad hoc and > >> mesh networks can become useful in emergency situations where > >> conventional infrastructures break down, but the > >> centralized/p2p/structured/unstructured questions here are far from > >> obvious. > >> > >> On the "obsessive science types" issue, this completely misses the > >> point. It's a very non "obsessive science type" statement. There > >> are strong reasons for using the massive indexing/random walk > >> approach above DHTs -- reasons that have nothing to do with > >> scalability. In particulary, DHTs are, well, hash tables. Hash > >> tables don't work well for metadata queries. They do fine for > >> keywords (hotspots are a problem, but they can be solved), but they > >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze > >> together, for example. The massive indexing (mutual index caching to > >> use Serguei's term)/random walk approach can get around these issues > >> more easily. They are also not nearly as brittle as DHTs. Sure, > >> DHTs repair themselves after node joins and leaves, but node > >> transience generally has a much greater effect on DHTs than it does > >> on massive indexing networks. > >> > >> I also think you're underestimating the efficiency of massive > >> indexing and random walks. Sure, these networks don't scale > >> logarithmically, but they do pretty darn well. > >> I encourage everyone to stay specific with their posts. > >> > >> All the Best, > >> > >> Adam > > > > > > > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > -- > ___________________________________________________________ > > Dr. Alexander L?ser, > Technische Universit?t Berlin, > CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY > office: +49- 30-314-25556 fax: +49- 30-314-21601 > web: http://cis.cs.tu-berlin.de/~aloeser/ > ___________________________________________________________ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From eugen at leitl.org Fri Dec 2 19:38:33 2005 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> Message-ID: <20051202193833.GD2249@leitl.org> On Fri, Dec 02, 2005 at 11:13:33AM -0800, coderman wrote: > On 12/2/05, Daniel Stutzbach wrote: > > I'm in Eugene. I'd be willing to drive up for a get-together if we > > have a big enough group to make it interesting. > > i'd be happy to travel to eugene if more of the group is located there > as well. weekends would be best in that case. Allright! I'm game. ;) -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/9879a674/attachment.pgp From Serguei.Osokine at efi.com Fri Dec 2 19:38:41 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42756@fcexmb04.efi.internal> On Friday, December 02, 2005 Greg Bildson wrote: > I have a feeling that pre-centralized hostcache, the network was > more of a long string with some clumps as it went along. So what kept this string from fully clumping as the connections were broken and reestablished? Default was four connections, not two. How is it possible not to fold this string onto itself about one thousand times after the first 1,000 connections will be reestablished - which would take 10-15 minutes in a 1,000-node network, and would happen instantly in a one-million one? > If ToadNode was correct in that they had millions of downloads in > those early days then thats the only way that I could see the modem > bandwidth barrier not getting hit very quickly. Between people not using the downloaded code, an error in ToadNode stats, a miracle, and the network preserving its 'linear' graph topology for any noticeable time, my vote will be for any one of the first three - the last one is too improbable. Best wishes - S.Osokine. 2 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Greg Bildson Sent: Friday, December 02, 2005 11:23 AM To: Peer-to-peer development. Subject: RE: [p2p-hackers] Re: scalability The only locality that I can think of that may have occurred back in that early timeframe would be based on the stringiness of the network. I have a feeling that pre-centralized hostcache, the network was more of a long string with some clumps as it went along. So, its possible that the network diameter at its longest point was much larger than max-TTL. Then, the introduction of centralized hostcaches helped create a massive cluster and exacerbated the early modem bandwidth barrier. This appeared to be what Gene Kan thought I believe. Its was only months later with the introduction of clients with keepalive pings and flow control that the clogged spots got freed up. If ToadNode was correct in that they had millions of downloads in those early days then thats the only way that I could see the modem bandwidth barrier not getting hit very quickly. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Serguei Osokine > Sent: Friday, December 02, 2005 1:23 PM > To: Peer-to-peer development. > Subject: RE: [p2p-hackers] Re: scalability > > > On Friday, December 02, 2005 Alexander L?ser wrote: > > originally there was a certain type of clustering in the beginnings > > of Gnutella (late 90ies) . People communicate its ids mouth to mouth > > or via Email or deja news to other people. So in most cases you got > > Ids from people which had at least similar interests, or from > > people where you expected some interesting files. > > I'm sorry to contradict you, but I think this is all a myth. > > First, there was no Gnutella in late 90ies. It was released in > March of 2000. Second, I remember looking at the connection stability > just a few months later (June/July, maybe?), and the churn was quite > high - a client tended to replace all its connections within an hour > or so. > > Now if you remember how the connections were replaced, the > client was trying the IPs that it received from PONGs, which were > essentially the random network IPs, because the network was just > a few thousand nodes and every client could see the pongs from > pretty much everyone. So in an hour or so your initial connection > point stopped being relevant and you found yourself at a random > place in the network. After that, all your subsequent sessions used > the IP list stored on disk by a previous session to connect to the > network, and the address given to you by your friends was no longer > important. > > To be precise, this latest part (about the IP list) was the > behaviour of the Gnutella clients that I worked with (I think these > were Gnutella v.056 and GNUT). Maybe there were some clients that > required to enter an IP at every session start. I don't know. There > was also a notion of locality based on the unusually good and stable > connections - as soon as the two machines on my desktop would find > each other on the network as a result of this random process, they > would stay connected for quite a while (as long as I did not stop > the clients). > > But even these considerations are not important, because the > early Gnutella (until the meltdown of July 2000) was fully visible, > and every query more or less reached every node (in the absence of > the flow control, this is exactly what caused the meltdown - TTL was > too high to limit the query propagation). > > Of course, some queries might have been missing some nodes, but > generally there was no chance for any clustering - I simply cannot see > how it could possibly exist in such a network. > > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest > > based Node Grouping Algorithm [1][2]) , that reclusters the network > > based on the interests of the peers, without any DHT, only using on > > an unstructured network. > > Which is cool, and maybe it is a great protocol - as long as > you won't justify its existence by myths. I'm sure there are plenty > of legitimate reasons that make this protocol useful ;-) > > Best wishes - > S.Osokine. > 2 Dec 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Alexander L?ser > Sent: Friday, December 02, 2005 1:29 AM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] Re: scalability > > > Hi Adam, > originally there was a certain type of clustering in the beginnings of > Gnutella (late 90ies) . People communicate its ids mouth to mouth or via > Email or deja news to other people. So in most cases you got Ids from > people which had at least similar interests, or from people where you > expected some interesting files. Later, due to the overwhelming > attractiveness of the gnutella application they introduced the gtk and > other bootstrapping alternatives, given you a number of starting > pointers. However, this starting points a chosen 'randomly', so there is > no longer any clustering by interests. > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based > Node Grouping Algorithm [1][2]) , that reclusters the network based on > the interests of the peers, without any DHT, only using on an > unstructured network. Similar to freenet, the network topology evolves > over a while to a so called small world topology, where people with > similar interests are clustered together. In addition, to further speed > up the clustering process, peers also keep in a local index structures > other peers, that are 'HUBs' in the network, e.g. having a high in and > out degree. Our experiments show, that we significantly outperform > Gnutella style approaches in messages even in highly volatile networks. > > Best's Alex > > [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich > et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 > http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf > [2] Remindin': Semantic query routing in peer-to-peer networks based on > social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 > http://**www.aifb.uni-karlsruhe.de/ > Publikationen/showPublikation?publ_id=447 > > Ronald Wertlen schrieb: > > > Hi Adam, > > > > perhaps you have not understood my message because you have not > > noticed the focus on "precision and recall" (i.e. search) not the old > > Distributed DB vs. own DB debate. You have also pigeon-holed my email > > with the DHT crowd (*grin*), it couldn't be further from it! > > > > I was arguing in the other direction - which coderman thankfully > > picked up. Gnutella doesn't structure enough, that's all. Sure > > Gnutella beats DHTs on search - I base that observation on a project I > > finished last year - a public prototype that used JXTA and was honed > > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > > (German site) - we've moved on some since them ;) ]. > > > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results in a random > > (power-law distribtion) network. Such a network is not going to > > perform very well as far as recall and precision are concerned, past a > > certain point. I would be interested to calculate that exact point > > (but doubting I'll get to it some time soon :-/). > > > > HTH. > > > > Best regards, Ron > > > > PS. seems this thread has driven the original author to reformulate > > his statment... :-) > > > > PPS. > > In fact, the network is not going to be completely random - it will > > follow the contours of the internet (distribution of servers, > > broadband connections, users, etc. is not random). I am not sure if > > that destroys or supports my argument. Back to the drawing board! > > > > We actually need a better internet. [oops there I go getting > > unspecific again, sorry!! ;-) ] > > > > > >> Message: 4 > >> Date: Wed, 30 Nov 2005 16:42:39 -0500 > >> From: Adam Fisk > >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer > >> development." > >> Message-ID: <438E1CCF.4010907@speedymail.org> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> I don't understand your post. When you say "critical", I assume > >> you're talking about life and death situations? Are you talking > >> about anything specifically? DHTs have failure rates. Ad hoc and > >> mesh networks can become useful in emergency situations where > >> conventional infrastructures break down, but the > >> centralized/p2p/structured/unstructured questions here are far from > >> obvious. > >> > >> On the "obsessive science types" issue, this completely misses the > >> point. It's a very non "obsessive science type" statement. There > >> are strong reasons for using the massive indexing/random walk > >> approach above DHTs -- reasons that have nothing to do with > >> scalability. In particulary, DHTs are, well, hash tables. Hash > >> tables don't work well for metadata queries. They do fine for > >> keywords (hotspots are a problem, but they can be solved), but they > >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze > >> together, for example. The massive indexing (mutual index caching to > >> use Serguei's term)/random walk approach can get around these issues > >> more easily. They are also not nearly as brittle as DHTs. Sure, > >> DHTs repair themselves after node joins and leaves, but node > >> transience generally has a much greater effect on DHTs than it does > >> on massive indexing networks. > >> > >> I also think you're underestimating the efficiency of massive > >> indexing and random walks. Sure, these networks don't scale > >> logarithmically, but they do pretty darn well. > >> I encourage everyone to stay specific with their posts. > >> > >> All the Best, > >> > >> Adam > > > > > > > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > -- > ___________________________________________________________ > > Dr. Alexander L?ser, > Technische Universit?t Berlin, > CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY > office: +49- 30-314-25556 fax: +49- 30-314-21601 > web: http://cis.cs.tu-berlin.de/~aloeser/ > ___________________________________________________________ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From bryan.turner at pobox.com Fri Dec 2 20:15:45 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <20051201205215.GF5300@cs.uoregon.edu> Message-ID: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> My $.02 on Gnutella, The Gnutella network will scale fine to 2B nodes. However, I believe without interest clustering or intelligent peer selection, it will become increasingly difficult to find the data you are interested in. IE: I feel the current architecture misses the 'long tail'. (Note that I am not well versed on Gnutella architecture, this opinion is based on papers modeling the math behind Gnutella) I like to find the orthogonal axis in a design, P2P has lots of interesting scalability axis: 1 Scalability in # of nodes 2 Scalability in # of objects 3 Scalability in size of objects 4 Scalability in interest for an object (hot spots) 5 Scalability in bandwidth (protocol overhead, efficiency) etc. BitTorrent captures all but #2, as multiple torrents may require redundant connections to a peer, and torrents that share files cannot also share swarms (not to mention BitTorrent isn't a content search network). Gnutella (I believe) doesn't meet #2,3 and partially #4,5: #2 because it does not cluster related data it will eventually be overwhelmed with content. #3 because it performs full-file transfers instead of block exchanges or partial file transfers #4/5 because clients don't immediately offer partial downloads, thus hot spots have a congestion delay measured in full-file-transfer increments rather than in block increments (an order of 2 for typical MP3s, easily reaching multiple days of congestion). A vision for a network that scales along all axis would be Gnutella with some structure to improve domain-specific searches, with BitTorrent as the data transfer mechanism. Please educate me if I've missed some facet of Gnutella! --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Daniel Stutzbach Sent: Thursday, December 01, 2005 3:52 PM To: p2p-hackers@zgp.org Subject: Re: [p2p-hackers] Re: scalability On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote: > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Gnutella is not a power-law network. See my paper on the graph properties of Gnutella, presented at the Internet Measurement Conference earlier this year: http://www.usenix.org/events/imc05/tech/stutzbach.html > Such a network is not going to perform very well as far as recall and > precision are concerned, past a certain point. I would be interested > to calculate that exact point (but doubting I'll get to it some time > soon :-/). Could you rigorously define recall and precision for me? I'm not sure what you mean by these terms. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From agthorr at cs.uoregon.edu Fri Dec 2 20:22:23 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> References: <20051201205215.GF5300@cs.uoregon.edu> <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> Message-ID: <20051202202223.GC2604@cs.uoregon.edu> On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > #2 because it does not cluster related data it will eventually > be overwhelmed with content. > #3 because it performs full-file transfers instead of block > exchanges or partial file transfers > #4/5 because clients don't immediately offer partial downloads, > thus hot spots have a congestion delay measured in > full-file-transfer increments rather than in block > increments (an order of 2 for typical MP3s, easily > reaching multiple days of congestion). If I am not mistaken, Gnutella has been doing partial file transfers for two or three years now. The eDonkey/eMule network does this too. BitTorrent does not have a monopoly on this feature. :-) The relevant spec (if it can be called a spec) for Gnutella is here: http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From bryan.turner at pobox.com Fri Dec 2 20:25:59 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: Message-ID: <200512022026.jB2KPx4X026153@rtp-core-1.cisco.com> The DHTs that I've studied behave well in high-churn environments. The problem is network migration events; large swings of population in a short time. Chord is the worst for this, as its rigid structure quickly buckles when you lose a large chunk of the network. Kademlia survives pretty well; maintaining connections with long-lived nodes is a definite win, as is maintaining connectivity to hubs/supernodes. All of them get screwed when large populations join. The network turns to chaos for a while until things settle down. Kademlia is better off (lookups continue to work). The largest problem is the sudden lack of bandwidth due to all the key-transfers between the nodes. In my implementations I had to add a 'slop' factor that was larger than my largest expected node-join event. During a lookup, if the 'ultimate' node didn't have the data, he passed the request through the oldest couple of nodes in the slop region. This allowed one last chance to find the right owner. It worked well in practice, but I still believe there's a better way. --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Sean Rhea Sent: Thursday, December 01, 2005 6:02 PM To: Peer-to-peer development. Subject: Re: [p2p-hackers] DHTs in highly-transient networks On Dec 1, 2005, at 5:53 PM, Michael Rogers wrote: > To what extent does this depend on the distribution of session times > as well as the mean? Kademlia assumes that old nodes will outlive new > nodes, and Daniel's paper shows that Gnutella contains an emergent > core of long-lived nodes - how well do Bamboo and Chord survive under > non-uniform churn? We used exponentially-distributed node lifetimes, so old nodes do not generally outlive new ones. However, I _think_ that choice only makes the problem harder, though. In particular, I would suspect that Bamboo/Chord would do just as well if old nodes lived longer than new ones, and possibly better. They won't take advantage of it like Kademlia does, but it shouldn't hurt them either. (At least that's my guess; I don't have data to prove it.) Sean -- When I see the price that you pay / I don't wanna grow up -- Tom Waits From coderman at gmail.com Fri Dec 2 20:30:11 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> References: <20051201205215.GF5300@cs.uoregon.edu> <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> Message-ID: <4ef5fec60512021230t2408884ew2e6002cc61fa6d92@mail.gmail.com> On 12/2/05, Bryan Turner wrote: > ... > 4 Scalability in interest for an object (hot spots) >... > A vision for a network that scales along all axis would be Gnutella > with some structure to improve domain-specific searches, with BitTorrent as > the data transfer mechanism. finding obscure / rare / unpopular resources is the flip side of the interest coin. in alpine all discovery was done using distinct peer groups dedicated to a single domain of resource discovery (specific subjects / applications had distinct groups). peer lists were ordered within each group according to a relative quality attribute associated with that group only. the goal was to make decentralized search efficient for very obscure resources when a centralized (or partially centralized) index search was usually required for completeness to make it effective. the problem with this approach is that it is very hard to model in a meaningful way due to inherent dependence on relative metrics associated with human behavior. (or perhaps it will be simple(r) if a large real world network can be observed and studied) alpine also used a pluggable module system (dlopen with c++ derived handlers) to handle arbitrary metadata associated with queries (different groups may require different search criteria and taxonomy) and integrate various transport mechanisms (a simple TCP stream transfer was provided as an example of this ability) being able to offload such transfers to a system optimized for the purpose, like bittorrent, was a design goal and definitely makes sense in any project where cooperative content distribution is useful. From gbildson at limepeer.com Fri Dec 2 20:35:11 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> Message-ID: Those suppositions are fairly misplaced as is most academic work on Gnutella. I wouldn't believe any (other than Daniel Stutzbach's) academic papers describing Gnutella. Partial file sharing is active by default. Download meshes are in place. Download chunking (pseudo-random) is in place - not rarest first but sufficient in many cases. Many improvements have been made to increase the awareness and allocation of resources but improvements can still be made. You are correct that rare file/topic searches are still not great but are much better than historically and likely better than similar networks. Dynamic querying does a good job of satisfying popular requests at low cost and reserving more horsepower for rarer searches. Efficiency is pretty good. Bittorrent is a tad verbose in some respects. The only important things that are not in place in Gnutella are rarest first and tit for tat. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Bryan Turner > Sent: Friday, December 02, 2005 3:16 PM > To: 'Peer-to-peer development.' > Subject: RE: [p2p-hackers] Re: scalability > > > My $.02 on Gnutella, > > The Gnutella network will scale fine to 2B nodes. However, I > believe without interest clustering or intelligent peer selection, it will > become increasingly difficult to find the data you are interested > in. IE: I > feel the current architecture misses the 'long tail'. (Note that I am not > well versed on Gnutella architecture, this opinion is based on papers > modeling the math behind Gnutella) > > I like to find the orthogonal axis in a design, P2P has lots of > interesting scalability axis: > 1 Scalability in # of nodes > 2 Scalability in # of objects > 3 Scalability in size of objects > 4 Scalability in interest for an object (hot spots) > 5 Scalability in bandwidth (protocol overhead, efficiency) > etc. > > BitTorrent captures all but #2, as multiple torrents may require > redundant connections to a peer, and torrents that share files cannot also > share swarms (not to mention BitTorrent isn't a content search network). > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > #2 because it does not cluster related data it will eventually > be overwhelmed with content. > #3 because it performs full-file transfers instead of block > exchanges or partial file transfers > #4/5 because clients don't immediately offer partial downloads, > thus hot spots have a congestion delay measured in > full-file-transfer increments rather than in block > increments (an order of 2 for typical MP3s, easily > reaching multiple days of congestion). > > A vision for a network that scales along all axis would be Gnutella > with some structure to improve domain-specific searches, with > BitTorrent as > the data transfer mechanism. > > Please educate me if I've missed some facet of Gnutella! > --Bryan > bryan.turner@pobox.com > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Daniel Stutzbach > Sent: Thursday, December 01, 2005 3:52 PM > To: p2p-hackers@zgp.org > Subject: Re: [p2p-hackers] Re: scalability > > On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote: > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results in a random > > (power-law distribtion) network. > > Gnutella is not a power-law network. See my paper on the graph properties > of Gnutella, presented at the Internet Measurement Conference earlier this > year: > > http://www.usenix.org/events/imc05/tech/stutzbach.html > > > Such a network is not going to perform very well as far as recall and > > precision are concerned, past a certain point. I would be interested > > to calculate that exact point (but doubting I'll get to it some time > > soon :-/). > > Could you rigorously define recall and precision for me? I'm not > sure what > you mean by these terms. > > -- > Daniel Stutzbach Computer Science Ph.D Student > http://www.barsoom.org/~agthorr University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From bryan.turner at pobox.com Fri Dec 2 20:40:59 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <20051202202223.GC2604@cs.uoregon.edu> Message-ID: <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> Ah, this is news to me :) Thanks for the link. I notice that this partial file transfer feature is only a footnote on the main protocol.. How wide spread is the partial file transfer feature among clients? --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Daniel Stutzbach Sent: Friday, December 02, 2005 3:22 PM To: 'Peer-to-peer development.' Subject: Re: [p2p-hackers] Re: scalability On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > #2 because it does not cluster related data it will eventually > be overwhelmed with content. > #3 because it performs full-file transfers instead of block > exchanges or partial file transfers > #4/5 because clients don't immediately offer partial downloads, > thus hot spots have a congestion delay measured in > full-file-transfer increments rather than in block > increments (an order of 2 for typical MP3s, easily > reaching multiple days of congestion). If I am not mistaken, Gnutella has been doing partial file transfers for two or three years now. The eDonkey/eMule network does this too. BitTorrent does not have a monopoly on this feature. :-) The relevant spec (if it can be called a spec) for Gnutella is here: http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From sberlin at gmail.com Fri Dec 2 21:11:53 2005 From: sberlin at gmail.com (Sam Berlin) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> References: <20051202202223.GC2604@cs.uoregon.edu> <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> Message-ID: <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com> The protocol, as described nearly anywhere, isn't Gnutella. Gnutella, as others have said, really isn't a protocol (0.6 or any number) anymore. It's a hodgepodge of a lot of features, all implemented by various Gnutella clients. Partial file sharing has been in use by mainstream clients for around 1-2 years. As Greg mentioned, academic papers tend to describe Gnutella as it was designed by Justin Frankel, and a few will include the addition of ultrapeers. It's nearly impossible to find a paper that accurately describes the current state of the network (as it exists through mainstream clients) though. It'd likely be a fascinating subject for researchers to study & write papers on. I know I'd be interested. Sam On 12/2/05, Bryan Turner wrote: > Ah, this is news to me :) Thanks for the link. I notice that this > partial file transfer feature is only a footnote on the main protocol.. How > wide spread is the partial file transfer feature among clients? > > --Bryan > bryan.turner@pobox.com > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Daniel Stutzbach > Sent: Friday, December 02, 2005 3:22 PM > To: 'Peer-to-peer development.' > Subject: Re: [p2p-hackers] Re: scalability > > On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > > #2 because it does not cluster related data it will eventually > > be overwhelmed with content. > > #3 because it performs full-file transfers instead of block > > exchanges or partial file transfers > > #4/5 because clients don't immediately offer partial downloads, > > thus hot spots have a congestion delay measured in > > full-file-transfer increments rather than in block > > increments (an order of 2 for typical MP3s, easily > > reaching multiple days of congestion). > > If I am not mistaken, Gnutella has been doing partial file transfers for two > or three years now. The eDonkey/eMule network does this too. > > BitTorrent does not have a monopoly on this feature. :-) > > The relevant spec (if it can be called a spec) for Gnutella is here: > > http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol > > -- > Daniel Stutzbach Computer Science Ph.D Student > http://www.barsoom.org/~agthorr University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From Serguei.Osokine at efi.com Fri Dec 2 21:26:00 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42759@fcexmb04.efi.internal> On Friday, December 02, 2005 Sam Berlin wrote: > It'd likely be a fascinating subject for researchers to study & write > papers on. I know I'd be interested. Yeah, well, O'Reilly wasn't :-) I submitted a proposal to the ETC two or three years ago, where I was going to talk about Gnutella being the first P2P network that is not only deployed and developed, but is also *designed* in a fully decentralized fashion. Like you say, basically - there is some common protocol framework, but within this framework vendors are free to develop, publish, and deploy their own protocol extensions, and to implement only those extensions of the others that they like. Survival of the fittest proposals in the field, so to speak. Design without an architectural committee, voting, or any kind of central authority or even consensus on half of the issues. This is a first and only example of such development, as far as I know. But for some reason O'Reilly was not impressed. Though I'm not much of a speaker in any case :-) Best wishes - S.Osokine. 2 Nov 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Sam Berlin Sent: Friday, December 02, 2005 1:12 PM To: Bryan Turner; Peer-to-peer development. Subject: Re: [p2p-hackers] Re: scalability The protocol, as described nearly anywhere, isn't Gnutella. Gnutella, as others have said, really isn't a protocol (0.6 or any number) anymore. It's a hodgepodge of a lot of features, all implemented by various Gnutella clients. Partial file sharing has been in use by mainstream clients for around 1-2 years. As Greg mentioned, academic papers tend to describe Gnutella as it was designed by Justin Frankel, and a few will include the addition of ultrapeers. It's nearly impossible to find a paper that accurately describes the current state of the network (as it exists through mainstream clients) though. It'd likely be a fascinating subject for researchers to study & write papers on. I know I'd be interested. Sam On 12/2/05, Bryan Turner wrote: > Ah, this is news to me :) Thanks for the link. I notice that this > partial file transfer feature is only a footnote on the main protocol.. How > wide spread is the partial file transfer feature among clients? > > --Bryan > bryan.turner@pobox.com > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Daniel Stutzbach > Sent: Friday, December 02, 2005 3:22 PM > To: 'Peer-to-peer development.' > Subject: Re: [p2p-hackers] Re: scalability > > On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > > #2 because it does not cluster related data it will eventually > > be overwhelmed with content. > > #3 because it performs full-file transfers instead of block > > exchanges or partial file transfers > > #4/5 because clients don't immediately offer partial downloads, > > thus hot spots have a congestion delay measured in > > full-file-transfer increments rather than in block > > increments (an order of 2 for typical MP3s, easily > > reaching multiple days of congestion). > > If I am not mistaken, Gnutella has been doing partial file transfers for two > or three years now. The eDonkey/eMule network does this too. > > BitTorrent does not have a monopoly on this feature. :-) > > The relevant spec (if it can be called a spec) for Gnutella is here: > > http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol > > -- > Daniel Stutzbach Computer Science Ph.D Student > http://www.barsoom.org/~agthorr University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From agthorr at cs.uoregon.edu Fri Dec 2 21:30:52 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com> References: <20051202202223.GC2604@cs.uoregon.edu> <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com> Message-ID: <20051202213051.GF2604@cs.uoregon.edu> Perhaps we should take a cue from TCP/IP and start referring to the "Gnutella protocol suite". On Fri, Dec 02, 2005 at 04:11:53PM -0500, Sam Berlin wrote: > The protocol, as described nearly anywhere, isn't Gnutella. Gnutella, > as others have said, really isn't a protocol (0.6 or any number) > anymore. It's a hodgepodge of a lot of features, all implemented by > various Gnutella clients. Partial file sharing has been in use by > mainstream clients for around 1-2 years. > > As Greg mentioned, academic papers tend to describe Gnutella as it was > designed by Justin Frankel, and a few will include the addition of > ultrapeers. It's nearly impossible to find a paper that accurately > describes the current state of the network (as it exists through > mainstream clients) though. > > It'd likely be a fascinating subject for researchers to study & write > papers on. I know I'd be interested. > > Sam > > On 12/2/05, Bryan Turner wrote: > > Ah, this is news to me :) Thanks for the link. I notice that this > > partial file transfer feature is only a footnote on the main protocol.. How > > wide spread is the partial file transfer feature among clients? > > > > bryan.turner@pobox.com > > > > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > > Behalf Of Daniel Stutzbach > > Sent: Friday, December 02, 2005 3:22 PM > > To: 'Peer-to-peer development.' > > Subject: Re: [p2p-hackers] Re: scalability > > > > On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > > > #2 because it does not cluster related data it will eventually > > > be overwhelmed with content. > > > #3 because it performs full-file transfers instead of block > > > exchanges or partial file transfers > > > #4/5 because clients don't immediately offer partial downloads, > > > thus hot spots have a congestion delay measured in > > > full-file-transfer increments rather than in block > > > increments (an order of 2 for typical MP3s, easily > > > reaching multiple days of congestion). > > > > If I am not mistaken, Gnutella has been doing partial file transfers for two > > or three years now. The eDonkey/eMule network does this too. > > > > BitTorrent does not have a monopoly on this feature. :-) > > > > The relevant spec (if it can be called a spec) for Gnutella is here: > > > > http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol > > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From srhea at cs.berkeley.edu Fri Dec 2 21:33:23 2005 From: srhea at cs.berkeley.edu (Sean Rhea) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: <200512022026.jB2KPx4X026153@rtp-core-1.cisco.com> References: <200512022026.jB2KPx4X026153@rtp-core-1.cisco.com> Message-ID: <1CBA368C-F6E1-49D9-B7C8-E024A7F556A7@cs.berkeley.edu> On Dec 2, 2005, at 3:25 PM, Bryan Turner wrote: > The DHTs that I've studied behave well in high-churn environments. > The problem is network migration events; large swings of population > in a > short time. Chord is the worst for this, as its rigid structure > quickly > buckles when you lose a large chunk of the network. Kademlia survives > pretty well; maintaining connections with long-lived nodes is a > definite > win, as is maintaining connectivity to hubs/supernodes. How massive is massive? In some earlier experiments we ran, we tested Bamboo with massive joins and failures of groups composing around 20% of the total network size. It works fine. You get a little blip where the average lookup time goes up by a factor of two or so, but that's all. If I recall correctly, the MIT Chord implementation, at least, did pretty well in such scenarios as well. You just have to recover periodically, rather than reactively, to join and failure events, as described in the Bamboo USENIX paper I referenced earlier. Sean -- Everyone chooses his or her own instrument for rebellion. I don't know what my son's will be, but my only hope for him is this: That by sharing my passions with him, I have planted the seeds of defiance that will someday be turned against me. -- Soo Lee Young -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/0a68b5e9/PGP.pgp From coderman at gmail.com Fri Dec 2 21:46:10 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42759@fcexmb04.efi.internal> References: <4A60C83D027E224BAA4550FB1A2B120EC42759@fcexmb04.efi.internal> Message-ID: <4ef5fec60512021346x537d09b9o6260407ef16d3cf8@mail.gmail.com> On 12/2/05, Serguei Osokine wrote: > ... Gnutella being the first P2P network that > is not only deployed and developed, but is also *designed* in a fully > decentralized fashion. Like you say, basically - there is some common > protocol framework, but within this framework vendors are free to > develop, publish, and deploy their own protocol extensions, and to > implement only those extensions of the others that they like. i'd say IRC falls into this category and definitely predates the current gnutella cabal. (i may be a bit biased as i met my wife on irc-2.mit.edu way back when... :) > Survival of the fittest proposals in the field, so to speak. > Design without an architectural committee, voting, or any kind of > central authority or even consensus on half of the issues. my favorite kind of design. groupthink is braindead! UML sucks! vi forevar! etc, etc. *grin* From ian.clarke at gmail.com Sat Dec 3 09:49:51 2005 From: ian.clarke at gmail.com (Ian Clarke) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <20051202154557.E559F191C@yumyum.zooko.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> Message-ID: <823242bd0512030149t3e6a18d2x@mail.gmail.com> On 02/12/05, zooko@zooko.com wrote: > Let 1 be the set of networks which are used for illegal transmission of > information, I do wish you would refer to these networks as those which allow the covert transmission of information, rather than those which are used for the illegal transmission of information - since I am not aware of any networks that are specifically designed for the illegal transmission of information. I think this would help alleviate the political problem you raise later in your email. > and 2 be the set of networks which are built on f2f connections, > and 1^2 be the intersection -- the set of networks which are used for illegal > transmission of information and which are built on f2f connections. If you broaden your definition of set 1 to be networks which are used for the covert transmission of information (I think this is a more useful definition for the set as not all covert activity is illegal), then I am not sure, in practice, how many networks will fall into set 2 that aren't also members of set 1, in fact, I can't think of any non-contrived situations where one would create a f2f network motivated by something other than a desire to be covert in some way. > [bepw2002] introduces "darknet" to mean concept 1. I'm not going to spend time dissecting their paper to determine exactly what BEPW's intention was for the term "darknet", certainly they could have been much more explicit about this if they wanted to, and they use the term in contradictory ways throughout their paper. For example, they refer to "the darknet" as if there is only one, but subsequenly refer to "darknets". Given this vagueness, I can't imagine that is was their goal to provide an authorative definition for the term. While we can debate what BEPW intended the term to mean when they used it in their paper, this is ultimately irrelevant. Software engineers often seem to forget that English isn't like a programming language where a designer specifies an unambigous definition at the outset (Richard Stallman is particularly guilty of this). The meaning of words in English is a consensus that is arrived at over time, and eventually finds its way into a dictionary (long) after that consensus is stable. The BEPW paper is one early voice in that consensus-forming process. Mine is another, yours is another still. > By the way, I should point out that I have a personal interest in this history > because between 2001 and 2003 I tried to promulgate concept 2, using Lucas > Gonze's coinage: "friendnet" [zooko2001, zooko2002, zooko2003, gonze2002]. > I would like to know for my own satisfaction if my ideas were a direct > inspiration for some of this modern stuff, such as the Freenet v0.7 design. I am not sure that they were a direct inspiration. We (Freenet) have been concerned about the fact that Freenet was harvestable for several years now. Around spring this year I made the observation that if human relationships form a small world network, it should be possible to assign locations to people such that we form a Kleinberg-style small world network, and thus we could make the network routable. Oskar Sandberg then suggested a way to do this, and we set about validating the concept using simulations. > Now the problem is that in the current parlance of the media, the word > "darknet" is used to mean vaguely 1 or 2 or 1^2. The reason that this is a > problem isn't that it breaks with some etymological tradition, but that it is > ambiguous and that it deprives us of useful words to refer to 1 or 2 > specifically. The ambiguity has nasty political consequences -- see for > example these f2f network operators struggling to persuade newspaper readers > that they are not primarily for illegal purposes: [globe]. I think a much better way to avoid this nasty political consequence is to stop describing set 1 in terms of illegal activity, but rather describe such networks as being "covert", or "anonymity preserving" - neither of which implies illegal activity (it is perfectly legal to be anonymous in most countries whose legal systems I am familiar with). > > defining the term "darknet" as a f2f network that is designed > > to conceal the activities of its participants (this being, so far as I > > have seen, one of the main motivations for building an f2f network), > > So you think of "darknet" as meaning 1^2. Or just 2, since I think the sets 1^2 and 2 are, in practical terms, virtually identical. > That's an interesting remark -- that you regard concealment as one of the main > motivations. I personally regard concealment as one of the lesser motivations > -- I'm more interested in attack resistance (resisting attacks such as > subversion or denial-of-service, rather than attacks such as surveillance), > scalability, and other properties. Although I'm interested in the concealment > properties as well. That is surprising. Are you aware of any current or proposed f2f networks for which concealment of user activity is not a goal? Ian. From adam at cypherspace.org Sat Dec 3 12:17:37 2005 From: adam at cypherspace.org (Adam Back) Date: Sat Dec 9 22:13:05 2006 Subject: idealized content network properties (Re: [p2p-hackers] darknet) In-Reply-To: <439057F1.2060208@cs.ucl.ac.uk> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202133516.GA15480@bitchcake.off.net> <439057F1.2060208@cs.ucl.ac.uk> Message-ID: <20051203121737.GA3572@bitchcake.off.net> Sure I just mean if you make it invitation only, thats the same network but with something preventing other people subscribing. That could be encryption keys (all encrypted), authentication keys/passwords (required to join network), obscurity (don't advertise IP/port), or network control (current entity requires to connect to you to join you). It seems that it would not be hard to add this restriction to a network without this restriction (but with the other features I mentioned). I'd say that darknet term specifically implies some opaqueness to outside observers -- likely encryption no? (but f2f would not necessarily, its just a invite only collaboration group network). Adam On Fri, Dec 02, 2005 at 02:19:29PM +0000, Michael Rogers wrote: > Adam Back wrote: > >removing 1 makes a eg "friend-to-friend" network -- that just means > >you encrypt the searchable tags and content with a shared key. > > Not sure about this one - I think the use of group keys is orthogonal to > the use of a friend-to-friend topology. For example Groove uses group > keys without f2f, Freenet 0.7 will use f2f without group keys, and WASTE > uses neither (but still fits under the "darknet" umbrella because it's > invitation-only). From rrrw at neofonie.de Sat Dec 3 23:04:16 2005 From: rrrw at neofonie.de (Ronald Wertlen) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: <43922470.6030802@neofonie.de> Hi Daniel, these are basically benchmark domains (variables), that tell you how good your search is from, as I mentioned in my mail, the information retrieval field. http://en.wikipedia.org/wiki/Information_retrieval For instance Bloom Filters increase your scalability but reduce the precision of the search - so you get a lot of stuff you didn't want. A few years ago, a lot of papers in the p2p field that were working on stuff like topology, organisational methods, scalability, etc. concentrated on finding better ways of getting from object_id to the node (number of hops, number of lookups, etc.). The problem from an IR perspective is that not all objects are as "simple" as a mp3 file and not all searches are as simple as "coldplay", how do you get the onject_id in the first place. This becomes a severe problem the more complex the objects, their metadata and the queries (for instance Boolean, range, content proximity, queries). I've downloaded your paper, thanks for the refutation. I love results that seem counter-intuitive to me because they mean I have some learning to do. :-) Best regards, Ron > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Daniel Stutzbach > Sent: Thursday, December 01, 2005 3:52 PM > To: p2p-hackers@zgp.org > Subject: Re: [p2p-hackers] Re: scalability > > On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote: > >>> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows >>> practically anyone to elevate to super-peer, which results in a random >>> (power-law distribtion) network. > > > Gnutella is not a power-law network. See my paper on the graph properties > of Gnutella, presented at the Internet Measurement Conference earlier this > year: > > http://www.usenix.org/events/imc05/tech/stutzbach.html > >>> Such a network is not going to perform very well as far as recall and >>> precision are concerned, past a certain point. I would be interested >>> to calculate that exact point (but doubting I'll get to it some time >>> soon :-/). > > > Could you rigorously define recall and precision for me? I'm not sure what > you mean by these terms. > > -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From sberlin at gmail.com Sun Dec 4 00:03:08 2005 From: sberlin at gmail.com (Sam Berlin) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <43922470.6030802@neofonie.de> References: <43922470.6030802@neofonie.de> Message-ID: <19196d860512031603q2b1e3700jc72dada77e890e10@mail.gmail.com> > For instance Bloom Filters increase your scalability but reduce the > precision of the search - so you get a lot of stuff you didn't want. Bloom Filters can be used to reduce the amount of incoming queries (in Gnutella filters are passed from a "leaf" to its "ultrapeer", and composite filters are passed between neighboring ultrapeers to reduce last hop & second-to-last-hop traffic). Once the query passes the filter test, it can still be forwarded on to the ultimate host, and that host can make the decision on whether or not to send a reply. This eliminates "the stuff you didn't want" from replies while still keeping traffic low. Tthe filters in Gnutella reduce ~70% of query traffic on the second-to-last hop, and ~90% on the last hop (at least, it did when I last checked a year or so ago). > A few years ago, a lot of papers in the p2p field that were working on > stuff like topology, organisational methods, scalability, etc. > concentrated on finding better ways of getting from object_id to the > node (number of hops, number of lookups, etc.). The problem from an IR > perspective is that not all objects are as "simple" as a mp3 file and > not all searches are as simple as "coldplay", how do you get the > onject_id in the first place. This becomes a severe problem the more > complex the objects, their metadata and the queries (for instance > Boolean, range, content proximity, queries). Metadata is certainly difficult to search for, but it isn't impossible. It's vastly easier to search using metadata in a network such as Gnutella than in a DHT-based network, as you don't have to prepopulate the tables with all kinds of data. There's lots of active metadata searches going on (again, in Gnutella), including searches for file names, most-recently-downloaded, specific data in id3 tags, file's licenses, etc... IMHO, the less a network is structured (ie, doesn't have an organized topology), the easier it is to add arbitrary searches. This is because there's no need to add another overlay for a new kind of search -- the network can function as-is. Of course, certain topologies can help when some kinds of searches are predominant. Sam From lgonze at panix.com Sun Dec 4 01:02:39 2005 From: lgonze at panix.com (Lucas Gonze) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <823242bd0512030149t3e6a18d2x@mail.gmail.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> <823242bd0512030149t3e6a18d2x@mail.gmail.com> Message-ID: <4392402F.303@panix.com> Ian Clarke wrote: >If you broaden your definition of set 1 to be networks which are used >for the covert transmission of information (I think this is a more >useful definition for the set as not all covert activity is illegal), >then I am not sure, in practice, how many networks will fall into set >2 that aren't also members of set 1, in fact, I can't think of any >non-contrived situations where one would create a f2f network >motivated by something other than a desire to be covert in some way. > > A private network allows participants to talk freely without every comment ending up in Google, and that allows you to have the kind of conversation which shouldn't be public. The application is to enable speech which isn't intended for global scale, usually about personal issues like sex, money, family, friendships, and gossip. I wouldn't call that covert, illegal, or contrived, just private. From kerry at vscape.com Sun Dec 4 05:36:21 2005 From: kerry at vscape.com (Kerry Bonin) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <20051202193833.GD2249@leitl.org> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> <20051202193833.GD2249@leitl.org> Message-ID: <43928055.4080306@vscape.com> I'm more of a lurker on this list, but might be able to make meeting - I'm in Corvallis, so Portland or Eugene is possible some weekend evenings... Eugen Leitl wrote: >On Fri, Dec 02, 2005 at 11:13:33AM -0800, coderman wrote: > > >>On 12/2/05, Daniel Stutzbach wrote: >> >> >>>I'm in Eugene. I'd be willing to drive up for a get-together if we >>>have a big enough group to make it interesting. >>> >>> >>i'd be happy to travel to eugene if more of the group is located there >>as well. weekends would be best in that case. >> >> > >Allright! I'm game. > >;) > > > >------------------------------------------------------------------------ > >_______________________________________________ >p2p-hackers mailing list >p2p-hackers@zgp.org >http://zgp.org/mailman/listinfo/p2p-hackers >_______________________________________________ >Here is a web page listing P2P Conferences: >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051203/51d9aaf9/attachment.html From lemonobrien at yahoo.com Sun Dec 4 06:10:58 2005 From: lemonobrien at yahoo.com (Lemon Obrien) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <4392402F.303@panix.com> Message-ID: <20051204061058.64313.qmail@web53602.mail.yahoo.com> here here....government just needs to let us be. Lucas Gonze wrote: Ian Clarke wrote: >If you broaden your definition of set 1 to be networks which are used >for the covert transmission of information (I think this is a more >useful definition for the set as not all covert activity is illegal), >then I am not sure, in practice, how many networks will fall into set >2 that aren't also members of set 1, in fact, I can't think of any >non-contrived situations where one would create a f2f network >motivated by something other than a desire to be covert in some way. > > A private network allows participants to talk freely without every comment ending up in Google, and that allows you to have the kind of conversation which shouldn't be public. The application is to enable speech which isn't intended for global scale, usually about personal issues like sex, money, family, friendships, and gossip. I wouldn't call that covert, illegal, or contrived, just private. _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences You don't get no juice unless you squeeze Lemon Obrien, the Third. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051203/6ca537fe/attachment.htm From m.rogers at cs.ucl.ac.uk Sun Dec 4 15:32:25 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: idealized content network properties (Re: [p2p-hackers] darknet) In-Reply-To: <20051203121737.GA3572@bitchcake.off.net> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202133516.GA15480@bitchcake.off.net> <439057F1.2060208@cs.ucl.ac.uk> <20051203121737.GA3572@bitchcake.off.net> Message-ID: <43930C09.5000006@cs.ucl.ac.uk> Adam Back wrote: > Sure I just mean if you make it invitation only, thats the same > network but with something preventing other people subscribing. Agreed - you could argue that WASTE is just Gnutella without host caches. ;-) > I'd say that darknet term specifically implies some opaqueness to > outside observers -- likely encryption no? (but f2f would not > necessarily, its just a invite only collaboration group network). F2F means more than invitation-only. Invitation-only means you need to know some member of the network in order to join, but it doesn't say anything about who you can see once you've joined. F2F means you can only see the people you know. A house party is invitation-only but not F2F; a drug distribution network is F2F. The difference is important because an invitation-only non-F2F network loses privacy as it grows, whereas an F2F network doesn't. Cheers, Michael From p2phackerslist at rhesusb.dk Sun Dec 4 18:56:07 2005 From: p2phackerslist at rhesusb.dk (DanielEKFA) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Looking for litterature on file sharing networks... Message-ID: <200512041956.09896.p2phackerslist@rhesusb.dk> Hi there :) I'm study computer science and I'm writing a synopsis on peer-to-peer, more specifically file sharing networks. I want to focus on the protocols/network structures used in different file sharing programs. I know the internet is my best friend, but for something more concrete (schools like books), do you guys have any books to recommend? Good URLs are very welcome, too :) Thanks in advance, Daniel From trep at cs.ucr.edu Sun Dec 4 19:27:11 2005 From: trep at cs.ucr.edu (Thomas Repantis) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Looking for litterature on file sharing networks... In-Reply-To: <200512041956.09896.p2phackerslist@rhesusb.dk> References: <200512041956.09896.p2phackerslist@rhesusb.dk> Message-ID: <20051204192711.GA85169@angeldust.chaos> Hi, You may want to take a look at: E.K. Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim. A survey and comparison of peer-to-peer overlay network schemes. IEEE Communications Surveys & Tutorials, 7(2):72­93, Second Quarter 2005. J. Risson and T. Moors. Survey of research towards robust peer-to-peer networks: Search methods. Technical Report UNSW-EE-P2P-1-1, University of New South Wales, Sydney, Australia, September 2004. D. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J. Pruyne, B. Richard, S. Rollins, and Z. Xu. Peer-to-Peer Computing. Technical Report HPL-2002-57, HP Labs, 2003. Cheers, Thomas On Sun, Dec 04, 2005 at 07:56:07PM +0100, DanielEKFA wrote: > Hi there :) > > I'm study computer science and I'm writing a synopsis on peer-to-peer, more > specifically file sharing networks. I want to focus on the protocols/network > structures used in different file sharing programs. I know the internet is my > best friend, but for something more concrete (schools like books), do you > guys have any books to recommend? Good URLs are very welcome, too :) > > Thanks in advance, > Daniel -- http://www.cs.ucr.edu/~trep -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051204/4043dc84/attachment.pgp From coderman at gmail.com Mon Dec 5 02:37:12 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland^H^H^Heugene (OR) ; scheduling Message-ID: <4ef5fec60512041837i7c631756rff3a9c01761c6dce@mail.gmail.com> to attempt a meeting this year (however futile) we have the following options: sat/sun dates: 10/11 17/18 2--hehe who am i kidding... 31/1st? 10th or 17th or jan. would be my preference. happy holidays On 12/3/05, Kerry Bonin wrote: > I'm more of a lurker on this list, but might be able to make meeting - I'm > in Corvallis, so Portland or Eugene is possible some weekend evenings... From arachnid at notdot.net Mon Dec 5 03:28:03 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? Message-ID: <4393B3C3.5040905@notdot.net> Systems like BitTorrent have a rather annoying failure mode - the last 'seed' goes offline while there are still several 'peers' (without the complete file) online. Attempts by the peers to reconstruct the original file are rarely successful, as the chances of every single block being present on one of the seeds are generally very low - it's likely that at least one block is missing. However, what if one were to precode files to be distributed using a standard error correcting code such as a reed-solomon code? By generating 10% check blocks, and treating the composite file the same as you would the original (with the exception that you can stop downloading when you reach 90%, and reconstruct using check blocks from there), you can reduce the chance of the last departing seed ensuring nobody can complete the file. If we assume there are 4 peers left on the network, each with 50% of the file remaining, on average they will be able to reconstruct 50% + 25% + 12.5% + 6.25% = 93.75% of the file, which exceeds the threshold required to reconstruct with check blocks. So, a couple of questions: 1) How common is this failure mode? Does it occur often enough to justify the extra complexity? 2) Do peers generally have enough pieces between them to reach or exceed the 90% threshold? -Nick Johnson From coderman at gmail.com Mon Dec 5 03:39:48 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4393B3C3.5040905@notdot.net> References: <4393B3C3.5040905@notdot.net> Message-ID: <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com> On 12/4/05, Nick Johnson wrote: > Systems like BitTorrent have a rather annoying failure mode - the last > 'seed' goes offline while there are still several 'peers' (without the > complete file) online. any system which makes a partial resource available has this problem, even one using error correcting codes (seed goes offline before requisite X of coded blocks are sent). > However, what if [.. doing stuff to mix data ..], you > can reduce the chance of the last departing seed ensuring nobody can > complete the file. no; you've just made it less likely that the end of the file will always be the part missing if a peer terminates distribution prematurely. From arachnid at notdot.net Mon Dec 5 03:45:54 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com> References: <4393B3C3.5040905@notdot.net> <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com> Message-ID: <4393B7F2.3010300@notdot.net> coderman wrote: >On 12/4/05, Nick Johnson wrote: > > >>Systems like BitTorrent have a rather annoying failure mode - the last >>'seed' goes offline while there are still several 'peers' (without the >>complete file) online. >> >> > >any system which makes a partial resource available has this problem, >even one using error correcting codes (seed goes offline before >requisite X of coded blocks are sent). > > But statistically, if n different peers each have random subsets of the data, the chances of them having 90% of the file between them are much, much higher than the chances of them having 100%. >>However, what if [.. doing stuff to mix data ..], you >>can reduce the chance of the last departing seed ensuring nobody can >>complete the file. >> >> > >no; you've just made it less likely that the end of the file will >always be the part missing if a peer terminates distribution >prematurely. > > BitTorrent distributes chunks semi-randomly, not sequentially, so you're no more likely to have the beginning than the end of the file. -Nick Johnson From coderman at gmail.com Mon Dec 5 04:06:37 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4393B7F2.3010300@notdot.net> References: <4393B3C3.5040905@notdot.net> <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com> <4393B7F2.3010300@notdot.net> Message-ID: <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com> On 12/4/05, Nick Johnson wrote: > ... > But statistically, if n different peers each have random subsets of the > data, the chances of them having 90% of the file between them are much, > much higher than the chances of them having 100%. you are assuming there was at least one complete distribution. in the situation you describe (last seed leaves) some of the remaining peers do then become seeds as they obtain requisite missing chunks to complete the torrent, if the remaining peers have the blocks required to complete what is missing. i don't see how error codes would be an improvement (considering coding overhead / expansion), unless the distribution of blocks using the current bittorrent algorithm was heavily weighted somehow. (is it?) if a complete copy has not been distributed within the group then it doesn't matter what encoding mechanism you use, and in my experience this has been the usual cause of partial failures. From agthorr at cs.uoregon.edu Mon Dec 5 04:36:02 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com> References: <4393B3C3.5040905@notdot.net> <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com> <4393B7F2.3010300@notdot.net> <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com> Message-ID: <20051205043601.GA3088@cs.uoregon.edu> On Sun, Dec 04, 2005 at 08:06:37PM -0800, coderman wrote: > if a complete copy has not been distributed within the group then it > doesn't matter what encoding mechanism you use, and in my experience > this has been the usual cause of partial failures. Out of curiosity, what's your dataset and how have you established that the original seed failed to distribute at least one copy? -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From arachnid at notdot.net Mon Dec 5 04:55:02 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com> References: <4393B3C3.5040905@notdot.net> <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com> <4393B7F2.3010300@notdot.net> <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com> Message-ID: <4393C826.1070500@notdot.net> coderman wrote: > you are assuming there was at least one complete distribution. > >in the situation you describe (last seed leaves) some of the remaining >peers do then become seeds as they obtain requisite missing chunks to >complete the torrent, if the remaining peers have the blocks required >to complete what is missing. i don't see how error codes would be an >improvement (considering coding overhead / expansion), unless the >distribution of blocks using the current bittorrent algorithm was >heavily weighted somehow. (is it?) > >if a complete copy has not been distributed within the group then it >doesn't matter what encoding mechanism you use, and in my experience >this has been the usual cause of partial failures. > > No, the point behind using ECC is that you don't need a complete distribution, only 90%. Here's some stats: Assume a file is distributed in 1000 blocks. The last seed goes offline, leaving 4 peers, each with an average of 500 blocks. Between them, they will, on average, have 1 - (500/1000)^4 percent of the blocks - 93.75%. The lieklihood of them having the entire file is (1 - 0.5^4) ^ 1000 - a very, very small number (approximately 10^-30). However, if we precode this file into 1100 blocks, of which only 1000 are required, assuming the same 500 blocks per peer, they have on average 1 - (600/1100)^4 percent of the blocks - 91.1%. Since they only require 10/11 (90.9%), they will usually have enough to reconstruct the original file. Unfortunately, I can't recall the neccessary stats to calculate the exact chance of success. -Nick Johnson From Arnaud.Legout at sophia.inria.fr Mon Dec 5 08:49:13 2005 From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4393B3C3.5040905@notdot.net> References: <4393B3C3.5040905@notdot.net> Message-ID: <4393FF09.6030302@sophia.inria.fr> Hi, Nick Johnson wrote: > Systems like BitTorrent have a rather annoying failure mode - the last > 'seed' goes offline while there are still several 'peers' (without the > complete file) online. Attempts by the peers to reconstruct the > original file are rarely successful, as the chances of every single > block being present on one of the seeds are generally very low - it's > likely that at least one block is missing. from my point of view this is pure myth. I often see such claims that bittorrent suffers from last pieces problem; that if there is no seed, the torrent is dead; etc. From all the experiments I performed, the reality is very different. Rarest first does a very good job at replicating the rarest pieces in a torrent, so that the probability to have a piece that is not replicated at all is very low. Of course, one can always build toys model that show problems in extreme cases. But, it is clear that BitTorrent is not a one fit all solution, and BitTorrent is very successful for its targeted applications: large scale replication for medium to large files. Outside this target, it makes sense to design other classes of applications. However, I am not convinced that error correcting code are the solution. Such codes are terribly sexy, but when it comes to real applications, things are far less sexy. It is hard to tune such codes as their relevance really comes from the context. If the context is a moving target, then the problem becomes very complex. Consider the simple case of a improving reliability of a satellite link. It is not trivial at all to find the good tradeoff between reliability and overhead. When it comes to a distributed system with heterogeneous clients, the problem is several order of magnitude more complex. Regards, Arnaud. From arachnid at notdot.net Mon Dec 5 10:34:55 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4393FF09.6030302@sophia.inria.fr> References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr> Message-ID: <439417CF.6060800@notdot.net> Arnaud Legout wrote: > Hi, > > Nick Johnson wrote: > >> Systems like BitTorrent have a rather annoying failure mode - the >> last 'seed' goes offline while there are still several 'peers' >> (without the complete file) online. Attempts by the peers to >> reconstruct the original file are rarely successful, as the chances >> of every single block being present on one of the seeds are generally >> very low - it's likely that at least one block is missing. > > from my point of view this is pure myth. > I often see such claims that bittorrent suffers from last pieces > problem; that if there is no seed, the torrent is dead; etc. > From all the experiments I performed, the reality is very different. > Rarest first does a very good job at replicating the rarest pieces in > a torrent, so that the probability to have a piece that is not > replicated at all is very low. This is why I'm after stats, not guesses - I'm of the opinion that even with rarest first, the chances of getting every single block are very low (remember, if you have 1000 blocks, and you're 99% likely to have each block, that's still only a 0.004% chance you'll have them all). However, that's just my guess, and this is just yours - only stats will show it one way or the other, really. > > Outside this target, it makes sense to design other classes of > applications. However, I am not convinced that error correcting code > are the > solution. Such codes are terribly sexy, but when it comes to real > applications, things are far less sexy. > It is hard to tune such codes as their relevance really comes from the > context. If the context is a moving target, then the problem becomes > very complex. In this case, the progression of expected percentage of blocks (50% + 25% + 12.5% + ...) gives us a very good idea how large an ECC is required - after only 4 generations (4 peers with average 50%), the amount of data expected is over 90%, so for most purposes 10% check blocks will be sufficient. Since more check blocks don't require any more transmission, the overhead is pretty low, and it's quite possible to set the threshold higher if desired. Any amount should give benifits, however. -Nick Johnson From Arnaud.Legout at sophia.inria.fr Mon Dec 5 11:31:22 2005 From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <439417CF.6060800@notdot.net> References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr> <439417CF.6060800@notdot.net> Message-ID: <4394250A.9050704@sophia.inria.fr> Hi, Nick Johnson wrote: > > This is why I'm after stats, not guesses - I'm of the opinion that > even with rarest first, the chances of getting every single block are > very low (remember, if you have 1000 blocks, and you're 99% likely to > have each block, that's still only a 0.004% chance you'll have them > all). However, that's just my guess, and this is just yours - only > stats will show it one way or the other, really. > and we have stats. You can have a look at (section IV-B): http://hal.inria.fr/inria-00000156/en for an experimental evaluation of rarest first. We are still working on this paper and more results are to come. However, they all show that rarest first increases the entropy of the pieces in a way that renders more complex piece management pointless in all the torrents we monitored. In particular, rarest first increases very fast the rarest pieces in your peer set so that the probability to have rare pieces in your peer set decreases fast with time. Therefore, even if some peers leave the peer set, the chance to have missing pieces is low. Of course, you have transient period of time during which some pieces may disappear from the torrent. But in a typical torrent, this is unlikely. Do not hesitate to comment or ask questions on our paper, I would be pleased to answer. Regards, Arnaud. From p2phackerslist at rhesusb.dk Mon Dec 5 12:18:27 2005 From: p2phackerslist at rhesusb.dk (DanielEKFA) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Looking for litterature on file sharing networks... In-Reply-To: <200512041956.09896.p2phackerslist@rhesusb.dk> References: <200512041956.09896.p2phackerslist@rhesusb.dk> Message-ID: <200512051318.27717.p2phackerslist@rhesusb.dk> To Thomas and Bram: Those resources are great! Perfect with technical documents, and great with Bram's list of different networks. Thanks again, both of you! :) On Sunday 2005-12-04 19:56, DanielEKFA wrote: > Hi there :) > > I'm study computer science and I'm writing a synopsis on peer-to-peer, more > specifically file sharing networks. I want to focus on the > protocols/network structures used in different file sharing programs. I > know the internet is my best friend, but for something more concrete > (schools like books), do you guys have any books to recommend? Good URLs > are very welcome, too :) > > Thanks in advance, > Daniel > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From arachnid at notdot.net Mon Dec 5 19:52:08 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4394250A.9050704@sophia.inria.fr> References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr> <439417CF.6060800@notdot.net> <4394250A.9050704@sophia.inria.fr> Message-ID: <43949A68.4090306@notdot.net> Arnaud Legout wrote: > Hi, > > Nick Johnson wrote: > >> >> This is why I'm after stats, not guesses - I'm of the opinion that >> even with rarest first, the chances of getting every single block are >> very low (remember, if you have 1000 blocks, and you're 99% likely to >> have each block, that's still only a 0.004% chance you'll have them >> all). However, that's just my guess, and this is just yours - only >> stats will show it one way or the other, really. >> > and we have stats. You can have a look at (section IV-B): > http://hal.inria.fr/inria-00000156/en > > for an experimental evaluation of rarest first. We are still working > on this paper and more results are to come. Excellent - this is exactly what I was looking for. However, I'm a little confused - first you say "Fig. 9 represents the evolution of the number of copies of pieces in the peer set with time", then you say "Fig. 12 represents the evolution of the number of copies of pieces in the peer set with time. We see some major differences compared to Fig. 9". What's the difference between what you're graphing in the two graphs? -Nick Johnson From lemonobrien at yahoo.com Mon Dec 5 20:14:08 2005 From: lemonobrien at yahoo.com (Lemon Obrien) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] P2P in SFC In-Reply-To: Message-ID: <20051205201408.64907.qmail@web53606.mail.yahoo.com> I need confirmation, time and place...so I can make sure I'm there...and on time to get a good seat :) lemon You don't get no juice unless you squeeze Lemon Obrien, the Third. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051205/ee4b480d/attachment.htm From sleety at gmail.com Tue Dec 6 08:20:41 2005 From: sleety at gmail.com (Mr Iceman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] I need a ASP security code Message-ID: <917b56f0512060020o4c37581xaa0a227f84c1f70d@mail.gmail.com> Hello. I need a ASP security code for login pages and save the information in Data base. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/aca61240/attachment.html From enzomich at gmail.com Tue Dec 6 08:37:36 2005 From: enzomich at gmail.com (Enzo Michelangeli) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failurein BitTorrent like systems? References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr> Message-ID: <02c601c5fa40$57fbd520$0200a8c0@em.noip.com> ----- Original Message ----- From: "Arnaud Legout" Sent: Monday, December 05, 2005 4:49 PM [...] > from my point of view this is pure myth. > I often see such claims that bittorrent suffers from last pieces > problem; that if there is no seed, the torrent is dead; etc. > From all the experiments I performed, the reality is very different. > Rarest first does a very good job at replicating the rarest pieces > in a torrent, so that the probability to have a piece that is not > replicated at all is very low. Sometimes particular blocks are not missing, but mangled in transit by NAT routers too smart for their own good (and their owners', among which, at one time, myself). See: http://azureus.aelitis.com/wiki/index.php/NinetyNine The existence of such routers is doubted at: http://www.plugndial.com/draft-jennings-midcom-stun-results-02.txt [...] Some NATs were rumored to exist that looked in arbitrary packets for either the NATs' external IP address or for the internal host IP address - either in binary or dotted decimal form - and rewrote it to something else. STUN could be extended to test for exactly this type of behavior by echoing arbitrary client data and the mapped address but sending the bits inverted so these evil NATs did not mess with them. NATs that do this will break integrity detection on payloads. ...but I can testify that a Sercom IP706ST once in my possession did perform such "blind payload patch" for packets sent to the DMZ host. Needless to say, it's been demoted to paperweight :-) Enzo From dbarrett at quinthar.com Tue Dec 6 08:43:01 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] P2P in SFC - Last call Message-ID: <43954F15.7080008@quinthar.com> Looks like we'll have a good showing at the P2P event. Despite the size, I'm sticking to my guns: Ryoko's Sushi - Wednesday, 12/7 at 9pm - URL: http://tinyurl.com/bkk5d - Phone: (415) 775-1028 - Address: 619 Taylor St, San Francisco, CA 94102 To keep the logistics simple, I'll bring a jar into which everyone can toss money, and I'll keep ordering sushi until the jar runs dry. If you have any preferences, do let me know. As for drinks, I spoke with the pretty girls there and they'll take care of you at the bar. Any lurkers who haven't spoken up feel free to come, and all those who have please let me know if you won't. Until then, see ya'll soon! -david From sg266 at cornell.edu Tue Dec 6 09:05:30 2005 From: sg266 at cornell.edu (Saikat Guha) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failurein BitTorrent like systems? In-Reply-To: <02c601c5fa40$57fbd520$0200a8c0@em.noip.com> References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr> <02c601c5fa40$57fbd520$0200a8c0@em.noip.com> Message-ID: <1133859930.3099.20.camel@himalaya.cs.cornell.edu> On Tue, 2005-12-06 at 16:37 +0800, Enzo Michelangeli wrote: > Some NATs were rumored to exist that looked in arbitrary packets for > either the NATs' external IP address or for the internal host IP > address - either in binary or dotted decimal form - and rewrote it to > something else. [...] > > ...but I can testify that a Sercom IP706ST once in my possession did > perform such "blind payload patch" for packets sent to the DMZ host. > Needless to say, it's been demoted to paperweight :-) FWIW, we looked for such behavior in NATs w.r.t TCP packets (http://nutss.net/stunt-results.php). We couldn't find any evidence of TCP data mangling in the 120 or so NATs that we tested. Would it be possible to run the STUNT NAT test from behind your paperweight? :-D cheers, -- Saikat -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051206/9b078982/attachment.pgp From Arnaud.Legout at sophia.inria.fr Tue Dec 6 09:34:31 2005 From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <43949A68.4090306@notdot.net> References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr> <439417CF.6060800@notdot.net> <4394250A.9050704@sophia.inria.fr> <43949A68.4090306@notdot.net> Message-ID: <43955B27.30202@sophia.inria.fr> Hi, Nick Johnson wrote: > > Excellent - this is exactly what I was looking for. I happy to see it is useful to you > However, I'm a little confused - first you say "Fig. 9 represents the > evolution of the number of copies of pieces in the peer set with > time", then you say "Fig. 12 represents the evolution of the number of > copies of pieces in the peer set with time. We see some major > differences compared to Fig. 9". What's the difference between what > you're graphing in the two graphs? This is not the same torrent. Fig. 9 is for torrent 7 (see Table 1), and Fig. 12 is for torrent 11. Torrent 9 is a typical torrent. We see that the number of copies in your peer set is well bounded. Torrent 11 is a torrent with only one seed for most of the monitoring. In this case there are many pieces with only one copy (only on the seed because the torrent is just starting), but we see that even in this case the mean number of copies increases and that the rarest pieces are replicated fast. This torrent shows that even with only one source, the pieces are efficiently replicated (Fig. 14 and Fig.15 leads to this conclusion). Arnaud. From Arnaud.Legout at sophia.inria.fr Tue Dec 6 09:46:13 2005 From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failurein BitTorrent like systems? In-Reply-To: <02c601c5fa40$57fbd520$0200a8c0@em.noip.com> References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr> <02c601c5fa40$57fbd520$0200a8c0@em.noip.com> Message-ID: <43955DE5.7090808@sophia.inria.fr> Hi, Enzo Michelangeli wrote: > Sometimes particular blocks are not missing, but mangled in transit by NAT > routers too smart for their own good (and their owners', among which, at > one time, myself). You do not target the same problem. It is far more easy to define your redundancy parameters to correct x% of corruption than to solve a distributed piece selection problem. Moreover, corrupted pieces cannot be replicated in a torrent because your have a hash for each piece. Therefore, I do not see how you can stop at 99% of the download because some pieces are corrupted. In this case, the piece is simply retransmitted, which increases slightly the download time. By no way it should compromise a torrent or create a last pieces problem. Arnaud. From sg266 at cornell.edu Tue Dec 6 10:37:18 2005 From: sg266 at cornell.edu (Saikat Guha) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: References: Message-ID: <1133865438.3099.65.camel@himalaya.cs.cornell.edu> On Thu, 2005-12-01 at 18:38 +0000, Salem Mark wrote: > in "highly transient networks", where the number of nodes > appearing and disappearing are very high, maintaining the DHT becomes hard > and introduces considerable overhead. > > I am trying to find out what exactly "highly-transient" means. A file > sharing network like Gnutella, seems to be highly transient, where peers > join/leave the network frequently. True; the answer depends on the particular application and protocol. In Gnutella (without ultra-peers), activity of all clients would affect the network equally. With an intelligently chosen subset of nodes, (ultrapeers in Gnutella, supernodes in Kazaa and Skype) the effects of churn can be mitigated. This relies on the assumption that this subset of nodes is more stable than the rest. The assumption appears to be borne out in Gnutella (Daniel's paper), and in Skype [1]. [1] An Experimental Study of the Skype Peer-to-Peer VoIP System http://www.guha.cc/~saikat/pub/cucs05-skype-abstract.php > Could somebody elaborate on this? is > there a node departure/arrival/failure rate (per sec? per min?) that > identifies "highly-transient" networks ? FWIW, in [1], we found that the supernode turnover is typically less than 5% / 30min. Median supernode session time is 5.5 hours; session time is heavy-tailed (Pareto) and not exponential. Supernodes are much more stable than regular nodes. Btw, if anyone wants a copy of [1], please email me directly. It has data on Skype supernode lifetimes, churn rates, comparison between skype supernodes and regular nodes, Skype VoIP and file-transfer workload characterization, etc. In short, we found that Skype differs considerably from filesharing networks (different usage model, much higher median lifetimes etc). cheers, -- Saikat -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051206/26715c26/attachment.pgp From enzomich at gmail.com Tue Dec 6 11:15:20 2005 From: enzomich at gmail.com (Enzo Michelangeli) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes toprevent failurein BitTorrent like systems? References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr><02c601c5fa40$57fbd520$0200a8c0@em.noip.com> <43955DE5.7090808@sophia.inria.fr> Message-ID: <035101c5fa5a$0abebc40$0200a8c0@em.noip.com> ----- Original Message ----- From: "Arnaud Legout" Sent: Tuesday, December 06, 2005 5:46 PM > You do not target the same problem. It is far more easy to define > your redundancy parameters to > correct x% of corruption than to solve a distributed piece > selection problem. > Moreover, corrupted pieces cannot be replicated in a torrent because > your have a hash for each piece. > Therefore, I do not see how you can stop at 99% of the download > because some pieces are corrupted. In this case, > the piece is simply retransmitted, which increases slightly the > download time. By no way it should compromise a torrent > or create a last pieces problem. Why should retransmission solve the problem? If I'm behind a "mangling router", trying to download a file a piece of which has a 4-byte sequence that is mistaken by the router as an IP address that needs translation, every time I get that sequence the data will be corrupted. Enzo From enzomich at gmail.com Tue Dec 6 11:44:37 2005 From: enzomich at gmail.com (Enzo Michelangeli) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to preventfailurein BitTorrent like systems? References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr><02c601c5fa40$57fbd520$0200a8c0@em.noip.com> <1133859930.3099.20.camel@himalaya.cs.cornell.edu> Message-ID: <035601c5fa5a$7c2f4340$0200a8c0@em.noip.com> ----- Original Message ----- From: "Saikat Guha" Sent: Tuesday, December 06, 2005 5:05 PM On Tue, 2005-12-06 at 16:37 +0800, Enzo Michelangeli wrote: [...] >> ...but I can testify that a Sercom IP706ST once in my possession did >> perform such "blind payload patch" for packets sent to the DMZ host. >> Needless to say, it's been demoted to paperweight :-) > > FWIW, we looked for such behavior in NATs w.r.t TCP packets > (http://nutss.net/stunt-results.php). We couldn't find any evidence of > TCP data mangling in the 120 or so NATs that we tested. Would it be > possible to run the STUNT NAT test from behind your paperweight? :-D It'll take a few days... But anyway the mangling I observed happened on UDP packets, rather than TCP. (I cannot exclude that TCP was affected as well, though). Enzo From stelian at axigenmail.com Tue Dec 6 12:59:54 2005 From: stelian at axigenmail.com (Stelian) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <4393B3C3.5040905@notdot.net> References: <4393B3C3.5040905@notdot.net> Message-ID: <43958B4A.8000902@axigenmail.com> Nick Johnson wrote: > If we assume there are 4 peers left on the network, each with 50% of > the file remaining, on average they will be able to reconstruct 50% + > 25% + 12.5% + 6.25% = 93.75% of the file, which exceeds the threshold > required to reconstruct with check blocks. Your concern over the disappearing seed is obviously relevant in case of a slow seed, otherwise the probability that the seed has not uploaded all the blocks at least once before departing is practically very low. So let's assume a slow seed, a seed so slow that once it has finished uploading a block, all the lechers will share the new available block instantly among them. Assuming the seed has time to upload only 90% of the original file, then it will have time to upload only 81.8% from the new file (10% larger because of the error correction) - which is of course insufficient to reconstitute the file, thereby denying the any apparent gain. From Arnaud.Legout at sophia.inria.fr Tue Dec 6 13:08:39 2005 From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Error correcting codes toprevent failurein BitTorrent like systems? In-Reply-To: <035101c5fa5a$0abebc40$0200a8c0@em.noip.com> References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr><02c601c5fa40$57fbd520$0200a8c0@em.noip.com> <43955DE5.7090808@sophia.inria.fr> <035101c5fa5a$0abebc40$0200a8c0@em.noip.com> Message-ID: <43958D57.3030305@sophia.inria.fr> Hi, Enzo Michelangeli wrote: > Why should retransmission solve the problem? If I'm behind a "mangling > router", trying to download a file a piece of which has a 4-byte sequence > that is mistaken by the router as an IP address that needs translation, > every time I get that sequence the data will be corrupted. > I did not understand you were referencing to this kind of problem. I was arguing for a random corruption of a given amount of packets. Regards, Arnaud. From xiangsong.hou at gmail.com Tue Dec 6 15:58:41 2005 From: xiangsong.hou at gmail.com (xiangsong hou) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] published key chenge frequency in DHT Message-ID: hi all: as we know,DHT can deal with node join/leave frequently. i want to know if DHT can deal with publishde key change frequently. for example,in grid computing resouce dicovery use DHT,the published key (represent cpu or memeory) is change very frequently,so assigned node is change frequently. how to deal with this situation in DHT? who have relative paper about this? HOUXS -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/3b9f6362/attachment.htm From ian.clarke at gmail.com Tue Dec 6 17:47:40 2005 From: ian.clarke at gmail.com (Ian Clarke) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <4392402F.303@panix.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> <823242bd0512030149t3e6a18d2x@mail.gmail.com> <4392402F.303@panix.com> Message-ID: <823242bd0512060947y3e4d1bd9n@mail.gmail.com> On 03/12/05, Lucas Gonze wrote: > A private network allows participants to talk freely without every > comment ending up in Google, and that allows you to have the kind of > conversation which shouldn't be public. The application is to enable > speech which isn't intended for global scale, usually about personal > issues like sex, money, family, friendships, and gossip. I wouldn't > call that covert, illegal, or contrived, just private. I think that would fall under my definition of "covert". Ian. From nigini at gmail.com Wed Dec 7 00:08:48 2005 From: nigini at gmail.com (Nigini Oliveira) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: Error correcting codes to prevent failure in BitTorrent like systems? Message-ID: <94fec2490512061608m14cc56f7vdfe4a423c95e4635@mail.gmail.com> Hi ALL! Don't know specifically about BitTorrent, but I've found a lot of results talking about the gains of using ECC for improving the availability of data at p2p networks: http://www.citeulike.org/user/nigini/article/274016 (not readed) These days I'm working on a paper's analisis that talks about using "Erasure Codes" to do that (maybe one day I can finish my model on it): http://www.citeulike.org/user/nigini/article/307402 (this is pretty hard to understand) Searching now I've found this one page work exposing some "interesting" data: http://dmi.ensica.fr/IMG/pdf/347.pdf After I began to study about ECC this year I can't stop finding examples of using them at network/computing world. But as Arnaud sad: " Such codes are terribly sexy, but when it comes to real applications, things are far less sexy." But as I didn't got yet that for BitTorrent the "less sexy" means "don't needed", maybe the above work can inspire some answer. "At? mais!" -- Nigini Abilio Oliveira Mestrando em Computa??o UFCG - DSC - COPIN www.nigini.com.br nigini@gmail.com nigini@dsc.ufcg.edu.br -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/76ec1011/attachment.html From nigini at gmail.com Wed Dec 7 00:32:14 2005 From: nigini at gmail.com (Nigini Oliveira) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: Error correcting codes to prevent failure in BitTorrent like systems? In-Reply-To: <94fec2490512061608m14cc56f7vdfe4a423c95e4635@mail.gmail.com> References: <94fec2490512061608m14cc56f7vdfe4a423c95e4635@mail.gmail.com> Message-ID: <94fec2490512061632i6801bc90ne763e5372aceec79@mail.gmail.com> Just found this text going at the point... I don't know, but appears that this guy is connected with BitTorrent development... http://www.livejournal.com/users/bramcohen/1416.html On 12/6/05, Nigini Oliveira wrote: > > Hi ALL! > > Don't know specifically about BitTorrent, but I've found a lot of results > talking about the gains of using ECC for improving the availability of data > at p2p networks: > http://www.citeulike.org/user/nigini/article/274016 (not readed) > > These days I'm working on a paper's analisis that talks about using > "Erasure Codes" to do that (maybe one day I can finish my model on it): > http://www.citeulike.org/user/nigini/article/307402 (this is pretty hard > to understand) > > Searching now I've found this one page work exposing some "interesting" > data: > http://dmi.ensica.fr/IMG/pdf/347.pdf > > After I began to study about ECC this year I can't stop finding examples > of using them at network/computing world. But as Arnaud sad: " Such codes > are terribly sexy, but when it comes to real applications, things are far > less sexy." But as I didn't got yet that for BitTorrent the "less sexy" > means "don't needed", maybe the above work can inspire some answer. > > "At? mais!" > > -- > Nigini Abilio Oliveira > Mestrando em Computa??o > UFCG - DSC - COPIN > www.nigini.com.br > nigini@gmail.com > nigini@dsc.ufcg.edu.br -- Nigini Abilio Oliveira Mestrando em Computa??o UFCG - DSC - COPIN www.nigini.com.br nigini@gmail.com nigini@dsc.ufcg.edu.br -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/f39d5c5a/attachment.htm From bneijt at gmail.com Wed Dec 7 01:31:01 2005 From: bneijt at gmail.com (Bram Neijt) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Any body know this kind of network? Message-ID: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com> Hi. I'm writing up some documentation on P2P systems, and I've tried to make an overview of all kinds of networks. From simple http client-server networks to annonymous P2P, in the hope I could predict the next step. One thing I thought up was a highly unusable network, which some university project might have tried out. Maybe some of you can point me to an "approximate example" of that network: Clients constantly recieve data from the network and push it back to other hosts they are connected to, inserting their own requests, filling requests with data and picking out their own data. The don't have any identification of where the data came from (not even a annonymous ID) and simply pick out the "right" data. If you know of a system that comes close, I would like to be able to point to it in my documentation. Greetings, Bram PS The documents I'm working on can be found here: http://www.ai.rug.nl/~bneijt/doc/networks/levels.html From m.rogers at cs.ucl.ac.uk Wed Dec 7 10:46:29 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Any body know this kind of network? In-Reply-To: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com> References: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com> Message-ID: <4396BD85.5000509@cs.ucl.ac.uk> Hi Bram, A few systems you might be interested in: P5 (http://www.cs.umd.edu/projects/p5/p5-extended.pdf), Cashmere (http://www.cs.ucsb.edu/~ravenben/publications/pdf/cashmere-nsdi05.pdf), Herbivore (http://www.cs.cornell.edu/People/egs/papers/herbivore-tr.pdf) and XOR trees (ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-54.ps.gz). Herbivore and XOR trees are based on the dining cryptographers protocol (http://world.std.com/~franl/crypto/dining-cryptographers.txt). [vapourware] I'm also working on an anonymous communication system where there are no end-to-end node IDs, but nodes can use link-local flow identifiers to recognise packets that are part of the same flow, and anonymous delivery receipts to work out which flows are being routed in the right direction. Trial and error can be used to find good routes to a destination without knowing its address. Hopefully this will be more efficient that flooding without sacrificing anonymity. [/vapourware] Cheers, Michael Bram Neijt wrote: > Hi. > > I'm writing up some documentation on P2P systems, and I've tried to > make an overview of all kinds of networks. From simple http > client-server networks to annonymous P2P, in the hope I could predict > the next step. > > One thing I thought up was a highly unusable network, which some > university project might have tried out. Maybe some of you can point > me to an "approximate example" of that network: > > Clients constantly recieve data from the network and push it back to > other hosts they are connected to, inserting their own requests, > filling requests with data and picking out their own data. The don't > have any identification of where the data came from (not even a > annonymous ID) and simply pick out the "right" data. > > If you know of a system that comes close, I would like to be able to > point to it in my documentation. > > Greetings, > Bram > PS The documents I'm working on can be found here: > http://www.ai.rug.nl/~bneijt/doc/networks/levels.html > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From bneijt at gmail.com Wed Dec 7 13:16:24 2005 From: bneijt at gmail.com (Bram Neijt) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Any body know this kind of network? In-Reply-To: <4396BD85.5000509@cs.ucl.ac.uk> References: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com> <4396BD85.5000509@cs.ucl.ac.uk> Message-ID: <46c2f4ab0512070516w3453b5bn3fe1a044f7349306@mail.gmail.com> Thanks to Gun and Michael, I'm going to take some time reading the papers and sites guys point to, so it will take some time before they are in the documentation. And now that this level has been identified, I'm off completing the list a bit more and hopefully think of yet another kind of system on the way ;-) Thanks for quick the replies! Bram From ludovic.courtes at laas.fr Wed Dec 7 16:18:46 2005 From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: <438C9F88.2050803@pdos.lcs.mit.edu> (Jeremy Stribling's message of "Tue, 29 Nov 2005 13:35:52 -0500") References: <200511291414.35852.01771@iha.dk> <20051129141713.6A9CB698@yumyum.zooko.com> <20051129142151.8E1A035E4@yumyum.zooko.com> <438C9F88.2050803@pdos.lcs.mit.edu> Message-ID: <87lkywg9sp.fsf_-_@laas.fr> Hi, Jeremy Stribling writes: > Working on it. Should have something public within a few months: > > http://pdos.csail.mit.edu/papers/overcite:iptps05/index.html Indeed, that seems very promising! Similarly, are there people working on decentralized web indexing and search engines? To paraphrase Zooko, it would be nice to decentralize Google before it is too late... Thanks, Ludovic. From gwendal.simon at francetelecom.com Wed Dec 7 16:36:01 2005 From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Decentralized search engines Message-ID: In comparison with traditional filesharing approaches, a decentralized search for the web should take into account words inside the documents. As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents. http://maay.netofpeers.net Each node (said computer) can publish a set of documents. This information space does not initially contain the web. Our idea is to consider that the cache (or history) of the web browser should be, by default, included in the published set of documents. So, every page that has been visited by at least one people since x days will be available in the network. Obviously, more popular a page is, more available it is. By the way, one first challenge is the implementation of a nice crawler for owned documents : an indexer. This indexer should be able to scan and retrieve words from various documents (.html, .doc, .pdf, ...). It should be light and run in idle time and, if possible, be cross-platform. If you know a good open-source indexer, please let us know. -- Gwendal > -----Message d'origine----- > De : p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] De la part de Ludovic Court?s > Envoy? : mercredi 7 d?cembre 2005 17:19 > ? : strib@MIT.EDU > Cc : Peer-to-peer development.; zooko@zooko.com > Objet : [p2p-hackers] Decentralized search engines > > Hi, > > Jeremy Stribling writes: > > > Working on it. Should have something public within a few months: > > > > http://pdos.csail.mit.edu/papers/overcite:iptps05/index.html > > Indeed, that seems very promising! > > Similarly, are there people working on decentralized web indexing and > search engines? To paraphrase Zooko, it would be nice to decentralize > Google before it is too late... > > Thanks, > Ludovic. > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From solipsis at pitrou.net Wed Dec 7 17:17:49 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: References: Message-ID: <1133975869.5662.5.camel@fsol> Hi Gwendal :) Le mercredi 07 d?cembre 2005 ? 17:36 +0100, SIMON Gwendal RD-MAPS-ISS a ?crit : > By the way, one first challenge is the implementation of a nice > crawler for owned documents : an indexer. This indexer should be able > to scan and retrieve words from various documents > (.html, .doc, .pdf, ...). It should be light and run in idle time and, > if possible, be cross-platform. If you know a good open-source > indexer, please let us know. You can look at the techniques used by Beagle : http://beaglewiki.org/ or Kat : http://kat.mandriva.com/ or the Gnome Deskbar applet : http://live.gnome.org/DeskbarApplet http://raphael.slinckx.net/deskbar/ Regards Antoine. From agthorr at cs.uoregon.edu Wed Dec 7 17:27:26 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: <87lkywg9sp.fsf_-_@laas.fr> References: <200511291414.35852.01771@iha.dk> <20051129141713.6A9CB698@yumyum.zooko.com> <20051129142151.8E1A035E4@yumyum.zooko.com> <438C9F88.2050803@pdos.lcs.mit.edu> <87lkywg9sp.fsf_-_@laas.fr> Message-ID: <20051207172725.GG5812@cs.uoregon.edu> On Wed, Dec 07, 2005 at 05:18:46PM +0100, Ludovic Court?s wrote: > Similarly, are there people working on decentralized web indexing and > search engines? To paraphrase Zooko, it would be nice to decentralize > Google before it is too late... For what purpose do you want to "decentralize Google"? Is it for some technical reason where you believe a decentralized index will provide better end-user performance? Or is it because you don't think any single organization should have that much control over information? -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From gwendal.simon at francetelecom.com Wed Dec 7 17:49:07 2005 From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Decentralized search engines Message-ID: > from Daniel Stutzbach > > For what purpose do you want to "decentralize Google"? > > Is it for some technical reason where you believe a decentralized > index will provide better end-user performance? Yes. Crawler-based systems are not up-to-date. It is especially bad in the current context of dynamic webpages : news, posts, comments... Moreover, more contents can be indexed : my photos, my music... > Or is it because you don't think any single organization should have > that much control over information? Yes. By the way, an organization can be sued and shutdown (eg. napster) -- Gwendal > > -- > Daniel Stutzbach Computer Science > Ph.D Student > http://www.barsoom.org/~agthorr > University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From hannes.tschofenig at siemens.com Wed Dec 7 17:45:23 2005 From: hannes.tschofenig at siemens.com (Tschofenig, Hannes) Date: Sat Dec 9 22:13:05 2006 Subject: AW: [p2p-hackers] Decentralized search engines Message-ID: hi daniel, google is, in some sense, already using a decentralized solution. they are using more than 160.000 machines for scalability, cost and performance reasons. i wonder whether there is actually detailed information available how their system works. ciao hannes > -----Urspr?ngliche Nachricht----- > Von: p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] Im Auftrag von Daniel Stutzbach > Gesendet: Mittwoch, 7. Dezember 2005 18:27 > An: Peer-to-peer development. > Cc: strib@MIT.EDU; zooko@zooko.com > Betreff: Re: [p2p-hackers] Decentralized search engines > > On Wed, Dec 07, 2005 at 05:18:46PM +0100, Ludovic Court?s wrote: > > Similarly, are there people working on decentralized web > indexing and > > search engines? To paraphrase Zooko, it would be nice to > decentralize > > Google before it is too late... > > For what purpose do you want to "decentralize Google"? > > Is it for some technical reason where you believe a decentralized > index will provide better end-user performance? > > Or is it because you don't think any single organization should have > that much control over information? > > -- > Daniel Stutzbach Computer Science > Ph.D Student > http://www.barsoom.org/~agthorr > University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From mgp at ucla.edu Wed Dec 7 19:27:48 2005 From: mgp at ucla.edu (Michael Parker) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: References: Message-ID: <20051207112748.1ptnql9c004s4oko@mail.ucla.edu> The first step of indexing is the actual keyword extraction itself. From what I have heard, libextractor is a good open-source solution: http://gnunet.org/libextractor/ - Mike Parker Quoting SIMON Gwendal RD-MAPS-ISS : > By the way, one first challenge is the implementation of a nice > crawler for owned documents : an indexer. This indexer should be able > to scan and retrieve words from various documents (.html, .doc, .pdf, > ...). It should be light and run in idle time and, if possible, be > cross-platform. If you know a good open-source indexer, please let us > know. > From coderman at gmail.com Wed Dec 7 19:40:50 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: References: Message-ID: <4ef5fec60512071140u70e2213dk29e31c47d4b205b2@mail.gmail.com> On 12/7/05, Tschofenig, Hannes wrote: > ... > google is, in some sense, already using a decentralized solution. they are using more than 160.000 machines for scalability, cost and performance reasons. distributed != decentralized. googlefs and map reduce papers describe some of their internals. (regarding how it works) http://labs.google.com/papers/index.html From zooko at zooko.com Thu Dec 8 14:48:25 2005 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <823242bd0512030149t3e6a18d2x@mail.gmail.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> <823242bd0512030149t3e6a18d2x@mail.gmail.com> Message-ID: <20051208144825.A210014E74@yumyum.zooko.com> Ian Clarke wrote: > > I do wish you would refer to these networks as those which allow the > covert transmission of information, rather than those which are used > for the illegal transmission of information - since I am not aware of > any networks that are specifically designed for the illegal > transmission of information. I think this would help alleviate the > political problem you raise later in your email. The concept of a networking technology or a network which is specifically used for illegal information is an interesting concept, for example Tim May "blacknet" [1, 2, 3] and Biddle, et al. "darknet" [4]. If you would like to use "darknet" to mean something else then I can't stop you, but I would like to talk about that concept so I need a word for it. Regards, Zooko P.S. The most salient difference between blacknet [1] and darknet [2] in my opinion is that blacknet is a market, in which participants are motivated by economic gain, and darknet is a more general concept, in which the motivations of participants may be various -- including but not limited to friendship. [1] http://www.privacyexchange.org/iss/confpro/cfpuntraceable.html [2] http://www-personal.umich.edu/~ludlow/worries.txt [3] http://cypherpunks.venona.com/date/1993/08/msg00538.html [4] http://zgp.org/pipermail/p2p-hackers/2005-December/003245.html From zooko at zooko.com Thu Dec 8 15:08:39 2005 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] f2f for purposes other than privacy In-Reply-To: <823242bd0512030149t3e6a18d2x@mail.gmail.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> <823242bd0512030149t3e6a18d2x@mail.gmail.com> Message-ID: <20051208150840.01E8A14E74@yumyum.zooko.com> Ian Clarke wrote: > > We (Freenet) have > been concerned about the fact that Freenet was harvestable for several > years now. Around spring this year I made the observation that if > human relationships form a small world network, it should be possible > to assign locations to people such that we form a Kleinberg-style > small world network, and thus we could make the network routable. > Oskar Sandberg then suggested a way to do this, and we set about > validating the concept using simulations. I would love to learn more. Is there a white-paper or design document beyond these slides from DefCon [1]? > Are you aware of any current or proposed f2f > networks for which concealment of user activity is not a goal? Well, I think of the links between two friends in f2f to be not solely communication channels but also to have other meaning. For example, if friends transmit music files to one another, then in addition to any privacy properties that the network may have, it also serves as a decentralized, attack-resistant recommendation engine for music. Honestly, this area of research is ripe for exploration, but I can give you at least a couple of examples. Doceur set it up with a claimed general negative result in "The Sybil Attack" in 2002 [2]. But his general negative result isn't quite true, as disproven by e.g. Advogato, 2000 [3, 4, 5]. Recently George Danezis, Chris Lesniewski-Laas, M. Frans Kaashoek, and Ross Anderson smashed these two ideas together and mixed in some DHT routing: [6]. [6] is an excellent paper, which proposes a concrete DHT design and which really nails the fact that the introduction graph or "bootstrap graph" contains information which can defeat the allegedly undefeatable Sybil Attack. [6] references some related work which looks interesting, but I haven't followed those links yet myself. I guess [6] is somewhat relevant to the Freenet v0.7 design. So, uh, anyway, this shows that there is interest in the notion of using friendship networks for purposes other than privacy, namely attack resistance of DHT routing and attack resistance of metadata [7 (self-citation)]. I think there's a lot more value to be mined from this concept, and I'm really glad that it has finally gotten the attention of some p2p researchers. Oh, and here's another perspective on this idea -- a post I wrote to my blog a few years ago suggesting that all sorts of DHT innovations which were intended to improve network performance could be applied to attack resistance: "trust is just another topology" [8]. Regards, Zooko [1] http://freenetproject.org/papers/vegas1_dc.pdf [2] http://citeseer.ist.psu.edu/douceur02sybil.html [3] http://www.advogato.org/trust-metric.html [4] http://www.levien.com/thesis/compact.pdf [5] http://www.levien.com/free/tmetric-HOWTO.html [6] http://pdos.csail.mit.edu/cgi-bin/pubs-date.cgi?match=Sybil-resistant+DHT+routing [7] http://conferences.oreillynet.com/cs/p2p2001/view/e_sess/1200 [8] http://www.zooko.com/log-2003-01.html#d2003-01-23-trust_is_just_another_topology From zooko at zooko.com Thu Dec 8 15:28:58 2005 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] DHT generalization for purposes other than network performance In-Reply-To: <20051208150840.01E8A14E74@yumyum.zooko.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> <823242bd0512030149t3e6a18d2x@mail.gmail.com> <20051208150840.01E8A14E74@yumyum.zooko.com> Message-ID: <20051208152858.0F5FC14E74@yumyum.zooko.com> I'm going to paraphrase a blog entry I wrote some years ago and then mention some newer research that is related. In January 2003 [1] I wrote something like: > Thanks to Peter Marbach for the discussion that prompted this insight. > > A network is defined on top of an underlying network. The first emergent > networks (Chord), assumed that the underlying network was (a) fully > connected and (b) homogeneous in the sense that any hop was considered to be > just as expensive as any other hop. The most important contribution of > Pastry (and then of Kademlia) is to treat the underlying network as > heterogeneous, in the sense that some hops are considered more expensive > that others. For Pastry, they chose to make these costs reflect network > performance (i.e. latency or throughput) so that Pastry would optimize for > faster routing (e.g. don't send packets through Japan when they are on their > way from Canada to USA). For Kademlia, they chose to make these costs > reflect uptime of peers in order to optimize for stability. So my big > realization is: > > *** Trust (or vulnerability, or exposure) can also be modelled in the same > way, as costs on the links of the underlying network. > > In addition, the underlying network may be incompletely connected, either > because of (a) trust disconnects, (b) firewalls, NATs, censorship, > terrorism, (c) the underlying network doesn't have complete routing e.g. > wireless ad hoc networks. > > This encourages me a lot: the fact that mainstream emergent network > researchers like Project IRIS might develop techniques for overlay networks > to work on more general underlying networks (especially non-fully- > connected), and that these techniques can then be applied to trust networks. The recent research that I wanted to cite was Michael Freedman et al. analyzing practical details of how current DHTs work atop non-fully-connected underlay networks: [2]. [2] doesn't propose any good general solution, and indeed it speculates that a fresh new DHT designed to handle non-fully-connected underlays may be needed. Regards, Zooko [1] http://www.zooko.com/log-2003-01.html#d2003-01-23-trust_is_just_another_topology [2] "Non-Transitive Connectivity and DHTs" http://www.scs.cs.nyu.edu/~mfreed/publications/ From Serguei.Osokine at efi.com Thu Dec 8 18:11:04 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] P2P in SFC - Last call Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42783@fcexmb04.efi.internal> > Ryoko's Sushi > - Wednesday, 12/7 at 9pm Thanks, David, for organizing that! Great crowd, great conversations. And I'm adding this place to my own short list of selected restaurants - I do not believe I've ever seen Kirin on tap anywhere else around here... Best wishes - S.Osokine. 8 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of David Barrett Sent: Tuesday, December 06, 2005 12:43 AM To: Peer-to-peer development. Subject: [p2p-hackers] P2P in SFC - Last call Looks like we'll have a good showing at the P2P event. Despite the size, I'm sticking to my guns: Ryoko's Sushi - Wednesday, 12/7 at 9pm - URL: http://tinyurl.com/bkk5d - Phone: (415) 775-1028 - Address: 619 Taylor St, San Francisco, CA 94102 To keep the logistics simple, I'll bring a jar into which everyone can toss money, and I'll keep ordering sushi until the jar runs dry. If you have any preferences, do let me know. As for drinks, I spoke with the pretty girls there and they'll take care of you at the bar. Any lurkers who haven't spoken up feel free to come, and all those who have please let me know if you won't. Until then, see ya'll soon! -david _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From don at dhoffman.net Thu Dec 8 19:21:45 2005 From: don at dhoffman.net (Donald Hoffman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] P2P in SFC - Last call In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42783@fcexmb04.efi.internal> References: <4A60C83D027E224BAA4550FB1A2B120EC42783@fcexmb04.efi.internal> Message-ID: I second Serguei's comment. A great evening. Thanks or organizing, David. On Dec 8, 2005, at 10:11 AM, Serguei Osokine wrote: >> Ryoko's Sushi >> - Wednesday, 12/7 at 9pm > > Thanks, David, for organizing that! > > Great crowd, great conversations. > > And I'm adding this place to my own short list of selected restaurants > - I do not believe I've ever seen Kirin on tap anywhere else around > here... > > Best wishes - > S.Osokine. > 8 Dec 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers- > bounces@zgp.org]On > Behalf Of David Barrett > Sent: Tuesday, December 06, 2005 12:43 AM > To: Peer-to-peer development. > Subject: [p2p-hackers] P2P in SFC - Last call > > > Looks like we'll have a good showing at the P2P event. Despite the > size, I'm sticking to my guns: > > Ryoko's Sushi > - Wednesday, 12/7 at 9pm > - URL: http://tinyurl.com/bkk5d > - Phone: (415) 775-1028 > - Address: 619 Taylor St, San Francisco, CA 94102 > > To keep the logistics simple, I'll bring a jar into which everyone can > toss money, and I'll keep ordering sushi until the jar runs dry. > If you > have any preferences, do let me know. As for drinks, I spoke with the > pretty girls there and they'll take care of you at the bar. > > Any lurkers who haven't spoken up feel free to come, and all those who > have please let me know if you won't. Until then, see ya'll soon! > > -david > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From dbarrett at quinthar.com Fri Dec 9 02:15:06 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] P2P in SFC - The Aftermath Message-ID: <4398E8AA.2050002@quinthar.com> Thanks to everyone for attending. A good crowd and great conversation all around. And astonishingly, only a single beer spilt. Final menu: 4 hamachi nigiri 2 tekka maki 2 sake maki 2 maguro nigiri 2 sake nigiri 2 teriyaki chicken 2 toro nigiri 2 saba nigiri 1 Spanish mackerel nigiri 1 halibut nigiri 2 asparagus rolls 2 California rolls 2 rainbow rolls 1 spicy tuna roll 3 tempura Nothing too fancy, but covered all the basics. Not a single uni request, even though I know there was at least one fanatic in the crowd. So thanks again for everyone who came, and thanks to the rest for enduring my endless emails. Also, thanks to Travis Kalanick from Red Swoosh for picking up the slack between what we ordered and what money was dropped into the bucket. For those who are interested, there's also a "superhappydevhouse" event (http://superhappydevhouse.com/) this Saturday in the hills above San Mateo. Come show your stuff in the "P2P Hour of Power" -- it's a great, informal way to get 15 minutes of fame in front of a surprisingly large crowd of surprisingly social geeks. I'll be there and will likely demo iGlance; feel free to send me any questions if you're interested in participating. This concludes our test of the P2P Sushifest broadcast system, now back to your originally scheduled programming. -david From dbarrett at quinthar.com Fri Dec 9 02:32:06 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: References: Message-ID: <4398ECA6.2090301@quinthar.com> SIMON Gwendal RD-MAPS-ISS wrote: > This > information space does not initially contain the web. Our idea is to > consider that the cache (or history) of the web browser should be, by > default, included in the published set of documents. I assume you have a good answer for this, but how will you prevent (for example) cached copies of Hotmail from ending up in your system? Also, is there any way to correlate an actual web URL with content in your system? For example, could you do a search in your system, it finds a cached webpage, and then offer a "(www)" link that points back to the original page? Finally, is there any way to create a "private" subset of the network, so (for example) everyone in my company can use this to get quick access to everyone else's documents, but nobody outside my company can use it to get in? Regardless, this looks fantastic. If you're in a US-centric business frame of mind, you might consider using this to ensure Sarbanes-Oxley conformance. Especially if it were scriptable -- have a series of "kill words" that should never appear in any document anywhere in a company (including a specific customer name, or a case number, or whatever). Then have a server loop through the kill list every night and raise a red flag if it finds a document that shouldn't exist. -david From lgonze at panix.com Fri Dec 9 03:34:14 2005 From: lgonze at panix.com (Lucas Gonze) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] P2P in SFC - The Aftermath In-Reply-To: <4398E8AA.2050002@quinthar.com> References: <4398E8AA.2050002@quinthar.com> Message-ID: <4398FB36.4080503@panix.com> And let the record show that the aloha chapter of p2p-hackers attended a talk on the Fortress language at University of Hawaii, followed by pizza (with pineapple, naturally) and a design patterns talk by Sam Joseph. Sushi was not had because we get way too much of it as it is, however at least one member of the chapter did have spam musubi following the get together. From fis at wiwi.hu-berlin.de Fri Dec 9 12:16:26 2005 From: fis at wiwi.hu-berlin.de (Matthias Fischmann) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <43928055.4080306@vscape.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> <20051202193833.GD2249@leitl.org> <43928055.4080306@vscape.com> Message-ID: <20051209121626.GB22875@localhost.localdomain> hi, i hope this isn't too rude. i am not a big contributer here, so i don't feel i have the right to make any demands. but anyway here you go: i feel those threads on getting together somewhere are only interesting to the small subset of geographically affected subscribers, so i would like to suggest that these are moved to a yahoo group or a different mailing list as soon as a few people get interested. for instance, the SFC group might have reached this momentum. just a suggestion. thanks, matthias -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051209/4dfbeba8/attachment.pgp From stewbagz at gmail.com Fri Dec 9 16:45:45 2005 From: stewbagz at gmail.com (stew "stewbagz" mercer) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <20051209121626.GB22875@localhost.localdomain> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> <20051202193833.GD2249@leitl.org> <43928055.4080306@vscape.com> <20051209121626.GB22875@localhost.localdomain> Message-ID: <3b4626760512090845y40995fc6h@mail.gmail.com> On 09/12/05, Matthias Fischmann wrote: > > [snipped] > > i feel those threads on getting together somewhere are only > interesting to the small subset of geographically affected > subscribers, so i would like to suggest that these are moved to a > yahoo group or a different mailing list as soon as a few people get > interested. for instance, the SFC group might have reached this > momentum. > [snipped] I'm just jealous that they live on the warm, sunny west coast of america, and I'm stuck here in grey, windswept London. Sushi ? That's something Homer ate once wasn't it ? Fugu ? :) Kind regs Stew From jbj at forbidden.co.uk Fri Dec 9 17:19:01 2005 From: jbj at forbidden.co.uk (Jeremy James) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <3b4626760512090845y40995fc6h@mail.gmail.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> <20051202193833.GD2249@leitl.org> <43928055.4080306@vscape.com> <20051209121626.GB22875@localhost.localdomain> <3b4626760512090845y40995fc6h@mail.gmail.com> Message-ID: <4399BC85.6010001@forbidden.co.uk> stew "stewbagz" mercer wrote: > >> [double snipped] > > [snipped] > > I'm just jealous that they live on the warm, sunny west coast of > america, and I'm stuck here in grey, windswept London. Sushi ? That's > something Homer ate once wasn't it ? Fugu ? > Maybe we should have our own p2p meeting in London to celebrate just how damn blustery it is. And none of this sushi malarky - proper pub meal and pints of ale all round, I think. Best wishes, Jeremy From matthewsp at avaya.com Fri Dec 9 19:12:37 2005 From: matthewsp at avaya.com (Matthews, Philip (Philip)) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Where do bright minds discuss p2p technology? Message-ID: I am also in Ottawa. We have been doing some P2P work here at the former Nimcat Networks (now part of Avaya) and I would be interested in getting together and discussing P2P research. - Philip Matthews > -----Original Message----- > From: p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Iles, Michael > Sent: November 29, 2005 06:44 > To: Peer-to-peer development. > Subject: RE: [p2p-hackers] Where do bright minds discuss p2p > technology? > > +1 for an Ottawa meeting, and +1 for sushi and beer :) > > Mike. > > (Ottawa, the land of real winters and overpriced sushi.) > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] > On Behalf Of Roop Mukherjee > Sent: November 28, 2005 5:48 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] Where do bright minds discuss p2p > technology? > > Looks like the SFC folks will meet soon. For the rest of us with real > winters;)- any p2p folks in the neighborhood of Ottawa, ON Canada, > interested in having a similar meeting? > > - roop > ______________________________________ > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > This message may contain privileged and/or > confidential information. If you have received this e-mail > in error or are not the intended recipient, you may not use, > copy, disseminate or distribute it; do not open any > attachments, delete it immediately from your system and > notify the sender promptly by e-mail that you have done so. > Thank you. > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From lemonobrien at yahoo.com Sat Dec 10 03:38:27 2005 From: lemonobrien at yahoo.com (Lemon Obrien) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <4399BC85.6010001@forbidden.co.uk> Message-ID: <20051210033827.7573.qmail@web53606.mail.yahoo.com> yeah, the weather here is nic for winter....a little rain... the place we emt was a class act. It had real flair/style... we had 20 people there. we should make it a monthly/quarterly event. lime Jeremy James wrote: stew "stewbagz" mercer wrote: > >> [double snipped] > > [snipped] > > I'm just jealous that they live on the warm, sunny west coast of > america, and I'm stuck here in grey, windswept London. Sushi ? That's > something Homer ate once wasn't it ? Fugu ? > Maybe we should have our own p2p meeting in London to celebrate just how damn blustery it is. And none of this sushi malarky - proper pub meal and pints of ale all round, I think. Best wishes, Jeremy _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences You don't get no juice unless you squeeze Lemon Obrien, the Third. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051209/47dd99ee/attachment.htm From wolfgang.mueller at wiai.uni-bamberg.de Sat Dec 10 15:14:33 2005 From: wolfgang.mueller at wiai.uni-bamberg.de (Wolfgang Mueller) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <20051210033827.7573.qmail@web53606.mail.yahoo.com> References: <4399BC85.6010001@forbidden.co.uk> <20051210033827.7573.qmail@web53606.mail.yahoo.com> Message-ID: <20051210151433.GA26619@portos.uni-bamberg.de> Dear SF peers, Would it be possible to spice these reports up by publishing not only a writeup of the menu but also of your topics? BTW I am munching typically German Christmas cookies here... And don't denigrate places that enjoy a winter recognizable as such :-D . And: imagine a Bavarian beer-to-beer meeting. People here confuse b,p,d and t, anyway, so let's use this ambiguity :-D Cheers, Wolfgang -- Dr. Wolfgang Mueller LS Medieninformatik Universitaet Bamberg From travis at redswoosh.net Sun Dec 11 00:04:40 2005 From: travis at redswoosh.net (Travis Kalanick) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Last call for SuperHappyDevHouse Message-ID: <200512110006.jBB06ZZF011954@be9.noc0.redswoosh.com> Hey all, As I had mentioned to a number of people at the SFC get together this week, a pretty cool hacker event is going on this evening at the superhappydevhouse See - http://superhappydevhouse.com/ It starts tonight at 7pm and goes until 7am. We have a few slots for presenters (around midnight) to show off their P2P apps and warez to around 100 bay area geek-devs Presenters have 10+ minutes to show off their stuff, and ideally have technology that can be used by other people in their own projects. Send me an email if you're interested in stopping by, and/or presenting. Thanks, Travis From travis at redswoosh.net Mon Dec 12 02:34:57 2005 From: travis at redswoosh.net (Travis Kalanick) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <20051210151433.GA26619@portos.uni-bamberg.de> Message-ID: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com> One of the more interesting topics that came up from our sushi-get-together was a fairly rigorous discussion about the merits (or lack thereof) of Proactive Caching. Let's define Proactive Caching as a mechanism where a P2P network sends content to a user's machine for the sole purpose of improving network performance and availability. For instance, imagine that a given network proactively caches "long tail" content to improve availability, or alternatively, proactively caches content during a sudden surge of demand for a particular file. Deep in this discussion at the dinner (this took place in the hours after most folks left) was Sergei, David Barrett, myself, and others, and I thought it would be good to bring this topic to the list. So far, in my opinion, proactive caching on open p2p networks would provide little temporal benefit in availability and performance, given the inherent costs of such a scheme, and given the availability of high-performance, high-reliability p2p architectures. Travis -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Wolfgang Mueller Sent: Saturday, December 10, 2005 7:15 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] p2p in some place or other Dear SF peers, Would it be possible to spice these reports up by publishing not only a writeup of the menu but also of your topics? BTW I am munching typically German Christmas cookies here... And don't denigrate places that enjoy a winter recognizable as such :-D . And: imagine a Bavarian beer-to-beer meeting. People here confuse b,p,d and t, anyway, so let's use this ambiguity :-D Cheers, Wolfgang -- Dr. Wolfgang Mueller LS Medieninformatik Universitaet Bamberg _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From osokin at osokin.com Mon Dec 12 03:17:22 2005 From: osokin at osokin.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com> Message-ID: On Sunday, December 11, 2005 Travis Kalanick wrote: > So far, in my opinion, proactive caching on open p2p networks would > provide little temporal benefit in availability and performance, > given the inherent costs of such a scheme, and given the availability > of high-performance, high-reliability p2p architectures. ...and I was on the other side of this debate. My point was that most of content in open P2P networks is trapped in the long tail; see, for example: [1] http://p2pecon.berkeley.edu/pub/CWC-EC05.pdf and [2] http://www.mpi-sws.mpg.de/~gummadi/papers/p118-gummadi.pdf - the number of copies of the average file is truly pathetic. As a result, the average download experience is slow and unreliable. For example, the study [2] suggests than in 2002 as many as two thirds of "transactions" (HTTP requests for a single data chunk) used to fail in Kazaa. [Presumably due to the source host overload - Oso] This situation is widely replicated all over P2P space, being observed in some form in Kazaa, Gnutella, eDonkey, etc. The overload of the uploaders was discussd in Gnutella for almost as long as I can remember. The suggested countermeasures include download queues, try-later responses and such, but they all have one thing in common: they suck. The user experience stays pathetic, as can be attested by anyone trying to hunt down something other than a popular file. And the reason for this is quite understandable - if most of the content exists in just one or two copies, what good are the swarm downloaders and other marvelous instruments of progress? This single copy that you need might be on a single host behind the modem in Albania, the host might go off-line at any moment, and to make it more fun, it might be trying to upload five other files (different files, mind you) to five other people at the same time. What is important here is that the de facto statistical distribution of content on the open P2P nets shown in [1] and [2] tells us that this is happening not just for something that we tend to dismiss as "rare files". No - these files represent a huge share of the network content, and as much as 50-70% of the user download attempts are for these "rare files". And all this experience sucks, no matter what fancy download queues are introduced into the system. Simply because there's not enough uplink bandwidth and content sources. So I'm not saying that proactive caching is easy to implement, does not require any resources, or even that it will will work well in practice. I'm just saying that without it the user experience will continue to suck, and proactive caching is the single mechanism that I can see which is potentially able to fix this situation. Unless this rare file is proactively replicated to 5-10 other nodes, I do not see how the adequate download speeds can be achieved. Best wishes - S.Osokine. 11 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Travis Kalanick Sent: Sunday, December 11, 2005 6:35 PM To: 'Peer-to-peer development.' Subject: RE: [p2p-hackers] p2p in some place or other One of the more interesting topics that came up from our sushi-get-together was a fairly rigorous discussion about the merits (or lack thereof) of Proactive Caching. Let's define Proactive Caching as a mechanism where a P2P network sends content to a user's machine for the sole purpose of improving network performance and availability. For instance, imagine that a given network proactively caches "long tail" content to improve availability, or alternatively, proactively caches content during a sudden surge of demand for a particular file. Deep in this discussion at the dinner (this took place in the hours after most folks left) was Sergei, David Barrett, myself, and others, and I thought it would be good to bring this topic to the list. So far, in my opinion, proactive caching on open p2p networks would provide little temporal benefit in availability and performance, given the inherent costs of such a scheme, and given the availability of high-performance, high-reliability p2p architectures. Travis -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Wolfgang Mueller Sent: Saturday, December 10, 2005 7:15 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] p2p in some place or other Dear SF peers, Would it be possible to spice these reports up by publishing not only a writeup of the menu but also of your topics? BTW I am munching typically German Christmas cookies here... And don't denigrate places that enjoy a winter recognizable as such :-D . And: imagine a Bavarian beer-to-beer meeting. People here confuse b,p,d and t, anyway, so let's use this ambiguity :-D Cheers, Wolfgang -- Dr. Wolfgang Mueller LS Medieninformatik Universitaet Bamberg _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From dbarrett at quinthar.com Mon Dec 12 03:47:48 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: Message-ID: <439CF2E4.40808@quinthar.com> Serguei Osokine wrote: > On Sunday, December 11, 2005 Travis Kalanick wrote: > >>So far, in my opinion, proactive caching on open p2p networks would >>provide little temporal benefit in availability and performance, >>given the inherent costs of such a scheme, and given the availability >>of high-performance, high-reliability p2p architectures. > > ...and I was on the other side of this debate. ... and I'm somewhere in between. I suspect proactive caching is useful for some configurations of files, uploaders, and would-be-downloaders, but I'm not sure if that configuration exists in the real world to such a degree that it's worth worrying about. Furthermore, I suspect the "real world" in all its glory is far too complicated to get agreement upon quickly, so I think we should first start with simplified worlds and then work up to the real one. For starters, assume "the network" consists of: 1) A single "uploader" with exactly one file 2) A "downloader" that wants the file (but doesn't have it) 3) An "innocent bystander" that neither has nor wants the file (Further assume that there will never be any more files, more nodes, and the innocent bystander will never want the file. Also, assume all three nodes have identical, equal upload/download speeds, unlimited storage, and have been and will be online for eternity.) Thus the uploader can either choose to: a) Send the file only to the downloader b) Only to the innocent bystander c) To both I'd define "proactive caching" as options (b) and (c). And in this specific configuration, I don't see it as useful. I'll define "success" as: - Transfers the maximum number of files to those who want them - In the shortest possible time (Note, I'm explicitly not valuing conservation of bandwidth or storage in order to simplify the case for proactive caching.) Thus for this absolute most basic network, I'd say option (a) is clearly the right choice. Can we agree on that much? If so, what is the *smallest* way this network must change in order for proactive caching to begin offering value? -david From mgp at ucla.edu Mon Dec 12 04:23:26 2005 From: mgp at ucla.edu (Michael Parker) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com> References: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com> Message-ID: <20051211202326.kxu6sg8tw88ss040@mail.ucla.edu> Hmmm, Emin Sirer was on this list talking about Beehive [1] not too long ago. I read the paper, from NSDI 04, and it seems to fit your description very well. They use an analytical model to derive how hard to push replication of a key-value pair such that it can be found in a (configurable) constant number of hops. It assumes that the data set has a zipf-like distribution. You could also try to borrow ideas from something like Glacier [2], from NSDI 05, which replicates to maintain high fault-tolerance, and try to exploit the replication for performance gains instead. (From what I can remember, data-survivability is the first priority of Glacier, not performance... As implied by the name.) Just my two cents. - Mike [1] http://www.cs.cornell.edu/People/egs/papers/beehive.pdf [2] http://www.cs.rice.edu/~druschel/publications/Glacier-NSDI.pdf Quoting Travis Kalanick : > One of the more interesting topics that came up from our sushi-get-together > was a fairly rigorous discussion about the merits (or lack thereof) of > Proactive Caching. > > Let's define Proactive Caching as a mechanism where a P2P network sends > content to a user's machine for the sole purpose of improving network > performance and availability. For instance, imagine that a given network > proactively caches "long tail" content to improve availability, or > alternatively, proactively caches content during a sudden surge of demand > for a particular file. > > Deep in this discussion at the dinner (this took place in the hours after > most folks left) was Sergei, David Barrett, myself, and others, and I > thought it would be good to bring this topic to the list. > > So far, in my opinion, proactive caching on open p2p networks would provide > little temporal benefit in availability and performance, given the inherent > costs of such a scheme, and given the availability of high-performance, > high-reliability p2p architectures. > > Travis > From osokin at osokin.com Mon Dec 12 06:06:49 2005 From: osokin at osokin.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <439CF2E4.40808@quinthar.com> Message-ID: On Sunday, December 11, 2005 David Barrett wrote: > Thus the uploader can either choose to: > a) Send the file only to the downloader > b) Only to the innocent bystander > c) To both Add to this: d) send the file to bystander in advance, even before the downloader asks for it - and then you'll cover pretty much every possible case of proactive caching, because e) send the file to both bystander and downloader in advance is essentially just an extreme case of "d" :-) Note that Dijjer, for example, seems to use more or less "c" - but I'm not sure if its usage model can be squeezed to fit into your simplified scenario. In real life there might be subsequent requests for the cached file, but since you're saying that the bystander won't ever need the file, there's no one to generate such requests in your model. So even if Dijjer benefits from its caching, your model will miss it. Best wishes - S.Osokine. 11 Dec 2005. -----Original Message----- From: David Barrett [mailto:dbarrett@quinthar.com] Sent: Sunday, December 11, 2005 7:48 PM To: osokin@osokin.com; Peer-to-peer development. Subject: Re: [p2p-hackers] p2p in some place or other Serguei Osokine wrote: > On Sunday, December 11, 2005 Travis Kalanick wrote: > >>So far, in my opinion, proactive caching on open p2p networks would >>provide little temporal benefit in availability and performance, >>given the inherent costs of such a scheme, and given the availability >>of high-performance, high-reliability p2p architectures. > > ...and I was on the other side of this debate. ... and I'm somewhere in between. I suspect proactive caching is useful for some configurations of files, uploaders, and would-be-downloaders, but I'm not sure if that configuration exists in the real world to such a degree that it's worth worrying about. Furthermore, I suspect the "real world" in all its glory is far too complicated to get agreement upon quickly, so I think we should first start with simplified worlds and then work up to the real one. For starters, assume "the network" consists of: 1) A single "uploader" with exactly one file 2) A "downloader" that wants the file (but doesn't have it) 3) An "innocent bystander" that neither has nor wants the file (Further assume that there will never be any more files, more nodes, and the innocent bystander will never want the file. Also, assume all three nodes have identical, equal upload/download speeds, unlimited storage, and have been and will be online for eternity.) Thus the uploader can either choose to: a) Send the file only to the downloader b) Only to the innocent bystander c) To both I'd define "proactive caching" as options (b) and (c). And in this specific configuration, I don't see it as useful. I'll define "success" as: - Transfers the maximum number of files to those who want them - In the shortest possible time (Note, I'm explicitly not valuing conservation of bandwidth or storage in order to simplify the case for proactive caching.) Thus for this absolute most basic network, I'd say option (a) is clearly the right choice. Can we agree on that much? If so, what is the *smallest* way this network must change in order for proactive caching to begin offering value? -david From osokin at osokin.com Mon Dec 12 06:19:40 2005 From: osokin at osokin.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <20051211202326.kxu6sg8tw88ss040@mail.ucla.edu> Message-ID: On Sunday, December 11, 2005 Michael Parker wrote: > They use an analytical model to derive how hard to push replication > of a key-value pair such that it can be found in a (configurable) > constant number of hops. Right. But Travis is not convinced that it has to be done in the first place, so I would imagine that the optimal nature of such replication would be of secondary importance to him :-) > It assumes that the data set has a zipf-like distribution. This makes me a bit uneasy about this model, by the way. Even if the content distribution in P2P nets would be Zipf (and it isn't), still I would be reluctant to implement anything that rigidly relies on any predetermined distribution. Real functioning systems tend to have some sort of a feedback loop and adapt to the changing situation. In this case, it should cache the proper amount of content regardless of how exactly it is distributed. But I just scanned the paper. Maybe I missed the adaptive part. Best wishes - S.Osokine. 11 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Michael Parker Sent: Sunday, December 11, 2005 8:23 PM To: Peer-to-peer development.; Travis Kalanick Cc: 'Peer-to-peer development.' Subject: RE: [p2p-hackers] p2p in some place or other Hmmm, Emin Sirer was on this list talking about Beehive [1] not too long ago. I read the paper, from NSDI 04, and it seems to fit your description very well. They use an analytical model to derive how hard to push replication of a key-value pair such that it can be found in a (configurable) constant number of hops. It assumes that the data set has a zipf-like distribution. You could also try to borrow ideas from something like Glacier [2], from NSDI 05, which replicates to maintain high fault-tolerance, and try to exploit the replication for performance gains instead. (From what I can remember, data-survivability is the first priority of Glacier, not performance... As implied by the name.) Just my two cents. - Mike [1] http://www.cs.cornell.edu/People/egs/papers/beehive.pdf [2] http://www.cs.rice.edu/~druschel/publications/Glacier-NSDI.pdf Quoting Travis Kalanick : > One of the more interesting topics that came up from our sushi-get-together > was a fairly rigorous discussion about the merits (or lack thereof) of > Proactive Caching. > > Let's define Proactive Caching as a mechanism where a P2P network sends > content to a user's machine for the sole purpose of improving network > performance and availability. For instance, imagine that a given network > proactively caches "long tail" content to improve availability, or > alternatively, proactively caches content during a sudden surge of demand > for a particular file. > > Deep in this discussion at the dinner (this took place in the hours after > most folks left) was Sergei, David Barrett, myself, and others, and I > thought it would be good to bring this topic to the list. > > So far, in my opinion, proactive caching on open p2p networks would provide > little temporal benefit in availability and performance, given the inherent > costs of such a scheme, and given the availability of high-performance, > high-reliability p2p architectures. > > Travis > _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From dbarrett at quinthar.com Mon Dec 12 07:00:54 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: Message-ID: <439D2026.6090409@quinthar.com> Serguei Osokine wrote: > On Sunday, December 11, 2005 David Barrett wrote: > >>Thus the uploader can either choose to: >>a) Send the file only to the downloader >>b) Only to the innocent bystander >>c) To both > > Add to this: > > d) send the file to bystander in advance, even before the downloader > asks for it Ah, ok, so it sounds like your saying that proactive caching becomes valuable when the bystander gets the file before the downloader makes its request. (This seems obvious in retrospect, but I was thinking along the lines of "just in time" proactive caching -- sending the file to more than just who requested it to somehow improve the experience for the requester. I couldn't see any way to make this work, though I'd be happy to be wrong.) So really, it sounds like proactive caching sets a "minimum replication" target for every file with one or more requests. Thus the policy is to keep making copies until that target is achieved. If anyone goes offline (and thus reduces a file's cache count below the minimum threshold), a new cache is made. How much of this is motivated by a desire for "high availability" versus "high performance"? In other words, if you had a "guaranteed seeder" for the file that you knew would never go offline, would proactive caching still be worth the trouble? Again, my initial thought around proactive caching was to use it to improve download performance by making the long tail download as fast as a well-deployed file. (And I still don't see how to make that happen without simply making every file "well deployed" through massive proactive caching.) But I can see how very limited proactive caching can be used to improve availability, and thus ensure the interesting subset of the long tail be downloaded *at all*. -david From gbildson at limepeer.com Mon Dec 12 17:16:53 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: Message-ID: There are an infinite number of rare files so caching those without future knowledge about anyone's interest would be costly and infeasible. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Serguei Osokine > Sent: Sunday, December 11, 2005 10:17 PM > To: Peer-to-peer development. > Subject: RE: [p2p-hackers] p2p in some place or other > > > On Sunday, December 11, 2005 Travis Kalanick wrote: > > So far, in my opinion, proactive caching on open p2p networks would > > provide little temporal benefit in availability and performance, > > given the inherent costs of such a scheme, and given the availability > > of high-performance, high-reliability p2p architectures. > > ...and I was on the other side of this debate. > > My point was that most of content in open P2P networks is > trapped in the long tail; see, for example: > > [1] http://p2pecon.berkeley.edu/pub/CWC-EC05.pdf > > and > > [2] http://www.mpi-sws.mpg.de/~gummadi/papers/p118-gummadi.pdf > > - the number of copies of the average file is truly pathetic. As a > result, the average download experience is slow and unreliable. For > example, the study [2] suggests than in 2002 as many as two thirds of > "transactions" (HTTP requests for a single data chunk) used to fail > in Kazaa. [Presumably due to the source host overload - Oso] > From matthew at matthew.at Mon Dec 12 17:31:01 2005 From: matthew at matthew.at (Matthew Kaufman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: Message-ID: <200512121731.jBCHVEU88770@where.matthew.at> Greg Bildson: > There are an infinite number of rare files so caching those > without future knowledge about anyone's interest would be > costly and infeasible. On any given actual file sharing network, I believe that's not actually true. In fact, it "probably" isn't even true for the known universe of computers :) The number *can be* very large however, so as has been pointed out before, how much sense this makes really depends upon the total number of files and their distribution. Consider, for instance, the cost of having every file stored on exactly ONE more node than actually cares about the file at present. If almost all files are, as we fear, on a very long tail of height one, then this approximately doubles the storage requirements network-wide (and communiaction required for that replication scales equivalently). However, that is the worst-case... Real distributions may not look like this at all, especially for things like commercial file distribution networks or corporate intranet applications. The real trick for the general case is probably selling users on the idea that the overhead they experience (disk space, bandwidth requirements, number of times their house is raided in a search for illicit bits) is worth the performance gains (if any) that they see. Matthew Kaufman matthew@matthew.at www.amicima.com From gbildson at limepeer.com Mon Dec 12 17:47:47 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <200512121731.jBCHVEU88770@where.matthew.at> Message-ID: Given the number of people sharing unique personal files (photos, etc), partial downloads and accidentally sharing entire drives, it's closer to reality then you may believe. I don't really mean infinite of course but large. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Matthew Kaufman > Sent: Monday, December 12, 2005 12:31 PM > To: 'Peer-to-peer development.' > Subject: RE: [p2p-hackers] p2p in some place or other > > > Greg Bildson: > > There are an infinite number of rare files so caching those > > without future knowledge about anyone's interest would be > > costly and infeasible. > > On any given actual file sharing network, I believe that's not actually > true. In fact, it "probably" isn't even true for the known universe of > computers :) > From alenlpeacock at gmail.com Mon Dec 12 17:49:41 2005 From: alenlpeacock at gmail.com (Alen Peacock) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: Message-ID: On 12/12/05, Greg Bildson wrote: > There are an infinite number of rare files so caching those without future > knowledge about anyone's interest would be costly and infeasible. I'd add: what is the self-interested motivation for a node to agree to cache the content in the first place? If proactive caching were turned on by default in my p2p filesharing client, don't I have a very real incentive to turn this off in my own node to preserve bandwidth, disk space, and perhaps limit any legal liability? If the implemented client doesn't have the option to turn this feature off, isn't there a very real incentive to use a different client to get better performance / less risk? The beauty of "a) Send the file only to the downloader" is that self-interest is leveraged to get the downloader to share. Is there some incentive or mechanism to enforce fairness with proactive caching? If there is, then it seems like you've still got to overcome Greg's argument against, which is similar to many of the arguments made against pre-fetching in traditional caching literature: how do you ensure that you prefetch the right content, especially when the cost of prefetching the wrong content is very high? Alen (Hoping I'm not re-hashing conversations you had over sushi -- posting anyway). From Serguei.Osokine at efi.com Mon Dec 12 18:01:05 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42792@fcexmb04.efi.internal> On Sunday, December 11, 2005 David Barrett wrote: > How much of this is motivated by a desire for "high availability" > versus "high performance"? Not sure how to separate those. They are closely related. For example, in Gnutella the average host session time is about an hour and a half. So if you try to get something from this host, and it will give you only 1 KB/s (a frequent occurence, because content tends to be concentrated on relatively few nodes), then you'll be able to receive only about 5 MB during the average host session. In fact, the average transferred volume before this host goes off-line will be half of that, or about 2.5 MB. Not enough to get even a single song. So availability and performance are closely related. > In other words, if you had a "guaranteed seeder" for the file that > you knew would never go offline, would proactive caching still be > worth the trouble? That does solve the availability problem; not sure about the performance one. Personally, I wouldn't like to download movies at 1 KB/s and wait several days for the download to finish. But people use eDonkey all the time - and its speed is not much better. So the answer depends on whether you want to make your users happy or you're fine with someone else doing that, I presume :-) Best wishes - S.Osokine. 12 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of David Barrett Sent: Sunday, December 11, 2005 11:01 PM To: Peer-to-peer development. Subject: Re: [p2p-hackers] p2p in some place or other Serguei Osokine wrote: > On Sunday, December 11, 2005 David Barrett wrote: > >>Thus the uploader can either choose to: >>a) Send the file only to the downloader >>b) Only to the innocent bystander >>c) To both > > Add to this: > > d) send the file to bystander in advance, even before the downloader > asks for it Ah, ok, so it sounds like your saying that proactive caching becomes valuable when the bystander gets the file before the downloader makes its request. (This seems obvious in retrospect, but I was thinking along the lines of "just in time" proactive caching -- sending the file to more than just who requested it to somehow improve the experience for the requester. I couldn't see any way to make this work, though I'd be happy to be wrong.) So really, it sounds like proactive caching sets a "minimum replication" target for every file with one or more requests. Thus the policy is to keep making copies until that target is achieved. If anyone goes offline (and thus reduces a file's cache count below the minimum threshold), a new cache is made. How much of this is motivated by a desire for "high availability" versus "high performance"? In other words, if you had a "guaranteed seeder" for the file that you knew would never go offline, would proactive caching still be worth the trouble? Again, my initial thought around proactive caching was to use it to improve download performance by making the long tail download as fast as a well-deployed file. (And I still don't see how to make that happen without simply making every file "well deployed" through massive proactive caching.) But I can see how very limited proactive caching can be used to improve availability, and thus ensure the interesting subset of the long tail be downloaded *at all*. -david _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From matthew at matthew.at Mon Dec 12 18:13:49 2005 From: matthew at matthew.at (Matthew Kaufman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: Message-ID: <200512121814.jBCIE1U88941@where.matthew.at> Alen Peacock: > > I'd add: what is the self-interested motivation for a node > to agree to cache the content in the first place? This could be some external motivation like "I want anonymously-posted files about certain political views to be available for all to see" or "my corporate IT department says that we have to use this distributed collaboration tool" > If proactive caching were turned on by default in my p2p > filesharing client, don't I have a very real incentive to > turn this off in my own node to preserve bandwidth, disk > space, and perhaps limit any legal liability? In the general "filesharing" case? Absolutely. But that's not the only use for P2P technology or even P2P file transfer. > ...which is similar to many of the arguments made against > pre-fetching in traditional caching literature: how do you > ensure that you prefetch the right content, especially when > the cost of prefetching the wrong content is very high? Actually, if you're replicating content to other nodes in order to ensure availability or create more downloadable nodes in order to speed future downloaders, it is more like the RAID arguments than the cache arguments. The real question is, IF you had a high-availability file sharing system, what files would you want to make available on it? (The answer is probably *not* the long tail of all files ever seen on generic file sharing services) Matthew Kaufman matthew@matthew.at www.amicima.com From Serguei.Osokine at efi.com Mon Dec 12 18:16:58 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42793@fcexmb04.efi.internal> On Monday, December 12, 2005 Greg Bildson wrote: > Given the number of people sharing unique personal files (photos, > etc), partial downloads and accidentally sharing entire drives, > it's closer to reality then you may believe. These things are just shared. Gummadi's research talks about the things that were actually downloaded. So no, the accidentally shared entire drives are not the reason why the average number of copies per unique title is not much higher than one. Best wishes - S.Osokine. 12 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Greg Bildson Sent: Monday, December 12, 2005 9:48 AM To: Peer-to-peer development. Subject: RE: [p2p-hackers] p2p in some place or other Given the number of people sharing unique personal files (photos, etc), partial downloads and accidentally sharing entire drives, it's closer to reality then you may believe. I don't really mean infinite of course but large. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Matthew Kaufman > Sent: Monday, December 12, 2005 12:31 PM > To: 'Peer-to-peer development.' > Subject: RE: [p2p-hackers] p2p in some place or other > > > Greg Bildson: > > There are an infinite number of rare files so caching those > > without future knowledge about anyone's interest would be > > costly and infeasible. > > On any given actual file sharing network, I believe that's not actually > true. In fact, it "probably" isn't even true for the known universe of > computers :) > _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From nazareno at dsc.ufcg.edu.br Mon Dec 12 18:21:44 2005 From: nazareno at dsc.ufcg.edu.br (Nazareno Andrade) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <200512121814.jBCIE1U88941@where.matthew.at> References: <200512121814.jBCIE1U88941@where.matthew.at> Message-ID: <439DBFB8.2060901@dsc.ufcg.edu.br> Hi there. A nice paper which you may find useful in this thread: High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two (HotOS XI) Peer-to-peer storage aims to build large-scale, reliable and available storage from many small-scale unreliable, low-availability distributed hosts. Data redundancy is the key to any data guarantees. However, preserving redundancy in the face of highly dynamic membership is costly. We use a simple resource usage model to measured behavior from the Gnutella file-sharing network to argue that large-scale cooperative storage is limited by likely dynamics and cross-system bandwidth - not by local disk space. We examine some bandwidth optimization strategies like delayed response to failures, admission control, and load-shifting and find that they do not alter the basic problem. We conclude that when redundancy, data scale, and dynamics are all high, the needed cross-system bandwidth is unreasonable. http://pmg.csail.mit.edu/~rodrigo/p2p-scl.pdf regards, Nazareno Matthew Kaufman wrote: > Alen Peacock: > >> I'd add: what is the self-interested motivation for a node >>to agree to cache the content in the first place? > > > This could be some external motivation like "I want anonymously-posted files > about certain political views to be available for all to see" or "my > corporate IT department says that we have to use this distributed > collaboration tool" > > >>If proactive caching were turned on by default in my p2p >>filesharing client, don't I have a very real incentive to >>turn this off in my own node to preserve bandwidth, disk >>space, and perhaps limit any legal liability? > > > In the general "filesharing" case? Absolutely. But that's not the only use > for P2P technology or even P2P file transfer. > > >>...which is similar to many of the arguments made against >>pre-fetching in traditional caching literature: how do you >>ensure that you prefetch the right content, especially when >>the cost of prefetching the wrong content is very high? > > > Actually, if you're replicating content to other nodes in order to ensure > availability or create more downloadable nodes in order to speed future > downloaders, it is more like the RAID arguments than the cache arguments. > > The real question is, IF you had a high-availability file sharing system, > what files would you want to make available on it? (The answer is probably > *not* the long tail of all files ever seen on generic file sharing services) > > Matthew Kaufman > matthew@matthew.at > www.amicima.com > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- Nazareno. ======================================== Nazareno Andrade LSD - DSC/UFCG Campina Grande - Brazil http://lsd.dsc.ufcg.edu.br/~nazareno/ OurGrid project http://www.ourgrid.org ======================================== From coderman at gmail.com Mon Dec 12 18:34:22 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: <20051211202326.kxu6sg8tw88ss040@mail.ucla.edu> Message-ID: <4ef5fec60512121034n5c5c8aedpa15d1eb9077fc74d@mail.gmail.com> On 12/11/05, Serguei Osokine wrote: > > It assumes that the data set has a zipf-like distribution. > > This makes me a bit uneasy about this model, by the way. Even > if the content distribution in P2P nets would be Zipf (and it isn't), > still I would be reluctant to implement anything that rigidly relies > on any predetermined distribution. Real functioning systems tend to > have some sort of a feedback loop and adapt to the changing situation. > In this case, it should cache the proper amount of content regardless > of how exactly it is distributed. i mentioned feedbackfs a few threads earlier and this was exactly what it intended to do: observe relevance and utility directly and implicitly (through filesystem interaction) so that recommendation (search) and caching (distribution) could be optimized. i think a poor / unintelligent caching mechanism would actually create more problems as it would be vulnerable to abuse - how do you prevent malicious or irrelevant peers from filling your cache with crap and consuming even more of your limited bandwidth to useless ends. reputation / trust needs to be addressed as well; the question about motivation for caching is a good one. that said, it seems clear to me that caching is a big win for performance. if you look at various papers and experiments using this technique everything from search to distribution is greatly enhanced with a well designed caching mechanism. akamai still seems to be doing well. ;) so perhaps for caching to work well you need a few prerequisites: - reputation / trust between peers to prevent abuse - feedback loop to ensure relevance / utility of cached content i haven't seen any p2p network/app which tries to address both points. does anyone know of such a beast? (i suppose mnet would fit somewhat. the agorics model prevents abuse of resources but i'm not sure how the feedback loop is applied) From Serguei.Osokine at efi.com Mon Dec 12 19:24:33 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42794@fcexmb04.efi.internal> On Monday, December 12, 2005 Nazareno Andrade wrote: > A nice paper which you may find useful in this thread: > > High Availability, Scalable Storage, Dynamic Peer Networks: Pick > Two Yes, it is an interesting approach - thank you! However, I'm not sure if their results directly apply to P2P nets. They are talking about six nines and replication factor of 20 to 80. They would likely commit suicide if they would try to actually use Gnutella for rare content. Any improvement would be nice - and forget about six nines. Also, despite introducing an interesting approach, this article results are very hard to verify and to reproduce, which is absolutely necessary if one would want to repeat their calculations with some different assumptions about the system requirements. For example, much of their conclusions are based on the Gnutella trace from April of 2003. Back then Gnutella was more than an order of magnitude smaller, and it would be interesting to repeat the calculations for today's situation. But the properties of this trace are not explicitly listed anywhere, being hidden in multiple charts and obscure statements like "only 5,000 of the 33,000 Gnutella hosts were usually available" (This, by the way, is a total mystery to me, since in April of 2003 Slyck's stats archive lists Gnutella at about 90,000 simultaneous nodes, so I have no idea where these 5,000 or 33,000 came from and what their meaning might have been.) To put it shortly, they have an interesting methodology, but I do not trust any one of their conclusions, as far as the caching in P2P file-sharing network is concerned. All their reasonings should be repeated for the reliable network statistical data, and with the set of requirements that reflects the needs of P2P users, not the need for a six nines-reliable data storage. I suspect that then the conclusions might prove to be a bit different. Best wishes - S.Osokine. 12 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Nazareno Andrade Sent: Monday, December 12, 2005 10:22 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] p2p in some place or other Hi there. A nice paper which you may find useful in this thread: High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two (HotOS XI) Peer-to-peer storage aims to build large-scale, reliable and available storage from many small-scale unreliable, low-availability distributed hosts. Data redundancy is the key to any data guarantees. However, preserving redundancy in the face of highly dynamic membership is costly. We use a simple resource usage model to measured behavior from the Gnutella file-sharing network to argue that large-scale cooperative storage is limited by likely dynamics and cross-system bandwidth - not by local disk space. We examine some bandwidth optimization strategies like delayed response to failures, admission control, and load-shifting and find that they do not alter the basic problem. We conclude that when redundancy, data scale, and dynamics are all high, the needed cross-system bandwidth is unreasonable. http://pmg.csail.mit.edu/~rodrigo/p2p-scl.pdf regards, Nazareno Matthew Kaufman wrote: > Alen Peacock: > >> I'd add: what is the self-interested motivation for a node >>to agree to cache the content in the first place? > > > This could be some external motivation like "I want anonymously-posted files > about certain political views to be available for all to see" or "my > corporate IT department says that we have to use this distributed > collaboration tool" > > >>If proactive caching were turned on by default in my p2p >>filesharing client, don't I have a very real incentive to >>turn this off in my own node to preserve bandwidth, disk >>space, and perhaps limit any legal liability? > > > In the general "filesharing" case? Absolutely. But that's not the only use > for P2P technology or even P2P file transfer. > > >>...which is similar to many of the arguments made against >>pre-fetching in traditional caching literature: how do you >>ensure that you prefetch the right content, especially when >>the cost of prefetching the wrong content is very high? > > > Actually, if you're replicating content to other nodes in order to ensure > availability or create more downloadable nodes in order to speed future > downloaders, it is more like the RAID arguments than the cache arguments. > > The real question is, IF you had a high-availability file sharing system, > what files would you want to make available on it? (The answer is probably > *not* the long tail of all files ever seen on generic file sharing services) > > Matthew Kaufman > matthew@matthew.at > www.amicima.com > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- Nazareno. ======================================== Nazareno Andrade LSD - DSC/UFCG Campina Grande - Brazil http://lsd.dsc.ufcg.edu.br/~nazareno/ OurGrid project http://www.ourgrid.org ======================================== _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From alenlpeacock at gmail.com Mon Dec 12 19:35:37 2005 From: alenlpeacock at gmail.com (Alen Peacock) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <200512121814.jBCIE1U88941@where.matthew.at> References: <200512121814.jBCIE1U88941@where.matthew.at> Message-ID: On 12/12/05, Matthew Kaufman wrote: > > This could be some external motivation like "I want anonymously-posted files > about certain political views to be available for all to see" or "my > corporate IT department says that we have to use this distributed > collaboration tool" External motivation is good, but is it sufficient to provide some sort of equilibria? If not, it's just the prisoner's dilemma; the vast majority of nodes disable caching because it is locally optimal, regardless of the fact that this produces a globally non-optimal solution. In fact, it might be even worse: the local cache could be exploited by malicious nodes to store data to the network. For example, instead of sharing my files from my own box, I just push them all out to the cache and stop local sharing altogether. > > If proactive caching were turned on by default in my p2p > > filesharing client, don't I have a very real incentive to > > turn this off in my own node to preserve bandwidth, disk > > space, and perhaps limit any legal liability? > > In the general "filesharing" case? Absolutely. But that's not the only use > for P2P technology or even P2P file transfer. Ah, but it doesn't matter if it is filesharing or not -- if the system can arbitrarily push data to my cache, my [bandwidth|disk|legal] resources are being consumed, regardless of whether the application layer is doing filesharing, chat, video, email, etc. And if I can prevent access to these extra resources, or if I can download an alternate client which promises better local performance and less legal liability, why wouldn't I? I'll admit that maybe I'm just obsessing over this point for purely academic reasons; maybe the majority of users simply accept the system defaults and innocently engage in altruistic behavior that ends up optimizing global performance. Maybe they all just turn their caches on because it is 'the right thing to do.' Maybe no one writes malicious software that takes advantage of [for example] a proactive cache. Maybe we shouldn't worry about it at all. But, isn't it more interesting to think about building systems that have some fairness guarantees than building ones that don't? Building a proactive cache that isn't susceptible to these abuses might require a trust/reputation sytem, which in turn requires a strong identity system, etc. -- but isn't that where the real fun is anyway? :) Alen From coderman at gmail.com Mon Dec 12 19:42:29 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Peer Identity Management [was: p2p in some place or other] Message-ID: <4ef5fec60512121142h1cef2711mb1b90765e31c8ed3@mail.gmail.com> On 12/12/05, Alen Peacock wrote: > ... > But, isn't it more interesting to think about building systems that > have some fairness guarantees than building ones that don't? Building > a proactive cache that isn't susceptible to these abuses might require > a trust/reputation sytem, which in turn requires a strong identity > system, etc. -- but isn't that where the real fun is anyway? :) i often wonder why identity management is not a more active topic on these lists. it is the cornerstone of a useful reputation/trust metric, which in turn provides a foundation for many advanced and resilient features like proactive caching, agorics, recommender systems, etc. is the lack of interest due to the overhead in security and complexity associated with digital identities in large, ad-hoc peer groups? the lack of consensus (single sign on?) preventing any critical mass? i'm curious... From lally at vt.edu Tue Dec 13 01:28:36 2005 From: lally at vt.edu (Lally Singh) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: <200512121814.jBCIE1U88941@where.matthew.at> Message-ID: <20051213012836.30212@smtp.vt.edu> >On 12/12/05, Matthew Kaufman wrote: > I'll admit that maybe I'm just obsessing over this point for purely >academic reasons; maybe the majority of users simply accept the system >defaults and innocently engage in altruistic behavior that ends up >optimizing global performance. Maybe they all just turn their caches >on because it is 'the right thing to do.' Maybe no one writes >malicious software that takes advantage of [for example] a proactive >cache. Maybe we shouldn't worry about it at all. > > But, isn't it more interesting to think about building systems that >have some fairness guarantees than building ones that don't? Building >a proactive cache that isn't susceptible to these abuses might require >a trust/reputation sytem, which in turn requires a strong identity >system, etc. -- but isn't that where the real fun is anyway? :) It's not hard to imagine an ISP shipping some software to disable caching on P2P network clients to all their clients (say with the installer of free anti-spyware or anti-virus software), without the clients ever having the chance to be altruistic. IMHO, anonymity's pretty important to keep. If there's going to be an identity system, let's make sure it doesn't attach to real people directly. Ebay user IDs, which you can burn at any time, but become valuable due to good feedback, are nice. -- H. Lally Singh Ph.D. Candidate, Computer Science Virginia Tech From coderman at gmail.com Tue Dec 13 07:05:12 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <20051213012836.30212@smtp.vt.edu> References: <200512121814.jBCIE1U88941@where.matthew.at> <20051213012836.30212@smtp.vt.edu> Message-ID: <4ef5fec60512122305n69f38437ke1996a87212b6cfa@mail.gmail.com> On 12/12/05, Lally Singh wrote: > ... > IMHO, anonymity's pretty important to keep. If there's going to > be an identity system, let's make sure it doesn't attach to real > people directly. Ebay user IDs, which you can burn at any time, > but become valuable due to good feedback, are nice. agreed; anonymity and pseudonymity are important. an ideal identity management system would function like Ian Goldberg's nymity slider and allow me to specify exactly how much information is revealed about my person during any interactions with peers. anonymous interactions would be useful for self certifying resources, pull based operations, and public broadcasts for example. psuedonymity for weakly trusted interactions, reputation attached to recommendations or other meta data. and strong identity for trusted relationships between friends / associates where non trivial resources may be exchanged or formal agreements negotiated. likewise, the protocols used to communicate between peers would need to take these nymity levels into account, and constrain or protect communication accordingly. i have to second Matthew Kaufman in that a lot of fun is to be had in these areas; so much ties into these mechanisms (user interfaces, protocols, social interactions, information security) that provides fertile ground for experimentation and discovery across a diverse range of interests. trying to make such systems work in a fully or partially decentralized manner makes it even more challenging (and fun :) From ludovic.courtes at laas.fr Tue Dec 13 08:43:04 2005 From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: <20051207172725.GG5812@cs.uoregon.edu> (Daniel Stutzbach's message of "Wed, 7 Dec 2005 09:27:26 -0800") References: <200511291414.35852.01771@iha.dk> <20051129141713.6A9CB698@yumyum.zooko.com> <20051129142151.8E1A035E4@yumyum.zooko.com> <438C9F88.2050803@pdos.lcs.mit.edu> <87lkywg9sp.fsf_-_@laas.fr> <20051207172725.GG5812@cs.uoregon.edu> Message-ID: <87u0ddo09z.fsf@laas.fr> Hi, Daniel Stutzbach writes: > For what purpose do you want to "decentralize Google"? > > Is it for some technical reason where you believe a decentralized > index will provide better end-user performance? > > Or is it because you don't think any single organization should have > that much control over information? Essentially for this reason. Because search engines have become "entry points" to the Internet. Typically, people tend to no longer use bookmarks and the likes: Google can always find the data they're looking for. Therefore, I think it's a reasonable goal to try to remove that single point of trust/failure. Thanks, Ludovic. From ludovic.courtes at laas.fr Tue Dec 13 09:04:51 2005 From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: <20051207112748.1ptnql9c004s4oko@mail.ucla.edu> (Michael Parker's message of "Wed, 07 Dec 2005 11:27:48 -0800") References: <20051207112748.1ptnql9c004s4oko@mail.ucla.edu> Message-ID: <87fyoxmkp8.fsf@laas.fr> Michael Parker writes: > The first step of indexing is the actual keyword extraction itself. > From what I have heard, libextractor is a good open-source solution: > http://gnunet.org/libextractor/ Its author (Christian Grothoff) also used it to implement Doodle, a document indexing and search tool similar to Beagle: http://gnunet.org/doodle/ . Thanks, Ludovic. From ludovic.courtes at laas.fr Tue Dec 13 09:08:57 2005 From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: (SIMON Gwendal's message of "Wed, 7 Dec 2005 17:36:01 +0100") References: Message-ID: <87zmn5l5xy.fsf@laas.fr> "SIMON Gwendal RD-MAPS-ISS" writes: > As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents. > > http://maay.netofpeers.net This sounds nice! What licence is it available under (I couldn't find it on the website)? Is anybody working on a Debian package? ;-) Thanks, Ludovic. From solipsis at pitrou.net Tue Dec 13 10:54:36 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: <87zmn5l5xy.fsf@laas.fr> References: <87zmn5l5xy.fsf@laas.fr> Message-ID: <1134471276.5631.1.camel@fsol> Le mardi 13 d?cembre 2005 ? 10:08 +0100, Ludovic Court?s a ?crit : > > As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents. > > > > http://maay.netofpeers.net > > This sounds nice! What licence is it available under (I couldn't find > it on the website)? >From the bottom of the home page: ? Maay is licensed under the GNU General Public License ? ;) Regards Antoine. From m.rogers at cs.ucl.ac.uk Tue Dec 13 11:07:11 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: <200512121814.jBCIE1U88941@where.matthew.at> Message-ID: <439EAB5F.3030800@cs.ucl.ac.uk> Alen Peacock wrote: > External motivation is good, but is it sufficient to provide some > sort of equilibria? If not, it's just the prisoner's dilemma; the > vast majority of nodes disable caching because it is locally optimal, > regardless of the fact that this produces a globally non-optimal > solution. Then how about internal motivation: the faster you upload, the faster you can download, and the more files you share, the more likely you are to be able to upload. I've come up with a half-baked incentive mechanism for Gnutella based on these principles: http://www.cs.ucl.ac.uk/staff/M.Rogers/gnutella-incentives.html No identity mechanism required I'm afraid ;-) > But, isn't it more interesting to think about building systems that > have some fairness guarantees than building ones that don't? Define fairness :-) I'm more interested in mutual benefit. Cheers, Michael From m.rogers at cs.ucl.ac.uk Tue Dec 13 11:24:53 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: Message-ID: <439EAF85.2040704@cs.ucl.ac.uk> Serguei Osokine wrote: > And the reason for this is quite understandable - if most of > the content exists in just one or two copies, what good are the swarm > downloaders and other marvelous instruments of progress? This single > copy that you need might be on a single host behind the modem in > Albania, the host might go off-line at any moment, and to make it > more fun, it might be trying to upload five other files (different > files, mind you) to five other people at the same time. I believe eMule allows the uploader to assign different priorities to different files - I'd like to be able to do this in Gnutella, to make the rarer (or better) content on my node easier to find, almost like a recommendation system. Cheers, Michael From alenlpeacock at gmail.com Tue Dec 13 15:41:56 2005 From: alenlpeacock at gmail.com (Alen Peacock) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: <439EAB5F.3030800@cs.ucl.ac.uk> References: <200512121814.jBCIE1U88941@where.matthew.at> <439EAB5F.3030800@cs.ucl.ac.uk> Message-ID: On 12/13/05, Michael Rogers wrote: > > Then how about internal motivation: the faster you upload, the faster > you can download, and the more files you share, the more likely you are > to be able to upload. I've come up with a half-baked incentive mechanism > for Gnutella based on these principles: > > http://www.cs.ucl.ac.uk/staff/M.Rogers/gnutella-incentives.html > > No identity mechanism required I'm afraid ;-) Neat ideas. Like you, I'm a big believer in incentive-based decisions. I just peaked at your "Cooperation in Decentralized Networks" paper, and I notice that you do require exchange of public keys, authentication with those keys, and some sort of history of reciprocation, no? This is what I'm talking about when I say 'identity' and 'trust'. Each node has to be able to positively certify the identities of other nodes, and what you seem to be building is essentially a trust system built on top of those strong identities. Without the ability to certify node identities, you'd have a system that was very susceptible to imposter nodes leeching resources (in the form of reciprocation) that they hadn't earned, right? Perhaps I confused the issue by using the word 'identity,' which in some circles is used only to talk about the concept of linking a virtual presence to a meatspace entity. That isn't what I intended. What I meant was exactly what you describe: use of assymetric keys to establish and prove peer IDs, use of those IDs to learn something about the behavior of other agents in the network, and use of that knowledge to make appropriate incentive-based decisions. > > But, isn't it more interesting to think about building systems that > > have some fairness guarantees than building ones that don't? > > Define fairness :-) I'm more interested in mutual benefit. Well, I don't know if my semantics are standard, but the concept of 'fairness' I was thinking of was one that was purposely broad -- an umbrella under which 'mutual benefit' is certainly an essential piece. Alen From m.rogers at cs.ucl.ac.uk Tue Dec 13 17:03:05 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other In-Reply-To: References: <200512121814.jBCIE1U88941@where.matthew.at> <439EAB5F.3030800@cs.ucl.ac.uk> Message-ID: <439EFEC9.7090209@cs.ucl.ac.uk> Alen Peacock wrote: > I just peaked at your "Cooperation in Decentralized Networks" paper, > and I notice that you do require exchange of public keys, > authentication with those keys, and some sort of history of > reciprocation, no? This is what I'm talking about when I say > 'identity' and 'trust'. Good point - I was thinking of identities that can be communicated to third parties, as in a reputation or recommendation system, but you're right that local (non-transitive?) identities are needed. In the context of Gnutella you can use IP addresses and port numbers. > Well, I don't know if my semantics are standard, but the concept of > 'fairness' I was thinking of was one that was purposely broad -- an > umbrella under which 'mutual benefit' is certainly an essential piece. Sorry for the knee-jerk reaction. Fairness seems to be one of those words that cause more arguments than they solve - some people say "it's not fair to exclude those who can't contribute", while others say "it's not fair to consume resources if you don't contribute". :-) Cheers, Michael From rabbi at abditum.com Tue Dec 13 19:48:31 2005 From: rabbi at abditum.com (Len Sassaman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] CodeCon submission deadline reminder Message-ID: Here's a reminder that the deadline for submissions to CodeCon 2006 is this week. Feel free to forward this to project developers who might not otherwise see it. --Len. -- CodeCon 2006 February 10-12, 2006 San Francisco CA, USA www.codecon.org Call For Papers CodeCon is the premier showcase of cutting edge software development. It is an excellent opportunity for programmers to demonstrate their work and keep abreast of what's going on in their community. All presentations must include working demonstrations, ideally accompanied by source code. Presentations must be done by one of the active developers of the code in question. We emphasize that demonstrations be of *working* code. We hereby solicit papers and demonstrations. * Papers and proposals due: December 15, 2005 * Authors notified: January 1, 2006 Possible topics include, but are by no means restricted to: * community-based web sites - forums, weblogs, personals * development tools - languages, debuggers, version control * file sharing systems - swarming distribution, distributed search * security products - mail encryption, intrusion detection, firewalls Presentations will be 45 minutes long, with 15 minutes allocated for Q&A. Overruns will be truncated. Submission details: Submissions are being accepted immediately. Acceptance dates are November 15, and December 15. After the first acceptance date, submissions will be either accepted, rejected, or deferred to the second acceptance date. The conference language is English. Ideally, demonstrations should be usable by attendees with 802.11b connected devices either via a web interface, or locally on Windows, UNIX-like, or MacOS platforms. Cross-platform applications are most desirable. Our venue will be 21+. To submit, send mail to submissions-2006 at codecon.org including the following information: * Project name * url of project home page * tagline - one sentence or less summing up what the project does * names of presenter(s) and urls of their home pages, if they have any * one-paragraph bios of presenters, optional, under 100 words each * project history, under 150 words * what will be done in the project demo, under 200 words * slides to be shown during the presentation, if applicable * future plans General Chair: Jonathan Moore Program Chair: Len Sassaman Program Committee: * Bram Cohen, BitTorrent, USA * Jered Floyd, Permabit, USA * Ian Goldberg, Zero-Knowledge Systems, CA * Dan Kaminsky, Avaya, USA * Ben Laurie, The Bunker Secure Hosting, UK * Nick Mathewson, The Free Haven Project, USA * David Molnar, University of California, Berkeley, USA * Jonathan Moore, Mosuki, USA * Meredith L. Patterson, University of Iowa, USA * Len Sassaman, Katholieke Universiteit Leuven, BE Sponsorship: If your organization is interested in sponsoring CodeCon, we would love to hear from you. In particular, we are looking for sponsors for social meals and parties on any of the three days of the conference, as well as sponsors of the conference as a whole and donors of door prizes. If you might be interested in sponsoring any of these aspects, please contact the conference organizers at codecon-admin at codecon.org. Press policy: CodeCon provides a limited number of passes to qualifying press. Complimentary press passes will be evaluated on request. Everyone is welcome to pay the low registration fee to attend without an official press credential. Questions: If you have questions about CodeCon, or would like to contact the organizers, please mail codecon-admin at codecon.org. Please note this address is only for questions and administrative requests, and not for workshop presentation submissions. From Serguei.Osokine at efi.com Tue Dec 13 19:50:31 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] p2p in some place or other Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42797@fcexmb04.efi.internal> On Tuesday, December 13, 2005 Michael Rogers wrote: > I believe eMule allows the uploader to assign different priorities > to different files - I'd like to be able to do this in Gnutella, to > make the rarer (or better) content on my node easier to find... Tha is more like "easier to download", but I see what you're saying. Yes, at some point I used to place hight hopes on this method, basically thinking that the transfer rates for the rare content can be improved at the expense of the popular one. Popular content can be found at lots of places anyway, so penalizing it should not hurt all that much; for me the goal was to equalize the download rates for all content regardless of its popularity. So if improving the rare content download speed would make the widely distributed content transfers a bit slower (because the systemwide cumulative uplink bandwidth is a scarce resource, after all), so be it. Unfortunately the statistical research of the P2P systems (the one that I've already quoted in this thread) shows that from the uploader standpoint the prioritization of rare vs popular content does not cover a very significant percentage of all upload situations. The typical upload scenario is not only "some popular, some rare, so give the rare more bandwidth". Just as widespread is "many rare uploads from one node", in which case changing their relative priorities is pointless, and also "rare upload from a single node", in which case no matter what this node does, the speed is going to be substandard. And let me reemphasize this again - these scenarios seem to be very common. Essentially the download speed for the rare content is limited by the uplink rates of the nodes with rare content, even if all the nodes are always on and spend just a small percantage of their online time downloading. For popular content, you can have very fast downloads in such a case; you can even saturate your downlink if you wish. But for rare content, you're still stuck with whatever is the uplink rate of a single node that has this file. As the nodes start spending more time on line, this disparity becomes more and more pronounced no matter how you prioritize the uploads. And seeing this causes the user frustration on a significant percentage of all downloads (on everything that is in the long tail). Best wishes - S.Osokine. 13 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Michael Rogers Sent: Tuesday, December 13, 2005 3:25 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] p2p in some place or other Serguei Osokine wrote: > And the reason for this is quite understandable - if most of > the content exists in just one or two copies, what good are the swarm > downloaders and other marvelous instruments of progress? This single > copy that you need might be on a single host behind the modem in > Albania, the host might go off-line at any moment, and to make it > more fun, it might be trying to upload five other files (different > files, mind you) to five other people at the same time. I believe eMule allows the uploader to assign different priorities to different files - I'd like to be able to do this in Gnutella, to make the rarer (or better) content on my node easier to find, almost like a recommendation system. Cheers, Michael _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From olau at cs.aau.dk Wed Dec 14 21:56:45 2005 From: olau at cs.aau.dk (Ole Laursen) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] published key chenge frequency in DHT In-Reply-To: References: Message-ID: xiangsong hou writes: > as we know,DHT can deal with node join/leave frequently. > i want to know if DHT can deal with publishde key change frequently. > for example,in grid computing resouce dicovery use DHT,the published key > (represent cpu or memeory) is change very frequently,so assigned node is > change frequently. > how to deal with this situation in DHT? I'm not totally sure what you are referring to, but I wrote my master's thesis together with two other guys on a design that used a DHT for distributing jobs for mass/grid computing. The DHT stored the jobs and indexed them based on keywords. The main problem was load balancing the index. We spent quite some time studying relevant literature and reviewed some of it in the thesis. You can find it here - we called the system U.P.: http://www.cs.aau.dk/~olau/writings/ Unfortunately, we never had time to optimize the load-balancing algorithm properly (we only had one semester). -- Ole Laursen http://www.cs.aau.dk/~olau/ From shashi.mit at gmail.com Thu Dec 15 16:53:28 2005 From: shashi.mit at gmail.com (Shashi (MIT)) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Opinions on JXTA2 - Message-ID: <4d19a3630512150853w7fcb2b7fice4ead23224460ca@mail.gmail.com> Hi all I was curious as to your thoughts on the JXTA platform. I am working in designing a P2P application and I have heard some conflicting thoughts on JXTA. The contrarian viewpoint is that it is too complex and way too much work to get something like a P2P app working. 'The solution being more complex than the problem' While there are simpler P2P frameworks available e.g. DirectConnect. The pro viewpoint is JXTA's comprehensiveness and the one-stop platform. What are your thoughts? thanks, Shashi From garyjefferson123 at yahoo.com Thu Dec 15 16:55:04 2005 From: garyjefferson123 at yahoo.com (Gary Jefferson) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] kademlia bucket spliting Message-ID: <20051215165504.7363.qmail@web35703.mail.mud.yahoo.com> I'm having a hard time following one detail in the Kademlia paper (long version, http://citeseer.ist.psu.edu/sex02sex.html) and was hoping someone could clarify. From section 2.4, we know that when a bucket gets full and a new node can be added to it, we only split the bucket if it contains our own node ID. Otherwise, we either discard the new node's contact info or replace an old entry with it, depending on whether the LRU entry is still alive or not. But then we read: "One complication arises in highly unbalanced trees. Suppose node u joins the system and is the only node whose ID begins 000. Suppose further that the system already has more than k nodes with prefix 001. Every node with prefix 001 would have an empty k-bucket into which u should be inserted, yet u's bucket refresh would only notify k of the nodes. To avoid this problem, Kademlia nodes keep all valid contacts in a subtree of size at least k nodes, even if this requires splitting buckets in which the node's own ID does not reside. Figure 5 illustrates these additional splits. When u refreshes the split buckets, all nodes with prefix 001 will learn about it." So when do we split a bucket? From the above, it sounds as if we always split buckets when they get full and new nodes can be added, regardless of whether the bucket contains our own node ID. But doesn't this mean we have an essentially limitless number of nodes that we can add to our buckets (and a corresponding memory issue as the network gets large)? I'm sure I'm missing something here, but I just can't make it out. I'm running into some pathological cases where I can't converge to the correct node unless I do split every bucket... Thanks, Gary --------------------------------- Yahoo! Shopping Find Great Deals on Holiday Gifts at Yahoo! Shopping -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051215/f34f2f1e/attachment.html From tcuag at t-online.de Thu Dec 15 20:23:12 2005 From: tcuag at t-online.de (tcuag@t-online.de) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] UDP Packet size... In-Reply-To: <20051215200003.B84C53FDA6@capsicum.zgp.org> Message-ID: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc> Hi! We have to make a decision. Shall we send 64Kb udp pakets or 40 times the 1,4K pakets, which fits into any MTU of any ISP? Is it true, that nowadays most router do not allow udp fragmentation by default? How many do allow it after configuration? (those configuration is mostly very difficult for average user, right?) Any experience? Some experts say: Never send more than MTU, some projects say that they work with 60K UDP??? For us (python-project) the sending of small pakets means serious trouble (CPU/Socket, timing), so before tuning our algorythms, we want to be sure, that sending small pakets is the only solution. Thx for your help GKL From matthew at matthew.at Thu Dec 15 20:51:30 2005 From: matthew at matthew.at (Matthew Kaufman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] UDP Packet size... In-Reply-To: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc> Message-ID: <200512152049.jBFKnWU99767@where.matthew.at> If 39 of 40 1.4k packets arrive, can you do anything with those, or do all 39 need to be thrown out because the 40th didn't get there? (I'll wait for the answer before going into more detail about what I think) Matthew Kaufman matthew@matthew.at www.amicima.com > -----Original Message----- > From: p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of tcuag@t-online.de > Sent: Thursday, December 15, 2005 12:23 PM > To: p2p-hackers@zgp.org > Subject: [p2p-hackers] UDP Packet size... > > Hi! > We have to make a decision. Shall we send 64Kb udp pakets or > 40 times the 1,4K pakets, which fits into any MTU of any ISP? > > Is it true, that nowadays most router do not allow udp > fragmentation by default? > How many do allow it after configuration? (those > configuration is mostly very difficult for average user, > right?) Any experience? Some experts say: Never send more > than MTU, some projects say that they work with 60K UDP??? > > For us (python-project) the sending of small pakets means > serious trouble (CPU/Socket, timing), so before tuning our > algorythms, we want to be sure, that sending small pakets is > the only solution. > > Thx for your help > GKL > > From agthorr at cs.uoregon.edu Thu Dec 15 21:04:07 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] UDP Packet size... In-Reply-To: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc> References: <20051215200003.B84C53FDA6@capsicum.zgp.org> <006d01c601b5$5f6c4fc0$67a2a8c0@namepc> Message-ID: <20051215210406.GF5108@cs.uoregon.edu> On Thu, Dec 15, 2005 at 09:23:12PM +0100, tcuag@t-online.de wrote: > We have to make a decision. Shall we send 64Kb udp pakets or 40 times the > 1,4K pakets, which fits into any MTU of any ISP? > > Is it true, that nowadays most router do not allow udp fragmentation by > default? This is not really the right forum for the question, as it's a general networking question and not related to peer-to-peer. I suggest finding a introductory networking list or, better still, a good TCP/IP book. I'm fond of TCP/IP Illustrated, Vol. 1. Nevertheless, I'll answer: First, there is no such thing as "UDP Fragmentation". Your UDP datagram is encapsulated in an IP packet, which routers will fragment as necessary down to their MTU (this is called "IP Fragmentation"). The receiving host will reassemble the IP fragments into the full IP packet and then pass the packet to UDP on the host. The problem is that if any of the IP fragments is lost, then the entire UDP datagram is lost and must be retransmitted. This is very wasteful. It's much better to make your packets fit within the Path-MTU, so that if one MTU-sized packet is lost, then only one MTU-sized packet must be retransmitted. To be completely robust, you need to do Path-MTU discovery to dynamically adapt if the Path-MTU isn't what you expect it to be. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From coderman at gmail.com Thu Dec 15 21:33:10 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] UDP Packet size... In-Reply-To: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc> References: <20051215200003.B84C53FDA6@capsicum.zgp.org> <006d01c601b5$5f6c4fc0$67a2a8c0@namepc> Message-ID: <4ef5fec60512151333l5f848710yd9c63a7581475a65@mail.gmail.com> On 12/15/05, tcuag@t-online.de wrote: > ... Shall we send 64Kb udp pakets or 40 times the > 1,4K pakets, which fits into any MTU of any ISP? use ~1400byte packets. be aware of tcp friendly congestion control. > Is it true, that nowadays most router do not allow udp fragmentation by > default? i've had the most problem with UDP NAPT's dropping fragmented datagrams. most routers are fine. > Any experience? the 'Never send more than MTU' suggestion is a good one. > For us (python-project) the sending of small pakets means serious trouble > (CPU/Socket, timing) you could always support both and use large packet support when it works well. if you are trying to do bulk transfer use TCP instead :) From bneijt at gmail.com Fri Dec 16 11:33:40 2005 From: bneijt at gmail.com (Bram Neijt) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Google releases something P2P Message-ID: <46c2f4ab0512160333u6d5e326er308b13c36d0a0ad0@mail.gmail.com> Hi. I havn't been able to take a good look at the system yet, but Google has released LibJingle (Google Talk library) which contains a P2P implementation. I don't think it's a "collaborate" implementation, like Skype, but it might be intresting code for people wanting to do firewall and NAT transversal: [The P2P component] "Negotiates, establishes, and maintains peer-to-peer connections through almost any network configuration regardless of NAT devices and firewalls. The p2p component understands the Jingle spec to initiate the session and then provides a sockets-like interface for sending and receiving data that is used by the session component to add functionality." More on that, here: http://code.google.com/apis/talk/about.html (probably my last post before Christmas, so) Happy christmas and hacking everyone! Bram Neijt From eunsoo at research.panasonic.com Fri Dec 16 22:40:02 2005 From: eunsoo at research.panasonic.com (Eunsoo Shim) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia Message-ID: <43A34242.3070203@research.panasonic.com> Hi, I am wondering whether Kad Network (based on Kademlia) has a hierarchical architecture where supernodes are distinguished from non-supernodes. Your kind information will be appreciated. Thanks. Eunsoo From agthorr at cs.uoregon.edu Fri Dec 16 22:44:28 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <43A34242.3070203@research.panasonic.com> References: <43A34242.3070203@research.panasonic.com> Message-ID: <20051216224426.GC3060@cs.uoregon.edu> On Fri, Dec 16, 2005 at 05:40:02PM -0500, Eunsoo Shim wrote: > I am wondering whether Kad Network (based on Kademlia) has a > hierarchical architecture where supernodes are distinguished from > non-supernodes. No, it does not. However, it does distinguish non-firewalled peers (which form the DHT) from firewalled peers (which can query the DHT but do not form part of the routing structure). -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From lemonobrien at yahoo.com Fri Dec 16 23:41:31 2005 From: lemonobrien at yahoo.com (Lemon Obrien) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <20051216224426.GC3060@cs.uoregon.edu> Message-ID: <20051216234131.80784.qmail@web53605.mail.yahoo.com> this is the same thing...almost; its the same thing if you only use udp. Daniel Stutzbach wrote: On Fri, Dec 16, 2005 at 05:40:02PM -0500, Eunsoo Shim wrote: > I am wondering whether Kad Network (based on Kademlia) has a > hierarchical architecture where supernodes are distinguished from > non-supernodes. No, it does not. However, it does distinguish non-firewalled peers (which form the DHT) from firewalled peers (which can query the DHT but do not form part of the routing structure). -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences You don't get no juice unless you squeeze Lemon Obrien, the Third. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051216/affdc9ec/attachment.htm From matthew at matthew.at Sat Dec 17 07:04:52 2005 From: matthew at matthew.at (Matthew Kaufman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades Message-ID: <200512170702.jBH72uU04975@where.matthew.at> We've been busy here at amicima, and thought you'd want to know about some recent improvements we've made: 1. We upgraded the MFP protocol to be resistant to a potential but unlikely denial-of-service attack in cases where there's no cryptography or the session key is the same in each direction. Specifically: an attacker who intercepts traffic from one end, modifies the session identifier to match the one sent by the other end, and plays the traffic back might be able in some cases to erroneously start flows or in extreme cases cause a denial of service through the IP mobility mechanism. This is fixed by adding explicit directionality flagging to the MFP packet header, and the protocol spec and our implementation have been upgraded. The revised protocol spec (version 1.2) can be found at: http://www.amicima.com/developers/documentation.html 2. We've significantly upgraded the "MFP defcrypto" default cryptographic plug-in. The new version is INCOMPATIBLE will all previous versions, but we hope our improvements mean that's the only time we'll have to say that. The previous version supported RSA for public-key crypto and AES128 for symmetric crypto, and while the key material was generated at both ends (thanks to suggestions here to make that improvement), the transmission of keying material was of a fixed length, the combination was identical at each end (XOR) (so both directions used the same session key), and there was no provision for any options to be sent between the cryptographic plug-ins at each end. The new version has replaced the fixed-length encrypted key data sent in the Initial Keying packets with a "micro-packet" of data that is exchanged between each end (and which is protected by the signatures present in the Initiator Initial Keying and Responder Initial Keying packets, so the data can't be tampered with). These "micro-packets" can contain variable-length option information for future cryptosystem upgrades, like changes to AES256 or the addition of HMAC, in such a way that backwards compatibility may be retained, as well as the necessary keying data (also of variable length, and which we now combine asymmetrically, such that both ends contribute to the session keys that are used, but a different session key is used in each direction now). This brings us to the next new feature... By popular request, and because we now have the ability to negotiate such options, we now have optional HMAC-SHA1 in the default crypto plug-in. The HMAC wraps the encrypted packet in order to detect any corruption or tampering before it is even decrypted at the far end and with much more certainty than the internal post-decryption 16-bit checksum. There is an API to set transmission (always send, only if requested by the other end, never send) and reception (require (and request) that it be sent, request (but not require) that it be sent, verify (but neither request nor require) if sent, and ignore completely) options, and MFPNet has been upgraded to provide access to the HMAC API as well. Once HMAC has been negotiated, any packet with the wrong HMAC (or from which the HMAC has been deleted) will be ignored. We always said that "if you don't like it, you can plug in a new cryptographic plug-in", but that doesn't necessarily provide a good backward-compatible solution for upgrades to running systems with large numbers of existing peers. We're pretty sure that this does (as would any other cryptographic plug-in that borrowed these enhancements), but only the future will tell us if we're right. The new releases of MObj, MFP, and MFPNet are available on our downloads page: http://www.amicima.com/developers/downloads.html And details of the default cryptographic plug-in are provided in the MFP release's README file, available separately here: http://www.amicima.com/downloads/mfp/README.txt 3. And finally, because we've rolled out an incompatible (but much better) default cryptographic plug-in, we've released a new version of amiciPhone, our demo application that does P2P VOIP calling, user presence, text messaging, and photo and file sending, you can get the Windows XP version from our website, and the Macintosh OS X version is coming along nicely and should be out before too much longer. The application download is here: http://www.amicima.com/applications/ Download a copy and try it out! (For a good time, try calling "7@test.amicima.com") Thanks for the support and feedback from the list and privately, it has helped make our protocols and implementations better, and we try to return the favor through the open-source publication of our protocol implementations. Matthew Kaufman matthew@matthew.at matthew@amicima.com http://www.amicima.com From eunsoo at research.panasonic.com Sat Dec 17 07:33:47 2005 From: eunsoo at research.panasonic.com (Eunsoo Shim) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <20051216224426.GC3060@cs.uoregon.edu> References: <43A34242.3070203@research.panasonic.com> <20051216224426.GC3060@cs.uoregon.edu> Message-ID: <43A3BF5B.5070102@research.panasonic.com> I see. Is there any statistics information about the average number of non-firewalled peers in Kad Network? Thanks a lot. Eunsoo Daniel Stutzbach wrote: >On Fri, Dec 16, 2005 at 05:40:02PM -0500, Eunsoo Shim wrote: > > >>I am wondering whether Kad Network (based on Kademlia) has a >>hierarchical architecture where supernodes are distinguished from >>non-supernodes. >> >> > >No, it does not. > >However, it does distinguish non-firewalled peers (which form the DHT) >from firewalled peers (which can query the DHT but do not form part of >the routing structure). > > > From agthorr at cs.uoregon.edu Sat Dec 17 07:41:40 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <43A3BF5B.5070102@research.panasonic.com> References: <43A34242.3070203@research.panasonic.com> <20051216224426.GC3060@cs.uoregon.edu> <43A3BF5B.5070102@research.panasonic.com> Message-ID: <20051217074138.GF3060@cs.uoregon.edu> On Sat, Dec 17, 2005 at 02:33:47AM -0500, Eunsoo Shim wrote: > I see. > Is there any statistics information about the average number of > non-firewalled peers in Kad Network? I measured it to be around a million non-firewalled peers, although that was a few months back. Or did you mean as a percentage of all peers? (which I'm not sure of) Why do you ask, by the way? -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From eunsoo at research.panasonic.com Sat Dec 17 08:40:53 2005 From: eunsoo at research.panasonic.com (Eunsoo Shim) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <20051217074138.GF3060@cs.uoregon.edu> References: <43A34242.3070203@research.panasonic.com> <43A3BF5B.5070102@research.panasonic.com> <20051217074138.GF3060@cs.uoregon.edu> Message-ID: <43A3CF15.7030503@research.panasonic.com> Daniel Stutzbach wrote: >On Sat, Dec 17, 2005 at 02:33:47AM -0500, Eunsoo Shim wrote: > > >>I see. >>Is there any statistics information about the average number of >>non-firewalled peers in Kad Network? >> >> > >I measured it to be around a million non-firewalled peers, although >that was a few months back. > >Or did you mean as a percentage of all peers? (which I'm not sure of) > >Why do you ask, by the way? > > > Wow, you know a lot about Kad Network. I asked about it because I was interested in scalability of DHTs. I looked for cases of large scale DHT deployment and so far found only Kad Network based on Kademlia. According to Wikipedia, there are 3.5 - 5.1 million concurrent online users in Kad Network. http://en.wikipedia.org/wiki/Kad_Network A million non-firewalled peers...It is impressive again. I thought most computers were working behind firewalls or NAT these days. Do you know any other large scale DHT deployment? Thanks. Eunsoo From agthorr at cs.uoregon.edu Sat Dec 17 17:09:25 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <43A3CF15.7030503@research.panasonic.com> References: <43A34242.3070203@research.panasonic.com> <43A3BF5B.5070102@research.panasonic.com> <20051217074138.GF3060@cs.uoregon.edu> <43A3CF15.7030503@research.panasonic.com> Message-ID: <20051217170924.GA1288@cs.uoregon.edu> On Sat, Dec 17, 2005 at 03:40:53AM -0500, Eunsoo Shim wrote: > A million non-firewalled peers...It is impressive again. I thought most > computers were working behind firewalls or NAT these days. > > Do you know any other large scale DHT deployment? eDonkey's Overnet is also Kademlia based, as well as the new "trackerless" feature of BitTorrent. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From dbarrett at quinthar.com Sat Dec 17 20:21:43 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <43A3CF15.7030503@research.panasonic.com> References: <43A34242.3070203@research.panasonic.com> <43A3CF15.7030503@research.panasonic.com> Message-ID: <1134850905.F9FF157@dl11.dngr.org> On Sat, 17 Dec 2005 1:43 am, Eunsoo Shim wrote: > Daniel Stutzbach wrote: >> >> I measured it to be around a million non-firewalled peers, although >> that was a few months back. > > A million non-firewalled peers...It is impressive again. I thought most > computers were working behind firewalls or NAT these days. Daniel, by "non-firewalled" do you mean truly, those that aren't behind a firewall, or rather those for which NAT/firewall traversal doesn't work? -david From coderman at gmail.com Sat Dec 17 21:33:45 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <200512170702.jBH72uU04975@where.matthew.at> References: <200512170702.jBH72uU04975@where.matthew.at> Message-ID: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com> On 12/17/05, Matthew Kaufman wrote: > ... > 2. We've significantly upgraded the "MFP defcrypto" default cryptographic > plug-in. i forgot to mention this previously but it is always a good idea to lock memory pages where key material and cipher state resides. the 'mlock' function can do this on unix systems (not sure what the equivalent is for win32 api). this does require root privilege which can make application coding a little more complicated. (i.e. handling setuid and dropping privs, etc). i've also seen some systems use unix IPC shared memory to keep memory from paging out to swap, etc. if you use an encrypted swap partition this might be somewhat less of a concern. best regards, From dbarrett at quinthar.com Sat Dec 17 23:00:40 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com> References: <200512170702.jBH72uU04975@where.matthew.at> <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com> Message-ID: <1134860444.CA368C6@di12.dngr.org> On Sat, 17 Dec 2005 1:58 pm, coderman wrote: > On 12/17/05, Matthew Kaufman wrote: >> ... >> 2. We've significantly upgraded the "MFP defcrypto" default >> cryptographic >> plug-in. > > i forgot to mention this previously but it is always a good idea to > lock memory pages where key material and cipher state resides. I'm not sure I follow how this helps: who is it protecting against? If you don't want the user to get access to cipher info, requiring root access isn't much of a barrier (any hacker will have root on his own box). And one user can't access the memory of another user's processes. I'm not disputing the technique, I just don't understand when to apply it. For example, just the other day I was interviewing a candidate (did I mention we are hiring?) who aggregates poker stats on other players. Despite all sorts of clever on-the-wire encryption, he just figured out where all the stats are kept in plaintext in memory and tapped into that. Doh! Ultimately, it's never a good idea to send data to a client that you don't want to fall into the wrong hands. Memory protection might stop a non-root user from accessing his own memory, but this seems like a boundary case (unless I'm misunderstanding it). -david From agthorr at cs.uoregon.edu Sat Dec 17 22:59:27 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <1134850905.F9FF157@dl11.dngr.org> References: <43A34242.3070203@research.panasonic.com> <43A3CF15.7030503@research.panasonic.com> <1134850905.F9FF157@dl11.dngr.org> Message-ID: <20051217225926.GB2876@cs.uoregon.edu> On Sat, Dec 17, 2005 at 12:21:43PM -0800, David Barrett wrote: > >Daniel Stutzbach wrote: > >>I measured it to be around a million non-firewalled peers, although > >>that was a few months back. > > Daniel, by "non-firewalled" do you mean truly, those that aren't behind > a firewall, or rather those for which NAT/firewall traversal doesn't > work? I mean those that can receive unsolicited TCP and UDP packets on the Kad/eMule ports. Either they must not be firewalled/NATed or the user must manually punch a whole to redirect those ports from the firewall device. Kad uses "iterative" DHT routing. If I'm a client and want to do a lookup, I query some of my contacts to get their next hop for my target, then I query that peer for it's next hop, etc. Therefore, it's important that any host participating in Kad's DHT routing structure be able to receive unsolicited packets. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From eunsoo at research.panasonic.com Sat Dec 17 23:43:30 2005 From: eunsoo at research.panasonic.com (Eunsoo Shim) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <20051217225926.GB2876@cs.uoregon.edu> References: <43A34242.3070203@research.panasonic.com> <1134850905.F9FF157@dl11.dngr.org> <20051217225926.GB2876@cs.uoregon.edu> Message-ID: <43A4A2A2.6080207@research.panasonic.com> >>>>I measured it to be around a million non-firewalled peers, although >>>>that was a few months back. >>>> >>>> >>Daniel, by "non-firewalled" do you mean truly, those that aren't behind >>a firewall, or rather those for which NAT/firewall traversal doesn't >>work? >> >> > >I mean those that can receive unsolicited TCP and UDP packets on the >Kad/eMule ports. Either they must not be firewalled/NATed or the user >must manually punch a whole to redirect those ports from the firewall >device. > > So port 80 or 443 is NOT used at all for Kad Network? >Kad uses "iterative" DHT routing. If I'm a client and want to do a >lookup, I query some of my contacts to get their next hop for my >target, then I query that peer for it's next hop, etc. Therefore, >it's important that any host participating in Kad's DHT routing >structure be able to receive unsolicited packets. > > > "Iterative" DHT routing is inefficient compared to "recursive" one. Is "iterative" routing used because of a concern about DoS attacks? Thanks. Eunsoo From agthorr at cs.uoregon.edu Sun Dec 18 00:00:26 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <43A4A2A2.6080207@research.panasonic.com> References: <43A34242.3070203@research.panasonic.com> <1134850905.F9FF157@dl11.dngr.org> <20051217225926.GB2876@cs.uoregon.edu> <43A4A2A2.6080207@research.panasonic.com> Message-ID: <20051218000025.GC2876@cs.uoregon.edu> On Sat, Dec 17, 2005 at 06:43:30PM -0500, Eunsoo Shim wrote: > >I mean those that can receive unsolicited TCP and UDP packets on the > >Kad/eMule ports. Either they must not be firewalled/NATed or the user > >must manually punch a whole to redirect those ports from the firewall > >device. > > > So port 80 or 443 is NOT used at all for Kad Network? Not normally, no. eMule lets the user configure their peer to use ports other than the default, so they could use any port they want in that case. But the vast majority of peers do not use port 80 or 443 at all. > "Iterative" DHT routing is inefficient compared to "recursive" one. > Is "iterative" routing used because of a concern about DoS attacks? Kademlia is an inherently iterative DHT. I suspect the Kad developers used iterative routing simply because they chose Kademlia as a starting point. I'm not one of the Kad developers though, so I can only guess at the reasons behind their design decisions. I'd observe, though, that iterative routing is much easier to debug. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From matthew at matthew.at Sun Dec 18 00:49:02 2005 From: matthew at matthew.at (Matthew Kaufman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com> Message-ID: <200512180047.jBI0l7U10135@where.matthew.at> Coderman: > i forgot to mention this previously but it is always a good > idea to lock memory pages where key material and cipher state > resides. Not "always". The right answer is "sometimes". If you wish to do so, on some systems you can use memory region locking to prevent the cryptographic material from being paged out to permanent storage media, in theory (see below). This makes it harder to take a machine, after the fact, and attempt to analyze its permanent media (hard disk) for any state that might have been left behind. However, that's just one of several possible attacks you might want to guard against, or other requirements you might have. > the 'mlock' function can do this on unix systems > (not sure what the equivalent is for win32 api). > this does require root privilege which can make application > coding a little more complicated. (i.e. handling setuid and > dropping privs, etc). Here, for example, is where some of those other requirements and limitations come in. mlock() is not only not available on Win32 (though there are calls used by drivers, and some directX calls that can do locking that might be applicable, though I couldn't in a quick check verify that non-paging is guaranteed), but has different requirements and functionality on different systems that DO support the call... Some Linux systems and MacOS X, for instance, allow a limited amount of mlock() by the user, but on FreeBSD, the call is unavailable except to the superuser. In some implementations, the calls nest (MacOS X), in others (Solaris) an inadvertant call to munlock() (or munmap(), even, which you might be using for other reasons) by other code can unlock pages that the cryptographic parts of your program think are still locked. And finally, and most important, *all* that POSIX guarantees from mlock() is that the page *is* in memory for you, *not* that it *isn't* also copied to swap. Whether or not that's the case is implementation-dependent. And, having an application run as setuid root, even if briefly, also increases the risk that it would be used as a vector to run other code as root, and clearly once an attacker can run code as root, they'd have access to all of physical memory, not just what's stored on the swap device. That's also true if the attacker uses some other method to get root access, or is running on a machine where the equivalent access to physical memory is easier to get (typical Win32 machines, for instance). In summary, "it depends". In our case, since the MFP default cryptographic plugin also uses the services of OpenSSL libraries for its crypto operations, changes to ensure that state both in our plugin and the external libraries (DLLs on Win32, shared system libraries on MacOS X and other platforms where OpenSSL is standard, or static OpenSSL libraries elsewhere) are all non-pagable (and more important, that "non-pagable" *also* means "never copied to disk") is not a trivial change, especially to ensure that that's the case on every platform we support. But as we've said before, for applications where this is an attack you wished to guard against specifically, there's nothing stopping you from modifying our plugin or writing your own that implements exactly what you want. Matthew Kaufman matthew@matthew.at http://www.amicima.com From osokin at osokin.com Sun Dec 18 03:17:20 2005 From: osokin at osokin.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <1134860444.CA368C6@di12.dngr.org> Message-ID: On Saturday, December 17, 2005 David Barrett wrote: > For example, just the other day I was interviewing a candidate (did > I mention we are hiring?) who aggregates poker stats on other players. Sounds like you're finally switching your development into areas that can actually bring heaps of money. I always thought that cheating in poker should be more profitable than P2P content delivery - and now your hiring approach seems to validate that. Good luck! Best wishes - S.Osokine. 17 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of David Barrett Sent: Saturday, December 17, 2005 3:01 PM To: Peer-to-peer development. Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades On Sat, 17 Dec 2005 1:58 pm, coderman wrote: > On 12/17/05, Matthew Kaufman wrote: >> ... >> 2. We've significantly upgraded the "MFP defcrypto" default >> cryptographic >> plug-in. > > i forgot to mention this previously but it is always a good idea to > lock memory pages where key material and cipher state resides. I'm not sure I follow how this helps: who is it protecting against? If you don't want the user to get access to cipher info, requiring root access isn't much of a barrier (any hacker will have root on his own box). And one user can't access the memory of another user's processes. I'm not disputing the technique, I just don't understand when to apply it. For example, just the other day I was interviewing a candidate (did I mention we are hiring?) who aggregates poker stats on other players. Despite all sorts of clever on-the-wire encryption, he just figured out where all the stats are kept in plaintext in memory and tapped into that. Doh! Ultimately, it's never a good idea to send data to a client that you don't want to fall into the wrong hands. Memory protection might stop a non-root user from accessing his own memory, but this seems like a boundary case (unless I'm misunderstanding it). -david _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From dbarrett at quinthar.com Sun Dec 18 05:49:45 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: References: Message-ID: <43A4F879.8050709@quinthar.com> Well the real money is in bulk counterfeiting. If only I had access to a frickin' huge printer... Serguei Osokine wrote: > On Saturday, December 17, 2005 David Barrett wrote: > >>For example, just the other day I was interviewing a candidate (did >>I mention we are hiring?) who aggregates poker stats on other players. > > > Sounds like you're finally switching your development into areas > that can actually bring heaps of money. I always thought that cheating > in poker should be more profitable than P2P content delivery - and now > your hiring approach seems to validate that. Good luck! > > Best wishes - > S.Osokine. > 17 Dec 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of David Barrett > Sent: Saturday, December 17, 2005 3:01 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades > > > On Sat, 17 Dec 2005 1:58 pm, coderman wrote: > >>On 12/17/05, Matthew Kaufman wrote: >> >>> ... >>> 2. We've significantly upgraded the "MFP defcrypto" default >>>cryptographic >>> plug-in. >> >>i forgot to mention this previously but it is always a good idea to >>lock memory pages where key material and cipher state resides. > > > I'm not sure I follow how this helps: who is it protecting against? If > you don't want the user to get access to cipher info, requiring root > access isn't much of a barrier (any hacker will have root on his own > box). And one user can't access the memory of another user's > processes. I'm not disputing the technique, I just don't understand > when to apply it. > > For example, just the other day I was interviewing a candidate (did I > mention we are hiring?) who aggregates poker stats on other players. > Despite all sorts of clever on-the-wire encryption, he just figured out > where all the stats are kept in plaintext in memory and tapped into > that. Doh! > > Ultimately, it's never a good idea to send data to a client that you > don't want to fall into the wrong hands. Memory protection might stop a > non-root user from accessing his own memory, but this seems like a > boundary case (unless I'm misunderstanding it). > > -david > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > From lemonobrien at yahoo.com Sun Dec 18 08:35:42 2005 From: lemonobrien at yahoo.com (Lemon Obrien) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <20051218000025.GC2876@cs.uoregon.edu> Message-ID: <20051218083542.13692.qmail@web53609.mail.yahoo.com> lots of peer to peer networks never support NAT tranversal cause it always encryption and privacy....users of eDonkey configure their router for its use...port changing should be stanard practice for any internet application; and know < 1024 is taken as a general rule. Daniel Stutzbach wrote: On Sat, Dec 17, 2005 at 06:43:30PM -0500, Eunsoo Shim wrote: > >I mean those that can receive unsolicited TCP and UDP packets on the > >Kad/eMule ports. Either they must not be firewalled/NATed or the user > >must manually punch a whole to redirect those ports from the firewall > >device. > > > So port 80 or 443 is NOT used at all for Kad Network? Not normally, no. eMule lets the user configure their peer to use ports other than the default, so they could use any port they want in that case. But the vast majority of peers do not use port 80 or 443 at all. > "Iterative" DHT routing is inefficient compared to "recursive" one. > Is "iterative" routing used because of a concern about DoS attacks? Kademlia is an inherently iterative DHT. I suspect the Kad developers used iterative routing simply because they chose Kademlia as a starting point. I'm not one of the Kad developers though, so I can only guess at the reasons behind their design decisions. I'd observe, though, that iterative routing is much easier to debug. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences You don't get no juice unless you squeeze Lemon Obrien, the Third. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051218/483e6dc8/attachment.html From eunsoo at research.panasonic.com Sun Dec 18 13:39:54 2005 From: eunsoo at research.panasonic.com (Eunsoo Shim) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia In-Reply-To: <20051218000025.GC2876@cs.uoregon.edu> References: <43A34242.3070203@research.panasonic.com> <20051217225926.GB2876@cs.uoregon.edu> <43A4A2A2.6080207@research.panasonic.com> <20051218000025.GC2876@cs.uoregon.edu> Message-ID: <43A566AA.6030501@research.panasonic.com> Thanks a lot, Daniel. Your information helped me a lot. Eunsoo Daniel Stutzbach wrote: >On Sat, Dec 17, 2005 at 06:43:30PM -0500, Eunsoo Shim wrote: > > >>>I mean those that can receive unsolicited TCP and UDP packets on the >>>Kad/eMule ports. Either they must not be firewalled/NATed or the user >>>must manually punch a whole to redirect those ports from the firewall >>>device. >>> >>> >>> >>So port 80 or 443 is NOT used at all for Kad Network? >> >> > >Not normally, no. eMule lets the user configure their peer to use >ports other than the default, so they could use any port they want in >that case. But the vast majority of peers do not use port 80 or 443 >at all. > > > >>"Iterative" DHT routing is inefficient compared to "recursive" one. >>Is "iterative" routing used because of a concern about DoS attacks? >> >> > >Kademlia is an inherently iterative DHT. I suspect the Kad developers >used iterative routing simply because they chose Kademlia as a >starting point. I'm not one of the Kad developers though, so I can >only guess at the reasons behind their design decisions. > >I'd observe, though, that iterative routing is much easier to debug. > > > From osokin at osokin.com Sun Dec 18 18:30:35 2005 From: osokin at osokin.com (Serguei Osokine) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <43A4F879.8050709@quinthar.com> Message-ID: > If only I had access to a frickin' huge printer... You're in luck! Just two days ago we finally managed to remove a few remaining bottlenecks and now are doing stable 2,000 pages per minute. Your main problem will be paper, actually - you'll need a lot... Best wishes - S.Osokine. 18 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of David Barrett Sent: Saturday, December 17, 2005 9:50 PM To: Peer-to-peer development. Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades Well the real money is in bulk counterfeiting. If only I had access to a frickin' huge printer... Serguei Osokine wrote: > On Saturday, December 17, 2005 David Barrett wrote: > >>For example, just the other day I was interviewing a candidate (did >>I mention we are hiring?) who aggregates poker stats on other players. > > > Sounds like you're finally switching your development into areas > that can actually bring heaps of money. I always thought that cheating > in poker should be more profitable than P2P content delivery - and now > your hiring approach seems to validate that. Good luck! > > Best wishes - > S.Osokine. > 17 Dec 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of David Barrett > Sent: Saturday, December 17, 2005 3:01 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades > > > On Sat, 17 Dec 2005 1:58 pm, coderman wrote: > >>On 12/17/05, Matthew Kaufman wrote: >> >>> ... >>> 2. We've significantly upgraded the "MFP defcrypto" default >>>cryptographic >>> plug-in. >> >>i forgot to mention this previously but it is always a good idea to >>lock memory pages where key material and cipher state resides. > > > I'm not sure I follow how this helps: who is it protecting against? If > you don't want the user to get access to cipher info, requiring root > access isn't much of a barrier (any hacker will have root on his own > box). And one user can't access the memory of another user's > processes. I'm not disputing the technique, I just don't understand > when to apply it. > > For example, just the other day I was interviewing a candidate (did I > mention we are hiring?) who aggregates poker stats on other players. > Despite all sorts of clever on-the-wire encryption, he just figured out > where all the stats are kept in plaintext in memory and tapped into > that. Doh! > > Ultimately, it's never a good idea to send data to a client that you > don't want to fall into the wrong hands. Memory protection might stop a > non-root user from accessing his own memory, but this seems like a > boundary case (unless I'm misunderstanding it). > > -david > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From coderman at gmail.com Sun Dec 18 19:37:46 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <200512180047.jBI0l7U10135@where.matthew.at> References: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com> <200512180047.jBI0l7U10135@where.matthew.at> Message-ID: <4ef5fec60512181137y334669e6qc38f475cb9fc298d@mail.gmail.com> On 12/17/05, Matthew Kaufman wrote: > ... > Not "always". The right answer is "sometimes". > > If you wish to do so, on some systems you can use memory region locking to > prevent the cryptographic material from being paged out to permanent storage > media, in theory (see below). ... > > However, that's just one of several possible attacks you might want to guard > against, or other requirements you might have. very true. i suppose if you are this concerned about key secrecy you'd also want to ensure other side channels / application security is as well protected. are patches for MF* accepted in general? is copyright assignment required? thanks for the detailed response. From matthew at matthew.at Sun Dec 18 22:39:22 2005 From: matthew at matthew.at (Matthew Kaufman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <4ef5fec60512181137y334669e6qc38f475cb9fc298d@mail.gmail.com> Message-ID: <200512182237.jBIMbSU13180@where.matthew.at> coderman: > > very true. i suppose if you are this concerned about key > secrecy you'd also want to ensure other side channels / > application security is as well protected. Exactly so. Remembering, of course, that the attacker is always looking for the path of least resistance. Encrypt everything on the disk with a strong passphrase? Better make sure there's no keylogger installed, Encrypting your VOIP chat? Better make sure there's no bug glued to the bottom of your desk. Etc. > are patches for MF* accepted in general? is copyright > assignment required? The code is under GPL. Self-published patches that modify the code are of course just fine, and that keeps complete control of the patch in your hands as long as any distribution of the patched code you do complies with the GPL. If you want patches rolled back into our distributed code, copyright assignment is required since we not only need to try to keep compatibility with them (and so we might "patch a patch", and don't want our GPL-publication-right of that getting confusing), but we have commercial licensees who we need to grant rights to. Exceptions might be made in exceptional circumstances where a GPL/non-GPL fork really makes sense. We're also open to suggestions for changes... Just ask, and we might write it into the next release for you :) One thing I know will be in the next release as a response to a request, for instance, is a change to the 'extern "C"' handling in the headers, to make life easier for C++ programmers, particularly on win32 where there's C vs C++ system include file issues. Matthew Kaufman matthew@matthew.at http://www.amicima.com From coderman at gmail.com Mon Dec 19 04:23:00 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] amicima MFP and crypto upgrades In-Reply-To: <200512182237.jBIMbSU13180@where.matthew.at> References: <4ef5fec60512181137y334669e6qc38f475cb9fc298d@mail.gmail.com> <200512182237.jBIMbSU13180@where.matthew.at> Message-ID: <4ef5fec60512182023k49f46bb1u755a0b214a3a271c@mail.gmail.com> On 12/18/05, Matthew Kaufman wrote: > ... If you want patches rolled back into our distributed code, copyright > assignment is required since we not only need to try to keep compatibility > with them (and so we might "patch a patch", and don't want our > GPL-publication-right of that getting confusing), but we have commercial > licensees who we need to grant rights to. Exceptions might be made in > exceptional circumstances where a GPL/non-GPL fork really makes sense. yeah, that is common and makes sense. i just hadn't seen this explicitly stated so i was curious. > One thing I know will be in the next release as a response to a request, for > instance, is a change to the 'extern "C"' handling in the headers, to make > life easier for C++ programmers, particularly on win32 where there's C vs > C++ system include file issues. that would be handy; i'd like to use some of this framework in a c++ project in the near future. thanks for update... From john.casey at gmail.com Mon Dec 19 10:45:52 2005 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Decentralized search engines In-Reply-To: References: Message-ID: Hi Simon, have you thought of using the Apache groups lucene search engine and crawler ?? http://lucene.apache.org/java/docs/index.html On 12/8/05, SIMON Gwendal RD-MAPS-ISS wrote: > In comparison with traditional filesharing approaches, a decentralized search for the web should take into account words inside the documents. > > As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents. > > http://maay.netofpeers.net > > Each node (said computer) can publish a set of documents. This information space does not initially contain the web. Our idea is to consider that the cache (or history) of the web browser should be, by default, included in the published set of documents. So, every page that has been visited by at least one people since x days will be available in the network. Obviously, more popular a page is, more available it is. > > By the way, one first challenge is the implementation of a nice crawler for owned documents : an indexer. This indexer should be able to scan and retrieve words from various documents (.html, .doc, .pdf, ...). It should be light and run in idle time and, if possible, be cross-platform. If you know a good open-source indexer, please let us know. > > > -- Gwendal > > > > > > > > > > > > > > -----Message d'origine----- > > De : p2p-hackers-bounces@zgp.org > > [mailto:p2p-hackers-bounces@zgp.org] De la part de Ludovic Court?s > > Envoy? : mercredi 7 d?cembre 2005 17:19 > > ? : strib@MIT.EDU > > Cc : Peer-to-peer development.; zooko@zooko.com > > Objet : [p2p-hackers] Decentralized search engines > > > > Hi, > > > > Jeremy Stribling writes: > > > > > Working on it. Should have something public within a few months: > > > > > > http://pdos.csail.mit.edu/papers/overcite:iptps05/index.html > > > > Indeed, that seems very promising! > > > > Similarly, are there people working on decentralized web indexing and > > search engines? To paraphrase Zooko, it would be nice to decentralize > > Google before it is too late... > > > > Thanks, > > Ludovic. > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From ap at hamachi.cc Mon Dec 19 19:26:56 2005 From: ap at hamachi.cc (Alex Pankratov) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Google releases something P2P In-Reply-To: <46c2f4ab0512160333u6d5e326er308b13c36d0a0ad0@mail.gmail.com> References: <46c2f4ab0512160333u6d5e326er308b13c36d0a0ad0@mail.gmail.com> Message-ID: <43A70980.70109@hamachi.cc> Bram Neijt wrote: > > [The P2P component] "Negotiates, establishes, and maintains > peer-to-peer connections through almost any network configuration > regardless of NAT devices and firewalls. The p2p component understands > the Jingle spec to initiate the session and then provides a > sockets-like interface for sending and receiving data that is used by > the session component to add functionality." > Tunneling method summary (based on what's in the actual code) - STUN-based NAT traversal complimented by an option of relaying data through 3rd node for cases that STUN cannot handle. Alex From threelions0916 at yahoo.com.cn Tue Dec 27 01:54:15 2005 From: threelions0916 at yahoo.com.cn (Michael Liu) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kad Network with Kademlia References: <43A34242.3070203@research.panasonic.com> <1134850905.F9FF157@dl11.dngr.org><20051217225926.GB2876@cs.uoregon.edu> <43A4A2A2.6080207@research.panasonic.com> Message-ID: <005c01c60a88$88580e20$1d18080a@cnc.intra> ----- Original Message ----- From: "Eunsoo Shim" To: "Peer-to-peer development." Sent: Sunday, December 18, 2005 7:43 AM Subject: Re: [p2p-hackers] Kad Network with Kademlia > > >>>>I measured it to be around a million non-firewalled peers, although > >>>>that was a few months back. > >>>> > >>>> > >>Daniel, by "non-firewalled" do you mean truly, those that aren't behind > >>a firewall, or rather those for which NAT/firewall traversal doesn't > >>work? > >> > >> > > > >I mean those that can receive unsolicited TCP and UDP packets on the > >Kad/eMule ports. Either they must not be firewalled/NATed or the user > >must manually punch a whole to redirect those ports from the firewall > >device. > > > > > So port 80 or 443 is NOT used at all for Kad Network? > > >Kad uses "iterative" DHT routing. If I'm a client and want to do a > >lookup, I query some of my contacts to get their next hop for my > >target, then I query that peer for it's next hop, etc. Therefore, > >it's important that any host participating in Kad's DHT routing > >structure be able to receive unsolicited packets. > > > > > > > "Iterative" DHT routing is inefficient compared to "recursive" one. > Is "iterative" routing used because of a concern about DoS attacks? > Thanks. I wonder why 'Iterative' DHT routing is less efficient than 'recursive' one? it seems the efficiency are same ... I am also astonished to find DHT can support millions of active nodes, it's fantastic . > > Eunsoo > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From gojomo at bitzi.com Thu Dec 29 22:41:33 2005 From: gojomo at bitzi.com (Gordon Mohr) Date: Sat Dec 9 22:13:06 2006 Subject: Google 'Safe Browsing' vs. RESTful authorization Re: [p2p-hackers] Re: [rest-discuss] Re: RESTful authorization In-Reply-To: <5691356f050929103656e4f30f@mail.gmail.com> References: <5691356f05092710332a623de2@mail.gmail.com> <1127850423.3A956345@bd12.dngr.org> <1127901049.15818.47.camel@p-dvsi-418-1.rd.francetelecom.fr> <5691356f0509280755501a1c1d@mail.gmail.com> <1127922062.15818.82.camel@p-dvsi-418-1.rd.francetelecom.fr> <5691356f050929103656e4f30f@mail.gmail.com> Message-ID: <43B4661D.5030301@bitzi.com> I like Tyler's notion of SSL-passed 'capability URLs', and had occasion to think about them again when reading the following: Two Things That Bother Me About Google?s New Firefox Extension http://www.oreillynet.com/pub/wlg/8760 "1) Every request is transmitted to Google over HTTP, i.e. in clear-text. This is not good. Here is why: Consider a web application that uses SSL to encrypt the session. If this web application were to submit private information about you via a GET request (i.e in the URL, such as a credit card number), this will now be transmitted to http://www.google.com/safebrowsing/lookup in clear-text, allowing someone on your network segment, or any router in between yourself and google.com to sniff the information off the wire." Asking a trusted third party their opinion of an URL seems a reasonable anti-phishing measure. But if that "trusted" third party is careless in its handling of HTTPS URLs, as it appears Google has been in this design, the prerequisite URL secrecy required for capability URLs will be often and casually violated. Of course, any security measure can be thwarted with sufficient carelessness, and in this case the onus should be on Google to fix this oversight, and respect the privacy of HTTPS URL requests. But it's been 2 weeks since this problem was highlighted, and it remains unfixed and mostly undiscussed. That suggests to me that people's expectations of HTTPS URL secrecy -- and of the standards that toolbar/extension makers like Google should be held to in protecting user secrets -- are pretty low. Perhaps so low that capability URLs would only be usable by the hyper-conscientious, who generally do fine under any system. - Gordon Tyler Close wrote: > Hi Antoine, > > On 9/28/05, Antoine Pitrou wrote: >>> On 9/28/05, Antoine Pitrou wrote: >>>> I'm curious as to how "capability URLs" can't be stolen and re-used by a >>>> malicious piece of Javascript like other URLs can. >>> Simply because a capability URL is unguessable. >> It is permanent too, > > No, the lifetime of the URL is up to the application designer. Using > capability URLs does not place any duration restrictions on how you > define the lifetime of your resources. > >> And you have to keep this URL somewhere... Given that it's full of >> random ascii garbage, you can't keep it in your head (contrast this with >> a properly chosen password), and you don't want to copy it by hand >> either. So it /will/ end up in electronic clickable form somewhere: for >> example in your bookmarks. > > I think keeping capability URLs in your bookmarks is a perfectly > sensible thing to do, providing you then protect your bookmarks. I run > OS X, so my entire filesystem is encrypted, including my bookmarks > file. > >> >From your own explanation on the REST mailing-list : ? The user just >> *clicks on hyperlinks*, without ever needing to be aware of the resource >> password. ? Those hyperlinks have to be somewhere... >> >> (and of course, this totally mandates HTTPS, which is impossible for >> most Web sites for reasons I already explained) > > This argument is a little out of date. You can get affordable HTTPS > hosting from providers like GoDaddy and 1and1. Even before the advent > of shared hosting for HTTPS, colocation was already an affordable > option and likely required anyways in order to get the performance > characteristics that you want for your web application. For very small > scale projects, running Apache on your home machine with a dyndns > hostname and a 7.99 SSL certificate is also doable. > >> As a mix proposal, it would be more interesting if a new URL was >> generated everytime the user identifies (with login/password). More >> interesting again, it could be generated client side in Javascript using >> a formula like "HASH(HASH(password) + challenge)" where the challenge is >> a temporary value generated by the server for this very session (thus >> with an expiration time). Which means: >> - the URL is temporary (it expires with the challenge) >> - this URL does not need to be recorded anywhere on the client since >> it's generated at every new login >> - in plain non-encrypted HTTPS, the data which goes over the wire only >> gives temporary access to the resource > > When I attend DefCon, I am always amazed that people are surprised by > the Wall of Sheep, people who know that network snooping is possible. > I guess you just have to experience the efficiency of live network > snooping in order to truly appreciate it. > > With the rise of ubiquitous WiFi, passing secrets, even temporary > ones, over the network in the clear is asking for trouble. Your 15 > minute session timeout is an eon on the timescale of a script watching > the network for your protocol and exploiting it on the fly. > > SSL has finally come into a somewhat reasonable price range. We should > go ahead and exploit it. We don't need to mess around with dodgy > timeout based designs. > > Thanks again for the questions. > > Tyler > > -- > The web-calculus is the union of REST and capability-based security: > http://www.waterken.com/dev/Web/ > > Name your trusted sites to distinguish them from phishing sites. > https://addons.mozilla.org/extensions/moreinfo.php?id=957 > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From marco at bice.it Fri Dec 30 09:20:28 2005 From: marco at bice.it (marco@bice.it) Date: Sat Dec 9 22:13:06 2006 Subject: [p2p-hackers] Kademlia and Java Message-ID: <20051230102028.9bg4nvglf1b4cw40@webmail.bice.it> I'm looking at a Java implementation of the Kademlia API. http://kademlia.scs.cs.nyu.edu doesn't work, so I was wondering if someone have experienced the Plan X 0.4.12 library. There is something called "org.planx.xmlstore.routing.Kademlia" mentioned in the Javadoc, that could be useful. Is there someone who can help me? Where to find the library and is it useful? Is there something else implementing Kademlia? Thank you very much. Marco