From threelions0916 at yahoo.com.cn  Thu Dec  1 06:34:37 2005
From: threelions0916 at yahoo.com.cn (Michael Liu)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] P2P in SFC
References: <438B8E20.3030903@quinthar.com><20051129011547.34494.qmail@web53605.mail.yahoo.com>
	<40513.67.188.193.83.1133228692.squirrel@webmail.redswoosh.net>
Message-ID: <002301c5f641$4ee31820$3118080a@cnc.intra>

eW91IGd1eXMgYXJlIHNvIGx1Y2t5LCBJIHdhbnQgdG8gYXR0ZW5kIHRoZSBtZWV0aW5nIHRvbywg
YnV0IEkgYW0gaW4gY2hpbmEgOi0oDQoNCk1pY2hhZWwNCg0KLS0tLS0gT3JpZ2luYWwgTWVzc2Fn
ZSAtLS0tLSANCkZyb206ICJUcmF2aXMgS2FsYW5pY2siIDx0cmF2aXNAcmVkc3dvb3NoLm5ldD4N
ClRvOiAiUGVlci10by1wZWVyIGRldmVsb3BtZW50LiIgPHAycC1oYWNrZXJzQHpncC5vcmc+DQpT
ZW50OiBUdWVzZGF5LCBOb3ZlbWJlciAyOSwgMjAwNSA5OjQ0IEFNDQpTdWJqZWN0OiBSZTogW3Ay
cC1oYWNrZXJzXSBQMlAgaW4gU0ZDDQoNCg0KPiBMZW1vbiwgSSBhZ3JlZSB3aXRoIHlvdS4gIFNp
bmNlIG1vc3QgcGVvcGxlIHNlZW0gdG8gYmUgYWJsZSB0byBtYWtlIHRoZQ0KPiBXZWRuZXNkYXkg
dGltZSwgbWF5YmUgd2Ugc2hvdWxkIGZpbmFsaXplIG9uIHRoYXQuDQo+IA0KPiBEYXZpZCwgZG8g
d2UgaGF2ZSBhIGNvbnNlbnN1cz8NCj4gDQo+IA0KPiBUDQo+IA0KPiBMZW1vbiBPYnJpZW4gc2Fp
ZDoNCj4gPiBJIGNhbiBjb21lIGFueXRpbWUuLi5JIHRoaW5rIHRoaXMgd291bGQgYmUgbmVhdDsg
aSBkb24ndCBrbm93IGFib3V0IHlvdQ0KPiA+IGd1eXM7IGJ1dCBpIG93biBteSBvd24gY29tcGFu
eTsgY29taW5nIG91dCB3aXRoIGEgcHJvZHVjdCBzb29uLiBJIGtub3cNCj4gPiBkYXZpZCBoYXMg
aUdsYW5jZS4uLndoaWNoIGlzIG5vdCBpbiBteSBzcGFjZSwgYnV0IGJlbGlldmUgbWVldGluZyBv
dGhlcg0KPiA+IGxpa2UgbWluZGVkIHBlb3BsZSB3aG8ga25vdyAicGVlciIgaXMgdGhlIG5leHQg
YmlnIHRoaW5nLi4uc29ycnkgbXkNCj4gPiBmcmllbmRzOyBidXQgdGhlIHdlYiBpcyBwbGF5ZWQg
b3V0Lg0KPiA+DQo+ID4gICBlbA0KPiA+DQo+ID4gRGF2aWQgQmFycmV0dCA8ZGJhcnJldHRAcXVp
bnRoYXIuY29tPiB3cm90ZToNCj4gPiAgIFdoYXQgZGF5L3RpbWUgd291bGQgeW91IHByb3Bvc2U/
DQo+ID4NCj4gPiBTZXJndWVpIE9zb2tpbmUgd3JvdGU6DQo+ID4+Pk1heWJlLCBXZWRuZXNkYXks
IDlwbT8NCj4gPj4NCj4gPj4NCj4gPj4gU29ycnkgLSBJJ20gYnVzeSBvbiBXZWRuZXNkYXkuLi4N
Cj4gPj4NCj4gPj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPj4gRnJvbTogcDJwLWhh
Y2tlcnMtYm91bmNlc0B6Z3Aub3JnIFttYWlsdG86cDJwLWhhY2tlcnMtYm91bmNlc0B6Z3Aub3Jn
XU9uDQo+ID4+IEJlaGFsZiBPZiBEYXZpZCBCYXJyZXR0DQo+ID4+IFNlbnQ6IE1vbmRheSwgTm92
ZW1iZXIgMjgsIDIwMDUgMTo0MCBQTQ0KPiA+PiBUbzogUGVlci10by1wZWVyIGRldmVsb3BtZW50
Lg0KPiA+PiBTdWJqZWN0OiBbcDJwLWhhY2tlcnNdIFAyUCBpbiBTRkMNCj4gPj4NCj4gPj4NCj4g
Pj4gU28gbG9va3MgbGlrZSB0aGVyZSdzIGEgZGVjZW50IHNob3dpbmcgb2YgUDJQIGd1eXMgaW4g
U2FuIEZyYW5jaXNjbyAtLQ0KPiA+PiBzaXggYnkgbXkgY291bnQuIEhvdyBhYm91dCBzdXNoaSBh
bmQgYmVlciB0aGlzIHdlZWsgYXQsIHNheSBSeW9rbz8NCj4gPj4NCj4gPj4gaHR0cDovL3Rpbnl1
cmwuY29tL2JrazVkDQo+ID4+DQo+ID4+IE1heWJlLCBXZWRuZXNkYXksIDlwbT8gQW55IG9iamVj
dGlvbnMgb3IgYWZmaXJtYXRpb25zPw0KPiA+Pg0KPiA+PiAtZGF2aWQNCj4gPj4NCj4gPj4NCj4g
Pj4NCj4gPj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18N
Cj4gPj4gcDJwLWhhY2tlcnMgbWFpbGluZyBsaXN0DQo+ID4+IHAycC1oYWNrZXJzQHpncC5vcmcN
Cj4gPj4gaHR0cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiA+PiBf
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+PiBIZXJl
IGlzIGEgd2ViIHBhZ2UgbGlzdGluZyBQMlAgQ29uZmVyZW5jZXM6DQo+ID4+IGh0dHA6Ly93d3cu
bmV1cm9ncmlkLm5ldC90d2lraS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0K
PiA+PiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+
PiBwMnAtaGFja2VycyBtYWlsaW5nIGxpc3QNCj4gPj4gcDJwLWhhY2tlcnNAemdwLm9yZw0KPiA+
PiBodHRwOi8vemdwLm9yZy9tYWlsbWFuL2xpc3RpbmZvL3AycC1oYWNrZXJzDQo+ID4+IF9fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+ID4+IEhlcmUgaXMg
YSB3ZWIgcGFnZSBsaXN0aW5nIFAyUCBDb25mZXJlbmNlczoNCj4gPj4gaHR0cDovL3d3dy5uZXVy
b2dyaWQubmV0L3R3aWtpL2Jpbi92aWV3L01haW4vUGVlclRvUGVlckNvbmZlcmVuY2VzDQo+ID4+
DQo+ID4+DQo+ID4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X18NCj4gPiBwMnAtaGFja2VycyBtYWlsaW5nIGxpc3QNCj4gPiBwMnAtaGFja2Vyc0B6Z3Aub3Jn
DQo+ID4gaHR0cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiA+IF9f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+ID4gSGVyZSBp
cyBhIHdlYiBwYWdlIGxpc3RpbmcgUDJQIENvbmZlcmVuY2VzOg0KPiA+IGh0dHA6Ly93d3cubmV1
cm9ncmlkLm5ldC90d2lraS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0KPiA+
DQo+ID4NCj4gPg0KPiA+DQo+ID4gWW91IGRvbid0IGdldCBubyBqdWljZSB1bmxlc3MgeW91IHNx
dWVlemUNCj4gPiBMZW1vbiBPYnJpZW4sIHRoZSBUaGlyZC5fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+IHAycC1oYWNrZXJzIG1haWxpbmcgbGlzdA0K
PiA+IHAycC1oYWNrZXJzQHpncC5vcmcNCj4gPiBodHRwOi8vemdwLm9yZy9tYWlsbWFuL2xpc3Rp
bmZvL3AycC1oYWNrZXJzDQo+ID4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX18NCj4gPiBIZXJlIGlzIGEgd2ViIHBhZ2UgbGlzdGluZyBQMlAgQ29uZmVyZW5j
ZXM6DQo+ID4gaHR0cDovL3d3dy5uZXVyb2dyaWQubmV0L3R3aWtpL2Jpbi92aWV3L01haW4vUGVl
clRvUGVlckNvbmZlcmVuY2VzDQo+ID4NCj4gDQo+IA0KPiBUcmF2aXMgS2FsYW5pY2sNCj4gUmVk
IFN3b29zaCwgSW5jLg0KPiBGb3VuZGVyLCBDaGFpcm1hbg0KPiB0cmF2aXNAcmVkc3dvb3NoLm5l
dA0KPiAodikgMzEwLjY2Ni4xNDI5DQo+IChmKSAyNTMuMzIyLjk0NzgNCj4gQUlNOiBTY291clRy
YXYxMjMNCj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18N
Cj4gcDJwLWhhY2tlcnMgbWFpbGluZyBsaXN0DQo+IHAycC1oYWNrZXJzQHpncC5vcmcNCj4gaHR0
cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiBfX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiBIZXJlIGlzIGEgd2ViIHBhZ2Ug
bGlzdGluZyBQMlAgQ29uZmVyZW5jZXM6DQo+IGh0dHA6Ly93d3cubmV1cm9ncmlkLm5ldC90d2lr
aS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0K

__________________________________________________
Do You Yahoo!?
????????G??????????????????????????????????????
http://cn.mail.yahoo.com/?id=77071

From gbildson at limepeer.com  Thu Dec  1 16:36:14 2005
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p framework
In-Reply-To: <71b79fa90511301016q54c57883se119ee54ea01212c@mail.gmail.com>
Message-ID: <EPEJIODJLBDLEHGHIEADEEFPFGAB.gbildson@limepeer.com>

You could build off the limewire.org open source code (currently down) or
off of the gtk-gnutella or gnucleus source.  You would want to form a
separate network since current vendors would consider alternate uses of the
existing network as pollution.

If you envisioned similar services as exist or extensions to the existing
services then this might make sense.  If you want something as a basis for a
new clean protocol, I might not recommend it since some aspects of the
protocol are a little ugly underneath the covers.  However, extension
mechanisms do for most message types.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Davide Carboni
> Sent: Wednesday, November 30, 2005 1:17 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] p2p framework
>
>
> On 11/29/05, Greg Bildson <gbildson@limepeer.com> wrote:
> > Do you mean Gnutella's use as a framework or otherwise?
> >
>
> Yes I do. My question is: are there some implementation of gnutella
> that can be used to build upon new applications and to develop new
> services (beyond simple file sharing) ?
>
> D.
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From unixsmaxer at hotmail.com  Thu Dec  1 18:28:16 2005
From: unixsmaxer at hotmail.com (Salem Mark)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p framework
In-Reply-To: <EPEJIODJLBDLEHGHIEADEEFPFGAB.gbildson@limepeer.com>
Message-ID: <BAY17-F2301650AE1430800165477D54D0@phx.gbl>


>From: "Greg Bildson" <gbildson@limepeer.com>
>You could build off the limewire.org open source code (currently down) or
>off of the gtk-gnutella or gnucleus source.  You would want to form a
>separate network since current vendors would consider alternate uses of the
>existing network as pollution.

Could you please elaborate on forming a separate network under Gnutella?

I was thinking of using the Echomine-Muse gnutella API, which facilitates 
sending custom messages in Gnutella, as a technique for Jabber Servers to 
collorabote and achieve global service discovery.

Thanks.

- Salem


>
>If you envisioned similar services as exist or extensions to the existing
>services then this might make sense.  If you want something as a basis for 
>a
>new clean protocol, I might not recommend it since some aspects of the
>protocol are a little ugly underneath the covers.  However, extension
>mechanisms do for most message types.
>
>Thanks
>-greg
>
> > -----Original Message-----
> > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> > Behalf Of Davide Carboni
> > Sent: Wednesday, November 30, 2005 1:17 PM
> > To: Peer-to-peer development.
> > Subject: Re: [p2p-hackers] p2p framework
> >
> >
> > On 11/29/05, Greg Bildson <gbildson@limepeer.com> wrote:
> > > Do you mean Gnutella's use as a framework or otherwise?
> > >
> >
> > Yes I do. My question is: are there some implementation of gnutella
> > that can be used to build upon new applications and to develop new
> > services (beyond simple file sharing) ?
> >
> > D.
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

_________________________________________________________________
No masks required! Use MSN Messenger to chat with friends and family. 
http://go.msnserver.com/HK/25382.asp


From unixsmaxer at hotmail.com  Thu Dec  1 18:38:46 2005
From: unixsmaxer at hotmail.com (Salem Mark)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] DHTs in highly-transient networks
Message-ID: <BAY17-F10983AE38C48B953A26F5DD54D0@phx.gbl>

Hello,

I have read in several papers that it is unlikely that the integrity of the 
DHT can be maintained where there is a high node or link failure rate 
without significant message transmission overhead. In other words, it is 
mentioned that, in "highly transient networks", where the number of nodes 
appearing and disappearing are very high, maintaining the DHT becomes hard 
and introduces considerable overhead.

I am trying to find out what exactly "highly-transient" means. A file 
sharing network like Gnutella, seems to be highly transient, where peers 
join/leave the network frequently. Could somebody elaborate on this? is 
there a node departure/arrival/failure rate (per sec? per min?) that 
identifies "highly-transient" networks ?

Thanks

- Salem

_________________________________________________________________
FREE English Booklet! Improve your English. 
http://www.linguaphonenet.com/BannerTrack.asp?EMSCode=MSN03-08ETFJ-0211E


From gbildson at limepeer.com  Thu Dec  1 18:44:00 2005
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p framework
In-Reply-To: <BAY17-F2301650AE1430800165477D54D0@phx.gbl>
Message-ID: <EPEJIODJLBDLEHGHIEADKEGIFGAB.gbildson@limepeer.com>

There is a standard connect string in Gnutella.  Something like "Gnutella
Connect" / "Gnutella OK".  Change that.  Set up your own Gwebcache or UHC
(UDP host cache) or include your own gnutella.net ip:ports file and you
should be able to bootstrap your own network.

I'm not aware of any mainstream users of the Echomine-Muse libraries.  They
may or may not work.  I expect that they are primitive compared to the
LimeWire and gtk-gnutella code.  However, they may work for your purposes.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Salem Mark
> Sent: Thursday, December 01, 2005 1:28 PM
> To: p2p-hackers@zgp.org
> Subject: RE: [p2p-hackers] p2p framework
>
>
>
> >From: "Greg Bildson" <gbildson@limepeer.com>
> >You could build off the limewire.org open source code (currently down) or
> >off of the gtk-gnutella or gnucleus source.  You would want to form a
> >separate network since current vendors would consider alternate
> uses of the
> >existing network as pollution.
>
> Could you please elaborate on forming a separate network under Gnutella?
>
> I was thinking of using the Echomine-Muse gnutella API, which facilitates
> sending custom messages in Gnutella, as a technique for Jabber Servers to
> collorabote and achieve global service discovery.
>
> Thanks.
>
> - Salem
>
>
>
>
>
>
>
>
>
>
> >
> >If you envisioned similar services as exist or extensions to the existing
> >services then this might make sense.  If you want something as a
> basis for
> >a
> >new clean protocol, I might not recommend it since some aspects of the
> >protocol are a little ugly underneath the covers.  However, extension
> >mechanisms do for most message types.
> >
> >Thanks
> >-greg
> >
> > > -----Original Message-----
> > > From: p2p-hackers-bounces@zgp.org
> [mailto:p2p-hackers-bounces@zgp.org]On
> > > Behalf Of Davide Carboni
> > > Sent: Wednesday, November 30, 2005 1:17 PM
> > > To: Peer-to-peer development.
> > > Subject: Re: [p2p-hackers] p2p framework
> > >
> > >
> > > On 11/29/05, Greg Bildson <gbildson@limepeer.com> wrote:
> > > > Do you mean Gnutella's use as a framework or otherwise?
> > > >
> > >
> > > Yes I do. My question is: are there some implementation of gnutella
> > > that can be used to build upon new applications and to develop new
> > > services (beyond simple file sharing) ?
> > >
> > > D.
> > > _______________________________________________
> > > p2p-hackers mailing list
> > > p2p-hackers@zgp.org
> > > http://zgp.org/mailman/listinfo/p2p-hackers
> > > _______________________________________________
> > > Here is a web page listing P2P Conferences:
> > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
> >
> >_______________________________________________
> >p2p-hackers mailing list
> >p2p-hackers@zgp.org
> >http://zgp.org/mailman/listinfo/p2p-hackers
> >_______________________________________________
> >Here is a web page listing P2P Conferences:
> >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
> _________________________________________________________________
> No masks required! Use MSN Messenger to chat with friends and family.
> http://go.msnserver.com/HK/25382.asp
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From rrrw at neofonie.de  Thu Dec  1 20:48:45 2005
From: rrrw at neofonie.de (Ronald Wertlen)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <20051130223956.C99CC3FEE8@capsicum.zgp.org>
References: <20051130223956.C99CC3FEE8@capsicum.zgp.org>
Message-ID: <438F61AD.80406@neofonie.de>

Hi Adam,

perhaps you have not understood my message because you have not noticed 
the focus on "precision and recall" (i.e. search) not the old 
Distributed DB vs. own DB debate. You have also pigeon-holed my email 
with the DHT crowd (*grin*), it couldn't be further from it!

I was arguing in the other direction - which coderman thankfully picked 
up.  Gnutella doesn't structure enough, that's all. Sure Gnutella beats 
DHTs on search - I base that observation on a project I finished last 
year - a public prototype that used JXTA and was honed for search using 
super-peers   [DFN S2S http://s2s.neofonie.de/ (German site) - we've 
moved on some since them  ;) ].

Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
practically anyone to elevate to super-peer, which results in a random 
(power-law distribtion) network. Such a network is not going to perform 
very well as far as recall and precision are concerned, past a certain 
point. I would be interested to calculate that exact point (but doubting 
I'll get to it some time soon :-/).

HTH.

Best regards, Ron

PS. seems this thread has driven the original author to reformulate his 
statment...  :-)

PPS.
In fact, the network is not going to be completely random - it will 
follow the contours of the internet (distribution of servers, broadband 
connections, users, etc. is not random). I am not sure if that destroys 
or supports my argument. Back to the drawing board!

We actually need a better internet. [oops there I go getting unspecific 
again, sorry!!  ;-) ]


> Message: 4
> Date: Wed, 30 Nov 2005 16:42:39 -0500
> From: Adam Fisk <afisk@speedymail.org>
> Subject: Re: [p2p-hackers] Re: scalability  
> To: "Peer-to-peer development." <p2p-hackers@zgp.org>
> Message-ID: <438E1CCF.4010907@speedymail.org>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> I don't understand your post.  When you say "critical", I assume you're 
> talking about life and death situations?  Are you talking about anything 
> specifically?  DHTs have failure rates.  Ad hoc and mesh networks can 
> become useful in emergency situations where conventional infrastructures 
> break down, but the centralized/p2p/structured/unstructured questions 
> here are far from obvious.
> 
> On the "obsessive science types" issue, this completely misses the 
> point.  It's a very non "obsessive science type" statement.  There are 
> strong reasons for using the massive indexing/random walk approach above 
> DHTs -- reasons that have nothing to do with scalability.  In 
> particulary, DHTs are, well, hash tables.  Hash tables don't work well 
> for metadata queries.  They do fine for keywords (hotspots are a 
> problem, but they can be solved), but they aren't as nice a fit for 
> metadata.  RDF and DHTs are tough to squeeze together, for example.  The 
> massive indexing (mutual index caching to use Serguei's term)/random 
> walk approach can get around these issues more easily.  They are also 
> not nearly as brittle as DHTs.  Sure, DHTs repair themselves after node 
> joins and leaves, but node transience generally has a much greater 
> effect on DHTs than it does on massive indexing networks.
> 
> I also think you're underestimating the efficiency of massive indexing 
> and random walks.  Sure, these networks don't scale logarithmically, but 
> they do pretty darn well. 
> 
> I encourage everyone to stay specific with their posts.
> 
> All the Best,
> 
> Adam


From agthorr at cs.uoregon.edu  Thu Dec  1 20:52:16 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <438F61AD.80406@neofonie.de>
References: <20051130223956.C99CC3FEE8@capsicum.zgp.org>
	<438F61AD.80406@neofonie.de>
Message-ID: <20051201205215.GF5300@cs.uoregon.edu>

On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote:
> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> practically anyone to elevate to super-peer, which results in a random 
> (power-law distribtion) network. 

Gnutella is not a power-law network.  See my paper on the graph
properties of Gnutella, presented at the Internet Measurement
Conference earlier this year:

http://www.usenix.org/events/imc05/tech/stutzbach.html

> Such a network is not going to perform very well as far as recall
> and precision are concerned, past a certain point. I would be
> interested to calculate that exact point (but doubting I'll get to
> it some time soon :-/).

Could you rigorously define recall and precision for me?  I'm not sure
what you mean by these terms.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From afisk at speedymail.org  Thu Dec  1 21:09:22 2005
From: afisk at speedymail.org (Adam Fisk)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <438F61AD.80406@neofonie.de>
References: <20051130223956.C99CC3FEE8@capsicum.zgp.org>
	<438F61AD.80406@neofonie.de>
Message-ID: <438F6682.6070806@speedymail.org>

Hi Ron-

Apologies for the DHT pigeon-holing.  I had this nagging feeling in my 
stomach that you may come more from the land of small world and power 
law networks, but I successfully supressed it!  I agree with Daniel that 
Gnutella's not actually a power law network, although I can't remember 
what led me to decide that (several years ago now).   If I recall 
correctly, it's that degrees between nodes are quite fixed and uniform. 

How would you prefer superpeers get elected?  Superpeer election on 
Gnutella is fairly simple primarily because there's a scarcity of 
non-firewalled/NATted machines to fill their roles, so you have to sort 
of take what you can get.  Are you referring more to which superpeers to 
*select* over the course of a search and not the original choice of 
superpeers?

On the Gnutella 0.6/0.7 issue, that's really just the version of the 
specification for connection headers -- a frequent source of confusion.  
Gnutella has rightfully evolved into a family of protocols that 
themselves have version numbers -- everything from superpeers to dynamic 
querying to bloom filter exchange and mesh downloading.  All of these 
evolve largely independently from one another, giving the protocol 
family much more flexibility and agility.

All the Best,

Adam


Ronald Wertlen wrote:

> Hi Adam,
>
> perhaps you have not understood my message because you have not 
> noticed the focus on "precision and recall" (i.e. search) not the old 
> Distributed DB vs. own DB debate. You have also pigeon-holed my email 
> with the DHT crowd (*grin*), it couldn't be further from it!
>
> I was arguing in the other direction - which coderman thankfully 
> picked up.  Gnutella doesn't structure enough, that's all. Sure 
> Gnutella beats DHTs on search - I base that observation on a project I 
> finished last year - a public prototype that used JXTA and was honed 
> for search using super-peers   [DFN S2S http://s2s.neofonie.de/ 
> (German site) - we've moved on some since them  ;) ].
>
> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> practically anyone to elevate to super-peer, which results in a random 
> (power-law distribtion) network. Such a network is not going to 
> perform very well as far as recall and precision are concerned, past a 
> certain point. I would be interested to calculate that exact point 
> (but doubting I'll get to it some time soon :-/).
>
> HTH.
>
> Best regards, Ron
>
> PS. seems this thread has driven the original author to reformulate 
> his statment...  :-)
>
> PPS.
> In fact, the network is not going to be completely random - it will 
> follow the contours of the internet (distribution of servers, 
> broadband connections, users, etc. is not random). I am not sure if 
> that destroys or supports my argument. Back to the drawing board!
>
> We actually need a better internet. [oops there I go getting 
> unspecific again, sorry!!  ;-) ]
>
>
>> Message: 4
>> Date: Wed, 30 Nov 2005 16:42:39 -0500
>> From: Adam Fisk <afisk@speedymail.org>
>> Subject: Re: [p2p-hackers] Re: scalability  To: "Peer-to-peer 
>> development." <p2p-hackers@zgp.org>
>> Message-ID: <438E1CCF.4010907@speedymail.org>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> I don't understand your post.  When you say "critical", I assume 
>> you're talking about life and death situations?  Are you talking 
>> about anything specifically?  DHTs have failure rates.  Ad hoc and 
>> mesh networks can become useful in emergency situations where 
>> conventional infrastructures break down, but the 
>> centralized/p2p/structured/unstructured questions here are far from 
>> obvious.
>>
>> On the "obsessive science types" issue, this completely misses the 
>> point.  It's a very non "obsessive science type" statement.  There 
>> are strong reasons for using the massive indexing/random walk 
>> approach above DHTs -- reasons that have nothing to do with 
>> scalability.  In particulary, DHTs are, well, hash tables.  Hash 
>> tables don't work well for metadata queries.  They do fine for 
>> keywords (hotspots are a problem, but they can be solved), but they 
>> aren't as nice a fit for metadata.  RDF and DHTs are tough to squeeze 
>> together, for example.  The massive indexing (mutual index caching to 
>> use Serguei's term)/random walk approach can get around these issues 
>> more easily.  They are also not nearly as brittle as DHTs.  Sure, 
>> DHTs repair themselves after node joins and leaves, but node 
>> transience generally has a much greater effect on DHTs than it does 
>> on massive indexing networks.
>>
>> I also think you're underestimating the efficiency of massive 
>> indexing and random walks.  Sure, these networks don't scale 
>> logarithmically, but they do pretty darn well.
>> I encourage everyone to stay specific with their posts.
>>
>> All the Best,
>>
>> Adam
>
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From srhea at cs.berkeley.edu  Thu Dec  1 21:11:02 2005
From: srhea at cs.berkeley.edu (Sean Rhea)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] DHTs in highly-transient networks
In-Reply-To: <E174EA38-CCF1-4594-BB48-F5F67227B075@mit.edu>
References: <BAY17-F10983AE38C48B953A26F5DD54D0@phx.gbl>
	<E174EA38-CCF1-4594-BB48-F5F67227B075@mit.edu>
Message-ID: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu>

On Dec 1, 2005, at 1:38 PM, Salem Mark wrote:

> I have read in several papers that it is unlikely that the  
> integrity of the DHT can be maintained where there is a high node  
> or link failure rate without significant message transmission  
> overhead. In other words, it is mentioned that, in "highly  
> transient networks", where the number of nodes appearing and  
> disappearing are very high, maintaining the DHT becomes hard and  
> introduces considerable overhead.
>
> I am trying to find out what exactly "highly-transient" means. A  
> file sharing network like Gnutella, seems to be highly transient,  
> where peers join/leave the network frequently. Could somebody  
> elaborate on this? is there a node departure/arrival/failure rate  
> (per sec? per min?) that identifies "highly-transient" networks ?
>

In the Bamboo USENIX paper, we talked about the average time a node  
was connected to the network before disconnecting.  Bamboo and Chord  
are definitely resilient (at a routing level) even when that period  
is a short as a few minutes:

   http://srhea.net/papers/bamboo-usenix.pdf

Other DHTs may be this resilient as well, but I don't have data for  
them.

Sean
-- 
             There is no end to the fragility of our democracy.
                               -- Ralph Nader


-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051201/f244438a/PGP.pgp
From agthorr at cs.uoregon.edu  Thu Dec  1 21:15:12 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <438F6682.6070806@speedymail.org>
References: <20051130223956.C99CC3FEE8@capsicum.zgp.org>
	<438F61AD.80406@neofonie.de> <438F6682.6070806@speedymail.org>
Message-ID: <20051201211511.GH5300@cs.uoregon.edu>

On Thu, Dec 01, 2005 at 04:09:22PM -0500, Adam Fisk wrote:
> On the Gnutella 0.6/0.7 issue, that's really just the version of the 
> specification for connection headers -- a frequent source of confusion.  
> Gnutella has rightfully evolved into a family of protocols that 
> themselves have version numbers -- everything from superpeers to dynamic 
> querying to bloom filter exchange and mesh downloading.  All of these 
> evolve largely independently from one another, giving the protocol 
> family much more flexibility and agility.

I suggest adding text similar to this to the GDF Wiki main page, and
changing "RFC-Gnutella 0.6" to "Gnutella Protocol Family" or the like.

(Which apparently cannot be edited by normal wiki users)

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From m.rogers at cs.ucl.ac.uk  Thu Dec  1 22:53:24 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] DHTs in highly-transient networks
In-Reply-To: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu>
References: <BAY17-F10983AE38C48B953A26F5DD54D0@phx.gbl>
	<E174EA38-CCF1-4594-BB48-F5F67227B075@mit.edu>
	<6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu>
Message-ID: <438F7EE4.9030209@cs.ucl.ac.uk>

Sean Rhea wrote:
> In the Bamboo USENIX paper, we talked about the average time a node  was 
> connected to the network before disconnecting.  Bamboo and Chord  are 
> definitely resilient (at a routing level) even when that period  is a 
> short as a few minutes:

To what extent does this depend on the distribution of session times as 
well as the mean? Kademlia assumes that old nodes will outlive new 
nodes, and Daniel's paper shows that Gnutella contains an emergent core 
of long-lived nodes - how well do Bamboo and Chord survive under 
non-uniform churn?

Cheers,
Michael

From srhea at cs.berkeley.edu  Thu Dec  1 23:01:51 2005
From: srhea at cs.berkeley.edu (Sean Rhea)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] DHTs in highly-transient networks
In-Reply-To: <438F7EE4.9030209@cs.ucl.ac.uk>
References: <BAY17-F10983AE38C48B953A26F5DD54D0@phx.gbl>
	<E174EA38-CCF1-4594-BB48-F5F67227B075@mit.edu>
	<6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu>
	<438F7EE4.9030209@cs.ucl.ac.uk>
Message-ID: <B82E7480-8489-4FE2-B7CD-2607F535D855@cs.berkeley.edu>

On Dec 1, 2005, at 5:53 PM, Michael Rogers wrote:
> To what extent does this depend on the distribution of session  
> times as well as the mean? Kademlia assumes that old nodes will  
> outlive new nodes, and Daniel's paper shows that Gnutella contains  
> an emergent core of long-lived nodes - how well do Bamboo and Chord  
> survive under non-uniform churn?

We used exponentially-distributed node lifetimes, so old nodes do not  
generally outlive new ones.  However, I _think_ that choice only  
makes the problem harder, though.  In particular, I would suspect  
that Bamboo/Chord would do just as well if old nodes lived longer  
than new ones, and possibly better.  They won't take advantage of it  
like Kademlia does, but it shouldn't hurt them either.  (At least  
that's my guess; I don't have data to prove it.)

Sean
-- 
           When I see the price that you pay / I don't wanna grow up
                                  -- Tom Waits


-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051201/326f82d2/PGP.pgp
From john.casey at gmail.com  Fri Dec  2 00:07:56 2005
From: john.casey at gmail.com (John Casey)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability (was: p2p framework)
In-Reply-To: <438E067B.2040408@neofonie.de>
References: <20051130095529.6CAE83FEB8@capsicum.zgp.org>
	<438E067B.2040408@neofonie.de>
Message-ID: <be7f17170512011607g72bcf465s81df094ac17dc0f@mail.gmail.com>

On 12/1/05, Ronald Wertlen <rrrw@neofonie.de> wrote:
> Hi,
>
> Gnutella-bashing certainly may be fun, the truth is, it is tremendously
> well-adapted for its purpose (I think Serguei's said the relevant stuff).
>
> However, I also believe it is pretty clear that from a search point of
> view, a random super-peer based network does not scale - it is never
> going to get the kind of precision and recall that we would call
> intelligent. It would be too slow or too inaccurate.

But if you index everything in some sort of distributed inverted index
on top of a DHT a lot of document postings and related meta data still
have to be exported to the network which isn't such a great solution
either. The worst thing is that semantically close terms and documents
are going to be scattered  to random locations to remote locations in
the network for indexing. Personally what I think is needed here is a
slightly coarser indexing structure. So that instead of publishing
1000s of term->document pointers or at the other extreme a few
term->peer as with PlanetP there is some sort of middle ground such as
term->cluster-id which is better able to direct a search to sensible
peers. The difficulty of course with this approach is that it isn't
that easy to construct sensible global clusters from local cluster
definitions as different local document databases will index different
terms and the like.

From baoguai2000 at gmail.com  Fri Dec  2 03:06:21 2005
From: baoguai2000 at gmail.com (zheng j)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] help:is anyone interested in living streaming or vod
	over p2p?
Message-ID: <f6a27c550512011906t6a462c1dm@mail.gmail.com>

SGksIEkgYW0gbm93IGRvaW5nIHJlc2VhcmNoIG9uIGxpdmluZyBzdHJlYW1pbmcgYW5kIFZvZCBv
dmVyIHAycCwgYnV0CkkgZG9uJ3Qga25vdyB3aG8gY2FuIEkgZGlzY3VzcyBteSBpZGVhIHdpdGgs
IHlvdSBrbm93LCB3aXRob3V0IGlkZWEKZXhjaGFuZ2UsIEkgZmVlbCB2ZXJ5IGNvbmZ1c2VkIGFu
ZCBhbm5veWVkLiBXaG8gY2FuIHRlbGwgbWUgd2hpY2gKd2Vic2l0ZSBJIGNhbiBmaW5kIHNvbWVv
bmUgaW50ZXJlc3RlZCBpbiBpdD8gQW5kLCBpZiB5b3UgYXJlCmludGVyZXN0ZWQgaW4gaXQsIHBs
ZWFzZSBjb250YWN0IG1loaMK

From joaquin.keller at francetelecom.com  Fri Dec  2 04:05:36 2005
From: joaquin.keller at francetelecom.com (KELLER Joaquin RD-MAPS-ISS)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] help:is anyone interested in living streaming or
	vodover p2p?
In-Reply-To: <BBBE5BAA3B351C488C415EA662EA88400275D39D@FTRDMEL2.rd.francetelecom.fr>
References: <BBBE5BAA3B351C488C415EA662EA88400275D39D@FTRDMEL2.rd.francetelecom.fr>
Message-ID: <c6f2e40f0512012005p781fa5c4kec6b012d659c4911@mail.gmail.com>

Hi Zheng,
We are working on that live streaming (not on VoD)
http://pulse.netofpeers.net/
-- Joaquin

On 12/1/05, zheng j <baoguai2000@gmail.com> wrote:
>
>
> Hi, I am now doing research on living streaming and Vod over p2p, but
>  I don't know who can I discuss my idea with, you know, without idea
>  exchange, I feel very confused and annoyed. Who can tell me which
>  website I can find someone interested in it? And, if you are
>  interested in it, please contact me$B!#(B
>
>


--
___________________________________________________________
Joaquin Keller MAPS/MMC - France Telecom - Division R&D
38-40, rue du General Leclerc 92794 Issy Moulineaux Cedex 9
Tel: +33 (0)1 45 29 52 86 Fax: +33 (0)1 45 29 52 94
joaquin.keller@rd.francetelecom.com
http://solipsis.netofpeers.net/

From redist-p2p-hackers at lothar.com  Fri Dec  2 07:54:31 2005
From: redist-p2p-hackers at lothar.com (Brian Warner)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: P2P in SFC
Message-ID: <20051201.235431.65614928.warner@lothar.com>

> Regardless, let's wait for the final guest list before deciding if we 
> switch locales.

I'll be there too. Thanks for setting this up!

 -Brian

From aloeser at cs.tu-berlin.de  Fri Dec  2 09:28:49 2005
From: aloeser at cs.tu-berlin.de (=?ISO-8859-1?Q?Alexander_L=F6ser?=)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <438F61AD.80406@neofonie.de>
References: <20051130223956.C99CC3FEE8@capsicum.zgp.org>
	<438F61AD.80406@neofonie.de>
Message-ID: <439013D1.7020803@cs.tu-berlin.de>

Hi Adam,
originally there was a certain type of clustering in the beginnings of 
Gnutella (late 90ies) . People communicate its ids mouth to mouth or via 
Email or deja news to other people. So in most cases you got Ids from 
people which had  at least similar interests, or from people where you 
expected some interesting files. Later, due to the overwhelming 
attractiveness of the gnutella application they introduced the gtk and 
other bootstrapping alternatives, given you a number of starting 
pointers. However, this starting points a chosen 'randomly', so there is 
no longer any clustering by interests.

We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based 
Node Grouping Algorithm [1][2]) , that reclusters the network based on 
the interests of the peers, without any DHT, only using on an 
unstructured network.  Similar to freenet, the network topology evolves 
over a while to a so called small world topology, where people with 
similar interests are clustered together. In addition, to further speed 
up the clustering process, peers also keep in a local index structures 
other peers, that are 'HUBs' in the network, e.g. having a high in and 
out degree. Our experiments show, that we significantly outperform 
Gnutella style approaches in messages even in highly volatile networks.

Best's Alex

[1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich 
et.al   3rd. International Semantic Web Conference, Galway. Springer 2005
http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf
[2] Remindin': Semantic query routing in peer-to-peer networks based on 
social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004
http://**www.aifb.uni-karlsruhe.de/ 
Publikationen/showPublikation?publ_id=447

Ronald Wertlen schrieb:

> Hi Adam,
>
> perhaps you have not understood my message because you have not 
> noticed the focus on "precision and recall" (i.e. search) not the old 
> Distributed DB vs. own DB debate. You have also pigeon-holed my email 
> with the DHT crowd (*grin*), it couldn't be further from it!
>
> I was arguing in the other direction - which coderman thankfully 
> picked up.  Gnutella doesn't structure enough, that's all. Sure 
> Gnutella beats DHTs on search - I base that observation on a project I 
> finished last year - a public prototype that used JXTA and was honed 
> for search using super-peers   [DFN S2S http://s2s.neofonie.de/ 
> (German site) - we've moved on some since them  ;) ].
>
> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> practically anyone to elevate to super-peer, which results in a random 
> (power-law distribtion) network. Such a network is not going to 
> perform very well as far as recall and precision are concerned, past a 
> certain point. I would be interested to calculate that exact point 
> (but doubting I'll get to it some time soon :-/).
>
> HTH.
>
> Best regards, Ron
>
> PS. seems this thread has driven the original author to reformulate 
> his statment...  :-)
>
> PPS.
> In fact, the network is not going to be completely random - it will 
> follow the contours of the internet (distribution of servers, 
> broadband connections, users, etc. is not random). I am not sure if 
> that destroys or supports my argument. Back to the drawing board!
>
> We actually need a better internet. [oops there I go getting 
> unspecific again, sorry!!  ;-) ]
>
>
>> Message: 4
>> Date: Wed, 30 Nov 2005 16:42:39 -0500
>> From: Adam Fisk <afisk@speedymail.org>
>> Subject: Re: [p2p-hackers] Re: scalability  To: "Peer-to-peer 
>> development." <p2p-hackers@zgp.org>
>> Message-ID: <438E1CCF.4010907@speedymail.org>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> I don't understand your post.  When you say "critical", I assume 
>> you're talking about life and death situations?  Are you talking 
>> about anything specifically?  DHTs have failure rates.  Ad hoc and 
>> mesh networks can become useful in emergency situations where 
>> conventional infrastructures break down, but the 
>> centralized/p2p/structured/unstructured questions here are far from 
>> obvious.
>>
>> On the "obsessive science types" issue, this completely misses the 
>> point.  It's a very non "obsessive science type" statement.  There 
>> are strong reasons for using the massive indexing/random walk 
>> approach above DHTs -- reasons that have nothing to do with 
>> scalability.  In particulary, DHTs are, well, hash tables.  Hash 
>> tables don't work well for metadata queries.  They do fine for 
>> keywords (hotspots are a problem, but they can be solved), but they 
>> aren't as nice a fit for metadata.  RDF and DHTs are tough to squeeze 
>> together, for example.  The massive indexing (mutual index caching to 
>> use Serguei's term)/random walk approach can get around these issues 
>> more easily.  They are also not nearly as brittle as DHTs.  Sure, 
>> DHTs repair themselves after node joins and leaves, but node 
>> transience generally has a much greater effect on DHTs than it does 
>> on massive indexing networks.
>>
>> I also think you're underestimating the efficiency of massive 
>> indexing and random walks.  Sure, these networks don't scale 
>> logarithmically, but they do pretty darn well.
>> I encourage everyone to stay specific with their posts.
>>
>> All the Best,
>>
>> Adam
>
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


-- 
___________________________________________________________

  Dr. Alexander L?ser, 
  Technische Universit?t Berlin,
  CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY
  office: +49- 30-314-25556  fax: +49- 30-314-21601
  web: http://cis.cs.tu-berlin.de/~aloeser/	
___________________________________________________________


From gwendal.simon at francetelecom.com  Fri Dec  2 09:38:14 2005
From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
Message-ID: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD2025@ftrdmel1.rd.francetelecom.fr>

Hi Alexander,

	This work is close to the one we perform for Maay [1]. As we just begin to implement it, it could be great if you can participate to the early protocol discussion on the mailing-list.

	The current Maay implementation [2] is very open. We develop a basic indexer that communicates through XML-RPC to the "Maay node". The "Maay node" manages communication and the sql database. It can be controlled through a web interface.

	Have fun !

-- Gwendal


[1]: MAAY: a decentralized personalized search system,  F. Dang Ngoc, J. Keller, G. Simon. SAINT'2006 http://maay.netofpeers.net/documentation/maay_SAINT2006.pdf

[2]: http://maay.netofpeers.net

 
> -----Message d'origine-----
> De : p2p-hackers-bounces@zgp.org 
> [mailto:p2p-hackers-bounces@zgp.org] De la part de Alexander L?ser
> Envoy? : vendredi 2 d?cembre 2005 10:29
> ? : Peer-to-peer development.
> Objet : Re: [p2p-hackers] Re: scalability
> 
> Hi Adam,
> originally there was a certain type of clustering in the 
> beginnings of 
> Gnutella (late 90ies) . People communicate its ids mouth to 
> mouth or via 
> Email or deja news to other people. So in most cases you got Ids from 
> people which had  at least similar interests, or from people 
> where you 
> expected some interesting files. Later, due to the overwhelming 
> attractiveness of the gnutella application they introduced 
> the gtk and 
> other bootstrapping alternatives, given you a number of starting 
> pointers. However, this starting points a chosen 'randomly', 
> so there is 
> no longer any clustering by interests.
> 
> We (Berlin and Karlsruhe) developed a new protocol (INGA 
> Interest based 
> Node Grouping Algorithm [1][2]) , that reclusters the network 
> based on 
> the interests of the peers, without any DHT, only using on an 
> unstructured network.  Similar to freenet, the network 
> topology evolves 
> over a while to a so called small world topology, where people with 
> similar interests are clustered together. In addition, to 
> further speed 
> up the clustering process, peers also keep in a local index 
> structures 
> other peers, that are 'HUBs' in the network, e.g. having a 
> high in and 
> out degree. Our experiments show, that we significantly outperform 
> Gnutella style approaches in messages even in highly volatile 
> networks.
> 
> Best's Alex
> 
> [1] Searching Dynamic Communities with Personal Indexes. 
> L?ser, Tempich 
> et.al   3rd. International Semantic Web Conference, Galway. 
> Springer 2005
> http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf
> [2] Remindin': Semantic query routing in peer-to-peer 
> networks based on 
> social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004
> http://**www.aifb.uni-karlsruhe.de/ 
> Publikationen/showPublikation?publ_id=447
> 
> Ronald Wertlen schrieb:
> 
> > Hi Adam,
> >
> > perhaps you have not understood my message because you have not 
> > noticed the focus on "precision and recall" (i.e. search) 
> not the old 
> > Distributed DB vs. own DB debate. You have also 
> pigeon-holed my email 
> > with the DHT crowd (*grin*), it couldn't be further from it!
> >
> > I was arguing in the other direction - which coderman thankfully 
> > picked up.  Gnutella doesn't structure enough, that's all. Sure 
> > Gnutella beats DHTs on search - I base that observation on 
> a project I 
> > finished last year - a public prototype that used JXTA and 
> was honed 
> > for search using super-peers   [DFN S2S http://s2s.neofonie.de/ 
> > (German site) - we've moved on some since them  ;) ].
> >
> > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> > practically anyone to elevate to super-peer, which results 
> in a random 
> > (power-law distribtion) network. Such a network is not going to 
> > perform very well as far as recall and precision are 
> concerned, past a 
> > certain point. I would be interested to calculate that exact point 
> > (but doubting I'll get to it some time soon :-/).
> >
> > HTH.
> >
> > Best regards, Ron
> >
> > PS. seems this thread has driven the original author to reformulate 
> > his statment...  :-)
> >
> > PPS.
> > In fact, the network is not going to be completely random - it will 
> > follow the contours of the internet (distribution of servers, 
> > broadband connections, users, etc. is not random). I am not sure if 
> > that destroys or supports my argument. Back to the drawing board!
> >
> > We actually need a better internet. [oops there I go getting 
> > unspecific again, sorry!!  ;-) ]
> >
> >
> >> Message: 4
> >> Date: Wed, 30 Nov 2005 16:42:39 -0500
> >> From: Adam Fisk <afisk@speedymail.org>
> >> Subject: Re: [p2p-hackers] Re: scalability  To: "Peer-to-peer 
> >> development." <p2p-hackers@zgp.org>
> >> Message-ID: <438E1CCF.4010907@speedymail.org>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> I don't understand your post.  When you say "critical", I assume 
> >> you're talking about life and death situations?  Are you talking 
> >> about anything specifically?  DHTs have failure rates.  Ad hoc and 
> >> mesh networks can become useful in emergency situations where 
> >> conventional infrastructures break down, but the 
> >> centralized/p2p/structured/unstructured questions here are 
> far from 
> >> obvious.
> >>
> >> On the "obsessive science types" issue, this completely misses the 
> >> point.  It's a very non "obsessive science type" statement.  There 
> >> are strong reasons for using the massive indexing/random walk 
> >> approach above DHTs -- reasons that have nothing to do with 
> >> scalability.  In particulary, DHTs are, well, hash tables.  Hash 
> >> tables don't work well for metadata queries.  They do fine for 
> >> keywords (hotspots are a problem, but they can be solved), 
> but they 
> >> aren't as nice a fit for metadata.  RDF and DHTs are tough 
> to squeeze 
> >> together, for example.  The massive indexing (mutual index 
> caching to 
> >> use Serguei's term)/random walk approach can get around 
> these issues 
> >> more easily.  They are also not nearly as brittle as DHTs.  Sure, 
> >> DHTs repair themselves after node joins and leaves, but node 
> >> transience generally has a much greater effect on DHTs 
> than it does 
> >> on massive indexing networks.
> >>
> >> I also think you're underestimating the efficiency of massive 
> >> indexing and random walks.  Sure, these networks don't scale 
> >> logarithmically, but they do pretty darn well.
> >> I encourage everyone to stay specific with their posts.
> >>
> >> All the Best,
> >>
> >> Adam
> >
> >
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
> 
> 
> -- 
> ___________________________________________________________
> 
>   Dr. Alexander L?ser, 
>   Technische Universit?t Berlin,
>   CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY
>   office: +49- 30-314-25556  fax: +49- 30-314-21601
>   web: http://cis.cs.tu-berlin.de/~aloeser/	
> ___________________________________________________________
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From ian.clarke at gmail.com  Fri Dec  2 12:07:32 2005
From: ian.clarke at gmail.com (Ian Clarke)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <20051129140314.046DD698@yumyum.zooko.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
Message-ID: <823242bd0512020407i252b84c4u@mail.gmail.com>

On 29/11/05, zooko@zooko.com <zooko@zooko.com> wrote:
> However, the media seems to have started using the word "Darknet" to mean a
> friend-to-friend net and/or a blacknet [7, 8], thus simultaneously making it
> harder for people to think about blacknets which are based on other than
> friend-to-friend architectures and making it harder for people to think about
> friend-to-friend networks which are used for other than illegal information
> sharing.
>
> I place some of the blame for this development on the Freenet folks, who may be
> the first to promulgate this munging, and if they aren't the first they're
> certainly the most effective.

As Michael Rogers pointed out, I am not sure this is as clear-cut as
you suggest, the goal for Freenet 0.7 is very close to the idea
outlined in the caption for Fig. 3 of the Microsoft Darknet paper,
which is a friend-to-friend network.

That paper may be the first common usage of the term "darknet", but so
far as I can see, it contains no concise definition of what a
"darknet" is.  I would therefore say that there is no authorative
basis on which to invalidate any particular definition of the term
that is broadly within the area of P2P networks which conceal user
activity.

As such, defining the term "darknet" as a f2f network that is designed
to conceal the activities of its participants (this being, so far as I
have seen, one of the main motivations for building an f2f network),
is as valid a definition as any other I have seen (and more useful
than most).

As a side-point, I think it is somewhat pejorative to say that any
technology is "designed" for illegal usage, just because it conceals
user activity and therefore may be capable of illegal usage.  There
are many legal reasons why people might wish to preserve their
anonymity and privacy.

Ian.

From adam at cypherspace.org  Fri Dec  2 13:35:16 2005
From: adam at cypherspace.org (Adam Back)
Date: Sat Dec  9 22:13:05 2006
Subject: idealized content network properties (Re: [p2p-hackers] darknet)
In-Reply-To: <823242bd0512020407i252b84c4u@mail.gmail.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
Message-ID: <20051202133516.GA15480@bitchcake.off.net>

I think an ideal www2 network should:

1. have any content searchable by anyone (the contents are public)
2. make it hard to determine who the author of content is
3. make it hard for people other than the author to remove content
4. make it hard for people to observe what other people are downloading
5. make it hard for anyone to change content (new version and
navigating by version should be the way to "change")

It seems to me that this network can provide any of these subset
classifications trivially.

removing 1 makes a eg "friend-to-friend" network -- that just means
you encrypt the searchable tags and content with a shared key.

removing 2 you just sign the content.

and so forth.

(Making it hard for people other than the author to remove content
technically probably involves things like redundancy, transience of
service, opaque content to its current server location, indirection
etc)

(The author also should be able to arrange that he himself can't
remove the content, by intentionally discarding whatever keys give him
the technical means to remove or change the content).

> As a side-point, I think it is somewhat pejorative to say that any
> technology is "designed" for illegal usage, just because it conceals
> user activity and therefore may be capable of illegal usage.  There
> are many legal reasons why people might wish to preserve their
> anonymity and privacy.

Yeah.  I think my feature set at the top should be the default/base
set of properties exhibited by the www2 (next gen web).  Any voluntary
restrictions on these should be entered into by policy.  Say content X
is illegal in jurisdiction Y, then Y should publish a blacklist
identifying content X and the legal system in jurisdiction Y should if
it chooses make it illegal to not consult the blacklist.  I mean
illegality is not even consistent, there are things which are legally
required in Y that are illegal in Z.  There is and can be no globally
acceptable policy, so we must robustly technologically prevent global
enforcement.

Adam

From m.rogers at cs.ucl.ac.uk  Fri Dec  2 14:19:29 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:05 2006
Subject: idealized content network properties (Re: [p2p-hackers] darknet)
In-Reply-To: <20051202133516.GA15480@bitchcake.off.net>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202133516.GA15480@bitchcake.off.net>
Message-ID: <439057F1.2060208@cs.ucl.ac.uk>

Adam Back wrote:
> removing 1 makes a eg "friend-to-friend" network -- that just means
> you encrypt the searchable tags and content with a shared key.

Not sure about this one - I think the use of group keys is orthogonal to 
the use of a friend-to-friend topology. For example Groove uses group 
keys without f2f, Freenet 0.7 will use f2f without group keys, and WASTE 
uses neither (but still fits under the "darknet" umbrella because it's 
invitation-only).

Cheers,
Michael

From zooko at zooko.com  Fri Dec  2 15:45:57 2005
From: zooko at zooko.com (zooko@zooko.com)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <823242bd0512020407i252b84c4u@mail.gmail.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
Message-ID: <20051202154557.E559F191C@yumyum.zooko.com>


Ian, p2p-hackers:

It's not my goal to quibble about etymology (except inasmuch as it is useful to
preserve the historical record).  My goals are:

1.  Avoid ambiguity -- where some people think that word X denotes concept 1,
    and others think that word X denotes concept 2.  Especially if concepts 1
    and 2 are related but not identical.  Especially if one of them is
    politically incendiary.

2.  Make sure we have names for our useful concepts.

However, before I get to that I am going to go through the history one last
time in order to cast light on the current problem.  I turned up some
interesting details.

Let's start with a Venn diagram:
         _______      _______
        /       \    /       \
       /         \  /         \
      /           \/           \
     /            /\            \
    /            /  \            \
   |            |    |            |
   |        1   |1^2 |   2        |
   |            |    |            |
   |            |    |            |
    \            \  /            /
     \            \/            /
      \           /\           /
       \         /  \         /
        \_______/    \_______/

Let 1 be the set of networks which are used for illegal transmission of
information, and 2 be the set of networks which are built on f2f connections,
and 1^2 be the intersection -- the set of networks which are used for illegal
transmission of information and which are built on f2f connections.

[bepw2002] introduces "darknet" to mean concept 1.  In their words darknet is
"a collection of networks and technologies used to share digital content", and
they use it consistently within that meaning.  They refer to concept 2,
starting in section 2.1, using the term "small-world nets", and they clearly
distinguish between what they call "small-world darknets" and "non-small-world
darknets".

However nowadays some people in the mass media seem to think that a "darknet"
means primarily a network which is "invitation-only", i.e. a "small-world" or
"f2f" net [globe].  When did the meaning shift?

Ooh -- how interesting to examine the evolution of this word on [wikipedia]!
The original definition on wikipedia was written on 2004-09-30.  It read in
full: "Darknet is a broad term to denote the networks and technologies that
enable users to copy and share digital material.  The term was coined in a
paper from four Microsoft Research authors.".

The next change was that two months later someone redirected the "Darknet" page
to just be a link to the "Filesharing page", with the comment "Just another
word for filesharing".

The next change was that on 2005-04-14 someone from IP 81.178.83.245 wrote a
definition beginning with this sentence: "A Darknet is a private file sharing
network where users only connect to people they trust.".

By the way, I should point out that I have a personal interest in this history
because between 2001 and 2003 I tried to promulgate concept 2, using Lucas
Gonze's coinage: "friendnet" [zooko2001, zooko2002, zooko2003, gonze2002].  
I would like to know for my own satisfaction if my ideas were a direct
inspiration for some of this modern stuff, such as the Freenet v0.7 design.


So much for etymology.

Now the problem is that in the current parlance of the media, the word
"darknet" is used to mean vaguely 1 or 2 or 1^2.  The reason that this is a
problem isn't that it breaks with some etymological tradition, but that it is
ambiguous and that it deprives us of useful words to refer to 1 or 2
specifically.  The ambiguity has nasty political consequences -- see for
example these f2f network operators struggling to persuade newspaper readers
that they are not primarily for illegal purposes: [globe].

My proposal to rectify the lack-of-words problem is to use "blacknet" to refer
to 1 specifically and "f2f net" to refer to 2 specifically.  I don't know if
there is any way to rectify the ambiguity problem.


 Ian wrote:
>
> ... 
> defining the term "darknet" as a f2f network that is designed
> to conceal the activities of its participants (this being, so far as I
> have seen, one of the main motivations for building an f2f network),

So you think of "darknet" as meaning 1^2.

That's an interesting remark -- that you regard concealment as one of the main
motivations.  I personally regard concealment as one of the lesser motivations
-- I'm more interested in attack resistance (resisting attacks such as
subversion or denial-of-service, rather than attacks such as surveillance),
scalability, and other properties.  Although I'm interested in the concealment
properties as well.


Regards,

Zooko

P.S.  Here's some obligatory link juice for Gonze's latest sly neologism: lightnet!

[bepw2002]  "The darknet and the future of content distribution" Biddle,
            England, Peinado, Willman (Microsoft Corporation)
            http://crypto.stanford.edu/DRM2002/darknet5.doc
            http://www.dklevine.com/archive/darknet.pdf
            (The .doc version crashes my OpenOffice.org app when I try to read
            it.  Does this mean something?  The .pdf version has screwed up
            images when I view it in evince.)
[wikipedia] http://en.wikipedia.org/wiki/Darknet
[zooko2001] "Attack Resistant Sharing of Metadata" Zooko and Raph Levien
            presentation, First O'Reilly Peer-to-Peer conference, 2001
            http://conferences.oreillynet.com/cs/p2p2001/view/e_sess/1200
[zooko2002] http://zooko.com/log-2002-12.html#d2002-12-14-the_human_context_and_the_future_of_Mnet
[zooko2003] http://www.zooko.com/log-2003-01.html#d2003-01-23-trust_is_just_another_topology
[gonze2002] http://www.oreillynet.com/pub/wlg/2428
[globe]     "Darknets: The invitation-only Internet" globeandmail.com
            2005-11-24
            http://www.globetechnology.com/servlet/story/RTGAM.20051007.gtdarknetoct7/BNStory/Technology/
[lightnet]  http://gonze.com/weblog/story/lightnet

From m.rogers at cs.ucl.ac.uk  Fri Dec  2 16:02:07 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <20051202154557.E559F191C@yumyum.zooko.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202154557.E559F191C@yumyum.zooko.com>
Message-ID: <43906FFF.3070302@cs.ucl.ac.uk>

zooko@zooko.com wrote:
> However nowadays some people in the mass media seem to think that a "darknet"
> means primarily a network which is "invitation-only", i.e. a "small-world" or
> "f2f" net [globe].

Sorry to split an already frayed hair, but invitation-only isn't the 
same as f2f. Invitation-only implies that you must know some member of 
the network, whereas f2f implies that you must know the members you 
connect to. For example Groove and WASTE are invitation-only but not f2f.

Cheers,
Michael

From mccoy at mad-scientist.com  Fri Dec  2 17:32:53 2005
From: mccoy at mad-scientist.com (Jim McCoy)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: P2P in SFC
In-Reply-To: <20051201.235431.65614928.warner@lothar.com>
References: <20051201.235431.65614928.warner@lothar.com>
Message-ID: <56D8091C-1D45-48C2-975C-5F6A1D47059B@mad-scientist.com>


> Regardless, let's wait for the final guest list before deciding if we
> switch locales.

I will be there.

Jim

From zooko at zooko.com  Fri Dec  2 17:20:47 2005
From: zooko at zooko.com (zooko@zooko.com)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet.pdf
Message-ID: <20051202172047.64212339@yumyum.zooko.com>

Thanks to anonymous contributor.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pdf
Size: 246474 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/62b9c4af/attachment.pdf
From coderman at gmail.com  Fri Dec  2 18:08:32 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] help:is anyone interested in living streaming or
	vod over p2p?
In-Reply-To: <f6a27c550512011906t6a462c1dm@mail.gmail.com>
References: <f6a27c550512011906t6a462c1dm@mail.gmail.com>
Message-ID: <4ef5fec60512021008v64987949xb6880691dd2fceec@mail.gmail.com>

On 12/1/05, zheng j <baoguai2000@gmail.com> wrote:
> Hi, I am now doing research on living streaming and Vod over p2p...

wireless is a natural fit for p2p streaming / broadcast distribution

From Serguei.Osokine at efi.com  Fri Dec  2 18:23:24 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42751@fcexmb04.efi.internal>

On Friday, December 02, 2005 Alexander L?ser wrote:
> originally there was a certain type of clustering in the beginnings
> of Gnutella (late 90ies) . People communicate its ids mouth to mouth
> or via Email or deja news to other people. So in most cases you got
> Ids from people which had  at least similar interests, or from 
> people where you expected some interesting files.

	I'm sorry to contradict you, but I think this is all a myth.

	First, there was no Gnutella in late 90ies. It was released in
March of 2000. Second, I remember looking at the connection stability
just a few months later (June/July, maybe?), and the churn was quite
high - a client tended to replace all its connections within an hour
or so.

	Now if you remember how the connections were replaced, the
client was trying the IPs that it received from PONGs, which were
essentially the random network IPs, because the network was just
a few thousand nodes and every client could see the pongs from 
pretty much everyone. So in an hour or so your initial connection 
point stopped being relevant and you found yourself at a random 
place in the network. After that, all your subsequent sessions used 
the IP list stored on disk by a previous session to connect to the
network, and the address given to you by your friends was no longer
important.

	To be precise, this latest part (about the IP list) was the
behaviour of the Gnutella clients that I worked with (I think these
were Gnutella v.056 and GNUT). Maybe there were some clients  that 
required to enter an IP at every session start. I don't know. There
was also a notion of locality based on the unusually good and stable
connections - as soon as the two machines on my desktop would find 
each other on the network as a result of this random process, they
would stay connected for quite a while (as long as I did not stop
the clients).

	But even these considerations are not important, because the
early Gnutella (until the meltdown of July 2000) was fully visible,
and every query more or less reached every node (in the absence of 
the flow control, this is exactly what caused the meltdown - TTL was
too high to limit the query propagation).

	Of course, some queries might have been missing some nodes, but 
generally there was no chance for any clustering - I simply cannot see
how it could possibly exist in such a network.

> We (Berlin and Karlsruhe) developed a new protocol (INGA Interest 
> based Node Grouping Algorithm [1][2]) , that reclusters the network 
> based on the interests of the peers, without any DHT, only using on 
> an unstructured network.

	Which is cool, and maybe it is a great protocol - as long as 
you won't justify its existence by myths. I'm sure there are plenty
of legitimate reasons that make this protocol useful ;-)

	Best wishes -
	S.Osokine.
	2 Dec 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Alexander L?ser
Sent: Friday, December 02, 2005 1:29 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] Re: scalability


Hi Adam,
originally there was a certain type of clustering in the beginnings of 
Gnutella (late 90ies) . People communicate its ids mouth to mouth or via 
Email or deja news to other people. So in most cases you got Ids from 
people which had  at least similar interests, or from people where you 
expected some interesting files. Later, due to the overwhelming 
attractiveness of the gnutella application they introduced the gtk and 
other bootstrapping alternatives, given you a number of starting 
pointers. However, this starting points a chosen 'randomly', so there is 
no longer any clustering by interests.

We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based 
Node Grouping Algorithm [1][2]) , that reclusters the network based on 
the interests of the peers, without any DHT, only using on an 
unstructured network.  Similar to freenet, the network topology evolves 
over a while to a so called small world topology, where people with 
similar interests are clustered together. In addition, to further speed 
up the clustering process, peers also keep in a local index structures 
other peers, that are 'HUBs' in the network, e.g. having a high in and 
out degree. Our experiments show, that we significantly outperform 
Gnutella style approaches in messages even in highly volatile networks.

Best's Alex

[1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich 
et.al   3rd. International Semantic Web Conference, Galway. Springer 2005
http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf
[2] Remindin': Semantic query routing in peer-to-peer networks based on 
social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004
http://**www.aifb.uni-karlsruhe.de/ 
Publikationen/showPublikation?publ_id=447

Ronald Wertlen schrieb:

> Hi Adam,
>
> perhaps you have not understood my message because you have not 
> noticed the focus on "precision and recall" (i.e. search) not the old 
> Distributed DB vs. own DB debate. You have also pigeon-holed my email 
> with the DHT crowd (*grin*), it couldn't be further from it!
>
> I was arguing in the other direction - which coderman thankfully 
> picked up.  Gnutella doesn't structure enough, that's all. Sure 
> Gnutella beats DHTs on search - I base that observation on a project I 
> finished last year - a public prototype that used JXTA and was honed 
> for search using super-peers   [DFN S2S http://s2s.neofonie.de/ 
> (German site) - we've moved on some since them  ;) ].
>
> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> practically anyone to elevate to super-peer, which results in a random 
> (power-law distribtion) network. Such a network is not going to 
> perform very well as far as recall and precision are concerned, past a 
> certain point. I would be interested to calculate that exact point 
> (but doubting I'll get to it some time soon :-/).
>
> HTH.
>
> Best regards, Ron
>
> PS. seems this thread has driven the original author to reformulate 
> his statment...  :-)
>
> PPS.
> In fact, the network is not going to be completely random - it will 
> follow the contours of the internet (distribution of servers, 
> broadband connections, users, etc. is not random). I am not sure if 
> that destroys or supports my argument. Back to the drawing board!
>
> We actually need a better internet. [oops there I go getting 
> unspecific again, sorry!!  ;-) ]
>
>
>> Message: 4
>> Date: Wed, 30 Nov 2005 16:42:39 -0500
>> From: Adam Fisk <afisk@speedymail.org>
>> Subject: Re: [p2p-hackers] Re: scalability  To: "Peer-to-peer 
>> development." <p2p-hackers@zgp.org>
>> Message-ID: <438E1CCF.4010907@speedymail.org>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> I don't understand your post.  When you say "critical", I assume 
>> you're talking about life and death situations?  Are you talking 
>> about anything specifically?  DHTs have failure rates.  Ad hoc and 
>> mesh networks can become useful in emergency situations where 
>> conventional infrastructures break down, but the 
>> centralized/p2p/structured/unstructured questions here are far from 
>> obvious.
>>
>> On the "obsessive science types" issue, this completely misses the 
>> point.  It's a very non "obsessive science type" statement.  There 
>> are strong reasons for using the massive indexing/random walk 
>> approach above DHTs -- reasons that have nothing to do with 
>> scalability.  In particulary, DHTs are, well, hash tables.  Hash 
>> tables don't work well for metadata queries.  They do fine for 
>> keywords (hotspots are a problem, but they can be solved), but they 
>> aren't as nice a fit for metadata.  RDF and DHTs are tough to squeeze 
>> together, for example.  The massive indexing (mutual index caching to 
>> use Serguei's term)/random walk approach can get around these issues 
>> more easily.  They are also not nearly as brittle as DHTs.  Sure, 
>> DHTs repair themselves after node joins and leaves, but node 
>> transience generally has a much greater effect on DHTs than it does 
>> on massive indexing networks.
>>
>> I also think you're underestimating the efficiency of massive 
>> indexing and random walks.  Sure, these networks don't scale 
>> logarithmically, but they do pretty darn well.
>> I encourage everyone to stay specific with their posts.
>>
>> All the Best,
>>
>> Adam
>
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


-- 
___________________________________________________________

  Dr. Alexander L?ser, 
  Technische Universit?t Berlin,
  CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY
  office: +49- 30-314-25556  fax: +49- 30-314-21601
  web: http://cis.cs.tu-berlin.de/~aloeser/	
___________________________________________________________

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From coderman at gmail.com  Fri Dec  2 18:30:03 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p portland (OR)
Message-ID: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>

ping

From don at dhoffman.net  Fri Dec  2 18:45:12 2005
From: don at dhoffman.net (Donald Hoffman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p portland (OR)
In-Reply-To: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
Message-ID: <0C7F13F7-D31C-47EB-90D0-17289D97ECAF@dhoffman.net>

Pong.    Also (live) in Portland.  (Actually in Montana right now.   
Anyone there?)

Don

On Dec 2, 2005, at 11:30 AM, coderman wrote:

> ping
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From agthorr at cs.uoregon.edu  Fri Dec  2 18:51:37 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p portland (OR)
In-Reply-To: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
Message-ID: <20051202185136.GB2604@cs.uoregon.edu>

On Fri, Dec 02, 2005 at 10:30:03AM -0800, coderman wrote:
> ping

I'm in Eugene.  I'd be willing to drive up for a get-together if we
have a big enough group to make it interesting.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From coderman at gmail.com  Fri Dec  2 19:13:33 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p portland (OR)
In-Reply-To: <20051202185136.GB2604@cs.uoregon.edu>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
	<20051202185136.GB2604@cs.uoregon.edu>
Message-ID: <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com>

On 12/2/05, Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:
> I'm in Eugene.  I'd be willing to drive up for a get-together if we
> have a big enough group to make it interesting.

i'd be happy to travel to eugene if more of the group is located there
as well.  weekends would be best in that case.

From gbildson at limepeer.com  Fri Dec  2 19:22:32 2005
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42751@fcexmb04.efi.internal>
Message-ID: <EPEJIODJLBDLEHGHIEADIEJIFGAB.gbildson@limepeer.com>

The only locality that I can think of that may have occurred back in that
early timeframe would be based on the stringiness of the network.  I have a
feeling that pre-centralized hostcache, the network was more of a long
string with some clumps as it went along.  So, its possible that the network
diameter at its longest point was much larger than max-TTL.  Then, the
introduction of centralized hostcaches helped create a massive cluster and
exacerbated the early modem bandwidth barrier.  This appeared to be what
Gene Kan thought I believe.  Its was only months later with the introduction
of clients with keepalive pings and flow control that the clogged spots got
freed up.

If ToadNode was correct in that they had millions of downloads in those
early days then thats the only way that I could see the modem bandwidth
barrier not getting hit very quickly.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Serguei Osokine
> Sent: Friday, December 02, 2005 1:23 PM
> To: Peer-to-peer development.
> Subject: RE: [p2p-hackers] Re: scalability
>
>
> On Friday, December 02, 2005 Alexander L?ser wrote:
> > originally there was a certain type of clustering in the beginnings
> > of Gnutella (late 90ies) . People communicate its ids mouth to mouth
> > or via Email or deja news to other people. So in most cases you got
> > Ids from people which had  at least similar interests, or from
> > people where you expected some interesting files.
>
> 	I'm sorry to contradict you, but I think this is all a myth.
>
> 	First, there was no Gnutella in late 90ies. It was released in
> March of 2000. Second, I remember looking at the connection stability
> just a few months later (June/July, maybe?), and the churn was quite
> high - a client tended to replace all its connections within an hour
> or so.
>
> 	Now if you remember how the connections were replaced, the
> client was trying the IPs that it received from PONGs, which were
> essentially the random network IPs, because the network was just
> a few thousand nodes and every client could see the pongs from
> pretty much everyone. So in an hour or so your initial connection
> point stopped being relevant and you found yourself at a random
> place in the network. After that, all your subsequent sessions used
> the IP list stored on disk by a previous session to connect to the
> network, and the address given to you by your friends was no longer
> important.
>
> 	To be precise, this latest part (about the IP list) was the
> behaviour of the Gnutella clients that I worked with (I think these
> were Gnutella v.056 and GNUT). Maybe there were some clients  that
> required to enter an IP at every session start. I don't know. There
> was also a notion of locality based on the unusually good and stable
> connections - as soon as the two machines on my desktop would find
> each other on the network as a result of this random process, they
> would stay connected for quite a while (as long as I did not stop
> the clients).
>
> 	But even these considerations are not important, because the
> early Gnutella (until the meltdown of July 2000) was fully visible,
> and every query more or less reached every node (in the absence of
> the flow control, this is exactly what caused the meltdown - TTL was
> too high to limit the query propagation).
>
> 	Of course, some queries might have been missing some nodes, but
> generally there was no chance for any clustering - I simply cannot see
> how it could possibly exist in such a network.
>
> > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest
> > based Node Grouping Algorithm [1][2]) , that reclusters the network
> > based on the interests of the peers, without any DHT, only using on
> > an unstructured network.
>
> 	Which is cool, and maybe it is a great protocol - as long as
> you won't justify its existence by myths. I'm sure there are plenty
> of legitimate reasons that make this protocol useful ;-)
>
> 	Best wishes -
> 	S.Osokine.
> 	2 Dec 2005.
>
>
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Alexander L?ser
> Sent: Friday, December 02, 2005 1:29 AM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] Re: scalability
>
>
> Hi Adam,
> originally there was a certain type of clustering in the beginnings of
> Gnutella (late 90ies) . People communicate its ids mouth to mouth or via
> Email or deja news to other people. So in most cases you got Ids from
> people which had  at least similar interests, or from people where you
> expected some interesting files. Later, due to the overwhelming
> attractiveness of the gnutella application they introduced the gtk and
> other bootstrapping alternatives, given you a number of starting
> pointers. However, this starting points a chosen 'randomly', so there is
> no longer any clustering by interests.
>
> We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based
> Node Grouping Algorithm [1][2]) , that reclusters the network based on
> the interests of the peers, without any DHT, only using on an
> unstructured network.  Similar to freenet, the network topology evolves
> over a while to a so called small world topology, where people with
> similar interests are clustered together. In addition, to further speed
> up the clustering process, peers also keep in a local index structures
> other peers, that are 'HUBs' in the network, e.g. having a high in and
> out degree. Our experiments show, that we significantly outperform
> Gnutella style approaches in messages even in highly volatile networks.
>
> Best's Alex
>
> [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich
> et.al   3rd. International Semantic Web Conference, Galway. Springer 2005
> http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf
> [2] Remindin': Semantic query routing in peer-to-peer networks based on
> social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004
> http://**www.aifb.uni-karlsruhe.de/
> Publikationen/showPublikation?publ_id=447
>
> Ronald Wertlen schrieb:
>
> > Hi Adam,
> >
> > perhaps you have not understood my message because you have not
> > noticed the focus on "precision and recall" (i.e. search) not the old
> > Distributed DB vs. own DB debate. You have also pigeon-holed my email
> > with the DHT crowd (*grin*), it couldn't be further from it!
> >
> > I was arguing in the other direction - which coderman thankfully
> > picked up.  Gnutella doesn't structure enough, that's all. Sure
> > Gnutella beats DHTs on search - I base that observation on a project I
> > finished last year - a public prototype that used JXTA and was honed
> > for search using super-peers   [DFN S2S http://s2s.neofonie.de/
> > (German site) - we've moved on some since them  ;) ].
> >
> > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows
> > practically anyone to elevate to super-peer, which results in a random
> > (power-law distribtion) network. Such a network is not going to
> > perform very well as far as recall and precision are concerned, past a
> > certain point. I would be interested to calculate that exact point
> > (but doubting I'll get to it some time soon :-/).
> >
> > HTH.
> >
> > Best regards, Ron
> >
> > PS. seems this thread has driven the original author to reformulate
> > his statment...  :-)
> >
> > PPS.
> > In fact, the network is not going to be completely random - it will
> > follow the contours of the internet (distribution of servers,
> > broadband connections, users, etc. is not random). I am not sure if
> > that destroys or supports my argument. Back to the drawing board!
> >
> > We actually need a better internet. [oops there I go getting
> > unspecific again, sorry!!  ;-) ]
> >
> >
> >> Message: 4
> >> Date: Wed, 30 Nov 2005 16:42:39 -0500
> >> From: Adam Fisk <afisk@speedymail.org>
> >> Subject: Re: [p2p-hackers] Re: scalability  To: "Peer-to-peer
> >> development." <p2p-hackers@zgp.org>
> >> Message-ID: <438E1CCF.4010907@speedymail.org>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> I don't understand your post.  When you say "critical", I assume
> >> you're talking about life and death situations?  Are you talking
> >> about anything specifically?  DHTs have failure rates.  Ad hoc and
> >> mesh networks can become useful in emergency situations where
> >> conventional infrastructures break down, but the
> >> centralized/p2p/structured/unstructured questions here are far from
> >> obvious.
> >>
> >> On the "obsessive science types" issue, this completely misses the
> >> point.  It's a very non "obsessive science type" statement.  There
> >> are strong reasons for using the massive indexing/random walk
> >> approach above DHTs -- reasons that have nothing to do with
> >> scalability.  In particulary, DHTs are, well, hash tables.  Hash
> >> tables don't work well for metadata queries.  They do fine for
> >> keywords (hotspots are a problem, but they can be solved), but they
> >> aren't as nice a fit for metadata.  RDF and DHTs are tough to squeeze
> >> together, for example.  The massive indexing (mutual index caching to
> >> use Serguei's term)/random walk approach can get around these issues
> >> more easily.  They are also not nearly as brittle as DHTs.  Sure,
> >> DHTs repair themselves after node joins and leaves, but node
> >> transience generally has a much greater effect on DHTs than it does
> >> on massive indexing networks.
> >>
> >> I also think you're underestimating the efficiency of massive
> >> indexing and random walks.  Sure, these networks don't scale
> >> logarithmically, but they do pretty darn well.
> >> I encourage everyone to stay specific with their posts.
> >>
> >> All the Best,
> >>
> >> Adam
> >
> >
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
>
>
> --
> ___________________________________________________________
>
>   Dr. Alexander L?ser,
>   Technische Universit?t Berlin,
>   CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY
>   office: +49- 30-314-25556  fax: +49- 30-314-21601
>   web: http://cis.cs.tu-berlin.de/~aloeser/
> ___________________________________________________________
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From eugen at leitl.org  Fri Dec  2 19:38:33 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p portland (OR)
In-Reply-To: <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
	<20051202185136.GB2604@cs.uoregon.edu>
	<4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com>
Message-ID: <20051202193833.GD2249@leitl.org>

On Fri, Dec 02, 2005 at 11:13:33AM -0800, coderman wrote:
> On 12/2/05, Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:
> > I'm in Eugene.  I'd be willing to drive up for a get-together if we
> > have a big enough group to make it interesting.
> 
> i'd be happy to travel to eugene if more of the group is located there
> as well.  weekends would be best in that case.

Allright! I'm game.

;)

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/9879a674/attachment.pgp
From Serguei.Osokine at efi.com  Fri Dec  2 19:38:41 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42756@fcexmb04.efi.internal>

On Friday, December 02, 2005 Greg Bildson wrote:
> I have a feeling that pre-centralized hostcache, the network was 
> more of a long string with some clumps as it went along.

	So what kept this string from fully clumping as the connections
were broken and reestablished? Default was four connections, not two.
How is it possible not to fold this string onto itself about one
thousand times after the first 1,000 connections will be reestablished
- which would take 10-15 minutes in a 1,000-node network, and would
happen instantly in a one-million one?

> If ToadNode was correct in that they had millions of downloads in 
> those early days then thats the only way that I could see the modem 
> bandwidth barrier not getting hit very quickly.

	Between people not using the downloaded code, an error in 
ToadNode stats, a miracle, and the network preserving its 'linear'
graph topology for any noticeable time, my vote will be for any one 
of the first three - the last one is too improbable.

	Best wishes -
	S.Osokine.
	2 Dec 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Greg Bildson
Sent: Friday, December 02, 2005 11:23 AM
To: Peer-to-peer development.
Subject: RE: [p2p-hackers] Re: scalability


The only locality that I can think of that may have occurred back in that
early timeframe would be based on the stringiness of the network.  I have a
feeling that pre-centralized hostcache, the network was more of a long
string with some clumps as it went along.  So, its possible that the network
diameter at its longest point was much larger than max-TTL.  Then, the
introduction of centralized hostcaches helped create a massive cluster and
exacerbated the early modem bandwidth barrier.  This appeared to be what
Gene Kan thought I believe.  Its was only months later with the introduction
of clients with keepalive pings and flow control that the clogged spots got
freed up.

If ToadNode was correct in that they had millions of downloads in those
early days then thats the only way that I could see the modem bandwidth
barrier not getting hit very quickly.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Serguei Osokine
> Sent: Friday, December 02, 2005 1:23 PM
> To: Peer-to-peer development.
> Subject: RE: [p2p-hackers] Re: scalability
>
>
> On Friday, December 02, 2005 Alexander L?ser wrote:
> > originally there was a certain type of clustering in the beginnings
> > of Gnutella (late 90ies) . People communicate its ids mouth to mouth
> > or via Email or deja news to other people. So in most cases you got
> > Ids from people which had  at least similar interests, or from
> > people where you expected some interesting files.
>
> 	I'm sorry to contradict you, but I think this is all a myth.
>
> 	First, there was no Gnutella in late 90ies. It was released in
> March of 2000. Second, I remember looking at the connection stability
> just a few months later (June/July, maybe?), and the churn was quite
> high - a client tended to replace all its connections within an hour
> or so.
>
> 	Now if you remember how the connections were replaced, the
> client was trying the IPs that it received from PONGs, which were
> essentially the random network IPs, because the network was just
> a few thousand nodes and every client could see the pongs from
> pretty much everyone. So in an hour or so your initial connection
> point stopped being relevant and you found yourself at a random
> place in the network. After that, all your subsequent sessions used
> the IP list stored on disk by a previous session to connect to the
> network, and the address given to you by your friends was no longer
> important.
>
> 	To be precise, this latest part (about the IP list) was the
> behaviour of the Gnutella clients that I worked with (I think these
> were Gnutella v.056 and GNUT). Maybe there were some clients  that
> required to enter an IP at every session start. I don't know. There
> was also a notion of locality based on the unusually good and stable
> connections - as soon as the two machines on my desktop would find
> each other on the network as a result of this random process, they
> would stay connected for quite a while (as long as I did not stop
> the clients).
>
> 	But even these considerations are not important, because the
> early Gnutella (until the meltdown of July 2000) was fully visible,
> and every query more or less reached every node (in the absence of
> the flow control, this is exactly what caused the meltdown - TTL was
> too high to limit the query propagation).
>
> 	Of course, some queries might have been missing some nodes, but
> generally there was no chance for any clustering - I simply cannot see
> how it could possibly exist in such a network.
>
> > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest
> > based Node Grouping Algorithm [1][2]) , that reclusters the network
> > based on the interests of the peers, without any DHT, only using on
> > an unstructured network.
>
> 	Which is cool, and maybe it is a great protocol - as long as
> you won't justify its existence by myths. I'm sure there are plenty
> of legitimate reasons that make this protocol useful ;-)
>
> 	Best wishes -
> 	S.Osokine.
> 	2 Dec 2005.
>
>
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Alexander L?ser
> Sent: Friday, December 02, 2005 1:29 AM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] Re: scalability
>
>
> Hi Adam,
> originally there was a certain type of clustering in the beginnings of
> Gnutella (late 90ies) . People communicate its ids mouth to mouth or via
> Email or deja news to other people. So in most cases you got Ids from
> people which had  at least similar interests, or from people where you
> expected some interesting files. Later, due to the overwhelming
> attractiveness of the gnutella application they introduced the gtk and
> other bootstrapping alternatives, given you a number of starting
> pointers. However, this starting points a chosen 'randomly', so there is
> no longer any clustering by interests.
>
> We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based
> Node Grouping Algorithm [1][2]) , that reclusters the network based on
> the interests of the peers, without any DHT, only using on an
> unstructured network.  Similar to freenet, the network topology evolves
> over a while to a so called small world topology, where people with
> similar interests are clustered together. In addition, to further speed
> up the clustering process, peers also keep in a local index structures
> other peers, that are 'HUBs' in the network, e.g. having a high in and
> out degree. Our experiments show, that we significantly outperform
> Gnutella style approaches in messages even in highly volatile networks.
>
> Best's Alex
>
> [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich
> et.al   3rd. International Semantic Web Conference, Galway. Springer 2005
> http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf
> [2] Remindin': Semantic query routing in peer-to-peer networks based on
> social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004
> http://**www.aifb.uni-karlsruhe.de/
> Publikationen/showPublikation?publ_id=447
>
> Ronald Wertlen schrieb:
>
> > Hi Adam,
> >
> > perhaps you have not understood my message because you have not
> > noticed the focus on "precision and recall" (i.e. search) not the old
> > Distributed DB vs. own DB debate. You have also pigeon-holed my email
> > with the DHT crowd (*grin*), it couldn't be further from it!
> >
> > I was arguing in the other direction - which coderman thankfully
> > picked up.  Gnutella doesn't structure enough, that's all. Sure
> > Gnutella beats DHTs on search - I base that observation on a project I
> > finished last year - a public prototype that used JXTA and was honed
> > for search using super-peers   [DFN S2S http://s2s.neofonie.de/
> > (German site) - we've moved on some since them  ;) ].
> >
> > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows
> > practically anyone to elevate to super-peer, which results in a random
> > (power-law distribtion) network. Such a network is not going to
> > perform very well as far as recall and precision are concerned, past a
> > certain point. I would be interested to calculate that exact point
> > (but doubting I'll get to it some time soon :-/).
> >
> > HTH.
> >
> > Best regards, Ron
> >
> > PS. seems this thread has driven the original author to reformulate
> > his statment...  :-)
> >
> > PPS.
> > In fact, the network is not going to be completely random - it will
> > follow the contours of the internet (distribution of servers,
> > broadband connections, users, etc. is not random). I am not sure if
> > that destroys or supports my argument. Back to the drawing board!
> >
> > We actually need a better internet. [oops there I go getting
> > unspecific again, sorry!!  ;-) ]
> >
> >
> >> Message: 4
> >> Date: Wed, 30 Nov 2005 16:42:39 -0500
> >> From: Adam Fisk <afisk@speedymail.org>
> >> Subject: Re: [p2p-hackers] Re: scalability  To: "Peer-to-peer
> >> development." <p2p-hackers@zgp.org>
> >> Message-ID: <438E1CCF.4010907@speedymail.org>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> I don't understand your post.  When you say "critical", I assume
> >> you're talking about life and death situations?  Are you talking
> >> about anything specifically?  DHTs have failure rates.  Ad hoc and
> >> mesh networks can become useful in emergency situations where
> >> conventional infrastructures break down, but the
> >> centralized/p2p/structured/unstructured questions here are far from
> >> obvious.
> >>
> >> On the "obsessive science types" issue, this completely misses the
> >> point.  It's a very non "obsessive science type" statement.  There
> >> are strong reasons for using the massive indexing/random walk
> >> approach above DHTs -- reasons that have nothing to do with
> >> scalability.  In particulary, DHTs are, well, hash tables.  Hash
> >> tables don't work well for metadata queries.  They do fine for
> >> keywords (hotspots are a problem, but they can be solved), but they
> >> aren't as nice a fit for metadata.  RDF and DHTs are tough to squeeze
> >> together, for example.  The massive indexing (mutual index caching to
> >> use Serguei's term)/random walk approach can get around these issues
> >> more easily.  They are also not nearly as brittle as DHTs.  Sure,
> >> DHTs repair themselves after node joins and leaves, but node
> >> transience generally has a much greater effect on DHTs than it does
> >> on massive indexing networks.
> >>
> >> I also think you're underestimating the efficiency of massive
> >> indexing and random walks.  Sure, these networks don't scale
> >> logarithmically, but they do pretty darn well.
> >> I encourage everyone to stay specific with their posts.
> >>
> >> All the Best,
> >>
> >> Adam
> >
> >
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
>
>
> --
> ___________________________________________________________
>
>   Dr. Alexander L?ser,
>   Technische Universit?t Berlin,
>   CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY
>   office: +49- 30-314-25556  fax: +49- 30-314-21601
>   web: http://cis.cs.tu-berlin.de/~aloeser/
> ___________________________________________________________
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From bryan.turner at pobox.com  Fri Dec  2 20:15:45 2005
From: bryan.turner at pobox.com (Bryan Turner)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <20051201205215.GF5300@cs.uoregon.edu>
Message-ID: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com>

My $.02 on Gnutella,

	The Gnutella network will scale fine to 2B nodes.  However, I
believe without interest clustering or intelligent peer selection, it will
become increasingly difficult to find the data you are interested in.  IE: I
feel the current architecture misses the 'long tail'.  (Note that I am not
well versed on Gnutella architecture, this opinion is based on papers
modeling the math behind Gnutella)

	I like to find the orthogonal axis in a design, P2P has lots of
interesting scalability axis:
1	Scalability in # of nodes
2	Scalability in # of objects
3	Scalability in size of objects
4	Scalability in interest for an object (hot spots)
5	Scalability in bandwidth (protocol overhead, efficiency)
	etc.

	BitTorrent captures all but #2, as multiple torrents may require
redundant connections to a peer, and torrents that share files cannot also
share swarms (not to mention BitTorrent isn't a content search network).

Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
	#2 because it does not cluster related data it will eventually
		be overwhelmed with content.
	#3 because it performs full-file transfers instead of block
		exchanges or partial file transfers
	#4/5 because clients don't immediately offer partial downloads,
		thus hot spots have a congestion delay measured in
		full-file-transfer increments rather than in block
		increments (an order of 2 for typical MP3s, easily
		reaching multiple days of congestion).

	A vision for a network that scales along all axis would be Gnutella
with some structure to improve domain-specific searches, with BitTorrent as
the data transfer mechanism.

Please educate me if I've missed some facet of Gnutella!
--Bryan
bryan.turner@pobox.com

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Daniel Stutzbach
Sent: Thursday, December 01, 2005 3:52 PM
To: p2p-hackers@zgp.org
Subject: Re: [p2p-hackers] Re: scalability

On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote:
> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> practically anyone to elevate to super-peer, which results in a random 
> (power-law distribtion) network.

Gnutella is not a power-law network.  See my paper on the graph properties
of Gnutella, presented at the Internet Measurement Conference earlier this
year:

http://www.usenix.org/events/imc05/tech/stutzbach.html

> Such a network is not going to perform very well as far as recall and 
> precision are concerned, past a certain point. I would be interested 
> to calculate that exact point (but doubting I'll get to it some time 
> soon :-/).

Could you rigorously define recall and precision for me?  I'm not sure what
you mean by these terms.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From agthorr at cs.uoregon.edu  Fri Dec  2 20:22:23 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com>
References: <20051201205215.GF5300@cs.uoregon.edu>
	<200512022015.jB2KFj4X024384@rtp-core-1.cisco.com>
Message-ID: <20051202202223.GC2604@cs.uoregon.edu>

On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote:
> Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
> 	#2 because it does not cluster related data it will eventually
> 		be overwhelmed with content.
> 	#3 because it performs full-file transfers instead of block
> 		exchanges or partial file transfers
> 	#4/5 because clients don't immediately offer partial downloads,
> 		thus hot spots have a congestion delay measured in
> 		full-file-transfer increments rather than in block
> 		increments (an order of 2 for typical MP3s, easily
> 		reaching multiple days of congestion).

If I am not mistaken, Gnutella has been doing partial file transfers
for two or three years now.  The eDonkey/eMule network does this too.

BitTorrent does not have a monopoly on this feature. :-)

The relevant spec (if it can be called a spec) for Gnutella is here:

http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From bryan.turner at pobox.com  Fri Dec  2 20:25:59 2005
From: bryan.turner at pobox.com (Bryan Turner)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] DHTs in highly-transient networks
In-Reply-To: <B82E7480-8489-4FE2-B7CD-2607F535D855@cs.berkeley.edu>
Message-ID: <200512022026.jB2KPx4X026153@rtp-core-1.cisco.com>

	The DHTs that I've studied behave well in high-churn environments.
The problem is network migration events; large swings of population in a
short time.  Chord is the worst for this, as its rigid structure quickly
buckles when you lose a large chunk of the network.  Kademlia survives
pretty well; maintaining connections with long-lived nodes is a definite
win, as is maintaining connectivity to hubs/supernodes.

	All of them get screwed when large populations join.  The network
turns to chaos for a while until things settle down.  Kademlia is better off
(lookups continue to work).  The largest problem is the sudden lack of
bandwidth due to all the key-transfers between the nodes.

	In my implementations I had to add a 'slop' factor that was larger
than my largest expected node-join event.  During a lookup, if the
'ultimate' node didn't have the data, he passed the request through the
oldest couple of nodes in the slop region.  This allowed one last chance to
find the right owner.  It worked well in practice, but I still believe
there's a better way.

--Bryan
bryan.turner@pobox.com

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Sean Rhea
Sent: Thursday, December 01, 2005 6:02 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] DHTs in highly-transient networks

On Dec 1, 2005, at 5:53 PM, Michael Rogers wrote:
> To what extent does this depend on the distribution of session times 
> as well as the mean? Kademlia assumes that old nodes will outlive new 
> nodes, and Daniel's paper shows that Gnutella contains an emergent 
> core of long-lived nodes - how well do Bamboo and Chord survive under 
> non-uniform churn?

We used exponentially-distributed node lifetimes, so old nodes do not
generally outlive new ones.  However, I _think_ that choice only makes the
problem harder, though.  In particular, I would suspect that Bamboo/Chord
would do just as well if old nodes lived longer than new ones, and possibly
better.  They won't take advantage of it like Kademlia does, but it
shouldn't hurt them either.  (At least that's my guess; I don't have data to
prove it.)

Sean
-- 
           When I see the price that you pay / I don't wanna grow up
                                  -- Tom Waits


From coderman at gmail.com  Fri Dec  2 20:30:11 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com>
References: <20051201205215.GF5300@cs.uoregon.edu>
	<200512022015.jB2KFj4X024384@rtp-core-1.cisco.com>
Message-ID: <4ef5fec60512021230t2408884ew2e6002cc61fa6d92@mail.gmail.com>

On 12/2/05, Bryan Turner <bryan.turner@pobox.com> wrote:
> ...
> 4       Scalability in interest for an object (hot spots)
>...
>         A vision for a network that scales along all axis would be Gnutella
> with some structure to improve domain-specific searches, with BitTorrent as
> the data transfer mechanism.

finding obscure / rare / unpopular resources is the flip side of the
interest coin. in alpine all discovery was done using distinct peer
groups dedicated to a single domain of resource discovery (specific
subjects / applications had distinct groups). peer lists were ordered
within each group according to a relative quality attribute associated
with that group only.

the goal was to make decentralized search efficient for very obscure
resources when a centralized (or partially centralized) index search
was usually required for completeness to make it effective.

the problem with this approach is that it is very hard to model in a
meaningful way due to inherent dependence on relative metrics
associated with human behavior. (or perhaps it will be simple(r) if a
large real world network can be observed and studied)

alpine also used a pluggable module system (dlopen with c++ derived
handlers) to handle arbitrary metadata associated with queries
(different groups may require different search criteria and taxonomy)
and integrate various transport mechanisms (a simple TCP stream
transfer was provided as an example of this ability)

being able to offload such transfers to a system optimized for the
purpose, like bittorrent, was a design goal and definitely makes sense
in any project where cooperative content distribution is useful.

From gbildson at limepeer.com  Fri Dec  2 20:35:11 2005
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com>
Message-ID: <EPEJIODJLBDLEHGHIEADMEJLFGAB.gbildson@limepeer.com>

Those suppositions are fairly misplaced as is most academic work on
Gnutella.  I wouldn't believe any (other than Daniel Stutzbach's) academic
papers describing Gnutella.

Partial file sharing is active by default.  Download meshes are in place.
Download chunking (pseudo-random) is in place - not rarest first but
sufficient in many cases.  Many improvements have been made to increase the
awareness and allocation of resources but improvements can still be made.

You are correct that rare file/topic searches are still not great but are
much better than historically and likely better than similar networks.
Dynamic querying does a good job of satisfying popular requests at low cost
and reserving more horsepower for rarer searches.

Efficiency is pretty good.  Bittorrent is a tad verbose in some respects.
The only important things that are not in place in Gnutella are rarest first
and tit for tat.

Thanks
-greg
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Bryan Turner
> Sent: Friday, December 02, 2005 3:16 PM
> To: 'Peer-to-peer development.'
> Subject: RE: [p2p-hackers] Re: scalability
>
>
> My $.02 on Gnutella,
>
> 	The Gnutella network will scale fine to 2B nodes.  However, I
> believe without interest clustering or intelligent peer selection, it will
> become increasingly difficult to find the data you are interested
> in.  IE: I
> feel the current architecture misses the 'long tail'.  (Note that I am not
> well versed on Gnutella architecture, this opinion is based on papers
> modeling the math behind Gnutella)
>
> 	I like to find the orthogonal axis in a design, P2P has lots of
> interesting scalability axis:
> 1	Scalability in # of nodes
> 2	Scalability in # of objects
> 3	Scalability in size of objects
> 4	Scalability in interest for an object (hot spots)
> 5	Scalability in bandwidth (protocol overhead, efficiency)
> 	etc.
>
> 	BitTorrent captures all but #2, as multiple torrents may require
> redundant connections to a peer, and torrents that share files cannot also
> share swarms (not to mention BitTorrent isn't a content search network).
>
> Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
> 	#2 because it does not cluster related data it will eventually
> 		be overwhelmed with content.
> 	#3 because it performs full-file transfers instead of block
> 		exchanges or partial file transfers
> 	#4/5 because clients don't immediately offer partial downloads,
> 		thus hot spots have a congestion delay measured in
> 		full-file-transfer increments rather than in block
> 		increments (an order of 2 for typical MP3s, easily
> 		reaching multiple days of congestion).
>
> 	A vision for a network that scales along all axis would be Gnutella
> with some structure to improve domain-specific searches, with
> BitTorrent as
> the data transfer mechanism.
>
> Please educate me if I've missed some facet of Gnutella!
> --Bryan
> bryan.turner@pobox.com
>
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
> Behalf Of Daniel Stutzbach
> Sent: Thursday, December 01, 2005 3:52 PM
> To: p2p-hackers@zgp.org
> Subject: Re: [p2p-hackers] Re: scalability
>
> On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote:
> > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows
> > practically anyone to elevate to super-peer, which results in a random
> > (power-law distribtion) network.
>
> Gnutella is not a power-law network.  See my paper on the graph properties
> of Gnutella, presented at the Internet Measurement Conference earlier this
> year:
>
> http://www.usenix.org/events/imc05/tech/stutzbach.html
>
> > Such a network is not going to perform very well as far as recall and
> > precision are concerned, past a certain point. I would be interested
> > to calculate that exact point (but doubting I'll get to it some time
> > soon :-/).
>
> Could you rigorously define recall and precision for me?  I'm not
> sure what
> you mean by these terms.
>
> --
> Daniel Stutzbach                           Computer Science Ph.D Student
> http://www.barsoom.org/~agthorr                     University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From bryan.turner at pobox.com  Fri Dec  2 20:40:59 2005
From: bryan.turner at pobox.com (Bryan Turner)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <20051202202223.GC2604@cs.uoregon.edu>
Message-ID: <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com>

	Ah, this is news to me :)  Thanks for the link.  I notice that this
partial file transfer feature is only a footnote on the main protocol..  How
wide spread is the partial file transfer feature among clients?

--Bryan
bryan.turner@pobox.com

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Daniel Stutzbach
Sent: Friday, December 02, 2005 3:22 PM
To: 'Peer-to-peer development.'
Subject: Re: [p2p-hackers] Re: scalability

On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote:
> Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
> 	#2 because it does not cluster related data it will eventually
> 		be overwhelmed with content.
> 	#3 because it performs full-file transfers instead of block
> 		exchanges or partial file transfers
> 	#4/5 because clients don't immediately offer partial downloads,
> 		thus hot spots have a congestion delay measured in
> 		full-file-transfer increments rather than in block
> 		increments (an order of 2 for typical MP3s, easily
> 		reaching multiple days of congestion).

If I am not mistaken, Gnutella has been doing partial file transfers for two
or three years now.  The eDonkey/eMule network does this too.

BitTorrent does not have a monopoly on this feature. :-)

The relevant spec (if it can be called a spec) for Gnutella is here:

http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From sberlin at gmail.com  Fri Dec  2 21:11:53 2005
From: sberlin at gmail.com (Sam Berlin)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com>
References: <20051202202223.GC2604@cs.uoregon.edu>
	<200512022041.jB2Kex4X028301@rtp-core-1.cisco.com>
Message-ID: <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com>

The protocol, as described nearly anywhere, isn't Gnutella.  Gnutella,
as others have said, really isn't a protocol (0.6 or any number)
anymore.  It's a hodgepodge of a lot of features, all implemented by
various Gnutella clients.  Partial file sharing has been in use by
mainstream clients for around 1-2 years.

As Greg mentioned, academic papers tend to describe Gnutella as it was
designed by Justin Frankel, and a few will include the addition of
ultrapeers.  It's nearly impossible to find a paper that accurately
describes the current state of the network  (as it exists through
mainstream clients) though.

It'd likely be a fascinating subject for researchers to study & write
papers on.  I know I'd be interested.

Sam

On 12/2/05, Bryan Turner <bryan.turner@pobox.com> wrote:
>        Ah, this is news to me :)  Thanks for the link.  I notice that this
> partial file transfer feature is only a footnote on the main protocol..  How
> wide spread is the partial file transfer feature among clients?
>
> --Bryan
> bryan.turner@pobox.com
>
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
> Behalf Of Daniel Stutzbach
> Sent: Friday, December 02, 2005 3:22 PM
> To: 'Peer-to-peer development.'
> Subject: Re: [p2p-hackers] Re: scalability
>
> On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote:
> > Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
> >       #2 because it does not cluster related data it will eventually
> >               be overwhelmed with content.
> >       #3 because it performs full-file transfers instead of block
> >               exchanges or partial file transfers
> >       #4/5 because clients don't immediately offer partial downloads,
> >               thus hot spots have a congestion delay measured in
> >               full-file-transfer increments rather than in block
> >               increments (an order of 2 for typical MP3s, easily
> >               reaching multiple days of congestion).
>
> If I am not mistaken, Gnutella has been doing partial file transfers for two
> or three years now.  The eDonkey/eMule network does this too.
>
> BitTorrent does not have a monopoly on this feature. :-)
>
> The relevant spec (if it can be called a spec) for Gnutella is here:
>
> http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol
>
> --
> Daniel Stutzbach                           Computer Science Ph.D Student
> http://www.barsoom.org/~agthorr                     University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From Serguei.Osokine at efi.com  Fri Dec  2 21:26:00 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42759@fcexmb04.efi.internal>

On Friday, December 02, 2005 Sam Berlin wrote:
> It'd likely be a fascinating subject for researchers to study & write
> papers on.  I know I'd be interested.

	Yeah, well, O'Reilly wasn't :-)

	I submitted a proposal to the ETC two or three years ago, where 
I was going to talk about Gnutella being the first P2P network that 
is not only deployed and developed, but is also *designed* in a fully
decentralized fashion. Like you say, basically - there is some common
protocol framework, but within this framework vendors are free to 
develop, publish, and deploy their own protocol extensions, and to 
implement only those extensions of the others that they like. 

	Survival of the fittest proposals in the field, so to speak. 
Design without an architectural committee, voting, or any kind of 
central authority or even consensus on half of the issues. This is
a first and only example of such development, as far as I know. But
for some reason O'Reilly was not impressed. Though I'm not much of 
a speaker in any case :-)

	Best wishes -
	S.Osokine.
	2 Nov 2005.
	

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Sam Berlin
Sent: Friday, December 02, 2005 1:12 PM
To: Bryan Turner; Peer-to-peer development.
Subject: Re: [p2p-hackers] Re: scalability


The protocol, as described nearly anywhere, isn't Gnutella.  Gnutella,
as others have said, really isn't a protocol (0.6 or any number)
anymore.  It's a hodgepodge of a lot of features, all implemented by
various Gnutella clients.  Partial file sharing has been in use by
mainstream clients for around 1-2 years.

As Greg mentioned, academic papers tend to describe Gnutella as it was
designed by Justin Frankel, and a few will include the addition of
ultrapeers.  It's nearly impossible to find a paper that accurately
describes the current state of the network  (as it exists through
mainstream clients) though.

It'd likely be a fascinating subject for researchers to study & write
papers on.  I know I'd be interested.

Sam

On 12/2/05, Bryan Turner <bryan.turner@pobox.com> wrote:
>        Ah, this is news to me :)  Thanks for the link.  I notice that this
> partial file transfer feature is only a footnote on the main protocol..
How
> wide spread is the partial file transfer feature among clients?
>
> --Bryan
> bryan.turner@pobox.com
>
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
> Behalf Of Daniel Stutzbach
> Sent: Friday, December 02, 2005 3:22 PM
> To: 'Peer-to-peer development.'
> Subject: Re: [p2p-hackers] Re: scalability
>
> On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote:
> > Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
> >       #2 because it does not cluster related data it will eventually
> >               be overwhelmed with content.
> >       #3 because it performs full-file transfers instead of block
> >               exchanges or partial file transfers
> >       #4/5 because clients don't immediately offer partial downloads,
> >               thus hot spots have a congestion delay measured in
> >               full-file-transfer increments rather than in block
> >               increments (an order of 2 for typical MP3s, easily
> >               reaching multiple days of congestion).
>
> If I am not mistaken, Gnutella has been doing partial file transfers for
two
> or three years now.  The eDonkey/eMule network does this too.
>
> BitTorrent does not have a monopoly on this feature. :-)
>
> The relevant spec (if it can be called a spec) for Gnutella is here:
>
> http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol
>
> --
> Daniel Stutzbach                           Computer Science Ph.D Student
> http://www.barsoom.org/~agthorr                     University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From agthorr at cs.uoregon.edu  Fri Dec  2 21:30:52 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com>
References: <20051202202223.GC2604@cs.uoregon.edu>
	<200512022041.jB2Kex4X028301@rtp-core-1.cisco.com>
	<19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com>
Message-ID: <20051202213051.GF2604@cs.uoregon.edu>

Perhaps we should take a cue from TCP/IP and start referring to the
"Gnutella protocol suite".

On Fri, Dec 02, 2005 at 04:11:53PM -0500, Sam Berlin wrote:
> The protocol, as described nearly anywhere, isn't Gnutella.  Gnutella,
> as others have said, really isn't a protocol (0.6 or any number)
> anymore.  It's a hodgepodge of a lot of features, all implemented by
> various Gnutella clients.  Partial file sharing has been in use by
> mainstream clients for around 1-2 years.
> 
> As Greg mentioned, academic papers tend to describe Gnutella as it was
> designed by Justin Frankel, and a few will include the addition of
> ultrapeers.  It's nearly impossible to find a paper that accurately
> describes the current state of the network  (as it exists through
> mainstream clients) though.
> 
> It'd likely be a fascinating subject for researchers to study & write
> papers on.  I know I'd be interested.
> 
> Sam
> 
> On 12/2/05, Bryan Turner <bryan.turner@pobox.com> wrote:
> >        Ah, this is news to me :)  Thanks for the link.  I notice that this
> > partial file transfer feature is only a footnote on the main protocol..  How
> > wide spread is the partial file transfer feature among clients?
> >
> > bryan.turner@pobox.com
> >
> > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
> > Behalf Of Daniel Stutzbach
> > Sent: Friday, December 02, 2005 3:22 PM
> > To: 'Peer-to-peer development.'
> > Subject: Re: [p2p-hackers] Re: scalability
> >
> > On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote:
> > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
> > >       #2 because it does not cluster related data it will eventually
> > >               be overwhelmed with content.
> > >       #3 because it performs full-file transfers instead of block
> > >               exchanges or partial file transfers
> > >       #4/5 because clients don't immediately offer partial downloads,
> > >               thus hot spots have a congestion delay measured in
> > >               full-file-transfer increments rather than in block
> > >               increments (an order of 2 for typical MP3s, easily
> > >               reaching multiple days of congestion).
> >
> > If I am not mistaken, Gnutella has been doing partial file transfers for two
> > or three years now.  The eDonkey/eMule network does this too.
> >
> > BitTorrent does not have a monopoly on this feature. :-)
> >
> > The relevant spec (if it can be called a spec) for Gnutella is here:
> >
> > http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol
> >
> >
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From srhea at cs.berkeley.edu  Fri Dec  2 21:33:23 2005
From: srhea at cs.berkeley.edu (Sean Rhea)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] DHTs in highly-transient networks
In-Reply-To: <200512022026.jB2KPx4X026153@rtp-core-1.cisco.com>
References: <200512022026.jB2KPx4X026153@rtp-core-1.cisco.com>
Message-ID: <1CBA368C-F6E1-49D9-B7C8-E024A7F556A7@cs.berkeley.edu>

On Dec 2, 2005, at 3:25 PM, Bryan Turner wrote:
> 	The DHTs that I've studied behave well in high-churn environments.
> The problem is network migration events; large swings of population  
> in a
> short time.  Chord is the worst for this, as its rigid structure  
> quickly
> buckles when you lose a large chunk of the network.  Kademlia survives
> pretty well; maintaining connections with long-lived nodes is a  
> definite
> win, as is maintaining connectivity to hubs/supernodes.

How massive is massive?  In some earlier experiments we ran, we  
tested Bamboo with massive joins and failures of groups composing  
around 20% of the total network size.  It works fine.  You get a  
little blip where the average lookup time goes up by a factor of two  
or so, but that's all.  If I recall correctly, the MIT Chord  
implementation, at least, did pretty well in such scenarios as well.   
You just have to recover periodically, rather than reactively, to  
join and failure events, as described in the Bamboo USENIX paper I  
referenced earlier.

Sean
-- 
          Everyone chooses his or her own instrument for rebellion.
    I don't know what my son's will be, but my only hope for him is  
this:
      That by sharing my passions with him, I have planted the seeds of
              defiance that will someday be turned against me.
                              -- Soo Lee Young


-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/0a68b5e9/PGP.pgp
From coderman at gmail.com  Fri Dec  2 21:46:10 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42759@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42759@fcexmb04.efi.internal>
Message-ID: <4ef5fec60512021346x537d09b9o6260407ef16d3cf8@mail.gmail.com>

On 12/2/05, Serguei Osokine <Serguei.Osokine@efi.com> wrote:
> ... Gnutella being the first P2P network that
> is not only deployed and developed, but is also *designed* in a fully
> decentralized fashion. Like you say, basically - there is some common
> protocol framework, but within this framework vendors are free to
> develop, publish, and deploy their own protocol extensions, and to
> implement only those extensions of the others that they like.

i'd say IRC falls into this category and definitely predates the
current gnutella cabal.
(i may be a bit biased as i met my wife on irc-2.mit.edu way back when... :)


>         Survival of the fittest proposals in the field, so to speak.
> Design without an architectural committee, voting, or any kind of
> central authority or even consensus on half of the issues.

my favorite kind of design. groupthink is braindead! UML sucks! vi
forevar! etc, etc. *grin*

From ian.clarke at gmail.com  Sat Dec  3 09:49:51 2005
From: ian.clarke at gmail.com (Ian Clarke)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <20051202154557.E559F191C@yumyum.zooko.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202154557.E559F191C@yumyum.zooko.com>
Message-ID: <823242bd0512030149t3e6a18d2x@mail.gmail.com>

On 02/12/05, zooko@zooko.com <zooko@zooko.com> wrote:
> Let 1 be the set of networks which are used for illegal transmission of
> information,

I do wish you would refer to these networks as those which allow the
covert transmission of information, rather than those which are used
for the illegal transmission of information - since I am not aware of
any networks that are specifically designed for the illegal
transmission of information.  I think this would help alleviate the
political problem you raise later in your email.

> and 2 be the set of networks which are built on f2f connections,
> and 1^2 be the intersection -- the set of networks which are used for illegal
> transmission of information and which are built on f2f connections.

If you broaden your definition of set 1 to be networks which are used
for the covert transmission of information (I think this is a more
useful definition for the set as not all covert activity is illegal),
then I am not sure, in practice, how many networks will fall into set
2 that aren't also members of set 1, in fact, I can't think of any
non-contrived situations where one would create a f2f network
motivated by something other than a desire to be covert in some way.

> [bepw2002] introduces "darknet" to mean concept 1.

I'm not going to spend time dissecting their paper to determine
exactly what BEPW's intention was for the term "darknet", certainly
they could have been much more explicit about this if they wanted to,
and they use the term in contradictory ways throughout their paper. 
For example, they refer to "the darknet" as if there is only one, but
subsequenly refer to "darknets".  Given this vagueness, I can't
imagine that is was their goal to provide an authorative definition
for the term.

While we can debate what BEPW intended the term to mean when they used
it in their paper, this is ultimately irrelevant.  Software engineers
often seem to forget that English isn't like a programming language
where a designer specifies an unambigous definition at the outset
(Richard Stallman is particularly guilty of this).  The meaning of
words in English is a consensus that is arrived at over time, and
eventually finds its way into a dictionary (long) after that consensus
is stable.  The BEPW paper is one early voice in that
consensus-forming process.  Mine is another, yours is another still.

> By the way, I should point out that I have a personal interest in this history
> because between 2001 and 2003 I tried to promulgate concept 2, using Lucas
> Gonze's coinage: "friendnet" [zooko2001, zooko2002, zooko2003, gonze2002].
> I would like to know for my own satisfaction if my ideas were a direct
> inspiration for some of this modern stuff, such as the Freenet v0.7 design.

I am not sure that they were a direct inspiration.  We (Freenet) have
been concerned about the fact that Freenet was harvestable for several
years now.  Around spring this year I made the observation that if
human relationships form a small world network, it should be possible
to assign locations to people such that we form a Kleinberg-style
small world network, and thus we could make the network routable. 
Oskar Sandberg then suggested a way to do this, and we set about
validating the concept using simulations.

> Now the problem is that in the current parlance of the media, the word
> "darknet" is used to mean vaguely 1 or 2 or 1^2.  The reason that this is a
> problem isn't that it breaks with some etymological tradition, but that it is
> ambiguous and that it deprives us of useful words to refer to 1 or 2
> specifically.  The ambiguity has nasty political consequences -- see for
> example these f2f network operators struggling to persuade newspaper readers
> that they are not primarily for illegal purposes: [globe].

I think a much better way to avoid this nasty political consequence is
to stop describing set 1 in terms of illegal activity, but rather
describe such networks as being "covert", or "anonymity preserving" -
neither of which implies illegal activity (it is perfectly legal to be
anonymous in most countries whose legal systems I am familiar with).

> > defining the term "darknet" as a f2f network that is designed
> > to conceal the activities of its participants (this being, so far as I
> > have seen, one of the main motivations for building an f2f network),
>
> So you think of "darknet" as meaning 1^2.

Or just 2, since I think the sets 1^2 and 2 are, in practical terms,
virtually identical.

> That's an interesting remark -- that you regard concealment as one of the main
> motivations.  I personally regard concealment as one of the lesser motivations
> -- I'm more interested in attack resistance (resisting attacks such as
> subversion or denial-of-service, rather than attacks such as surveillance),
> scalability, and other properties.  Although I'm interested in the concealment
> properties as well.

That is surprising.  Are you aware of any current or proposed f2f
networks for which concealment of user activity is not a goal?

Ian.

From adam at cypherspace.org  Sat Dec  3 12:17:37 2005
From: adam at cypherspace.org (Adam Back)
Date: Sat Dec  9 22:13:05 2006
Subject: idealized content network properties (Re: [p2p-hackers] darknet)
In-Reply-To: <439057F1.2060208@cs.ucl.ac.uk>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202133516.GA15480@bitchcake.off.net>
	<439057F1.2060208@cs.ucl.ac.uk>
Message-ID: <20051203121737.GA3572@bitchcake.off.net>

Sure I just mean if you make it invitation only, thats the same
network but with something preventing other people subscribing.

That could be encryption keys (all encrypted), authentication
keys/passwords (required to join network), obscurity (don't advertise
IP/port), or network control (current entity requires to connect to
you to join you).

It seems that it would not be hard to add this restriction to a
network without this restriction (but with the other features I
mentioned).

I'd say that darknet term specifically implies some opaqueness to
outside observers -- likely encryption no?  (but f2f would not
necessarily, its just a invite only collaboration group network).

Adam

On Fri, Dec 02, 2005 at 02:19:29PM +0000, Michael Rogers wrote:
> Adam Back wrote:
> >removing 1 makes a eg "friend-to-friend" network -- that just means
> >you encrypt the searchable tags and content with a shared key.
> 
> Not sure about this one - I think the use of group keys is orthogonal to 
> the use of a friend-to-friend topology. For example Groove uses group 
> keys without f2f, Freenet 0.7 will use f2f without group keys, and WASTE 
> uses neither (but still fits under the "darknet" umbrella because it's 
> invitation-only).

From rrrw at neofonie.de  Sat Dec  3 23:04:16 2005
From: rrrw at neofonie.de (Ronald Wertlen)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
Message-ID: <43922470.6030802@neofonie.de>

Hi Daniel,

these are basically benchmark domains (variables), that tell you how 
good your search is from, as I mentioned in my mail, the information 
retrieval field.  http://en.wikipedia.org/wiki/Information_retrieval

For instance Bloom Filters increase your scalability but reduce the 
precision of the search - so you get a lot of stuff you didn't want.

A few years ago, a lot of papers in the p2p field that were working on 
stuff like topology, organisational methods, scalability, etc. 
concentrated on finding better ways of getting from object_id to the 
node (number of hops, number of lookups, etc.). The problem from an IR 
perspective is that not all objects are as "simple" as a mp3 file and 
not all searches are as simple as "coldplay", how do you get the 
onject_id in the first place. This becomes a severe problem the more 
complex the objects, their metadata and the queries (for instance 
Boolean, range, content proximity, queries).

I've downloaded your paper, thanks for the refutation. I love results 
that seem counter-intuitive to me because they mean I have some learning 
to do.  :-)

Best regards, Ron

> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
> Behalf Of Daniel Stutzbach
> Sent: Thursday, December 01, 2005 3:52 PM
> To: p2p-hackers@zgp.org
> Subject: Re: [p2p-hackers] Re: scalability
> 
> On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote:
> 
>>> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
>>> practically anyone to elevate to super-peer, which results in a random 
>>> (power-law distribtion) network.
> 
> 
> Gnutella is not a power-law network.  See my paper on the graph properties
> of Gnutella, presented at the Internet Measurement Conference earlier this
> year:
> 
> http://www.usenix.org/events/imc05/tech/stutzbach.html
> 
>>> Such a network is not going to perform very well as far as recall and 
>>> precision are concerned, past a certain point. I would be interested 
>>> to calculate that exact point (but doubting I'll get to it some time 
>>> soon :-/).
> 
> 
> Could you rigorously define recall and precision for me?  I'm not sure what
> you mean by these terms.
> 
> -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon 


From sberlin at gmail.com  Sun Dec  4 00:03:08 2005
From: sberlin at gmail.com (Sam Berlin)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: scalability
In-Reply-To: <43922470.6030802@neofonie.de>
References: <43922470.6030802@neofonie.de>
Message-ID: <19196d860512031603q2b1e3700jc72dada77e890e10@mail.gmail.com>

> For instance Bloom Filters increase your scalability but reduce the
> precision of the search - so you get a lot of stuff you didn't want.

Bloom Filters can be used to reduce the amount of incoming queries (in
Gnutella filters are passed from a "leaf" to its "ultrapeer", and
composite filters are passed between neighboring ultrapeers to reduce
last hop & second-to-last-hop traffic).  Once the query passes the
filter test, it can still be forwarded on to the ultimate host, and
that host can make the decision on whether or not to send a reply. 
This eliminates "the stuff you didn't want" from replies while still
keeping traffic low.  Tthe filters in Gnutella reduce ~70% of query
traffic on the second-to-last hop, and ~90% on the last hop (at least,
it did when I last checked a year or so ago).

> A few years ago, a lot of papers in the p2p field that were working on
> stuff like topology, organisational methods, scalability, etc.
> concentrated on finding better ways of getting from object_id to the
> node (number of hops, number of lookups, etc.). The problem from an IR
> perspective is that not all objects are as "simple" as a mp3 file and
> not all searches are as simple as "coldplay", how do you get the
> onject_id in the first place. This becomes a severe problem the more
> complex the objects, their metadata and the queries (for instance
> Boolean, range, content proximity, queries).

Metadata is certainly difficult to search for, but it isn't
impossible.  It's vastly easier to search using metadata in a network
such as Gnutella than in a DHT-based network, as you don't have to
prepopulate the tables with all kinds of data.  There's lots of active
metadata searches going on (again, in Gnutella), including searches
for file names, most-recently-downloaded, specific data in id3 tags,
file's licenses, etc...

IMHO, the less a network is structured (ie, doesn't have an organized
topology), the easier it is to add arbitrary searches.  This is
because there's no need to add another overlay for a new kind of
search -- the network can function as-is.  Of course, certain
topologies can help when some kinds of searches are predominant.

Sam

From lgonze at panix.com  Sun Dec  4 01:02:39 2005
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <823242bd0512030149t3e6a18d2x@mail.gmail.com>
References: <200511291414.35852.01771@iha.dk>	<20051129140314.046DD698@yumyum.zooko.com>	<823242bd0512020407i252b84c4u@mail.gmail.com>	<20051202154557.E559F191C@yumyum.zooko.com>
	<823242bd0512030149t3e6a18d2x@mail.gmail.com>
Message-ID: <4392402F.303@panix.com>

Ian Clarke wrote:

>If you broaden your definition of set 1 to be networks which are used
>for the covert transmission of information (I think this is a more
>useful definition for the set as not all covert activity is illegal),
>then I am not sure, in practice, how many networks will fall into set
>2 that aren't also members of set 1, in fact, I can't think of any
>non-contrived situations where one would create a f2f network
>motivated by something other than a desire to be covert in some way.
>  
>

A private network allows participants to talk freely without every 
comment ending up in Google, and that allows you to have the kind of 
conversation which shouldn't be public.  The application is to enable 
speech which isn't intended for global scale, usually about personal 
issues like sex, money, family, friendships, and gossip.  I wouldn't 
call that covert, illegal, or contrived, just private.

From kerry at vscape.com  Sun Dec  4 05:36:21 2005
From: kerry at vscape.com (Kerry Bonin)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p portland (OR)
In-Reply-To: <20051202193833.GD2249@leitl.org>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>	<20051202185136.GB2604@cs.uoregon.edu>	<4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com>
	<20051202193833.GD2249@leitl.org>
Message-ID: <43928055.4080306@vscape.com>

I'm more of a lurker on this list, but might be able to make meeting - 
I'm in Corvallis, so Portland or Eugene is possible some weekend evenings...

Eugen Leitl wrote:

>On Fri, Dec 02, 2005 at 11:13:33AM -0800, coderman wrote:
>  
>
>>On 12/2/05, Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:
>>    
>>
>>>I'm in Eugene.  I'd be willing to drive up for a get-together if we
>>>have a big enough group to make it interesting.
>>>      
>>>
>>i'd be happy to travel to eugene if more of the group is located there
>>as well.  weekends would be best in that case.
>>    
>>
>
>Allright! I'm game.
>
>;)
>
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>  
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051203/51d9aaf9/attachment.html
From lemonobrien at yahoo.com  Sun Dec  4 06:10:58 2005
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <4392402F.303@panix.com>
Message-ID: <20051204061058.64313.qmail@web53602.mail.yahoo.com>

here here....government just needs to let us be. 

Lucas Gonze <lgonze@panix.com> wrote:  Ian Clarke wrote:

>If you broaden your definition of set 1 to be networks which are used
>for the covert transmission of information (I think this is a more
>useful definition for the set as not all covert activity is illegal),
>then I am not sure, in practice, how many networks will fall into set
>2 that aren't also members of set 1, in fact, I can't think of any
>non-contrived situations where one would create a f2f network
>motivated by something other than a desire to be covert in some way.
> 
>

A private network allows participants to talk freely without every 
comment ending up in Google, and that allows you to have the kind of 
conversation which shouldn't be public. The application is to enable 
speech which isn't intended for global scale, usually about personal 
issues like sex, money, family, friendships, and gossip. I wouldn't 
call that covert, illegal, or contrived, just private.
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051203/6ca537fe/attachment.htm
From m.rogers at cs.ucl.ac.uk  Sun Dec  4 15:32:25 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:05 2006
Subject: idealized content network properties (Re: [p2p-hackers] darknet)
In-Reply-To: <20051203121737.GA3572@bitchcake.off.net>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202133516.GA15480@bitchcake.off.net>
	<439057F1.2060208@cs.ucl.ac.uk>
	<20051203121737.GA3572@bitchcake.off.net>
Message-ID: <43930C09.5000006@cs.ucl.ac.uk>

Adam Back wrote:
> Sure I just mean if you make it invitation only, thats the same
> network but with something preventing other people subscribing.

Agreed - you could argue that WASTE is just Gnutella without host 
caches. ;-)

> I'd say that darknet term specifically implies some opaqueness to
> outside observers -- likely encryption no?  (but f2f would not
> necessarily, its just a invite only collaboration group network).

F2F means more than invitation-only. Invitation-only means you need to 
know some member of the network in order to join, but it doesn't say 
anything about who you can see once you've joined. F2F means you can 
only see the people you know.

A house party is invitation-only but not F2F; a drug distribution 
network is F2F.

The difference is important because an invitation-only non-F2F network 
loses privacy as it grows, whereas an F2F network doesn't.

Cheers,
Michael

From p2phackerslist at rhesusb.dk  Sun Dec  4 18:56:07 2005
From: p2phackerslist at rhesusb.dk (DanielEKFA)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Looking for litterature on file sharing networks...
Message-ID: <200512041956.09896.p2phackerslist@rhesusb.dk>

Hi there :)

I'm study computer science and I'm writing a synopsis on peer-to-peer, more 
specifically file sharing networks. I want to focus on the protocols/network 
structures used in different file sharing programs. I know the internet is my 
best friend, but for something more concrete (schools like books), do you 
guys have any books to recommend? Good URLs are very welcome, too :)

Thanks in advance,
Daniel

From trep at cs.ucr.edu  Sun Dec  4 19:27:11 2005
From: trep at cs.ucr.edu (Thomas Repantis)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Looking for litterature on file sharing networks...
In-Reply-To: <200512041956.09896.p2phackerslist@rhesusb.dk>
References: <200512041956.09896.p2phackerslist@rhesusb.dk>
Message-ID: <20051204192711.GA85169@angeldust.chaos>

Hi, 

You may want to take a look at:

E.K. Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim. A survey and comparison
of peer-to-peer overlay network schemes. IEEE Communications Surveys & 
Tutorials, 7(2):72�93, Second Quarter 2005.

J. Risson and T. Moors. Survey of research towards robust peer-to-peer 
networks: Search methods.     Technical Report UNSW-EE-P2P-1-1, 
University of New South Wales, Sydney, Australia, September 2004.

D. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J. Pruyne, B. Richard, 
S. Rollins, and Z. Xu. Peer-to-Peer Computing. Technical Report HPL-2002-57, 
HP Labs, 2003.

Cheers, 
Thomas


On Sun, Dec 04, 2005 at 07:56:07PM +0100, DanielEKFA wrote:
> Hi there :)
> 
> I'm study computer science and I'm writing a synopsis on peer-to-peer, more 
> specifically file sharing networks. I want to focus on the protocols/network 
> structures used in different file sharing programs. I know the internet is my 
> best friend, but for something more concrete (schools like books), do you 
> guys have any books to recommend? Good URLs are very welcome, too :)
> 
> Thanks in advance,
> Daniel

-- 
http://www.cs.ucr.edu/~trep
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051204/4043dc84/attachment.pgp
From coderman at gmail.com  Mon Dec  5 02:37:12 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] p2p portland^H^H^Heugene (OR) ; scheduling
Message-ID: <4ef5fec60512041837i7c631756rff3a9c01761c6dce@mail.gmail.com>

to attempt a meeting this year (however futile) we have the following options:
sat/sun dates:
10/11
17/18
2--hehe who am i kidding...
31/1st?

10th or 17th or jan. would be my preference.

happy holidays


On 12/3/05, Kerry Bonin <kerry@vscape.com> wrote:
>  I'm more of a lurker on this list, but might be able to make meeting - I'm
> in Corvallis, so Portland or Eugene is possible some weekend evenings...

From arachnid at notdot.net  Mon Dec  5 03:28:03 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure in
	BitTorrent like systems?
Message-ID: <4393B3C3.5040905@notdot.net>

Systems like BitTorrent have a rather annoying failure mode - the last 
'seed' goes offline while there are still several 'peers' (without the 
complete file) online. Attempts by the peers to reconstruct the original 
file are rarely successful, as the chances of every single block being 
present on one of the seeds are generally very low - it's likely that at 
least one block is missing.

However, what if one were to precode files to be distributed using a 
standard error correcting code such as a reed-solomon code? By 
generating 10% check blocks, and treating the composite file the same as 
you would the original (with the exception that you can stop downloading 
when you reach 90%, and reconstruct using check blocks from there), you 
can reduce the chance of the last departing seed ensuring nobody can 
complete the file.

If we assume there are 4 peers left on the network, each with 50% of the 
file remaining, on average they will be able to reconstruct 50% + 25% + 
12.5% + 6.25% = 93.75% of the file, which exceeds the threshold required 
to reconstruct with check blocks.

So, a couple of questions:
1) How common is this failure mode? Does it occur often enough to 
justify the extra complexity?
2) Do peers generally have enough pieces between them to reach or exceed 
the 90% threshold?

-Nick Johnson


From coderman at gmail.com  Mon Dec  5 03:39:48 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure in
	BitTorrent like systems?
In-Reply-To: <4393B3C3.5040905@notdot.net>
References: <4393B3C3.5040905@notdot.net>
Message-ID: <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com>

On 12/4/05, Nick Johnson <arachnid@notdot.net> wrote:
> Systems like BitTorrent have a rather annoying failure mode - the last
> 'seed' goes offline while there are still several 'peers' (without the
> complete file) online.

any system which makes a partial resource available has this problem,
even one using error correcting codes (seed goes offline before
requisite X of coded blocks are sent).


> However, what if [.. doing stuff to mix data ..], you
> can reduce the chance of the last departing seed ensuring nobody can
> complete the file.

no; you've just made it less likely that the end of the file will
always be the part missing if a peer terminates distribution
prematurely.

From arachnid at notdot.net  Mon Dec  5 03:45:54 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure
	in	BitTorrent like systems?
In-Reply-To: <4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com>
References: <4393B3C3.5040905@notdot.net>
	<4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com>
Message-ID: <4393B7F2.3010300@notdot.net>

coderman wrote:

>On 12/4/05, Nick Johnson <arachnid@notdot.net> wrote:
>  
>
>>Systems like BitTorrent have a rather annoying failure mode - the last
>>'seed' goes offline while there are still several 'peers' (without the
>>complete file) online.
>>    
>>
>
>any system which makes a partial resource available has this problem,
>even one using error correcting codes (seed goes offline before
>requisite X of coded blocks are sent).
>  
>
But statistically, if n different peers each have random subsets of the 
data, the chances of them having 90% of the file between them are much, 
much higher than the chances of them having 100%.

>>However, what if [.. doing stuff to mix data ..], you
>>can reduce the chance of the last departing seed ensuring nobody can
>>complete the file.
>>    
>>
>
>no; you've just made it less likely that the end of the file will
>always be the part missing if a peer terminates distribution
>prematurely.
>  
>
BitTorrent distributes chunks semi-randomly, not sequentially, so you're 
no more likely to have the beginning than the end of the file.

-Nick Johnson


From coderman at gmail.com  Mon Dec  5 04:06:37 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure in
	BitTorrent like systems?
In-Reply-To: <4393B7F2.3010300@notdot.net>
References: <4393B3C3.5040905@notdot.net>
	<4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com>
	<4393B7F2.3010300@notdot.net>
Message-ID: <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com>

On 12/4/05, Nick Johnson <arachnid@notdot.net> wrote:
> ...
> But statistically, if n different peers each have random subsets of the
> data, the chances of them having 90% of the file between them are much,
> much higher than the chances of them having 100%.

you are assuming there was at least one complete distribution.

in the situation you describe (last seed leaves) some of the remaining
peers do then become seeds as they obtain requisite missing chunks to
complete the torrent, if the remaining peers have the blocks required
to complete what is missing.  i don't see how error codes would be an
improvement (considering coding overhead / expansion), unless the
distribution of blocks using the current bittorrent algorithm was
heavily weighted somehow. (is it?)

if a complete copy has not been distributed within the group then it
doesn't matter what encoding mechanism you use, and in my experience
this has been the usual cause of partial failures.

From agthorr at cs.uoregon.edu  Mon Dec  5 04:36:02 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure in
	BitTorrent like systems?
In-Reply-To: <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com>
References: <4393B3C3.5040905@notdot.net>
	<4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com>
	<4393B7F2.3010300@notdot.net>
	<4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com>
Message-ID: <20051205043601.GA3088@cs.uoregon.edu>

On Sun, Dec 04, 2005 at 08:06:37PM -0800, coderman wrote:
> if a complete copy has not been distributed within the group then it
> doesn't matter what encoding mechanism you use, and in my experience
> this has been the usual cause of partial failures.

Out of curiosity, what's your dataset and how have you established
that the original seed failed to distribute at least one copy?

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From arachnid at notdot.net  Mon Dec  5 04:55:02 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure in
	BitTorrent like systems?
In-Reply-To: <4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com>
References: <4393B3C3.5040905@notdot.net>
	<4ef5fec60512041939q18355de7n499ce3c2d8c41b7a@mail.gmail.com>
	<4393B7F2.3010300@notdot.net>
	<4ef5fec60512042006j60238e48q8003449fe1c60878@mail.gmail.com>
Message-ID: <4393C826.1070500@notdot.net>

coderman wrote:

> you are assuming there was at least one complete distribution.
>
>in the situation you describe (last seed leaves) some of the remaining
>peers do then become seeds as they obtain requisite missing chunks to
>complete the torrent, if the remaining peers have the blocks required
>to complete what is missing.  i don't see how error codes would be an
>improvement (considering coding overhead / expansion), unless the
>distribution of blocks using the current bittorrent algorithm was
>heavily weighted somehow. (is it?)
>
>if a complete copy has not been distributed within the group then it
>doesn't matter what encoding mechanism you use, and in my experience
>this has been the usual cause of partial failures.
>  
>
No, the point behind using ECC is that you don't need a complete 
distribution, only 90%. Here's some stats:

Assume a file is distributed in 1000 blocks. The last seed goes offline, 
leaving 4 peers, each with an average of 500 blocks. Between them, they 
will, on average, have 1 - (500/1000)^4 percent of the blocks - 93.75%. 
The lieklihood of them having the entire file is (1 - 0.5^4) ^ 1000 - a 
very, very small number (approximately 10^-30).

However, if we precode this file into 1100 blocks, of which only 1000 
are required, assuming the same 500 blocks per peer, they have on 
average 1 - (600/1100)^4 percent of the blocks - 91.1%. Since they only 
require 10/11 (90.9%), they will usually have enough to reconstruct the 
original file. Unfortunately, I can't recall the neccessary stats to 
calculate the exact chance of success.

-Nick Johnson

From Arnaud.Legout at sophia.inria.fr  Mon Dec  5 08:49:13 2005
From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure
	in	BitTorrent like systems?
In-Reply-To: <4393B3C3.5040905@notdot.net>
References: <4393B3C3.5040905@notdot.net>
Message-ID: <4393FF09.6030302@sophia.inria.fr>

Hi,

Nick Johnson wrote:
> Systems like BitTorrent have a rather annoying failure mode - the last 
> 'seed' goes offline while there are still several 'peers' (without the 
> complete file) online. Attempts by the peers to reconstruct the 
> original file are rarely successful, as the chances of every single 
> block being present on one of the seeds are generally very low - it's 
> likely that at least one block is missing.
from my point of view this is pure myth.
I often see such claims that bittorrent suffers from last pieces 
problem; that if there is no seed, the torrent is dead; etc.
 From all the experiments I performed, the reality is very different.
Rarest first does a very good job at  replicating the rarest pieces in a 
torrent, so that  the probability to have a piece that is not
replicated at all is very low.

Of course, one can always build toys model that show problems in extreme 
cases. But, it is clear that BitTorrent is
not a one fit all solution, and BitTorrent is very successful for its 
targeted applications: large scale replication for medium to large files.

Outside this target, it makes sense to design other classes of 
applications. However, I am not convinced that error correcting code are the
solution. Such codes are terribly sexy, but when it comes to real 
applications, things are far less sexy.
It is hard to tune such codes as their relevance really comes from the 
context. If the context is a moving target, then the problem becomes 
very complex.
Consider the simple case of a improving reliability of a satellite link. 
It is not trivial at all to find the good tradeoff between reliability 
and overhead.
When it comes to a distributed system with heterogeneous clients, the 
problem is several order of magnitude more complex.

Regards,

Arnaud.


From arachnid at notdot.net  Mon Dec  5 10:34:55 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure	in
	BitTorrent like systems?
In-Reply-To: <4393FF09.6030302@sophia.inria.fr>
References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr>
Message-ID: <439417CF.6060800@notdot.net>

Arnaud Legout wrote:

> Hi,
>
> Nick Johnson wrote:
>
>> Systems like BitTorrent have a rather annoying failure mode - the 
>> last 'seed' goes offline while there are still several 'peers' 
>> (without the complete file) online. Attempts by the peers to 
>> reconstruct the original file are rarely successful, as the chances 
>> of every single block being present on one of the seeds are generally 
>> very low - it's likely that at least one block is missing.
>
> from my point of view this is pure myth.
> I often see such claims that bittorrent suffers from last pieces 
> problem; that if there is no seed, the torrent is dead; etc.
> From all the experiments I performed, the reality is very different.
> Rarest first does a very good job at  replicating the rarest pieces in 
> a torrent, so that  the probability to have a piece that is not
> replicated at all is very low.

This is why I'm after stats, not guesses - I'm of the opinion that even 
with rarest first, the chances of getting every single block are very 
low (remember, if you have 1000 blocks, and you're 99% likely to have 
each block, that's still only a 0.004% chance you'll have them all). 
However, that's just my guess, and this is just yours - only stats will 
show it one way or the other, really.

>
> Outside this target, it makes sense to design other classes of 
> applications. However, I am not convinced that error correcting code 
> are the
> solution. Such codes are terribly sexy, but when it comes to real 
> applications, things are far less sexy.
> It is hard to tune such codes as their relevance really comes from the 
> context. If the context is a moving target, then the problem becomes 
> very complex.

In this case, the progression of expected percentage of blocks (50% + 
25% + 12.5% + ...) gives us a very good idea how large an ECC is 
required - after only 4 generations (4 peers with average 50%), the 
amount of data expected is over 90%, so for most purposes 10% check 
blocks will be sufficient. Since more check blocks don't require any 
more transmission, the overhead is pretty low, and it's quite possible 
to set the threshold higher if desired. Any amount should give benifits, 
however.

-Nick Johnson

From Arnaud.Legout at sophia.inria.fr  Mon Dec  5 11:31:22 2005
From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent
	failure	in	BitTorrent like systems?
In-Reply-To: <439417CF.6060800@notdot.net>
References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr>
	<439417CF.6060800@notdot.net>
Message-ID: <4394250A.9050704@sophia.inria.fr>

Hi,

Nick Johnson wrote:
>
> This is why I'm after stats, not guesses - I'm of the opinion that 
> even with rarest first, the chances of getting every single block are 
> very low (remember, if you have 1000 blocks, and you're 99% likely to 
> have each block, that's still only a 0.004% chance you'll have them 
> all). However, that's just my guess, and this is just yours - only 
> stats will show it one way or the other, really.
>
and we have stats. You can have a look at (section IV-B):
http://hal.inria.fr/inria-00000156/en

for an experimental evaluation of rarest first. We are still working on 
this paper and more results are to come.
However, they all show that rarest first increases the entropy of the 
pieces in a way that renders more complex
piece management pointless in all the torrents we monitored.

In particular, rarest first increases very fast the rarest pieces in 
your peer set so that the probability to have rare pieces in your peer
set decreases fast with time. Therefore, even if some peers leave the 
peer set, the chance to have missing pieces is low.
Of course, you have transient period of time during which some pieces 
may disappear from the torrent. But in a typical torrent,
this is unlikely.

Do not hesitate to comment or ask questions on our paper, I would be 
pleased to answer.

Regards,
Arnaud.

From p2phackerslist at rhesusb.dk  Mon Dec  5 12:18:27 2005
From: p2phackerslist at rhesusb.dk (DanielEKFA)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Looking for litterature on file sharing networks...
In-Reply-To: <200512041956.09896.p2phackerslist@rhesusb.dk>
References: <200512041956.09896.p2phackerslist@rhesusb.dk>
Message-ID: <200512051318.27717.p2phackerslist@rhesusb.dk>

To Thomas and Bram: Those resources are great! Perfect with technical 
documents, and great with Bram's list of different networks. Thanks again, 
both of you! :)

On Sunday 2005-12-04 19:56, DanielEKFA wrote:
> Hi there :)
>
> I'm study computer science and I'm writing a synopsis on peer-to-peer, more
> specifically file sharing networks. I want to focus on the
> protocols/network structures used in different file sharing programs. I
> know the internet is my best friend, but for something more concrete
> (schools like books), do you guys have any books to recommend? Good URLs
> are very welcome, too :)
>
> Thanks in advance,
> Daniel
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From arachnid at notdot.net  Mon Dec  5 19:52:08 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to
	prevent	failure	in	BitTorrent like systems?
In-Reply-To: <4394250A.9050704@sophia.inria.fr>
References: <4393B3C3.5040905@notdot.net>
	<4393FF09.6030302@sophia.inria.fr>	<439417CF.6060800@notdot.net>
	<4394250A.9050704@sophia.inria.fr>
Message-ID: <43949A68.4090306@notdot.net>

Arnaud Legout wrote:

> Hi,
>
> Nick Johnson wrote:
>
>>
>> This is why I'm after stats, not guesses - I'm of the opinion that 
>> even with rarest first, the chances of getting every single block are 
>> very low (remember, if you have 1000 blocks, and you're 99% likely to 
>> have each block, that's still only a 0.004% chance you'll have them 
>> all). However, that's just my guess, and this is just yours - only 
>> stats will show it one way or the other, really.
>>
> and we have stats. You can have a look at (section IV-B):
> http://hal.inria.fr/inria-00000156/en
>
> for an experimental evaluation of rarest first. We are still working 
> on this paper and more results are to come.

Excellent - this is exactly what I was looking for.
However, I'm a little confused - first you say "Fig. 9 represents the 
evolution of the number of copies of pieces in the peer set with time", 
then you say "Fig. 12 represents the evolution of the number of copies 
of pieces in the peer set with time. We see some major differences 
compared to Fig. 9". What's the difference between what you're graphing 
in the two graphs?

-Nick Johnson


From lemonobrien at yahoo.com  Mon Dec  5 20:14:08 2005
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] P2P in SFC
In-Reply-To: <Pine.LNX.4.64.0511290042160.28248@laptop.snarfed.org>
Message-ID: <20051205201408.64907.qmail@web53606.mail.yahoo.com>

I need confirmation, time and place...so I can make sure I'm there...and on time to get a good seat :)
   
  lemon


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051205/ee4b480d/attachment.htm
From sleety at gmail.com  Tue Dec  6 08:20:41 2005
From: sleety at gmail.com (Mr Iceman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] I need a ASP security code
Message-ID: <917b56f0512060020o4c37581xaa0a227f84c1f70d@mail.gmail.com>

Hello.
I need a ASP security code for login pages and save the information in Data
base.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/aca61240/attachment.html
From enzomich at gmail.com  Tue Dec  6 08:37:36 2005
From: enzomich at gmail.com (Enzo Michelangeli)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent
	failurein	BitTorrent like systems?
References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr>
Message-ID: <02c601c5fa40$57fbd520$0200a8c0@em.noip.com>

----- Original Message ----- 
From: "Arnaud Legout" <Arnaud.Legout@sophia.inria.fr>
Sent: Monday, December 05, 2005 4:49 PM

[...]
> from my point of view this is pure myth.
> I often see such claims that bittorrent suffers from last pieces
> problem; that if there is no seed, the torrent is dead; etc.
>  From all the experiments I performed, the reality is very different.
> Rarest first does a very good job at  replicating the rarest pieces
> in a torrent, so that  the probability to have a piece that is not
> replicated at all is very low.

Sometimes particular blocks are not missing, but mangled in transit by NAT
routers too smart for their own good (and their owners', among which, at
one time, myself). See:

http://azureus.aelitis.com/wiki/index.php/NinetyNine

The existence of such routers is doubted at:

http://www.plugndial.com/draft-jennings-midcom-stun-results-02.txt

   [...]
   Some NATs were rumored to exist that looked in arbitrary packets for
   either the NATs' external IP address or for the internal host IP
   address - either in binary or dotted decimal form - and rewrote it to
   something else.  STUN could be extended to test for exactly this type
   of behavior by echoing arbitrary client data and the mapped address
   but sending the bits inverted so these evil NATs did not mess with
   them.  NATs that do this will break integrity detection on payloads.

...but I can testify that a Sercom IP706ST once in my possession did
perform such "blind payload patch" for packets sent to the DMZ host.
Needless to say, it's been demoted to paperweight :-)

Enzo


From dbarrett at quinthar.com  Tue Dec  6 08:43:01 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] P2P in SFC - Last call
Message-ID: <43954F15.7080008@quinthar.com>

Looks like we'll have a good showing at the P2P event.  Despite the 
size, I'm sticking to my guns:

     Ryoko's Sushi
     - Wednesday, 12/7 at 9pm
     - URL: http://tinyurl.com/bkk5d
     - Phone: (415) 775-1028
     - Address: 619 Taylor St, San Francisco, CA 94102

To keep the logistics simple, I'll bring a jar into which everyone can 
toss money, and I'll keep ordering sushi until the jar runs dry.  If you 
have any preferences, do let me know.  As for drinks, I spoke with the 
pretty girls there and they'll take care of you at the bar.

Any lurkers who haven't spoken up feel free to come, and all those who 
have please let me know if you won't.  Until then, see ya'll soon!

-david

From sg266 at cornell.edu  Tue Dec  6 09:05:30 2005
From: sg266 at cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent
	failurein	BitTorrent like systems?
In-Reply-To: <02c601c5fa40$57fbd520$0200a8c0@em.noip.com>
References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr>
	<02c601c5fa40$57fbd520$0200a8c0@em.noip.com>
Message-ID: <1133859930.3099.20.camel@himalaya.cs.cornell.edu>

On Tue, 2005-12-06 at 16:37 +0800, Enzo Michelangeli wrote:
>    Some NATs were rumored to exist that looked in arbitrary packets for
>    either the NATs' external IP address or for the internal host IP
>    address - either in binary or dotted decimal form - and rewrote it to
>    something else.  [...]
> 
> ...but I can testify that a Sercom IP706ST once in my possession did
> perform such "blind payload patch" for packets sent to the DMZ host.
> Needless to say, it's been demoted to paperweight :-)


FWIW, we looked for such behavior in NATs w.r.t TCP packets
(http://nutss.net/stunt-results.php). We couldn't find any evidence of
TCP data mangling in the 120 or so NATs that we tested. Would it be
possible to run the STUNT NAT test from behind your paperweight? :-D

cheers,
-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051206/9b078982/attachment.pgp
From Arnaud.Legout at sophia.inria.fr  Tue Dec  6 09:34:31 2005
From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes
	to	prevent	failure	in	BitTorrent like systems?
In-Reply-To: <43949A68.4090306@notdot.net>
References: <4393B3C3.5040905@notdot.net>	<4393FF09.6030302@sophia.inria.fr>	<439417CF.6060800@notdot.net>	<4394250A.9050704@sophia.inria.fr>
	<43949A68.4090306@notdot.net>
Message-ID: <43955B27.30202@sophia.inria.fr>

Hi,

Nick Johnson wrote:
>
> Excellent - this is exactly what I was looking for.
I happy to see it is useful to you
> However, I'm a little confused - first you say "Fig. 9 represents the 
> evolution of the number of copies of pieces in the peer set with 
> time", then you say "Fig. 12 represents the evolution of the number of 
> copies of pieces in the peer set with time. We see some major 
> differences compared to Fig. 9". What's the difference between what 
> you're graphing in the two graphs?
This is not the same torrent. Fig. 9 is for torrent 7 (see Table 1), and 
Fig. 12 is for torrent 11.
Torrent 9 is a typical torrent. We see that the number of copies in your 
peer set is well bounded.
Torrent 11 is a torrent with only one seed for most of the monitoring. 
In this case there are many pieces with only one copy (only on the seed
because the torrent is just starting),
but we see that even in this case the mean number of copies increases 
and that the rarest pieces are replicated fast.
This torrent shows that even with only one source, the pieces are 
efficiently replicated (Fig. 14 and Fig.15 leads to this conclusion).

Arnaud.


From Arnaud.Legout at sophia.inria.fr  Tue Dec  6 09:46:13 2005
From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to
	prevent	failurein	BitTorrent like systems?
In-Reply-To: <02c601c5fa40$57fbd520$0200a8c0@em.noip.com>
References: <4393B3C3.5040905@notdot.net> <4393FF09.6030302@sophia.inria.fr>
	<02c601c5fa40$57fbd520$0200a8c0@em.noip.com>
Message-ID: <43955DE5.7090808@sophia.inria.fr>

Hi,

Enzo Michelangeli wrote:
> Sometimes particular blocks are not missing, but mangled in transit by NAT
> routers too smart for their own good (and their owners', among which, at
> one time, myself). 
You do not target the same problem. It is far more easy to define your 
redundancy parameters to
 correct x% of corruption than to solve a distributed piece selection 
problem.
Moreover, corrupted pieces cannot be replicated in a torrent because 
your have a hash for each piece.
Therefore, I do not see how you can stop at 99% of the download because 
some pieces are corrupted. In this case,
the piece is simply retransmitted, which increases slightly the download 
time. By no way it should compromise a torrent
or create a last pieces problem.

Arnaud.

From sg266 at cornell.edu  Tue Dec  6 10:37:18 2005
From: sg266 at cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] DHTs in highly-transient networks
In-Reply-To: <BAY17-F10983AE38C48B953A26F5DD54D0@phx.gbl>
References: <BAY17-F10983AE38C48B953A26F5DD54D0@phx.gbl>
Message-ID: <1133865438.3099.65.camel@himalaya.cs.cornell.edu>

On Thu, 2005-12-01 at 18:38 +0000, Salem Mark wrote:
> in "highly transient networks", where the number of nodes 
> appearing and disappearing are very high, maintaining the DHT becomes hard 
> and introduces considerable overhead.
> 
> I am trying to find out what exactly "highly-transient" means. A file 
> sharing network like Gnutella, seems to be highly transient, where peers 
> join/leave the network frequently. 

True; the answer depends on the particular application and protocol.

In Gnutella (without ultra-peers), activity of all clients would affect
the network equally. With an intelligently chosen subset of nodes,
(ultrapeers in Gnutella, supernodes in Kazaa and Skype) the effects of
churn can be mitigated. This relies on the assumption that this subset
of nodes is more stable than the rest. The assumption appears to be
borne out in Gnutella (Daniel's paper), and in Skype [1]. 

[1] An Experimental Study of the Skype Peer-to-Peer VoIP System 
    http://www.guha.cc/~saikat/pub/cucs05-skype-abstract.php


> Could somebody elaborate on this? is 
> there a node departure/arrival/failure rate (per sec? per min?) that 
> identifies "highly-transient" networks ?

FWIW, in [1], we found that the supernode turnover is typically less
than 5% / 30min. Median supernode session time is 5.5 hours; session
time is heavy-tailed (Pareto) and not exponential. Supernodes are much
more stable than regular nodes.

Btw, if anyone wants a copy of [1], please email me directly. It has
data on Skype supernode lifetimes, churn rates, comparison between skype
supernodes and regular nodes, Skype VoIP and file-transfer workload
characterization, etc. In short, we found that Skype differs
considerably from filesharing networks (different usage model, much
higher median lifetimes etc).

cheers,
-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051206/26715c26/attachment.pgp
From enzomich at gmail.com  Tue Dec  6 11:15:20 2005
From: enzomich at gmail.com (Enzo Michelangeli)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes
	toprevent	failurein	BitTorrent like systems?
References: <4393B3C3.5040905@notdot.net>
	<4393FF09.6030302@sophia.inria.fr><02c601c5fa40$57fbd520$0200a8c0@em.noip.com>
	<43955DE5.7090808@sophia.inria.fr>
Message-ID: <035101c5fa5a$0abebc40$0200a8c0@em.noip.com>

----- Original Message ----- 
From: "Arnaud Legout" <Arnaud.Legout@sophia.inria.fr>
Sent: Tuesday, December 06, 2005 5:46 PM

> You do not target the same problem. It is far more easy to define
> your redundancy parameters to
>  correct x% of corruption than to solve a distributed piece
> selection problem.
> Moreover, corrupted pieces cannot be replicated in a torrent because
> your have a hash for each piece.
> Therefore, I do not see how you can stop at 99% of the download
> because some pieces are corrupted. In this case,
> the piece is simply retransmitted, which increases slightly the
> download time. By no way it should compromise a torrent
> or create a last pieces problem.

Why should retransmission solve the problem? If I'm behind a "mangling
router", trying to download a file a piece of which has a 4-byte sequence
that is mistaken by the router as an IP address that needs translation,
every time I get that sequence the data will be corrupted.

Enzo


From enzomich at gmail.com  Tue Dec  6 11:44:37 2005
From: enzomich at gmail.com (Enzo Michelangeli)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to
	preventfailurein	BitTorrent like systems?
References: <4393B3C3.5040905@notdot.net>
	<4393FF09.6030302@sophia.inria.fr><02c601c5fa40$57fbd520$0200a8c0@em.noip.com>
	<1133859930.3099.20.camel@himalaya.cs.cornell.edu>
Message-ID: <035601c5fa5a$7c2f4340$0200a8c0@em.noip.com>

----- Original Message ----- 
From: "Saikat Guha" <sg266@cornell.edu>
Sent: Tuesday, December 06, 2005 5:05 PM

On Tue, 2005-12-06 at 16:37 +0800, Enzo Michelangeli wrote:
[...]
>> ...but I can testify that a Sercom IP706ST once in my possession did
>> perform such "blind payload patch" for packets sent to the DMZ host.
>> Needless to say, it's been demoted to paperweight :-)
>
> FWIW, we looked for such behavior in NATs w.r.t TCP packets
> (http://nutss.net/stunt-results.php). We couldn't find any evidence of
> TCP data mangling in the 120 or so NATs that we tested. Would it be
> possible to run the STUNT NAT test from behind your paperweight? :-D

It'll take a few days... But anyway the mangling I observed happened on
UDP packets, rather than TCP. (I cannot exclude that TCP was affected as
well, though).

Enzo


From stelian at axigenmail.com  Tue Dec  6 12:59:54 2005
From: stelian at axigenmail.com (Stelian)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting codes to prevent failure
	in	BitTorrent like systems?
In-Reply-To: <4393B3C3.5040905@notdot.net>
References: <4393B3C3.5040905@notdot.net>
Message-ID: <43958B4A.8000902@axigenmail.com>

Nick Johnson wrote:

> If we assume there are 4 peers left on the network, each with 50% of
> the file remaining, on average they will be able to reconstruct 50% +
> 25% + 12.5% + 6.25% = 93.75% of the file, which exceeds the threshold
> required to reconstruct with check blocks.

Your concern over the disappearing seed is obviously relevant in case of
a slow seed, otherwise the probability that the seed has not uploaded
all the blocks at least once before departing is practically very low.
So let's assume a slow seed, a seed so slow that once it has finished
uploading a block, all the lechers will share the new available block
instantly among them.
Assuming the seed has time to upload only 90% of the original file, then
it will have time to upload only 81.8% from the new file (10% larger
because of the error correction) - which is of course insufficient to
reconstitute the file, thereby denying the any apparent gain.

From Arnaud.Legout at sophia.inria.fr  Tue Dec  6 13:08:39 2005
From: Arnaud.Legout at sophia.inria.fr (Arnaud Legout)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Error correcting
	codes	toprevent	failurein	BitTorrent like systems?
In-Reply-To: <035101c5fa5a$0abebc40$0200a8c0@em.noip.com>
References: <4393B3C3.5040905@notdot.net>	<4393FF09.6030302@sophia.inria.fr><02c601c5fa40$57fbd520$0200a8c0@em.noip.com>	<43955DE5.7090808@sophia.inria.fr>
	<035101c5fa5a$0abebc40$0200a8c0@em.noip.com>
Message-ID: <43958D57.3030305@sophia.inria.fr>

Hi,

Enzo Michelangeli wrote:
> Why should retransmission solve the problem? If I'm behind a "mangling
> router", trying to download a file a piece of which has a 4-byte sequence
> that is mistaken by the router as an IP address that needs translation,
> every time I get that sequence the data will be corrupted.
>   
I did not understand you were referencing to this kind of problem.
I was arguing for a random corruption of a given amount of packets.

Regards,
Arnaud.


From xiangsong.hou at gmail.com  Tue Dec  6 15:58:41 2005
From: xiangsong.hou at gmail.com (xiangsong hou)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] published key chenge frequency in DHT
Message-ID: <f322b8af0512060758i2102c86av@mail.gmail.com>

hi all:
as we know,DHT can deal with node join/leave frequently.
i want to know if DHT can deal with publishde key change frequently.
for example,in grid computing resouce dicovery use DHT,the published key
(represent cpu or memeory) is change very frequently,so assigned node is
change frequently.
how to deal with this situation in DHT?
who have relative paper about this?
                                              HOUXS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/3b9f6362/attachment.htm
From ian.clarke at gmail.com  Tue Dec  6 17:47:40 2005
From: ian.clarke at gmail.com (Ian Clarke)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <4392402F.303@panix.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202154557.E559F191C@yumyum.zooko.com>
	<823242bd0512030149t3e6a18d2x@mail.gmail.com> <4392402F.303@panix.com>
Message-ID: <823242bd0512060947y3e4d1bd9n@mail.gmail.com>

On 03/12/05, Lucas Gonze <lgonze@panix.com> wrote:
> A private network allows participants to talk freely without every
> comment ending up in Google, and that allows you to have the kind of
> conversation which shouldn't be public.  The application is to enable
> speech which isn't intended for global scale, usually about personal
> issues like sex, money, family, friendships, and gossip.  I wouldn't
> call that covert, illegal, or contrived, just private.

I think that would fall under my definition of "covert".

Ian.

From nigini at gmail.com  Wed Dec  7 00:08:48 2005
From: nigini at gmail.com (Nigini Oliveira)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: Error correcting codes to prevent failure in
	BitTorrent like systems?
Message-ID: <94fec2490512061608m14cc56f7vdfe4a423c95e4635@mail.gmail.com>

Hi ALL!

Don't know specifically about BitTorrent, but I've found a lot of results
talking about the gains of using ECC for improving the availability of data
at p2p networks:
http://www.citeulike.org/user/nigini/article/274016 (not readed)

These days I'm working on a paper's analisis that talks about using "Erasure
Codes" to do that (maybe one day I can finish my model on it):
http://www.citeulike.org/user/nigini/article/307402 (this is pretty hard to
understand)

Searching now I've found this one page work exposing some "interesting"
data:
http://dmi.ensica.fr/IMG/pdf/347.pdf

After I began to study about ECC this year I can't stop finding examples of
using them at network/computing world. But as Arnaud sad: " Such codes are
terribly sexy, but when it comes to real applications, things are far less
sexy." But as I didn't got yet that for BitTorrent the "less sexy" means
"don't needed", maybe the above work can inspire some answer.

"At? mais!"

--
Nigini Abilio Oliveira
Mestrando em Computa??o
UFCG - DSC - COPIN
www.nigini.com.br
nigini@gmail.com
nigini@dsc.ufcg.edu.br
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/76ec1011/attachment.html
From nigini at gmail.com  Wed Dec  7 00:32:14 2005
From: nigini at gmail.com (Nigini Oliveira)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Re: Error correcting codes to prevent failure in
	BitTorrent like systems?
In-Reply-To: <94fec2490512061608m14cc56f7vdfe4a423c95e4635@mail.gmail.com>
References: <94fec2490512061608m14cc56f7vdfe4a423c95e4635@mail.gmail.com>
Message-ID: <94fec2490512061632i6801bc90ne763e5372aceec79@mail.gmail.com>

Just found this text going at the point... I don't know, but appears that
this guy is connected with BitTorrent development...

http://www.livejournal.com/users/bramcohen/1416.html

On 12/6/05, Nigini Oliveira <nigini@gmail.com> wrote:
>
> Hi ALL!
>
> Don't know specifically about BitTorrent, but I've found a lot of results
> talking about the gains of using ECC for improving the availability of data
> at p2p networks:
> http://www.citeulike.org/user/nigini/article/274016 (not readed)
>
> These days I'm working on a paper's analisis that talks about using
> "Erasure Codes" to do that (maybe one day I can finish my model on it):
> http://www.citeulike.org/user/nigini/article/307402 (this is pretty hard
> to understand)
>
> Searching now I've found this one page work exposing some "interesting"
> data:
> http://dmi.ensica.fr/IMG/pdf/347.pdf
>
> After I began to study about ECC this year I can't stop finding examples
> of using them at network/computing world. But as Arnaud sad: " Such codes
> are terribly sexy, but when it comes to real applications, things are far
> less sexy." But as I didn't got yet that for BitTorrent the "less sexy"
> means "don't needed", maybe the above work can inspire some answer.
>
> "At? mais!"
>
> --
> Nigini Abilio Oliveira
> Mestrando em Computa??o
> UFCG - DSC - COPIN
> www.nigini.com.br
> nigini@gmail.com
> nigini@dsc.ufcg.edu.br


--
Nigini Abilio Oliveira
Mestrando em Computa??o
UFCG - DSC - COPIN
www.nigini.com.br
nigini@gmail.com
nigini@dsc.ufcg.edu.br
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051206/f39d5c5a/attachment.htm
From bneijt at gmail.com  Wed Dec  7 01:31:01 2005
From: bneijt at gmail.com (Bram Neijt)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Any body know this kind of network?
Message-ID: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com>

Hi.

I'm writing up some documentation on P2P systems, and I've tried to
make an overview of all kinds of networks. From simple http
client-server networks to annonymous P2P, in the hope I could predict
the next step.

One thing I thought up was a highly unusable network, which some
university project might have tried out. Maybe some of you can point
me to an "approximate example" of that network:

Clients constantly recieve data from the network and push it back to
other hosts they are connected to, inserting their own requests,
filling requests with data and picking out their own data. The don't
have any identification of where the data came from (not even a
annonymous ID) and simply pick out the "right" data.

If you know of a system that comes close, I would like to be able to
point to it in my documentation.

Greetings,
  Bram
PS The documents I'm working on can be found here:
http://www.ai.rug.nl/~bneijt/doc/networks/levels.html

From m.rogers at cs.ucl.ac.uk  Wed Dec  7 10:46:29 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Any body know this kind of network?
In-Reply-To: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com>
References: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com>
Message-ID: <4396BD85.5000509@cs.ucl.ac.uk>

Hi Bram,

A few systems you might be interested in:

P5 (http://www.cs.umd.edu/projects/p5/p5-extended.pdf), Cashmere 
(http://www.cs.ucsb.edu/~ravenben/publications/pdf/cashmere-nsdi05.pdf), 
Herbivore (http://www.cs.cornell.edu/People/egs/papers/herbivore-tr.pdf) 
and XOR trees 
(ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-54.ps.gz). 


Herbivore and XOR trees are based on the dining cryptographers protocol 
(http://world.std.com/~franl/crypto/dining-cryptographers.txt).

[vapourware]
I'm also working on an anonymous communication system where there are no 
end-to-end node IDs, but nodes can use link-local flow identifiers to 
recognise packets that are part of the same flow, and anonymous delivery 
receipts to work out which flows are being routed in the right 
direction. Trial and error can be used to find good routes to a 
destination without knowing its address. Hopefully this will be more 
efficient that flooding without sacrificing anonymity.
[/vapourware]

Cheers,
Michael

Bram Neijt wrote:
> Hi.
> 
> I'm writing up some documentation on P2P systems, and I've tried to
> make an overview of all kinds of networks. From simple http
> client-server networks to annonymous P2P, in the hope I could predict
> the next step.
> 
> One thing I thought up was a highly unusable network, which some
> university project might have tried out. Maybe some of you can point
> me to an "approximate example" of that network:
> 
> Clients constantly recieve data from the network and push it back to
> other hosts they are connected to, inserting their own requests,
> filling requests with data and picking out their own data. The don't
> have any identification of where the data came from (not even a
> annonymous ID) and simply pick out the "right" data.
> 
> If you know of a system that comes close, I would like to be able to
> point to it in my documentation.
> 
> Greetings,
>   Bram
> PS The documents I'm working on can be found here:
> http://www.ai.rug.nl/~bneijt/doc/networks/levels.html
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From bneijt at gmail.com  Wed Dec  7 13:16:24 2005
From: bneijt at gmail.com (Bram Neijt)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Any body know this kind of network?
In-Reply-To: <4396BD85.5000509@cs.ucl.ac.uk>
References: <46c2f4ab0512061731m59b6ebder89c62a15aa5f9fd1@mail.gmail.com>
	<4396BD85.5000509@cs.ucl.ac.uk>
Message-ID: <46c2f4ab0512070516w3453b5bn3fe1a044f7349306@mail.gmail.com>

Thanks to Gun and Michael,

I'm going to take some time reading the papers and sites guys point
to, so it will
take some time before they are in the documentation.

And now that this level has been identified, I'm off completing the
list a bit more
and hopefully think of yet another kind of system on the way ;-)

Thanks for quick the replies!
  Bram

From ludovic.courtes at laas.fr  Wed Dec  7 16:18:46 2005
From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <438C9F88.2050803@pdos.lcs.mit.edu> (Jeremy Stribling's message
	of "Tue, 29 Nov 2005 13:35:52 -0500")
References: <200511291414.35852.01771@iha.dk>
	<20051129141713.6A9CB698@yumyum.zooko.com>
	<20051129142151.8E1A035E4@yumyum.zooko.com>
	<438C9F88.2050803@pdos.lcs.mit.edu>
Message-ID: <87lkywg9sp.fsf_-_@laas.fr>

Hi,

Jeremy Stribling <strib@amsterdam.lcs.mit.edu> writes:

> Working on it.  Should have something public within a few months:
>
> http://pdos.csail.mit.edu/papers/overcite:iptps05/index.html

Indeed, that seems very promising!

Similarly, are there people working on decentralized web indexing and
search engines?  To paraphrase Zooko, it would be nice to decentralize
Google before it is too late...

Thanks,
Ludovic.

From gwendal.simon at francetelecom.com  Wed Dec  7 16:36:01 2005
From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Decentralized search engines
Message-ID: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>

In comparison with traditional filesharing approaches, a decentralized search for the web should take into account words inside the documents.

As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents.

http://maay.netofpeers.net

Each node (said computer) can publish a set of documents. This information space does not initially contain the web. Our idea is to consider that the cache (or history) of the web browser should be, by default, included in the published set of documents. So, every page that has been visited by at least one people since x days will be available in the network. Obviously, more popular a page is, more available it is.

By the way, one first challenge is the implementation of a nice crawler for owned documents : an indexer. This indexer should be able to scan and retrieve words from various documents (.html, .doc, .pdf, ...). It should be light and run in idle time and, if possible, be cross-platform. If you know a good open-source indexer, please let us know.


-- Gwendal


> -----Message d'origine-----
> De : p2p-hackers-bounces@zgp.org 
> [mailto:p2p-hackers-bounces@zgp.org] De la part de Ludovic Court?s
> Envoy? : mercredi 7 d?cembre 2005 17:19
> ? : strib@MIT.EDU
> Cc : Peer-to-peer development.; zooko@zooko.com
> Objet : [p2p-hackers] Decentralized search engines
> 
> Hi,
> 
> Jeremy Stribling <strib@amsterdam.lcs.mit.edu> writes:
> 
> > Working on it.  Should have something public within a few months:
> >
> > http://pdos.csail.mit.edu/papers/overcite:iptps05/index.html
> 
> Indeed, that seems very promising!
> 
> Similarly, are there people working on decentralized web indexing and
> search engines?  To paraphrase Zooko, it would be nice to decentralize
> Google before it is too late...
> 
> Thanks,
> Ludovic.
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From solipsis at pitrou.net  Wed Dec  7 17:17:49 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
Message-ID: <1133975869.5662.5.camel@fsol>


Hi Gwendal :)

Le mercredi 07 d?cembre 2005 ? 17:36 +0100, SIMON Gwendal RD-MAPS-ISS a
?crit :
> By the way, one first challenge is the implementation of a nice
> crawler for owned documents : an indexer. This indexer should be able
> to scan and retrieve words from various documents
> (.html, .doc, .pdf, ...). It should be light and run in idle time and,
> if possible, be cross-platform. If you know a good open-source
> indexer, please let us know.

You can look at the techniques used by Beagle :
http://beaglewiki.org/
or Kat :
http://kat.mandriva.com/
or the Gnome Deskbar applet :
http://live.gnome.org/DeskbarApplet
http://raphael.slinckx.net/deskbar/

Regards

Antoine.


From agthorr at cs.uoregon.edu  Wed Dec  7 17:27:26 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <87lkywg9sp.fsf_-_@laas.fr>
References: <200511291414.35852.01771@iha.dk>
	<20051129141713.6A9CB698@yumyum.zooko.com>
	<20051129142151.8E1A035E4@yumyum.zooko.com>
	<438C9F88.2050803@pdos.lcs.mit.edu> <87lkywg9sp.fsf_-_@laas.fr>
Message-ID: <20051207172725.GG5812@cs.uoregon.edu>

On Wed, Dec 07, 2005 at 05:18:46PM +0100, Ludovic Court?s wrote:
> Similarly, are there people working on decentralized web indexing and
> search engines?  To paraphrase Zooko, it would be nice to decentralize
> Google before it is too late...

For what purpose do you want to "decentralize Google"?

Is it for some technical reason where you believe a decentralized
index will provide better end-user performance?

Or is it because you don't think any single organization should have
that much control over information?

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From gwendal.simon at francetelecom.com  Wed Dec  7 17:49:07 2005
From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Decentralized search engines
Message-ID: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD2060@ftrdmel1.rd.francetelecom.fr>

> from Daniel Stutzbach
> 
> For what purpose do you want to "decentralize Google"?
> 
> Is it for some technical reason where you believe a decentralized
> index will provide better end-user performance?

Yes.

Crawler-based systems are not up-to-date. It is especially bad in the
current context of dynamic webpages : news, posts, comments...

Moreover, more contents can be indexed : my photos, my music...

> Or is it because you don't think any single organization should have
> that much control over information?

Yes.

By the way, an organization can be sued and shutdown (eg. napster)

-- Gwendal


> 
> -- 
> Daniel Stutzbach                           Computer Science 
> Ph.D Student
> http://www.barsoom.org/~agthorr                     
> University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From hannes.tschofenig at siemens.com  Wed Dec  7 17:45:23 2005
From: hannes.tschofenig at siemens.com (Tschofenig, Hannes)
Date: Sat Dec  9 22:13:05 2006
Subject: AW: [p2p-hackers] Decentralized search engines
Message-ID: <ECDC9C7BC7809340842C0E7FCF48C393A80113@MCHP7IEA.ww002.siemens.net>

hi daniel, 

google is, in some sense, already using a decentralized solution. they are using more than 160.000 machines for scalability, cost and performance reasons. 

i wonder whether there is actually detailed information available how their system works. 

ciao
hannes


> -----Urspr?ngliche Nachricht-----
> Von: p2p-hackers-bounces@zgp.org 
> [mailto:p2p-hackers-bounces@zgp.org] Im Auftrag von Daniel Stutzbach
> Gesendet: Mittwoch, 7. Dezember 2005 18:27
> An: Peer-to-peer development.
> Cc: strib@MIT.EDU; zooko@zooko.com
> Betreff: Re: [p2p-hackers] Decentralized search engines
> 
> On Wed, Dec 07, 2005 at 05:18:46PM +0100, Ludovic Court?s wrote:
> > Similarly, are there people working on decentralized web 
> indexing and
> > search engines?  To paraphrase Zooko, it would be nice to 
> decentralize
> > Google before it is too late...
> 
> For what purpose do you want to "decentralize Google"?
> 
> Is it for some technical reason where you believe a decentralized
> index will provide better end-user performance?
> 
> Or is it because you don't think any single organization should have
> that much control over information?
> 
> -- 
> Daniel Stutzbach                           Computer Science 
> Ph.D Student
> http://www.barsoom.org/~agthorr                     
> University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From mgp at ucla.edu  Wed Dec  7 19:27:48 2005
From: mgp at ucla.edu (Michael Parker)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
Message-ID: <20051207112748.1ptnql9c004s4oko@mail.ucla.edu>

The first step of indexing is the actual keyword extraction itself. 
 From what I have heard, libextractor is a good open-source solution: 
http://gnunet.org/libextractor/

- Mike Parker


Quoting SIMON Gwendal RD-MAPS-ISS <gwendal.simon@francetelecom.com>:

> By the way, one first challenge is the implementation of a nice 
> crawler for owned documents : an indexer. This indexer should be able 
> to scan and retrieve words from various documents (.html, .doc, .pdf, 
> ...). It should be light and run in idle time and, if possible, be 
> cross-platform. If you know a good open-source indexer, please let us 
> know.
>

From coderman at gmail.com  Wed Dec  7 19:40:50 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <ECDC9C7BC7809340842C0E7FCF48C393A80113@MCHP7IEA.ww002.siemens.net>
References: <ECDC9C7BC7809340842C0E7FCF48C393A80113@MCHP7IEA.ww002.siemens.net>
Message-ID: <4ef5fec60512071140u70e2213dk29e31c47d4b205b2@mail.gmail.com>

On 12/7/05, Tschofenig, Hannes <hannes.tschofenig@siemens.com> wrote:
> ...
> google is, in some sense, already using a decentralized solution. they are using more than 160.000 machines for scalability, cost and performance reasons.

distributed != decentralized.

googlefs and map reduce papers describe some of their internals.
(regarding how it works)
http://labs.google.com/papers/index.html

From zooko at zooko.com  Thu Dec  8 14:48:25 2005
From: zooko at zooko.com (zooko@zooko.com)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] darknet ~= (blacknet, f2f net)
In-Reply-To: <823242bd0512030149t3e6a18d2x@mail.gmail.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202154557.E559F191C@yumyum.zooko.com>
	<823242bd0512030149t3e6a18d2x@mail.gmail.com>
Message-ID: <20051208144825.A210014E74@yumyum.zooko.com>


 Ian Clarke wrote:
>
> I do wish you would refer to these networks as those which allow the
> covert transmission of information, rather than those which are used
> for the illegal transmission of information - since I am not aware of
> any networks that are specifically designed for the illegal
> transmission of information.  I think this would help alleviate the
> political problem you raise later in your email.

The concept of a networking technology or a network which is specifically used
for illegal information is an interesting concept, for example Tim May
"blacknet" [1, 2, 3] and Biddle, et al. "darknet" [4].  If you would like to
use "darknet" to mean something else then I can't stop you, but I would like to
talk about that concept so I need a word for it.

Regards,

Zooko

P.S.  The most salient difference between blacknet [1] and darknet [2] in my
opinion is that blacknet is a market, in which participants are motivated by
economic gain, and darknet is a more general concept, in which the motivations
of participants may be various -- including but not limited to friendship.

[1] http://www.privacyexchange.org/iss/confpro/cfpuntraceable.html
[2] http://www-personal.umich.edu/~ludlow/worries.txt
[3] http://cypherpunks.venona.com/date/1993/08/msg00538.html
[4] http://zgp.org/pipermail/p2p-hackers/2005-December/003245.html

From zooko at zooko.com  Thu Dec  8 15:08:39 2005
From: zooko at zooko.com (zooko@zooko.com)
Date: Sat Dec  9 22:13:05 2006
Subject: [p2p-hackers] f2f for purposes other than privacy
In-Reply-To: <823242bd0512030149t3e6a18d2x@mail.gmail.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202154557.E559F191C@yumyum.zooko.com>
	<823242bd0512030149t3e6a18d2x@mail.gmail.com>
Message-ID: <20051208150840.01E8A14E74@yumyum.zooko.com>


 Ian Clarke wrote:
>
> We (Freenet) have
> been concerned about the fact that Freenet was harvestable for several
> years now.  Around spring this year I made the observation that if
> human relationships form a small world network, it should be possible
> to assign locations to people such that we form a Kleinberg-style
> small world network, and thus we could make the network routable.
> Oskar Sandberg then suggested a way to do this, and we set about
> validating the concept using simulations.

I would love to learn more.  Is there a white-paper or design document beyond
these slides from DefCon [1]?


> Are you aware of any current or proposed f2f
> networks for which concealment of user activity is not a goal?

Well, I think of the links between two friends in f2f to be not solely
communication channels but also to have other meaning.  For example, if friends
transmit music files to one another, then in addition to any privacy properties
that the network may have, it also serves as a decentralized, attack-resistant
recommendation engine for music.

Honestly, this area of research is ripe for exploration, but I can give you at
least a couple of examples.  Doceur set it up with a claimed general negative
result in "The Sybil Attack" in 2002 [2].  But his general negative result
isn't quite true, as disproven by e.g. Advogato, 2000 [3, 4, 5].  Recently
George Danezis, Chris Lesniewski-Laas, M. Frans Kaashoek, and Ross Anderson
smashed these two ideas together and mixed in some DHT routing: [6].

[6] is an excellent paper, which proposes a concrete DHT design and which
really nails the fact that the introduction graph or "bootstrap graph" contains
information which can defeat the allegedly undefeatable Sybil Attack.  [6]
references some related work which looks interesting, but I haven't followed
those links yet myself.  I guess [6] is somewhat relevant to the Freenet v0.7
design.

So, uh, anyway, this shows that there is interest in the notion of using
friendship networks for purposes other than privacy, namely attack resistance
of DHT routing and attack resistance of metadata [7 (self-citation)].

I think there's a lot more value to be mined from this concept, and I'm really
glad that it has finally gotten the attention of some p2p researchers.

Oh, and here's another perspective on this idea -- a post I wrote to my blog a
few years ago suggesting that all sorts of DHT innovations which were intended
to improve network performance could be applied to attack resistance: "trust is
just another topology" [8].

Regards,

Zooko

[1] http://freenetproject.org/papers/vegas1_dc.pdf
[2] http://citeseer.ist.psu.edu/douceur02sybil.html
[3] http://www.advogato.org/trust-metric.html
[4] http://www.levien.com/thesis/compact.pdf
[5] http://www.levien.com/free/tmetric-HOWTO.html
[6] http://pdos.csail.mit.edu/cgi-bin/pubs-date.cgi?match=Sybil-resistant+DHT+routing
[7] http://conferences.oreillynet.com/cs/p2p2001/view/e_sess/1200
[8] http://www.zooko.com/log-2003-01.html#d2003-01-23-trust_is_just_another_topology

From zooko at zooko.com  Thu Dec  8 15:28:58 2005
From: zooko at zooko.com (zooko@zooko.com)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] DHT generalization for purposes other than network
	performance
In-Reply-To: <20051208150840.01E8A14E74@yumyum.zooko.com>
References: <200511291414.35852.01771@iha.dk>
	<20051129140314.046DD698@yumyum.zooko.com>
	<823242bd0512020407i252b84c4u@mail.gmail.com>
	<20051202154557.E559F191C@yumyum.zooko.com>
	<823242bd0512030149t3e6a18d2x@mail.gmail.com>
	<20051208150840.01E8A14E74@yumyum.zooko.com>
Message-ID: <20051208152858.0F5FC14E74@yumyum.zooko.com>


I'm going to paraphrase a blog entry I wrote some years ago and then mention
some newer research that is related.

In January 2003 [1] I wrote something like:

> Thanks to Peter Marbach for the discussion that prompted this insight.
> 
> A network is defined on top of an underlying network. The first emergent 
> networks (Chord), assumed that the underlying network was (a) fully 
> connected and (b) homogeneous in the sense that any hop was considered to be 
> just as expensive as any other hop. The most important contribution of 
> Pastry (and then of Kademlia) is to treat the underlying network as 
> heterogeneous, in the sense that some hops are considered more expensive 
> that others. For Pastry, they chose to make these costs reflect network 
> performance (i.e. latency or throughput) so that Pastry would optimize for 
> faster routing (e.g. don't send packets through Japan when they are on their 
> way from Canada to USA). For Kademlia, they chose to make these costs 
> reflect uptime of peers in order to optimize for stability. So my big
> realization is:
> 
> *** Trust (or vulnerability, or exposure) can also be modelled in the same 
>     way, as costs on the links of the underlying network.
> 
> In addition, the underlying network may be incompletely connected, either 
> because of (a) trust disconnects, (b) firewalls, NATs, censorship, 
> terrorism, (c) the underlying network doesn't have complete routing e.g. 
> wireless ad hoc networks.
> 
> This encourages me a lot: the fact that mainstream emergent network 
> researchers like Project IRIS might develop techniques for overlay networks 
> to work on more general underlying networks (especially non-fully-
> connected), and that these techniques can then be applied to trust networks.


The recent research that I wanted to cite was Michael Freedman et al. analyzing
practical details of how current DHTs work atop non-fully-connected underlay
networks: [2].  

[2] doesn't propose any good general solution, and indeed it speculates that a
fresh new DHT designed to handle non-fully-connected underlays may be needed.

Regards,

Zooko

[1] http://www.zooko.com/log-2003-01.html#d2003-01-23-trust_is_just_another_topology
[2] "Non-Transitive Connectivity and DHTs"
    http://www.scs.cs.nyu.edu/~mfreed/publications/

From Serguei.Osokine at efi.com  Thu Dec  8 18:11:04 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] P2P in SFC - Last call
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42783@fcexmb04.efi.internal>

>      Ryoko's Sushi
>     - Wednesday, 12/7 at 9pm

Thanks, David, for organizing that!

Great crowd, great conversations.

And I'm adding this place to my own short list of selected restaurants
- I do not believe I've ever seen Kirin on tap anywhere else around
here...

	Best wishes -
	S.Osokine.
	8 Dec 2005.
	

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of David Barrett
Sent: Tuesday, December 06, 2005 12:43 AM
To: Peer-to-peer development.
Subject: [p2p-hackers] P2P in SFC - Last call


Looks like we'll have a good showing at the P2P event.  Despite the 
size, I'm sticking to my guns:

     Ryoko's Sushi
     - Wednesday, 12/7 at 9pm
     - URL: http://tinyurl.com/bkk5d
     - Phone: (415) 775-1028
     - Address: 619 Taylor St, San Francisco, CA 94102

To keep the logistics simple, I'll bring a jar into which everyone can 
toss money, and I'll keep ordering sushi until the jar runs dry.  If you 
have any preferences, do let me know.  As for drinks, I spoke with the 
pretty girls there and they'll take care of you at the bar.

Any lurkers who haven't spoken up feel free to come, and all those who 
have please let me know if you won't.  Until then, see ya'll soon!

-david
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From don at dhoffman.net  Thu Dec  8 19:21:45 2005
From: don at dhoffman.net (Donald Hoffman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] P2P in SFC - Last call
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42783@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42783@fcexmb04.efi.internal>
Message-ID: <D0B1BFCF-704E-4872-9F5D-09C6A6EEA58F@dhoffman.net>

I second Serguei's comment.  A great evening.  Thanks or organizing,  
David.


On Dec 8, 2005, at 10:11 AM, Serguei Osokine wrote:

>>      Ryoko's Sushi
>>     - Wednesday, 12/7 at 9pm
>
> Thanks, David, for organizing that!
>
> Great crowd, great conversations.
>
> And I'm adding this place to my own short list of selected restaurants
> - I do not believe I've ever seen Kirin on tap anywhere else around
> here...
>
> 	Best wishes -
> 	S.Osokine.
> 	8 Dec 2005.
> 	
>
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers- 
> bounces@zgp.org]On
> Behalf Of David Barrett
> Sent: Tuesday, December 06, 2005 12:43 AM
> To: Peer-to-peer development.
> Subject: [p2p-hackers] P2P in SFC - Last call
>
>
> Looks like we'll have a good showing at the P2P event.  Despite the
> size, I'm sticking to my guns:
>
>      Ryoko's Sushi
>      - Wednesday, 12/7 at 9pm
>      - URL: http://tinyurl.com/bkk5d
>      - Phone: (415) 775-1028
>      - Address: 619 Taylor St, San Francisco, CA 94102
>
> To keep the logistics simple, I'll bring a jar into which everyone can
> toss money, and I'll keep ordering sushi until the jar runs dry.   
> If you
> have any preferences, do let me know.  As for drinks, I spoke with the
> pretty girls there and they'll take care of you at the bar.
>
> Any lurkers who haven't spoken up feel free to come, and all those who
> have please let me know if you won't.  Until then, see ya'll soon!
>
> -david
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From dbarrett at quinthar.com  Fri Dec  9 02:15:06 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] P2P in SFC - The Aftermath
Message-ID: <4398E8AA.2050002@quinthar.com>

Thanks to everyone for attending.  A good crowd and great conversation 
all around.  And astonishingly, only a single beer spilt.  Final menu:

	4 hamachi nigiri
	2 tekka maki
	2 sake maki
	2 maguro nigiri
	2 sake nigiri
	2 teriyaki chicken
	2 toro nigiri
	2 saba nigiri
	1 Spanish mackerel nigiri
	1 halibut nigiri
	2 asparagus rolls
	2 California rolls
	2 rainbow rolls
	1 spicy tuna roll
	3 tempura

Nothing too fancy, but covered all the basics.  Not a single uni 
request, even though I know there was at least one fanatic in the crowd.

So thanks again for everyone who came, and thanks to the rest for 
enduring my endless emails.  Also, thanks to Travis Kalanick from Red 
Swoosh for picking up the slack between what we ordered and what money 
was dropped into the bucket.

For those who are interested, there's also a "superhappydevhouse" event 
(http://superhappydevhouse.com/) this Saturday in the hills above San 
Mateo.  Come show your stuff in the "P2P Hour of Power" -- it's a great, 
informal way to get 15 minutes of fame in front of a surprisingly large 
crowd of surprisingly social geeks.  I'll be there and will likely demo 
iGlance; feel free to send me any questions if you're interested in 
participating.

This concludes our test of the P2P Sushifest broadcast system, now back 
to your originally scheduled programming.

-david

From dbarrett at quinthar.com  Fri Dec  9 02:32:06 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
Message-ID: <4398ECA6.2090301@quinthar.com>

SIMON Gwendal RD-MAPS-ISS wrote:
> This
> information space does not initially contain the web. Our idea is to
> consider that the cache (or history) of the web browser should be, by
> default, included in the published set of documents.

I assume you have a good answer for this, but how will you prevent (for 
example) cached copies of Hotmail from ending up in your system?

Also, is there any way to correlate an actual web URL with content in 
your system?  For example, could you do a search in your system, it 
finds a cached webpage, and then offer a "(www)" link that points back 
to the original page?

Finally, is there any way to create a "private" subset of the network, 
so (for example) everyone in my company can use this to get quick access 
to everyone else's documents, but nobody outside my company can use it 
to get in?

Regardless, this looks fantastic.

If you're in a US-centric business frame of mind, you might consider 
using this to ensure Sarbanes-Oxley conformance.  Especially if it were 
scriptable -- have a series of "kill words" that should never appear in 
any document anywhere in a company (including a specific customer name, 
or a case number, or whatever).  Then have a server loop through the 
kill list every night and raise a red flag if it finds a document that 
shouldn't exist.

-david

From lgonze at panix.com  Fri Dec  9 03:34:14 2005
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] P2P in SFC - The Aftermath
In-Reply-To: <4398E8AA.2050002@quinthar.com>
References: <4398E8AA.2050002@quinthar.com>
Message-ID: <4398FB36.4080503@panix.com>

And let the record show that the aloha chapter of p2p-hackers attended a 
talk on the Fortress language at University of Hawaii, followed by pizza 
(with pineapple, naturally) and a design patterns talk by Sam Joseph.  
Sushi was not had because we get way too much of it as it is, however at 
least one member of the chapter did have spam musubi following the get 
together.


From fis at wiwi.hu-berlin.de  Fri Dec  9 12:16:26 2005
From: fis at wiwi.hu-berlin.de (Matthias Fischmann)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <43928055.4080306@vscape.com>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
	<20051202185136.GB2604@cs.uoregon.edu>
	<4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com>
	<20051202193833.GD2249@leitl.org> <43928055.4080306@vscape.com>
Message-ID: <20051209121626.GB22875@localhost.localdomain>


hi, i hope this isn't too rude.  i am not a big contributer here, so i
don't feel i have the right to make any demands.  but anyway here you
go:

i feel those threads on getting together somewhere are only
interesting to the small subset of geographically affected
subscribers, so i would like to suggest that these are moved to a
yahoo group or a different mailing list as soon as a few people get
interested.  for instance, the SFC group might have reached this
momentum.

just a suggestion.

thanks,
matthias
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051209/4dfbeba8/attachment.pgp
From stewbagz at gmail.com  Fri Dec  9 16:45:45 2005
From: stewbagz at gmail.com (stew "stewbagz" mercer)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <20051209121626.GB22875@localhost.localdomain>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>
	<20051202185136.GB2604@cs.uoregon.edu>
	<4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com>
	<20051202193833.GD2249@leitl.org> <43928055.4080306@vscape.com>
	<20051209121626.GB22875@localhost.localdomain>
Message-ID: <3b4626760512090845y40995fc6h@mail.gmail.com>

On 09/12/05, Matthias Fischmann <fis@wiwi.hu-berlin.de> wrote:
>
>
[snipped]
>
> i feel those threads on getting together somewhere are only
> interesting to the small subset of geographically affected
> subscribers, so i would like to suggest that these are moved to a
> yahoo group or a different mailing list as soon as a few people get
> interested.  for instance, the SFC group might have reached this
> momentum.
>
[snipped]

I'm just jealous that they live on the warm, sunny west coast of
america, and I'm stuck here in grey, windswept London. Sushi ? That's
something Homer ate once wasn't it ? Fugu ?

:)

Kind regs
Stew

From jbj at forbidden.co.uk  Fri Dec  9 17:19:01 2005
From: jbj at forbidden.co.uk (Jeremy James)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <3b4626760512090845y40995fc6h@mail.gmail.com>
References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com>	<20051202185136.GB2604@cs.uoregon.edu>	<4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com>	<20051202193833.GD2249@leitl.org>
	<43928055.4080306@vscape.com>	<20051209121626.GB22875@localhost.localdomain>
	<3b4626760512090845y40995fc6h@mail.gmail.com>
Message-ID: <4399BC85.6010001@forbidden.co.uk>

stew "stewbagz" mercer wrote:
> 
>> [double snipped]
> 
> [snipped]
> 
> I'm just jealous that they live on the warm, sunny west coast of
> america, and I'm stuck here in grey, windswept London. Sushi ? That's
> something Homer ate once wasn't it ? Fugu ?
>

Maybe we should have our own p2p meeting in London to celebrate just how 
damn blustery it is. And none of this sushi malarky - proper pub meal 
and pints of ale all round, I think.

Best wishes,
Jeremy

From matthewsp at avaya.com  Fri Dec  9 19:12:37 2005
From: matthewsp at avaya.com (Matthews, Philip (Philip))
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Where do bright minds discuss p2p technology?
Message-ID: <DFD0AADB452F8B4E99FB78719CAF0E6101B9E61D@ON0010AVEXU1.global.avaya.com>

I am also in Ottawa. 

We have been doing some P2P work here at the former Nimcat Networks
(now part of Avaya) and I would be interested in getting together
and discussing P2P research.

- Philip Matthews


> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org 
> [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Iles, Michael
> Sent: November 29, 2005 06:44
> To: Peer-to-peer development.
> Subject: RE: [p2p-hackers] Where do bright minds discuss p2p 
> technology?
> 
> +1 for an Ottawa meeting, and +1 for sushi and beer :)
> 
> Mike.
> 
> (Ottawa, the land of real winters and overpriced sushi.)
> 
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]
> On Behalf Of Roop Mukherjee
> Sent: November 28, 2005 5:48 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] Where do bright minds discuss p2p 
> technology?
> 
> Looks like the SFC folks will meet soon. For the rest of us with real 
> winters;)- any p2p folks in the neighborhood of Ottawa, ON Canada, 
> interested in having a similar meeting?
> 
> - roop
> ______________________________________
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences 
>   
>        This message may contain privileged and/or 
> confidential information.  If you have received this e-mail 
> in error or are not the intended recipient, you may not use, 
> copy, disseminate or distribute it; do not open any 
> attachments, delete it immediately from your system and 
> notify the sender promptly by e-mail that you have done so.  
> Thank you. 
>         
>  
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From lemonobrien at yahoo.com  Sat Dec 10 03:38:27 2005
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <4399BC85.6010001@forbidden.co.uk>
Message-ID: <20051210033827.7573.qmail@web53606.mail.yahoo.com>

yeah, the weather here is nic for winter....a little rain...
  the place we emt was a class act. It had real flair/style...
   
  we had 20 people there. 
   
  we should make it a monthly/quarterly event.
   
  lime
  

Jeremy James <jbj@forbidden.co.uk> wrote:
  stew "stewbagz" mercer wrote:
> 
>> [double snipped]
> 
> [snipped]
> 
> I'm just jealous that they live on the warm, sunny west coast of
> america, and I'm stuck here in grey, windswept London. Sushi ? That's
> something Homer ate once wasn't it ? Fugu ?
>

Maybe we should have our own p2p meeting in London to celebrate just how 
damn blustery it is. And none of this sushi malarky - proper pub meal 
and pints of ale all round, I think.

Best wishes,
Jeremy
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
  

You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051209/47dd99ee/attachment.htm
From wolfgang.mueller at wiai.uni-bamberg.de  Sat Dec 10 15:14:33 2005
From: wolfgang.mueller at wiai.uni-bamberg.de (Wolfgang Mueller)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <20051210033827.7573.qmail@web53606.mail.yahoo.com>
References: <4399BC85.6010001@forbidden.co.uk>
	<20051210033827.7573.qmail@web53606.mail.yahoo.com>
Message-ID: <20051210151433.GA26619@portos.uni-bamberg.de>

Dear SF peers,

Would it be possible to spice these reports up by 
publishing not only a writeup of the menu but also 
of your topics? BTW I am munching typically 
German Christmas cookies here... And don't denigrate 
places that enjoy a winter recognizable as such :-D . 

And: imagine a Bavarian beer-to-beer meeting. People here confuse b,p,d
and t, anyway, so let's use this ambiguity :-D

Cheers,
Wolfgang

--
Dr. Wolfgang Mueller
LS Medieninformatik
Universitaet Bamberg

From travis at redswoosh.net  Sun Dec 11 00:04:40 2005
From: travis at redswoosh.net (Travis Kalanick)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Last call for SuperHappyDevHouse
Message-ID: <200512110006.jBB06ZZF011954@be9.noc0.redswoosh.com>

Hey all, 

As I had mentioned to a number of people at the SFC get together this week,
a pretty cool hacker event is going on this evening at the
superhappydevhouse

See - http://superhappydevhouse.com/  It starts tonight at 7pm and goes
until 7am.

We have a few slots for presenters (around midnight) to show off their P2P
apps and warez to around 100 bay area geek-devs

Presenters have 10+ minutes to show off their stuff, and ideally have
technology that can be used by other people in their own projects.

Send me an email if you're interested in stopping by, and/or presenting.

Thanks,

Travis

 
From travis at redswoosh.net  Mon Dec 12 02:34:57 2005
From: travis at redswoosh.net (Travis Kalanick)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <20051210151433.GA26619@portos.uni-bamberg.de>
Message-ID: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com>

One of the more interesting topics that came up from our sushi-get-together
was a fairly rigorous discussion about the merits (or lack thereof) of
Proactive Caching.  

Let's define Proactive Caching as a mechanism where a P2P network sends
content to a user's machine for the sole purpose of improving network
performance and availability.  For instance, imagine that a given network
proactively caches "long tail" content to improve availability, or
alternatively, proactively caches content during a sudden surge of demand
for a particular file.

Deep in this discussion at the dinner (this took place in the hours after
most folks left) was Sergei, David Barrett, myself, and others, and I
thought it would be good to bring this topic to the list.

So far, in my opinion, proactive caching on open p2p networks would provide
little temporal benefit in availability and performance, given the inherent
costs of such a scheme, and given the availability of high-performance,
high-reliability p2p architectures.

Travis

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Wolfgang Mueller
Sent: Saturday, December 10, 2005 7:15 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] p2p in some place or other

Dear SF peers,

Would it be possible to spice these reports up by 
publishing not only a writeup of the menu but also 
of your topics? BTW I am munching typically 
German Christmas cookies here... And don't denigrate 
places that enjoy a winter recognizable as such :-D . 

And: imagine a Bavarian beer-to-beer meeting. People here confuse b,p,d
and t, anyway, so let's use this ambiguity :-D

Cheers,
Wolfgang

--
Dr. Wolfgang Mueller
LS Medieninformatik
Universitaet Bamberg
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From osokin at osokin.com  Mon Dec 12 03:17:22 2005
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com>
Message-ID: <IJECJCHFLLIPKMNGHDPMCEDCIAAA.osokin@osokin.com>

On Sunday, December 11, 2005 Travis Kalanick wrote:
> So far, in my opinion, proactive caching on open p2p networks would
> provide little temporal benefit in availability and performance, 
> given the inherent costs of such a scheme, and given the availability
> of high-performance, high-reliability p2p architectures.

	...and I was on the other side of this debate.

	My point was that most of content in open P2P networks is
trapped in the long tail; see, for example:

[1] http://p2pecon.berkeley.edu/pub/CWC-EC05.pdf

and

[2] http://www.mpi-sws.mpg.de/~gummadi/papers/p118-gummadi.pdf

- the number of copies of the average file is truly pathetic. As a
result, the average download experience is slow and unreliable. For
example, the study [2] suggests than in 2002 as many as two thirds of
"transactions" (HTTP requests for a single data chunk) used to fail 
in Kazaa. [Presumably due to the source host overload - Oso]

	This situation is widely replicated all over P2P space, being
observed in some form in Kazaa, Gnutella, eDonkey, etc. The overload
of the uploaders was discussd in Gnutella for almost as long as I can
remember.

	The suggested countermeasures include download queues, try-later
responses and such, but they all have one thing in common: they suck.
The user experience stays pathetic, as can be attested by anyone trying
to hunt down something other than a popular file. 

	And the reason for this is quite understandable - if most of 
the content exists in just one or two copies, what good are the swarm
downloaders and other marvelous instruments of progress? This single
copy that you need might be on a single host behind the modem in
Albania, the host might go off-line at any moment, and to make it 
more fun, it might be trying to upload five other files (different
files, mind you) to five other people at the same time. 

	What is important here is that the de facto statistical
distribution of content on the open P2P nets shown in [1] and [2]
tells us that this is happening not just for something that we tend
to dismiss as "rare files". No - these files represent a huge share
of the network content, and as much as 50-70% of the user download
attempts are for these "rare files". And all this experience sucks,
no matter what fancy download queues are introduced into the system.
Simply because there's not enough uplink bandwidth and content
sources.

	So I'm not saying that proactive caching is easy to implement,
does not require any resources, or even that it will will work well
in practice. I'm just saying that without it the user experience will
continue to suck, and proactive caching is the single mechanism that
I can see which is potentially able to fix this situation. Unless
this rare file is proactively replicated to 5-10 other nodes, I do
not see how the adequate download speeds can be achieved.

	Best wishes -
	S.Osokine.
	11 Dec 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Travis Kalanick
Sent: Sunday, December 11, 2005 6:35 PM
To: 'Peer-to-peer development.'
Subject: RE: [p2p-hackers] p2p in some place or other


One of the more interesting topics that came up from our sushi-get-together
was a fairly rigorous discussion about the merits (or lack thereof) of
Proactive Caching.  

Let's define Proactive Caching as a mechanism where a P2P network sends
content to a user's machine for the sole purpose of improving network
performance and availability.  For instance, imagine that a given network
proactively caches "long tail" content to improve availability, or
alternatively, proactively caches content during a sudden surge of demand
for a particular file.

Deep in this discussion at the dinner (this took place in the hours after
most folks left) was Sergei, David Barrett, myself, and others, and I
thought it would be good to bring this topic to the list.

So far, in my opinion, proactive caching on open p2p networks would provide
little temporal benefit in availability and performance, given the inherent
costs of such a scheme, and given the availability of high-performance,
high-reliability p2p architectures.

Travis

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Wolfgang Mueller
Sent: Saturday, December 10, 2005 7:15 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] p2p in some place or other

Dear SF peers,

Would it be possible to spice these reports up by 
publishing not only a writeup of the menu but also 
of your topics? BTW I am munching typically 
German Christmas cookies here... And don't denigrate 
places that enjoy a winter recognizable as such :-D . 

And: imagine a Bavarian beer-to-beer meeting. People here confuse b,p,d
and t, anyway, so let's use this ambiguity :-D

Cheers,
Wolfgang

--
Dr. Wolfgang Mueller
LS Medieninformatik
Universitaet Bamberg
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From dbarrett at quinthar.com  Mon Dec 12 03:47:48 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <IJECJCHFLLIPKMNGHDPMCEDCIAAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMCEDCIAAA.osokin@osokin.com>
Message-ID: <439CF2E4.40808@quinthar.com>

Serguei Osokine wrote:
> On Sunday, December 11, 2005 Travis Kalanick wrote:
> 
>>So far, in my opinion, proactive caching on open p2p networks would
>>provide little temporal benefit in availability and performance, 
>>given the inherent costs of such a scheme, and given the availability
>>of high-performance, high-reliability p2p architectures.
> 
> 	...and I was on the other side of this debate.

... and I'm somewhere in between.

I suspect proactive caching is useful for some configurations of files, 
uploaders, and would-be-downloaders, but I'm not sure if that 
configuration exists in the real world to such a degree that it's worth 
worrying about.  Furthermore, I suspect the "real world" in all its 
glory is far too complicated to get agreement upon quickly, so I think 
we should first start with simplified worlds and then work up to the 
real one.

For starters, assume "the network" consists of:
1) A single "uploader" with exactly one file
2) A "downloader" that wants the file (but doesn't have it)
3) An "innocent bystander" that neither has nor wants the file

(Further assume that there will never be any more files, more nodes, and 
the innocent bystander will never want the file.  Also, assume all three 
nodes have identical, equal upload/download speeds, unlimited storage, 
and have been and will be online for eternity.)

Thus the uploader can either choose to:
a) Send the file only to the downloader
b) Only to the innocent bystander
c) To both

I'd define "proactive caching" as options (b) and (c).  And in this 
specific configuration, I don't see it as useful.  I'll define "success" as:
- Transfers the maximum number of files to those who want them
- In the shortest possible time

(Note, I'm explicitly not valuing conservation of bandwidth or storage 
in order to simplify the case for proactive caching.)

Thus for this absolute most basic network, I'd say option (a) is clearly 
the right choice.

Can we agree on that much?

If so, what is the *smallest* way this network must change in order for 
proactive caching to begin offering value?

-david


From mgp at ucla.edu  Mon Dec 12 04:23:26 2005
From: mgp at ucla.edu (Michael Parker)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com>
References: <200512120237.jBC2avZF028114@be9.noc0.redswoosh.com>
Message-ID: <20051211202326.kxu6sg8tw88ss040@mail.ucla.edu>

Hmmm,

Emin Sirer was on this list talking about Beehive [1] not too long ago. 
I read the paper, from NSDI 04, and it seems to fit your description 
very well. They use an analytical model to derive how hard to push 
replication of a key-value pair such that it can be found in a 
(configurable) constant number of hops. It assumes that the data set 
has a zipf-like distribution.

You could also try to borrow ideas from something like Glacier [2], 
from NSDI 05, which replicates to maintain high fault-tolerance, and 
try to exploit the replication for performance gains instead. (From 
what I can remember, data-survivability is the first priority of 
Glacier, not performance... As implied by the name.)

Just my two cents.

- Mike

[1] http://www.cs.cornell.edu/People/egs/papers/beehive.pdf
[2] http://www.cs.rice.edu/~druschel/publications/Glacier-NSDI.pdf


Quoting Travis Kalanick <travis@redswoosh.net>:

> One of the more interesting topics that came up from our sushi-get-together
> was a fairly rigorous discussion about the merits (or lack thereof) of
> Proactive Caching.
>
> Let's define Proactive Caching as a mechanism where a P2P network sends
> content to a user's machine for the sole purpose of improving network
> performance and availability.  For instance, imagine that a given network
> proactively caches "long tail" content to improve availability, or
> alternatively, proactively caches content during a sudden surge of demand
> for a particular file.
>
> Deep in this discussion at the dinner (this took place in the hours after
> most folks left) was Sergei, David Barrett, myself, and others, and I
> thought it would be good to bring this topic to the list.
>
> So far, in my opinion, proactive caching on open p2p networks would provide
> little temporal benefit in availability and performance, given the inherent
> costs of such a scheme, and given the availability of high-performance,
> high-reliability p2p architectures.
>
> Travis
>

From osokin at osokin.com  Mon Dec 12 06:06:49 2005
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <439CF2E4.40808@quinthar.com>
Message-ID: <IJECJCHFLLIPKMNGHDPMGEDFIAAA.osokin@osokin.com>

On Sunday, December 11, 2005 David Barrett wrote:
> Thus the uploader can either choose to:
> a) Send the file only to the downloader
> b) Only to the innocent bystander
> c) To both

	Add to this:

d) send the file to bystander in advance, even before the downloader
   asks for it

- and then you'll cover pretty much every possible case of proactive
caching, because

e) send the file to both bystander and downloader in advance

is essentially just an extreme case of "d" :-)

	Note that Dijjer, for example, seems to use more or less "c" -
but I'm not sure if its usage model can be squeezed to fit into your
simplified scenario. In real life there might be subsequent requests
for the cached file, but since you're saying that the bystander won't
ever need the file, there's no one to generate such requests in your
model. So even if Dijjer benefits from its caching, your model will
miss it.

	Best wishes -
	S.Osokine.
	11 Dec 2005.


-----Original Message-----
From: David Barrett [mailto:dbarrett@quinthar.com]
Sent: Sunday, December 11, 2005 7:48 PM
To: osokin@osokin.com; Peer-to-peer development.
Subject: Re: [p2p-hackers] p2p in some place or other


Serguei Osokine wrote:
> On Sunday, December 11, 2005 Travis Kalanick wrote:
>
>>So far, in my opinion, proactive caching on open p2p networks would
>>provide little temporal benefit in availability and performance,
>>given the inherent costs of such a scheme, and given the availability
>>of high-performance, high-reliability p2p architectures.
>
> 	...and I was on the other side of this debate.

... and I'm somewhere in between.

I suspect proactive caching is useful for some configurations of files,
uploaders, and would-be-downloaders, but I'm not sure if that
configuration exists in the real world to such a degree that it's worth
worrying about.  Furthermore, I suspect the "real world" in all its
glory is far too complicated to get agreement upon quickly, so I think
we should first start with simplified worlds and then work up to the
real one.

For starters, assume "the network" consists of:
1) A single "uploader" with exactly one file
2) A "downloader" that wants the file (but doesn't have it)
3) An "innocent bystander" that neither has nor wants the file

(Further assume that there will never be any more files, more nodes, and
the innocent bystander will never want the file.  Also, assume all three
nodes have identical, equal upload/download speeds, unlimited storage,
and have been and will be online for eternity.)

Thus the uploader can either choose to:
a) Send the file only to the downloader
b) Only to the innocent bystander
c) To both

I'd define "proactive caching" as options (b) and (c).  And in this
specific configuration, I don't see it as useful.  I'll define "success" as:
- Transfers the maximum number of files to those who want them
- In the shortest possible time

(Note, I'm explicitly not valuing conservation of bandwidth or storage
in order to simplify the case for proactive caching.)

Thus for this absolute most basic network, I'd say option (a) is clearly
the right choice.

Can we agree on that much?

If so, what is the *smallest* way this network must change in order for
proactive caching to begin offering value?

-david


From osokin at osokin.com  Mon Dec 12 06:19:40 2005
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <20051211202326.kxu6sg8tw88ss040@mail.ucla.edu>
Message-ID: <IJECJCHFLLIPKMNGHDPMOEDFIAAA.osokin@osokin.com>

On Sunday, December 11, 2005 Michael Parker wrote:
> They use an analytical model to derive how hard to push replication
> of a key-value pair such that it can be found in a (configurable)
> constant number of hops.

	Right. But Travis is not convinced that it has to be done in
the first place, so I would imagine that the optimal nature of such
replication would be of secondary importance to him :-)

> It assumes that the data set has a zipf-like distribution.

	This makes me a bit uneasy about this model, by the way. Even
if the content distribution in P2P nets would be Zipf (and it isn't),
still I would be reluctant to implement anything that rigidly relies
on any predetermined distribution. Real functioning systems tend to
have some sort of a feedback loop and adapt to the changing situation.
In this case, it should cache the proper amount of content regardless
of how exactly it is distributed.

	But I just scanned the paper. Maybe I missed the adaptive part.

	Best wishes -
	S.Osokine.
	11 Dec 2005.

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Michael Parker
Sent: Sunday, December 11, 2005 8:23 PM
To: Peer-to-peer development.; Travis Kalanick
Cc: 'Peer-to-peer development.'
Subject: RE: [p2p-hackers] p2p in some place or other


Hmmm,

Emin Sirer was on this list talking about Beehive [1] not too long ago.
I read the paper, from NSDI 04, and it seems to fit your description
very well. They use an analytical model to derive how hard to push
replication of a key-value pair such that it can be found in a
(configurable) constant number of hops. It assumes that the data set
has a zipf-like distribution.

You could also try to borrow ideas from something like Glacier [2],
from NSDI 05, which replicates to maintain high fault-tolerance, and
try to exploit the replication for performance gains instead. (From
what I can remember, data-survivability is the first priority of
Glacier, not performance... As implied by the name.)

Just my two cents.

- Mike

[1] http://www.cs.cornell.edu/People/egs/papers/beehive.pdf
[2] http://www.cs.rice.edu/~druschel/publications/Glacier-NSDI.pdf


Quoting Travis Kalanick <travis@redswoosh.net>:

> One of the more interesting topics that came up from our
sushi-get-together
> was a fairly rigorous discussion about the merits (or lack thereof) of
> Proactive Caching.
>
> Let's define Proactive Caching as a mechanism where a P2P network sends
> content to a user's machine for the sole purpose of improving network
> performance and availability.  For instance, imagine that a given network
> proactively caches "long tail" content to improve availability, or
> alternatively, proactively caches content during a sudden surge of demand
> for a particular file.
>
> Deep in this discussion at the dinner (this took place in the hours after
> most folks left) was Sergei, David Barrett, myself, and others, and I
> thought it would be good to bring this topic to the list.
>
> So far, in my opinion, proactive caching on open p2p networks would
provide
> little temporal benefit in availability and performance, given the
inherent
> costs of such a scheme, and given the availability of high-performance,
> high-reliability p2p architectures.
>
> Travis
>
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From dbarrett at quinthar.com  Mon Dec 12 07:00:54 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <IJECJCHFLLIPKMNGHDPMGEDFIAAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMGEDFIAAA.osokin@osokin.com>
Message-ID: <439D2026.6090409@quinthar.com>

Serguei Osokine wrote:
> On Sunday, December 11, 2005 David Barrett wrote:
> 
>>Thus the uploader can either choose to:
>>a) Send the file only to the downloader
>>b) Only to the innocent bystander
>>c) To both
> 
> Add to this:
> 
> d) send the file to bystander in advance, even before the downloader
>    asks for it

Ah, ok, so it sounds like your saying that proactive caching becomes 
valuable when the bystander gets the file before the downloader makes 
its request.

(This seems obvious in retrospect, but I was thinking along the lines of 
"just in time" proactive caching -- sending the file to more than just 
who requested it to somehow improve the experience for the requester.  I 
couldn't see any way to make this work, though I'd be happy to be wrong.)

So really, it sounds like proactive caching sets a "minimum replication" 
target for every file with one or more requests.  Thus the policy is to 
keep making copies until that target is achieved.  If anyone goes 
offline (and thus reduces a file's cache count below the minimum 
threshold), a new cache is made.

How much of this is motivated by a desire for "high availability" versus 
"high performance"?  In other words, if you had a "guaranteed seeder" 
for the file that you knew would never go offline, would proactive 
caching still be worth the trouble?


Again, my initial thought around proactive caching was to use it to 
improve download performance by making the long tail download as fast as 
a well-deployed file.  (And I still don't see how to make that happen 
without simply making every file "well deployed" through massive 
proactive caching.)

But I can see how very limited proactive caching can be used to improve 
availability, and thus ensure the interesting subset of the long tail be 
downloaded *at all*.

-david

From gbildson at limepeer.com  Mon Dec 12 17:16:53 2005
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <IJECJCHFLLIPKMNGHDPMCEDCIAAA.osokin@osokin.com>
Message-ID: <EPEJIODJLBDLEHGHIEADEEPBFHAB.gbildson@limepeer.com>

There are an infinite number of rare files so caching those without future
knowledge about anyone's interest would be costly and infeasible.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Serguei Osokine
> Sent: Sunday, December 11, 2005 10:17 PM
> To: Peer-to-peer development.
> Subject: RE: [p2p-hackers] p2p in some place or other
>
>
> On Sunday, December 11, 2005 Travis Kalanick wrote:
> > So far, in my opinion, proactive caching on open p2p networks would
> > provide little temporal benefit in availability and performance,
> > given the inherent costs of such a scheme, and given the availability
> > of high-performance, high-reliability p2p architectures.
>
> 	...and I was on the other side of this debate.
>
> 	My point was that most of content in open P2P networks is
> trapped in the long tail; see, for example:
>
> [1] http://p2pecon.berkeley.edu/pub/CWC-EC05.pdf
>
> and
>
> [2] http://www.mpi-sws.mpg.de/~gummadi/papers/p118-gummadi.pdf
>
> - the number of copies of the average file is truly pathetic. As a
> result, the average download experience is slow and unreliable. For
> example, the study [2] suggests than in 2002 as many as two thirds of
> "transactions" (HTTP requests for a single data chunk) used to fail
> in Kazaa. [Presumably due to the source host overload - Oso]
>


From matthew at matthew.at  Mon Dec 12 17:31:01 2005
From: matthew at matthew.at (Matthew Kaufman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <EPEJIODJLBDLEHGHIEADEEPBFHAB.gbildson@limepeer.com>
Message-ID: <200512121731.jBCHVEU88770@where.matthew.at>

Greg Bildson:
> There are an infinite number of rare files so caching those 
> without future knowledge about anyone's interest would be 
> costly and infeasible.

On any given actual file sharing network, I believe that's not actually
true. In fact, it "probably" isn't even true for the known universe of
computers :)

The number *can be* very large however, so as has been pointed out before,
how much sense this makes really depends upon the total number of files and
their distribution.

Consider, for instance, the cost of having every file stored on exactly ONE
more node than actually cares about the file at present. If almost all files
are, as we fear, on a very long tail of height one, then this approximately
doubles the storage requirements network-wide (and communiaction required
for that replication scales equivalently). However, that is the
worst-case... Real distributions may not look like this at all, especially
for things like commercial file distribution networks or corporate intranet
applications.

The real trick for the general case is probably selling users on the idea
that the overhead they experience (disk space, bandwidth requirements,
number of times their house is raided in a search for illicit bits) is worth
the performance gains (if any) that they see.

Matthew Kaufman
matthew@matthew.at
www.amicima.com
 

From gbildson at limepeer.com  Mon Dec 12 17:47:47 2005
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <200512121731.jBCHVEU88770@where.matthew.at>
Message-ID: <EPEJIODJLBDLEHGHIEADIEPDFHAB.gbildson@limepeer.com>

Given the number of people sharing unique personal files (photos, etc),
partial downloads and accidentally sharing entire drives, it's closer to
reality then you may believe.  I don't really mean infinite of course but
large.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Matthew Kaufman
> Sent: Monday, December 12, 2005 12:31 PM
> To: 'Peer-to-peer development.'
> Subject: RE: [p2p-hackers] p2p in some place or other
>
>
> Greg Bildson:
> > There are an infinite number of rare files so caching those
> > without future knowledge about anyone's interest would be
> > costly and infeasible.
>
> On any given actual file sharing network, I believe that's not actually
> true. In fact, it "probably" isn't even true for the known universe of
> computers :)
>


From alenlpeacock at gmail.com  Mon Dec 12 17:49:41 2005
From: alenlpeacock at gmail.com (Alen Peacock)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <EPEJIODJLBDLEHGHIEADEEPBFHAB.gbildson@limepeer.com>
References: <IJECJCHFLLIPKMNGHDPMCEDCIAAA.osokin@osokin.com>
	<EPEJIODJLBDLEHGHIEADEEPBFHAB.gbildson@limepeer.com>
Message-ID: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>

On 12/12/05, Greg Bildson <gbildson@limepeer.com> wrote:
> There are an infinite number of rare files so caching those without future
> knowledge about anyone's interest would be costly and infeasible.

  I'd add: what is the self-interested motivation for a node to agree
to cache the content in the first place?  If proactive caching were
turned on by default in my p2p filesharing client, don't I have a very
real incentive to turn this off in my own node to preserve bandwidth,
disk space, and perhaps limit any legal liability?  If the implemented
client doesn't have the option to turn this feature off, isn't there a
very real incentive to use a different client to get better
performance / less risk?

  The beauty of "a) Send the file only to the downloader" is that
self-interest is leveraged to get the downloader to share.  Is there
some incentive or mechanism to enforce fairness with proactive
caching?  If there is, then it seems like you've still got to overcome
Greg's argument against, which is similar to many of the arguments
made against pre-fetching in traditional caching literature: how do
you ensure that you prefetch the right content, especially when the
cost of prefetching the wrong content is very high?

  Alen

(Hoping I'm not re-hashing conversations you had over sushi -- posting anyway).

From Serguei.Osokine at efi.com  Mon Dec 12 18:01:05 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42792@fcexmb04.efi.internal>

On Sunday, December 11, 2005 David Barrett wrote:
> How much of this is motivated by a desire for "high availability" 
> versus "high performance"? 

	Not sure how to separate those. They are closely related. For
example, in Gnutella the average host session time is about an hour
and a half. So if you try to get something from this host, and it
will give you only 1 KB/s (a frequent occurence, because content
tends to be concentrated on relatively few nodes), then you'll be 
able to receive only about 5 MB during the average host session.

	In fact, the average transferred volume before this host goes
off-line will be half of that, or about 2.5 MB. Not enough to get 
even a single song. So availability and performance are closely
related.

> In other words, if you had a "guaranteed seeder" for the file that
> you knew would never go offline, would proactive caching still be 
> worth the trouble?

	That does solve the availability problem; not sure about the
performance one. Personally, I wouldn't like to download movies at
1 KB/s and wait several days for the download to finish. But people
use eDonkey all the time - and its speed is not much better. So the
answer depends on whether you want to make your users happy or you're
fine with someone else doing that, I presume :-)

	Best wishes -
	S.Osokine.
	12 Dec 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of David Barrett
Sent: Sunday, December 11, 2005 11:01 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] p2p in some place or other


Serguei Osokine wrote:
> On Sunday, December 11, 2005 David Barrett wrote:
> 
>>Thus the uploader can either choose to:
>>a) Send the file only to the downloader
>>b) Only to the innocent bystander
>>c) To both
> 
> Add to this:
> 
> d) send the file to bystander in advance, even before the downloader
>    asks for it

Ah, ok, so it sounds like your saying that proactive caching becomes 
valuable when the bystander gets the file before the downloader makes 
its request.

(This seems obvious in retrospect, but I was thinking along the lines of 
"just in time" proactive caching -- sending the file to more than just 
who requested it to somehow improve the experience for the requester.  I 
couldn't see any way to make this work, though I'd be happy to be wrong.)

So really, it sounds like proactive caching sets a "minimum replication" 
target for every file with one or more requests.  Thus the policy is to 
keep making copies until that target is achieved.  If anyone goes 
offline (and thus reduces a file's cache count below the minimum 
threshold), a new cache is made.

How much of this is motivated by a desire for "high availability" versus 
"high performance"?  In other words, if you had a "guaranteed seeder" 
for the file that you knew would never go offline, would proactive 
caching still be worth the trouble?


Again, my initial thought around proactive caching was to use it to 
improve download performance by making the long tail download as fast as 
a well-deployed file.  (And I still don't see how to make that happen 
without simply making every file "well deployed" through massive 
proactive caching.)

But I can see how very limited proactive caching can be used to improve 
availability, and thus ensure the interesting subset of the long tail be 
downloaded *at all*.

-david
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From matthew at matthew.at  Mon Dec 12 18:13:49 2005
From: matthew at matthew.at (Matthew Kaufman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>
Message-ID: <200512121814.jBCIE1U88941@where.matthew.at>

Alen Peacock:
>
>   I'd add: what is the self-interested motivation for a node 
> to agree to cache the content in the first place? 

This could be some external motivation like "I want anonymously-posted files
about certain political views to be available for all to see" or "my
corporate IT department says that we have to use this distributed
collaboration tool"

> If proactive caching were turned on by default in my p2p 
> filesharing client, don't I have a very real incentive to 
> turn this off in my own node to preserve bandwidth, disk 
> space, and perhaps limit any legal liability? 

In the general "filesharing" case? Absolutely. But that's not the only use
for P2P technology or even P2P file transfer.

> ...which is similar to many of the arguments made against 
> pre-fetching in traditional caching literature: how do you 
> ensure that you prefetch the right content, especially when 
> the cost of prefetching the wrong content is very high?

Actually, if you're replicating content to other nodes in order to ensure
availability or create more downloadable nodes in order to speed future
downloaders, it is more like the RAID arguments than the cache arguments.

The real question is, IF you had a high-availability file sharing system,
what files would you want to make available on it? (The answer is probably
*not* the long tail of all files ever seen on generic file sharing services)

Matthew Kaufman
matthew@matthew.at
www.amicima.com


From Serguei.Osokine at efi.com  Mon Dec 12 18:16:58 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42793@fcexmb04.efi.internal>

On Monday, December 12, 2005 Greg Bildson wrote:
> Given the number of people sharing unique personal files (photos, 
> etc), partial downloads and accidentally sharing entire drives, 
> it's closer to reality then you may believe.

	These things are just shared. Gummadi's research talks about the
things that were actually downloaded. So no, the accidentally shared
entire drives are not the reason why the average number of copies per
unique title is not much higher than one.

	Best wishes -
	S.Osokine.
	12 Dec 2005.

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Greg Bildson
Sent: Monday, December 12, 2005 9:48 AM
To: Peer-to-peer development.
Subject: RE: [p2p-hackers] p2p in some place or other


Given the number of people sharing unique personal files (photos, etc),
partial downloads and accidentally sharing entire drives, it's closer to
reality then you may believe.  I don't really mean infinite of course but
large.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Matthew Kaufman
> Sent: Monday, December 12, 2005 12:31 PM
> To: 'Peer-to-peer development.'
> Subject: RE: [p2p-hackers] p2p in some place or other
>
>
> Greg Bildson:
> > There are an infinite number of rare files so caching those
> > without future knowledge about anyone's interest would be
> > costly and infeasible.
>
> On any given actual file sharing network, I believe that's not actually
> true. In fact, it "probably" isn't even true for the known universe of
> computers :)
>


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From nazareno at dsc.ufcg.edu.br  Mon Dec 12 18:21:44 2005
From: nazareno at dsc.ufcg.edu.br (Nazareno Andrade)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <200512121814.jBCIE1U88941@where.matthew.at>
References: <200512121814.jBCIE1U88941@where.matthew.at>
Message-ID: <439DBFB8.2060901@dsc.ufcg.edu.br>

Hi there.

A nice paper which you may find useful in this thread:

High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two 
(HotOS XI)

Peer-to-peer storage aims to build large-scale, reliable and available 
storage from many small-scale unreliable, low-availability distributed 
hosts. Data redundancy is the key to any data guarantees. However, 
preserving redundancy in the face of highly dynamic membership is 
costly. We use a simple resource usage model to measured behavior from 
the Gnutella file-sharing network to argue that large-scale cooperative 
storage is limited by likely dynamics and cross-system bandwidth - not 
by local disk space. We examine some bandwidth optimization strategies 
like delayed response to failures, admission control, and load-shifting 
and find that they do not alter the basic problem. We conclude that when 
redundancy, data scale, and dynamics are all high, the needed 
cross-system bandwidth is unreasonable.

http://pmg.csail.mit.edu/~rodrigo/p2p-scl.pdf

regards,
Nazareno

Matthew Kaufman wrote:
> Alen Peacock:
> 
>>  I'd add: what is the self-interested motivation for a node 
>>to agree to cache the content in the first place? 
> 
> 
> This could be some external motivation like "I want anonymously-posted files
> about certain political views to be available for all to see" or "my
> corporate IT department says that we have to use this distributed
> collaboration tool"
> 
> 
>>If proactive caching were turned on by default in my p2p 
>>filesharing client, don't I have a very real incentive to 
>>turn this off in my own node to preserve bandwidth, disk 
>>space, and perhaps limit any legal liability? 
> 
> 
> In the general "filesharing" case? Absolutely. But that's not the only use
> for P2P technology or even P2P file transfer.
> 
> 
>>...which is similar to many of the arguments made against 
>>pre-fetching in traditional caching literature: how do you 
>>ensure that you prefetch the right content, especially when 
>>the cost of prefetching the wrong content is very high?
> 
> 
> Actually, if you're replicating content to other nodes in order to ensure
> availability or create more downloadable nodes in order to speed future
> downloaders, it is more like the RAID arguments than the cache arguments.
> 
> The real question is, IF you had a high-availability file sharing system,
> what files would you want to make available on it? (The answer is probably
> *not* the long tail of all files ever seen on generic file sharing services)
> 
> Matthew Kaufman
> matthew@matthew.at
> www.amicima.com
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 


-- 

Nazareno.

========================================
  Nazareno Andrade
  LSD - DSC/UFCG
  Campina Grande - Brazil
  http://lsd.dsc.ufcg.edu.br/~nazareno/

  OurGrid project
  http://www.ourgrid.org
========================================

From coderman at gmail.com  Mon Dec 12 18:34:22 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <IJECJCHFLLIPKMNGHDPMOEDFIAAA.osokin@osokin.com>
References: <20051211202326.kxu6sg8tw88ss040@mail.ucla.edu>
	<IJECJCHFLLIPKMNGHDPMOEDFIAAA.osokin@osokin.com>
Message-ID: <4ef5fec60512121034n5c5c8aedpa15d1eb9077fc74d@mail.gmail.com>

On 12/11/05, Serguei Osokine <osokin@osokin.com> wrote:
> > It assumes that the data set has a zipf-like distribution.
>
>         This makes me a bit uneasy about this model, by the way. Even
> if the content distribution in P2P nets would be Zipf (and it isn't),
> still I would be reluctant to implement anything that rigidly relies
> on any predetermined distribution. Real functioning systems tend to
> have some sort of a feedback loop and adapt to the changing situation.
> In this case, it should cache the proper amount of content regardless
> of how exactly it is distributed.

i mentioned feedbackfs a few threads earlier and this was exactly what
it intended to do: observe relevance and utility directly and
implicitly (through filesystem interaction) so that recommendation
(search) and caching (distribution) could be optimized.

i think a poor / unintelligent caching mechanism would actually create
more problems as it would be vulnerable to abuse - how do you prevent
malicious or irrelevant peers from filling your cache with crap and
consuming even more of your limited bandwidth to useless ends.

reputation / trust needs to be addressed as well; the question about
motivation for caching is a good one.

that said, it seems clear to me that caching is a big win for
performance.  if you look at various papers and experiments using this
technique everything from search to distribution is greatly enhanced
with a well designed caching mechanism.  akamai still seems to be
doing well. ;)

so perhaps for caching to work well you need a few prerequisites:
- reputation / trust between peers to prevent abuse
- feedback loop to ensure relevance / utility of cached content

i haven't seen any p2p network/app which tries to address both points.
 does anyone know of such a beast?

(i suppose mnet would fit somewhat.  the agorics model prevents abuse
of resources but i'm not sure how the feedback loop is applied)

From Serguei.Osokine at efi.com  Mon Dec 12 19:24:33 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42794@fcexmb04.efi.internal>

On Monday, December 12, 2005 Nazareno Andrade wrote:
> A nice paper which you may find useful in this thread:
>
> High Availability, Scalable Storage, Dynamic Peer Networks: Pick
> Two

	Yes, it is an interesting approach - thank you! However, I'm
not sure if their results directly apply to P2P nets. They are talking
about six nines and replication factor of 20 to 80. They would likely
commit suicide if they would try to actually use Gnutella for rare
content. Any improvement would be nice - and forget about six nines.

	Also, despite introducing an interesting approach, this article
results are very hard to verify and to reproduce, which is absolutely
necessary if one would want to repeat their calculations with some
different assumptions about the system requirements.

	For example, much of their conclusions are based on the Gnutella
trace from April of 2003. Back then Gnutella was more than an order of
magnitude smaller, and it would be interesting to repeat the 
calculations for today's situation. But the properties of this trace 
are not explicitly listed anywhere, being hidden in multiple charts
and obscure statements like "only 5,000 of the 33,000 Gnutella hosts
were usually available" (This, by the way, is a total mystery to me,
since in April of 2003 Slyck's stats archive lists Gnutella at about
90,000 simultaneous nodes, so I have no idea where these 5,000 or
33,000 came from and what their meaning might have been.)

	To put it shortly, they have an interesting methodology, but
I do not trust any one of their conclusions, as far as the caching
in P2P file-sharing network is concerned. All their reasonings
should be repeated for the reliable network statistical data, and
with the set of requirements that reflects the needs of P2P users,
not the need for a six nines-reliable data storage. I suspect that
then the conclusions might prove to be a bit different.

	Best wishes -
	S.Osokine.
	12 Dec 2005.
	 

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Nazareno Andrade
Sent: Monday, December 12, 2005 10:22 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] p2p in some place or other


Hi there.

A nice paper which you may find useful in this thread:

High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two 
(HotOS XI)

Peer-to-peer storage aims to build large-scale, reliable and available 
storage from many small-scale unreliable, low-availability distributed 
hosts. Data redundancy is the key to any data guarantees. However, 
preserving redundancy in the face of highly dynamic membership is 
costly. We use a simple resource usage model to measured behavior from 
the Gnutella file-sharing network to argue that large-scale cooperative 
storage is limited by likely dynamics and cross-system bandwidth - not 
by local disk space. We examine some bandwidth optimization strategies 
like delayed response to failures, admission control, and load-shifting 
and find that they do not alter the basic problem. We conclude that when 
redundancy, data scale, and dynamics are all high, the needed 
cross-system bandwidth is unreasonable.

http://pmg.csail.mit.edu/~rodrigo/p2p-scl.pdf

regards,
Nazareno

Matthew Kaufman wrote:
> Alen Peacock:
> 
>>  I'd add: what is the self-interested motivation for a node 
>>to agree to cache the content in the first place? 
> 
> 
> This could be some external motivation like "I want anonymously-posted
files
> about certain political views to be available for all to see" or "my
> corporate IT department says that we have to use this distributed
> collaboration tool"
> 
> 
>>If proactive caching were turned on by default in my p2p 
>>filesharing client, don't I have a very real incentive to 
>>turn this off in my own node to preserve bandwidth, disk 
>>space, and perhaps limit any legal liability? 
> 
> 
> In the general "filesharing" case? Absolutely. But that's not the only use
> for P2P technology or even P2P file transfer.
> 
> 
>>...which is similar to many of the arguments made against 
>>pre-fetching in traditional caching literature: how do you 
>>ensure that you prefetch the right content, especially when 
>>the cost of prefetching the wrong content is very high?
> 
> 
> Actually, if you're replicating content to other nodes in order to ensure
> availability or create more downloadable nodes in order to speed future
> downloaders, it is more like the RAID arguments than the cache arguments.
> 
> The real question is, IF you had a high-availability file sharing system,
> what files would you want to make available on it? (The answer is probably
> *not* the long tail of all files ever seen on generic file sharing
services)
> 
> Matthew Kaufman
> matthew@matthew.at
> www.amicima.com
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 


-- 

Nazareno.

========================================
  Nazareno Andrade
  LSD - DSC/UFCG
  Campina Grande - Brazil
  http://lsd.dsc.ufcg.edu.br/~nazareno/

  OurGrid project
  http://www.ourgrid.org
========================================
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From alenlpeacock at gmail.com  Mon Dec 12 19:35:37 2005
From: alenlpeacock at gmail.com (Alen Peacock)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <200512121814.jBCIE1U88941@where.matthew.at>
References: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>
	<200512121814.jBCIE1U88941@where.matthew.at>
Message-ID: <ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>

On 12/12/05, Matthew Kaufman <matthew@matthew.at> wrote:
>
> This could be some external motivation like "I want anonymously-posted files
> about certain political views to be available for all to see" or "my
> corporate IT department says that we have to use this distributed
> collaboration tool"

  External motivation is good, but is it sufficient to provide some
sort of equilibria?  If not, it's just the prisoner's dilemma; the
vast majority of nodes disable caching because it is locally optimal,
regardless of the fact that this produces a globally non-optimal
solution.  In fact, it might be even worse: the local cache could be
exploited by malicious nodes to store data to the network.  For
example, instead of sharing my files from my own box, I just push them
all out to the cache and stop local sharing altogether.


> > If proactive caching were turned on by default in my p2p
> > filesharing client, don't I have a very real incentive to
> > turn this off in my own node to preserve bandwidth, disk
> > space, and perhaps limit any legal liability?
>
> In the general "filesharing" case? Absolutely. But that's not the only use
> for P2P technology or even P2P file transfer.

  Ah, but it doesn't matter if it is filesharing or not -- if the
system can arbitrarily push data to my cache, my
[bandwidth|disk|legal] resources are being consumed, regardless of
whether the application layer is doing filesharing, chat, video,
email, etc.  And if I can prevent access to these extra resources, or
if I can download an alternate client which promises better local
performance and less legal liability, why wouldn't I?

  I'll admit that maybe I'm just obsessing over this point for purely
academic reasons; maybe the majority of users simply accept the system
defaults and innocently engage in altruistic behavior that ends up
optimizing global performance.  Maybe they all just turn their caches
on because it is 'the right thing to do.'  Maybe no one writes
malicious software that takes advantage of [for example] a proactive
cache.  Maybe we shouldn't worry about it at all.

  But, isn't it more interesting to think about building systems that
have some fairness guarantees than building ones that don't?  Building
a proactive cache that isn't susceptible to these abuses might require
a trust/reputation sytem, which in turn requires a strong identity
system, etc. -- but isn't that where the real fun is anyway? :)

  Alen

From coderman at gmail.com  Mon Dec 12 19:42:29 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Peer Identity Management [was: p2p in some place or
	other]
Message-ID: <4ef5fec60512121142h1cef2711mb1b90765e31c8ed3@mail.gmail.com>

On 12/12/05, Alen Peacock <alenlpeacock@gmail.com> wrote:
> ...
>   But, isn't it more interesting to think about building systems that
> have some fairness guarantees than building ones that don't?  Building
> a proactive cache that isn't susceptible to these abuses might require
> a trust/reputation sytem, which in turn requires a strong identity
> system, etc. -- but isn't that where the real fun is anyway? :)

i often wonder why identity management is not a more active topic on
these lists.  it is the cornerstone of a useful reputation/trust
metric, which in turn provides a foundation for many advanced and
resilient features like proactive caching, agorics, recommender
systems, etc.

is the lack of interest due to the overhead in security and complexity
associated with digital identities in large, ad-hoc peer groups?  the
lack of consensus (single sign on?) preventing any critical mass?  i'm
curious...

From lally at vt.edu  Tue Dec 13 01:28:36 2005
From: lally at vt.edu (Lally Singh)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>
References: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>
	<200512121814.jBCIE1U88941@where.matthew.at>
	<ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>
Message-ID: <20051213012836.30212@smtp.vt.edu>

>On 12/12/05, Matthew Kaufman <matthew@matthew.at> wrote:
>  I'll admit that maybe I'm just obsessing over this point for purely
>academic reasons; maybe the majority of users simply accept the system
>defaults and innocently engage in altruistic behavior that ends up
>optimizing global performance.  Maybe they all just turn their caches
>on because it is 'the right thing to do.'  Maybe no one writes
>malicious software that takes advantage of [for example] a proactive
>cache.  Maybe we shouldn't worry about it at all.
>
>  But, isn't it more interesting to think about building systems that
>have some fairness guarantees than building ones that don't?  Building
>a proactive cache that isn't susceptible to these abuses might require
>a trust/reputation sytem, which in turn requires a strong identity
>system, etc. -- but isn't that where the real fun is anyway? :)

It's not hard to imagine an ISP shipping some software to disable
caching on P2P network clients to all their clients (say with the installer
of free anti-spyware or anti-virus software), without the clients
ever having the chance to be altruistic.

IMHO, anonymity's pretty important to keep.  If there's going to
be an identity system, let's make sure it doesn't attach to real
people directly.  Ebay user IDs, which you can burn at any time,
but become valuable due to good feedback, are nice.

-- 
H. Lally Singh
Ph.D. Candidate, Computer Science
Virginia Tech


From coderman at gmail.com  Tue Dec 13 07:05:12 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <20051213012836.30212@smtp.vt.edu>
References: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>
	<200512121814.jBCIE1U88941@where.matthew.at>
	<ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>
	<20051213012836.30212@smtp.vt.edu>
Message-ID: <4ef5fec60512122305n69f38437ke1996a87212b6cfa@mail.gmail.com>

On 12/12/05, Lally Singh <lally@vt.edu> wrote:
> ...
> IMHO, anonymity's pretty important to keep.  If there's going to
> be an identity system, let's make sure it doesn't attach to real
> people directly.  Ebay user IDs, which you can burn at any time,
> but become valuable due to good feedback, are nice.

agreed; anonymity and pseudonymity are important.  an ideal identity
management system would function like Ian Goldberg's nymity slider and
allow me to specify exactly how much information is revealed about my
person during any interactions with peers.

anonymous interactions would be useful for self certifying resources,
pull based operations, and public broadcasts for example.

psuedonymity for weakly trusted interactions, reputation attached to
recommendations or other meta data.

and strong identity for trusted relationships between friends /
associates where non trivial resources may be exchanged or formal
agreements negotiated.

likewise, the protocols used to communicate between peers would need
to take these nymity levels into account, and constrain or protect
communication accordingly.

i have to second Matthew Kaufman in that a lot of fun is to be had in
these areas; so much ties into these mechanisms (user interfaces,
protocols, social interactions, information security) that provides
fertile ground for experimentation and discovery across a diverse
range of interests.  trying to make such systems work in a fully or
partially decentralized manner makes it even more challenging (and fun
:)

From ludovic.courtes at laas.fr  Tue Dec 13 08:43:04 2005
From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <20051207172725.GG5812@cs.uoregon.edu> (Daniel Stutzbach's
	message of "Wed, 7 Dec 2005 09:27:26 -0800")
References: <200511291414.35852.01771@iha.dk>
	<20051129141713.6A9CB698@yumyum.zooko.com>
	<20051129142151.8E1A035E4@yumyum.zooko.com>
	<438C9F88.2050803@pdos.lcs.mit.edu> <87lkywg9sp.fsf_-_@laas.fr>
	<20051207172725.GG5812@cs.uoregon.edu>
Message-ID: <87u0ddo09z.fsf@laas.fr>

Hi,

Daniel Stutzbach <agthorr@cs.uoregon.edu> writes:

> For what purpose do you want to "decentralize Google"?
>
> Is it for some technical reason where you believe a decentralized
> index will provide better end-user performance?
>
> Or is it because you don't think any single organization should have
> that much control over information?

Essentially for this reason.  Because search engines have become "entry
points" to the Internet.  Typically, people tend to no longer use
bookmarks and the likes: Google can always find the data they're looking
for.  Therefore, I think it's a reasonable goal to try to remove that
single point of trust/failure.

Thanks,
Ludovic.

From ludovic.courtes at laas.fr  Tue Dec 13 09:04:51 2005
From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <20051207112748.1ptnql9c004s4oko@mail.ucla.edu> (Michael Parker's
	message of "Wed, 07 Dec 2005 11:27:48 -0800")
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
	<20051207112748.1ptnql9c004s4oko@mail.ucla.edu>
Message-ID: <87fyoxmkp8.fsf@laas.fr>

Michael Parker <mgp@ucla.edu> writes:

> The first step of indexing is the actual keyword extraction itself. 
> From what I have heard, libextractor is a good open-source solution: 
> http://gnunet.org/libextractor/

Its author (Christian Grothoff) also used it to implement Doodle, a
document indexing and search tool similar to Beagle:
http://gnunet.org/doodle/ .

Thanks,
Ludovic.

From ludovic.courtes at laas.fr  Tue Dec 13 09:08:57 2005
From: ludovic.courtes at laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
	(SIMON Gwendal's message of "Wed, 7 Dec 2005 17:36:01 +0100")
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
Message-ID: <87zmn5l5xy.fsf@laas.fr>

"SIMON Gwendal RD-MAPS-ISS" <gwendal.simon@francetelecom.com> writes:

> As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents.
>
> http://maay.netofpeers.net

This sounds nice!  What licence is it available under (I couldn't find
it on the website)?  Is anybody working on a Debian package?  ;-)

Thanks,
Ludovic.

From solipsis at pitrou.net  Tue Dec 13 10:54:36 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <87zmn5l5xy.fsf@laas.fr>
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
	<87zmn5l5xy.fsf@laas.fr>
Message-ID: <1134471276.5631.1.camel@fsol>

Le mardi 13 d?cembre 2005 ? 10:08 +0100, Ludovic Court?s a ?crit :
> > As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents.
> >
> > http://maay.netofpeers.net
> 
> This sounds nice!  What licence is it available under (I couldn't find
> it on the website)?

>From the bottom of the home page:
? Maay is licensed under the GNU General Public License ?
;)

Regards

Antoine.


From m.rogers at cs.ucl.ac.uk  Tue Dec 13 11:07:11 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>
References: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>
	<200512121814.jBCIE1U88941@where.matthew.at>
	<ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>
Message-ID: <439EAB5F.3030800@cs.ucl.ac.uk>

Alen Peacock wrote:
>   External motivation is good, but is it sufficient to provide some
> sort of equilibria?  If not, it's just the prisoner's dilemma; the
> vast majority of nodes disable caching because it is locally optimal,
> regardless of the fact that this produces a globally non-optimal
> solution.

Then how about internal motivation: the faster you upload, the faster 
you can download, and the more files you share, the more likely you are 
to be able to upload. I've come up with a half-baked incentive mechanism 
for Gnutella based on these principles:

http://www.cs.ucl.ac.uk/staff/M.Rogers/gnutella-incentives.html

No identity mechanism required I'm afraid ;-)

>   But, isn't it more interesting to think about building systems that
> have some fairness guarantees than building ones that don't?

Define fairness :-) I'm more interested in mutual benefit.

Cheers,
Michael

From m.rogers at cs.ucl.ac.uk  Tue Dec 13 11:24:53 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <IJECJCHFLLIPKMNGHDPMCEDCIAAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMCEDCIAAA.osokin@osokin.com>
Message-ID: <439EAF85.2040704@cs.ucl.ac.uk>

Serguei Osokine wrote:
> 	And the reason for this is quite understandable - if most of 
> the content exists in just one or two copies, what good are the swarm
> downloaders and other marvelous instruments of progress? This single
> copy that you need might be on a single host behind the modem in
> Albania, the host might go off-line at any moment, and to make it 
> more fun, it might be trying to upload five other files (different
> files, mind you) to five other people at the same time. 

I believe eMule allows the uploader to assign different priorities to 
different files - I'd like to be able to do this in Gnutella, to make 
the rarer (or better) content on my node easier to find, almost like a 
recommendation system.

Cheers,
Michael

From alenlpeacock at gmail.com  Tue Dec 13 15:41:56 2005
From: alenlpeacock at gmail.com (Alen Peacock)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <439EAB5F.3030800@cs.ucl.ac.uk>
References: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>
	<200512121814.jBCIE1U88941@where.matthew.at>
	<ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>
	<439EAB5F.3030800@cs.ucl.ac.uk>
Message-ID: <ffe450f90512130741oad7acd3rdf53a1850abcc410@mail.gmail.com>

On 12/13/05, Michael Rogers <m.rogers@cs.ucl.ac.uk> wrote:
>
> Then how about internal motivation: the faster you upload, the faster
> you can download, and the more files you share, the more likely you are
> to be able to upload. I've come up with a half-baked incentive mechanism
> for Gnutella based on these principles:
>
> http://www.cs.ucl.ac.uk/staff/M.Rogers/gnutella-incentives.html
>
> No identity mechanism required I'm afraid ;-)

  Neat ideas.  Like you, I'm a big believer in incentive-based decisions.

  I just peaked at your "Cooperation in Decentralized Networks" paper,
and I notice that you do require exchange of public keys,
authentication with those keys, and some sort of history of
reciprocation, no?  This is what I'm talking about when I say
'identity' and 'trust'.  Each node has to be able to positively
certify the identities of other nodes, and what you seem to be
building is essentially a trust system built on top of those strong
identities.  Without the ability to certify node identities, you'd
have a system that was very susceptible to imposter nodes leeching
resources (in the form of reciprocation) that they hadn't earned,
right?

  Perhaps I confused the issue by using the word 'identity,' which in
some circles is used only to talk about the concept of linking a
virtual presence to a meatspace entity.  That isn't what I intended. 
What I meant was exactly what you describe: use of assymetric keys to
establish and prove peer IDs, use of those IDs to learn something
about the behavior of other agents in the network, and use of that
knowledge to make appropriate incentive-based decisions.


> >   But, isn't it more interesting to think about building systems that
> > have some fairness guarantees than building ones that don't?
>
> Define fairness :-) I'm more interested in mutual benefit.

  Well, I don't know if my semantics are standard, but the concept of
'fairness' I was thinking of was one that was purposely broad -- an
umbrella under which 'mutual benefit' is certainly an essential piece.

  Alen

From m.rogers at cs.ucl.ac.uk  Tue Dec 13 17:03:05 2005
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
In-Reply-To: <ffe450f90512130741oad7acd3rdf53a1850abcc410@mail.gmail.com>
References: <ffe450f90512120949scbd5289t50ec6e81de670eb4@mail.gmail.com>
	<200512121814.jBCIE1U88941@where.matthew.at>
	<ffe450f90512121135l5ca5fa2brf5de6fd63e8923fe@mail.gmail.com>
	<439EAB5F.3030800@cs.ucl.ac.uk>
	<ffe450f90512130741oad7acd3rdf53a1850abcc410@mail.gmail.com>
Message-ID: <439EFEC9.7090209@cs.ucl.ac.uk>

Alen Peacock wrote:
>   I just peaked at your "Cooperation in Decentralized Networks" paper,
> and I notice that you do require exchange of public keys,
> authentication with those keys, and some sort of history of
> reciprocation, no?  This is what I'm talking about when I say
> 'identity' and 'trust'.

Good point - I was thinking of identities that can be communicated to 
third parties, as in a reputation or recommendation system, but you're 
right that local (non-transitive?) identities are needed. In the context 
of Gnutella you can use IP addresses and port numbers.

>   Well, I don't know if my semantics are standard, but the concept of
> 'fairness' I was thinking of was one that was purposely broad -- an
> umbrella under which 'mutual benefit' is certainly an essential piece.

Sorry for the knee-jerk reaction. Fairness seems to be one of those 
words that cause more arguments than they solve - some people say "it's 
not fair to exclude those who can't contribute", while others say "it's 
not fair to consume resources if you don't contribute". :-)

Cheers,
Michael

From rabbi at abditum.com  Tue Dec 13 19:48:31 2005
From: rabbi at abditum.com (Len Sassaman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] CodeCon submission deadline reminder 
Message-ID: <Pine.LNX.4.58.0512131148170.2231@thetis.deor.org>

Here's a reminder that the deadline for submissions to CodeCon 2006 is
this week. Feel free to forward this to project developers who might not
otherwise see it.

--Len.

--

CodeCon 2006
February 10-12, 2006
San Francisco CA, USA
www.codecon.org

Call For Papers

CodeCon is the premier showcase of cutting edge software development. It
is an excellent opportunity for programmers to demonstrate their work and
keep abreast of what's going on in their community.

All presentations must include working demonstrations, ideally
accompanied by source code. Presentations must be done by one of the
active developers of the code in question. We emphasize that
demonstrations be of *working* code.

We hereby solicit papers and demonstrations.

    * Papers and proposals due: December 15, 2005
    * Authors notified: January 1, 2006

Possible topics include, but are by no means restricted to:

    * community-based web sites - forums, weblogs, personals
    * development tools - languages, debuggers, version control
    * file sharing systems - swarming distribution, distributed search
    * security products - mail encryption, intrusion detection, firewalls

Presentations will be 45 minutes long, with 15 minutes allocated for
Q&A. Overruns will be truncated.

Submission details:

Submissions are being accepted immediately. Acceptance dates are
November 15, and December 15. After the first acceptance date,
submissions will be either accepted, rejected, or deferred to the
second acceptance date.

The conference language is English.

Ideally, demonstrations should be usable by attendees with 802.11b
connected devices either via a web interface, or locally on Windows,
UNIX-like, or MacOS platforms. Cross-platform applications are most
desirable.

Our venue will be 21+.

To submit, send mail to submissions-2006 at codecon.org including the
following information:

    * Project name
    * url of project home page
    * tagline - one sentence or less summing up what the project does
    * names of presenter(s) and urls of their home pages, if they have any
    * one-paragraph bios of presenters, optional, under 100 words each
    * project history, under 150 words
    * what will be done in the project demo, under 200 words
    * slides to be shown during the presentation, if applicable
    * future plans

General Chair: Jonathan Moore
Program Chair: Len Sassaman

Program Committee:

    * Bram Cohen, BitTorrent, USA
    * Jered Floyd, Permabit, USA
    * Ian Goldberg, Zero-Knowledge Systems, CA
    * Dan Kaminsky, Avaya, USA
    * Ben Laurie, The Bunker Secure Hosting, UK
    * Nick Mathewson, The Free Haven Project, USA
    * David Molnar, University of California, Berkeley, USA
    * Jonathan Moore, Mosuki, USA
    * Meredith L. Patterson, University of Iowa, USA
    * Len Sassaman, Katholieke Universiteit Leuven, BE

Sponsorship:

If your organization is interested in sponsoring CodeCon, we would
love to hear from you. In particular, we are looking for sponsors for
social meals and parties on any of the three days of the conference,
as well as sponsors of the conference as a whole and donors of door
prizes. If you might be interested in sponsoring any of these aspects,
please contact the conference organizers at codecon-admin at codecon.org.

Press policy:

CodeCon provides a limited number of passes to qualifying press.
Complimentary press passes will be evaluated on request. Everyone is
welcome to pay the low registration fee to attend without an official
press credential.

Questions:

If you have questions about CodeCon, or would like to contact the
organizers, please mail codecon-admin at codecon.org. Please note this
address is only for questions and administrative requests, and not for
workshop presentation submissions.


From Serguei.Osokine at efi.com  Tue Dec 13 19:50:31 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] p2p in some place or other
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42797@fcexmb04.efi.internal>

On Tuesday, December 13, 2005 Michael Rogers wrote:
> I believe eMule allows the uploader to assign different priorities
> to different files - I'd like to be able to do this in Gnutella, to
> make the rarer (or better) content on my node easier to find...

	Tha is more like "easier to download", but I see what you're
saying. Yes, at some point I used to place hight hopes on this method,
basically thinking that the transfer rates for the rare content can
be improved at the expense of the popular one. Popular content can be
found at lots of places anyway, so penalizing it should not hurt all 
that much; for me the goal was to equalize the download rates for
all content regardless of its popularity. So if improving the rare
content download speed would make the widely distributed content
transfers a bit slower (because the systemwide cumulative uplink 
bandwidth is a scarce resource, after all), so be it.

	Unfortunately the statistical research of the P2P systems 
(the one that I've already quoted in this thread) shows that from 
the uploader standpoint the prioritization of rare vs popular content 
does not cover a very significant percentage of all upload situations.

	The typical upload scenario is not only "some popular, some 
rare, so give the rare more bandwidth". Just as widespread is "many
rare uploads from one node", in which case changing their relative
priorities is pointless, and also "rare upload from a single node",
in which case no matter what this node does, the speed is going to
be substandard.

	And let me reemphasize this again - these scenarios seem to be
very common. Essentially the download speed for the rare content is 
limited by the uplink rates of the nodes with rare content, even if
all the nodes are always on and spend just a small percantage of their
online time downloading. For popular content, you can have very fast
downloads in such a case; you can even saturate your downlink if you
wish. But for rare content, you're still stuck with whatever is the
uplink rate of a single node that has this file. 

	As the nodes start spending more time on line, this disparity 
becomes more and more pronounced no matter how you prioritize the
uploads. And seeing this causes the user frustration on a significant
percentage of all downloads (on everything that is in the long tail).

	Best wishes -
	S.Osokine.
	13 Dec 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Michael Rogers
Sent: Tuesday, December 13, 2005 3:25 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] p2p in some place or other


Serguei Osokine wrote:
> 	And the reason for this is quite understandable - if most of 
> the content exists in just one or two copies, what good are the swarm
> downloaders and other marvelous instruments of progress? This single
> copy that you need might be on a single host behind the modem in
> Albania, the host might go off-line at any moment, and to make it 
> more fun, it might be trying to upload five other files (different
> files, mind you) to five other people at the same time. 

I believe eMule allows the uploader to assign different priorities to 
different files - I'd like to be able to do this in Gnutella, to make 
the rarer (or better) content on my node easier to find, almost like a 
recommendation system.

Cheers,
Michael
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From olau at cs.aau.dk  Wed Dec 14 21:56:45 2005
From: olau at cs.aau.dk (Ole Laursen)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] published key chenge frequency in DHT
In-Reply-To: <f322b8af0512060758i2102c86av@mail.gmail.com>
References: <f322b8af0512060758i2102c86av@mail.gmail.com>
Message-ID: <tv8bqzjny02.fsf@homer.cs.aau.dk>

xiangsong hou <xiangsong.hou@gmail.com> writes:

> as we know,DHT can deal with node join/leave frequently.
> i want to know if DHT can deal with publishde key change frequently.
> for example,in grid computing resouce dicovery use DHT,the published key
> (represent cpu or memeory) is change very frequently,so assigned node is
> change frequently.
> how to deal with this situation in DHT?

I'm not totally sure what you are referring to, but I wrote my
master's thesis together with two other guys on a design that used a
DHT for distributing jobs for mass/grid computing. The DHT stored the
jobs and indexed them based on keywords. The main problem was load
balancing the index. We spent quite some time studying relevant
literature and reviewed some of it in the thesis.

You can find it here - we called the system U.P.:

  http://www.cs.aau.dk/~olau/writings/

Unfortunately, we never had time to optimize the load-balancing
algorithm properly (we only had one semester).

-- 
Ole Laursen
http://www.cs.aau.dk/~olau/

From shashi.mit at gmail.com  Thu Dec 15 16:53:28 2005
From: shashi.mit at gmail.com (Shashi (MIT))
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Opinions on JXTA2 -
Message-ID: <4d19a3630512150853w7fcb2b7fice4ead23224460ca@mail.gmail.com>

Hi all

I was curious as to your thoughts on the JXTA platform. I am working
in designing a P2P application and I have heard some conflicting
thoughts on JXTA.

The contrarian viewpoint is that it is too complex and way too much
work to get something like a P2P app working. 'The solution being more
complex than the problem'
While there are simpler P2P frameworks available e.g. DirectConnect.

The pro viewpoint is JXTA's comprehensiveness and the one-stop
platform.  What are your thoughts?

thanks,
Shashi

From garyjefferson123 at yahoo.com  Thu Dec 15 16:55:04 2005
From: garyjefferson123 at yahoo.com (Gary Jefferson)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] kademlia bucket spliting
Message-ID: <20051215165504.7363.qmail@web35703.mail.mud.yahoo.com>

I'm having a hard time following one detail in the Kademlia paper (long version, http://citeseer.ist.psu.edu/sex02sex.html) and was hoping someone could clarify.  From section 2.4, we know that when a bucket gets full and a new node can be added to it, we only split the bucket if it contains our own node ID.  Otherwise, we either discard the new node's contact info or replace an old entry with it, depending on whether the LRU entry is still alive or not.  But then we read:
  
  "One complication arises in highly unbalanced trees. Suppose node u joins the system and is the only node whose ID begins 000. Suppose further that the system already has more than k nodes with prefix 001. Every node with prefix 001 would have an empty k-bucket into which u should be inserted, yet u's bucket refresh would only notify k of the nodes. To avoid this problem, Kademlia nodes keep all valid contacts in a subtree of size at least k nodes, even if this requires splitting buckets in which the node's own ID does not reside. Figure 5 illustrates these additional splits. When u refreshes the split buckets, all nodes with prefix 001 will learn about it."
  
  So when do we split a bucket?  From the above, it sounds as if we always split buckets when they get full and new nodes can be added, regardless of whether the bucket contains our own node ID.  But doesn't this mean we have an essentially limitless number of nodes that we can add to our buckets (and a corresponding memory issue as the network gets large)?
  
  I'm sure I'm missing something here, but I just can't make it out.  I'm running into some pathological cases where I can't converge to the correct node unless I do split every bucket...
  
  Thanks,
  Gary 
 

---------------------------------
Yahoo! Shopping
 Find Great Deals on Holiday Gifts at Yahoo! Shopping 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051215/f34f2f1e/attachment.html
From tcuag at t-online.de  Thu Dec 15 20:23:12 2005
From: tcuag at t-online.de (tcuag@t-online.de)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] UDP Packet size...
In-Reply-To: <20051215200003.B84C53FDA6@capsicum.zgp.org>
Message-ID: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc>

Hi! 
We have to make a decision. Shall we send 64Kb udp pakets or 40 times the
1,4K pakets, which fits into any MTU of any ISP? 

Is it true, that nowadays most router do not allow udp fragmentation by
default?
How many do allow it after configuration? (those configuration is mostly
very difficult for average user, right?) 
Any experience? Some experts say: Never send more than MTU, some projects
say that they work with 60K UDP??? 

For us (python-project) the sending of small pakets means serious trouble
(CPU/Socket, timing), so before tuning our algorythms, we want to be sure,
that sending small pakets is the only solution.
 
Thx for your help
GKL 


From matthew at matthew.at  Thu Dec 15 20:51:30 2005
From: matthew at matthew.at (Matthew Kaufman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] UDP Packet size...
In-Reply-To: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc>
Message-ID: <200512152049.jBFKnWU99767@where.matthew.at>

If 39 of 40 1.4k packets arrive, can you do anything with those, or do all
39 need to be thrown out because the 40th didn't get there?

(I'll wait for the answer before going into more detail about what I think)

Matthew Kaufman
matthew@matthew.at
www.amicima.com 

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org 
> [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of tcuag@t-online.de
> Sent: Thursday, December 15, 2005 12:23 PM
> To: p2p-hackers@zgp.org
> Subject: [p2p-hackers] UDP Packet size...
> 
> Hi! 
> We have to make a decision. Shall we send 64Kb udp pakets or 
> 40 times the 1,4K pakets, which fits into any MTU of any ISP? 
> 
> Is it true, that nowadays most router do not allow udp 
> fragmentation by default?
> How many do allow it after configuration? (those 
> configuration is mostly very difficult for average user, 
> right?) Any experience? Some experts say: Never send more 
> than MTU, some projects say that they work with 60K UDP??? 
> 
> For us (python-project) the sending of small pakets means 
> serious trouble (CPU/Socket, timing), so before tuning our 
> algorythms, we want to be sure, that sending small pakets is 
> the only solution.
>  
> Thx for your help
> GKL 
> 
> 


From agthorr at cs.uoregon.edu  Thu Dec 15 21:04:07 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] UDP Packet size...
In-Reply-To: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc>
References: <20051215200003.B84C53FDA6@capsicum.zgp.org>
	<006d01c601b5$5f6c4fc0$67a2a8c0@namepc>
Message-ID: <20051215210406.GF5108@cs.uoregon.edu>

On Thu, Dec 15, 2005 at 09:23:12PM +0100, tcuag@t-online.de wrote:
> We have to make a decision. Shall we send 64Kb udp pakets or 40 times the
> 1,4K pakets, which fits into any MTU of any ISP? 
> 
> Is it true, that nowadays most router do not allow udp fragmentation by
> default?

This is not really the right forum for the question, as it's a general
networking question and not related to peer-to-peer.  I suggest
finding a introductory networking list or, better still, a good TCP/IP
book.  I'm fond of TCP/IP Illustrated, Vol. 1.

Nevertheless, I'll answer:

First, there is no such thing as "UDP Fragmentation".  Your UDP
datagram is encapsulated in an IP packet, which routers will fragment
as necessary down to their MTU (this is called "IP Fragmentation").
The receiving host will reassemble the IP fragments into the full IP
packet and then pass the packet to UDP on the host.

The problem is that if any of the IP fragments is lost, then the
entire UDP datagram is lost and must be retransmitted.  This is very
wasteful.

It's much better to make your packets fit within the Path-MTU, so that
if one MTU-sized packet is lost, then only one MTU-sized packet must
be retransmitted.

To be completely robust, you need to do Path-MTU discovery to
dynamically adapt if the Path-MTU isn't what you expect it to be.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From coderman at gmail.com  Thu Dec 15 21:33:10 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] UDP Packet size...
In-Reply-To: <006d01c601b5$5f6c4fc0$67a2a8c0@namepc>
References: <20051215200003.B84C53FDA6@capsicum.zgp.org>
	<006d01c601b5$5f6c4fc0$67a2a8c0@namepc>
Message-ID: <4ef5fec60512151333l5f848710yd9c63a7581475a65@mail.gmail.com>

On 12/15/05, tcuag@t-online.de <tcuag@t-online.de> wrote:
> ... Shall we send 64Kb udp pakets or 40 times the
> 1,4K pakets, which fits into any MTU of any ISP?

use ~1400byte packets.  be aware of tcp friendly congestion
control.


> Is it true, that nowadays most router do not allow udp fragmentation by
> default?

i've had the most problem with UDP NAPT's dropping fragmented
datagrams.  most routers are fine.


> Any experience?

the 'Never send more than MTU' suggestion is a good one.


> For us (python-project) the sending of small pakets means serious trouble
> (CPU/Socket, timing)

you could always support both and use large packet support when it
works well.  if you are trying to do bulk transfer use TCP instead :)

From bneijt at gmail.com  Fri Dec 16 11:33:40 2005
From: bneijt at gmail.com (Bram Neijt)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Google releases something P2P
Message-ID: <46c2f4ab0512160333u6d5e326er308b13c36d0a0ad0@mail.gmail.com>

Hi.

I havn't been able to take a good look at the system yet, but Google
has released LibJingle (Google Talk library) which contains a P2P
implementation. I don't think it's a "collaborate" implementation,
like Skype, but it might be intresting code for people wanting to do
firewall and NAT transversal:

[The P2P component] "Negotiates, establishes, and maintains
peer-to-peer connections through almost any network configuration
regardless of NAT devices and firewalls. The p2p component understands
the Jingle spec to initiate the session and then provides a
sockets-like interface for sending and receiving data that is used by
the session component to add functionality."

More on that, here: http://code.google.com/apis/talk/about.html

(probably my last post before Christmas, so)
Happy christmas and hacking everyone!
  Bram Neijt

From eunsoo at research.panasonic.com  Fri Dec 16 22:40:02 2005
From: eunsoo at research.panasonic.com (Eunsoo Shim)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
Message-ID: <43A34242.3070203@research.panasonic.com>

Hi,

I am wondering whether Kad Network (based on Kademlia) has a 
hierarchical architecture where supernodes are distinguished from 
non-supernodes.
Your kind information will be appreciated.
Thanks.

Eunsoo

From agthorr at cs.uoregon.edu  Fri Dec 16 22:44:28 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <43A34242.3070203@research.panasonic.com>
References: <43A34242.3070203@research.panasonic.com>
Message-ID: <20051216224426.GC3060@cs.uoregon.edu>

On Fri, Dec 16, 2005 at 05:40:02PM -0500, Eunsoo Shim wrote:
> I am wondering whether Kad Network (based on Kademlia) has a 
> hierarchical architecture where supernodes are distinguished from 
> non-supernodes.

No, it does not.

However, it does distinguish non-firewalled peers (which form the DHT)
from firewalled peers (which can query the DHT but do not form part of
the routing structure).

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From lemonobrien at yahoo.com  Fri Dec 16 23:41:31 2005
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <20051216224426.GC3060@cs.uoregon.edu>
Message-ID: <20051216234131.80784.qmail@web53605.mail.yahoo.com>

this is the same thing...almost; its the same thing if you only use udp.

Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:  On Fri, Dec 16, 2005 at 05:40:02PM -0500, Eunsoo Shim wrote:
> I am wondering whether Kad Network (based on Kademlia) has a 
> hierarchical architecture where supernodes are distinguished from 
> non-supernodes.

No, it does not.

However, it does distinguish non-firewalled peers (which form the DHT)
from firewalled peers (which can query the DHT but do not form part of
the routing structure).

-- 
Daniel Stutzbach Computer Science Ph.D Student
http://www.barsoom.org/~agthorr University of Oregon
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051216/affdc9ec/attachment.htm
From matthew at matthew.at  Sat Dec 17 07:04:52 2005
From: matthew at matthew.at (Matthew Kaufman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
Message-ID: <200512170702.jBH72uU04975@where.matthew.at>

We've been busy here at amicima, and thought you'd want to know about some
recent improvements we've made:

1. We upgraded the MFP protocol to be resistant to a potential but unlikely
denial-of-service attack in cases where there's no cryptography or the
session key is the same in each direction. Specifically: an attacker who
intercepts traffic from one end, modifies the session identifier to match
the one sent by the other end, and plays the traffic back might be able in
some cases to erroneously start flows or in extreme cases cause a denial of
service through the IP mobility mechanism.

This is fixed by adding explicit directionality flagging to the MFP packet
header, and the protocol spec and our implementation have been upgraded. The
revised protocol spec (version 1.2) can be found at:

	http://www.amicima.com/developers/documentation.html

2. We've significantly upgraded the "MFP defcrypto" default cryptographic
plug-in. The new version is INCOMPATIBLE will all previous versions, but we
hope our improvements mean that's the only time we'll have to say that. The
previous version supported RSA for public-key crypto and AES128 for
symmetric crypto, and while the key material was generated at both ends
(thanks to suggestions here to make that improvement), the transmission of
keying material was of a fixed length, the combination was identical at each
end (XOR) (so both directions used the same session key), and there was no
provision for any options to be sent between the cryptographic plug-ins at
each end.

The new version has replaced the fixed-length encrypted key data sent in the
Initial Keying packets with a "micro-packet" of data that is exchanged
between each end (and which is protected by the signatures present in the
Initiator Initial Keying and Responder Initial Keying packets, so the data
can't be tampered with). These "micro-packets" can contain variable-length
option information for future cryptosystem upgrades, like changes to AES256
or the addition of HMAC, in such a way that backwards compatibility may be
retained, as well as the necessary keying data (also of variable length, and
which we now combine asymmetrically, such that both ends contribute to the
session keys that are used, but a different session key is used in each
direction now).

This brings us to the next new feature... By popular request, and because we
now have the ability to negotiate such options, we now have optional
HMAC-SHA1 in the default crypto plug-in. The HMAC wraps the encrypted packet
in order to detect any corruption or tampering before it is even decrypted
at the far end and with much more certainty than the internal
post-decryption 16-bit checksum. There is an API to set transmission (always
send, only if requested by the other end, never send) and reception (require
(and request) that it be sent, request (but not require) that it be sent,
verify (but neither request nor require) if sent, and ignore completely)
options, and MFPNet has been upgraded to provide access to the HMAC API as
well. Once HMAC has been negotiated, any packet with the wrong HMAC (or from
which the HMAC has been deleted) will be ignored.

We always said that "if you don't like it, you can plug in a new
cryptographic plug-in", but that doesn't necessarily provide a good
backward-compatible solution for upgrades to running systems with large
numbers of existing peers. We're pretty sure that this does (as would any
other cryptographic plug-in that borrowed these enhancements), but only the
future will tell us if we're right.

The new releases of MObj, MFP, and MFPNet are available on our downloads
page:

	http://www.amicima.com/developers/downloads.html

And details of the default cryptographic plug-in are provided in the MFP
release's README file, available separately here:

	http://www.amicima.com/downloads/mfp/README.txt

3. And finally, because we've rolled out an incompatible (but much better)
default cryptographic plug-in, we've released a new version of amiciPhone,
our demo application that does P2P VOIP calling, user presence, text
messaging, and photo and file sending, you can get the Windows XP version
from our website, and the Macintosh OS X version is coming along nicely and
should be out before too much longer.

The application download is here:

	http://www.amicima.com/applications/

Download a copy and try it out! (For a good time, try calling
"7@test.amicima.com")

Thanks for the support and feedback from the list and privately, it has
helped make our protocols and implementations better, and we try to return
the favor through the open-source publication of our protocol
implementations.

Matthew Kaufman
matthew@matthew.at
matthew@amicima.com
http://www.amicima.com


From eunsoo at research.panasonic.com  Sat Dec 17 07:33:47 2005
From: eunsoo at research.panasonic.com (Eunsoo Shim)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <20051216224426.GC3060@cs.uoregon.edu>
References: <43A34242.3070203@research.panasonic.com> 
	<20051216224426.GC3060@cs.uoregon.edu>
Message-ID: <43A3BF5B.5070102@research.panasonic.com>

I see.
Is there any statistics information about the average number of
non-firewalled peers in Kad Network?
Thanks a lot.

Eunsoo

Daniel Stutzbach wrote:

>On Fri, Dec 16, 2005 at 05:40:02PM -0500, Eunsoo Shim wrote:
>  
>
>>I am wondering whether Kad Network (based on Kademlia) has a 
>>hierarchical architecture where supernodes are distinguished from 
>>non-supernodes.
>>    
>>
>
>No, it does not.
>
>However, it does distinguish non-firewalled peers (which form the DHT)
>from firewalled peers (which can query the DHT but do not form part of
>the routing structure).
>
>  
>


From agthorr at cs.uoregon.edu  Sat Dec 17 07:41:40 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <43A3BF5B.5070102@research.panasonic.com>
References: <43A34242.3070203@research.panasonic.com>
	<20051216224426.GC3060@cs.uoregon.edu>
	<43A3BF5B.5070102@research.panasonic.com>
Message-ID: <20051217074138.GF3060@cs.uoregon.edu>

On Sat, Dec 17, 2005 at 02:33:47AM -0500, Eunsoo Shim wrote:
> I see.
> Is there any statistics information about the average number of
> non-firewalled peers in Kad Network?

I measured it to be around a million non-firewalled peers, although
that was a few months back.  

Or did you mean as a percentage of all peers?  (which I'm not sure of)

Why do you ask, by the way?

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From eunsoo at research.panasonic.com  Sat Dec 17 08:40:53 2005
From: eunsoo at research.panasonic.com (Eunsoo Shim)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <20051217074138.GF3060@cs.uoregon.edu>
References: <43A34242.3070203@research.panasonic.com> 
	<43A3BF5B.5070102@research.panasonic.com> 
	<20051217074138.GF3060@cs.uoregon.edu>
Message-ID: <43A3CF15.7030503@research.panasonic.com>

Daniel Stutzbach wrote:

>On Sat, Dec 17, 2005 at 02:33:47AM -0500, Eunsoo Shim wrote:
>  
>
>>I see.
>>Is there any statistics information about the average number of
>>non-firewalled peers in Kad Network?
>>    
>>
>
>I measured it to be around a million non-firewalled peers, although
>that was a few months back.  
>
>Or did you mean as a percentage of all peers?  (which I'm not sure of)
>
>Why do you ask, by the way?
>
>  
>
Wow, you know a lot about Kad Network.
I asked about it because I was interested in scalability of DHTs.
I looked for cases of large scale DHT deployment and so far found only
Kad Network based on Kademlia.
According to Wikipedia, there are 3.5 - 5.1 million concurrent online
users in Kad Network.

http://en.wikipedia.org/wiki/Kad_Network

A million non-firewalled peers...It is impressive again. I thought most
computers were working behind firewalls or NAT these days.

Do you know any other large scale DHT deployment?

Thanks.

Eunsoo

From agthorr at cs.uoregon.edu  Sat Dec 17 17:09:25 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <43A3CF15.7030503@research.panasonic.com>
References: <43A34242.3070203@research.panasonic.com>
	<43A3BF5B.5070102@research.panasonic.com>
	<20051217074138.GF3060@cs.uoregon.edu>
	<43A3CF15.7030503@research.panasonic.com>
Message-ID: <20051217170924.GA1288@cs.uoregon.edu>

On Sat, Dec 17, 2005 at 03:40:53AM -0500, Eunsoo Shim wrote:
> A million non-firewalled peers...It is impressive again. I thought most
> computers were working behind firewalls or NAT these days.
> 
> Do you know any other large scale DHT deployment?

eDonkey's Overnet is also Kademlia based, as well as the new
"trackerless" feature of BitTorrent.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From dbarrett at quinthar.com  Sat Dec 17 20:21:43 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <43A3CF15.7030503@research.panasonic.com>
References: <43A34242.3070203@research.panasonic.com>
	<43A3CF15.7030503@research.panasonic.com>
Message-ID: <1134850905.F9FF157@dl11.dngr.org>

On Sat, 17 Dec 2005 1:43 am, Eunsoo Shim wrote:
> Daniel Stutzbach wrote:
>>
>> I measured it to be around a million non-firewalled peers, although
>> that was a few months back.
>
> A million non-firewalled peers...It is impressive again. I thought most
> computers were working behind firewalls or NAT these days.

Daniel, by "non-firewalled" do you mean truly, those that aren't behind 
a firewall, or rather those for which NAT/firewall traversal doesn't 
work?

-david

From coderman at gmail.com  Sat Dec 17 21:33:45 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <200512170702.jBH72uU04975@where.matthew.at>
References: <200512170702.jBH72uU04975@where.matthew.at>
Message-ID: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com>

On 12/17/05, Matthew Kaufman <matthew@matthew.at> wrote:
> ...
> 2. We've significantly upgraded the "MFP defcrypto" default cryptographic
> plug-in.

i forgot to mention this previously but it is always a good idea to
lock memory pages where key material and cipher state resides.  the
'mlock' function can do this on unix systems (not sure what the
equivalent is for win32 api).

this does require root privilege which can make application coding a
little more complicated.  (i.e. handling setuid and dropping privs,
etc).

i've also seen some systems use unix IPC shared memory to keep memory
from paging out to swap, etc.  if you use an encrypted swap partition
this might be somewhat less of a concern.

best regards,

From dbarrett at quinthar.com  Sat Dec 17 23:00:40 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com>
References: <200512170702.jBH72uU04975@where.matthew.at>
	<4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com>
Message-ID: <1134860444.CA368C6@di12.dngr.org>

On Sat, 17 Dec 2005 1:58 pm, coderman wrote:
> On 12/17/05, Matthew Kaufman <matthew@matthew.at> wrote:
>>  ...
>>  2. We've significantly upgraded the "MFP defcrypto" default 
>> cryptographic
>>  plug-in.
>
> i forgot to mention this previously but it is always a good idea to
> lock memory pages where key material and cipher state resides.

I'm not sure I follow how this helps: who is it protecting against?  If 
you don't want the user to get access to cipher info, requiring root 
access isn't much of a barrier (any hacker will have root on his own 
box).  And one user can't access the memory of another user's 
processes.  I'm not disputing the technique, I just don't understand 
when to apply it.

For example, just the other day I was interviewing a candidate (did I 
mention we are hiring?) who aggregates poker stats on other players.  
Despite all sorts of clever on-the-wire encryption, he just figured out 
where all the stats are kept in plaintext in memory and tapped into 
that.  Doh!

Ultimately, it's never a good idea to send data to a client that you 
don't want to fall into the wrong hands.  Memory protection might stop a 
non-root user from accessing his own memory, but this seems like a 
boundary case (unless I'm misunderstanding it).

-david

From agthorr at cs.uoregon.edu  Sat Dec 17 22:59:27 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <1134850905.F9FF157@dl11.dngr.org>
References: <43A34242.3070203@research.panasonic.com>
	<43A3CF15.7030503@research.panasonic.com>
	<1134850905.F9FF157@dl11.dngr.org>
Message-ID: <20051217225926.GB2876@cs.uoregon.edu>

On Sat, Dec 17, 2005 at 12:21:43PM -0800, David Barrett wrote:
> >Daniel Stutzbach wrote:
> >>I measured it to be around a million non-firewalled peers, although
> >>that was a few months back.
> 
> Daniel, by "non-firewalled" do you mean truly, those that aren't behind 
> a firewall, or rather those for which NAT/firewall traversal doesn't 
> work?

I mean those that can receive unsolicited TCP and UDP packets on the
Kad/eMule ports.  Either they must not be firewalled/NATed or the user
must manually punch a whole to redirect those ports from the firewall
device.

Kad uses "iterative" DHT routing.  If I'm a client and want to do a
lookup, I query some of my contacts to get their next hop for my
target, then I query that peer for it's next hop, etc.  Therefore,
it's important that any host participating in Kad's DHT routing
structure be able to receive unsolicited packets.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From eunsoo at research.panasonic.com  Sat Dec 17 23:43:30 2005
From: eunsoo at research.panasonic.com (Eunsoo Shim)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <20051217225926.GB2876@cs.uoregon.edu>
References: <43A34242.3070203@research.panasonic.com> 
	<1134850905.F9FF157@dl11.dngr.org>
	<20051217225926.GB2876@cs.uoregon.edu>
Message-ID: <43A4A2A2.6080207@research.panasonic.com>


>>>>I measured it to be around a million non-firewalled peers, although
>>>>that was a few months back.
>>>>        
>>>>
>>Daniel, by "non-firewalled" do you mean truly, those that aren't behind 
>>a firewall, or rather those for which NAT/firewall traversal doesn't 
>>work?
>>    
>>
>
>I mean those that can receive unsolicited TCP and UDP packets on the
>Kad/eMule ports.  Either they must not be firewalled/NATed or the user
>must manually punch a whole to redirect those ports from the firewall
>device.
>  
>
So port 80 or 443 is NOT used at all for Kad Network?

>Kad uses "iterative" DHT routing.  If I'm a client and want to do a
>lookup, I query some of my contacts to get their next hop for my
>target, then I query that peer for it's next hop, etc.  Therefore,
>it's important that any host participating in Kad's DHT routing
>structure be able to receive unsolicited packets.
>
>  
>
"Iterative" DHT routing is inefficient compared to "recursive" one.
Is "iterative" routing used because of a concern about DoS attacks?
Thanks.

Eunsoo

From agthorr at cs.uoregon.edu  Sun Dec 18 00:00:26 2005
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <43A4A2A2.6080207@research.panasonic.com>
References: <43A34242.3070203@research.panasonic.com>
	<1134850905.F9FF157@dl11.dngr.org>
	<20051217225926.GB2876@cs.uoregon.edu>
	<43A4A2A2.6080207@research.panasonic.com>
Message-ID: <20051218000025.GC2876@cs.uoregon.edu>

On Sat, Dec 17, 2005 at 06:43:30PM -0500, Eunsoo Shim wrote:
> >I mean those that can receive unsolicited TCP and UDP packets on the
> >Kad/eMule ports.  Either they must not be firewalled/NATed or the user
> >must manually punch a whole to redirect those ports from the firewall
> >device.
> >
> So port 80 or 443 is NOT used at all for Kad Network?

Not normally, no.  eMule lets the user configure their peer to use
ports other than the default, so they could use any port they want in
that case.  But the vast majority of peers do not use port 80 or 443
at all.

> "Iterative" DHT routing is inefficient compared to "recursive" one.
> Is "iterative" routing used because of a concern about DoS attacks?

Kademlia is an inherently iterative DHT.  I suspect the Kad developers
used iterative routing simply because they chose Kademlia as a
starting point.  I'm not one of the Kad developers though, so I can
only guess at the reasons behind their design decisions.

I'd observe, though, that iterative routing is much easier to debug.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From matthew at matthew.at  Sun Dec 18 00:49:02 2005
From: matthew at matthew.at (Matthew Kaufman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com>
Message-ID: <200512180047.jBI0l7U10135@where.matthew.at>

Coderman:
> i forgot to mention this previously but it is always a good 
> idea to lock memory pages where key material and cipher state 
> resides. 

Not "always". The right answer is "sometimes".

If you wish to do so, on some systems you can use memory region locking to
prevent the cryptographic material from being paged out to permanent storage
media, in theory (see below). This makes it harder to take a machine, after
the fact, and attempt to analyze its permanent media (hard disk) for any
state that might have been left behind.

However, that's just one of several possible attacks you might want to guard
against, or other requirements you might have.

> the 'mlock' function can do this on unix systems 
> (not sure what the equivalent is for win32 api).
> this does require root privilege which can make application 
> coding a little more complicated.  (i.e. handling setuid and 
> dropping privs, etc).

Here, for example, is where some of those other requirements and limitations
come in. mlock() is not only not available on Win32 (though there are calls
used by drivers, and some directX calls that can do locking that might be
applicable, though I couldn't in a quick check verify that non-paging is
guaranteed), but has different requirements and functionality on different
systems that DO support the call...

Some Linux systems and MacOS X, for instance, allow a limited amount of
mlock() by the user, but on FreeBSD, the call is unavailable except to the
superuser. In some implementations, the calls nest (MacOS X), in others
(Solaris) an inadvertant call to munlock() (or munmap(), even, which you
might be using for other reasons) by other code can unlock pages that the
cryptographic parts of your program think are still locked. And finally, and
most important, *all* that POSIX guarantees from mlock() is that the page
*is* in memory for you, *not* that it *isn't* also copied to swap. Whether
or not that's the case is implementation-dependent.

And, having an application run as setuid root, even if briefly, also
increases the risk that it would be used as a vector to run other code as
root, and clearly once an attacker can run code as root, they'd have access
to all of physical memory, not just what's stored on the swap device. That's
also true if the attacker uses some other method to get root access, or is
running on a machine where the equivalent access to physical memory is
easier to get (typical Win32 machines, for instance).

In summary, "it depends". In our case, since the MFP default cryptographic
plugin also uses the services of OpenSSL libraries for its crypto
operations, changes to ensure that state both in our plugin and the external
libraries (DLLs on Win32, shared system libraries on MacOS X and other
platforms where OpenSSL is standard, or static OpenSSL libraries elsewhere)
are all non-pagable (and more important, that "non-pagable" *also* means
"never copied to disk") is not a trivial change, especially to ensure that
that's the case on every platform we support. But as we've said before, for
applications where this is an attack you wished to guard against
specifically, there's nothing stopping you from modifying our plugin or
writing your own that implements exactly what you want.

Matthew Kaufman
matthew@matthew.at
http://www.amicima.com


From osokin at osokin.com  Sun Dec 18 03:17:20 2005
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <1134860444.CA368C6@di12.dngr.org>
Message-ID: <IJECJCHFLLIPKMNGHDPMCEPBIAAA.osokin@osokin.com>

On Saturday, December 17, 2005 David Barrett wrote:
> For example, just the other day I was interviewing a candidate (did
> I mention we are hiring?) who aggregates poker stats on other players. 

	Sounds like you're finally switching your development into areas
that can actually bring heaps of money. I always thought that cheating
in poker should be more profitable than P2P content delivery - and now
your hiring approach seems to validate that. Good luck!

	Best wishes -
	S.Osokine.
	17 Dec 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of David Barrett
Sent: Saturday, December 17, 2005 3:01 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades


On Sat, 17 Dec 2005 1:58 pm, coderman wrote:
> On 12/17/05, Matthew Kaufman <matthew@matthew.at> wrote:
>>  ...
>>  2. We've significantly upgraded the "MFP defcrypto" default 
>> cryptographic
>>  plug-in.
>
> i forgot to mention this previously but it is always a good idea to
> lock memory pages where key material and cipher state resides.

I'm not sure I follow how this helps: who is it protecting against?  If 
you don't want the user to get access to cipher info, requiring root 
access isn't much of a barrier (any hacker will have root on his own 
box).  And one user can't access the memory of another user's 
processes.  I'm not disputing the technique, I just don't understand 
when to apply it.

For example, just the other day I was interviewing a candidate (did I 
mention we are hiring?) who aggregates poker stats on other players.  
Despite all sorts of clever on-the-wire encryption, he just figured out 
where all the stats are kept in plaintext in memory and tapped into 
that.  Doh!

Ultimately, it's never a good idea to send data to a client that you 
don't want to fall into the wrong hands.  Memory protection might stop a 
non-root user from accessing his own memory, but this seems like a 
boundary case (unless I'm misunderstanding it).

-david
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From dbarrett at quinthar.com  Sun Dec 18 05:49:45 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <IJECJCHFLLIPKMNGHDPMCEPBIAAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMCEPBIAAA.osokin@osokin.com>
Message-ID: <43A4F879.8050709@quinthar.com>

Well the real money is in bulk counterfeiting.  If only I had access to 
a frickin' huge printer...

Serguei Osokine wrote:
> On Saturday, December 17, 2005 David Barrett wrote:
> 
>>For example, just the other day I was interviewing a candidate (did
>>I mention we are hiring?) who aggregates poker stats on other players. 
> 
> 
> 	Sounds like you're finally switching your development into areas
> that can actually bring heaps of money. I always thought that cheating
> in poker should be more profitable than P2P content delivery - and now
> your hiring approach seems to validate that. Good luck!
> 
> 	Best wishes -
> 	S.Osokine.
> 	17 Dec 2005.
> 
> 
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of David Barrett
> Sent: Saturday, December 17, 2005 3:01 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades
> 
> 
> On Sat, 17 Dec 2005 1:58 pm, coderman wrote:
> 
>>On 12/17/05, Matthew Kaufman <matthew@matthew.at> wrote:
>>
>>> ...
>>> 2. We've significantly upgraded the "MFP defcrypto" default 
>>>cryptographic
>>> plug-in.
>>
>>i forgot to mention this previously but it is always a good idea to
>>lock memory pages where key material and cipher state resides.
> 
> 
> I'm not sure I follow how this helps: who is it protecting against?  If 
> you don't want the user to get access to cipher info, requiring root 
> access isn't much of a barrier (any hacker will have root on his own 
> box).  And one user can't access the memory of another user's 
> processes.  I'm not disputing the technique, I just don't understand 
> when to apply it.
> 
> For example, just the other day I was interviewing a candidate (did I 
> mention we are hiring?) who aggregates poker stats on other players.  
> Despite all sorts of clever on-the-wire encryption, he just figured out 
> where all the stats are kept in plaintext in memory and tapped into 
> that.  Doh!
> 
> Ultimately, it's never a good idea to send data to a client that you 
> don't want to fall into the wrong hands.  Memory protection might stop a 
> non-root user from accessing his own memory, but this seems like a 
> boundary case (unless I'm misunderstanding it).
> 
> -david
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 

From lemonobrien at yahoo.com  Sun Dec 18 08:35:42 2005
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <20051218000025.GC2876@cs.uoregon.edu>
Message-ID: <20051218083542.13692.qmail@web53609.mail.yahoo.com>

lots of peer to peer networks never support NAT tranversal cause it always encryption and privacy....users of eDonkey configure their router for its use...port changing should be stanard practice for any internet application; and know < 1024 is taken as a general rule.

Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:  On Sat, Dec 17, 2005 at 06:43:30PM -0500, Eunsoo Shim wrote:
> >I mean those that can receive unsolicited TCP and UDP packets on the
> >Kad/eMule ports. Either they must not be firewalled/NATed or the user
> >must manually punch a whole to redirect those ports from the firewall
> >device.
> >
> So port 80 or 443 is NOT used at all for Kad Network?

Not normally, no. eMule lets the user configure their peer to use
ports other than the default, so they could use any port they want in
that case. But the vast majority of peers do not use port 80 or 443
at all.

> "Iterative" DHT routing is inefficient compared to "recursive" one.
> Is "iterative" routing used because of a concern about DoS attacks?

Kademlia is an inherently iterative DHT. I suspect the Kad developers
used iterative routing simply because they chose Kademlia as a
starting point. I'm not one of the Kad developers though, so I can
only guess at the reasons behind their design decisions.

I'd observe, though, that iterative routing is much easier to debug.

-- 
Daniel Stutzbach Computer Science Ph.D Student
http://www.barsoom.org/~agthorr University of Oregon
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20051218/483e6dc8/attachment.html
From eunsoo at research.panasonic.com  Sun Dec 18 13:39:54 2005
From: eunsoo at research.panasonic.com (Eunsoo Shim)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
In-Reply-To: <20051218000025.GC2876@cs.uoregon.edu>
References: <43A34242.3070203@research.panasonic.com> 
	<20051217225926.GB2876@cs.uoregon.edu> 
	<43A4A2A2.6080207@research.panasonic.com> 
	<20051218000025.GC2876@cs.uoregon.edu>
Message-ID: <43A566AA.6030501@research.panasonic.com>

Thanks a lot, Daniel.
Your information helped me a lot.

Eunsoo

Daniel Stutzbach wrote:

>On Sat, Dec 17, 2005 at 06:43:30PM -0500, Eunsoo Shim wrote:
>  
>
>>>I mean those that can receive unsolicited TCP and UDP packets on the
>>>Kad/eMule ports.  Either they must not be firewalled/NATed or the user
>>>must manually punch a whole to redirect those ports from the firewall
>>>device.
>>>
>>>      
>>>
>>So port 80 or 443 is NOT used at all for Kad Network?
>>    
>>
>
>Not normally, no.  eMule lets the user configure their peer to use
>ports other than the default, so they could use any port they want in
>that case.  But the vast majority of peers do not use port 80 or 443
>at all.
>
>  
>
>>"Iterative" DHT routing is inefficient compared to "recursive" one.
>>Is "iterative" routing used because of a concern about DoS attacks?
>>    
>>
>
>Kademlia is an inherently iterative DHT.  I suspect the Kad developers
>used iterative routing simply because they chose Kademlia as a
>starting point.  I'm not one of the Kad developers though, so I can
>only guess at the reasons behind their design decisions.
>
>I'd observe, though, that iterative routing is much easier to debug.
>
>  
>


From osokin at osokin.com  Sun Dec 18 18:30:35 2005
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <43A4F879.8050709@quinthar.com>
Message-ID: <IJECJCHFLLIPKMNGHDPMGEAJIBAA.osokin@osokin.com>

> If only I had access to a frickin' huge printer...

	You're in luck! Just two days ago we finally managed to remove
a few remaining bottlenecks and now are doing stable 2,000 pages per
minute. Your main problem will be paper, actually - you'll need a lot...

	Best wishes -
	S.Osokine.
	18 Dec 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of David Barrett
Sent: Saturday, December 17, 2005 9:50 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades


Well the real money is in bulk counterfeiting.  If only I had access to 
a frickin' huge printer...

Serguei Osokine wrote:
> On Saturday, December 17, 2005 David Barrett wrote:
> 
>>For example, just the other day I was interviewing a candidate (did
>>I mention we are hiring?) who aggregates poker stats on other players. 
> 
> 
> 	Sounds like you're finally switching your development into areas
> that can actually bring heaps of money. I always thought that cheating
> in poker should be more profitable than P2P content delivery - and now
> your hiring approach seems to validate that. Good luck!
> 
> 	Best wishes -
> 	S.Osokine.
> 	17 Dec 2005.
> 
> 
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of David Barrett
> Sent: Saturday, December 17, 2005 3:01 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] amicima MFP and crypto upgrades
> 
> 
> On Sat, 17 Dec 2005 1:58 pm, coderman wrote:
> 
>>On 12/17/05, Matthew Kaufman <matthew@matthew.at> wrote:
>>
>>> ...
>>> 2. We've significantly upgraded the "MFP defcrypto" default 
>>>cryptographic
>>> plug-in.
>>
>>i forgot to mention this previously but it is always a good idea to
>>lock memory pages where key material and cipher state resides.
> 
> 
> I'm not sure I follow how this helps: who is it protecting against?  If 
> you don't want the user to get access to cipher info, requiring root 
> access isn't much of a barrier (any hacker will have root on his own 
> box).  And one user can't access the memory of another user's 
> processes.  I'm not disputing the technique, I just don't understand 
> when to apply it.
> 
> For example, just the other day I was interviewing a candidate (did I 
> mention we are hiring?) who aggregates poker stats on other players.  
> Despite all sorts of clever on-the-wire encryption, he just figured out 
> where all the stats are kept in plaintext in memory and tapped into 
> that.  Doh!
> 
> Ultimately, it's never a good idea to send data to a client that you 
> don't want to fall into the wrong hands.  Memory protection might stop a 
> non-root user from accessing his own memory, but this seems like a 
> boundary case (unless I'm misunderstanding it).
> 
> -david
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From coderman at gmail.com  Sun Dec 18 19:37:46 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <200512180047.jBI0l7U10135@where.matthew.at>
References: <4ef5fec60512171333m2dcdeb04h5dabd339d41fc576@mail.gmail.com>
	<200512180047.jBI0l7U10135@where.matthew.at>
Message-ID: <4ef5fec60512181137y334669e6qc38f475cb9fc298d@mail.gmail.com>

On 12/17/05, Matthew Kaufman <matthew@matthew.at> wrote:
> ...
> Not "always". The right answer is "sometimes".
>
> If you wish to do so, on some systems you can use memory region locking to
> prevent the cryptographic material from being paged out to permanent storage
> media, in theory (see below). ...
>
> However, that's just one of several possible attacks you might want to guard
> against, or other requirements you might have.

very true.  i suppose if you are this concerned about key secrecy
you'd also want to ensure other side channels / application security
is as well protected.

are patches for MF* accepted in general?  is copyright assignment required?

thanks for the detailed response.

From matthew at matthew.at  Sun Dec 18 22:39:22 2005
From: matthew at matthew.at (Matthew Kaufman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <4ef5fec60512181137y334669e6qc38f475cb9fc298d@mail.gmail.com>
Message-ID: <200512182237.jBIMbSU13180@where.matthew.at>

coderman:
>
> very true.  i suppose if you are this concerned about key 
> secrecy you'd also want to ensure other side channels / 
> application security is as well protected.

Exactly so. Remembering, of course, that the attacker is always looking for
the path of least resistance. Encrypt everything on the disk with a strong
passphrase? Better make sure there's no keylogger installed, Encrypting your
VOIP chat? Better make sure there's no bug glued to the bottom of your desk.
Etc.
 
> are patches for MF* accepted in general?  is copyright 
> assignment required?

The code is under GPL. Self-published patches that modify the code are of
course just fine, and that keeps complete control of the patch in your hands
as long as any distribution of the patched code you do complies with the
GPL.

If you want patches rolled back into our distributed code, copyright
assignment is required since we not only need to try to keep compatibility
with them (and so we might "patch a patch", and don't want our
GPL-publication-right of that getting confusing), but we have commercial
licensees who we need to grant rights to. Exceptions might be made in
exceptional circumstances where a GPL/non-GPL fork really makes sense.

We're also open to suggestions for changes... Just ask, and we might write
it into the next release for you :)

One thing I know will be in the next release as a response to a request, for
instance, is a change to the 'extern "C"' handling in the headers, to make
life easier for C++ programmers, particularly on win32 where there's C vs
C++ system include file issues.

Matthew Kaufman
matthew@matthew.at
http://www.amicima.com


From coderman at gmail.com  Mon Dec 19 04:23:00 2005
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] amicima MFP and crypto upgrades
In-Reply-To: <200512182237.jBIMbSU13180@where.matthew.at>
References: <4ef5fec60512181137y334669e6qc38f475cb9fc298d@mail.gmail.com>
	<200512182237.jBIMbSU13180@where.matthew.at>
Message-ID: <4ef5fec60512182023k49f46bb1u755a0b214a3a271c@mail.gmail.com>

On 12/18/05, Matthew Kaufman <matthew@matthew.at> wrote:
> ... If you want patches rolled back into our distributed code, copyright
> assignment is required since we not only need to try to keep compatibility
> with them (and so we might "patch a patch", and don't want our
> GPL-publication-right of that getting confusing), but we have commercial
> licensees who we need to grant rights to. Exceptions might be made in
> exceptional circumstances where a GPL/non-GPL fork really makes sense.

yeah, that is common and makes sense.  i just hadn't seen this
explicitly stated so i was curious.


> One thing I know will be in the next release as a response to a request, for
> instance, is a change to the 'extern "C"' handling in the headers, to make
> life easier for C++ programmers, particularly on win32 where there's C vs
> C++ system include file issues.

that would be handy; i'd like to use some of this framework in a c++
project in the near future.

thanks for update...

From john.casey at gmail.com  Mon Dec 19 10:45:52 2005
From: john.casey at gmail.com (John Casey)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Decentralized search engines
In-Reply-To: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1DD205D@ftrdmel1.rd.francetelecom.fr>
Message-ID: <be7f17170512190245jf91920ep5cf2992fe0946e0d@mail.gmail.com>

Hi Simon, have you thought of using the Apache groups lucene search
engine and crawler ?? http://lucene.apache.org/java/docs/index.html

On 12/8/05, SIMON Gwendal RD-MAPS-ISS <gwendal.simon@francetelecom.com> wrote:
> In comparison with traditional filesharing approaches, a decentralized search for the web should take into account words inside the documents.
>
> As previously said, we are working on a system namely Maay which aims at performing a decentralized and personalized search on a distributed set of textual documents.
>
> http://maay.netofpeers.net
>
> Each node (said computer) can publish a set of documents. This information space does not initially contain the web. Our idea is to consider that the cache (or history) of the web browser should be, by default, included in the published set of documents. So, every page that has been visited by at least one people since x days will be available in the network. Obviously, more popular a page is, more available it is.
>
> By the way, one first challenge is the implementation of a nice crawler for owned documents : an indexer. This indexer should be able to scan and retrieve words from various documents (.html, .doc, .pdf, ...). It should be light and run in idle time and, if possible, be cross-platform. If you know a good open-source indexer, please let us know.
>
>
> -- Gwendal
>
>
>
>
>
>
>
>
>
>
>
>
> > -----Message d'origine-----
> > De : p2p-hackers-bounces@zgp.org
> > [mailto:p2p-hackers-bounces@zgp.org] De la part de Ludovic Court?s
> > Envoy? : mercredi 7 d?cembre 2005 17:19
> > ? : strib@MIT.EDU
> > Cc : Peer-to-peer development.; zooko@zooko.com
> > Objet : [p2p-hackers] Decentralized search engines
> >
> > Hi,
> >
> > Jeremy Stribling <strib@amsterdam.lcs.mit.edu> writes:
> >
> > > Working on it.  Should have something public within a few months:
> > >
> > > http://pdos.csail.mit.edu/papers/overcite:iptps05/index.html
> >
> > Indeed, that seems very promising!
> >
> > Similarly, are there people working on decentralized web indexing and
> > search engines?  To paraphrase Zooko, it would be nice to decentralize
> > Google before it is too late...
> >
> > Thanks,
> > Ludovic.
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From ap at hamachi.cc  Mon Dec 19 19:26:56 2005
From: ap at hamachi.cc (Alex Pankratov)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Google releases something P2P
In-Reply-To: <46c2f4ab0512160333u6d5e326er308b13c36d0a0ad0@mail.gmail.com>
References: <46c2f4ab0512160333u6d5e326er308b13c36d0a0ad0@mail.gmail.com>
Message-ID: <43A70980.70109@hamachi.cc>


Bram Neijt wrote:
>
> [The P2P component] "Negotiates, establishes, and maintains
> peer-to-peer connections through almost any network configuration
> regardless of NAT devices and firewalls. The p2p component understands
> the Jingle spec to initiate the session and then provides a
> sockets-like interface for sending and receiving data that is used by
> the session component to add functionality."
> 

Tunneling method summary (based on what's in the actual code) -

STUN-based NAT traversal complimented by an option of relaying
data through 3rd node for cases that STUN cannot handle.

Alex

From threelions0916 at yahoo.com.cn  Tue Dec 27 01:54:15 2005
From: threelions0916 at yahoo.com.cn (Michael Liu)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kad Network with Kademlia
References: <43A34242.3070203@research.panasonic.com>
	<1134850905.F9FF157@dl11.dngr.org><20051217225926.GB2876@cs.uoregon.edu>
	<43A4A2A2.6080207@research.panasonic.com>
Message-ID: <005c01c60a88$88580e20$1d18080a@cnc.intra>


----- Original Message ----- 
From: "Eunsoo Shim" <eunsoo@research.panasonic.com>
To: "Peer-to-peer development." <p2p-hackers@zgp.org>
Sent: Sunday, December 18, 2005 7:43 AM
Subject: Re: [p2p-hackers] Kad Network with Kademlia


> 
> >>>>I measured it to be around a million non-firewalled peers, although
> >>>>that was a few months back.
> >>>>        
> >>>>
> >>Daniel, by "non-firewalled" do you mean truly, those that aren't behind 
> >>a firewall, or rather those for which NAT/firewall traversal doesn't 
> >>work?
> >>    
> >>
> >
> >I mean those that can receive unsolicited TCP and UDP packets on the
> >Kad/eMule ports.  Either they must not be firewalled/NATed or the user
> >must manually punch a whole to redirect those ports from the firewall
> >device.
> >  
> >
> So port 80 or 443 is NOT used at all for Kad Network?
> 
> >Kad uses "iterative" DHT routing.  If I'm a client and want to do a
> >lookup, I query some of my contacts to get their next hop for my
> >target, then I query that peer for it's next hop, etc.  Therefore,
> >it's important that any host participating in Kad's DHT routing
> >structure be able to receive unsolicited packets.
> >
> >  
> >
> "Iterative" DHT routing is inefficient compared to "recursive" one.
> Is "iterative" routing used because of a concern about DoS attacks?
> Thanks.

I wonder why 'Iterative' DHT routing is less efficient than 'recursive' one? it seems the efficiency are same ...

I am also astonished to find DHT can support millions of active nodes, it's fantastic .

> 
> Eunsoo
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
From gojomo at bitzi.com  Thu Dec 29 22:41:33 2005
From: gojomo at bitzi.com (Gordon Mohr)
Date: Sat Dec  9 22:13:06 2006
Subject: Google 'Safe Browsing' vs. RESTful authorization Re: [p2p-hackers]
	Re: [rest-discuss] Re: RESTful authorization
In-Reply-To: <5691356f050929103656e4f30f@mail.gmail.com>
References: <F02EAE07A561584D92034BA84EE60123010E49FA@eduau-exch.eduau.local>	<5691356f05092710332a623de2@mail.gmail.com>	<1127850423.3A956345@bd12.dngr.org>	<1127901049.15818.47.camel@p-dvsi-418-1.rd.francetelecom.fr>	<5691356f0509280755501a1c1d@mail.gmail.com>	<1127922062.15818.82.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<5691356f050929103656e4f30f@mail.gmail.com>
Message-ID: <43B4661D.5030301@bitzi.com>

I like Tyler's notion of SSL-passed 'capability URLs', and had
occasion to think about them again when reading the following:

   Two Things That Bother Me About Google?s New Firefox Extension
   http://www.oreillynet.com/pub/wlg/8760

   "1) Every request is transmitted to Google over HTTP, i.e. in
   clear-text. This is not good. Here is why: Consider a web
   application that uses SSL to encrypt the session. If this web
   application were to submit private information about you via a
   GET request (i.e in the URL, such as a credit card number),
   this will now be transmitted to
   http://www.google.com/safebrowsing/lookup in clear-text, allowing
   someone on your network segment, or any router in between yourself
   and google.com to sniff the information off the wire."

Asking a trusted third party their opinion of an URL seems a
reasonable anti-phishing measure. But if that "trusted" third
party is careless in its handling of HTTPS URLs, as it
appears Google has been in this design, the prerequisite
URL secrecy required for capability URLs will be often and
casually violated.

Of course, any security measure can be thwarted with sufficient
carelessness, and in this case the onus should be on Google to
fix this oversight, and respect the privacy of HTTPS URL requests.

But it's been 2 weeks since this problem was highlighted, and
it remains unfixed and mostly undiscussed. That suggests to me
that people's expectations of HTTPS URL secrecy -- and of the
standards that toolbar/extension makers like Google should be
held to in protecting user secrets -- are pretty low. Perhaps
so low that capability URLs would only be usable by the
hyper-conscientious, who generally do fine under any system.

- Gordon

Tyler Close wrote:
> Hi Antoine,
> 
> On 9/28/05, Antoine Pitrou <solipsis@pitrou.net> wrote:
>>> On 9/28/05, Antoine Pitrou <solipsis@pitrou.net> wrote:
>>>> I'm curious as to how "capability URLs" can't be stolen and re-used by a
>>>> malicious piece of Javascript like other URLs can.
>>> Simply because a capability URL is unguessable.
>> It is permanent too,
> 
> No, the lifetime of the URL is up to the application designer. Using
> capability URLs does not place any duration restrictions on how you
> define the lifetime of your resources.
> 
>> And you have to keep this URL somewhere... Given that it's full of
>> random ascii garbage, you can't keep it in your head (contrast this with
>> a properly chosen password), and you don't want to copy it by hand
>> either. So it /will/ end up in electronic clickable form somewhere: for
>> example in your bookmarks.
> 
> I think keeping capability URLs in your bookmarks is a perfectly
> sensible thing to do, providing you then protect your bookmarks. I run
> OS X, so my entire filesystem is encrypted, including my bookmarks
> file.
> 
>> >From your own explanation on the REST mailing-list : ? The user just
>> *clicks on hyperlinks*, without ever needing to be aware of the resource
>> password. ? Those hyperlinks have to be somewhere...
>>
>> (and of course, this totally mandates HTTPS, which is impossible for
>> most Web sites for reasons I already explained)
> 
> This argument is a little out of date. You can get affordable HTTPS
> hosting from providers like GoDaddy and 1and1. Even before the advent
> of shared hosting for HTTPS, colocation was already an affordable
> option and likely required anyways in order to get the performance
> characteristics that you want for your web application. For very small
> scale projects, running Apache on your home machine with a dyndns
> hostname and a 7.99 SSL certificate is also doable.
> 
>> As a mix proposal, it would be more interesting if a new URL was
>> generated everytime the user identifies (with login/password). More
>> interesting again, it could be generated client side in Javascript using
>> a formula like "HASH(HASH(password) + challenge)" where the challenge is
>> a temporary value generated by the server for this very session (thus
>> with an expiration time). Which means:
>> - the URL is temporary (it expires with the challenge)
>> - this URL does not need to be recorded anywhere on the client since
>> it's generated at every new login
>> - in plain non-encrypted HTTPS, the data which goes over the wire only
>> gives temporary access to the resource
> 
> When I attend DefCon, I am always amazed that people are surprised by
> the Wall of Sheep, people who know that network snooping is possible.
> I guess you just have to experience the efficiency of live network
> snooping in order to truly appreciate it.
> 
> With the rise of ubiquitous WiFi, passing secrets, even temporary
> ones, over the network in the clear is asking for trouble. Your 15
> minute session timeout is an eon on the timescale of a script watching
> the network for your protocol and exploiting it on the fly.
> 
> SSL has finally come into a somewhat reasonable price range. We should
> go ahead and exploit it. We don't need to mess around with dodgy
> timeout based designs.
> 
> Thanks again for the questions.
> 
> Tyler
> 
> --
> The web-calculus is the union of REST and capability-based security:
> http://www.waterken.com/dev/Web/
> 
> Name your trusted sites to distinguish them from phishing sites.
> https://addons.mozilla.org/extensions/moreinfo.php?id=957
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From marco at bice.it  Fri Dec 30 09:20:28 2005
From: marco at bice.it (marco@bice.it)
Date: Sat Dec  9 22:13:06 2006
Subject: [p2p-hackers] Kademlia and Java
Message-ID: <20051230102028.9bg4nvglf1b4cw40@webmail.bice.it>

I'm looking at a Java implementation of the Kademlia API.
http://kademlia.scs.cs.nyu.edu doesn't work, so I was wondering if someone have
experienced the Plan X 0.4.12 library. There is something called
"org.planx.xmlstore.routing.Kademlia" mentioned in the Javadoc, that could be
useful.

Is there someone who can help me? Where to find the library and is it useful?
Is there something else implementing Kademlia?

Thank you very much.
Marco