From dcarboni at gmail.com  Tue Feb  1 17:22:58 2005
From: dcarboni at gmail.com (Davide Carboni)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] simulator for p2p
Message-ID: <71b79fa9050201092273a5f7ba@mail.gmail.com>

Hi,
is there any way to simulate a p2p network  using a single PC?  I know
ns2 but it seems very "low-level" simulation. I'd like something to
simulate a network of peers abstracting from the serialization of
messages. For instance, I'd like to model peers like objects in memory
which exchange messages invoking methods each other but taking into
account variables like the bandwidth, the latency and so forth.

Bye,
Davide

From srhea at cs.berkeley.edu  Tue Feb  1 19:19:59 2005
From: srhea at cs.berkeley.edu (Sean C. Rhea)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] simulator for p2p
In-Reply-To: <71b79fa9050201092273a5f7ba@mail.gmail.com>
References: <71b79fa9050201092273a5f7ba@mail.gmail.com>
Message-ID: <e4597d167c81771dbf90e25765f00cdc@cs.berkeley.edu>

On Feb 1, 2005, at 9:22 AM, Davide Carboni wrote:
> is there any way to simulate a p2p network  using a single PC?  I know
> ns2 but it seems very "low-level" simulation. I'd like something to
> simulate a network of peers abstracting from the serialization of
> messages. For instance, I'd like to model peers like objects in memory
> which exchange messages invoking methods each other but taking into
> account variables like the bandwidth, the latency and so forth.

Bamboo (bamboo-dht.org) comes with a simple simulator that models 
latency based on real measurements (the data is from here: 
http://www.pdos.lcs.mit.edu/p2psim/kingdata/).  It's a pretty simple 
event-driven simulator written in Java; the nice thing about it is that 
you can use the same code under simulation that you use on the real 
net.  To use it, download the latest Bamboo CVS snapshot and try this:

   cd bamboo/src/bamboo/sim
   ./make-startup-test.pl
   ../../../bin/run-java bamboo.sim.Simulator /tmp/startup-test.exp

It will start up 29 Bamboo nodes that will then form a Bamboo network.  
It's a pretty simple example, but it should give you the idea.


The PDOS group at MIT also has a simulator.  It's at 
http://www.pdos.lcs.mit.edu/p2psim/.  It uses threads instead of 
events, and C++ instead of Java.  It also models only latency.

Both of these simulators should be able to simulate 200-1000 nodes, 
depending on how much core memory your machine has.  Modeling bandwidth 
is hard to do at scale.  (It's one of the reasons NS2 doesn't scale too 
well.)

Sean
-- 
     We are all in the gutter, but some of us are looking at the stars.
                              -- Oscar Wilde
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050201/781b2a03/PGP.pgp
From davidopp at cs.berkeley.edu  Tue Feb  1 19:27:47 2005
From: davidopp at cs.berkeley.edu (David L. Oppenheimer)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] simulator for p2p
In-Reply-To: <e4597d167c81771dbf90e25765f00cdc@cs.berkeley.edu>
Message-ID: <200502011927.LAA10387@mindbender.davido.com>

> Bamboo (bamboo-dht.org) comes with a simple simulator that models 
> latency based on real measurements (the data is from here: 
> http://www.pdos.lcs.mit.edu/p2psim/kingdata/).  It's a pretty simple 
> event-driven simulator written in Java; the nice thing about 
> it is that 
> you can use the same code under simulation that you use on the real 
> net. 

And because you can run the same code on the "real net," you can run the
same code under emulation on a cluster to study bandwidth effects.

David


From srhea at cs.berkeley.edu  Tue Feb  1 19:51:51 2005
From: srhea at cs.berkeley.edu (Sean C. Rhea)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] simulator for p2p
In-Reply-To: <200502011927.LAA10387@mindbender.davido.com>
References: <200502011927.LAA10387@mindbender.davido.com>
Message-ID: <1f6262437486549bbe1834eb3149f490@cs.berkeley.edu>

On Feb 1, 2005, at 11:27 AM, David L. Oppenheimer wrote:
>> Bamboo (bamboo-dht.org) comes with a simple simulator that models
>> latency based on real measurements (the data is from here:
>> http://www.pdos.lcs.mit.edu/p2psim/kingdata/).  It's a pretty simple
>> event-driven simulator written in Java; the nice thing about
>> it is that you can use the same code under simulation that you use on 
>> the real net.
>
> And because you can run the same code on the "real net," you can run 
> the
> same code under emulation on a cluster to study bandwidth effects.

That's a good point.  We run the same code under the Bamboo simulator, 
on a local cluster using ModelNet 
(http://issg.cs.duke.edu/modelnet.html) to provide wide-area-like 
latency and bandwidth restrictions, and on PlanetLab 
(http://planet-lab.org/).

Sean
-- 
            An atheist doesn't have to be someone who thinks he has a
           proof that there can't be a god. He only has to be someone
           who believes that the evidence on the God question is at a
            similar level to the evidence on the were-wolf question.
                                -- John McCarthy

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050201/5ed10b82/PGP.pgp
From hopper at omnifarious.org  Wed Feb  2 03:17:28 2005
From: hopper at omnifarious.org (Eric M. Hopper)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] simulator for p2p
In-Reply-To: <71b79fa9050201092273a5f7ba@mail.gmail.com>
References: <71b79fa9050201092273a5f7ba@mail.gmail.com>
Message-ID: <1107314248.25868.59.camel@bats.omnifarious.org>

On Tue, 2005-02-01 at 18:22 +0100, Davide Carboni wrote:
> Hi,
> is there any way to simulate a p2p network  using a single PC?  I know
> ns2 but it seems very "low-level" simulation. I'd like something to
> simulate a network of peers abstracting from the serialization of
> messages. For instance, I'd like to model peers like objects in memory
> which exchange messages invoking methods each other but taking into
> account variables like the bandwidth, the latency and so forth.

You could probably write a replacement for SocketModule in my
StreamModule framework (http://www.omnifarious.org/StrMod/) that could
simulate some of the latency characteristics of a network connection.

If you wrote the code to use StreamModule, you could then put in real
SocketModules instead and it would work over a real network with no
other changes.

Have fun (if at all possible),
-- 
Eric Hopper (hopper@omnifarious.org  http://www.omnifarious.org/~hopper)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050201/aa1d07bf/attachment.pgp
From sdaswani at gmail.com  Wed Feb  2 06:35:12 2005
From: sdaswani at gmail.com (Susheel Daswani)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Altnet Patent
Message-ID: <1cd056b905020122353cd6ad68@mail.gmail.com>

Hey Folks,
I'm not sure how everyone is handling the Altnet patent threat, but in
my studies I've come across some salient points regarding patent
infringement:

"For an accused product to literally infringe a patent, EVERY element
contained in the patent claim must also be present in the accused
product or device.  If a claimed apparatus has five parts, or
'elements', and the allegedly infringing apparatus has only four of
those five, it does not literally infringe.  This is true even though
the defendant may have copied the four elements exactly, and
regardless of how significant or insignificant the missing element
is."
'Intellectual Property in the New Technological Age', 3rd Edition, page 230

This may already be known, but I thought I'd put it out there.
So everyone should analyse their hashing systems to see how they
compare to Altnet's patent elements.  If you don't do everything they
do, you can ignore their dinky letter :).  I'm going to analyse their
claims soon and compare to the systems I know.

Some more interesting information, which is probably obvious:
"[I]t does not matter [if] a defendant has ADDED several new elements
-- adding new features cannot help a defendant escape infringement."

Susheel

From samnospam at bcgreen.com  Wed Feb  2 09:06:42 2005
From: samnospam at bcgreen.com (Stephen Samuel (leave the email alone))
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Altnet Patent (Prior art)
In-Reply-To: <1cd056b905020122353cd6ad68@mail.gmail.com>
References: <1cd056b905020122353cd6ad68@mail.gmail.com>
Message-ID: <42009822.1030904@bcgreen.com>

I'm thinking that one well-documented example of prior art for the altnet
patent might be the pgp neteork which identifies and distributes PGP
keys by their hash IDs. In the case of pgp.net, there are actually a
couple of lengths of hash keys: Short, lont and fingerprint.


Susheel Daswani wrote:
> Hey Folks,
> I'm not sure how everyone is handling the Altnet patent threat, but in
> my studies I've come across some salient points regarding patent
> infringement:


-- 
Stephen Samuel +1(604)876-0426             samnospam@bcgreen.com
		   http://www.bcgreen.com/
    Powerful committed communication. Transformation touching
      the jewel within each person and bringing it to light.

From aloeser at cs.tu-berlin.de  Thu Feb  3 10:32:18 2005
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? 
Message-ID: <4201FDB1.6F607C0C@cs.tu-berlin.de>

Hey all,
structured overlay networks  based on DHT's, such as Pastry and  Chord
among others, have been investigated in the past to construct scalable
and performance orientated peer-to-peer networks.  However, unstructured
networks, such as Gnutella or Kazaa,  are still widely used among the
file sharing community. Recently researchers proposed extensions to
unstructured networks  networks based on the small world idea: peers
dynamically create shortcuts to other peers based on their interests.
Over a while peers with the same interests became direct neighbors
through its shortcuts and build interest based clusters.   Hence peers
no longer flood messages but partly route it's queries via a interested
based/semantic  overlay.  Examples are described in [1] [2] among
others.

Comparing small world and DHT  approaches is a difficult task, since
simulations usually differ in scenarios, data sets or simulation
methodology.    I'm interested in scenarios and arguments PRO small
world overlays for unstructured networks. Does anybody now actual
theoretic or practical work that compares both approaches in different
scenarios (high churn, no super peers, key word based search, meta data
based search)?  Which scenarios or arguments support small world
approaches for unstructured networks?

Alex


[1] Gia - Making Gnutella like P2P Systems Scalable
http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf
http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003-gia.ppt

[2]  Efficient Content Location Using Interest Based Locality in
Peer-to-Peer Systems
http://www.ieee-infocom.org/2003/papers/53_01.PDF
--
___________________________________________________________

  Alexander L?ser
  Technische Universitaet Berlin
  hp: http://cis.cs.tu-berlin.de/~aloeser/
  office: +49- 30-314-25551
  fax   : +49- 30-314-21601
___________________________________________________________


From zooko at zooko.com  Thu Feb  3 12:43:26 2005
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Re: TCP thru' double NAT?
In-Reply-To: <4201DC25.6070508@ucla.edu>
References: <409EC974.9000007@vaste.mine.nu>
	<409ECB89.8010408@locut.us>	<20040512113330.GA2606@bitchcake.off.net>
	<E1BNslA-0006Zh-00@localhost> <4201DC25.6070508@ucla.edu>
Message-ID: <90cc48fc65f4f090eb9e558264a311db@zooko.com>

[responding on-list to off-list query]

> I know this p2p-hackers message is from loooong ago, but I had a quick 
> question -- does the TCP relay currently implemented in Mnet use the 
> technique described in Section 3.5 of that document? At the end it 
> says that "Unfortunately, this trick may be even more fragile and 
> timing-sensitive than the UDP port number prediction trick described 
> above... Applications that require efficient, direct peer-to-peer 
> communication over existing NATs should use UDP." It doesn't sound 
> like a technique to get good results with, although you report success 
> -- so I was just curious.

Hi Michael:

The Mnet hack is low-tech.  A node which is not behind NAT or firewall 
volunteers to be a relay server.  It receives msgs from node A via TCP 
and sends them to node B via TCP, all in user-land.  There are plenty 
of obvious drawbacks, but it works for Mnet's purposes.

I believe Skype does something similar, when Skype's more efficient 
alternatives fail.

Regards,

Zooko

---
Please excuse terse writing -- there is a baby in my arms.


From Bernard.Traversat at Sun.COM  Thu Feb  3 14:04:31 2005
From: Bernard.Traversat at Sun.COM (Bernard Traversat)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World?
In-Reply-To: <4201FDB1.6F607C0C@cs.tu-berlin.de>
References: <4201FDB1.6F607C0C@cs.tu-berlin.de>
Message-ID: <42022F6F.1040707@Sun.COM>

You may want to look at JXTA (www.jxta.org) which
provides an hybrid architecture allowing you
to deploy both structured and ad hoc
unstructured P2P network overlays.

Cheers,

B.

Alexander L?ser wrote:
> Hey all,
> structured overlay networks  based on DHT's, such as Pastry and  Chord
> among others, have been investigated in the past to construct scalable
> and performance orientated peer-to-peer networks.  However, unstructured
> networks, such as Gnutella or Kazaa,  are still widely used among the
> file sharing community. Recently researchers proposed extensions to
> unstructured networks  networks based on the small world idea: peers
> dynamically create shortcuts to other peers based on their interests.
> Over a while peers with the same interests became direct neighbors
> through its shortcuts and build interest based clusters.   Hence peers
> no longer flood messages but partly route it's queries via a interested
> based/semantic  overlay.  Examples are described in [1] [2] among
> others.
> 
> Comparing small world and DHT  approaches is a difficult task, since
> simulations usually differ in scenarios, data sets or simulation
> methodology.    I'm interested in scenarios and arguments PRO small
> world overlays for unstructured networks. Does anybody now actual
> theoretic or practical work that compares both approaches in different
> scenarios (high churn, no super peers, key word based search, meta data
> based search)?  Which scenarios or arguments support small world
> approaches for unstructured networks?
> 
> Alex
> 
> 
> 
> 
> [1] Gia - Making Gnutella like P2P Systems Scalable
> http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf
> http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003-gia.ppt
> 
> [2]  Efficient Content Location Using Interest Based Locality in
> Peer-to-Peer Systems
> http://www.ieee-infocom.org/2003/papers/53_01.PDF
> --
> ___________________________________________________________
> 
>   Alexander L?ser
>   Technische Universitaet Berlin
>   hp: http://cis.cs.tu-berlin.de/~aloeser/
>   office: +49- 30-314-25551
>   fax   : +49- 30-314-21601
> ___________________________________________________________
> 
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From gbildson at limepeer.com  Thu Feb  3 15:26:25 2005
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? 
In-Reply-To: <4201FDB1.6F607C0C@cs.tu-berlin.de>
Message-ID: <EPEJIODJLBDLEHGHIEADAEAECIAB.gbildson@limepeer.com>

I'd just like to point out that Gnutella does not use pure flooding anymore
and you are unlikely to find P2P networks that don't have something akin to
supernodes.  Gnutella uses bloom filter based keyword index replication and
dynamic querying (selectively sending out queries until a result limit is
reached) to reduce the overhead of flooding for popular queries and to route
all queries on the last hop.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Alexander L?ser
> Sent: Thursday, February 03, 2005 5:32 AM
> To: p2p-hackers@zgp.org
> Subject: [p2p-hackers] Paradigma Question: DHT's or Small World?
>
>
> Hey all,
> structured overlay networks  based on DHT's, such as Pastry and  Chord
> among others, have been investigated in the past to construct scalable
> and performance orientated peer-to-peer networks.  However, unstructured
> networks, such as Gnutella or Kazaa,  are still widely used among the
> file sharing community. Recently researchers proposed extensions to
> unstructured networks  networks based on the small world idea: peers
> dynamically create shortcuts to other peers based on their interests.
> Over a while peers with the same interests became direct neighbors
> through its shortcuts and build interest based clusters.   Hence peers
> no longer flood messages but partly route it's queries via a interested
> based/semantic  overlay.  Examples are described in [1] [2] among
> others.
>
> Comparing small world and DHT  approaches is a difficult task, since
> simulations usually differ in scenarios, data sets or simulation
> methodology.    I'm interested in scenarios and arguments PRO small
> world overlays for unstructured networks. Does anybody now actual
> theoretic or practical work that compares both approaches in different
> scenarios (high churn, no super peers, key word based search, meta data
> based search)?  Which scenarios or arguments support small world
> approaches for unstructured networks?
>
> Alex
>
>
>
>
> [1] Gia - Making Gnutella like P2P Systems Scalable
> http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf
> http://seattle.intel-research.net/people/yatin/publications/talks/
> sigcomm2003-gia.ppt
>
> [2]  Efficient Content Location Using Interest Based Locality in
> Peer-to-Peer Systems
> http://www.ieee-infocom.org/2003/papers/53_01.PDF
> --
> ___________________________________________________________
>
>   Alexander L?ser
>   Technische Universitaet Berlin
>   hp: http://cis.cs.tu-berlin.de/~aloeser/
>   office: +49- 30-314-25551
>   fax   : +49- 30-314-21601
> ___________________________________________________________
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From gwendal.simon at francetelecom.com  Thu Feb  3 15:49:10 2005
From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS)
Date: Sat Dec  9 22:12:50 2006
Subject: TR: [p2p-hackers] Paradigma Question: DHT's or Small World? 
Message-ID: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1C7B181@ftrdmel1.rd.francetelecom.fr>

Hi,

	Here are two assumptions that advocate for small-world.

	The first one, related to the human language, has been partially established by several studies [1,2] since the pioneering work of [3]. The graph of word interactions is constructed by linking two words when they co-occur in a sentence (a fortiori in a file). The study of the properties of these graphs shows they exhibit the small world effect and a scale-free distribution of degrees.

	The second assumption follows the observations you cite and some others [4,5,6]. The data-sharing graph is constructed by linking two users when they share a same file. Observations on several real traces show that this graph exhibits also the small-world effect and the scale-free distribution of degrees.

	Besides, it is known that the lexicon of an human contains few thousands of words. This lexicon and the words contained in the documents which have been produced and dowloaded by an user define her "semantic profile". Through the preceeding assumptions, we naturally infer that the graph generated by linking users when their semantic profile overlap is also small-world and scale-free.

	That is, if we consider that users emit requests on keywords chosen within their profile, we can expect that almost *all* files of interest for an user are stored by a small set of "friends". Moreover, these "friends" are already known by the user thanks to previous successfull queries.

	Therefore, it is possible to limit the search to a subspace of the information space without preventing the quality of responses. On the contrary, it is probable that these responses are more relevant for the requester point of view. For instance, a fan of "Fiona Apple" will discover mp3 of Fiona Apple and not informations on Apple Inc. or webpages for "apple pie" cooking. Or, an European querying informations on "football" will not receive pages on NFL.

	By the way, another related concern is the publication of a file.
	In a gnutella-like systems, peers just have to put their files in their "shared directory" in order to make them available by any node in the system.
	On the contrary, the task of publication in a DHT-based overlay requires to reach as many peers as the number of words describing the published document. Indeed, the published file has to be known by the peers that are responsible of all the *relevant* words of the document. This is clearly an issue for keyword-based search in DHTs. If you want to design a search engine indexing *all* words in the document, this task becomes unrealistic.

--------------------
Gwendal Simon
France Telecom R&D
http://solipsis.netofpeers.net


[1] D. Watts. Six Degrees.
[2] A. Barabasi. Linked: the New Science of Networks.
[3] R. Ferrer i Canco and R. Sole. The Small World of Human Language.
[4] J. Keller, D. Stern and F. Dang Ngoc. MAAY: A Self-Adaptive Peer Network for Efficient Document Search.
[5] V. Cholvi, P. Felber, and E.W. Biersack. Efficient Search in Unstructured Peer-to-Peer Networks.
[6] Adriana Iamnitchi, Matei Ripeanu and Ian Foster, Small-World File-Sharing Communities.


> -----Message d'origine-----
> De : p2p-hackers-bounces@zgp.org
> [mailto:p2p-hackers-bounces@zgp.org] De la part de Alexander L?ser 
> Envoy? : jeudi 3 f?vrier 2005 11:32 ? : p2p-hackers@zgp.org Objet : 
> [p2p-hackers] Paradigma Question: DHT's or Small World?
> 
> Hey all,
> structured overlay networks  based on DHT's, such as Pastry and  Chord 
> among others, have been investigated in the past to construct scalable 
> and performance orientated peer-to-peer networks.  However, 
> unstructured networks, such as Gnutella or Kazaa,  are still widely 
> used among the file sharing community. Recently researchers proposed 
> extensions to unstructured networks  networks based on the small world
> idea: peers dynamically create shortcuts to other peers based on their 
> interests.
> Over a while peers with the same interests became direct neighbors
> through its shortcuts and build interest based clusters.   Hence peers
> no longer flood messages but partly route it's queries via a 
> interested based/semantic  overlay.  Examples are described in [1] [2] 
> among others.
> 
> Comparing small world and DHT  approaches is a difficult task, since 
> simulations usually differ in scenarios, data sets or simulation
> methodology.    I'm interested in scenarios and arguments PRO small
> world overlays for unstructured networks. Does anybody now actual 
> theoretic or practical work that compares both approaches in different 
> scenarios (high churn, no super peers, key word based search, meta 
> data based search)?  Which scenarios or arguments support small world 
> approaches for unstructured networks?
> 
> Alex
> 
> 
> 
> 
> [1] Gia - Making Gnutella like P2P Systems Scalable 
> http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf
> http://seattle.intel-research.net/people/yatin/publications/ta
> lks/sigcomm2003-gia.ppt
> 
> [2]  Efficient Content Location Using Interest Based Locality in 
> Peer-to-Peer Systems http://www.ieee-infocom.org/2003/papers/53_01.PDF
> --
> ___________________________________________________________
> 
>   Alexander L?ser
>   Technische Universitaet Berlin
>   hp: http://cis.cs.tu-berlin.de/~aloeser/
>   office: +49- 30-314-25551
>   fax   : +49- 30-314-21601
> ___________________________________________________________
> 
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From bryan.turner at pobox.com  Thu Feb  3 16:35:02 2005
From: bryan.turner at pobox.com (Bryan Turner)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? 
In-Reply-To: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1C7B181@ftrdmel1.rd.francetelecom.fr>
Message-ID: <200502031635.j13GZ3jZ020887@rtp-core-1.cisco.com>

Regarding Small World vs DHT;

	Pedantically, there is no difference.. you can map a DHT to Small
World by viewing the domain of the DHT (it's keyspace) to be the semantic
information sought by the peers.  Thus, peers which seek nearby points in
the keyspace are linked by Small World links, while those which seek distant
points are only occasionally referenced.

	The difference that is being argued is what the USER is interested
in versus what the PEER is interested in.  In a DHT, the peer is required to
be interested in keys which  conform to some dynamic metric based on the
specific model of DHT being used, while there is no model for the user's
interests.

	I'm not arguing for or against Small World - simply that the models
are equally expressive and thus equally capable of implementing each other's
features.  Just something to keep in mind.

And to keep things on track:

	Gwendal, I like your explanation of a user's semantic profile, it's
very crisp and approachable.  It's been difficult to explain to colleagues
in the past, next time I'll use your words.  ;)

	In the following by "Gnutella", I mean "Gnutella-like systems".
Please do not be offended by my mis-representation of the specific features
supported by Gnutella.

	I see publishing between the two mediums in a different light.
While it seems simpler to publish under Gnutella, there are tradeoffs that
you haven't pointed out.  For instance, single-word queries and exact-file
searches are significantly more difficult under the Gnutella model exactly
because your query must reach all your 'friends' - and return from all of
them!  In effect you get worst-case performance for every query.

	DHTs achieve best-case performance for this type of query, but are
burdened by a more complex publishing process.

	I would also like to argue that full-text indexing on all documents
is equally difficult for *both* models.  My reasoning follows from the
processing requirements (in any model) to index/query a full document:
	1. Process a document to produce an index.
	2. Store the index for future retrieval.
	3. Provide query capability to a client.
	4. Discover relevant indexes to a query.
	5. Search the indexes for query terms.
	6. Return results

	It should be clear that #1, #3, and #6 are essentially the same
between the two models, as some entity must perform the same amount of work
for these steps regardless of how it is handled "under the covers".

	#2 differs only in the location where the index is stored - locally
or distributed.  And in the amount of work done (Gnutella;less, DHT;more).

	#4 differs again in the location and work, but here I argue the
amount of work has reversed from #2.  Gnutella requires *many* peers to
perform complex queries against their complex indexes, which constitutes a
great deal of work.  OtoH, a DHT implicitly knows which peers to address,
and which queries to perform (in fact, the very act of addressing a peer is
effectively performing the query).

	#5 again differs, although I argue that the total amount of work
performed is essentially the same.

	Given my arguments above, the total work performed by the "system"
to achieve a query is roughly equivalent between the two models.  There
isn't any one area in which one of the systems is burdened by an order of
magnitude over the other.

--Bryan
bryan.turner@pobox.com

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of SIMON Gwendal RD-MAPS-ISS
Sent: Thursday, February 03, 2005 10:49 AM
To: Peer-to-peer development.
Subject: TR: [p2p-hackers] Paradigma Question: DHT's or Small World? 

Hi,

	Here are two assumptions that advocate for small-world.

	The first one, related to the human language, has been partially
established by several studies [1,2] since the pioneering work of [3]. The
graph of word interactions is constructed by linking two words when they
co-occur in a sentence (a fortiori in a file). The study of the properties
of these graphs shows they exhibit the small world effect and a scale-free
distribution of degrees.

	The second assumption follows the observations you cite and some
others [4,5,6]. The data-sharing graph is constructed by linking two users
when they share a same file. Observations on several real traces show that
this graph exhibits also the small-world effect and the scale-free
distribution of degrees.

	Besides, it is known that the lexicon of an human contains few
thousands of words. This lexicon and the words contained in the documents
which have been produced and dowloaded by an user define her "semantic
profile". Through the preceeding assumptions, we naturally infer that the
graph generated by linking users when their semantic profile overlap is also
small-world and scale-free.

	That is, if we consider that users emit requests on keywords chosen
within their profile, we can expect that almost *all* files of interest for
an user are stored by a small set of "friends". Moreover, these "friends"
are already known by the user thanks to previous successfull queries.

	Therefore, it is possible to limit the search to a subspace of the
information space without preventing the quality of responses. On the
contrary, it is probable that these responses are more relevant for the
requester point of view. For instance, a fan of "Fiona Apple" will discover
mp3 of Fiona Apple and not informations on Apple Inc. or webpages for "apple
pie" cooking. Or, an European querying informations on "football" will not
receive pages on NFL.

	By the way, another related concern is the publication of a file.
	In a gnutella-like systems, peers just have to put their files in
their "shared directory" in order to make them available by any node in the
system.
	On the contrary, the task of publication in a DHT-based overlay
requires to reach as many peers as the number of words describing the
published document. Indeed, the published file has to be known by the peers
that are responsible of all the *relevant* words of the document. This is
clearly an issue for keyword-based search in DHTs. If you want to design a
search engine indexing *all* words in the document, this task becomes
unrealistic.

--------------------
Gwendal Simon
France Telecom R&D
http://solipsis.netofpeers.net


[1] D. Watts. Six Degrees.
[2] A. Barabasi. Linked: the New Science of Networks.
[3] R. Ferrer i Canco and R. Sole. The Small World of Human Language.
[4] J. Keller, D. Stern and F. Dang Ngoc. MAAY: A Self-Adaptive Peer Network
for Efficient Document Search.
[5] V. Cholvi, P. Felber, and E.W. Biersack. Efficient Search in
Unstructured Peer-to-Peer Networks.
[6] Adriana Iamnitchi, Matei Ripeanu and Ian Foster, Small-World
File-Sharing Communities.


From Serguei.Osokine at efi.com  Thu Feb  3 18:12:31 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? 
Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC32E@fcexmb04.efi.internal>

On Thursday, February 03, 2005 Bryan Turner wrote:
> Given my arguments above, the total work performed by the "system"
> to achieve a query is roughly equivalent between the two models.

	Uh, looks to me that given your arguments above the models are
logically equivalent, which says nothing about whether the work is 
the same or not.

	In fact, I can easily imagine the situations where the load
would be orders of magnitude different for Gnutella and DHTs.

	Best wishes -
	S.Osokine.
	3 Feb 2005.

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Bryan Turner
Sent: Thursday, February 03, 2005 8:35 AM
To: 'Peer-to-peer development.'
Subject: RE: [p2p-hackers] Paradigma Question: DHT's or Small World? 


Regarding Small World vs DHT;

	Pedantically, there is no difference.. you can map a DHT to Small
World by viewing the domain of the DHT (it's keyspace) to be the semantic
information sought by the peers.  Thus, peers which seek nearby points in
the keyspace are linked by Small World links, while those which seek distant
points are only occasionally referenced.

	The difference that is being argued is what the USER is interested
in versus what the PEER is interested in.  In a DHT, the peer is required to
be interested in keys which  conform to some dynamic metric based on the
specific model of DHT being used, while there is no model for the user's
interests.

	I'm not arguing for or against Small World - simply that the models
are equally expressive and thus equally capable of implementing each other's
features.  Just something to keep in mind.

And to keep things on track:

	Gwendal, I like your explanation of a user's semantic profile, it's
very crisp and approachable.  It's been difficult to explain to colleagues
in the past, next time I'll use your words.  ;)

	In the following by "Gnutella", I mean "Gnutella-like systems".
Please do not be offended by my mis-representation of the specific features
supported by Gnutella.

	I see publishing between the two mediums in a different light.
While it seems simpler to publish under Gnutella, there are tradeoffs that
you haven't pointed out.  For instance, single-word queries and exact-file
searches are significantly more difficult under the Gnutella model exactly
because your query must reach all your 'friends' - and return from all of
them!  In effect you get worst-case performance for every query.

	DHTs achieve best-case performance for this type of query, but are
burdened by a more complex publishing process.

	I would also like to argue that full-text indexing on all documents
is equally difficult for *both* models.  My reasoning follows from the
processing requirements (in any model) to index/query a full document:
	1. Process a document to produce an index.
	2. Store the index for future retrieval.
	3. Provide query capability to a client.
	4. Discover relevant indexes to a query.
	5. Search the indexes for query terms.
	6. Return results

	It should be clear that #1, #3, and #6 are essentially the same
between the two models, as some entity must perform the same amount of work
for these steps regardless of how it is handled "under the covers".

	#2 differs only in the location where the index is stored - locally
or distributed.  And in the amount of work done (Gnutella;less, DHT;more).

	#4 differs again in the location and work, but here I argue the
amount of work has reversed from #2.  Gnutella requires *many* peers to
perform complex queries against their complex indexes, which constitutes a
great deal of work.  OtoH, a DHT implicitly knows which peers to address,
and which queries to perform (in fact, the very act of addressing a peer is
effectively performing the query).

	#5 again differs, although I argue that the total amount of work
performed is essentially the same.

	Given my arguments above, the total work performed by the "system"
to achieve a query is roughly equivalent between the two models.  There
isn't any one area in which one of the systems is burdened by an order of
magnitude over the other.

--Bryan
bryan.turner@pobox.com

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of SIMON Gwendal RD-MAPS-ISS
Sent: Thursday, February 03, 2005 10:49 AM
To: Peer-to-peer development.
Subject: TR: [p2p-hackers] Paradigma Question: DHT's or Small World? 

Hi,

	Here are two assumptions that advocate for small-world.

	The first one, related to the human language, has been partially
established by several studies [1,2] since the pioneering work of [3]. The
graph of word interactions is constructed by linking two words when they
co-occur in a sentence (a fortiori in a file). The study of the properties
of these graphs shows they exhibit the small world effect and a scale-free
distribution of degrees.

	The second assumption follows the observations you cite and some
others [4,5,6]. The data-sharing graph is constructed by linking two users
when they share a same file. Observations on several real traces show that
this graph exhibits also the small-world effect and the scale-free
distribution of degrees.

	Besides, it is known that the lexicon of an human contains few
thousands of words. This lexicon and the words contained in the documents
which have been produced and dowloaded by an user define her "semantic
profile". Through the preceeding assumptions, we naturally infer that the
graph generated by linking users when their semantic profile overlap is also
small-world and scale-free.

	That is, if we consider that users emit requests on keywords chosen
within their profile, we can expect that almost *all* files of interest for
an user are stored by a small set of "friends". Moreover, these "friends"
are already known by the user thanks to previous successfull queries.

	Therefore, it is possible to limit the search to a subspace of the
information space without preventing the quality of responses. On the
contrary, it is probable that these responses are more relevant for the
requester point of view. For instance, a fan of "Fiona Apple" will discover
mp3 of Fiona Apple and not informations on Apple Inc. or webpages for "apple
pie" cooking. Or, an European querying informations on "football" will not
receive pages on NFL.

	By the way, another related concern is the publication of a file.
	In a gnutella-like systems, peers just have to put their files in
their "shared directory" in order to make them available by any node in the
system.
	On the contrary, the task of publication in a DHT-based overlay
requires to reach as many peers as the number of words describing the
published document. Indeed, the published file has to be known by the peers
that are responsible of all the *relevant* words of the document. This is
clearly an issue for keyword-based search in DHTs. If you want to design a
search engine indexing *all* words in the document, this task becomes
unrealistic.

--------------------
Gwendal Simon
France Telecom R&D
http://solipsis.netofpeers.net


[1] D. Watts. Six Degrees.
[2] A. Barabasi. Linked: the New Science of Networks.
[3] R. Ferrer i Canco and R. Sole. The Small World of Human Language.
[4] J. Keller, D. Stern and F. Dang Ngoc. MAAY: A Self-Adaptive Peer Network
for Efficient Document Search.
[5] V. Cholvi, P. Felber, and E.W. Biersack. Efficient Search in
Unstructured Peer-to-Peer Networks.
[6] Adriana Iamnitchi, Matei Ripeanu and Ian Foster, Small-World
File-Sharing Communities.

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From rita at comet.columbia.edu  Thu Feb  3 18:59:39 2005
From: rita at comet.columbia.edu (Rita H. Wouhaybi)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? 
Message-ID: <008f01c50a22$837a66d0$9e433b80@comet.columbia.edu>

Alexander L?ser wrote:
> Hey all,
> structured overlay networks  based on DHT's, such as Pastry and  Chord
> among others, have been investigated in the past to construct scalable
> and performance orientated peer-to-peer networks.  However, unstructured
> networks, such as Gnutella or Kazaa,  are still widely used among the
> file sharing community. Recently researchers proposed extensions to
> unstructured networks  networks based on the small world idea: peers
> dynamically create shortcuts to other peers based on their interests.
> Over a while peers with the same interests became direct neighbors
> through its shortcuts and build interest based clusters.   Hence peers
> no longer flood messages but partly route it's queries via a interested
> based/semantic  overlay.  Examples are described in [1] [2] among
> others.
> 
> Comparing small world and DHT  approaches is a difficult task, since
> simulations usually differ in scenarios, data sets or simulation
> methodology.    I'm interested in scenarios and arguments PRO small
> world overlays for unstructured networks. Does anybody now actual
> theoretic or practical work that compares both approaches in different
> scenarios (high churn, no super peers, key word based search, meta data
> based search)?  Which scenarios or arguments support small world
> approaches for unstructured networks?
> 
> Alex
> 
> 
> 
> 
> [1] Gia - Making Gnutella like P2P Systems Scalable
> http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf
> http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003-gia.ppt
> 
> [2]  Efficient Content Location Using Interest Based Locality in
> Peer-to-Peer Systems
> http://www.ieee-infocom.org/2003/papers/53_01.PDF
> --
> ___________________________________________________________
> 
>   Alexander L?ser
>   Technische Universitaet Berlin
>   hp: http://cis.cs.tu-berlin.de/~aloeser/
>   office: +49- 30-314-25551
>   fax   : +49- 30-314-21601
> ___________________________________________________________
> 
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers

Interesting discussion Alex.

>From the practical and system challenges that faced researchers working on DHTs (long time for the network to become stable, updates and maintenance for nodes join and leave, high cost of messaging when adding an object to the network, ..), it has become the norm to think about the application when trying to decide to use structured (DHTs) or unstructured (gnutella-like) p2p topologies. That is probably one of the reasons why people did not compare both structures in an analysis similar to what you are asking for. Thus, small world and power-law have emerged to bridge the gap between a total random network and a "rigid" DHT. Note that super-peers in Kazaa and Gnutella do actually help the network become more like a small-world. We also have worked in this area and created a power-law distribution P2P network that might interest you: 
- Rita H. Wouhaybi, and Andrew T. Campbell, "Phenix: Supporting Resilient Low-Diameter Peer-to-Peer Topologies", IEEE INFOCOM'2004, Hong Kong, China, March 7-11, 2004. 


Rita H. Wouhaybi
rita@comet.columbia.edu
http://comet.columbia.edu/~rita/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20050203/c06a78ef/attachment.html
From Serguei.Osokine at efi.com  Thu Feb  3 19:53:22 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? 
Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC32F@fcexmb04.efi.internal>

On Thursday, February 03, 2005 Rita H. Wouhaybi wrote:
> Note that super-peers in Kazaa and Gnutella do actually help
> the network become more like a small-world.

	Not necessarily. Or at least, to a much smaller extent than
the intuitive thinking would suggest. Superpeers do make the network
smaller in terms of the node numbers, but at the same time they 
increase the traffic on the intra-ultrapeer links in exactly the
same proportion, making it more difficult to route anything to the
remote nodes. So the actual query reach improvement (the degree of
'small-worldness', so to speak) is improved only due to the better
than average super-peer bandwidth:

http://www.grouter.net/gnutella/search.htm#PlainSuperpeerNetwork
http://www.grouter.net/gnutella/search.htm#Eq25

	Basically, if you cannot reach all hosts in a 'flat' network 
(without super-peers), chances are pretty high that the introduction
of super-peers won't change this situation unless the original flat
network was already pretty close to being a 'small world' (fully
reachable) one. 

	The search reach in the super-peered nets like Kazaa really 
is better, but it comes from first, higher than average superpeer 
bandwidth, and second, from the proactive index replication that
naturally happens when a leaf connects to several superpeers at
once (three or so in Kazaa case, I believe). This one tends to be 
viewed as just something done to improve a connection reliability
through redundancy, whereas in fact it also improves the query reach 
in direct proportion to the number of redundant links:

http://www.grouter.net/gnutella/search.htm#RedundantSuperpeerClusters

	I think this effect was first noted by the Stanford P2P 
research group, which named it 'k-redundancy':
 
http://www-db.stanford.edu/~byang/pubs/superpeer.pdf

	Best wishes -
	S.Osokine.
	3 Feb 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Rita H. Wouhaybi
Sent: Thursday, February 03, 2005 11:00 AM
To: p2p-hackers@zgp.org; aloeser@cs.tu-berlin.de
Subject: Re:[p2p-hackers] Paradigma Question: DHT's or Small World? 


Alexander L?ser wrote:
> Hey all,
> structured overlay networks  based on DHT's, such as Pastry and  Chord
> among others, have been investigated in the past to construct scalable
> and performance orientated peer-to-peer networks.  However, unstructured
> networks, such as Gnutella or Kazaa,  are still widely used among the
> file sharing community. Recently researchers proposed extensions to
> unstructured networks  networks based on the small world idea: peers
> dynamically create shortcuts to other peers based on their interests.
> Over a while peers with the same interests became direct neighbors
> through its shortcuts and build interest based clusters.   Hence peers
> no longer flood messages but partly route it's queries via a interested
> based/semantic  overlay.  Examples are described in [1] [2] among
> others.
> 
> Comparing small world and DHT  approaches is a difficult task, since
> simulations usually differ in scenarios, data sets or simulation
> methodology.    I'm interested in scenarios and arguments PRO small
> world overlays for unstructured networks. Does anybody now actual
> theoretic or practical work that compares both approaches in different
> scenarios (high churn, no super peers, key word based search, meta data
> based search)?  Which scenarios or arguments support small world
> approaches for unstructured networks?
> 
> Alex
> 
> 
> 
> 
> [1] Gia - Making Gnutella like P2P Systems Scalable
> http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf
>
http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003
-gia.ppt
> 
> [2]  Efficient Content Location Using Interest Based Locality in
> Peer-to-Peer Systems
> http://www.ieee-infocom.org/2003/papers/53_01.PDF
> --
> ___________________________________________________________
> 
>   Alexander L?ser
>   Technische Universitaet Berlin
>   hp: http://cis.cs.tu-berlin.de/~aloeser/
>   office: +49- 30-314-25551
>   fax   : +49- 30-314-21601
> ___________________________________________________________
> 
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers

Interesting discussion Alex.

>From the practical and system challenges that faced researchers working on
DHTs (long time for the network to become stable, updates and maintenance for
nodes join and leave, high cost of messaging when adding an object to the
network, ..), it has become the norm to think about the application when
trying to decide to use structured (DHTs) or unstructured (gnutella-like) p2p
topologies. That is probably one of the reasons why people did not compare
both structures in an analysis similar to what you are asking for. Thus,
small world and power-law have emerged to bridge the gap between a total
random network and a "rigid" DHT. Note that super-peers in Kazaa and Gnutella
do actually help the network become more like a small-world. We also have
worked in this area and created a power-law distribution P2P network that
might interest you: 
- Rita H. Wouhaybi, and Andrew T. Campbell, "Phenix: Supporting Resilient
Low-Diameter Peer-to-Peer Topologies", IEEE INFOCOM'2004, Hong Kong, China,
March 7-11, 2004. 


Rita H. Wouhaybi
rita@comet.columbia.edu
http://comet.columbia.edu/~rita/

From aloeser at cs.tu-berlin.de  Fri Feb  4 12:57:50 2005
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World?
References: <4201FDB1.6F607C0C@cs.tu-berlin.de>
Message-ID: <4203714E.E1EA93CD@cs.tu-berlin.de>

Thank you very much in sharing this discussion!!  You gave me very valuable comments on
the design question to choose either small world or DHT's.  If I understood your
arguments right, small world should be the preferred paradigm, if the system design
requires the following (hard or soft) features:

(Hard features)
Churn: The system should support a high churn rate of peers/high churn rate of objects:
By the way, since these hypotheses are intuitive but unproved, does anybody know a
theoretical or experimental work, that proofed them? Furthermore, maybe this question is
a bit naive, but what exactly is high?

Complex queries: The system allows a user to pose complex queries, e.g. several keywords,
or if I speak about meta data annotated documents more than one (semantic) predicate per
query.

(Soft features)
Profile locality: One peer maps to one user. Probably a user is not interested or willing
to transfer it's local profile to a global index but likes to keep it locally, e.g. for
anonymity or to delete entries.

Popularity:  If most of the searches go for popular objects, small world may be the first
choice. For example, this is the case for most music sharing networks.

Community search: Depending on the shortcut creation strategies between friends on a
small world network, the small world paradigm supports the data sharing graph between
people with similar interests. By the way: Does it also support similar semantics?

What kind of application scenario suits to this requirements? I think of a networked
desktop search application. Similar to Gnutella, some people publish some of its
documents, most don't. Some of them are annotated by meta data, probably with the same
vocabulary or within the same ontology, some not. Users pose keyword queries, similar in
a single desktop search engine.  Queries either match the documents filename, folder or
(if any) documents meta data.

Would be the small world paradigm support such a system?

Alex
--
___________________________________________________________

  Alexander L?ser
  Technische Universit?t Berlin
  http://cis.cs.tu-berlin.de/~aloeser/
  office : +49- 30-314-25551
  fax    : +49- 30-314-21601
  skype  : hallo.alex
___________________________________________________________


From gwendal.simon at francetelecom.com  Fri Feb  4 13:26:21 2005
From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World?
Message-ID: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1C7B547@ftrdmel1.rd.francetelecom.fr>

Hi, 

> What kind of application scenario suits to this requirements? 
> I think of a networked desktop search application. Similar to 
> Gnutella, some people publish some of its documents, most 
> don't. Some of them are annotated by meta data, probably with 
> the same vocabulary or within the same ontology, some not. 
> Users pose keyword queries, similar in a single desktop 
> search engine.  Queries either match the documents filename, 
> folder or (if any) documents meta data.

Why do you want to restrict search to meta-data ? Google don't ! It must
be possible to perform full-text search...

Besides, how to define a world common ontology that could fit all future
needs ?

--------------------
Gwendal Simon
France Telecom R&D
http://solipsis.netofpeers.net 

From aloeser at cs.tu-berlin.de  Fri Feb  4 13:43:11 2005
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World?
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1C7B547@ftrdmel1.rd.francetelecom.fr>
Message-ID: <42037BEF.8E228624@cs.tu-berlin.de>

SIMON Gwendal RD-MAPS-ISS wrote:

> Hi,
>
> > What kind of application scenario suits to this requirements?
> > I think of a networked desktop search application. Similar to
> > Gnutella, some people publish some of its documents, most
> > don't. Some of them are annotated by meta data, probably with
> > the same vocabulary or within the same ontology, some not.
> > Users pose keyword queries, similar in a single desktop
> > search engine.  Queries either match the documents filename,
> > folder or (if any) documents meta data.
>
> Why do you want to restrict search to meta-data ? Google don't ! It must
> be possible to perform full-text search...

I assume a system where its possible to search full text.  Probably for a
first try, within the filename and directory structure only, later in the
document itself.

>
> Besides, how to define a world common ontology that could fit all future
> needs ?

However, if the document contains any valuable meta data, the system should
consider this information as well.  I think of documents classified by an
enterprise wide topic hierarchy or research docs classified within the ACM
topic hierarchy or the documents within the google/dmoz project. Or possible
doctors that exchange documents classified within a medical taxonomy.

Please correct me, if my assumptions are wrong.
Cheers Alex


--
___________________________________________________________

  Alexander L?ser
  Technische Universit?t Berlin
  http://cis.cs.tu-berlin.de/~aloeser/
  office : +49- 30-314-25551
  fax    : +49- 30-314-21601
  skype  : hallo.alex
___________________________________________________________


From hopper at omnifarious.org  Fri Feb  4 15:03:42 2005
From: hopper at omnifarious.org (Eric M. Hopper)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World?
In-Reply-To: <42037BEF.8E228624@cs.tu-berlin.de>
References: <DD8B8FEBBFAF9E488F63FF0F1A69EDD1C7B547@ftrdmel1.rd.francetelecom.fr>
	<42037BEF.8E228624@cs.tu-berlin.de>
Message-ID: <1107529422.6165.27.camel@bats.omnifarious.org>

On Fri, 2005-02-04 at 14:43 +0100, Alexander L?ser wrote:
> However, if the document contains any valuable meta data, the system should
> consider this information as well.  I think of documents classified by an
> enterprise wide topic hierarchy or research docs classified within the ACM
> topic hierarchy or the documents within the google/dmoz project. Or possible
> doctors that exchange documents classified within a medical taxonomy.
> 
> Please correct me, if my assumptions are wrong.

Well, one thing any search system has to deal with is being gamed.
Meta-data is too easy to game.  It's data for the computer, not for
people, so it can be used to trick computers into giving people
information they're not actually interested in.

Computers, as much as possible, have to base their searching on what
people will actually look at.

Now, your idea of trying to automatically get people with similar
interests to group together might provide a way for computers to take
advantage of knowledge of those relationships to let people sort of vet
documents for one another.  And that could be an interesting approach.
I think one of the primary problems there is the same one google has to
deal with.  Party crashers.  People who try to become part of a
community largely in order to sow disinformation, usually for commercial
gain.

Have fun (if at all possible),
-- 
The best we can hope for concerning the people at large is that they
be properly armed.  -- Alexander Hamilton
-- Eric Hopper (hopper@omnifarious.org  http://www.omnifarious.org/~hopper) --
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050204/c3bf8deb/attachment.pgp
From bryan.turner at pobox.com  Fri Feb  4 18:50:25 2005
From: bryan.turner at pobox.com (Bryan Turner)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Paradigma Question: DHT's or Small World?
In-Reply-To: <4203714E.E1EA93CD@cs.tu-berlin.de>
Message-ID: <200502041850.j14IoPjZ016336@rtp-core-1.cisco.com>

Alex,

>  Churn: The system should support a high churn rate of peers/high churn
rate of
> objects.  By the way, since these hypotheses are intuitive but unproved,
does
> anybody know a theoretical or experimental work, that proofed them?
Furthermore,
> maybe this question is a bit naive, but what exactly is high?

	See [1,2] for some discussion of churn and the "half-life" of a
network.  These models were built from Chord, but the results are useful to
both systems.

	To answer your question more directly: "high" means close to the
half-life of your network.  The half-life being the time it takes half the
nodes in the network to cycle off of it.  If your churn rate is higher than
this, you effectively cannot keep the network together, as it is outpacing
your stabilization protocol.  If your churn is lower, then you get a stable
network.  So a "high" churn rate is just under your network's half-life.

> Profile locality: One peer maps to one user. Probably a user is not
interested
> or willing to transfer it's local profile to a global index but likes to
keep
> it locally, e.g. for anonymity or to delete entries.

	Depending on system design, anonymity may be improved if a 'peer' is
actually a darknet of users.  This provides k-anonymity within the group.
See [3,4] for such protocols.  Probably not relevant to your request, but
it's fascinating research anyway..

> Popularity:  If most of the searches go for popular objects, small world
may
> be the first choice. For example, this is the case for most music sharing
networks.

	The greater practical concern for popularity is resolving "flash
crowds" gracefully in the system.  Neither DHT/Small World models define the
behavior for this case.  You should review some of the various solutions to
this problem (too many to reference, but see [5], Section 3, and [6],
Section III, for an example).

> What kind of application scenario suits to this requirements?

	Any form of data repository where the primary user is an individual.
For instance: Phone Book, Restaurant Guide, News Portal, Product Catalog,
Wiki, etc..

Hope that helps!
--Bryan
bryan.turner@pobox.com

[1] Observations on the Dynamic Evolution of Peer-to-Peer Networks
	David Liben-Nowell, et. al.
	http://citeseer.ist.psu.edu/liben-nowell02observations.html

[2] Analysis of the Evolution of Peer-to-Peer Systems
	David Liben-Nowell, et. al.
	http://citeseer.ist.psu.edu/liben-nowell02analysis.html

[3] k-Anonymous Message Transmission, Luis von Ahn, et. al.
	http://www-2.cs.cmu.edu/~abortz/work/k-anon-final.html

[4] A New k-Anonymous Message Transmission Protocol
	Gang Yao, Dengguo Feng
	http://dasan.sejong.ac.kr/~wisa04/ppt/9A2.pdf

[5] Novel Architectures for P2P Applications: The Continuous-Discrete
Approach
	Moni Naor, Udi Wieder
	http://citeseer.ist.psu.edu/554254.html

[6] Small World Overlay P2P Networks, Ken Y. K. Hui, et. al.
	
http://www.cse.cuhk.edu.hk/~cslui/PUBLICATION/iwqos2004_small_world.pdf

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Alexander L?ser
Sent: Friday, February 04, 2005 7:58 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] Paradigma Question: DHT's or Small World?

Thank you very much in sharing this discussion!!  You gave me very valuable
comments on the design question to choose either small world or DHT's.  If I
understood your arguments right, small world should be the preferred
paradigm, if the system design requires the following (hard or soft)
features:

(Hard features)
Churn: The system should support a high churn rate of peers/high churn rate
of objects:
By the way, since these hypotheses are intuitive but unproved, does anybody
know a theoretical or experimental work, that proofed them? Furthermore,
maybe this question is a bit naive, but what exactly is high?

Complex queries: The system allows a user to pose complex queries, e.g.
several keywords, or if I speak about meta data annotated documents more
than one (semantic) predicate per query.

(Soft features)
Profile locality: One peer maps to one user. Probably a user is not
interested or willing to transfer it's local profile to a global index but
likes to keep it locally, e.g. for anonymity or to delete entries.

Popularity:  If most of the searches go for popular objects, small world may
be the first choice. For example, this is the case for most music sharing
networks.

Community search: Depending on the shortcut creation strategies between
friends on a small world network, the small world paradigm supports the data
sharing graph between people with similar interests. By the way: Does it
also support similar semantics?

What kind of application scenario suits to this requirements? I think of a
networked desktop search application. Similar to Gnutella, some people
publish some of its documents, most don't. Some of them are annotated by
meta data, probably with the same vocabulary or within the same ontology,
some not. Users pose keyword queries, similar in a single desktop search
engine.  Queries either match the documents filename, folder or (if any)
documents meta data.

Would be the small world paradigm support such a system?

Alex
--
___________________________________________________________

  Alexander L?ser
  Technische Universit?t Berlin
  http://cis.cs.tu-berlin.de/~aloeser/
  office : +49- 30-314-25551
  fax    : +49- 30-314-21601
  skype  : hallo.alex
___________________________________________________________


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From john.casey at gmail.com  Mon Feb  7 08:22:30 2005
From: john.casey at gmail.com (John Casey)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] gossiping in a DHT
Message-ID: <be7f1717050207002297265d9@mail.gmail.com>

Hi All, I have been thinking about developing a gossip information
dissemenation algorithm to work across a DHT. Does any one have any
links to any must read papers on this topic? Conceptually, the process
seems similar to that of gossip in an unstructured DHT. Just wondering
if there was any prior work I should take a look at thanks. :)

From davidopp at cs.berkeley.edu  Mon Feb  7 16:36:09 2005
From: davidopp at cs.berkeley.edu (David L. Oppenheimer)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] gossiping in a DHT
In-Reply-To: <be7f1717050207002297265d9@mail.gmail.com>
Message-ID: <200502071635.IAA27418@mindbender.davido.com>

You might want to take a look at Kelips
http://citeseer.ist.psu.edu/570786.html

David

> Hi All, I have been thinking about developing a gossip information
> dissemenation algorithm to work across a DHT. Does any one have any
> links to any must read papers on this topic? Conceptually, the process
> seems similar to that of gossip in an unstructured DHT. Just wondering
> if there was any prior work I should take a look at thanks. :)


From paul at ref.nmedia.net  Tue Feb  8 13:31:56 2005
From: paul at ref.nmedia.net (Paul Campbell)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] gossiping in a DHT
In-Reply-To: <be7f1717050207002297265d9@mail.gmail.com>
References: <be7f1717050207002297265d9@mail.gmail.com>
Message-ID: <20050208133156.GA11916@ref.nmedia.net>

On Mon, Feb 07, 2005 at 07:22:30PM +1100, John Casey wrote:
> Hi All, I have been thinking about developing a gossip information
> dissemenation algorithm to work across a DHT. Does any one have any
> links to any must read papers on this topic? Conceptually, the process
> seems similar to that of gossip in an unstructured DHT. Just wondering
> if there was any prior work I should take a look at thanks. :)

Gossipping has to overcome the unstructured nature of the underlying
network. In a DHT, this is not necessary since it is easy to set up a
real broadcast. Look for protocols dealing with broadcasting on a DHT.

For instance, one could propagate a message around the ring until it gets
back to the source. This would take N-1 messages (if the originator is
listed in the message) and N-1 rounds.

A faster way is to use the DHT structure where some nodes broadcast multiple
messages. For instance, the source could conceptually break the DHT ring up
into arcs and broadcast a message to a node residing on each arc along with
the arc length. In turn, the next layer of nodes can broadcast the message
across their respective arcs, subdividing the problem by another level. With
log(N) known neighbors, it should take log(N) rounds to reach every node and
again, N-1 messages. Contrast this with N*log(N) messages in an unstructured
gossipping system with log(N) rounds. Thus, without structure, the load is
much higher.


From anwitaman at hotmail.com  Wed Feb  9 12:00:44 2005
From: anwitaman at hotmail.com (Anwitaman Datta)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] RE: p2p-hackers Digest, Vol 19, Issue 7
In-Reply-To: <20050208200004.84AD13FD25@capsicum.zgp.org>
Message-ID: <BAY17-F35B6C249021C4930BDBA6AA9750@phx.gbl>

There are several DHT based broadcasting mechanisms in the literature, which 
may also interest you.

The first that I came across was "structella": 
http://nms.lcs.mit.edu/HotNets-II/papers/structella.pdf

Also, we use such a scheme for range queries in
P-Grid: http://www.p-grid.org/Papers/TR-IC-2004-111.pdf

as also used in prefix hash tree

http://berkeley.intel-research.net/sylvia/pht.pdf

- Anwitaman

Today's Topics:

    1. Re: gossiping in a DHT (Paul Campbell)


A faster way is to use the DHT structure where some nodes broadcast multiple
messages. For instance, the source could conceptually break the DHT ring up
into arcs and broadcast a message to a node residing on each arc along with
the arc length. In turn, the next layer of nodes can broadcast the message
across their respective arcs, subdividing the problem by another level. With
log(N) known neighbors, it should take log(N) rounds to reach every node and
again, N-1 messages. Contrast this with N*log(N) messages in an unstructured
gossipping system with log(N) rounds. Thus, without structure, the load is
much higher.

_________________________________________________________________
Trailblazer Narain Karthikeyan. Know more about him �n his life. 
http://server1.msn.co.in/sp04/tataracing/ Stay in the loop with Tata Racing!


From john.casey at gmail.com  Thu Feb 10 04:40:09 2005
From: john.casey at gmail.com (John Casey)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] gossiping in a DHT
In-Reply-To: <20050208133156.GA11916@ref.nmedia.net>
References: <be7f1717050207002297265d9@mail.gmail.com>
	<20050208133156.GA11916@ref.nmedia.net>
Message-ID: <be7f171705020920407954ada9@mail.gmail.com>

thanks guys. I've just been reading digesting the papers you have
given me. The structella, and the pointers to the broadcasting papers
it have are very useful :)

On Tue, 8 Feb 2005 05:31:56 -0800, Paul Campbell <paul@ref.nmedia.net> wrote:
> On Mon, Feb 07, 2005 at 07:22:30PM +1100, John Casey wrote:
> > Hi All, I have been thinking about developing a gossip information
> > dissemenation algorithm to work across a DHT. Does any one have any
> > links to any must read papers on this topic? Conceptually, the process
> > seems similar to that of gossip in an unstructured DHT. Just wondering
> > if there was any prior work I should take a look at thanks. :)
> 
> Gossipping has to overcome the unstructured nature of the underlying
> network. In a DHT, this is not necessary since it is easy to set up a
> real broadcast. Look for protocols dealing with broadcasting on a DHT.
> 
> For instance, one could propagate a message around the ring until it gets
> back to the source. This would take N-1 messages (if the originator is
> listed in the message) and N-1 rounds.
> 
> A faster way is to use the DHT structure where some nodes broadcast multiple
> messages. For instance, the source could conceptually break the DHT ring up
> into arcs and broadcast a message to a node residing on each arc along with
> the arc length. In turn, the next layer of nodes can broadcast the message
> across their respective arcs, subdividing the problem by another level. With
> log(N) known neighbors, it should take log(N) rounds to reach every node and
> again, N-1 messages. Contrast this with N*log(N) messages in an unstructured
> gossipping system with log(N) rounds. Thus, without structure, the load is
> much higher.

From rabbi at abditum.com  Thu Feb 10 08:01:01 2005
From: rabbi at abditum.com (Len Sassaman)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] CodeCon Reminder
Message-ID: <Pine.LNX.4.58.0502100000480.17409@thetis.deor.org>

We'd like to remind those of you planning to attend this year's event that
CodeCon is fast approaching.

CodeCon is the premier event in 2005 for application developer community.
It is a workshop for developers of real-world applications with working
code and active development projects.

Past presentations at CodeCon have included the file distribution software
BitTorrent; the Peek-A-Booty anti-censorship application; the email
encryption system PGP Universal; and Audacity, a powerful audio editing
tool.

Some of this year's highlights include Off-The-Record Messaging, a
privacy-enhancing encryption protocol for instant-message systems;
SciTools, a web-based toolkit for genetic design and analysis; and
Incoherence, a novel stereo sound visualization tool.

CodeCon registration is discounted this year:  $80 for cash at the door
registrations. Registration will be available every day of the conference,
though ticket are limited, and attendees are encouraged to register on the
first day to secure admission.

CodeCon will be held February 11-13, noon-6pm, at Club NV (525 Howard
Street) in San Francisco.


For more information, please visit http://www.codecon.org.


From aloeser at cs.tu-berlin.de  Thu Feb 10 08:57:00 2005
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] gossiping in a DHT
References: <be7f1717050207002297265d9@mail.gmail.com>
	<20050208133156.GA11916@ref.nmedia.net>
	<be7f171705020920407954ada9@mail.gmail.com>
Message-ID: <420B21DC.4A2E1625@cs.tu-berlin.de>

Hi John,
probably you should look at the hypercup topology [1] , that permits broadcasting
in a structured overlay. In Edutella [2] we use the broadcast mechanism to
broadcast complex queries. Due to the combination of the broadcast with routing
indices and a super-peer network  we are able to focus the broadcast to a subset
of peers.

Alex

[1]
http://projekte.learninglab.uni-hannover.de/pub/bscw.cgi/d7825/HyperCuP%20-%20Hypercubes,%20Ontologies%20and%20Efficient%20Search%20on%20P2P%20Networks

[2]
http://www.kbs.uni-hannover.de/Arbeiten/Publikationen/2002/www2003_superpeer.pdf


John Casey wrote:

> thanks guys. I've just been reading digesting the papers you have
> given me. The structella, and the pointers to the broadcasting papers
> it have are very useful :)
>
> On Tue, 8 Feb 2005 05:31:56 -0800, Paul Campbell <paul@ref.nmedia.net> wrote:
> > On Mon, Feb 07, 2005 at 07:22:30PM +1100, John Casey wrote:
> > > Hi All, I have been thinking about developing a gossip information
> > > dissemenation algorithm to work across a DHT. Does any one have any
> > > links to any must read papers on this topic? Conceptually, the process
> > > seems similar to that of gossip in an unstructured DHT. Just wondering
> > > if there was any prior work I should take a look at thanks. :)
> >
> > Gossipping has to overcome the unstructured nature of the underlying
> > network. In a DHT, this is not necessary since it is easy to set up a
> > real broadcast. Look for protocols dealing with broadcasting on a DHT.
> >
> > For instance, one could propagate a message around the ring until it gets
> > back to the source. This would take N-1 messages (if the originator is
> > listed in the message) and N-1 rounds.
> >
> > A faster way is to use the DHT structure where some nodes broadcast multiple
> > messages. For instance, the source could conceptually break the DHT ring up
> > into arcs and broadcast a message to a node residing on each arc along with
> > the arc length. In turn, the next layer of nodes can broadcast the message
> > across their respective arcs, subdividing the problem by another level. With
> > log(N) known neighbors, it should take log(N) rounds to reach every node and
> > again, N-1 messages. Contrast this with N*log(N) messages in an unstructured
> > gossipping system with log(N) rounds. Thus, without structure, the load is
> > much higher.
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

--
___________________________________________________________

  Alexander L?ser
  Technische Universit?t Berlin
  http://cis.cs.tu-berlin.de/~aloeser/
  office : +49- 30-314-25551
  fax    : +49- 30-314-21601
___________________________________________________________


From telecontrol at t-online.de  Thu Feb 10 10:46:29 2005
From: telecontrol at t-online.de (Telecontrol)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] We need some help for our project TV-Sharing over P2P
	(www.cybertelly.com)
Message-ID: <003001c50f5d$c71b56c0$69a2a8c0@namepc>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 12199 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050210/fd77db14/attachment.gif
From telecontrol at t-online.de  Thu Feb 10 10:56:38 2005
From: telecontrol at t-online.de (Telecontrol)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] We need some help for our project TV-Sharing over P2P
Message-ID: <003c01c50f5f$31c5e610$69a2a8c0@namepc>

Please use the email adress telecontrol@t-online.de if you want to
support the project , Thank you !!


From sszukala at runbox.com  Thu Feb 10 20:39:14 2005
From: sszukala at runbox.com (Shannon Alexander Szukala)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Re: p2p-hackers Digest, Vol 19, Issue 10
In-Reply-To: <20050210200004.046783FD65@capsicum.zgp.org>
References: <20050210200004.046783FD65@capsicum.zgp.org>
Message-ID: <E1CzL5m-0006IM-Ku@odie.runbox.com>

Hey I want to help out. Let me know what you are looking for.

> Send p2p-hackers mailing list submissions to
> p2p-hackers@zgp.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://zgp.org/mailman/listinfo/p2p-hackers
> or, via email, send a message with subject or body 'help' to
> p2p-hackers-request@zgp.org
>
> You can reach the person managing the list at
> p2p-hackers-owner@zgp.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of p2p-hackers digest..."
>
>
> Today's Topics:
>
> 1. We need some help for our project TV-Sharing over P2P
> (Telecontrol)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 10 Feb 2005 11:56:38 +0100
> From: "Telecontrol"
> Subject: [p2p-hackers] We need some help for our project TV-Sharing
> over P2P
> To:
> Message-ID: <003c01c50f5f$31c5e610$69a2a8c0@namepc>
> Content-Type: text/plain; charset="us-ascii"
>
> Please use the email adress telecontrol@t-online.de if you want to
> support the project , Thank you !!
>
>
>
> ------------------------------
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
>
>
> End of p2p-hackers Digest, Vol 19, Issue 10
> *******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20050210/8dda94c0/attachment.html
From trep at cs.ucr.edu  Fri Feb 11 21:11:14 2005
From: trep at cs.ucr.edu (Thomas Repantis)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Bloom Filters in Gnutella (Was: Re: Paradigma
	Question: DHT's or Small World?)
In-Reply-To: <EPEJIODJLBDLEHGHIEADAEAECIAB.gbildson@limepeer.com>
References: <4201FDB1.6F607C0C@cs.tu-berlin.de>
	<EPEJIODJLBDLEHGHIEADAEAECIAB.gbildson@limepeer.com>
Message-ID: <20050211211114.GA673@angeldust.chaos>

Hi Greg, 

interesting what you wrote, that Gnutella uses Bloom Filters.  I thought 
that simple hash tables were exchanged.  How are the Bloom Filters 
propagated?  Just from every leaf to its ultrapeer?  Or do ultrapeers 
also exchange Bloom Filters?  Let me know if you have any pointers on this.  
I'm only aware of:

http://rfc-gnutella.sourceforge.net/src/Ultrapeers_1.0.html
and
http://www.limewire.com/developer/query_routing/keyword%20routing.htm

I've also done some work on Bloom Filters and their propagation
(the first paper on:
http://www.cs.ucr.edu/~trep/publications.html )

Cheers,
Thomas


On Thu, Feb 03, 2005 at 10:26:25AM -0500, Greg Bildson wrote:
> I'd just like to point out that Gnutella does not use pure flooding anymore
> and you are unlikely to find P2P networks that don't have something akin to
> supernodes.  Gnutella uses bloom filter based keyword index replication and
> dynamic querying (selectively sending out queries until a result limit is
> reached) to reduce the overhead of flooding for popular queries and to route
> all queries on the last hop.
> 
> Thanks
> -greg
> 

-- 
http://www.cs.ucr.edu/~trep
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050211/e324ccd7/attachment.pgp
From mgp at ucla.edu  Tue Feb 15 09:52:41 2005
From: mgp at ucla.edu (Michael Parker)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Online Codes
Message-ID: <4211C669.3080200@ucla.edu>

Hi all,

Does anyone know what happened to the "Online Codes" Sourceforge 
project, listed at http://sourceforge.net/projects/onlinecodes? I'm 
asking here for two reasons: First, because Online Codes [1, 2] would be 
a great tool in peer-to-peer applications, so I thought someone here 
might have followed the project while it was still active. Second, I've 
written a solid library implementation of the Online Codes 
encoding/decoding algorithm described in the aforementioned papers. 
Alas, only after I implemented it did I find out that the authors' 
company, Rateless, had patented it (or, so they allude to on their web 
site www.rateless.com, Digital Fountain owned the IP). I was thinking of 
releasing it under the GPL, but now that I've discovered patents are 
involved that seems like a very bad idea. So I was wondering if the 
Online Codes project broke up because of this, and whether I would get 
sued into oblivion if I ever made this code available? IANAL, but is it 
illegal to write such code and distribute it as a library on the net 
(after all, it is straight from their papers) to elucidate how the 
algorithm works, or only illegal to include the library in any working 
software program?

Regards,
Michael Parker

[1] http://www.rateless.com/oncodes.pdf
[2] http://www.rateless.com/msd.ps

From stewbagz at gmail.com  Tue Feb 15 10:00:42 2005
From: stewbagz at gmail.com (stew "stewbagz" mercer)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <4211C669.3080200@ucla.edu>
References: <4211C669.3080200@ucla.edu>
Message-ID: <3b462676050215020043ee0d5a@mail.gmail.com>

I was wondering about this as well. It appears that there was a build
of rateless-copy and rateless-tunnel that was done with the cygwin
tool kit, and that appears to have caused some complications. if you
go to

http://www.rateless.com/download_copy.html you can see the links to
the binaries, but I've not been able to download anything from it.

They were supposedly writing some RFCs for it too, but there is no
sign of them either ...


On Tue, 15 Feb 2005 01:52:41 -0800, Michael Parker <mgp@ucla.edu> wrote:
> Hi all,
> 
> Does anyone know what happened to the "Online Codes" Sourceforge
> project, listed at http://sourceforge.net/projects/onlinecodes? I'm
> asking here for two reasons: First, because Online Codes [1, 2] would be
> a great tool in peer-to-peer applications, so I thought someone here
> might have followed the project while it was still active. Second, I've
> written a solid library implementation of the Online Codes
> encoding/decoding algorithm described in the aforementioned papers.
> Alas, only after I implemented it did I find out that the authors'
> company, Rateless, had patented it (or, so they allude to on their web
> site www.rateless.com, Digital Fountain owned the IP). I was thinking of
> releasing it under the GPL, but now that I've discovered patents are
> involved that seems like a very bad idea. So I was wondering if the
> Online Codes project broke up because of this, and whether I would get
> sued into oblivion if I ever made this code available? IANAL, but is it
> illegal to write such code and distribute it as a library on the net
> (after all, it is straight from their papers) to elucidate how the
> algorithm works, or only illegal to include the library in any working
> software program?
> 
> Regards,
> Michael Parker
> 
> [1] http://www.rateless.com/oncodes.pdf
> [2] http://www.rateless.com/msd.ps
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From solipsis at pitrou.net  Tue Feb 15 10:15:04 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <4211C669.3080200@ucla.edu>
References: <4211C669.3080200@ucla.edu>
Message-ID: <1108462504.7938.25.camel@p-dhcp-333-72.rd.francetelecom.fr>


> I was thinking of 
> releasing it under the GPL, but now that I've discovered patents are 
> involved that seems like a very bad idea. So I was wondering if the 
> Online Codes project broke up because of this, and whether I would get 
> sued into oblivion if I ever made this code available? IANAL, but is it 
> illegal to write such code and distribute it as a library on the net 
> (after all, it is straight from their papers) to elucidate how the 
> algorithm works, or only illegal to include the library in any working 
> software program?

If you are European then it's still legal ;)
(given your e-mail address I guess you are not...)

On the other hand, if software patents are valid in your country, then
you can't distribute any code that infringes the patent without a
license for that patent, even if you are doing it for research purposes,
etc. Indeed, one of the problems with patents is that they are not
subject to the traditional limits of copyright (fair use, etc.).

Regards

Antoine.

-- 
http://solipsis.netofpeers.net/


From paul at ref.nmedia.net  Tue Feb 15 19:58:23 2005
From: paul at ref.nmedia.net (Paul Campbell)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <4211C669.3080200@ucla.edu>
References: <4211C669.3080200@ucla.edu>
Message-ID: <20050215195823.GB25409@ref.nmedia.net>

On Tue, Feb 15, 2005 at 01:52:41AM -0800, Michael Parker wrote:
> Does anyone know what happened to the "Online Codes" Sourceforge 
> project, listed at http://sourceforge.net/projects/onlinecodes? I'm 
> asking here for two reasons: First, because Online Codes [1, 2] would be 
> a great tool in peer-to-peer applications, so I thought someone here 
> might have followed the project while it was still active. Second, I've 
> written a solid library implementation of the Online Codes 
> encoding/decoding algorithm described in the aforementioned papers. 
> Alas, only after I implemented it did I find out that the authors' 
> company, Rateless, had patented it (or, so they allude to on their web 
> site www.rateless.com, Digital Fountain owned the IP). I was thinking of 
> releasing it under the GPL, but now that I've discovered patents are 
> involved that seems like a very bad idea.

There are additional papers out there. There are essentially two
implementations of the idea. First, there's the "LT Codes" and "Raptor
Codes". Second, there's the "Online Codes". Both are very similar in a lot
of ways.

There are also some fundamental problems. See this one:
http://citeseer.ist.psu.edu/695965.html

I didn't know that Online codes have now been patented. However, if you
consider the code, you've got essentially two pieces. First, there's the
LDPC cipher being used in erasure-handling only. Second, there's the inner
error correction cipher. The inner cipher is what makes the fundamental
difference between LT Codes and Online Codes. However, there is absolutely
nothing to say that you can't use say a punctured rate-1 outer code
(repitition-style codes) with a suitable scrambler, or vary the inner code
with something that gives equivalent performance (even a BCH code). Patents
only work as long as you implement ALL the features of the patent.

From gojomo at bitzi.com  Wed Feb 16 05:41:05 2005
From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi))
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <4212DCF1.1070909@bitzi.com>

Via Slashdot, as reported by Bruce Schneier:

     http://www.schneier.com/blog/archives/2005/02/sha1_broken.html

Schneier writes:

#   SHA-1 Broken
#
# SHA-1 has been broken. Not a reduced-round version. Not a
# simplified version. The real thing.
#
# The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu
# (mostly from Shandong University in China) have been quietly
# circulating a paper announcing their results:
#
#   * collisions in the the full SHA-1 in 2**69 hash operations,
#     much less than the brute-force attack of 2**80 operations
#     based on the hash length.
#
#   * collisions in SHA-0 in 2**39 operations.
#
#   * collisions in 58-round SHA-1 in 2**33 operations.
#
# This attack builds on previous attacks on SHA-0 and SHA-1, and
# is a major, major cryptanalytic result. It pretty much puts a
# bullet into SHA-1 as a hash function for digital signatures
# (although it doesn't affect applications such as HMAC where
# collisions aren't important).
#
# The paper isn't generally available yet. At this point I can't
# tell if the attack is real, but the paper looks good and this
# is a reputable research team.
#
# More details when I have them.

- Gordon @ Bitzi

From jeffh at cs.rice.edu  Wed Feb 16 06:51:45 2005
From: jeffh at cs.rice.edu (Jeff Hoye)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <4212DCF1.1070909@bitzi.com>
References: <4212DCF1.1070909@bitzi.com>
Message-ID: <4212ED81.4030606@cs.rice.edu>

Let's wait for a real report.  But it's cool if it's true.
-Jeff

Gordon Mohr (@ Bitzi) wrote:
> Via Slashdot, as reported by Bruce Schneier:
> 
>     http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
> 
> Schneier writes:
> 
> #   SHA-1 Broken
> #
> # SHA-1 has been broken. Not a reduced-round version. Not a
> # simplified version. The real thing.
> #
> # The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu
> # (mostly from Shandong University in China) have been quietly
> # circulating a paper announcing their results:
> #
> #   * collisions in the the full SHA-1 in 2**69 hash operations,
> #     much less than the brute-force attack of 2**80 operations
> #     based on the hash length.
> #
> #   * collisions in SHA-0 in 2**39 operations.
> #
> #   * collisions in 58-round SHA-1 in 2**33 operations.
> #
> # This attack builds on previous attacks on SHA-0 and SHA-1, and
> # is a major, major cryptanalytic result. It pretty much puts a
> # bullet into SHA-1 as a hash function for digital signatures
> # (although it doesn't affect applications such as HMAC where
> # collisions aren't important).
> #
> # The paper isn't generally available yet. At this point I can't
> # tell if the attack is real, but the paper looks good and this
> # is a reputable research team.
> #
> # More details when I have them.
> 
> - Gordon @ Bitzi
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From osokin at osokin.com  Wed Feb 16 08:11:07 2005
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <4212DCF1.1070909@bitzi.com>
Message-ID: <IJECJCHFLLIPKMNGHDPMMEBGGNAA.osokin@osokin.com>

> #   * collisions in the the full SHA-1 in 2**69 hash operations,
> #     much less than the brute-force attack of 2**80 operations...

Okay, so the effective SHA-1 length is 138 bits instead of full
160 - so what's the big deal? It is still way more than, say, MD5
length. And MD5 is still widely used for stuff like content id'ing
in various systems, because even 128 bits is quite a lot, never
mind 138 bits.

	Best wishes -
	S.Osokine.
	16 Feb 2005.

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Gordon Mohr (@ Bitzi)
Sent: Tuesday, February 15, 2005 9:41 PM
To: p2p-hackers
Subject: [p2p-hackers] SHA1 broken?


Via Slashdot, as reported by Bruce Schneier:

     http://www.schneier.com/blog/archives/2005/02/sha1_broken.html

Schneier writes:

#   SHA-1 Broken
#
# SHA-1 has been broken. Not a reduced-round version. Not a
# simplified version. The real thing.
#
# The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu
# (mostly from Shandong University in China) have been quietly
# circulating a paper announcing their results:
#
#   * collisions in the the full SHA-1 in 2**69 hash operations,
#     much less than the brute-force attack of 2**80 operations
#     based on the hash length.
#
#   * collisions in SHA-0 in 2**39 operations.
#
#   * collisions in 58-round SHA-1 in 2**33 operations.
#
# This attack builds on previous attacks on SHA-0 and SHA-1, and
# is a major, major cryptanalytic result. It pretty much puts a
# bullet into SHA-1 as a hash function for digital signatures
# (although it doesn't affect applications such as HMAC where
# collisions aren't important).
#
# The paper isn't generally available yet. At this point I can't
# tell if the attack is real, but the paper looks good and this
# is a reputable research team.
#
# More details when I have them.

- Gordon @ Bitzi
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From gojomo at bitzi.com  Wed Feb 16 09:10:13 2005
From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi))
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <IJECJCHFLLIPKMNGHDPMMEBGGNAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMMEBGGNAA.osokin@osokin.com>
Message-ID: <42130DF5.3020708@bitzi.com>

Serguei Osokine wrote:
>>#   * collisions in the the full SHA-1 in 2**69 hash operations,
>>#     much less than the brute-force attack of 2**80 operations...
> 
> 
> Okay, so the effective SHA-1 length is 138 bits instead of full
> 160 - so what's the big deal? 

If the results hold up:

SHA1 is not as strong as it was designed to be, and its effective
strength is being sent in the wrong direction, rather than being
confirmed, by new research.

Even while maintaining that SHA1 was unbroken and likely to
remain so just last week, NIST was still recommending that SHA1 be
phased out of government use by 2010:

   http://www.fcw.com/fcw/articles/2005/0207/web-hash-02-07-05.asp

One more paper from a group of precocious researchers anywhere in
the world, or unpublished result exploited in secret, could topple
SHA1 from practical use entirely. Of course, that's remotely possible
with any hash, but the pattern of recent results suggest that a
further break is now more likely with SHA1 (and related hashes)
than others.

So the big deal would be: don't rely on SHA1 in any applications
you intend to have a long effective life.

> It is still way more than, say, MD5
> length. And MD5 is still widely used for stuff like content id'ing
> in various systems, because even 128 bits is quite a lot, never
> mind 138 bits.

Just because it's widely used doesn't mean it's a good idea.

MD5 should not be used for content identification, given the ability
to create content pairs with the same MD5, with one version being
(and appearing and acquiring a reputation for being) innocuous, and
the other version malicious.

- Gordon @ Bitzi

From paul at ref.nmedia.net  Wed Feb 16 13:15:36 2005
From: paul at ref.nmedia.net (Paul Campbell)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <4212DCF1.1070909@bitzi.com>
References: <4212DCF1.1070909@bitzi.com>
Message-ID: <20050216131536.GA27730@ref.nmedia.net>

On Tue, Feb 15, 2005 at 09:41:05PM -0800, Gordon Mohr (@ Bitzi) wrote:
> Via Slashdot, as reported by Bruce Schneier:
> 
>     http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
> 
> Schneier writes:
> 
> #   SHA-1 Broken

I saw this a few months ago. It's not just SHA-1. All ciphers based on the
MD-5 S-box design are apparently vulnerable. At this point, it appears that
there are two options for the future:

1. Go to something with a larger internal state (256-bit state), and that is
NOT just an extended version of the original (as the extended SHA standards
attempt to do).

2. Go to a completely different type of cipher. The choices right now are
either digital signatures via elliptic curves, or else using one of the
stream cipher designs. Since neither one is really optimized for hashing-type
operations, they are essentially no-go's for most P2P uses (e.g. DHT's). When
I say "optimized", by that I mean very SLOW by the way.


From ap at hamachi.cc  Wed Feb 16 16:03:47 2005
From: ap at hamachi.cc (Alex Pankratov)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <20050216131536.GA27730@ref.nmedia.net>
References: <4212DCF1.1070909@bitzi.com>
	<20050216131536.GA27730@ref.nmedia.net>
Message-ID: <42136EE3.4000001@hamachi.cc>


Paul Campbell wrote:

> 2. Go to a completely different type of cipher. The choices right now are
> either digital signatures via elliptic curves, ...

By the way - is ECC patented ? I heard Sun had some activity around
ECC patents, Certicom has patents for a curve selection algorithms,
but is core ECC patented ? Or rather - is it in public domain or not ?

I am seriously considering ECDSA as a replacement for RSA as it
seems to be significantly faster for the same crypto strength.


From Serguei.Osokine at efi.com  Wed Feb 16 16:37:31 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC35B@fcexmb04.efi.internal>

On Wednesday, February 16, 2005 Gordon Mohr wrote:
> MD5 should not be used for content identification, given the 
> ability to create content pairs with the same MD5, with one 
> version being (and appearing and acquiring a reputation for 
> being) innocuous, and the other version malicious.

	Right. So let's go and try to find something with the same
MD5 as this letter of mine, shall we? :-)

	For any practical purpose that I can imagine in a content
identification field, MD5 is just fine. And SHA-1 is even more
fine. There are plenty more simple ways to attack the CDN nets
than MD5 collisions. Way more simple. And abandoning MD5 for
SHA1, then SHA1 for Tiger, and then abandoning Tiger for some
newer hash when some researcher finds that it is really twenty
bits weaker than you thought - it is all just a huge waste of
development effort, as far as I'm concerned.

	It sure is nice to know that the human mind can find
collisions in a 160-bit hash, but I have a feeling that the
practical meaning of this result in the content identification
area is precisely zero. Probably the biggest effect will be
that the more advanced of the marketing types will start 
saying with a knowing look: "ah, but SHA1 was compromised -
shouldn't we use something more secure?" 

	Which is a plenty effect by itself, I'll grant you that.
It will be way easier to switch to a newer hash than to explain
to these guys that this is all a load of bull. But this is a 
Chicken Little effect, which is of a psychological rather than 
of a technical nature, and I'd expect to find the concerns about
SHA1 weakness on some marketing forum rather than here.

(All of the above is only about the content identification in
the P2P nets, of course. Security/authentication is a different
story. But saying that MD5 should not be used for the content
identification does seem like a bit of an overstatement to me. 
I mean, imagine yourself a Gnutella network - so its biggest,
major, noticeable, or even existing concern is a collision
in the content hashes? Are you kidding? :-)

	Best wishes -
	S.Osokine.
	16 Feb 2005.

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Gordon Mohr (@ Bitzi)
Sent: Wednesday, February 16, 2005 1:10 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] SHA1 broken?


Serguei Osokine wrote:
>>#   * collisions in the the full SHA-1 in 2**69 hash operations,
>>#     much less than the brute-force attack of 2**80 operations...
> 
> 
> Okay, so the effective SHA-1 length is 138 bits instead of full
> 160 - so what's the big deal? 

If the results hold up:

SHA1 is not as strong as it was designed to be, and its effective
strength is being sent in the wrong direction, rather than being
confirmed, by new research.

Even while maintaining that SHA1 was unbroken and likely to
remain so just last week, NIST was still recommending that SHA1 be
phased out of government use by 2010:

   http://www.fcw.com/fcw/articles/2005/0207/web-hash-02-07-05.asp

One more paper from a group of precocious researchers anywhere in
the world, or unpublished result exploited in secret, could topple
SHA1 from practical use entirely. Of course, that's remotely possible
with any hash, but the pattern of recent results suggest that a
further break is now more likely with SHA1 (and related hashes)
than others.

So the big deal would be: don't rely on SHA1 in any applications
you intend to have a long effective life.

> It is still way more than, say, MD5
> length. And MD5 is still widely used for stuff like content id'ing
> in various systems, because even 128 bits is quite a lot, never
> mind 138 bits.

Just because it's widely used doesn't mean it's a good idea.

MD5 should not be used for content identification, given the ability
to create content pairs with the same MD5, with one version being
(and appearing and acquiring a reputation for being) innocuous, and
the other version malicious.

- Gordon @ Bitzi
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From lloyd at randombit.net  Wed Feb 16 22:05:17 2005
From: lloyd at randombit.net (Jack Lloyd)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <20050216131536.GA27730@ref.nmedia.net>
References: <4212DCF1.1070909@bitzi.com>
	<20050216131536.GA27730@ref.nmedia.net>
Message-ID: <20050216220516.GC29536@randombit.net>

On Wed, Feb 16, 2005 at 05:15:36AM -0800, Paul Campbell wrote:
> On Tue, Feb 15, 2005 at 09:41:05PM -0800, Gordon Mohr (@ Bitzi) wrote:
> > Via Slashdot, as reported by Bruce Schneier:
> > 
> >     http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
> > 
> > Schneier writes:
> > 
> > #   SHA-1 Broken
> 
> I saw this a few months ago. It's not just SHA-1. All ciphers based on the
> MD-5 S-box design are apparently vulnerable. At this point, it appears that
> there are two options for the future:

No, there were no major results against full 80 round SHA-1 until this. There
were collisions with ~50 of the 80 rounds for SHA-1, and Joux found a collision
for SHA-0 around the same time Wang et all produced the collisions for
MD4/MD5/RIPEMD/HAVAL-128 last summer.

BTW, MD5 does not use S-Boxes in any form.

> 1. Go to something with a larger internal state (256-bit state), and that is
> NOT just an extended version of the original (as the extended SHA standards
> attempt to do).

Currently Whirlpool is looking like the best bet. Tiger is still out there, and
is both reasonably fast on 32-bit machines and very fast on 64-bit, but it
never saw much analysis, as the designers expected the 64-bit revolution about
8 years too early.

Both are quite unlike the MDx designs, which is both good (possibly less likely
to fall to whatever methods Wang and crew have), and bad (less analysis has
been done). A major issue is that currently the details of the attacks haven't
been published. All we really have right now are a set of collisions for
various hashes, which proves that there are weaknesses, but until we know the
details there is no way to say that they will or won't apply to
Whirlpool/Tiger/SHA-2/etc.

Fortunately the 2^69 worklimit on SHA-1 is currently theoretical for everyone
but the TLAs, so the paper will have to explain the attack is sufficient detail
to verify the results, from which people more compentent than me can see if the
attacks do (or might) apply to the latest generation of hash functions.

The real key is not just to upgrade, but to provide a smooth upgrade path in
the future. Before SHA-1, the average security lifetime of a hash was about 5
years. I suspect we're seeing a return to that level of cycling; for the most
part analysis of hash functions is not nearly as developed as that for block
ciphers.

> 
> 2. Go to a completely different type of cipher. The choices right now are
> either digital signatures via elliptic curves, or else using one of the

ECDSA and ECNR still use conventional hash functions; you don't reduce the
impact of an attack on SHA-1 by using either of those as compared to DSA or
RSA.

> stream cipher designs.

I am not aware of any methods of hashing with just a stream cipher; are you
refering to Panama? Panama's stream cipher mode is still secure AFAIK, but the
Panama transform has been shown insecure for hashing (IIRC with 2^80
operations, versus the expected 2^128)

Regards,
  Jack

From gojomo at bitzi.com  Thu Feb 17 04:12:18 2005
From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi))
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC35B@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120E0DC35B@fcexmb04.efi.internal>
Message-ID: <421419A2.80307@bitzi.com>

Serguei Osokine wrote:
> On Wednesday, February 16, 2005 Gordon Mohr wrote:
> 
>>MD5 should not be used for content identification, given the 
>>ability to create content pairs with the same MD5, with one 
>>version being (and appearing and acquiring a reputation for 
>>being) innocuous, and the other version malicious.
> 
> 
> 	Right. So let's go and try to find something with the same
> MD5 as this letter of mine, shall we? :-)

I can't -- but you could have made a collision, very easily, if
you composed your initial message with the intent of also composing
an MD5 twin at the same time.

That means for content identification MD5 is fatally flawed. For
any file whose contents I think I know and trust, perhaps based
on analysis and history of the file, there could be another
dangerous file with the same MD5. MD5 cannot be used to distinguish
between the two, but that's the whole point of using a secure
hash for content identification.

Dan Kaminsky runs over a number of potential attacks that
are relevant to P2P -- see:

   http://paketto.doxpara.com

Don't be fooled by the title of his analysis, "MD to be considered
harmful someday" -- the attacks mentioned are possible now, and
could trick people and software in subtle ways different from
other threats to P2P nets.

Here's another example from the cryptography list that convinced
a  doubter that the attacks on MD5 were of more than purely
theoretical interest: two long binary strings, one a prime number,
one not:

   http://lists.virus.org/cryptography-0412/msg00102.html

Consider source code or executables which work fine with the
primes, s-boxes, and other initialization vectors initially
examined -- but have exploitable flaws when those values are
perturbed in a manner that leaves the MD5 the same. You need
to use a different, stronger content check to prevent such
mischief -- making the use of MD5 redundant and even dangerous
for the false sense of security it gives.

> 	For any practical purpose that I can imagine in a content
> identification field, MD5 is just fine. And SHA-1 is even more
> fine. 

If you can't imagine exploits, perhaps it's just a failure of
your imagination. Prudent engineering would assume some attackers
have better imaginations than you, when it comes to exploiting
hashes that don't work as originally intended.

> There are plenty more simple ways to attack the CDN nets
> than MD5 collisions. Way more simple. And abandoning MD5 for
> SHA1, then SHA1 for Tiger, and then abandoning Tiger for some
> newer hash when some researcher finds that it is really twenty
> bits weaker than you thought - it is all just a huge waste of
> development effort, as far as I'm concerned.

Depends on the kinds of attacks you're worried about. There
are more simple ways to disrupt P2P nets, sure. But are there
more simple ways to trick conscientious, hash-checking users
into running malware?

And since when did the ease of other attacks become an excuse
for ignoring more complicated and subtle (and thus perhaps
more valuable) attacks?

If you need a secure hash's properties in your software, you
should use an uncompromised secure hash. (Results as early as
1996 suggested MD5 should not be used in applications where
collision-resistance is important.)

If you're stuck with a legacy hash, fine, analyze the situation
and if you're confident the weakness has no effect on current
usage, rationalize using it a while longer. But get ready for
the potential need to switch hashes quickly in the presence of
further discoveries. Or better yet: design with the idea in mind
that no hash function lives forever.

- Gordon @ Bitzi

From osokin at osokin.com  Thu Feb 17 07:37:55 2005
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <421419A2.80307@bitzi.com>
Message-ID: <IJECJCHFLLIPKMNGHDPMCEDNGNAA.osokin@osokin.com>

On Wednesday, February 16, 2005 Gordon Mohr wrote:
> Dan Kaminsky runs over a number of potential attacks that
> are relevant to P2P -- see:
>
>   http://paketto.doxpara.com
> ...
> Here's another example from the cryptography list that convinced
> a  doubter...

	Certainly looks cute. Now correct me if I'm not getting something
here - but isn't it true that in order to mount an attack one has to
replace the "good" code (content, whatever) by the "bad" code, and the
absolutely necessary condition is that the "good" code also has to be
created by an attacker? So an attacker creates "good" code, gives it
to security experts for verification, and then after they are done,
replaces it with "bad code", right? 

	Isn't it a bit far-fetched? Do we have a somewhat more realistic
attack scenario? I just cannot imagine all this happening in real 
life. Real-life breakdowns always tend to be way simpler than their
theoretical scenarios (and totally unexpected, too).

> But are there more simple ways to trick conscientious, hash-checking
> users into running malware?

	Users typically don't give a damn about hash-checking; they
expect the system to do that for them. And a few users that do give
a damn typically can defend themselves from pretty much anything no
matter what you throw at them. So the fate of this "expert" group 
(consisting of about ten people for any given P2P system) does not
really worry me, whereas for the rest of the user population there 
are plenty of ways to trick them into running the malware - *all* 
the current ways of doing  so are simpler than fiddling with hashes. 

	Which brings me back to my question above: do we have a
realistic scenario where a network like Gnutella would be harmed by
using MD5?

(Not that I give a damn about MD5, and no one in Gnutella probably 
uses it anyway; my interest is largely theoretical here, and the same
issues might be relevant for the other hashes, either.)

> And since when did the ease of other attacks become an excuse
> for ignoring more complicated and subtle (and thus perhaps
> more valuable) attacks?

	Why, every time you do not have infinite development resources,
of course. You always have to juggle priorities, and subtle attacks 
typically are not anywhere close to the head of the development 
priority list for P2P networks...

> Or better yet: design with the idea in mind that no hash function
> lives forever.

	Sure; but that's orthogonal:

> If you're stuck with a legacy hash, fine, analyze the situation
> and if you're confident the weakness has no effect on current
> usage, rationalize using it a while longer.

	My point exactly. The issue is whether one should consider the
deployed legacy codebase unsecure after every new discovery is made 
in the hash collision research or not. My personal approach would be 
to disregard the possible collision issues until there is a problem
serious enough to be noticed by CNN. (So far I still cannot see any
*realistic* attack scenario; maybe your next letter will convince me
that I'm wrong :-) But everyone has a personal "worry threshold",
I guess. Mine is pretty low...

	Best wishes -
	S.Osokine.
	16 Feb 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Gordon Mohr (@ Bitzi)
Sent: Wednesday, February 16, 2005 8:12 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] SHA1 broken?


Serguei Osokine wrote:
> On Wednesday, February 16, 2005 Gordon Mohr wrote:
> 
>>MD5 should not be used for content identification, given the 
>>ability to create content pairs with the same MD5, with one 
>>version being (and appearing and acquiring a reputation for 
>>being) innocuous, and the other version malicious.
> 
> 
> 	Right. So let's go and try to find something with the same
> MD5 as this letter of mine, shall we? :-)

I can't -- but you could have made a collision, very easily, if
you composed your initial message with the intent of also composing
an MD5 twin at the same time.

That means for content identification MD5 is fatally flawed. For
any file whose contents I think I know and trust, perhaps based
on analysis and history of the file, there could be another
dangerous file with the same MD5. MD5 cannot be used to distinguish
between the two, but that's the whole point of using a secure
hash for content identification.

Dan Kaminsky runs over a number of potential attacks that
are relevant to P2P -- see:

   http://paketto.doxpara.com

Don't be fooled by the title of his analysis, "MD to be considered
harmful someday" -- the attacks mentioned are possible now, and
could trick people and software in subtle ways different from
other threats to P2P nets.

Here's another example from the cryptography list that convinced
a  doubter that the attacks on MD5 were of more than purely
theoretical interest: two long binary strings, one a prime number,
one not:

   http://lists.virus.org/cryptography-0412/msg00102.html

Consider source code or executables which work fine with the
primes, s-boxes, and other initialization vectors initially
examined -- but have exploitable flaws when those values are
perturbed in a manner that leaves the MD5 the same. You need
to use a different, stronger content check to prevent such
mischief -- making the use of MD5 redundant and even dangerous
for the false sense of security it gives.

> 	For any practical purpose that I can imagine in a content
> identification field, MD5 is just fine. And SHA-1 is even more
> fine. 

If you can't imagine exploits, perhaps it's just a failure of
your imagination. Prudent engineering would assume some attackers
have better imaginations than you, when it comes to exploiting
hashes that don't work as originally intended.

> There are plenty more simple ways to attack the CDN nets
> than MD5 collisions. Way more simple. And abandoning MD5 for
> SHA1, then SHA1 for Tiger, and then abandoning Tiger for some
> newer hash when some researcher finds that it is really twenty
> bits weaker than you thought - it is all just a huge waste of
> development effort, as far as I'm concerned.

Depends on the kinds of attacks you're worried about. There
are more simple ways to disrupt P2P nets, sure. But are there
more simple ways to trick conscientious, hash-checking users
into running malware?

And since when did the ease of other attacks become an excuse
for ignoring more complicated and subtle (and thus perhaps
more valuable) attacks?

If you need a secure hash's properties in your software, you
should use an uncompromised secure hash. (Results as early as
1996 suggested MD5 should not be used in applications where
collision-resistance is important.)

If you're stuck with a legacy hash, fine, analyze the situation
and if you're confident the weakness has no effect on current
usage, rationalize using it a while longer. But get ready for
the potential need to switch hashes quickly in the presence of
further discoveries. Or better yet: design with the idea in mind
that no hash function lives forever.

- Gordon @ Bitzi
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From em at em.no-ip.com  Thu Feb 17 11:11:13 2005
From: em at em.no-ip.com (Enzo Michelangeli)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
References: <4212DCF1.1070909@bitzi.com><20050216131536.GA27730@ref.nmedia.net>
	<42136EE3.4000001@hamachi.cc>
Message-ID: <005b01c514e1$6dcee580$0200a8c0@em.noip.com>

----- Original Message ----- 
From: "Alex Pankratov" <ap@hamachi.cc>
To: "Peer-to-peer development." <p2p-hackers@zgp.org>
Sent: Thursday, February 17, 2005 12:03 AM
Subject: Re: [p2p-hackers] SHA1 broken?

[...]
> By the way - is ECC patented ? I heard Sun had some activity around
> ECC patents, Certicom has patents for a curve selection algorithms,
> but is core ECC patented ? Or rather - is it in public domain or not ?

Answers to patent-related questions are not Turing computable ;-)

Anyway, several years ago the IEEE made an effort to collect statements
and claims about intellectual property on PK encryption algorithms:

  http://grouper.ieee.org/groups/1363/P1363/patents.html

Several of the letters collected refer to EC-related areas (Nyberg-Rueppel
signatures, point compression techniques, etc.)

Enzo


From gojomo at bitzi.com  Thu Feb 17 18:23:51 2005
From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi))
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <IJECJCHFLLIPKMNGHDPMCEDNGNAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMCEDNGNAA.osokin@osokin.com>
Message-ID: <4214E137.8000109@bitzi.com>

Serguei Osokine wrote:
> On Wednesday, February 16, 2005 Gordon Mohr wrote:
> 
>>Dan Kaminsky runs over a number of potential attacks that
>>are relevant to P2P -- see:
>>
>>  http://paketto.doxpara.com
>>...
>>Here's another example from the cryptography list that convinced
>>a  doubter...
> 
> 
> 	Certainly looks cute. Now correct me if I'm not getting something
> here - but isn't it true that in order to mount an attack one has to
> replace the "good" code (content, whatever) by the "bad" code, and the
> absolutely necessary condition is that the "good" code also has to be
> created by an attacker? So an attacker creates "good" code, gives it
> to security experts for verification, and then after they are done,
> replaces it with "bad code", right? 

Yes.

> 	Isn't it a bit far-fetched? Do we have a somewhat more realistic
> attack scenario? I just cannot imagine all this happening in real 
> life. Real-life breakdowns always tend to be way simpler than their
> theoretical scenarios (and totally unexpected, too).

It's possible. It's not that hard. It would offer rewards to an attacker
that are different and possibly larger than those offered by the simple
tricks that reel in easy marks.

So it doesn't seem that far-fetched to me.

>>But are there more simple ways to trick conscientious, hash-checking
>>users into running malware?
> 
> 
> 	Users typically don't give a damn about hash-checking; they
> expect the system to do that for them. And a few users that do give
> a damn typically can defend themselves from pretty much anything no
> matter what you throw at them. So the fate of this "expert" group 
> (consisting of about ten people for any given P2P system) does not
> really worry me, whereas for the rest of the user population there 
> are plenty of ways to trick them into running the malware - *all* 
> the current ways of doing  so are simpler than fiddling with hashes. 

If your attack is just to get someone, somewhere to run your malware,
sure. But the average/mass user is not the only interesting case.

If you want to get onto other, higher-valued machines, you have to get
around the real practices of many users, of various sophistication,
who do care about hashes of received content matching expected values.

For such people, to get them to settle for MD5, you either convince
them not to worry about the potential attack -- making them potential
victims -- or you lose them as users, because they realize that the
hashes used for content-identification on your network do not offer
the guarantee they seek.

That's not a good result. I want P2P+CDN that delivers content that I
and other sophisticated users can trust, and I want the unsophisticated
users on the same network, too:  I gain from their presence as peers/
seeds, and they can gain from my insistence on rigorous content
identification.

> 	Which brings me back to my question above: do we have a
> realistic scenario where a network like Gnutella would be harmed by
> using MD5?

Having installers like the fire.exe/ice.exe described by Kaminsky,
which have the same MD5 but install different software, could
quickly undermine confidence in an MD5-only P2P network for most kinds
of content delivery. Telling average users (or businesses considering
P2P delivery),  "but that's only when the attacker gets to create both
files", is noise to them.

(And for pro users, telling them that they have to trust the original
creator of the file not to have created twins is tantamount to requiring
the content to be separately digitally signed to prove origination --
an additional step rendering the plain standalone MD5 for content
identification superfluous.)

> (Not that I give a damn about MD5, and no one in Gnutella probably 
> uses it anyway; my interest is largely theoretical here, and the same
> issues might be relevant for the other hashes, either.)
> 
> 
>>And since when did the ease of other attacks become an excuse
>>for ignoring more complicated and subtle (and thus perhaps
>>more valuable) attacks?
> 
> 
> 	Why, every time you do not have infinite development resources,
> of course. You always have to juggle priorities, and subtle attacks 
> typically are not anywhere close to the head of the development 
> priority list for P2P networks...

Of course work has to be prioritized in context. But the priority list
is not a single-file line, where a few frontmost entries prevent
consideration of everything else.

In particular, I would guess the "head of the development priority list"
for most commercial P2P networks is dominated by user satisfaction
issues. But these are only remedied incrementally, with research and
trial and error. The risk of delay is incremental competitive decay,
and the work is never really "done".

At the same time, developers can be addressing other specific flaws --
failures of the software and chosen algorithms to deliver the
functionality intended. Such flaws can't be ignored forever. They
may be easy to fix with a discrete amount of effort. And since
transitioning hash functions requires lead time, the groundwork
should be laid before any change is urgent.

>>Or better yet: design with the idea in mind that no hash function
>>lives forever.
> 
> 
> 	Sure; but that's orthogonal:
> 
> 
>>If you're stuck with a legacy hash, fine, analyze the situation
>>and if you're confident the weakness has no effect on current
>>usage, rationalize using it a while longer.
> 
> 
> 	My point exactly. The issue is whether one should consider the
> deployed legacy codebase unsecure after every new discovery is made 
> in the hash collision research or not. My personal approach would be 
> to disregard the possible collision issues until there is a problem
> serious enough to be noticed by CNN. (So far I still cannot see any
> *realistic* attack scenario; maybe your next letter will convince me
> that I'm wrong :-) But everyone has a personal "worry threshold",
> I guess. Mine is pretty low...

I suppose it depends on how high your ambitions for P2P are. Clearly,
you can have a very popular network with a very weak hash for quite
a while -- witness ED2K, using MD4, a hash "broken" for over a decade.

But over time, users have become more aware of the importance of
hash-based content-verification, and users have generally migrated in
the direction of more-rigorous hash-using networks -- though not to
the *most* rigorous networks.

If P2P is just a leisure-time lark for credulous, casual users who have
many other unhygenic comuting practices, then you can be lacksadaisical
in your use of hash algorithms. If you want it to also be a platform
stable for long-term use by more discriminating users and commercial
endeavors, you should take the strength of your hashes seriously. If
you wait until someone is hurt enough that the damage is reported on
CNN, that's too long.

- Gordon @ Bitzi

> 	Best wishes -
> 	S.Osokine.
> 	16 Feb 2005.
> 
> 
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Gordon Mohr (@ Bitzi)
> Sent: Wednesday, February 16, 2005 8:12 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] SHA1 broken?
> 
> 
> Serguei Osokine wrote:
> 
>>On Wednesday, February 16, 2005 Gordon Mohr wrote:
>>
>>
>>>MD5 should not be used for content identification, given the 
>>>ability to create content pairs with the same MD5, with one 
>>>version being (and appearing and acquiring a reputation for 
>>>being) innocuous, and the other version malicious.
>>
>>
>>	Right. So let's go and try to find something with the same
>>MD5 as this letter of mine, shall we? :-)
> 
> 
> I can't -- but you could have made a collision, very easily, if
> you composed your initial message with the intent of also composing
> an MD5 twin at the same time.
> 
> That means for content identification MD5 is fatally flawed. For
> any file whose contents I think I know and trust, perhaps based
> on analysis and history of the file, there could be another
> dangerous file with the same MD5. MD5 cannot be used to distinguish
> between the two, but that's the whole point of using a secure
> hash for content identification.
> 
> Dan Kaminsky runs over a number of potential attacks that
> are relevant to P2P -- see:
> 
>    http://paketto.doxpara.com
> 
> Don't be fooled by the title of his analysis, "MD to be considered
> harmful someday" -- the attacks mentioned are possible now, and
> could trick people and software in subtle ways different from
> other threats to P2P nets.
> 
> Here's another example from the cryptography list that convinced
> a  doubter that the attacks on MD5 were of more than purely
> theoretical interest: two long binary strings, one a prime number,
> one not:
> 
>    http://lists.virus.org/cryptography-0412/msg00102.html
> 
> Consider source code or executables which work fine with the
> primes, s-boxes, and other initialization vectors initially
> examined -- but have exploitable flaws when those values are
> perturbed in a manner that leaves the MD5 the same. You need
> to use a different, stronger content check to prevent such
> mischief -- making the use of MD5 redundant and even dangerous
> for the false sense of security it gives.
> 
> 
>>	For any practical purpose that I can imagine in a content
>>identification field, MD5 is just fine. And SHA-1 is even more
>>fine. 
> 
> 
> If you can't imagine exploits, perhaps it's just a failure of
> your imagination. Prudent engineering would assume some attackers
> have better imaginations than you, when it comes to exploiting
> hashes that don't work as originally intended.
> 
> 
>>There are plenty more simple ways to attack the CDN nets
>>than MD5 collisions. Way more simple. And abandoning MD5 for
>>SHA1, then SHA1 for Tiger, and then abandoning Tiger for some
>>newer hash when some researcher finds that it is really twenty
>>bits weaker than you thought - it is all just a huge waste of
>>development effort, as far as I'm concerned.
> 
> 
> Depends on the kinds of attacks you're worried about. There
> are more simple ways to disrupt P2P nets, sure. But are there
> more simple ways to trick conscientious, hash-checking users
> into running malware?
> 
> And since when did the ease of other attacks become an excuse
> for ignoring more complicated and subtle (and thus perhaps
> more valuable) attacks?
> 
> If you need a secure hash's properties in your software, you
> should use an uncompromised secure hash. (Results as early as
> 1996 suggested MD5 should not be used in applications where
> collision-resistance is important.)
> 
> If you're stuck with a legacy hash, fine, analyze the situation
> and if you're confident the weakness has no effect on current
> usage, rationalize using it a while longer. But get ready for
> the potential need to switch hashes quickly in the presence of
> further discoveries. Or better yet: design with the idea in mind
> that no hash function lives forever.
> 
> - Gordon @ Bitzi
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 
> 


From Serguei.Osokine at efi.com  Thu Feb 17 18:37:34 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal>

On Thursday, February 17, 2005 Gordon Mohr wrote:
> I want P2P+CDN that delivers content that I and other sophisticated
> users can trust, and I want the unsophisticated users on the same 
> network, too...
> ...
> If P2P is just a leisure-time lark for credulous, casual users who 
> have many other unhygenic comuting practices, then you can be 
> lacksadaisical in your use of hash algorithms. If you want it to 
> also be a platform stable for long-term use by more discriminating
> users and commercial endeavors, you should take the strength of 
> your hashes seriously.

	Fair enough. So how do you prevent the DNS-hijacking of Bitzi?
Or - way more importantly - how do you prevent the fake .torrent files
from being submitted to any number of torrent aggregator sites? That
one would be a method of choice for me, if I'd be in the mood to 
distribute some malicious code to many machines at once, and would 
not be in the mood to use a virus for that purpose.

	Best wishes -
	S.Osokine.
	17 Feb 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Gordon Mohr (@ Bitzi)
Sent: Thursday, February 17, 2005 10:24 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] SHA1 broken?


Serguei Osokine wrote:
> On Wednesday, February 16, 2005 Gordon Mohr wrote:
> 
>>Dan Kaminsky runs over a number of potential attacks that
>>are relevant to P2P -- see:
>>
>>  http://paketto.doxpara.com
>>...
>>Here's another example from the cryptography list that convinced
>>a  doubter...
> 
> 
> 	Certainly looks cute. Now correct me if I'm not getting something
> here - but isn't it true that in order to mount an attack one has to
> replace the "good" code (content, whatever) by the "bad" code, and the
> absolutely necessary condition is that the "good" code also has to be
> created by an attacker? So an attacker creates "good" code, gives it
> to security experts for verification, and then after they are done,
> replaces it with "bad code", right? 

Yes.

> 	Isn't it a bit far-fetched? Do we have a somewhat more realistic
> attack scenario? I just cannot imagine all this happening in real 
> life. Real-life breakdowns always tend to be way simpler than their
> theoretical scenarios (and totally unexpected, too).

It's possible. It's not that hard. It would offer rewards to an attacker
that are different and possibly larger than those offered by the simple
tricks that reel in easy marks.

So it doesn't seem that far-fetched to me.

>>But are there more simple ways to trick conscientious, hash-checking
>>users into running malware?
> 
> 
> 	Users typically don't give a damn about hash-checking; they
> expect the system to do that for them. And a few users that do give
> a damn typically can defend themselves from pretty much anything no
> matter what you throw at them. So the fate of this "expert" group 
> (consisting of about ten people for any given P2P system) does not
> really worry me, whereas for the rest of the user population there 
> are plenty of ways to trick them into running the malware - *all* 
> the current ways of doing  so are simpler than fiddling with hashes. 

If your attack is just to get someone, somewhere to run your malware,
sure. But the average/mass user is not the only interesting case.

If you want to get onto other, higher-valued machines, you have to get
around the real practices of many users, of various sophistication,
who do care about hashes of received content matching expected values.

For such people, to get them to settle for MD5, you either convince
them not to worry about the potential attack -- making them potential
victims -- or you lose them as users, because they realize that the
hashes used for content-identification on your network do not offer
the guarantee they seek.

That's not a good result. I want P2P+CDN that delivers content that I
and other sophisticated users can trust, and I want the unsophisticated
users on the same network, too:  I gain from their presence as peers/
seeds, and they can gain from my insistence on rigorous content
identification.

> 	Which brings me back to my question above: do we have a
> realistic scenario where a network like Gnutella would be harmed by
> using MD5?

Having installers like the fire.exe/ice.exe described by Kaminsky,
which have the same MD5 but install different software, could
quickly undermine confidence in an MD5-only P2P network for most kinds
of content delivery. Telling average users (or businesses considering
P2P delivery),  "but that's only when the attacker gets to create both
files", is noise to them.

(And for pro users, telling them that they have to trust the original
creator of the file not to have created twins is tantamount to requiring
the content to be separately digitally signed to prove origination --
an additional step rendering the plain standalone MD5 for content
identification superfluous.)

> (Not that I give a damn about MD5, and no one in Gnutella probably 
> uses it anyway; my interest is largely theoretical here, and the same
> issues might be relevant for the other hashes, either.)
> 
> 
>>And since when did the ease of other attacks become an excuse
>>for ignoring more complicated and subtle (and thus perhaps
>>more valuable) attacks?
> 
> 
> 	Why, every time you do not have infinite development resources,
> of course. You always have to juggle priorities, and subtle attacks 
> typically are not anywhere close to the head of the development 
> priority list for P2P networks...

Of course work has to be prioritized in context. But the priority list
is not a single-file line, where a few frontmost entries prevent
consideration of everything else.

In particular, I would guess the "head of the development priority list"
for most commercial P2P networks is dominated by user satisfaction
issues. But these are only remedied incrementally, with research and
trial and error. The risk of delay is incremental competitive decay,
and the work is never really "done".

At the same time, developers can be addressing other specific flaws --
failures of the software and chosen algorithms to deliver the
functionality intended. Such flaws can't be ignored forever. They
may be easy to fix with a discrete amount of effort. And since
transitioning hash functions requires lead time, the groundwork
should be laid before any change is urgent.

>>Or better yet: design with the idea in mind that no hash function
>>lives forever.
> 
> 
> 	Sure; but that's orthogonal:
> 
> 
>>If you're stuck with a legacy hash, fine, analyze the situation
>>and if you're confident the weakness has no effect on current
>>usage, rationalize using it a while longer.
> 
> 
> 	My point exactly. The issue is whether one should consider the
> deployed legacy codebase unsecure after every new discovery is made 
> in the hash collision research or not. My personal approach would be 
> to disregard the possible collision issues until there is a problem
> serious enough to be noticed by CNN. (So far I still cannot see any
> *realistic* attack scenario; maybe your next letter will convince me
> that I'm wrong :-) But everyone has a personal "worry threshold",
> I guess. Mine is pretty low...

I suppose it depends on how high your ambitions for P2P are. Clearly,
you can have a very popular network with a very weak hash for quite
a while -- witness ED2K, using MD4, a hash "broken" for over a decade.

But over time, users have become more aware of the importance of
hash-based content-verification, and users have generally migrated in
the direction of more-rigorous hash-using networks -- though not to
the *most* rigorous networks.

If P2P is just a leisure-time lark for credulous, casual users who have
many other unhygenic comuting practices, then you can be lacksadaisical
in your use of hash algorithms. If you want it to also be a platform
stable for long-term use by more discriminating users and commercial
endeavors, you should take the strength of your hashes seriously. If
you wait until someone is hurt enough that the damage is reported on
CNN, that's too long.

- Gordon @ Bitzi

> 	Best wishes -
> 	S.Osokine.
> 	16 Feb 2005.
> 
> 
> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Gordon Mohr (@ Bitzi)
> Sent: Wednesday, February 16, 2005 8:12 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] SHA1 broken?
> 
> 
> Serguei Osokine wrote:
> 
>>On Wednesday, February 16, 2005 Gordon Mohr wrote:
>>
>>
>>>MD5 should not be used for content identification, given the 
>>>ability to create content pairs with the same MD5, with one 
>>>version being (and appearing and acquiring a reputation for 
>>>being) innocuous, and the other version malicious.
>>
>>
>>	Right. So let's go and try to find something with the same
>>MD5 as this letter of mine, shall we? :-)
> 
> 
> I can't -- but you could have made a collision, very easily, if
> you composed your initial message with the intent of also composing
> an MD5 twin at the same time.
> 
> That means for content identification MD5 is fatally flawed. For
> any file whose contents I think I know and trust, perhaps based
> on analysis and history of the file, there could be another
> dangerous file with the same MD5. MD5 cannot be used to distinguish
> between the two, but that's the whole point of using a secure
> hash for content identification.
> 
> Dan Kaminsky runs over a number of potential attacks that
> are relevant to P2P -- see:
> 
>    http://paketto.doxpara.com
> 
> Don't be fooled by the title of his analysis, "MD to be considered
> harmful someday" -- the attacks mentioned are possible now, and
> could trick people and software in subtle ways different from
> other threats to P2P nets.
> 
> Here's another example from the cryptography list that convinced
> a  doubter that the attacks on MD5 were of more than purely
> theoretical interest: two long binary strings, one a prime number,
> one not:
> 
>    http://lists.virus.org/cryptography-0412/msg00102.html
> 
> Consider source code or executables which work fine with the
> primes, s-boxes, and other initialization vectors initially
> examined -- but have exploitable flaws when those values are
> perturbed in a manner that leaves the MD5 the same. You need
> to use a different, stronger content check to prevent such
> mischief -- making the use of MD5 redundant and even dangerous
> for the false sense of security it gives.
> 
> 
>>	For any practical purpose that I can imagine in a content
>>identification field, MD5 is just fine. And SHA-1 is even more
>>fine. 
> 
> 
> If you can't imagine exploits, perhaps it's just a failure of
> your imagination. Prudent engineering would assume some attackers
> have better imaginations than you, when it comes to exploiting
> hashes that don't work as originally intended.
> 
> 
>>There are plenty more simple ways to attack the CDN nets
>>than MD5 collisions. Way more simple. And abandoning MD5 for
>>SHA1, then SHA1 for Tiger, and then abandoning Tiger for some
>>newer hash when some researcher finds that it is really twenty
>>bits weaker than you thought - it is all just a huge waste of
>>development effort, as far as I'm concerned.
> 
> 
> Depends on the kinds of attacks you're worried about. There
> are more simple ways to disrupt P2P nets, sure. But are there
> more simple ways to trick conscientious, hash-checking users
> into running malware?
> 
> And since when did the ease of other attacks become an excuse
> for ignoring more complicated and subtle (and thus perhaps
> more valuable) attacks?
> 
> If you need a secure hash's properties in your software, you
> should use an uncompromised secure hash. (Results as early as
> 1996 suggested MD5 should not be used in applications where
> collision-resistance is important.)
> 
> If you're stuck with a legacy hash, fine, analyze the situation
> and if you're confident the weakness has no effect on current
> usage, rationalize using it a while longer. But get ready for
> the potential need to switch hashes quickly in the presence of
> further discoveries. Or better yet: design with the idea in mind
> that no hash function lives forever.
> 
> - Gordon @ Bitzi
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 
> 


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From zooko at zooko.com  Thu Feb 17 20:30:55 2005
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal>
Message-ID: <e73ea1d7089c53c2417b08ca46cfbe04@zooko.com>

This topic -- whether collision-resistance is or is not necessary for 
secure identification of content -- has been discussed extensively on 
the cryptography@metzdowd mailing list recently.  Ben Laurie started it 
with a post entitled "The pointlessness of MD5 attacks".  Here is my 
contribution to that discussion:

http://thread.gmane.org/gmane.comp.encryption.general/5717

This note I posted alludes to this kind of situation:

Bob, the honest and noble software maintainer, writes a good piece of 
software, S1, and then asks Charles the Malicious Multimedia Master to 
give him an icon to include in the package.  Charles writes some 
malicious software S2, and then finds an icon I1 and another icon I2 
such that MD5(B1) == MD5(B2), where B1 is the binary package resulting 
from packaging software S1 and icon I1, and B2 is the binary package 
resulting from packaging software S2 and icon I2.  Charles then gives 
I1 to Bob, who compiles B2 himself.  Charles generates T1 == MD5(B1), 
and distributes B1, telling Alice "Please verify that the binary 
package you download and run matches the hash T1.".

Charles sends Alice a copy of binary software package B2, who verifies 
that MD5(B2) == T1, and then trusts the binary package as though it 
were a package that Bob wrote.

Now to be clear: I don't know if the current attacks on MD5 and SHA1 
enable Charles to do this!  Because I don't know if those attacks can 
be used when there is a fixed IV or a fixed part of the message which 
is chosen by someone (Bob) other than the attacker (Charles).

However, I do know that if a hash is collision-resistant then the 
situation outlined above cannot occur, but that if a hash is 
non-collision-resistant, then the situation outlined above *might* be 
possible, even if the hash is second-preimage resistant.

I guess the challenge presented to Charles in the situation outlined 
above occupies a sort of middle ground between collision-resistance and 
second-preimage-resistance.  The HMAC challenge occupies another niche 
in that middle ground -- in the situation described above, Charles is 
given a fixed IV or a fixed part-of-the-message.  In the HMAC 
situation, Charles is faced with an IV which is random and unknown to 
him.

Regards,

Zooko


From clodoaldo_gouveia at yahoo.com.br  Thu Feb 17 20:34:56 2005
From: clodoaldo_gouveia at yahoo.com.br (Clodoaldo Gouveia)
Date: Sat Dec  9 22:12:50 2006
Subject: [p2p-hackers] Changes in  Pastry 1.3.2
Message-ID: <20050217203456.76779.qmail@web41127.mail.yahoo.com>

Hi There...
 
We work in a peer-to-peer project in Brazil and we use the DHT Pastry FreePastry 1.3.1, and now we want to change to the newer version of this implementation of Pastry, 1.3.2...
All I?d Like to know is if someone has changed the FreePastry 1.3.1 to the FreePastry 1.3.2 in your applications and what are the main differences between both... I wanna know what changed on message routing, in our application some errors apeeared on the class rice.pastry.messaging.MessageDispatch!
I will stay so glad if you can help and i wanna know what I have to change in my application to fix this problem...
Thanks in advance...
 
Clodoaldo Gouveia e Hermano Toscano

		
---------------------------------
Yahoo! Acesso Gr?tis - Internet r?pida e gr?tis. Instale o discador do Yahoo! agora.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20050217/7394484d/attachment.html
From zooko at zooko.com  Thu Feb 17 20:39:27 2005
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <e73ea1d7089c53c2417b08ca46cfbe04@zooko.com>
References: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal>
	<e73ea1d7089c53c2417b08ca46cfbe04@zooko.com>
Message-ID: <248c7321e5782025ad5dd664665dbbff@zooko.com>

following-up to my own post to correct and add URLs

I wrote:

> This topic -- whether collision-resistance is or is not necessary for 
> secure identification of content -- has been discussed extensively on 
> the cryptography@metzdowd mailing list recently.  Ben Laurie started 
> it with a post entitled "The pointlessness of MD5 attacks".  Here is 
> my contribution to that discussion:
>
> http://thread.gmane.org/gmane.comp.encryption.general/5717

^-- actually, that's the URL to Ben Laurie's original post that started 
the discussion.

Here's the URL I intended to give -- the URL to my own post about Alice 
the user, Bob the software maintainer, and Charles the Malicious 
Multimedia Master:

http://article.gmane.org/gmane.comp.encryption.general/5789

Here's the URL to Adam Back's post which suggested the technique which 
could lead to this bad situation without violating the 
second-preimage-resistance of the hash function:

http://article.gmane.org/gmane.comp.encryption.general/5729

Regards,

Zooko


From nlothian at educationau.edu.au  Thu Feb 17 21:47:36 2005
From: nlothian at educationau.edu.au (Nick Lothian)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <F02EAE07A561584D92034BA84EE601234BD91F@eduau-exch.eduau.local>


> > Dan Kaminsky runs over a number of potential attacks that 
> are relevant 
> > to P2P -- see:
> >
> >   http://paketto.doxpara.com
> > ...
> > Here's another example from the cryptography list that convinced a  
> > doubter...
> 
> 	Certainly looks cute. Now correct me if I'm not getting 
> something here - but isn't it true that in order to mount an 
> attack one has to replace the "good" code (content, whatever) 
> by the "bad" code, and the absolutely necessary condition is 
> that the "good" code also has to be created by an attacker? 
> So an attacker creates "good" code, gives it to security 
> experts for verification, and then after they are done, 
> replaces it with "bad code", right? 
> 
> 	Isn't it a bit far-fetched? Do we have a somewhat more 
> realistic attack scenario? I just cannot imagine all this 
> happening in real life. Real-life breakdowns always tend to 
> be way simpler than their theoretical scenarios (and totally 
> unexpected, too).
> 

According to some reports some anti-spyware tools currently use MD5
hashes to find known-bad software (See
http://malektips.com/microsoft_antispyware_0007.html). It's not hard to
imagine spyware manufactures modifying common opensource applications
(eg: p2p software) so they include spyware and yet still have the same
hash.

Nick


IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party.

This email represents the views of the individual sender, which does not necessarily reflect those of education.au limited except where the sender expressly states otherwise.

It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects.

education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this e-mail.

From Serguei.Osokine at efi.com  Thu Feb 17 22:11:28 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC36B@fcexmb04.efi.internal>

On Thursday, February 17, 2005 Nick Lothian wrote:
> It's not hard to imagine spyware manufactures modifying common 
> opensource applications (eg: p2p software) so they include spyware
> and yet still have the same hash.

	Sure, but then they would have to find some innocently looking
way to include something like this into the open source app:

d1  31  dd  02  c5  e6  ee  c4  69  3d  9a  06  98  af  f9  5c
2f  ca  b5  87  12  46  7e  ab  40  04  58  3e  b8  fb  7f  89
55  ad  34  06  09  f4  b3  02  83  e4  88  83  25  71  41  5a
08  51  25  e8  f7  cd  c9  9f  d9  1d  bd  f2  80  37  3c  5b
d8  82  3e  31  56  34  8f  5b  ae  6d  ac  d4  36  c9  19  c6
dd  53  e2  b4  87  da  03  fd  02  39  63  06  d2  48  cd  a0
e9  9f  33  42  0f  57  7e  e8  ce  54  b6  70  80  a8  0d  1e
c6  98  21  bc  b6  a8  83  93  96  f9  65  2b  6f  f7  2a  70

- which is no big deal, could be a bitmap. However, after that they
would have to modify the application to use the text above as a jump
table to a malicious code, which would be dormant in the application
until the data is changed to:

d1  31  dd  02  c5  e6  ee  c4  69  3d  9a  06  98  af  f9  5c
2f  ca  b5  07  12  46  7e  ab  40  04  58  3e  b8  fb  7f  89
55  ad  34  06  09  f4  b3  02  83  e4  88  83  25  f1  41  5a
08  51  25  e8  f7  cd  c9  9f  d9  1d  bd  72  80  37  3c  5b
d8  82  3e  31  56  34  8f  5b  ae  6d  ac  d4  36  c9  19  c6
dd  53  e2  34  87  da  03  fd  02  39  63  06  d2  48  cd  a0
e9  9f  33  42  0f  57  7e  e8  ce  54  b6  70  80  28  0d  1e
c6  98  21  bc  b6  a8  83  93  96  f9  65  ab  6f  f7  2a  70

	If they can pull all of this off without raising any suspicion 
(which is not a huge problem if no one reads CVS diffs or sources), 
then they might as well just jump to this malicious code, or jump to
it on some moderately obfuscated condition, since no one would notice
this code or the jump to begin with if no one monitors the sources.
Using MD5 collision to do that seems like a particularly convoluted 
way to achieve the same goal that can be achieved way simpler without
it. Of course, if one is a Rube Goldberg fan, this is something he
might want to do as a matter of principle... :-)

	Best wishes -
	S.Osokine.
	17 Feb 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Nick Lothian
Sent: Thursday, February 17, 2005 1:48 PM
To: osokin@osokin.com; Peer-to-peer development.
Subject: RE: [p2p-hackers] SHA1 broken?


> > Dan Kaminsky runs over a number of potential attacks that 
> are relevant 
> > to P2P -- see:
> >
> >   http://paketto.doxpara.com
> > ...
> > Here's another example from the cryptography list that convinced a  
> > doubter...
> 
> 	Certainly looks cute. Now correct me if I'm not getting 
> something here - but isn't it true that in order to mount an 
> attack one has to replace the "good" code (content, whatever) 
> by the "bad" code, and the absolutely necessary condition is 
> that the "good" code also has to be created by an attacker? 
> So an attacker creates "good" code, gives it to security 
> experts for verification, and then after they are done, 
> replaces it with "bad code", right? 
> 
> 	Isn't it a bit far-fetched? Do we have a somewhat more 
> realistic attack scenario? I just cannot imagine all this 
> happening in real life. Real-life breakdowns always tend to 
> be way simpler than their theoretical scenarios (and totally 
> unexpected, too).
> 

According to some reports some anti-spyware tools currently use MD5
hashes to find known-bad software (See
http://malektips.com/microsoft_antispyware_0007.html). It's not hard to
imagine spyware manufactures modifying common opensource applications
(eg: p2p software) so they include spyware and yet still have the same
hash.

Nick


IMPORTANT: This e-mail, including any attachments, may contain private or
confidential information. If you think you may not be the intended recipient,
or if you have received this e-mail in error, please contact the sender
immediately and delete all copies of this e-mail. If you are not the intended
recipient, you must not reproduce any part of this e-mail or disclose its
contents to any other party.

This email represents the views of the individual sender, which does not
necessarily reflect those of education.au limited except where the sender
expressly states otherwise.

It is your responsibility to scan this email and any files transmitted with
it for viruses or any other defects.

education.au limited will not be liable for any loss, damage or consequence
caused directly or indirectly by this e-mail.
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From hal at finney.org  Thu Feb 17 22:25:36 2005
From: hal at finney.org (Hal Finney)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <20050217222536.6174957EBA@finney.org>

The problem with the attack scenario where two versions of a program are
created with the same hash, is that from what little we know of the new
attacks, they aren't powerful enough to do this.

All of the collisions they have shown have the property where the two
alternatives start with the same initial value for the hash; they then
have one or two blocks which are very carefully selected, with a few
bits differing between the two blocks; and at the end, they are back
to a common value for the hash.

It is known that their techniques are not sensitive to this initial value.
They actually made a mistake when they published their MD5 collision,
because they had the wrong initial values due to a typo in Schneier's
book.  When people gave them the correct initial values, they were able
to come up with new collisions within a matter of hours.

If you look at their MD5 collision in detail, it was two blocks long.
Each block was almost the same as the other, with just a few bits
different.  They start with the common initial value.  Then they run
the first blocks through.  Amazingly, this has only a small impact on
the intermediate value after this first block.  Only a relatively few
bits are different.

If you or I tried to take two blocks with a few bits different and feed
them to MD5, we would get totally different outputs.  Changing even
one bit will normally change half the output bits.  The fact that they
are able to change several bits and get only a small difference in the
output is the first miracle.

But then they do an even better trick.  They now go on and do the
second pair of blocks.  The initial values for these blocks (which are
the outputs from the previous stage) are close but not quite the same.
And amazingly, these second blocks not only keep things from getting
worse, they manage to heal the differences.  They precisely compensate
for the changes and bring the values back together.  This is the second
miracle and it is even greater.

Now, it would be a big leap from this to being able to take two arbitrary
different initial values and bring them together to a common output.
That is what would be necessary to mount the code fraud attack.  But as
we can see by inspection of the collisions produced by the researchers
(who are keeping their methodology secret for now), they don't seem to
have that power.  Instead, they are able to introduce a very carefully
controlled difference between the two blocks, and then cancel it.
Being able to cancel a huge difference between blocks would be a problem
of an entirely different magnitude.

Now, there is this other idea which Zooko alludes to, from Dan Kaminsky,
www.doxpara.com, which could exploit the power of the new attacks to
do something malicious.  Let us grant that the only ability we have is
that we can create slightly different pairs of blocks that collide.
We can't meaningfully control the contents of these blocks, and they
will differ in only a few bits.  And these blocks have to be inserted
into a program being distributed, which will have two versions that
are *exactly the same* except for the few bits of difference between
the blocks.  This way the two versions will have the same hash, and this
is the power which the current attacks seem to have.

Kaminsky shows that you could still have "good" and "bad" versions of
such a program.  You'd have to write a program which tested a bit in
the colliding blocks, and behaved "good" if the bit was set, and "bad"
if the bit was clear.  When someone reviewed this program, they'd see
the potential bad behavior, but they'd also see that the behavior was
not enabled because the bit that enabled it was not set.  Maybe the
bad behavior could be a back door used during debugging, and there is
some flag bit that turns off the debugging mode.  So the reviewer might
assume that the program was OK despite this somewhat questionable code,
because he builds it and makes sure to sign or validate the hash when
built in the mode when the bad features are turned off.

But what he doesn't know is, Kaminsky has another block of data prepared
which has that flag bit in the opposite state, and which he can substitute
without changing the hash.  That will cause the program to behave in its
"bad" mode, even though the only change was a few bits in this block
of random data.  So this way he can distribute a malicious build and it
has the hash which was approved by the reviewer.

And as Zooko points out, this doesn't have to be the main developer
who is doing this, anyone who is doing some work on creating the final
package might be able to do so.

On the other hand, this attack is pretty blatant once you know it is
possible.  The lesson is that a reviewer should be suspicious of code
whose security properties depend on the detailed contents of blocks
of random-looking data.  One problem with this is that there are some
circumstances where it could be hard to tell.  Zooko links to the example
of a crypto key which could have weak and strong versions.  The strong
version could be approved and then the weak version substituted.
There are also some crypto algorithms that use random-looking blocks of
data which could have weak and strong versions.

So it's not always as easy as it sounds.  But most code will not have
these problems, and for those programs it would be pretty conspicuous
to implement Kaminsky's attacks.  At present, that looks to be the best
someone could do with SHA-1 or even MD5.

Hal Finney

From nlothian at educationau.edu.au  Thu Feb 17 22:31:16 2005
From: nlothian at educationau.edu.au (Nick Lothian)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <F02EAE07A561584D92034BA84EE601234BD93A@eduau-exch.eduau.local>

> 
> On Thursday, February 17, 2005 Nick Lothian wrote:
> > It's not hard to imagine spyware manufactures modifying common 
> > opensource applications (eg: p2p software) so they include 
> spyware and 
> > yet still have the same hash.
> 
> 	Sure, but then they would have to find some innocently 
> looking way to include something like this into the open source app:
> 

No - they just release the built .exe without the source (or even better
- hack the original download site and replace the original version with
their malicious version. If the hashes of the apps matched this could be
pretty hard to detect).

Nick


IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party.

This email represents the views of the individual sender, which does not necessarily reflect those of education.au limited except where the sender expressly states otherwise.

It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects.

education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this e-mail.

From mccoy at mad-scientist.com  Thu Feb 17 22:31:43 2005
From: mccoy at mad-scientist.com (Jim McCoy)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SHA1 broken?
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC36B@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120E0DC36B@fcexmb04.efi.internal>
Message-ID: <a8f66d5c0ade5f491851a46773f31c43@mad-scientist.com>


On Feb 17, 2005, at 2:11 PM, Serguei Osokine wrote:

> On Thursday, February 17, 2005 Nick Lothian wrote:
>> It's not hard to imagine spyware manufactures modifying common
>> opensource applications (eg: p2p software) so they include spyware
>> and yet still have the same hash.
>
> 	Sure, but then they would have to find some innocently looking
> way to include something like this into the open source app:
> [collision data A]
> - which is no big deal, could be a bitmap. However, after that they
> would have to modify the application to use the text above as a jump
> table to a malicious code, which would be dormant in the application
> until the data is changed to:
> [collision data B]

So tell me, when was the last time you ran your SSL library through a 
debugger to determine with complete confidence that the modulii being 
used were not insecure ones?  With this attack I could distribute a 
copy of a crypto library that seemed to match the hash it was supposed 
to have, but which was in fact opening you up to certain crypto 
attacks.

As was pointed out on the crypto list thread zooko referenced, this is 
probably the only practical attack that can be made in this fashion 
right now.  I can replace your crypto modulus, some RNG seeds, and 
other bits of data that are used by applications (and not just 
displayed, like the bitmap you suggest) which are of the "big, 
random-looking number" format.

Jim


From Serguei.Osokine at efi.com  Thu Feb 17 23:23:27 2005
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SHA1 broken?
Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC36C@fcexmb04.efi.internal>

On Thursday, February 17, 2005 Jim McCoy wrote:
> With this attack I could distribute a copy of a crypto library that
> seemed to match the hash it was supposed to have, but which was in
> fact opening you up to certain crypto attacks.

And on Thursday, February 17, 2005 Nick Lothian wrote:
> ...they just release the built .exe without the source (or even 
> better - hack the original download site and replace the original 
> version with their malicious version. If the hashes of the apps 
> matched this could be pretty hard to detect).

	Yes to both; but only if it would be your library to begin with,
because it is essential that the *original* crypto library should have 
"collision data A" - without it, this attack is impossible. 

	And not only that - the *original* crypto library would also have
to have a) the malicious code prepared to be launched (say, granting 
root access or something), and b) the jump to this code that would not
be executed with "data A", but would - with "data B". And all of this 
should be already present and ready for launch in an original library,
the one that would be used by everyone for years and would not do 
anything visibly improper.

	RSA could do this, for sure. Heck, anyone who owns some cryptolib
could do that - who scrutinizes cryptolib sources anyway? And it would
be even simpler if only the binaries are distributed. But if you own a
widely used cryptolib, you have more simple ways to include a backdoor
into your code and to activate it on an innocently looking external 
event - especially if you do not show anyone the sources and distribute
only the binaries. For anyone *but* the original code author, however,
achieving a malicious collision this way would be impossible. 

	So the Bad Charlie Webmaster from Zooko is pretty much out of 
luck - he'd have to conspire with an honest programmer Bob to do any 
harm. And an innocent programmer Bob is quite capable of doing plenty
of harm even without any help and without knowing anything about hash
properties, if he only pretends to be honest long enough. Why would
he want to bring Charlie into his scam?

	Best wishes -
	S.Osokine.
	17 Feb 2005.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Jim McCoy
Sent: Thursday, February 17, 2005 2:32 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] SHA1 broken?


On Feb 17, 2005, at 2:11 PM, Serguei Osokine wrote:

> On Thursday, February 17, 2005 Nick Lothian wrote:
>> It's not hard to imagine spyware manufactures modifying common
>> opensource applications (eg: p2p software) so they include spyware
>> and yet still have the same hash.
>
> 	Sure, but then they would have to find some innocently looking
> way to include something like this into the open source app:
> [collision data A]
> - which is no big deal, could be a bitmap. However, after that they
> would have to modify the application to use the text above as a jump
> table to a malicious code, which would be dormant in the application
> until the data is changed to:
> [collision data B]

So tell me, when was the last time you ran your SSL library through a 
debugger to determine with complete confidence that the modulii being 
used were not insecure ones?  With this attack I could distribute a 
copy of a crypto library that seemed to match the hash it was supposed 
to have, but which was in fact opening you up to certain crypto 
attacks.

As was pointed out on the crypto list thread zooko referenced, this is 
probably the only practical attack that can be made in this fashion 
right now.  I can replace your crypto modulus, some RNG seeds, and 
other bits of data that are used by applications (and not just 
displayed, like the bitmap you suggest) which are of the "big, 
random-looking number" format.

Jim

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From gojomo at bitzi.com  Fri Feb 18 05:54:15 2005
From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi))
Date: Sat Dec  9 22:12:51 2006
Subject: Other P2P attacks (DNS, fake torrents, etc) Re: [p2p-hackers] SHA1
	broken?
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal>
Message-ID: <42158307.60008@bitzi.com>

Serguei Osokine wrote:
> On Thursday, February 17, 2005 Gordon Mohr wrote:
> 
>>I want P2P+CDN that delivers content that I and other sophisticated
>>users can trust, and I want the unsophisticated users on the same 
>>network, too...
>>...
>>If P2P is just a leisure-time lark for credulous, casual users who 
>>have many other unhygenic comuting practices, then you can be 
>>lacksadaisical in your use of hash algorithms. If you want it to 
>>also be a platform stable for long-term use by more discriminating
>>users and commercial endeavors, you should take the strength of 
>>your hashes seriously.
> 
> 
> 	Fair enough. So how do you prevent the DNS-hijacking of Bitzi?

Good question. There's no protection yet. I've assumed that when the
budget and interest allows, we'd (1) offer signed versions of our XML
metadata tickets; and (2) offer https service for some or all users.
Other ideas welcome.

> Or - way more importantly - how do you prevent the fake .torrent files
> from being submitted to any number of torrent aggregator sites? 

I assume the torrent aggregators have some way of vetting submissions,
either by reputation of the submitter, early testing/reviews by the most
adventurous users, and so forth.  I'm currently not immersed in those
communities, so I don't know for sure.

Anyone else want to chime in on how torrent aggregator sites manage to
tend toward quality torrents over time?

- Gordon @ Bitzi

From eugen at leitl.org  Fri Feb 18 12:41:02 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net)
Message-ID: <20050218124102.GI1404@leitl.org>

----- Forwarded message from jrandom <jrandom@i2p.net> -----

From: jrandom <jrandom@i2p.net>
Date: Fri, 18 Feb 2005 03:39:24 -0800
To: i2p@i2p.net
Subject: [i2p] 0.5 is available

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi y'all,

After 6 months of work on the 0.4 series, we've implemented and
deployed the new streaming library, integrated and tested bittorrent,
mail, and naming apps, fixed a bunch of bugs, and learned as much as
we could from real world users.  We now have a new 0.5 release which
reworks the tunnel routing algorithms, improving security and
anonymity while giving the user more control of their own
performance related tradeoffs.  In addition, we've bundled susi23's
susimail client, upgraded to the latest Jetty (allowing both symlinks
and CGI), and a whole lot more.  This new release is not backwards
compatible - you must upgrade to get anything useful done.

There has been a lot of work going on since 0.4.2.6 a month and a
half ago, with contributions by smeghead, duck, Jhor, cervantes,
Ragnarok, Sugadude, and the rest of the rabid testers in #i2p and
#i2p-chat.  I could write for pages describing whats up, but instead
I'll just direct you to the change log at
http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/history.txt?rev=HEAD

For the impatient, please review the install and update instructions
up at http://www.i2p.net/download

Please note that since this new release updates the classpath, the
update process will require you to start up the router again after
it finishes.  Any local modifications to the wrapper.config will
be lost when updating, so please be sure to back it up.  In
addition, even though this new release includes the latest Jetty
(5.1.2), if you want to enable CGI support, you will need to edit
your ./eepsite/jetty.xml to include:

  <Call name="addContext">
    <Arg>/cgi-bin/*</Arg>
    <Set name="ResourceBase">./eepsite/cgi-bin</Set>
    <Call name="addServlet">
      <Arg>Common Gateway Interface</Arg>
      <Arg>/</Arg>
      <Arg>org.mortbay.servlet.CGI</Arg>
     <Put name="Path">/usr/local/bin:/usr/ucb:/bin:/usr/bin</Put>
    </Call>
  </Call>

adjusting the Path as necessary for your OS/distro/tastes.  New
users have it easy - all of this is done for them.

While the docs on the website haven't been updated to reflect the new
tunnel routing and crypto changes yet, the nitty gritty is up at
http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/doc/tunnel-alt.html?rev=HEAD

There will be another release in the 0.5 series beyond this one,
including more options for allowing the user to control the impact
of predecessor attacks on their anonymity.  There will certainly be
performance and load balancing improvements as well, using the
feedback we get deploying the new tunnel code on a wider network.

Until the UDP transport is added in 0.6, we will want to continue to
be fairly low key, as we've already run into the default limits on
some braindead OSes (*cough*98*cough*).  There is much we can improve
upon while the network is small though, and while I know we all want
to go out and show the world what I2P can do, another two months
waiting won't hurt.

Anyway, thats that.  The new net is up and running, squid.i2p and
other services should be up, you know where to get the goods, so
get goin'!

=jr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCFc3OGnFL2th344YRAszOAKCfTh/OOAAyonRmKoRF/iw5BwRkZACgpGp4
qHMJkSo2mzjHTHRf98fsvdM=
=Vfl3
-----END PGP SIGNATURE-----
_______________________________________________
i2p mailing list
i2p@i2p.net
http://i2p.dnsalias.net/mailman/listinfo/i2p

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050218/64d6b1e1/attachment.pgp
From cefn.hoile at bt.com  Fri Feb 18 14:14:57 2005
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Internship in P2P technologies with British
	Telecommunications
Message-ID: <21DA6754A9238B48B92F39637EF307FD05B1A127@i2km41-ukdy.domain1.systemhost.net>

The job opening linked below may be of interest to some people on this
list.

Apologies for cross posting.

The Pervasive Systems Laboratory has a 12 month internship available as
part of a research project in large-scale resilient networks. The
project is focused on pervasive and adaptive networked systems. This
position is part of a new and expanding project to develop cutting edge
applications and novel solutions for dependable and robust ICT networks.
The position offers a challenging and varied research environment and
would be suitable for applicants who may, for example, be at the final
stages of their MSc or PhD studies. 

...more information can be found at http://cefn.com/cefn/?PICTJob

Cefn
http://cefn.com 

From dbarrett at quinthar.com  Fri Feb 18 20:20:18 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net)
In-Reply-To: <20050218124102.GI1404@leitl.org>
References: <20050218124102.GI1404@leitl.org>
Message-ID: <1108758022.2299409D@bf12.dngr.org>

Just curious, what are the "default limits" you ran into under Win 98?  
Indeed, if you could summarize the top five lessons learned from your 
real-world users, what would they be?  I'm sure we'd all like to learn 
them, too.

-david

On Fri, 18 Feb 2005 5:01 am, Eugen Leitl wrote:
> ----- Forwarded message from jrandom <jrandom@i2p.net> -----
>
> From: jrandom <jrandom@i2p.net>
> Date: Fri, 18 Feb 2005 03:39:24 -0800
> To: i2p@i2p.net
> Subject: [i2p] 0.5 is available
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi y'all,
>
> After 6 months of work on the 0.4 series, we've implemented and
> deployed the new streaming library, integrated and tested bittorrent,
> mail, and naming apps, fixed a bunch of bugs, and learned as much as
> we could from real world users.  We now have a new 0.5 release which
> reworks the tunnel routing algorithms, improving security and
> anonymity while giving the user more control of their own
> performance related tradeoffs.  In addition, we've bundled susi23's
> susimail client, upgraded to the latest Jetty (allowing both symlinks
> and CGI), and a whole lot more.  This new release is not backwards
> compatible - you must upgrade to get anything useful done.
>
> There has been a lot of work going on since 0.4.2.6 a month and a
> half ago, with contributions by smeghead, duck, Jhor, cervantes,
> Ragnarok, Sugadude, and the rest of the rabid testers in #i2p and
> #i2p-chat.  I could write for pages describing whats up, but instead
> I'll just direct you to the change log at
> http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/history.txt?rev=HEAD
>
> For the impatient, please review the install and update instructions
> up at http://www.i2p.net/download
>
> Please note that since this new release updates the classpath, the
> update process will require you to start up the router again after
> it finishes.  Any local modifications to the wrapper.config will
> be lost when updating, so please be sure to back it up.  In
> addition, even though this new release includes the latest Jetty
> (5.1.2), if you want to enable CGI support, you will need to edit
> your ./eepsite/jetty.xml to include:
>
>   <Call name="addContext">
>     <Arg>/cgi-bin/*</Arg>
>     <Set name="ResourceBase">./eepsite/cgi-bin</Set>
>     <Call name="addServlet">
>       <Arg>Common Gateway Interface</Arg>
>       <Arg>/</Arg>
>       <Arg>org.mortbay.servlet.CGI</Arg>
>      <Put name="Path">/usr/local/bin:/usr/ucb:/bin:/usr/bin</Put>
>     </Call>
>   </Call>
>
> adjusting the Path as necessary for your OS/distro/tastes.  New
> users have it easy - all of this is done for them.
>
> While the docs on the website haven't been updated to reflect the new
> tunnel routing and crypto changes yet, the nitty gritty is up at
> http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/doc/tunnel-alt.html?rev=HEAD
>
> There will be another release in the 0.5 series beyond this one,
> including more options for allowing the user to control the impact
> of predecessor attacks on their anonymity.  There will certainly be
> performance and load balancing improvements as well, using the
> feedback we get deploying the new tunnel code on a wider network.
>
> Until the UDP transport is added in 0.6, we will want to continue to
> be fairly low key, as we've already run into the default limits on
> some braindead OSes (*cough*98*cough*).  There is much we can improve
> upon while the network is small though, and while I know we all want
> to go out and show the world what I2P can do, another two months
> waiting won't hurt.
>
> Anyway, thats that.  The new net is up and running, squid.i2p and
> other services should be up, you know where to get the goods, so
> get goin'!
>
> =jr
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
>
> iD8DBQFCFc3OGnFL2th344YRAszOAKCfTh/OOAAyonRmKoRF/iw5BwRkZACgpGp4
> qHMJkSo2mzjHTHRf98fsvdM=
> =Vfl3
> -----END PGP SIGNATURE-----
> _______________________________________________
> i2p mailing list
> i2p@i2p.net
> http://i2p.dnsalias.net/mailman/listinfo/i2p
>
> ----- End forwarded message -----
> --
> Eugen* Leitl <a href="http://leitl.org">leitl</a>
> ______________________________________________________________
> ICBM: 48.07078, 11.61144            http://www.leitl.org
> 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> http://moleculardevices.org         http://nanomachines.net

From eugen at leitl.org  Fri Feb 18 22:31:26 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net)
	(fwd from dbarrett@quinthar.com) (fwd from nickpro79@mail.ru)
Message-ID: <20050218223126.GG1404@leitl.org>

----- Forwarded message from Nikita Proskourine <nickpro79@mail.ru> -----

From: Nikita Proskourine <nickpro79@mail.ru>
Date: Fri, 18 Feb 2005 16:54:19 -0500
To: Eugen Leitl <eugen@leitl.org>
Cc: i2p@i2p.net
Subject: Re: [p2p-hackers] [i2p] 0.5 is available (fwd from
	jrandom@i2p.net) (fwd from dbarrett@quinthar.com)
X-Mailer: Evolution 2.0.2 

I am guessing that the default limit in question is one of the things
mentioned on http://support.microsoft.com/kb/q158474/ (MaxConnections).
It defaults to 100 and can be raised, but on Win98 I believe the actual
limit is min(MaxConnections, 256).

Nick.

On Fri, 2005-02-18 at 21:31 +0100, Eugen Leitl wrote:
> ----- Forwarded message from David Barrett <dbarrett@quinthar.com> -----
> 
> From: David Barrett <dbarrett@quinthar.com>
> Date: Fri, 18 Feb 2005 12:20:18 -0800
> To: "Peer-to-peer development." <p2p-hackers@zgp.org>
> Subject: Re: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net)
> X-Mailer: Danger Service
> Reply-To: David Barrett <dbarrett@quinthar.com>,
> 	"Peer-to-peer development." <p2p-hackers@zgp.org>
> 
> Just curious, what are the "default limits" you ran into under Win 98?  
> Indeed, if you could summarize the top five lessons learned from your 
> real-world users, what would they be?  I'm sure we'd all like to learn 
> them, too.
> 
> -david
> 
> On Fri, 18 Feb 2005 5:01 am, Eugen Leitl wrote:
> >----- Forwarded message from jrandom <jrandom@i2p.net> -----
> >
> >From: jrandom <jrandom@i2p.net>
> >Date: Fri, 18 Feb 2005 03:39:24 -0800
> >To: i2p@i2p.net
> >Subject: [i2p] 0.5 is available
> >
> >-----BEGIN PGP SIGNED MESSAGE-----
> >Hash: SHA1
> >
> >Hi y'all,
> >
> >After 6 months of work on the 0.4 series, we've implemented and
> >deployed the new streaming library, integrated and tested bittorrent,
> >mail, and naming apps, fixed a bunch of bugs, and learned as much as
> >we could from real world users.  We now have a new 0.5 release which
> >reworks the tunnel routing algorithms, improving security and
> >anonymity while giving the user more control of their own
> >performance related tradeoffs.  In addition, we've bundled susi23's
> >susimail client, upgraded to the latest Jetty (allowing both symlinks
> >and CGI), and a whole lot more.  This new release is not backwards
> >compatible - you must upgrade to get anything useful done.
> >
> >There has been a lot of work going on since 0.4.2.6 a month and a
> >half ago, with contributions by smeghead, duck, Jhor, cervantes,
> >Ragnarok, Sugadude, and the rest of the rabid testers in #i2p and
> >#i2p-chat.  I could write for pages describing whats up, but instead
> >I'll just direct you to the change log at
> >http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/history.txt?rev=HEAD
> >
> >For the impatient, please review the install and update instructions
> >up at http://www.i2p.net/download
> >
> >Please note that since this new release updates the classpath, the
> >update process will require you to start up the router again after
> >it finishes.  Any local modifications to the wrapper.config will
> >be lost when updating, so please be sure to back it up.  In
> >addition, even though this new release includes the latest Jetty
> >(5.1.2), if you want to enable CGI support, you will need to edit
> >your ./eepsite/jetty.xml to include:
> >
> >  <Call name="addContext">
> >    <Arg>/cgi-bin/*</Arg>
> >    <Set name="ResourceBase">./eepsite/cgi-bin</Set>
> >    <Call name="addServlet">
> >      <Arg>Common Gateway Interface</Arg>
> >      <Arg>/</Arg>
> >      <Arg>org.mortbay.servlet.CGI</Arg>
> >     <Put name="Path">/usr/local/bin:/usr/ucb:/bin:/usr/bin</Put>
> >    </Call>
> >  </Call>
> >
> >adjusting the Path as necessary for your OS/distro/tastes.  New
> >users have it easy - all of this is done for them.
> >
> >While the docs on the website haven't been updated to reflect the new
> >tunnel routing and crypto changes yet, the nitty gritty is up at
> >http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/doc/tunnel-alt.html?rev=HEAD
> >
> >There will be another release in the 0.5 series beyond this one,
> >including more options for allowing the user to control the impact
> >of predecessor attacks on their anonymity.  There will certainly be
> >performance and load balancing improvements as well, using the
> >feedback we get deploying the new tunnel code on a wider network.
> >
> >Until the UDP transport is added in 0.6, we will want to continue to
> >be fairly low key, as we've already run into the default limits on
> >some braindead OSes (*cough*98*cough*).  There is much we can improve
> >upon while the network is small though, and while I know we all want
> >to go out and show the world what I2P can do, another two months
> >waiting won't hurt.
> >
> >Anyway, thats that.  The new net is up and running, squid.i2p and
> >other services should be up, you know where to get the goods, so
> >get goin'!
> >
> >=jr
> >-----BEGIN PGP SIGNATURE-----
> >Version: GnuPG v1.2.4 (GNU/Linux)
> >
> >iD8DBQFCFc3OGnFL2th344YRAszOAKCfTh/OOAAyonRmKoRF/iw5BwRkZACgpGp4
> >qHMJkSo2mzjHTHRf98fsvdM=
> >=Vfl3
> >-----END PGP SIGNATURE-----
> >_______________________________________________
> >i2p mailing list
> >i2p@i2p.net
> >http://i2p.dnsalias.net/mailman/listinfo/i2p
> >
> >----- End forwarded message -----
> >--
> >Eugen* Leitl <a href="http://leitl.org">leitl</a>
> >______________________________________________________________
> >ICBM: 48.07078, 11.61144            http://www.leitl.org
> >8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> >http://moleculardevices.org         http://nanomachines.net
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> ----- End forwarded message -----
> _______________________________________________
> i2p mailing list
> i2p@i2p.net
> http://i2p.dnsalias.net/mailman/listinfo/i2p

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050218/f1208cda/attachment.pgp
From baford at mit.edu  Fri Feb 18 23:44:43 2005
From: baford at mit.edu (Bryan Ford)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Final version of "P2P over NAT" paper available
Message-ID: <200502190044.43425.baford@mit.edu>

Hi folks,

For those interested in P2P-over-NAT issues, I just wanted to announce that 
the final version of the following paper, to appear in USENIX '05, is now 
available:

Peer-to-Peer Communication Across Network Address Translators, Bryan Ford, 
Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April 
2005.
(PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf
(HTML) http://www.brynosaurus.com/pub/net/p2pnat/

An earlier draft of this paper was announced on this list a few months ago.  
The final version includes, among other minor revisions, new "NAT Check" 
testing results based on almost twice the number of data points as the 
original draft.

Cheers,
Bryan

---

Abstract:

Network Address Translation (NAT) causes well-known difficulties for 
peer-to-peer (P2P) communication, since the peers involved may not be 
reachable at any globally valid IP address. Several NAT traversal techniques 
are known, but their documentation is slim, and data about their robustness 
or relative merits is slimmer. This paper documents and analyzes one of the 
simplest but most robust and practical NAT traversal techniques, commonly 
known as ``hole punching.'' Hole punching is moderately well-understood for 
UDP communication, but we show how it can be reliably used to set up 
peer-to-peer TCP streams as well. After gathering data on the reliability of 
this technique on a wide variety of deployed NATs, we find that about 82% of 
the NATs tested support hole punching for UDP, and about 64% support hole 
punching for TCP streams. As NAT vendors become increasingly conscious of the 
needs of important P2P applications such as Voice over IP and online gaming 
protocols, support for hole punching is likely to increase in the future. 

From ap at hamachi.cc  Sat Feb 19 07:04:10 2005
From: ap at hamachi.cc (Alex Pankratov)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Final version of "P2P over NAT" paper available
In-Reply-To: <200502190044.43425.baford@mit.edu>
References: <200502190044.43425.baford@mit.edu>
Message-ID: <4216E4EA.2030704@hamachi.cc>

Bryan,

Quoting your paper -

 > .. we find that about 82% of the NATs tested support hole punching
 > for UDP.
 > ..
> The NAT Check data we gathered consists of 380 reported data points 
 > ..

I happened to have statistics for more than 16000 'data poits', and
check this out - the rate of 'identity preserving' NAT devices suitable
for hole punching works out to be 82.2%. *UDP* hole punching that is.

Alex

Bryan Ford wrote:

> Hi folks,
> 
> For those interested in P2P-over-NAT issues, I just wanted to announce that 
> the final version of the following paper, to appear in USENIX '05, is now 
> available:
> 
> Peer-to-Peer Communication Across Network Address Translators, Bryan Ford, 
> Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April 
> 2005.
> (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf
> (HTML) http://www.brynosaurus.com/pub/net/p2pnat/
> 
> An earlier draft of this paper was announced on this list a few months ago.  
> The final version includes, among other minor revisions, new "NAT Check" 
> testing results based on almost twice the number of data points as the 
> original draft.
> 
> Cheers,
> Bryan
> 
> ---
> 
> Abstract:
> 
> Network Address Translation (NAT) causes well-known difficulties for 
> peer-to-peer (P2P) communication, since the peers involved may not be 
> reachable at any globally valid IP address. Several NAT traversal techniques 
> are known, but their documentation is slim, and data about their robustness 
> or relative merits is slimmer. This paper documents and analyzes one of the 
> simplest but most robust and practical NAT traversal techniques, commonly 
> known as ``hole punching.'' Hole punching is moderately well-understood for 
> UDP communication, but we show how it can be reliably used to set up 
> peer-to-peer TCP streams as well. After gathering data on the reliability of 
> this technique on a wide variety of deployed NATs, we find that about 82% of 
> the NATs tested support hole punching for UDP, and about 64% support hole 
> punching for TCP streams. As NAT vendors become increasingly conscious of the 
> needs of important P2P applications such as Voice over IP and online gaming 
> protocols, support for hole punching is likely to increase in the future. 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 

From dbarrett at quinthar.com  Sat Feb 19 08:03:33 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Final version of "P2P over NAT" paper available
In-Reply-To: <4216E4EA.2030704@hamachi.cc>
Message-ID: <20050219080338.212D33FC27@capsicum.zgp.org>

Heh, great validation of the results.

So if what's the latest values for the following chart:

                        NAT'd     Firewalled
                     +---------+-------------
% Able to hole punch |  82.2%  |   50-60% *
% of total internet  |   ??    |     ??
                     +---------+-------------
% Benefiting         |   ??    |     ??

* http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html

Basically, I'd like to get a better understanding of what fraction of all
internet users might benefit from these techniques, estimated as the product
of the above rows.

-david

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
> Behalf Of Alex Pankratov
> Sent: Friday, February 18, 2005 11:04 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper available
> 
> Bryan,
> 
> Quoting your paper -
> 
>  > .. we find that about 82% of the NATs tested support hole punching
>  > for UDP.
>  > ..
> > The NAT Check data we gathered consists of 380 reported data points
>  > ..
> 
> I happened to have statistics for more than 16000 'data poits', and
> check this out - the rate of 'identity preserving' NAT devices suitable
> for hole punching works out to be 82.2%. *UDP* hole punching that is.
> 
> Alex
> 
> Bryan Ford wrote:
> 
> > Hi folks,
> >
> > For those interested in P2P-over-NAT issues, I just wanted to announce
> that
> > the final version of the following paper, to appear in USENIX '05, is
> now
> > available:
> >
> > Peer-to-Peer Communication Across Network Address Translators, Bryan
> Ford,
> > Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April
> > 2005.
> > (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf
> > (HTML) http://www.brynosaurus.com/pub/net/p2pnat/
> >
> > An earlier draft of this paper was announced on this list a few months
> ago.
> > The final version includes, among other minor revisions, new "NAT Check"
> > testing results based on almost twice the number of data points as the
> > original draft.
> >
> > Cheers,
> > Bryan
> >
> > ---
> >
> > Abstract:
> >
> > Network Address Translation (NAT) causes well-known difficulties for
> > peer-to-peer (P2P) communication, since the peers involved may not be
> > reachable at any globally valid IP address. Several NAT traversal
> techniques
> > are known, but their documentation is slim, and data about their
> robustness
> > or relative merits is slimmer. This paper documents and analyzes one of
> the
> > simplest but most robust and practical NAT traversal techniques,
> commonly
> > known as ``hole punching.'' Hole punching is moderately well-understood
> for
> > UDP communication, but we show how it can be reliably used to set up
> > peer-to-peer TCP streams as well. After gathering data on the
> reliability of
> > this technique on a wide variety of deployed NATs, we find that about
> 82% of
> > the NATs tested support hole punching for UDP, and about 64% support
> hole
> > punching for TCP streams. As NAT vendors become increasingly conscious
> of the
> > needs of important P2P applications such as Voice over IP and online
> gaming
> > protocols, support for hole punching is likely to increase in the
> future.
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
> >
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From Ian.Wiles at blueyonder.co.uk  Sat Feb 19 15:14:32 2005
From: Ian.Wiles at blueyonder.co.uk (Ian Wiles)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Learning how to build a P2P system
Message-ID: <GMdViHDYf1FCFwqg@blueyonder.co.uk>


Hello, I am pondering creating my own P2P system, but am having a bit of 
difficulty finding enough technical information. I am looking for 
technical documentation on p2p protocols and more high level stuff for 
the topologies used.

Would any recommend O'Reilly's Peer to Peer and/or Ian Taylors From P2P 
to Web Services and Grids: Peers in a Client/Server World?

I'm hoping that someone could point me in the right direction for web 
based information as obtaining these books depends on the library at the 
moment.

I should also probably give a brief explanation of the system I'd like 
to develop. Basically it's a system to pass around text messages in a 
forum/usenet type of setup. However I would like the news spool to be 
distributed across all nodes, presumably this would require some sort of 
mirroring/backup as well as each node goes offline. This is why I would 
like to find some information on the techniques used, as I'm sure they'd 
be better than the solutions I would come up with. I would like to do 
this using Java as it's the language I'm most familiar with at the 
moment.

The question is, should I bother starting from scratch or is that a 
recipe for disaster? Should I use something such as JXTA instead? I'm 
not to keen on using XML messages, but I'm easily persuaded. Originally 
I had planned on using binary packets.


Thanks.


-- 
Ian


From jdd at dixons.org  Sat Feb 19 19:28:11 2005
From: jdd at dixons.org (Jim Dixon)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Learning how to build a P2P system
In-Reply-To: <GMdViHDYf1FCFwqg@blueyonder.co.uk>
References: <GMdViHDYf1FCFwqg@blueyonder.co.uk>
Message-ID: <Pine.LNX.4.58.0502191911410.2130@westbury.dixons.org>

On Sat, 19 Feb 2005, Ian Wiles wrote:

> I should also probably give a brief explanation of the system I'd like
> to develop. Basically it's a system to pass around text messages in a
> forum/usenet type of setup. However I would like the news spool to be
> distributed across all nodes, presumably this would require some sort of
> mirroring/backup as well as each node goes offline.

You do understand that Usenet News _is_ a p2p system, right?  One that
works very well, despite immense and rapidly fluctuating loads, the
sporadic loss of peers, legal threats, hordes of utterly clueless users,
and sometimes uhm less clueful sys admins.

Source code freely available.  See for example
http://www.isc.org/index.pl?/innd

The design is ancient and barbaric, but it works.

>                                                   This is why I would
> like to find some information on the techniques used, as I'm sure they'd
> be better than the solutions I would come up with. I would like to do
> this using Java as it's the language I'm most familiar with at the
> moment.

Java is a practical tool for this sort of project.

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
http://xlattice.sourceforge.net         p2p communications infrastructure

From davidopp at cs.berkeley.edu  Sat Feb 19 19:43:24 2005
From: davidopp at cs.berkeley.edu (David L. Oppenheimer)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Learning how to build a P2P system
In-Reply-To: <Pine.LNX.4.58.0502191911410.2130@westbury.dixons.org>
Message-ID: <200502191942.LAA04550@mindbender.davido.com>

And if your tastes are more in the structured P2P camp than unstructured,
check out UsenetDHT
http://project-iris.net/irisbib/papers/usenetdht:iptps04/paper.pdf

David 

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org 
> [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Jim Dixon
> Sent: Saturday, February 19, 2005 11:28 AM
> To: Ian Wiles; Peer-to-peer development.
> Subject: Re: [p2p-hackers] Learning how to build a P2P system
> 
> On Sat, 19 Feb 2005, Ian Wiles wrote:
> 
> > I should also probably give a brief explanation of the 
> system I'd like
> > to develop. Basically it's a system to pass around text 
> messages in a
> > forum/usenet type of setup. However I would like the news 
> spool to be
> > distributed across all nodes, presumably this would require 
> some sort of
> > mirroring/backup as well as each node goes offline.
> 
> You do understand that Usenet News _is_ a p2p system, right?  One that
> works very well, despite immense and rapidly fluctuating loads, the
> sporadic loss of peers, legal threats, hordes of utterly 
> clueless users,
> and sometimes uhm less clueful sys admins.
> 
> Source code freely available.  See for example
> http://www.isc.org/index.pl?/innd
> 
> The design is ancient and barbaric, but it works.
> 
> >                                                   This is 
> why I would
> > like to find some information on the techniques used, as 
> I'm sure they'd
> > be better than the solutions I would come up with. I would 
> like to do
> > this using Java as it's the language I'm most familiar with at the
> > moment.
> 
> Java is a practical tool for this sort of project.
> 
> --
> Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 
> 797 373 7881
> http://xlattice.sourceforge.net         p2p communications 
> infrastructure
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 


From ap at hamachi.cc  Sat Feb 19 20:13:24 2005
From: ap at hamachi.cc (Alex Pankratov)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Final version of "P2P over NAT" paper available
In-Reply-To: <20050219080338.212D33FC27@capsicum.zgp.org>
References: <20050219080338.212D33FC27@capsicum.zgp.org>
Message-ID: <42179DE3.10309@hamachi.cc>

Well, based on same stats it looks like 'hole punching' as it's
described in p2pnat paper succeeds in ~84% of the cases. Our
proggy is a bit more complex than that so our success rate is
about 97%.

Alex

David Barrett wrote:

> Heh, great validation of the results.
> 
> So if what's the latest values for the following chart:
> 
>                         NAT'd     Firewalled
>                      +---------+-------------
> % Able to hole punch |  82.2%  |   50-60% *
> % of total internet  |   ??    |     ??
>                      +---------+-------------
> % Benefiting         |   ??    |     ??
> 
> * http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html
> 
> Basically, I'd like to get a better understanding of what fraction of all
> internet users might benefit from these techniques, estimated as the product
> of the above rows.
> 
> -david
> 
> 
>>-----Original Message-----
>>From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
>>Behalf Of Alex Pankratov
>>Sent: Friday, February 18, 2005 11:04 PM
>>To: Peer-to-peer development.
>>Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper available
>>
>>Bryan,
>>
>>Quoting your paper -
>>
>> > .. we find that about 82% of the NATs tested support hole punching
>> > for UDP.
>> > ..
>>
>>>The NAT Check data we gathered consists of 380 reported data points
>>
>> > ..
>>
>>I happened to have statistics for more than 16000 'data poits', and
>>check this out - the rate of 'identity preserving' NAT devices suitable
>>for hole punching works out to be 82.2%. *UDP* hole punching that is.
>>
>>Alex
>>
>>Bryan Ford wrote:
>>
>>
>>>Hi folks,
>>>
>>>For those interested in P2P-over-NAT issues, I just wanted to announce
>>
>>that
>>
>>>the final version of the following paper, to appear in USENIX '05, is
>>
>>now
>>
>>>available:
>>>
>>>Peer-to-Peer Communication Across Network Address Translators, Bryan
>>
>>Ford,
>>
>>>Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April
>>>2005.
>>>(PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf
>>>(HTML) http://www.brynosaurus.com/pub/net/p2pnat/
>>>
>>>An earlier draft of this paper was announced on this list a few months
>>
>>ago.
>>
>>>The final version includes, among other minor revisions, new "NAT Check"
>>>testing results based on almost twice the number of data points as the
>>>original draft.
>>>
>>>Cheers,
>>>Bryan
>>>
>>>---
>>>
>>>Abstract:
>>>
>>>Network Address Translation (NAT) causes well-known difficulties for
>>>peer-to-peer (P2P) communication, since the peers involved may not be
>>>reachable at any globally valid IP address. Several NAT traversal
>>
>>techniques
>>
>>>are known, but their documentation is slim, and data about their
>>
>>robustness
>>
>>>or relative merits is slimmer. This paper documents and analyzes one of
>>
>>the
>>
>>>simplest but most robust and practical NAT traversal techniques,
>>
>>commonly
>>
>>>known as ``hole punching.'' Hole punching is moderately well-understood
>>
>>for
>>
>>>UDP communication, but we show how it can be reliably used to set up
>>>peer-to-peer TCP streams as well. After gathering data on the
>>
>>reliability of
>>
>>>this technique on a wide variety of deployed NATs, we find that about
>>
>>82% of
>>
>>>the NATs tested support hole punching for UDP, and about 64% support
>>
>>hole
>>
>>>punching for TCP streams. As NAT vendors become increasingly conscious
>>
>>of the
>>
>>>needs of important P2P applications such as Voice over IP and online
>>
>>gaming
>>
>>>protocols, support for hole punching is likely to increase in the
>>
>>future.
>>
>>>_______________________________________________
>>>p2p-hackers mailing list
>>>p2p-hackers@zgp.org
>>>http://zgp.org/mailman/listinfo/p2p-hackers
>>>_______________________________________________
>>>Here is a web page listing P2P Conferences:
>>>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>>>
>>>
>>
>>_______________________________________________
>>p2p-hackers mailing list
>>p2p-hackers@zgp.org
>>http://zgp.org/mailman/listinfo/p2p-hackers
>>_______________________________________________
>>Here is a web page listing P2P Conferences:
>>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 

From dbarrett at quinthar.com  Sat Feb 19 21:43:12 2005
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Final version of "P2P over NAT" paper available
In-Reply-To: <42179DE3.10309@hamachi.cc>
References: <20050219080338.212D33FC27@capsicum.zgp.org>
	<42179DE3.10309@hamachi.cc>
Message-ID: <1108849400.7734B6B@dk12.dngr.org>

I'm sorry, I didn't make my question clear.  Given that you can hole 
punch for 82-97% of NAT'd users, how many users are behind NATs in the 
first place?

For example if only 1% of users is behind a NAT then hole punching 
doesn't much matter.  But it's 25%, 50%, or 75%, it becomes critical.

Does this question make sense?

Likewise, I'm interested in a similar stat for firewalls.

Sorry for not being clear the first time.

-david

On Sat, 19 Feb 2005 12:32 pm, Alex Pankratov wrote:
> Well, based on same stats it looks like 'hole punching' as it's
> described in p2pnat paper succeeds in ~84% of the cases. Our
> proggy is a bit more complex than that so our success rate is
> about 97%.
>
> Alex
>
> David Barrett wrote:
>
>> Heh, great validation of the results.
>> So if what's the latest values for the following chart:
>>                         NAT'd     Firewalled
>>                      +---------+-------------
>> % Able to hole punch |  82.2%  |   50-60% *
>> % of total internet  |   ??    |     ??
>>                      +---------+-------------
>> % Benefiting         |   ??    |     ??
>> * http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html
>> Basically, I'd like to get a better understanding of what fraction of 
>> all
>> internet users might benefit from these techniques, estimated as the 
>> product
>> of the above rows.
>> -david
>>
>>> -----Original Message-----
>>> From: p2p-hackers-bounces@zgp.org 
>>> [mailto:p2p-hackers-bounces@zgp.org] On
>>> Behalf Of Alex Pankratov
>>> Sent: Friday, February 18, 2005 11:04 PM
>>> To: Peer-to-peer development.
>>> Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper 
>>> available
>>>
>>> Bryan,
>>>
>>> Quoting your paper -
>>>
>>>>  .. we find that about 82% of the NATs tested support hole punching
>>>>  for UDP.
>>>>  ..
>>>
>>>> The NAT Check data we gathered consists of 380 reported data points
>>>
>>>>  ..
>>>
>>> I happened to have statistics for more than 16000 'data poits', and
>>> check this out - the rate of 'identity preserving' NAT devices suitable
>>> for hole punching works out to be 82.2%. *UDP* hole punching that is.
>>>
>>> Alex
>>>
>>> Bryan Ford wrote:
>>>
>>>
>>>> Hi folks,
>>>>
>>>> For those interested in P2P-over-NAT issues, I just wanted to announce
>>>
>>> that
>>>
>>>> the final version of the following paper, to appear in USENIX '05, is
>>>
>>> now
>>>
>>>> available:
>>>>
>>>> Peer-to-Peer Communication Across Network Address Translators, Bryan
>>>
>>> Ford,
>>>
>>>> Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, 
>>>> April
>>>> 2005.
>>>> (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf
>>>> (HTML) http://www.brynosaurus.com/pub/net/p2pnat/
>>>>
>>>> An earlier draft of this paper was announced on this list a few months
>>>
>>> ago.
>>>
>>>> The final version includes, among other minor revisions, new "NAT 
>>>> Check"
>>>> testing results based on almost twice the number of data points as the
>>>> original draft.
>>>>
>>>> Cheers,
>>>> Bryan
>>>>
>>>> ---
>>>>
>>>> Abstract:
>>>>
>>>> Network Address Translation (NAT) causes well-known difficulties for
>>>> peer-to-peer (P2P) communication, since the peers involved may not be
>>>> reachable at any globally valid IP address. Several NAT traversal
>>>
>>> techniques
>>>
>>>> are known, but their documentation is slim, and data about their
>>>
>>> robustness
>>>
>>>> or relative merits is slimmer. This paper documents and analyzes one of
>>>
>>> the
>>>
>>>> simplest but most robust and practical NAT traversal techniques,
>>>
>>> commonly
>>>
>>>> known as ``hole punching.'' Hole punching is moderately well-understood
>>>
>>> for
>>>
>>>> UDP communication, but we show how it can be reliably used to set up
>>>> peer-to-peer TCP streams as well. After gathering data on the
>>>
>>> reliability of
>>>
>>>> this technique on a wide variety of deployed NATs, we find that about
>>>
>>> 82% of
>>>
>>>> the NATs tested support hole punching for UDP, and about 64% support
>>>
>>> hole
>>>
>>>> punching for TCP streams. As NAT vendors become increasingly conscious
>>>
>>> of the
>>>
>>>> needs of important P2P applications such as Voice over IP and online
>>>
>>> gaming
>>>
>>>> protocols, support for hole punching is likely to increase in the
>>>
>>> future.
>>>
>>>> _______________________________________________
>>>> p2p-hackers mailing list
>>>> p2p-hackers@zgp.org
>>>> http://zgp.org/mailman/listinfo/p2p-hackers
>>>> _______________________________________________
>>>> Here is a web page listing P2P Conferences:
>>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>>>>
>>>>
>>>
>>> _______________________________________________
>>> p2p-hackers mailing list
>>> p2p-hackers@zgp.org
>>> http://zgp.org/mailman/listinfo/p2p-hackers
>>> _______________________________________________
>>> Here is a web page listing P2P Conferences:
>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>>
>> _______________________________________________
>> p2p-hackers mailing list
>> p2p-hackers@zgp.org
>> http://zgp.org/mailman/listinfo/p2p-hackers
>> _______________________________________________
>> Here is a web page listing P2P Conferences:
>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From Ian.Wiles at blueyonder.co.uk  Sat Feb 19 22:56:08 2005
From: Ian.Wiles at blueyonder.co.uk (Ian Wiles)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Learning how to build a P2P system
In-Reply-To: <Pine.LNX.4.58.0502191911410.2130@westbury.dixons.org>
References: <GMdViHDYf1FCFwqg@blueyonder.co.uk>
	<Pine.LNX.4.58.0502191911410.2130@westbury.dixons.org>
Message-ID: <IF5GgCDIQ8FCFwaB@blueyonder.co.uk>

In message <Pine.LNX.4.58.0502191911410.2130@westbury.dixons.org>, Jim 
Dixon <jdd@dixons.org> writes

>
>You do understand that Usenet News _is_ a p2p system, right?  One that
>works very well, despite immense and rapidly fluctuating loads, the
>sporadic loss of peers, legal threats, hordes of utterly clueless users,
>and sometimes uhm less clueful sys admins.
>
>Source code freely available.  See for example
>http://www.isc.org/index.pl?/innd
>
>The design is ancient and barbaric, but it works.
>

Yep, I'm aware of this. I used to run a dnews server for some groups. 
I've also seen some P2P usenet based systems, such as Mynews etc.

The difference between my idea was to distribute the spool, which 
doesn't happen with NNTP servers, they more or less copy the spool 
amongst their peers. As I would expect each node to be a home PC (I 
should've mentioned that...) if they each propagated 100% of the spool 
that would be pretty bad for scaling. For example most usenet servers 
now move about 1Tb of bandwidth a day (although this  mostly accounts 
for binaries). Even text only would probably amount to 5Gb or so a day.

Also a lot of usenet providers are no charging for access on top of ISP 
fees (a lot of ISP provided usenet servers are quite poor in my 
experience).

Cheers.


-- 
Ian Wiles


From ap at hamachi.cc  Sun Feb 20 00:26:28 2005
From: ap at hamachi.cc (Alex Pankratov)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Final version of "P2P over NAT" paper available
In-Reply-To: <1108849400.7734B6B@dk12.dngr.org>
References: <20050219080338.212D33FC27@capsicum.zgp.org>
	<42179DE3.10309@hamachi.cc> <1108849400.7734B6B@dk12.dngr.org>
Message-ID: <4217D934.401@hamachi.cc>


David Barrett wrote:

> I'm sorry, I didn't make my question clear.  Given that you can hole 
> punch for 82-97% of NAT'd users, how many users are behind NATs in the
> first place?

Around 70%, but keep in mind that 16000+ samples we have at the moment
are far from being representative.

> 
> For example if only 1% of users is behind a NAT then hole punching 
> doesn't much matter.  But it's 25%, 50%, or 75%, it becomes critical.
> 
> Does this question make sense?

Yes, let me clarify the numbers I gave earlier -

82% is ratio of 'identity preserving' NATs among all NAT'ed clients
we saw. 97% - is a number of user pairs (including routable clients)
that we were able to successfully connect. If we were only to use
technique suggested in p2pnat paper, 97% would've become 84%.

> 
> Likewise, I'm interested in a similar stat for firewalls.

The stats for firewall that allow outbound UDP is 100%, ie you can
always connect two peers behind two separate _stateful_ firewalls
that allow unrestricted outbound UDP. The % of those behind locked-
down firewalls is neglible as well as the % of those behind the
same firewall that doesn't allow for hairpin'ing.

> 
> Sorry for not being clear the first time.

Not a problem.

> 
> -david
> 
> On Sat, 19 Feb 2005 12:32 pm, Alex Pankratov wrote:
> 
>> Well, based on same stats it looks like 'hole punching' as it's
>> described in p2pnat paper succeeds in ~84% of the cases. Our
>> proggy is a bit more complex than that so our success rate is
>> about 97%.
>>
>> Alex
>>
>> David Barrett wrote:
>>
>>> Heh, great validation of the results.
>>> So if what's the latest values for the following chart:
>>>                         NAT'd     Firewalled
>>>                      +---------+-------------
>>> % Able to hole punch |  82.2%  |   50-60% *
>>> % of total internet  |   ??    |     ??
>>>                      +---------+-------------
>>> % Benefiting         |   ??    |     ??
>>> * http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html
>>> Basically, I'd like to get a better understanding of what fraction of 
>>> all
>>> internet users might benefit from these techniques, estimated as the 
>>> product
>>> of the above rows.
>>> -david
>>>
>>>> -----Original Message-----
>>>> From: p2p-hackers-bounces@zgp.org 
>>>> [mailto:p2p-hackers-bounces@zgp.org] On
>>>> Behalf Of Alex Pankratov
>>>> Sent: Friday, February 18, 2005 11:04 PM
>>>> To: Peer-to-peer development.
>>>> Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper 
>>>> available
>>>>
>>>> Bryan,
>>>>
>>>> Quoting your paper -
>>>>
>>>>>  .. we find that about 82% of the NATs tested support hole punching
>>>>>  for UDP.
>>>>>  ..
>>>>
>>>>
>>>>> The NAT Check data we gathered consists of 380 reported data points
>>>>
>>>>
>>>>>  ..
>>>>
>>>>
>>>> I happened to have statistics for more than 16000 'data poits', and
>>>> check this out - the rate of 'identity preserving' NAT devices suitable
>>>> for hole punching works out to be 82.2%. *UDP* hole punching that is.
>>>>
>>>> Alex
>>>>
>>>> Bryan Ford wrote:
>>>>
>>>>
>>>>> Hi folks,
>>>>>
>>>>> For those interested in P2P-over-NAT issues, I just wanted to announce
>>>>
>>>>
>>>> that
>>>>
>>>>> the final version of the following paper, to appear in USENIX '05, is
>>>>
>>>>
>>>> now
>>>>
>>>>> available:
>>>>>
>>>>> Peer-to-Peer Communication Across Network Address Translators, Bryan
>>>>
>>>>
>>>> Ford,
>>>>
>>>>> Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, 
>>>>> April
>>>>> 2005.
>>>>> (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf
>>>>> (HTML) http://www.brynosaurus.com/pub/net/p2pnat/
>>>>>
>>>>> An earlier draft of this paper was announced on this list a few months
>>>>
>>>>
>>>> ago.
>>>>
>>>>> The final version includes, among other minor revisions, new "NAT 
>>>>> Check"
>>>>> testing results based on almost twice the number of data points as the
>>>>> original draft.
>>>>>
>>>>> Cheers,
>>>>> Bryan
>>>>>
>>>>> ---
>>>>>
>>>>> Abstract:
>>>>>
>>>>> Network Address Translation (NAT) causes well-known difficulties for
>>>>> peer-to-peer (P2P) communication, since the peers involved may not be
>>>>> reachable at any globally valid IP address. Several NAT traversal
>>>>
>>>>
>>>> techniques
>>>>
>>>>> are known, but their documentation is slim, and data about their
>>>>
>>>>
>>>> robustness
>>>>
>>>>> or relative merits is slimmer. This paper documents and analyzes 
>>>>> one of
>>>>
>>>>
>>>> the
>>>>
>>>>> simplest but most robust and practical NAT traversal techniques,
>>>>
>>>>
>>>> commonly
>>>>
>>>>> known as ``hole punching.'' Hole punching is moderately 
>>>>> well-understood
>>>>
>>>>
>>>> for
>>>>
>>>>> UDP communication, but we show how it can be reliably used to set up
>>>>> peer-to-peer TCP streams as well. After gathering data on the
>>>>
>>>>
>>>> reliability of
>>>>
>>>>> this technique on a wide variety of deployed NATs, we find that about
>>>>
>>>>
>>>> 82% of
>>>>
>>>>> the NATs tested support hole punching for UDP, and about 64% support
>>>>
>>>>
>>>> hole
>>>>
>>>>> punching for TCP streams. As NAT vendors become increasingly conscious
>>>>
>>>>
>>>> of the
>>>>
>>>>> needs of important P2P applications such as Voice over IP and online
>>>>
>>>>
>>>> gaming
>>>>
>>>>> protocols, support for hole punching is likely to increase in the
>>>>
>>>>
>>>> future.
>>>>
>>>>> _______________________________________________
>>>>> p2p-hackers mailing list
>>>>> p2p-hackers@zgp.org
>>>>> http://zgp.org/mailman/listinfo/p2p-hackers
>>>>> _______________________________________________
>>>>> Here is a web page listing P2P Conferences:
>>>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> p2p-hackers mailing list
>>>> p2p-hackers@zgp.org
>>>> http://zgp.org/mailman/listinfo/p2p-hackers
>>>> _______________________________________________
>>>> Here is a web page listing P2P Conferences:
>>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>>>
>>>
>>> _______________________________________________
>>> p2p-hackers mailing list
>>> p2p-hackers@zgp.org
>>> http://zgp.org/mailman/listinfo/p2p-hackers
>>> _______________________________________________
>>> Here is a web page listing P2P Conferences:
>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>>>
>> _______________________________________________
>> p2p-hackers mailing list
>> p2p-hackers@zgp.org
>> http://zgp.org/mailman/listinfo/p2p-hackers
>> _______________________________________________
>> Here is a web page listing P2P Conferences:
>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 

From arachnid at notdot.net  Mon Feb 21 01:01:02 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Efficient Decoding of Online / LT / Raptor codes
Message-ID: <c3c6fc1825e2bf77c45c48f3b05578c5@notdot.net>

After reading papers on Digital Fountain codes, it occurs to me that 
the decoding process can be more efficient than described: All the 
papers describe decoding by looking for encoded nodes that consist of a 
single source node, marking that node as decoded, and subtracting it 
from all the other nodes that include it. However, it occurs to me that 
it should be possible to be more efficient than this: The order of 
encoded nodes can be decreased without starting with nodes consisting 
of a single source node.
As an example, if I have nodes consisting of:
1: A+B+C
2: A+B+D
3: C+D+E

Assuming my operator is commutative (eg, XOR), I can add 1 and 2 to get 
C+D, and add that to 3 to recover E. Obviously this only works with 
much efficiency if the operator you use to subtract is the same as the 
one you use to add, such as XOR.

Unfortunately, I don't know enough stats to figure out if this is worth 
it - is anyone able to show if it's worth the extra effort, or if the 
cases in which this is useful are rare?

-Nick Johnson


From arachnid at notdot.net  Mon Feb 21 03:23:12 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <4211C669.3080200@ucla.edu>
References: <4211C669.3080200@ucla.edu>
Message-ID: <de0199583d469a42503ca2bc9a732a60@notdot.net>

On 15/02/2005, at 10:52 PM, Michael Parker wrote:

> Hi all,
>
> Does anyone know what happened to the "Online Codes" Sourceforge 
> project, listed at http://sourceforge.net/projects/onlinecodes? I'm 
> asking here for two reasons: First, because Online Codes [1, 2] would 
> be a great tool in peer-to-peer applications, so I thought someone 
> here might have followed the project while it was still active. 
> Second, I've written a solid library implementation of the Online 
> Codes encoding/decoding algorithm described in the aforementioned 
> papers. Alas, only after I implemented it did I find out that the 
> authors' company, Rateless, had patented it (or, so they allude to on 
> their web site www.rateless.com, Digital Fountain owned the IP).

I don't see it - where do they allude to it?
The only mention of patents google finds on the site is in the copy of 
the GPL they have hosted there.

-Nick Johnson


From mgp at ucla.edu  Mon Feb 21 05:57:25 2005
From: mgp at ucla.edu (mgp@ucla.edu)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
Message-ID: <200502210557.j1L5vPX11295@webmail.my.ucla.edu>

Hi Nick,

First, I think your optimization to rateless codes is interesting,
although I'm worried that it might not be 
computationally feasible. I haven't given it much thought yet, but at
first glimpse it seems like it would take 
exponential or factorial time to come up with such shortcuts from XORing
the check blocks. I'll give it more 
thought, but if you or anyone else can show it can be done efficiently, I 
will be pleasantly suprised :)

About your second e-mail... On the web site www.rateless.com, if you go
to Library, you will see that under 
the "Online Codes" paper it says:

"This paper marked the beginning of Rateless Research. The codes
described in it fall within the scope of 
patents owned by Digital Fountain, Inc. In order to prevent intellectual
property overlap, we have 
developed and use a new class of practical rateless codes, based on
decoders that don't use chain-reaction, 
message passing or belief-propagation techniques."

Which, I think, means that the rateless codes method described in the
paper is not the one they are 
currently using, for the one in the paper is covered by the patents of
Digital Fountain, Inc. Also, in the 
excellent paper "Digital Fountains: A Survey and Look Forward" [1], under 
"Barriers to Adoption -- Patent 
Protection" you will find references to 10 (!) or so patents in the
bibliography. Again, most of these I believe 
belong to Digital Fountain, Inc.

Regards,
Michael Parker

[1] www.eecs.harvard.edu/~michaelm/postscripts/itw2004.pdf


On Mon, 21 Feb 2005 16:23:12 +1300 Nick Johnson wrote:

> On 15/02/2005, at 10:52 PM, Michael Parker wrote:
> 
> > Hi all,
> >
> > Does anyone know what happened to the "Online Codes" Sourceforge 
> > project, listed at http://sourceforge.net/projects/onlinecodes? I'm 
> > asking here for two reasons: First, because Online Codes [1, 2] would 
> > be a great tool in peer-to-peer applications, so I thought someone 
> > here might have followed the project while it was still active. 
> > Second, I've written a solid library implementation of the Online 
> > Codes encoding/decoding algorithm described in the aforementioned 
> > papers. Alas, only after I implemented it did I find out that the 
> > authors' company, Rateless, had patented it (or, so they allude to on 
> > their web site www.rateless.com, Digital Fountain owned the IP).
> 
> I don't see it - where do they allude to it?
> The only mention of patents google finds on the site is in the copy of 
> the GPL they have hosted there.
> 
> -Nick Johnson
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From arachnid at notdot.net  Mon Feb 21 06:07:09 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <200502210557.j1L5vPX11295@webmail.my.ucla.edu>
References: <200502210557.j1L5vPX11295@webmail.my.ucla.edu>
Message-ID: <42197A8D.9020400@notdot.net>

mgp@ucla.edu wrote:

>Hi Nick,
>
>First, I think your optimization to rateless codes is interesting,
>although I'm worried that it might not be 
>computationally feasible. I haven't given it much thought yet, but at
>first glimpse it seems like it would take 
>exponential or factorial time to come up with such shortcuts from XORing
>the check blocks. I'll give it more 
>thought, but if you or anyone else can show it can be done efficiently, I 
>will be pleasantly suprised :)
>  
>
As far as I can tell (though I can't prove it), it should be possible to 
reduce the magnitude as much as is possible by simply checking each 
received block against the ones already received. If the number of 
source blocks they don't share is fewer than the order of whichever of 
the two blocks has higher order, replace the higher order block with the 
xor of the two. At first glance this would reqire O(n) time per block, 
and hence O(n^2) time for all blocks, but it could probably be reduced 
by creating tree 'indexes' to the blocks that contain each source block, 
reducing the complexity to less than O(n) for each block.

>About your second e-mail... On the web site www.rateless.com, if you go
>to Library, you will see that under 
>the "Online Codes" paper it says:
>
>"This paper marked the beginning of Rateless Research. The codes
>described in it fall within the scope of 
>patents owned by Digital Fountain, Inc. In order to prevent intellectual
>property overlap, we have 
>developed and use a new class of practical rateless codes, based on
>decoders that don't use chain-reaction, 
>message passing or belief-propagation techniques."
>  
>
So Online codes should be unencumbered, while Torrent / LT codes aren't?

-Nick Johnson

From bryan.turner at pobox.com  Mon Feb 21 15:57:11 2005
From: bryan.turner at pobox.com (Bryan Turner)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <42197A8D.9020400@notdot.net>
Message-ID: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com>

Nick,

> If the number of source blocks they don't share is fewer than the order
> of whichever of the two blocks has higher order, replace the higher order
> block with the xor of the two.

	You are correct, this will work for codes which use XOR.  I share
Michael's caution about the efficiency of this scheme, as most packets will
be sourced from dozens of blocks.  With indexing, this could be sub-linear
per application - but is an application per source block, or per packet?  My
guess is that even the best indexing would still require per-source-block
lookup.  So you have dozens of sub-linear applications per packet, which
will probably end up being super-linear if your big-O constants aren't very
small.

	If you could keep the DAG of all source blocks, received packets,
etc, in memory then you could reduce it to O(M) per application where M is
the depth of the DAG.  But there's some recursion that's not accounted for
at the next layer so the total cost is more than O(nM), it would be like
O(n*CM) where C is the recursion constant.

> So Online codes should be unencumbered, while Torrent / LT codes aren't?

	As far as I am aware, Digital Fountain owns all the patents to
rateless codes currently in the literature.  This is dangerous territory for
open-source.  That makes Rateless.com's claims of an unencumbered rateless
code even more interesting.. their product documentation uses the words
"Mixed Acyclic Decoder", which I cannot find any other references to online.

	My guess is that they have generalized the connectivity DAG for the
source packets ("acyclic"), and used a two-part code ("mixed") to regenerate
the DAG at the other end.  Most of the Digital Fountain IP is predicated on
the small per-packet ID-tag which identifies the connectivity of the packet
in the source DAG.  Digital Fountain has been working to optimize the source
DAG to reduce memory consumption and improve throughput (basically you have
to keep the entire file in memory to run a Digital Fountain of it).
Rateless has probably come up with a new ID-tag which does not read on the
Digital Fountain IP, but transmits the same information.

--Bryan
bryan.turner@pobox.com

From arachnid at notdot.net  Mon Feb 21 20:16:57 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com>
References: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com>
Message-ID: <273a8d5eb249de6348b5c145301b9ab2@notdot.net>

On 22/02/2005, at 4:57 AM, Bryan Turner wrote:

>
>> So Online codes should be unencumbered, while Torrent / LT codes 
>> aren't?
>
> 	As far as I am aware, Digital Fountain owns all the patents to
> rateless codes currently in the literature.  This is dangerous 
> territory for
> open-source.  That makes Rateless.com's claims of an unencumbered 
> rateless
> code even more interesting.. their product documentation uses the words
> "Mixed Acyclic Decoder", which I cannot find any other references to 
> online.

Since infringing a patent requires meeting all the claims, it seems 
likely to me that Online Codes won't infringe Digital Fountain's 
patents. However, IANAPL.

On the plus side, I live in New Zealand, which doesn't recognize 
Software Patents (to the best of my knowledge). It'd be nice to be able 
to distribute anything I write to people in the US, though.

-Nick Johnson


From agthorr at cs.uoregon.edu  Mon Feb 21 20:29:36 2005
From: agthorr at cs.uoregon.edu (agthorr@cs.uoregon.edu)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <273a8d5eb249de6348b5c145301b9ab2@notdot.net>
References: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com>
	<273a8d5eb249de6348b5c145301b9ab2@notdot.net>
Message-ID: <20050221202935.GB16576@barsoom.org>

On Tue, Feb 22, 2005 at 09:16:57AM +1300, Nick Johnson wrote:
> Since infringing a patent requires meeting all the claims, it seems 
> likely to me that Online Codes won't infringe Digital Fountain's 
> patents. However, IANAPL.

IANAL either, but it was my understanding that infringing a patent
occurs when you infringe any of the claims, though a court may dismiss
some claims as being overly broad.  Thus, most patents have a series of
increasingly more specific claims, where the most-broad cover the most
scope, and the most-specific are most likely to hold up.

This is what I recall from taking an intellectual property course
several years ago, and appears to be true doing some quick search on
Google.

But, again, IANAL. :)

From b.fallenstein at gmx.de  Mon Feb 21 23:53:30 2005
From: b.fallenstein at gmx.de (b.fallenstein@gmx.de)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
Message-ID: <28921.1109030010@www19.gmx.net>

On Mon, 21 Feb 2005 12:29:36 -0800, agthorr@cs.uoregon.edu
<agthorr@cs.uoregon.edu> wrote:
> On Tue, Feb 22, 2005 at 09:16:57AM +1300, Nick Johnson wrote:
> > Since infringing a patent requires meeting all the claims, it seems
> > likely to me that Online Codes won't infringe Digital Fountain's
> > patents. However, IANAPL.
>
> IANAL either, but it was my understanding that infringing a patent
> occurs when you infringe any of the claims, though a court may dismiss
> some claims as being overly broad.

You're correct. What Nick was thinking about is that infringing a
patent requires infringing all *steps* in one particular claim. So if
you have a patent with claims like

1. A method for quenching thirst, comprising of opening at least one
container of juice, pouring at least some of the juice in said
container into a glass and drinking said juice from said glass.
2. The method of claim 1, where the container is a pack.
3. The method of claim 1, where the container is a can.
4. A method for quenching thirst, comprising of opening at least one
container of juice, putting a straw in said container, and drinking at
least some of the juice in said container through said straw.

then you infringe the patent if you drink juice from a glass, even
though you only infringe on claim 1 and not claim 4; however, if you
open a can of juice and then drink the juice directly from the can,
you do not infringe on the patent, because you don't perform all of
the steps in the patent.

(There's a loophole there; the patent holder may argue that what you
did was "equivalent" to one of the steps in the patent. In the
example, say you're drinking from a cup instead of a glass; the patent
holder may argue that the cup is 'equivalent' to a glass. However, I
believe courts are generally quite strict about that; if the patent
holder meant either a cup or a glass, why did they say 'glass'
specifically?)

IANAL, but I do believe I'm correct here, from what I've read.
(Unfortunately the URI of the source escapes my mind :-( )

- Benja

From sdaswani at gmail.com  Tue Feb 22 01:15:08 2005
From: sdaswani at gmail.com (Susheel Daswani)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <28921.1109030010@www19.gmx.net>
References: <28921.1109030010@www19.gmx.net>
Message-ID: <1cd056b90502211715a073fd@mail.gmail.com>

Folks, it is not the case that you infringe on a patent if you
infringe ANY of the claims.  You must infringe all.  I an not a
lawyer, but I am training to be one :).  Here is the content of a
message I sent a few months back:

--------------
I'm not sure how everyone is handling the Altnet patent threat, but in
my studies I've come across some salient points regarding patent
infringement:

"For an accused product to literally infringe a patent, EVERY element
contained in the patent claim must also be present in the accused
product or device.  If a claimed apparatus has five parts, or
'elements', and the allegedly infringing apparatus has only four of
those five, it does not literally infringe.  This is true even though
the defendant may have copied the four elements exactly, and
regardless of how significant or insignificant the missing element
is."
'Intellectual Property in the New Technological Age', 3rd Edition, page 230

This may already be known, but I thought I'd put it out there.
So everyone should analyse their hashing systems to see how they
compare to Altnet's patent elements.  If you don't do everything they
do, you can ignore their dinky letter :).  I'm going to analyse their
claims soon and compare to the systems I know.

Some more interesting information, which is probably obvious:
"[I]t does not matter [if] a defendant has ADDED several new elements
-- adding new features cannot help a defendant escape infringement."
--------------

Susheel


On Tue, 22 Feb 2005 00:53:30 +0100 (MET), b.fallenstein@gmx.de
<b.fallenstein@gmx.de> wrote:
> On Mon, 21 Feb 2005 12:29:36 -0800, agthorr@cs.uoregon.edu
> <agthorr@cs.uoregon.edu> wrote:
> > On Tue, Feb 22, 2005 at 09:16:57AM +1300, Nick Johnson wrote:
> > > Since infringing a patent requires meeting all the claims, it seems
> > > likely to me that Online Codes won't infringe Digital Fountain's
> > > patents. However, IANAPL.
> >
> > IANAL either, but it was my understanding that infringing a patent
> > occurs when you infringe any of the claims, though a court may dismiss
> > some claims as being overly broad.
> 
> You're correct. What Nick was thinking about is that infringing a
> patent requires infringing all *steps* in one particular claim. So if
> you have a patent with claims like
>

From arachnid at notdot.net  Tue Feb 22 02:37:54 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Online Codes
In-Reply-To: <1cd056b90502211715a073fd@mail.gmail.com>
References: <28921.1109030010@www19.gmx.net>
	<1cd056b90502211715a073fd@mail.gmail.com>
Message-ID: <a8a2d30b9e64ed148a711d87b2afcf89@notdot.net>

On 22/02/2005, at 2:15 PM, Susheel Daswani wrote:

> "For an accused product to literally infringe a patent, EVERY element
> contained in the patent claim must also be present in the accused
> product or device.  If a claimed apparatus has five parts, or
> 'elements', and the allegedly infringing apparatus has only four of
> those five, it does not literally infringe.  This is true even though
> the defendant may have copied the four elements exactly, and
> regardless of how significant or insignificant the missing element
> is."
> 'Intellectual Property in the New Technological Age', 3rd Edition, 
> page 230

This seems to be exactly what b.fallenstein was saying - all the 
elements in a particular claim must match, but only one claim needs to 
completely match for there to be a violation:

>>> IANAL either, but it was my understanding that infringing a patent
>>> occurs when you infringe any of the claims, though a court may 
>>> dismiss
>>> some claims as being overly broad.
>>
>> You're correct. What Nick was thinking about is that infringing a
>> patent requires infringing all *steps* in one particular claim. So if
>> you have a patent with claims like


From szabo at szabo.best.vwh.net  Tue Feb 22 03:09:41 2005
From: szabo at szabo.best.vwh.net (Nick Szabo)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] no subject (file transmission)
Message-ID: <20050222030941.24981.qmail@szabo.best.vwh.net>

Disclaimer -- IANAL and the following are personal not legal opinions.

The "found valid by a jury" business is FUD -- it's meant to intimidate those receiving the cease & desist letters into thinking they will lose, but the jury finding has no direct legal effect on future cases.

To have such an effect it would have to satisfy the criteria for issue preclusion.  If an issue is precluded for a future case that would mean the finding in the previous case stands -- it can't be re-argued or decided differently in the future case.

The party against whom issue preclusion is asserted must have been present at the original trial and had a full and fair opportunity to litigate.  Here this only applies to Akamai itself.  In other words, if Akamai infringed again, and they had lost the first trial, the validity issue wouldn't be retried in the second trial.  But other accused infringers can re-challenge the validity from scratch.   (I suspect, but Im not certain, that the jury result can't even be introduced as evidence to sway future juries, though it looks like nothing stops them from putting the results in cease & desist letters).

Here the jury finding couldn't even preclude re-arguing the validity even in a future suit against Akamai since it's not necessary to the result of the first case.   This kind of thing has been going on in IP for many decades -- see Electrical Fittings v. Thomas Betts, 307 U.S. 241 (1939) at http://caselaw.lp.findlaw.com/scripts/getcase.pl?navby=search&court=US&case=/us/307/241.html.  According to the result in that case Akamai here could have appealed the jury finding (even though it won the case), but since the issue isn't precluded Akamai had no incentive appeal, and didn't.

If the roles were turned around the issue could be precluded -- if the jury had said the patent was invalid (and Digital Island/C&W/Altnet couldn't overturn the jury on appeal), and Altnet for that reason lost the case, Altnet would be precluded from raising that issue again -- the patent would remain invalid for subsequent lawsuits.   If juries find conflicting results there probably won't ever be any preclusion, so the issue of the validity of this Altnet patent probably won't ever be precluded, though it would be an interesting case to argue.  So the jury finding in Akamai might have this indirect legal effect on future cases in _preventing_ preclusion of the validity issue in favor of defendants.

As for the probabilities of subsequent juries going the same way as the first, I suspect there's not much correlation.  Juries in patent cases are all over the map, and given that there is more publicity in the p2p world this time around, future defendants will probably find much better prior art to challenge the Altnet patents with.


> "Also, a decentralized scheme such as in
> Kazaa has no availability problems but lacks integrity, since Kazaa is
> plagued with many fake files. Clearly, decentralization is an unsolved
> issue that needs further research."

Perhaps this is ironic, but a good solution is protocol-enforced property 
rights.  Specifically, we should treat names crossing trust boundaries as 
property, and securely agree across trust boundaries on who owns what names, 
as described in

"Secure Property Titles with Owner Authority",
http://szabo.best.vwh.net/securetitle.html

Nick Szabo

From szabo at szabo.best.vwh.net  Tue Feb 22 03:22:24 2005
From: szabo at szabo.best.vwh.net (Nick Szabo)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Fully
Message-ID: <20050222032224.44521.qmail@szabo.best.vwh.net>


> "Also, a decentralized scheme such as in
> Kazaa has no availability problems but lacks integrity, since Kazaa is
> plagued with many fake files. Clearly, decentralization is an unsolved
> issue that needs further research."

Perhaps this is ironic, but a good solution is protocol-enforced property 
rights.  Specifically, we should treat names crossing trust boundaries as 
property, and securely agree across trust boundaries on who owns what names, 
as described in

"Secure Property Titles with Owner Authority",
http://szabo.best.vwh.net/securetitle.html

Nick Szabo

From ian at locut.us  Tue Feb 22 11:39:52 2005
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] no subject (file transmission)
In-Reply-To: <20050222030941.24981.qmail@szabo.best.vwh.net>
References: <20050222030941.24981.qmail@szabo.best.vwh.net>
Message-ID: <cd28dc838dee71b6d5de11ece3a54606@locut.us>


On 22 Feb 2005, at 03:09, Nick Szabo wrote:

> Disclaimer -- IANAL and the following are personal not legal opinions.
>
> The "found valid by a jury" business is FUD -- it's meant to 
> intimidate those receiving the cease & desist letters into thinking 
> they will lose, but the jury finding has no direct legal effect on 
> future cases.

IANAL either, but an IP lawyer told me that jurys determine matters of 
fact, where as the validity of a patent is a matter of law, and 
therefore a jury is incapable of finding that a patent is valid.

Ian.

--
Founder, The Freenet Project	http://freenetproject.org/
CEO, Cematics Ltd				http://cematics.com/
Personal Blog					http://locut.us/~ian/blog/


From alwaysakid at 163.com  Tue Feb 22 18:36:56 2005
From: alwaysakid at 163.com (Chris)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Dynamic IP
In-Reply-To: <20050222032224.44521.qmail@szabo.best.vwh.net>
Message-ID: <20050222183708.E5CD83FD48@capsicum.zgp.org>

Reading thru B.Ford's paper, it just occurs to me that if there's somewhere
a server who's willing map a user name to a udp ip:port tuple, many p2p app
will not need things like Dyn2Go to resolve a dynamic ip server.
Is there any such thing ?


From bkn3 at columbia.edu  Tue Feb 22 20:09:50 2005
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Byzantine Quorum Systems
In-Reply-To: <20050222032224.44521.qmail@szabo.best.vwh.net>
References: <20050222032224.44521.qmail@szabo.best.vwh.net>
Message-ID: <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com>

Nick, I've been reading over the papers on your website and want to know 
your opinion of how well byzantine quorum systems work under the following 
conditions:

* A majority of the nodes have a half-life of about an hour (i.e. every 
hour half the nodes in the p2p system leave).  A large number of nodes 
never rejoin the system again.
* The system has high latency (i.e. it is running over the public Internet)
* A majority of the nodes are NATed or firewalled, depending on other nodes 
to relay requests.

My understanding of byzantine quorum algorithms are that they break down 
under the kinds of conditions found in large-scale, P2P systems, which have 
very high node churn and high latency, described above.  They seem to be 
focused on very stable or LAN type networks.  Is this a correct 
assumption?  If it is, it seems that byzantine quorum algorithms need to be 
refocused on the kinds of networks that we are dealing with today, rather 
than LAN centric networks or networks of very stable servers on the public 
Internet.

Best,
   Brad Neuberg

At 07:22 PM 2/21/2005, you wrote:

> > "Also, a decentralized scheme such as in
> > Kazaa has no availability problems but lacks integrity, since Kazaa is
> > plagued with many fake files. Clearly, decentralization is an unsolved
> > issue that needs further research."
>
>Perhaps this is ironic, but a good solution is protocol-enforced property
>rights.  Specifically, we should treat names crossing trust boundaries as
>property, and securely agree across trust boundaries on who owns what names,
>as described in
>
>"Secure Property Titles with Owner Authority",
>http://szabo.best.vwh.net/securetitle.html
>
>Nick Szabo
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

Brad Neuberg, bkn3@columbia.edu
Senior Software Engineer, Rojo Networks
Weblog: http://www.codinginparadise.org

=====================================================================

Check out Rojo, an RSS and Atom news aggregator that I work on.  Visit 
http://rojo.com for more info. Feel free to ask me for an invite!

Rojo is Hiring!  If you're interested in RSS, Weblogs, Social Networking, 
Java, Open Source, etc... then come work with us at Rojo.  If you recommend 
someone and we hire them you'll get a free iPod!  See 
http://www.rojonetworks.com/JobsAtRojo.html. 


From wesley at felter.org  Tue Feb 22 20:30:41 2005
From: wesley at felter.org (Wes Felter)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Byzantine Quorum Systems
In-Reply-To: <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com>
References: <20050222032224.44521.qmail@szabo.best.vwh.net>
	<6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com>
Message-ID: <421B9671.30404@felter.org>

Brad Neuberg wrote:

> My understanding of byzantine quorum algorithms are that they break down 
> under the kinds of conditions found in large-scale, P2P systems, which 
> have very high node churn and high latency, described above.  They seem 
> to be focused on very stable or LAN type networks.  Is this a correct 
> assumption?  If it is, it seems that byzantine quorum algorithms need to 
> be refocused on the kinds of networks that we are dealing with today, 
> rather than LAN centric networks or networks of very stable servers on 
> the public Internet.

OceanStore solves this problem by using a byzantine quorum protocol only 
between supernodes. IIRC, performance of this protocol over the Internet 
was not too bad. Unfortunately, I didn't understand Nick's recent 
messages (and thus the context for this discussion) at all, so I don't 
know if this is relevant.

Wes Felter - wesley@felter.org


From bkn3 at columbia.edu  Tue Feb 22 21:52:17 2005
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Byzantine Quorum Systems
In-Reply-To: <421B9671.30404@felter.org>
References: <20050222032224.44521.qmail@szabo.best.vwh.net>
	<6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com>
	<421B9671.30404@felter.org>
Message-ID: <6.2.1.2.2.20050222134908.025b1140@pop.mail.yahoo.com>

At 12:30 PM 2/22/2005, you wrote:
>Brad Neuberg wrote:
>
>>My understanding of byzantine quorum algorithms are that they break down 
>>under the kinds of conditions found in large-scale, P2P systems, which 
>>have very high node churn and high latency, described above.  They seem 
>>to be focused on very stable or LAN type networks.  Is this a correct 
>>assumption?  If it is, it seems that byzantine quorum algorithms need to 
>>be refocused on the kinds of networks that we are dealing with today, 
>>rather than LAN centric networks or networks of very stable servers on 
>>the public Internet.
>
>OceanStore solves this problem by using a byzantine quorum protocol only 
>between supernodes. IIRC, performance of this protocol over the Internet 
>was not too bad. Unfortunately, I didn't understand Nick's recent messages 
>(and thus the context for this discussion) at all, so I don't know if this 
>is relevant.

I should provide more context.  I'm reading the following two papers by Nick:

* "Secure Property Titles with Owner Authority" - 
http://szabo.best.vwh.net/securetitle.html
* "Advances in Distributed Security" - 
http://szabo.best.vwh.net/distributed.html

Both posit that greater advances in things like distributed naming over p2p 
networks are possible due to things like byzantine quorum systems.  I've 
always felt that byzantine quorum systems are too fragile for unreliable 
p2p networks on the wider Internet, though I'd love to be proven wrong.

Brad


>Wes Felter - wesley@felter.org
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

Brad Neuberg, bkn3@columbia.edu
Senior Software Engineer, Rojo Networks
Weblog: http://www.codinginparadise.org

=====================================================================

Check out Rojo, an RSS and Atom news aggregator that I work on.  Visit 
http://rojo.com for more info. Feel free to ask me for an invite!

Rojo is Hiring!  If you're interested in RSS, Weblogs, Social Networking, 
Java, Open Source, etc... then come work with us at Rojo.  If you recommend 
someone and we hire them you'll get a free iPod!  See 
http://www.rojonetworks.com/JobsAtRojo.html. 


From paul at ref.nmedia.net  Tue Feb 22 23:33:49 2005
From: paul at ref.nmedia.net (Paul Campbell)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] no subject (file transmission)
In-Reply-To: <cd28dc838dee71b6d5de11ece3a54606@locut.us>
References: <20050222030941.24981.qmail@szabo.best.vwh.net>
	<cd28dc838dee71b6d5de11ece3a54606@locut.us>
Message-ID: <20050222233349.GA2001@ref.nmedia.net>

On Tue, Feb 22, 2005 at 11:39:52AM +0000, Ian Clarke wrote:
> 
> On 22 Feb 2005, at 03:09, Nick Szabo wrote:
> 
> >Disclaimer -- IANAL and the following are personal not legal opinions.
> >
> >The "found valid by a jury" business is FUD -- it's meant to 
> >intimidate those receiving the cease & desist letters into thinking 
> >they will lose, but the jury finding has no direct legal effect on 
> >future cases.
> 
> IANAL either, but an IP lawyer told me that jurys determine matters of 
> fact, where as the validity of a patent is a matter of law, and 
> therefore a jury is incapable of finding that a patent is valid.

False. Juries can decide both the facts and the law. Judges often refuse to
allow them to do so unless they are instructed to do otherwise. There is a
huge body of information about this. Specifically, there is the issue of
jury nullification. In essence, a jury can decide to totally ignore the law
and decide for innocence in a criminal case or decide for the defendant in
a civil case in spite of whatever legal paperwork exists.

This was most prominent during prohibition when it was very difficult or
impossible to get a conviction simply because the power of jury nullification
was raised in jury instructions and the resulting juries simply refused to
convict.

From paul at ref.nmedia.net  Tue Feb 22 23:39:53 2005
From: paul at ref.nmedia.net (Paul Campbell)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Dynamic IP
In-Reply-To: <20050222183708.E5CD83FD48@capsicum.zgp.org>
References: <20050222032224.44521.qmail@szabo.best.vwh.net>
	<20050222183708.E5CD83FD48@capsicum.zgp.org>
Message-ID: <20050222233953.GB2001@ref.nmedia.net>

On Wed, Feb 23, 2005 at 02:36:56AM +0800, Chris wrote:
> Reading thru B.Ford's paper, it just occurs to me that if there's somewhere
> a server who's willing map a user name to a udp ip:port tuple, many p2p app
> will not need things like Dyn2Go to resolve a dynamic ip server.
> Is there any such thing ?

First off, it is not necessary anyways. If the p2p network uses the very
same network that maintains the distributed database of files/data also
maintains a distributed database of usernames (perhaps mapping to a public
key as a signature scheme), then there is no need for the service you
mentioned. Most of them have this capability to one degree or another.

The other place where it becomes "necessary" is for a "first contact point"
or initial entry into the network. This is already essentially solved by
maintaining a sufficiently large list of contact points (say a list of the
ip:port list of 1000 known P2P contacts) that there is zero chance of not
finding an initial entry point. It is still a sticky issue for "new"
entries...someone who downloads the software needs something for an initial
contact point that has enough permanence to withstand months or years of
no updates. This can be provided at a relatively low-bandwidth level by a
web server. It also provides a fall back for more or less gauranteed first
contacts.

From szabo at szabo.best.vwh.net  Wed Feb 23 01:32:05 2005
From: szabo at szabo.best.vwh.net (Nick Szabo)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Re: Byzantine Quorum Systems
In-Reply-To: <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com> from "Brad
	Neuberg" at Feb 22, 2005 12:09:50 PM
Message-ID: <20050223013206.42103.qmail@szabo.best.vwh.net>


My proposal for a fully decentralized p2p directory to protect name integrity
was somewhat vague; hopefully this will clarify things a bit:

http://szabo.best.vwh.net/nameintegrity.html

> Nick, I've been reading over the papers on your website and want to know 
> your opinion of how well byzantine quorum systems work under the following 
> conditions:

I'm by no means an expert on the network conditions you describe, but
I'll do my best.

> * A majority of the nodes have a half-life of about an hour (i.e. every 
> hour half the nodes in the p2p system leave).  A large number of nodes 
> never rejoin the system again.

This is where Byzantine fault tolerance shines.  The main security and 
reliability property holds per transaction.  Byzantine fault-tolerant 
systems are vastly superior to any reputation- or history-based precaution 
in this regard.  The most common transaction for the proposed directory is 
the propagation of a new (filename, hash, owner) tuple.  

The Byzantine attacker tries to overwhelm the directory with new, 
separate-looking nodes which forge messages, corrupting the directory for 
everybody.  If the attacker can usurp more than a certain fraction (1/4 
for some kinds of quorum systems, 1/3 for traditional slow Byzantine fault 
tolerant systems) of the separate-looking nodes during a given transaction, 
they can compromise the integrity of that transaction.  The fraction 
is far smaller for non-Byzantine tolerant directories -- in a 
typical p2p system, you can't demonstrate that the directory is 
safe against even a very small number of message-forging nodes.

> * The system has high latency (i.e. it is running over the public Internet)

For the directory, we just want short tuples like (file name, file hash, 
owner name, logo, digital signature) to propagate about as fast as the 
large files they refer to.  How bad slowing down overall file+directory
propagation is a problem depends on the relative values users and 
uploaders place on propagation time, uploader reputation, and file name 
integrity.  The larger number of messages required for a Byzantine quorum 
system is probably tolerable, as it is for the supernodes in OceanStore.   

> * A majority of the nodes are NATed or firewalled, depending on other nodes 
> to relay requests.

All kinds of directories would benefit from cryptography to prevent 
substitution and replay attacks by intermediaries like these and others, 
to be sure.  This reduces the intermediaries' attack to making all the
nodes behind them simply fail, and fail-stop is a simpler problem to solve 
than Byzantine faults (message forging & the like).  If the inside nodes 
fail and outside nodes can't reroute their messages to the failed nodes 
through another NAT or firewall, the correct nodes cut the failed nodes 
out of the directory, and the integrity of the directory is not harmed.

> My understanding of byzantine quorum algorithms are that they break down 
> under the kinds of conditions found in large-scale, P2P systems, which have 
> very high node churn and high latency, described above.  They seem to be 
> focused on very stable or LAN type networks.  Is this a correct 
> assumption?  If it is, it seems that byzantine quorum algorithms need to be 
> refocused on the kinds of networks that we are dealing with today...

Actually, the original focus of the research in the 70's was even
more fixed -- it was about ensuring the reliability of computers with
multiple redundant CPUs.  The best way to prove such reliability was to make
a very open-ended assumption that CPUs would have not just statistical
errors but arbitrary, even malicious errors -- so the models became
even more pertinent the security and reliability of LANs, and more 
pertinent still to the security and reliability of wide-area distributed 
systems.  The CPUs (and LAN nodes) were also deemed to be highly 
unstable -- there was a separate fail-stop model where the idea was
to simply tolerate high numbers of failures that could (unlike Byzantine
attacks) be detected and ignored as simple failures by other nodes.
Both models are very well studied, and chances are the combination of
Byzantine and fail-stop models is well-studies as well, though I haven't 
personally researched that angle.

I heartily agree that more adaptations and improvements to an unstable
environment should be explored.  For example, a combination of fail-stop
(to deal with very large numbers of detectable failures from catastrophic
exit or blockage of many p2p nodes at once) and Byzantine fault tolerance 
(to deal with a smaller numbers of simultaneous message-forging nodes) 
could be explored.  Such exploration should not, however, come at the 
expense of losing the provable security and reliability properties 
Byzantine fault-tolerance discipline makes possible for fully decentralized 
directories.

Nick Szabo

From lgonze at panix.com  Wed Feb 23 01:56:34 2005
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Re: Byzantine Quorum Systems
In-Reply-To: <20050223013206.42103.qmail@szabo.best.vwh.net>
References: <20050223013206.42103.qmail@szabo.best.vwh.net>
Message-ID: <Pine.NEB.4.61.0502221553340.6111@panix2.panix.com>


While we're on the topic of quorum systems for naming, it strikes me that 
there's nothing about Zooko's triangle which requires systems to tolerate 
rapid churn.  A network of highly stable nodes, for example ones hosted at 
university comp sci departments, would be fine as long as it was large 
enough and allowed open membership.


From arachnid at notdot.net  Wed Feb 23 08:18:36 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Homomorphic hashing and fountain codes -
	implementation question
Message-ID: <421C3C5C.4040408@notdot.net>

If I can be forgiven a stupid question:
I'm reading in detail the paper 
(http://www.scs.cs.nyu.edu/~mfreed/docs/authcodes-ieee04.pdf) on 
homomorphic hash functions for use with Digital Fountain codes in 
preparation for implementing it. The problem I'm coming up against is in 
the description of the modifications to the Fountain code described on 
page 5. With their example settings, 256 bit long sub-blocks are now 
added modulo a 257 bit prime. This makes sense - what I don't get is how 
to encode the result in 256 bits! What is one supposed to do if the sum 
of the selected blocks overflows 256 bits?

Can anyone enlighten me?

-Nick Johnson

From gwenchlan at fr.fm  Wed Feb 23 14:23:21 2005
From: gwenchlan at fr.fm (Gwenchlan)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Node counting algorithm
Message-ID: <421C91D9.6070908@fr.fm>

Hi all,
i am looking for papers or informations about node counting on overlays, 
but currently without success.
In a distributed fashion, a node would be able to start a process to 
estimate the overlay size.
Do someone here have seen something like this recently?
Any clues about that?
I think i will have to use random walkers.
Thanks!

From agthorr at barsoom.org  Wed Feb 23 14:33:17 2005
From: agthorr at barsoom.org (Daniel Stutzbach)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Node counting algorithm
In-Reply-To: <421C91D9.6070908@fr.fm>
References: <421C91D9.6070908@fr.fm>
Message-ID: <20050223143317.GG3549@barsoom.org>

On Wed, Feb 23, 2005 at 03:23:21PM +0100, Gwenchlan wrote:
> i am looking for papers or informations about node counting on overlays, 
> but currently without success.
> In a distributed fashion, a node would be able to start a process to 
> estimate the overlay size.
> Do someone here have seen something like this recently?
> Any clues about that?
> I think i will have to use random walkers.

Are you looking for an algorithm that will estimate the overlay size
for use by the overlay, or are you looking for measurement techniques?

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From zooko at zooko.com  Wed Feb 23 14:44:40 2005
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Re: Byzantine Quorum Systems
In-Reply-To: <20050223013206.42103.qmail@szabo.best.vwh.net>
References: <20050223013206.42103.qmail@szabo.best.vwh.net>
Message-ID: <63099cbffbb15834bf7e5478a6dabd69@zooko.com>

Hello Nick, Brad, et al.

"The Sybil Attack" by John Doceur puts the question in the p2p context:

http://citeseer.ist.psu.edu/douceur02sybil.html

This paper can be lossily compressed as: "Your scheme can handle up to 
K malicious nodes.  My attacker can bring K+1 malicious nodes to the 
party.".

It is an excellent paper because it is very general and it requires p2p 
designers to think explicitly about the issue.

The argument in "The Sybil Attack" is valid -- if you accept its 
premises, then you must accept its conclusion.  However, there is one 
premise which is implicit in this paper and in most related research 
which ought to be challenged.

This implicit premise is that a connection between two node arises ex 
nihilo.  That is: for any three nodes A, B, and C, A has (at the start) 
no information about how B differs from C.  This assumption is 
obviously key to the whole issue.  It is also obviously wrong!

In practice the opposite is often true: for any three nodes A, B, and 
C, A often has information distinguishing B from C.  This is because A 
has been introduced to B somehow, and that introduction gave A 
information.  (Likewise with A's introduction to C.)

In sum, The Sybil Attack beats Byzantine techniques, but fortunately 
that doesn't matter because both of those ideas are set in an idealized 
world in which an unbounded number of indistinguishable peers are 
introduced to one another ex nihilo, with the introductions conveying 
no information.  Rather that struggle vainly to overcome The Sybil 
Attack in that idealized world, I suggest designing distributed systems 
that are secure only when bootstrapped with useful introductions.

I don't know whether that kind of design can accomodate "The Napster 
Setting", in which a large number of strangers want to be automatically 
introduced in order to trade files.  I suspect that it *can*, but I 
also think that this isn't the only interesting setting for p2p 
designs, and other settings may be even more amenable to designs which 
use the information from introductions.

Regards,

Zooko


From gwenchlan at fr.fm  Wed Feb 23 14:48:43 2005
From: gwenchlan at fr.fm (Gwenchlan)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Node counting algorithm
Message-ID: <421C97CB.2050205@fr.fm>

Le mercredi 23 f?vrier 2005 ? 06:33 -0800, Daniel Stutzbach a ?crit :

>On Wed, Feb 23, 2005 at 03:23:21PM +0100, Gwenchlan wrote:
>> i am looking for papers or informations about node counting on overlays, 
>> but currently without success.
>> In a distributed fashion, a node would be able to start a process to 
>> estimate the overlay size.
>> Do someone here have seen something like this recently?
>> Any clues about that?
>> I think i will have to use random walkers.
>
>Are you looking for an algorithm that will estimate the overlay size
>for use by the overlay, or are you looking for measurement techniques?
>  
>
Hi Daniel,
probably the first option, in order to launch "on demand" mesurement for 
overlay maintenance (by the overlay itself) for exemple..
The request initiator would be waiting for a more or less precise 
estimation, due to dynamicity, response time expected..

From mgp at ucla.edu  Wed Feb 23 18:27:41 2005
From: mgp at ucla.edu (Michael Parker)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Node counting algorithm
In-Reply-To: <421C97CB.2050205@fr.fm>
References: <421C97CB.2050205@fr.fm>
Message-ID: <421CCB1D.1030907@ucla.edu>

Interestingly enough, the topology of some DHTs makes this not too 
difficult to calculate. In Pastry/Bamboo, for example:

If your leaf set overlaps, it's just the number of entries in your leaf set.
If your leaf set does not overlap, divide the size of the ring (e.g. 
2^128, 2^160) by the span of your leaf set (i.e., the farthest clockwise 
node minus the farthest counterclockwise node, modulo the ring size), 
and multiply by the size of your leaf set. Basically, what this means is 
if your leaf set is size L, and it spans a percentage x of the node 
identifier space, the size of the network is approximately L * x^-1. To 
improve accuracy, ask the two farthest nodes in your leaf set and ask 
them for their leaf sets, merging them into yours before calculating. 
That way, you have a larger effective L.

The same can be done for Chord using its successor list and predecessor. 
Alternatively, in these two networks, since the number of nodes in your 
routing table is log_2 N, you can estimate the size of your network as 
2^k, where k is the number of filled rows in your routing table. 
Finally, although I'm no Kademlia expert, I think you can estimate the 
size of the network by 2^k, where k is the number of buckets in your 
routing table.

- Michael Parker


Gwenchlan wrote:

> Le mercredi 23 f?vrier 2005 ? 06:33 -0800, Daniel Stutzbach a ?crit :
>
>> On Wed, Feb 23, 2005 at 03:23:21PM +0100, Gwenchlan wrote:
>>
>>> i am looking for papers or informations about node counting on 
>>> overlays, but currently without success.
>>> In a distributed fashion, a node would be able to start a process to 
>>> estimate the overlay size.
>>> Do someone here have seen something like this recently?
>>> Any clues about that?
>>> I think i will have to use random walkers.
>>
>>
>> Are you looking for an algorithm that will estimate the overlay size
>> for use by the overlay, or are you looking for measurement techniques?
>>  
>>
> Hi Daniel,
> probably the first option, in order to launch "on demand" mesurement 
> for overlay maintenance (by the overlay itself) for exemple..
> The request initiator would be waiting for a more or less precise 
> estimation, due to dynamicity, response time expected..
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From em at em.no-ip.com  Thu Feb 24 03:13:57 2005
From: em at em.no-ip.com (Enzo Michelangeli)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Node counting algorithm
References: <421C97CB.2050205@fr.fm> <421CCB1D.1030907@ucla.edu>
Message-ID: <00d801c51a1e$f3813740$0200a8c0@em.noip.com>

----- Original Message ----- 
From: "Michael Parker" <mgp@ucla.edu>
To: "Peer-to-peer development." <p2p-hackers@zgp.org>
Sent: Thursday, February 24, 2005 2:27 AM
Subject: Re: [p2p-hackers] Node counting algorithm

> Interestingly enough, the topology of some DHTs makes this not too
> difficult to calculate. In Pastry/Bamboo, for example:
[...]
> Finally, although I'm no Kademlia expert, I think you can estimate the
> size of the network by 2^k, where k is the number of buckets in your
> routing table.

More accurately, by the number of nodes contained in each k-bucket.
Each k-bucket gets full after k nodes, and that clips its contribution to
k; however, partially-filled k-buckets do contribute useful information.
See e.g. my posting archived at
http://zgp.org/pipermail/p2p-hackers/2004-June/001991.html , so far with
no followup.

Enzo


From srhea at cs.berkeley.edu  Thu Feb 24 05:32:49 2005
From: srhea at cs.berkeley.edu (Sean C. Rhea)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Node counting algorithm
In-Reply-To: <421CCB1D.1030907@ucla.edu>
References: <421C97CB.2050205@fr.fm> <421CCB1D.1030907@ucla.edu>
Message-ID: <3ab998d35d23fedc340ce0364a89aa8c@cs.berkeley.edu>

On Feb 23, 2005, at 10:27 AM, Michael Parker wrote:
> If your leaf set overlaps, it's just the number of entries in your 
> leaf set.
> If your leaf set does not overlap, divide the size of the ring (e.g. 
> 2^128, 2^160) by the span of your leaf set (i.e., the farthest 
> clockwise node minus the farthest counterclockwise node, modulo the 
> ring size), and multiply by the size of your leaf set. Basically, what 
> this means is if your leaf set is size L, and it spans a percentage x 
> of the node identifier space, the size of the network is approximately 
> L * x^-1. To improve accuracy, ask the two farthest nodes in your leaf 
> set and ask them for their leaf sets, merging them into yours before 
> calculating. That way, you have a larger effective L.

This technique gives estimates that, on average, overestimate the size 
of the network if you pick node identifiers uniformly at random (UAR).  
The reason is that UAR doesn't mean evenly distributed; some nodes' 
leaf sets cover much more than others.  If you have one node whose leaf 
set covers a larger portion of the key space, that node underestimates 
the size of the ring, but a lot of other nodes end up covering less of 
the key space (to make room for the larger one) and end up 
overestimating the network size.  When you average them all, the few 
nodes that underestimate don't make up for all the rest that 
overestimate it.

IANAM (I am not a mathematician), but this is what happens when you 
simulate it at least, with identifiers drawn using SHA.  References for 
more mathematical explanations and a better algorithm are described in 
this paper:

   http://iptps05.cs.cornell.edu/PDFs/CameraReady_174.pdf

Sean
-- 
         Give a man a fish and he will eat for a day.  Teach him how
         to fish, and he will sit in a boat and drink beer all day.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050223/45d8b186/PGP.pgp
From gwenchlan at fr.fm  Thu Feb 24 08:37:35 2005
From: gwenchlan at fr.fm (Gwenchlan)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Node counting algorithm
Message-ID: <421D924F.20403@fr.fm>

Le mercredi 23 f?vrier 2005 ? 21:32 -0800, Sean C. Rhea a ?crit :

>On Feb 23, 2005, at 10:27 AM, Michael Parker wrote:
>> If your leaf set overlaps, it's just the number of entries in your 
>> leaf set.
>> If your leaf set does not overlap, divide the size of the ring (e.g. 
>> 2^128, 2^160) by the span of your leaf set (i.e., the farthest 
>> clockwise node minus the farthest counterclockwise node, modulo the 
>> ring size), and multiply by the size of your leaf set. Basically, what 
>> this means is if your leaf set is size L, and it spans a percentage x 
>> of the node identifier space, the size of the network is approximately 
>> L * x^-1. To improve accuracy, ask the two farthest nodes in your leaf 
>> set and ask them for their leaf sets, merging them into yours before 
>> calculating. That way, you have a larger effective L.
>
>This technique gives estimates that, on average, overestimate the size 
>of the network if you pick node identifiers uniformly at random (UAR).  
>The reason is that UAR doesn't mean evenly distributed; some nodes' 
>leaf sets cover much more than others.  If you have one node whose leaf 
>set covers a larger portion of the key space, that node underestimates 
>the size of the ring, but a lot of other nodes end up covering less of 
>the key space (to make room for the larger one) and end up 
>overestimating the network size.  When you average them all, the few 
>nodes that underestimate don't make up for all the rest that 
>overestimate it.
>
>IANAM (I am not a mathematician), but this is what happens when you 
>simulate it at least, with identifiers drawn using SHA.  References for 
>more mathematical explanations and a better algorithm are described in 
>this paper:
>
>   http://iptps05.cs.cornell.edu/PDFs/CameraReady_174.pdf
>
>Sean
>
Thanks for these DHT heuristics. I have omited to specify i was looking 
for tricks applying to unstructured networks (probably random graphs).
Thus we cannot deal with density here. I was thinking about using a 
flexible method like randow walk, but K random walkers launched by 
initiator raise the problem of scalability and accuracy as the overlay 
may (more or less) vary during the counting itself, phenomenon amplified 
by network size...


From eugen at leitl.org  Thu Feb 24 12:19:40 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] [IP] New paper on poisioning and pollution of P2P
	networks (fwd from dave@farber.net)
Message-ID: <20050224121940.GD1404@leitl.org>

----- Forwarded message from David Farber <dave@farber.net> -----

From: David Farber <dave@farber.net>
Date: Thu, 24 Feb 2005 06:09:58 -0500
To: Ip <ip@v2.listbox.com>
Subject: [IP] New paper on poisioning and pollution of P2P
 networks
User-Agent: Microsoft-Entourage/11.1.0.040913
Reply-To: dave@farber.net


------ Forwarded Message
From: Joseph Lorenzo Hall <joehall@gmail.com>
Reply-To: <joehall@pobox.com>
Date: Wed, 23 Feb 2005 23:20:22 -0800
To: Dave Farber <dave@farber.net>, Declan McCullagh <declan@well.com>
Subject: New paper on poisioning and pollution of P2P networks

Hi Dave, Declan... I thought you two might enjoy this paper. -Joe
---- 

## New paper on poisioning and pollution of P2P networks ##

http://groups.sims.berkeley.edu/pam-p2p/index.php?p=40

[Nicolas Christin][1] has just put the finishing touches on a new
paper authored with [Andreas Weigend][2] and SIMS professor [John
Chuang][3], ["Content Availability, Pollution and Poisoning in File
Sharing Peer-to-Peer Networks"][4] that will be presented at [ACM's
Conference on Electronic Commerce][5] this summer in Vancouver,
Canada. Here is the abstract:

[1]: http://www.sims.berkeley.edu/~christin/
[2]: http://www.weigend.com/
[3]: http://www.sims.berkeley.edu/~chuang/
[4]: http://p2pecon.berkeley.edu/pub/CWC-EC05.pdf
[5]: http://www.acm.org/sigs/sigecom/ec05/

> Copyright holders have been investigating technological solutions to
  prevent distribution of copyrighted materials in peer-to-peer file
  sharing networks. A particularly popular technique consists in
  poisoning a specific item (movie, song, or software title) by
  injecting a massive number of decoys into the peer-to-peer network,
  to reduce the availability of the targeted item. In addition to
  poisoning, pollution, that is, the accidental injection of unusable
  copies of files in the network, also decreases content
  availability. In this paper, we attempt to provide a first step
  toward understanding the differences between pollution and
  poisoning, and their respective impact on content availability in
  peer-to-peer file sharing networks. To that effect, we conduct a
  measurement study of content availability in the four most popular
  peer-to-peer file sharing networks, in the absence of poisoning, and
  then simulate different poisoning strategies on the measured data to
  evaluate their potential impact. We exhibit a strong correlation
  between content availability and topological properties of the
  underlying peer-to-peer network, and show that the injection of a
  small number of decoys can seriously impact the users perception of
  content availability.

This is a really interesting paper. They measure a number of P2P
network metrics - query response time, temporal stability, spatial
stability and download completion time - using a widely distributed
set of PCs on the [PlanetLab network][6] running scripted P2P
software. This is a clever way to simultaneously study the
characteristics of different P2P networks (notably eDonkey,
eDonky/Overnet, FastTrack and Gnutella) as well as quantitatively
illustrate differences in the underlying network algoritms. The really
nifty part of this paper, in my opinion, involves measuring the
effects of various content poisoning and pollution strategies. Their
results show that fairly simple strategies are fairly simply defeated
while more sophisticated and hybrid strategies aimed at
mucking-up-the-net are difficult to detect and thwart.

[6]: http://www.planet-lab.org/

-- 
Joseph Lorenzo Hall
UC Berkeley, SIMS PhD Student
http://pobox.com/~joehall/
blog: http://pobox.com/~joehall/nqb2/

------ End of Forwarded Message


-------------------------------------
You are subscribed as eugen@leitl.org
To manage your subscription, go to
  http://v2.listbox.com/member/?listname=ip

Archives at: http://www.interesting-people.org/archives/interesting-people/

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050224/6bf3d531/attachment.pgp
From sam at neurogrid.com  Fri Feb 25 22:19:17 2005
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Agents and P2P Workshop - submission form live
Message-ID: <421FA465.6030909@neurogrid.com>

*** our apologies if you receive multiple copies of this e-mail ***

Call for Papers for the

Fourth International Workshop on Agents and Peer-to-Peer Computing
(AP2PC 2005)
http://p2p.ingce.unibo.it/
held in AAMAS 2005
International Conference on Autonomous Agents and MultiAgent Systems
Utrecht University, Netherlands.
from 25 July - 29 July 2005.

[SUBMISSION FORM NOW AVAIALBLE]
https://msrcmt.research.microsoft.com/AP2PC2005/CallForPapers.aspx
[see below for more details]

CALL FOR PAPERS: Peer-to-peer (P2P) computing has attracted enormous 
media attention, initially spurred by the popularity of file sharing 
systems such as Napster, Gnutella, and Morpheus. More recently systems 
like BitTorrent and eDonkey have continued to sustain that attention. 
New techniques such as distributed hash-tables (DHTs), semantic routing, 
and Plaxton Meshes are being combined with traditional concepts such as 
Hypercubes, Trust Metrics and caching techniques to pool together the 
untapped computing power at the "edges" of the internet. These new 
techniques and possibilities have generated a lot of interest in many 
industrial organizations, and has resulted in the creation of a P2P 
working group on standardization in this area. 
(http://www.irtf.org/charters/p2prg.html).

In P2P computing peers and services forego central coordination and 
dynamically organise themselves to support knowledge sharing and 
collaboration, in both cooperative and non-cooperative environments. The 
success of P2P systems strongly depends on a number of factors. First, 
the ability to ensure equitable distribution of content and services. 
Economic and business models which rely on incentive mechanisms to 
supply contributions to the system are being developed, along with 
methods for controlling the "free riding" issue. Second, the ability to 
enforce provision of trusted services. Reputation based P2P trust 
management models are becoming a focus of the research community as a 
viable solution. The trust models must balance both constraints imposed 
by the environment (e.g. scalability) and the unique properties of trust 
as a social and psychological phenomenon. Recently, we are also 
witnessing a move of the P2P paradigm to embrace mobile computing in an 
attempt to achieve even higher ubiquitousness. The possibility of 
services related to physical location and the relation with agents in 
physical proximity could introduce new opportunities and also new 
technical challenges.

Although researchers working on distributed computing, MultiAgent 
Systems, databases and networks have been using similar concepts for a 
long time, it is only fairly recently that papers motivated by the 
current P2P paradigm have started appearing in high quality conferences 
and workshops. Research in agent systems in particular appears to be 
most relevant because, since their inception, MultiAgent Systems have 
always been thought of as collections of peers.

The MultiAgent paradigm can thus be superimposed on the P2P 
architecture, where agents embody the description of the task 
environments, the decision-support capabilities, the collective 
behavior, and the interaction protocols of each peer. The emphasis in 
this context on decentralization, user autonomy, dynamic growth and 
other advantages of P2P, also leads to significant potential problems. 
Most prominent among these problems are coordination: the ability of an 
agent to make decisions on its own actions in the context of activities 
of other agents, and scalability: the value of the P2P systems lies in 
how well they scale along several dimensions, including complexity, 
heterogeneity of peers, robustness, traffic redistribution, and so 
forth. It is important to scale up coordination strategies along 
multiple dimensions to enhance their tractability and viability, and 
thereby to widen potential application domains. These two problems are 
common to many large-scale applications. Without coordination, agents 
may be wasting their efforts, squander resources and fail to achieve 
their objectives in situations requiring collective effort.

This workshop will bring together researchers working on agent systems 
and P2P computing with the intention of strengthening this connection. 
Researchers from other related areas such as distributed systems, 
networks and database systems will also be welcome (and, in our opinion, 
have a lot to contribute). We seek high-quality and original 
contributions on the general theme of "Agents and P2P Computing". The 
following is a non-exhaustive list of topics of special interest:

- Intelligent agent techniques for P2P computing
- P2P computing techniques for MultiAgent Systems
- The Semantic Web, Semantic Coordination Mechanisms and P2P systems
- Scalability, coordination, robustness and adaptability in P2P systems
- Self-organization and emergent behavior in P2P networks
- E-commerce and P2P computing
- Participation and Contract Incentive Mechanisms in P2P Systems
- Computational Models of Trust and Reputation
- Community of interest building and regulation, and behavioral norms
- Intellectual property rights in P2P systems
- P2P architectures
- Scalable Data Structures for P2P systems
- Services in P2P systems (service definition languages, service 
discovery, filtering and composition etc.)
- Knowledge Discovery and P2P Data Mining Agents
- P2P oriented information systems
- Information ecosystems and P2P systems
- Security issues in P2P networks
- Pervasive computing based on P2P architectures (ad-hoc 
networks,wireless communication devices and mobile systems)
- Grid computing solutions based on agents and P2P paradigms
- Legal issues in P2P networks PANEL
The theme of the panel will be Decentralised Trust in P2P and MultiAgent 
Systems. As P2P and MultiAgent systems become larger and more diverse 
the risks of interacting with malicious peers become increasingly 
problematic. The panel will address how computational trust issues can 
be addressed in P2P and MultiAgent systems. The panel will involve short 
presentations by thepanelists followed by a discussion session involving 
the audience.

IMPORTANT DATES
Paper submission: 14th March 2005
Acceptance notification: 18th April 2005
Workshop: 25-26th July 2005
Camera ready for post-proceedings: 20th September 2005

REGISTRATION
Accomodation and workshop registration will be handled by the AAMAS 2005 
organization along with the main conference registration.

SUBMISSION INSTRUCTIONS
Previously unpublished papers should be formatted according to the 
LNCS/LNAI author instructions for proceedings and they should not be 
longer than 12 pages (about 5000 words including figures, tables, 
references, etc.).

Please submit your papers through the Microsoft conference management 
system: https://msrcmt.research.microsoft.com/AP2PC2005/CallForPapers.aspx

Particular preference will be given to those papers that build upon the 
contributions of papers presented at previous AP2PC workshops. In 
addition, please carefully consider the issues that our reviewers will 
be considering. Some of the issues our reviewers will be considering can 
be seen in this form:

http://www.neurogrid.net/ap2pc2005/review-form.html

At the very least we would encourage all authors to read the abstracts 
of the papers submitted to previous workshops - available from the links 
below:

http://p2p.ingce.unibo.it/2002/  
http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-40109-22-2991818-0,00.html
http://p2p.ingce.unibo.it/2003/  
http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-40109-22-37060961-0,00.html
http://p2p.ingce.unibo.it/2004/

Particular preference will be given to both novel approaches and those 
papers that build upon the contributions of papers presented at previous 
AP2PC workshops.

PUBLICATION
Accepted papers will be distributed to the workshop participants as 
workshop notes. As in previous years post-proceedings of the revised 
papers (namely accepted papers presented at the workshop) will be 
submitted for publication to Springer in Lecture Notes in Computer 
Science series.

ORGANIZING COMMITTEE
Program Co-chairs

Zoran Despotovic
School of Computer and Communication Sciences, E'cole Polytechnique 
Fe'de'rale de Lausanne (EPFL)
CH-1015 Lausanne, Switzerland
Email zoran.despotovic@epfl.ch

Sam Joseph (main contact)
Dept. of Information and Computer Science, University of Hawaii at 
Manoa, USA
1680 East-West Road, POST 309, Honolulu, HI 96822
E-mail: srjoseph@hawaii.edu

Claudio Sartori
Dept. of Electronics, Computer Science and Systems, University of 
Bologna, Italy
Viale Risorgimento, 2 - 40136 Bologna Italy
E-mail: claudio.sartori@unibo.it

Panel Chair
Omer Rana
School of Computer Science, Cardiff University, UK
Queen's Buildings, Newport Road, Cardiff CF24 3AA, UK
E-mail: o.f.rana@cs.cardiff.ac.uk

PROGRAM COMMITTEE
Karl Aberer, EPFL, Lausanne, Switzerland
Alessandro Agostini, ITC-IRST, Trento, Italy
Djamal Benslimane, Universite Claude Bernard, France
Sonia Bergamaschi, University of Modena and Reggio-Emilia, Italy
M. Brian Blake, Georgetown University, USA
Rajkumar Buyya, University of Melbourne, Australia
Paolo Ciancarini, University of Bologna, Italy
Costas Courcoubetis, Athens University of Economics and Business, Greece
Yogesh Deshpande, University of Western Sydney, Australia
Asuman Dogac, Middle East Technical University, Turkey
Boi V. Faltings, EPFL, Lausanne, Switzerland
Maria Gini, University of Minnesota, USA
Dina Q. Goldin, University of Connecticut, USA
Chihab Hanachi, University of Toulouse, France
Mark Klein, Massachusetts Institute of Technology, USA
Matthias Klusch, DFKI, Saarbrucken, Germany
Tan Kian Lee, National University of Singapore, Singapore
Zakaria Maamar, Zayed University, UAE
Wolfgang Mayer, University of South Australia, Australia
Dejan Milojicic, Hewlett Packard Labs, USA
Alberto Montresor, University of Bologna, Italy
Luc Moreau, University of Southampton, UK
Jean-Henry Morin, University of Geneve, Switzerland
Andrea Omicini, University of Bologna, Italy
Maria Orlowska, University of Queensland, Australia
Aris. M. Ouksel, University of Illinois at Chicago, USA
Mike Papazoglou, Tilburg University, Netherlands
Paolo Petta, Austrian Research Institute for AI, Austria,
Jeremy Pitt, Imperial College, UK
Dimitris Plexousakis, Institute of Computer Science, FORTH, Greece
Martin Purvis, University of Otago, New Zealand
Omer F. Rana, Cardiff University, UK
Douglas S. Reeves, North Carolina State University, USA
Thomas Risse, Fraunhofer IPSI, Darmstadt, Germany
Pierangela Samarati, University of Milan, Italy
Christophe Silbertin-Blanc, University of Toulouse, France
Maarten van Steen, Vrije Universiteit, Netherlands
Katia Sycara, Robotics Institute, Carnegie Mellon University, USA
Peter Triantafillou, Technical University of Crete, Greece
Anand Tripathi, University of Minnesota, USA
Vijay K. Vaishnavi, Georgia State University, USA
Francisco Valverde-Albacete, Universidad Carlos III de Madrid, Spain
Maurizio Vincini, University of Modena and Reggio-Emilia, Italy
Fang Wang, BTexact Technologies, UK
Gerhard Weiss, Technische Universitaet, Germany
Bin Yu, North Carolina State University, USA
Franco Zambonelli, University of Modena and Reggio-Emilia, Italy


From hal at finney.org  Fri Feb 25 23:30:50 2005
From: hal at finney.org (Hal Finney)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Homomorphic hashing and fountain codes -
	implementation question
Message-ID: <20050225233050.C0EAD57EBA@finney.org>

Nick Johnson writes:
> If I can be forgiven a stupid question:
> I'm reading in detail the paper 
> (http://www.scs.cs.nyu.edu/~mfreed/docs/authcodes-ieee04.pdf) on 
> homomorphic hash functions for use with Digital Fountain codes in 
> preparation for implementing it. The problem I'm coming up against is in 
> the description of the modifications to the Fountain code described on 
> page 5. With their example settings, 256 bit long sub-blocks are now 
> added modulo a 257 bit prime. This makes sense - what I don't get is how 
> to encode the result in 256 bits! What is one supposed to do if the sum 
> of the selected blocks overflows 256 bits?

I think... you would have to plan on allocating 257 bits for the output
of this process.  Blocks would be 257 bits long.  I don't know if that
is a show stopper for an implementation.

There is a possible workaround.  You could choose a prime q which was
just barely, barely, barely 257 bits long.  Let it be 2^257 plus some
number less than 2^170 or so.  In other words, let the prime q start
with 1000000... for 80+ bits of zeros.  Now the chance of a random Z_q
value happening to be > 256 bits will be vanishingly small.

Unfortunately, files are not random.  Someone could choose a file which
had special values that would overflow 256 bits.

You could fix that by first pre-randomizing the file in some reversible
way.  A suitable cryptographic primitive is called an All Or Nothing
Transform.  See http://theory.lcs.mit.edu/~boyko/aont-oaep.html for
an example.

So you'd first AONT transform the file, which would randomize the values;
then you'd use this q for the coding, which could overflow in principle
but not in practice.  In the end, after reconstructing the file, you'd
reverse the AONT transform.

In your sci.crypt posting, you also asked:

> 1) The algorithm requires two primes q and p. These primes are known to
> both the publisher and the verifiers. Will security be reduced if the
> same primes are used for all publishers, or can a single pair of primes
> be used globally?

You should be able to use the same primes globally, if the security of
the size of your primes is adequate.

> 2) What level of security does this algorithm provide with p and q
> being 1024 bit and 257 bit, respectively? Eg, how many operations or
> how much computing time would be required to compute a collision? How
> does this fare with reduced lengths of p and q? 

It is a little hard to quantify.  The security will be the minimum of the
security levels of p and q against the discrete log problem.  For q it
is easy, it is half the size of q, or about 128 bits (i.e. 2^128 work to
break it).  That should be more than enough.  For p it is harder.  I think
most people would agree that a p of 1024 bits corresponds to a security
of perhaps 80-90 bits for discrete logs.  This is somewhat marginal.

I would suggest a p of more like 2048 bits, with a 256 bit q.  There have
been a number of proposals for theoretical factoring machines, most
of which could be adapted at somewhat greater cost to finding discrete
logs.  Many people today are worried that 1024 bit keys can no longer be
considered extremely safe.  Particularly if you use the same p throughout
the system, it would be wise to use something a little bigger.

Hal Finney

From hal at finney.org  Fri Feb 25 23:35:07 2005
From: hal at finney.org (Hal Finney)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Homomorphic hashing and fountain codes -
	implementation question
Message-ID: <20050225233507.0BA9557EBA@finney.org>

Quick correction:

> There is a possible workaround.  You could choose a prime q which was
> just barely, barely, barely 257 bits long.  Let it be 2^257 plus some
> number less than 2^170 or so.

I should have said, let it be 2^256 plus some number less than...  In other
words, a 257 bit prime that is just barely bigger than 2^256.

Hal

From arachnid at notdot.net  Sat Feb 26 10:45:56 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Homomorphic hashing and fountain codes -
	implementation question
In-Reply-To: <20050225233050.C0EAD57EBA@finney.org>
References: <20050225233050.C0EAD57EBA@finney.org>
Message-ID: <42205364.5040301@notdot.net>

Hal Finney wrote:

>So you'd first AONT transform the file, which would randomize the values;
>then you'd use this q for the coding, which could overflow in principle
>but not in practice.  In the end, after reconstructing the file, you'd
>reverse the AONT transform.
>  
>
Interesting idea - thanks for suggesting it. I'd have to consider 
further to decide if the extra complexity in the protocol is worth the 
size and computation relieved by not having to store the 257th bit.

>I would suggest a p of more like 2048 bits, with a 256 bit q.  There have
>been a number of proposals for theoretical factoring machines, most
>of which could be adapted at somewhat greater cost to finding discrete
>logs.  Many people today are worried that 1024 bit keys can no longer be
>considered extremely safe.  Particularly if you use the same p throughout
>the system, it would be wise to use something a little bigger.
>  
>
This could be a problem: 1024 bit hashes are already pretty large, 2048 
bit would be substantially worse. I was, in fact, hoping it would be 
practical to use a smaller prime!

Should someone successfully break the 1024 bit hash, what would the 
consequences be? Could they compute collisions for a single block (a 
break that is meaningless, since using the per-publisher model the only 
one setting blocks is the publisher, who can compute collisions with 
ease anyway - a backup file-wide SHA-1 hash will act to prevent this), 
could they conduct a preimage attack against a single block, or could 
they create collisions and preimage attacks for any file published with 
that K? Worse, could they compute collisions for every file published 
using that p?

Thanks,

Nick Johnson

From hal at finney.org  Sat Feb 26 18:55:29 2005
From: hal at finney.org (Hal Finney)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Homomorphic hashing and fountain codes -
	implementation question
Message-ID: <20050226185529.46E7157EBA@finney.org>

Nick Johnson writes:
> Should someone successfully break the 1024 bit hash, what would the 
> consequences be? Could they compute collisions for a single block (a 
> break that is meaningless, since using the per-publisher model the only 
> one setting blocks is the publisher, who can compute collisions with 
> ease anyway - a backup file-wide SHA-1 hash will act to prevent this), 
> could they conduct a preimage attack against a single block, or could 
> they create collisions and preimage attacks for any file published with 
> that K? Worse, could they compute collisions for every file published 
> using that p?

As I understand it, the hash on a block is computed by breaking it into
256 bit pieces t_i, and using pre-defined constants g_i, computing the
product over i of g_i^t_i.

If someone did the work to break discrete logs mod p, they could compute
the discrete logs of the g_i relative to each other, and take discrete
logs of hashes.  This would allow them to create a block that would match
the hash of any given block.  They could compute preimages for any hash,
and they could compute collisions for any file hash using that p.

The nature of the discrete log attack based on p is that it is expensive
to mount, but once you have done the work, you can easily find more
discrete logs of other values using that same modulus p.

However, on the other hand, these hypothetical machines are really
expensive, something like a billion dollars in today's money.  Some people
speculate that it might be down to a hundred million in a few years.

The main concern with 1024 bit moduli has been cryptographic, where
"they" could factor your key.  Since it's so easy to move to bigger
keys, many people are doing it, just to make sure that no government or
anyone else who has a billion dollars to spend could read their messages.
I guess everyone feels like they are special enough that their secrets
might be worth that much.

In your case, maybe this level of paranoia is unnecessary.  Nobody is
going to spend a billion dollars to break this.  You could think about
how much the opponents of this system would be willing to spend, look
at Moore's law over the time frame during which you would envision this
system being used, and estimate a desired security level from that.

There's also the point that if someone did have the money to build such
a machine, they would probably rather factor RSA moduli secretly than
publicly start messing with your hashes.  As soon as someone noticed
a matching hash it would give away the existence of the machine.
Then people would switch to a larger hash, making the machine useless.
All that money spent would be wasted after forging a single hash.
Using it to break encryption keys allows the machine to be kept secret,
making it far more valuable.

http://mathworld.wolfram.com/RSANumber.html has a nice chart showing
the progress over the years in factoring RSA moduli, which are within an
order of magnitude of difficulty of finding discrete logs.  The largest
RSA modulus factored is 576 bits, although looking at the chart I expect
640 will probably fall within a year or so.

One point is that you should try to design the system to allow the hash
size to be upgradeable if and when it became necessary.  Then, maybe you
could even get away with something a little smaller than 1024, with the
idea of upgrading in five years or so, when faster networks and cheaper
disk storage will make a larger hash more palatable.  This will also
let you recover from a surprise breakthrough.

Another point is that if you used a different p for every file, it
would require a billion dollars per file break rather than a billion to
break the whole system.  In that case I'd feel much safer about using a
smaller p, even maybe 768 bits if you accept that one or two files might
be cracked by say 2010.

Hal

From mfreed at cs.nyu.edu  Sun Feb 27 05:45:50 2005
From: mfreed at cs.nyu.edu (Michael J. Freedman)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] Homomorphic hashing and fountain codes -
	implementation question (fwd)
Message-ID: <Pine.BSO.4.61.0502270045110.6074@ludlow.scs.cs.nyu.edu>

[Looks like Max was getting his posts rejected from the mailing list, so
I'm forwarding his response.  --mike]

---------- Forwarded message ----------
Date: Sat, 26 Feb 2005 12:53:31 -0500
From: Maxwell Krohn <krohn@mit.edu>
To: Hal Finney <hal@finney.org>
Cc: Michael J Freedman <mfreed@scs.cs.nyu.edu>, p2p-hackers@zgp.org
Subject: Re: [p2p-hackers] Homomorphic hashing and fountain codes -
     implementation question (fwd)

This is the response I sent to Nick personally about the one-bit
overflow problem. This was the simplest solution we came up with.  It
makes all encoded blocks 1/256 bigger, where 256 is the number of bits
per sub-block.

Maxwell Krohn (krohn@mit.edu) wrote:
> Hi,
>
> There might be 1 overflow bit per subblock, making 512 bits per
> block, and 16 bytes per block of overflow bits.
>
> Our implementation of the Codes is here by the way, available by
> anonymous CVS:
>
> http://cvs.pdos.lcs.mit.edu/cvs/codes1/
>
> In memory, for the encoders and decoders, we store subblocks as big
> integers, which can grow to 257 bits long.  However, for sending over
> the network (and storing on disk, perhaps) we want a denser packing.  So
> what we do is we sheer off the top bit of each subblock and pack them
> together at the beginning of every block.  In our implementation, it's
> called the "carrybitmap_t" object, which you can grep for in our code.
>
> Hope this helps,
>
> Max

From arachnid at notdot.net  Mon Feb 28 10:53:22 2005
From: arachnid at notdot.net (Nick Johnson)
Date: Sat Dec  9 22:12:51 2006
Subject: [p2p-hackers] SSM for Java?
Message-ID: <4222F822.8020207@notdot.net>

Has anyone come across a single-source-multicast library for Java, 
possibly using JNI? I've seen tantalizing hints of one, but no actual code.

-Nick Johnson