From dcarboni at gmail.com Tue Feb 1 17:22:58 2005 From: dcarboni at gmail.com (Davide Carboni) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] simulator for p2p Message-ID: <71b79fa9050201092273a5f7ba@mail.gmail.com> Hi, is there any way to simulate a p2p network using a single PC? I know ns2 but it seems very "low-level" simulation. I'd like something to simulate a network of peers abstracting from the serialization of messages. For instance, I'd like to model peers like objects in memory which exchange messages invoking methods each other but taking into account variables like the bandwidth, the latency and so forth. Bye, Davide From srhea at cs.berkeley.edu Tue Feb 1 19:19:59 2005 From: srhea at cs.berkeley.edu (Sean C. Rhea) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] simulator for p2p In-Reply-To: <71b79fa9050201092273a5f7ba@mail.gmail.com> References: <71b79fa9050201092273a5f7ba@mail.gmail.com> Message-ID: On Feb 1, 2005, at 9:22 AM, Davide Carboni wrote: > is there any way to simulate a p2p network using a single PC? I know > ns2 but it seems very "low-level" simulation. I'd like something to > simulate a network of peers abstracting from the serialization of > messages. For instance, I'd like to model peers like objects in memory > which exchange messages invoking methods each other but taking into > account variables like the bandwidth, the latency and so forth. Bamboo (bamboo-dht.org) comes with a simple simulator that models latency based on real measurements (the data is from here: http://www.pdos.lcs.mit.edu/p2psim/kingdata/). It's a pretty simple event-driven simulator written in Java; the nice thing about it is that you can use the same code under simulation that you use on the real net. To use it, download the latest Bamboo CVS snapshot and try this: cd bamboo/src/bamboo/sim ./make-startup-test.pl ../../../bin/run-java bamboo.sim.Simulator /tmp/startup-test.exp It will start up 29 Bamboo nodes that will then form a Bamboo network. It's a pretty simple example, but it should give you the idea. The PDOS group at MIT also has a simulator. It's at http://www.pdos.lcs.mit.edu/p2psim/. It uses threads instead of events, and C++ instead of Java. It also models only latency. Both of these simulators should be able to simulate 200-1000 nodes, depending on how much core memory your machine has. Modeling bandwidth is hard to do at scale. (It's one of the reasons NS2 doesn't scale too well.) Sean -- We are all in the gutter, but some of us are looking at the stars. -- Oscar Wilde -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050201/781b2a03/PGP.pgp From davidopp at cs.berkeley.edu Tue Feb 1 19:27:47 2005 From: davidopp at cs.berkeley.edu (David L. Oppenheimer) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] simulator for p2p In-Reply-To: Message-ID: <200502011927.LAA10387@mindbender.davido.com> > Bamboo (bamboo-dht.org) comes with a simple simulator that models > latency based on real measurements (the data is from here: > http://www.pdos.lcs.mit.edu/p2psim/kingdata/). It's a pretty simple > event-driven simulator written in Java; the nice thing about > it is that > you can use the same code under simulation that you use on the real > net. And because you can run the same code on the "real net," you can run the same code under emulation on a cluster to study bandwidth effects. David From srhea at cs.berkeley.edu Tue Feb 1 19:51:51 2005 From: srhea at cs.berkeley.edu (Sean C. Rhea) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] simulator for p2p In-Reply-To: <200502011927.LAA10387@mindbender.davido.com> References: <200502011927.LAA10387@mindbender.davido.com> Message-ID: <1f6262437486549bbe1834eb3149f490@cs.berkeley.edu> On Feb 1, 2005, at 11:27 AM, David L. Oppenheimer wrote: >> Bamboo (bamboo-dht.org) comes with a simple simulator that models >> latency based on real measurements (the data is from here: >> http://www.pdos.lcs.mit.edu/p2psim/kingdata/). It's a pretty simple >> event-driven simulator written in Java; the nice thing about >> it is that you can use the same code under simulation that you use on >> the real net. > > And because you can run the same code on the "real net," you can run > the > same code under emulation on a cluster to study bandwidth effects. That's a good point. We run the same code under the Bamboo simulator, on a local cluster using ModelNet (http://issg.cs.duke.edu/modelnet.html) to provide wide-area-like latency and bandwidth restrictions, and on PlanetLab (http://planet-lab.org/). Sean -- An atheist doesn't have to be someone who thinks he has a proof that there can't be a god. He only has to be someone who believes that the evidence on the God question is at a similar level to the evidence on the were-wolf question. -- John McCarthy -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050201/5ed10b82/PGP.pgp From hopper at omnifarious.org Wed Feb 2 03:17:28 2005 From: hopper at omnifarious.org (Eric M. Hopper) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] simulator for p2p In-Reply-To: <71b79fa9050201092273a5f7ba@mail.gmail.com> References: <71b79fa9050201092273a5f7ba@mail.gmail.com> Message-ID: <1107314248.25868.59.camel@bats.omnifarious.org> On Tue, 2005-02-01 at 18:22 +0100, Davide Carboni wrote: > Hi, > is there any way to simulate a p2p network using a single PC? I know > ns2 but it seems very "low-level" simulation. I'd like something to > simulate a network of peers abstracting from the serialization of > messages. For instance, I'd like to model peers like objects in memory > which exchange messages invoking methods each other but taking into > account variables like the bandwidth, the latency and so forth. You could probably write a replacement for SocketModule in my StreamModule framework (http://www.omnifarious.org/StrMod/) that could simulate some of the latency characteristics of a network connection. If you wrote the code to use StreamModule, you could then put in real SocketModules instead and it would work over a real network with no other changes. Have fun (if at all possible), -- Eric Hopper (hopper@omnifarious.org http://www.omnifarious.org/~hopper) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 185 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050201/aa1d07bf/attachment.pgp From sdaswani at gmail.com Wed Feb 2 06:35:12 2005 From: sdaswani at gmail.com (Susheel Daswani) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Altnet Patent Message-ID: <1cd056b905020122353cd6ad68@mail.gmail.com> Hey Folks, I'm not sure how everyone is handling the Altnet patent threat, but in my studies I've come across some salient points regarding patent infringement: "For an accused product to literally infringe a patent, EVERY element contained in the patent claim must also be present in the accused product or device. If a claimed apparatus has five parts, or 'elements', and the allegedly infringing apparatus has only four of those five, it does not literally infringe. This is true even though the defendant may have copied the four elements exactly, and regardless of how significant or insignificant the missing element is." 'Intellectual Property in the New Technological Age', 3rd Edition, page 230 This may already be known, but I thought I'd put it out there. So everyone should analyse their hashing systems to see how they compare to Altnet's patent elements. If you don't do everything they do, you can ignore their dinky letter :). I'm going to analyse their claims soon and compare to the systems I know. Some more interesting information, which is probably obvious: "[I]t does not matter [if] a defendant has ADDED several new elements -- adding new features cannot help a defendant escape infringement." Susheel From samnospam at bcgreen.com Wed Feb 2 09:06:42 2005 From: samnospam at bcgreen.com (Stephen Samuel (leave the email alone)) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Altnet Patent (Prior art) In-Reply-To: <1cd056b905020122353cd6ad68@mail.gmail.com> References: <1cd056b905020122353cd6ad68@mail.gmail.com> Message-ID: <42009822.1030904@bcgreen.com> I'm thinking that one well-documented example of prior art for the altnet patent might be the pgp neteork which identifies and distributes PGP keys by their hash IDs. In the case of pgp.net, there are actually a couple of lengths of hash keys: Short, lont and fingerprint. Susheel Daswani wrote: > Hey Folks, > I'm not sure how everyone is handling the Altnet patent threat, but in > my studies I've come across some salient points regarding patent > infringement: -- Stephen Samuel +1(604)876-0426 samnospam@bcgreen.com http://www.bcgreen.com/ Powerful committed communication. Transformation touching the jewel within each person and bringing it to light. From aloeser at cs.tu-berlin.de Thu Feb 3 10:32:18 2005 From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? Message-ID: <4201FDB1.6F607C0C@cs.tu-berlin.de> Hey all, structured overlay networks based on DHT's, such as Pastry and Chord among others, have been investigated in the past to construct scalable and performance orientated peer-to-peer networks. However, unstructured networks, such as Gnutella or Kazaa, are still widely used among the file sharing community. Recently researchers proposed extensions to unstructured networks networks based on the small world idea: peers dynamically create shortcuts to other peers based on their interests. Over a while peers with the same interests became direct neighbors through its shortcuts and build interest based clusters. Hence peers no longer flood messages but partly route it's queries via a interested based/semantic overlay. Examples are described in [1] [2] among others. Comparing small world and DHT approaches is a difficult task, since simulations usually differ in scenarios, data sets or simulation methodology. I'm interested in scenarios and arguments PRO small world overlays for unstructured networks. Does anybody now actual theoretic or practical work that compares both approaches in different scenarios (high churn, no super peers, key word based search, meta data based search)? Which scenarios or arguments support small world approaches for unstructured networks? Alex [1] Gia - Making Gnutella like P2P Systems Scalable http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003-gia.ppt [2] Efficient Content Location Using Interest Based Locality in Peer-to-Peer Systems http://www.ieee-infocom.org/2003/papers/53_01.PDF -- ___________________________________________________________ Alexander L?ser Technische Universitaet Berlin hp: http://cis.cs.tu-berlin.de/~aloeser/ office: +49- 30-314-25551 fax : +49- 30-314-21601 ___________________________________________________________ From zooko at zooko.com Thu Feb 3 12:43:26 2005 From: zooko at zooko.com (Zooko O'Whielacronx) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Re: TCP thru' double NAT? In-Reply-To: <4201DC25.6070508@ucla.edu> References: <409EC974.9000007@vaste.mine.nu> <409ECB89.8010408@locut.us> <20040512113330.GA2606@bitchcake.off.net> <4201DC25.6070508@ucla.edu> Message-ID: <90cc48fc65f4f090eb9e558264a311db@zooko.com> [responding on-list to off-list query] > I know this p2p-hackers message is from loooong ago, but I had a quick > question -- does the TCP relay currently implemented in Mnet use the > technique described in Section 3.5 of that document? At the end it > says that "Unfortunately, this trick may be even more fragile and > timing-sensitive than the UDP port number prediction trick described > above... Applications that require efficient, direct peer-to-peer > communication over existing NATs should use UDP." It doesn't sound > like a technique to get good results with, although you report success > -- so I was just curious. Hi Michael: The Mnet hack is low-tech. A node which is not behind NAT or firewall volunteers to be a relay server. It receives msgs from node A via TCP and sends them to node B via TCP, all in user-land. There are plenty of obvious drawbacks, but it works for Mnet's purposes. I believe Skype does something similar, when Skype's more efficient alternatives fail. Regards, Zooko --- Please excuse terse writing -- there is a baby in my arms. From Bernard.Traversat at Sun.COM Thu Feb 3 14:04:31 2005 From: Bernard.Traversat at Sun.COM (Bernard Traversat) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? In-Reply-To: <4201FDB1.6F607C0C@cs.tu-berlin.de> References: <4201FDB1.6F607C0C@cs.tu-berlin.de> Message-ID: <42022F6F.1040707@Sun.COM> You may want to look at JXTA (www.jxta.org) which provides an hybrid architecture allowing you to deploy both structured and ad hoc unstructured P2P network overlays. Cheers, B. Alexander L?ser wrote: > Hey all, > structured overlay networks based on DHT's, such as Pastry and Chord > among others, have been investigated in the past to construct scalable > and performance orientated peer-to-peer networks. However, unstructured > networks, such as Gnutella or Kazaa, are still widely used among the > file sharing community. Recently researchers proposed extensions to > unstructured networks networks based on the small world idea: peers > dynamically create shortcuts to other peers based on their interests. > Over a while peers with the same interests became direct neighbors > through its shortcuts and build interest based clusters. Hence peers > no longer flood messages but partly route it's queries via a interested > based/semantic overlay. Examples are described in [1] [2] among > others. > > Comparing small world and DHT approaches is a difficult task, since > simulations usually differ in scenarios, data sets or simulation > methodology. I'm interested in scenarios and arguments PRO small > world overlays for unstructured networks. Does anybody now actual > theoretic or practical work that compares both approaches in different > scenarios (high churn, no super peers, key word based search, meta data > based search)? Which scenarios or arguments support small world > approaches for unstructured networks? > > Alex > > > > > [1] Gia - Making Gnutella like P2P Systems Scalable > http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf > http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003-gia.ppt > > [2] Efficient Content Location Using Interest Based Locality in > Peer-to-Peer Systems > http://www.ieee-infocom.org/2003/papers/53_01.PDF > -- > ___________________________________________________________ > > Alexander L?ser > Technische Universitaet Berlin > hp: http://cis.cs.tu-berlin.de/~aloeser/ > office: +49- 30-314-25551 > fax : +49- 30-314-21601 > ___________________________________________________________ > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From gbildson at limepeer.com Thu Feb 3 15:26:25 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? In-Reply-To: <4201FDB1.6F607C0C@cs.tu-berlin.de> Message-ID: I'd just like to point out that Gnutella does not use pure flooding anymore and you are unlikely to find P2P networks that don't have something akin to supernodes. Gnutella uses bloom filter based keyword index replication and dynamic querying (selectively sending out queries until a result limit is reached) to reduce the overhead of flooding for popular queries and to route all queries on the last hop. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Alexander L?ser > Sent: Thursday, February 03, 2005 5:32 AM > To: p2p-hackers@zgp.org > Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? > > > Hey all, > structured overlay networks based on DHT's, such as Pastry and Chord > among others, have been investigated in the past to construct scalable > and performance orientated peer-to-peer networks. However, unstructured > networks, such as Gnutella or Kazaa, are still widely used among the > file sharing community. Recently researchers proposed extensions to > unstructured networks networks based on the small world idea: peers > dynamically create shortcuts to other peers based on their interests. > Over a while peers with the same interests became direct neighbors > through its shortcuts and build interest based clusters. Hence peers > no longer flood messages but partly route it's queries via a interested > based/semantic overlay. Examples are described in [1] [2] among > others. > > Comparing small world and DHT approaches is a difficult task, since > simulations usually differ in scenarios, data sets or simulation > methodology. I'm interested in scenarios and arguments PRO small > world overlays for unstructured networks. Does anybody now actual > theoretic or practical work that compares both approaches in different > scenarios (high churn, no super peers, key word based search, meta data > based search)? Which scenarios or arguments support small world > approaches for unstructured networks? > > Alex > > > > > [1] Gia - Making Gnutella like P2P Systems Scalable > http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf > http://seattle.intel-research.net/people/yatin/publications/talks/ > sigcomm2003-gia.ppt > > [2] Efficient Content Location Using Interest Based Locality in > Peer-to-Peer Systems > http://www.ieee-infocom.org/2003/papers/53_01.PDF > -- > ___________________________________________________________ > > Alexander L?ser > Technische Universitaet Berlin > hp: http://cis.cs.tu-berlin.de/~aloeser/ > office: +49- 30-314-25551 > fax : +49- 30-314-21601 > ___________________________________________________________ > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From gwendal.simon at francetelecom.com Thu Feb 3 15:49:10 2005 From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS) Date: Sat Dec 9 22:12:50 2006 Subject: TR: [p2p-hackers] Paradigma Question: DHT's or Small World? Message-ID: Hi, Here are two assumptions that advocate for small-world. The first one, related to the human language, has been partially established by several studies [1,2] since the pioneering work of [3]. The graph of word interactions is constructed by linking two words when they co-occur in a sentence (a fortiori in a file). The study of the properties of these graphs shows they exhibit the small world effect and a scale-free distribution of degrees. The second assumption follows the observations you cite and some others [4,5,6]. The data-sharing graph is constructed by linking two users when they share a same file. Observations on several real traces show that this graph exhibits also the small-world effect and the scale-free distribution of degrees. Besides, it is known that the lexicon of an human contains few thousands of words. This lexicon and the words contained in the documents which have been produced and dowloaded by an user define her "semantic profile". Through the preceeding assumptions, we naturally infer that the graph generated by linking users when their semantic profile overlap is also small-world and scale-free. That is, if we consider that users emit requests on keywords chosen within their profile, we can expect that almost *all* files of interest for an user are stored by a small set of "friends". Moreover, these "friends" are already known by the user thanks to previous successfull queries. Therefore, it is possible to limit the search to a subspace of the information space without preventing the quality of responses. On the contrary, it is probable that these responses are more relevant for the requester point of view. For instance, a fan of "Fiona Apple" will discover mp3 of Fiona Apple and not informations on Apple Inc. or webpages for "apple pie" cooking. Or, an European querying informations on "football" will not receive pages on NFL. By the way, another related concern is the publication of a file. In a gnutella-like systems, peers just have to put their files in their "shared directory" in order to make them available by any node in the system. On the contrary, the task of publication in a DHT-based overlay requires to reach as many peers as the number of words describing the published document. Indeed, the published file has to be known by the peers that are responsible of all the *relevant* words of the document. This is clearly an issue for keyword-based search in DHTs. If you want to design a search engine indexing *all* words in the document, this task becomes unrealistic. -------------------- Gwendal Simon France Telecom R&D http://solipsis.netofpeers.net [1] D. Watts. Six Degrees. [2] A. Barabasi. Linked: the New Science of Networks. [3] R. Ferrer i Canco and R. Sole. The Small World of Human Language. [4] J. Keller, D. Stern and F. Dang Ngoc. MAAY: A Self-Adaptive Peer Network for Efficient Document Search. [5] V. Cholvi, P. Felber, and E.W. Biersack. Efficient Search in Unstructured Peer-to-Peer Networks. [6] Adriana Iamnitchi, Matei Ripeanu and Ian Foster, Small-World File-Sharing Communities. > -----Message d'origine----- > De : p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] De la part de Alexander L?ser > Envoy? : jeudi 3 f?vrier 2005 11:32 ? : p2p-hackers@zgp.org Objet : > [p2p-hackers] Paradigma Question: DHT's or Small World? > > Hey all, > structured overlay networks based on DHT's, such as Pastry and Chord > among others, have been investigated in the past to construct scalable > and performance orientated peer-to-peer networks. However, > unstructured networks, such as Gnutella or Kazaa, are still widely > used among the file sharing community. Recently researchers proposed > extensions to unstructured networks networks based on the small world > idea: peers dynamically create shortcuts to other peers based on their > interests. > Over a while peers with the same interests became direct neighbors > through its shortcuts and build interest based clusters. Hence peers > no longer flood messages but partly route it's queries via a > interested based/semantic overlay. Examples are described in [1] [2] > among others. > > Comparing small world and DHT approaches is a difficult task, since > simulations usually differ in scenarios, data sets or simulation > methodology. I'm interested in scenarios and arguments PRO small > world overlays for unstructured networks. Does anybody now actual > theoretic or practical work that compares both approaches in different > scenarios (high churn, no super peers, key word based search, meta > data based search)? Which scenarios or arguments support small world > approaches for unstructured networks? > > Alex > > > > > [1] Gia - Making Gnutella like P2P Systems Scalable > http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf > http://seattle.intel-research.net/people/yatin/publications/ta > lks/sigcomm2003-gia.ppt > > [2] Efficient Content Location Using Interest Based Locality in > Peer-to-Peer Systems http://www.ieee-infocom.org/2003/papers/53_01.PDF > -- > ___________________________________________________________ > > Alexander L?ser > Technische Universitaet Berlin > hp: http://cis.cs.tu-berlin.de/~aloeser/ > office: +49- 30-314-25551 > fax : +49- 30-314-21601 > ___________________________________________________________ > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From bryan.turner at pobox.com Thu Feb 3 16:35:02 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? In-Reply-To: Message-ID: <200502031635.j13GZ3jZ020887@rtp-core-1.cisco.com> Regarding Small World vs DHT; Pedantically, there is no difference.. you can map a DHT to Small World by viewing the domain of the DHT (it's keyspace) to be the semantic information sought by the peers. Thus, peers which seek nearby points in the keyspace are linked by Small World links, while those which seek distant points are only occasionally referenced. The difference that is being argued is what the USER is interested in versus what the PEER is interested in. In a DHT, the peer is required to be interested in keys which conform to some dynamic metric based on the specific model of DHT being used, while there is no model for the user's interests. I'm not arguing for or against Small World - simply that the models are equally expressive and thus equally capable of implementing each other's features. Just something to keep in mind. And to keep things on track: Gwendal, I like your explanation of a user's semantic profile, it's very crisp and approachable. It's been difficult to explain to colleagues in the past, next time I'll use your words. ;) In the following by "Gnutella", I mean "Gnutella-like systems". Please do not be offended by my mis-representation of the specific features supported by Gnutella. I see publishing between the two mediums in a different light. While it seems simpler to publish under Gnutella, there are tradeoffs that you haven't pointed out. For instance, single-word queries and exact-file searches are significantly more difficult under the Gnutella model exactly because your query must reach all your 'friends' - and return from all of them! In effect you get worst-case performance for every query. DHTs achieve best-case performance for this type of query, but are burdened by a more complex publishing process. I would also like to argue that full-text indexing on all documents is equally difficult for *both* models. My reasoning follows from the processing requirements (in any model) to index/query a full document: 1. Process a document to produce an index. 2. Store the index for future retrieval. 3. Provide query capability to a client. 4. Discover relevant indexes to a query. 5. Search the indexes for query terms. 6. Return results It should be clear that #1, #3, and #6 are essentially the same between the two models, as some entity must perform the same amount of work for these steps regardless of how it is handled "under the covers". #2 differs only in the location where the index is stored - locally or distributed. And in the amount of work done (Gnutella;less, DHT;more). #4 differs again in the location and work, but here I argue the amount of work has reversed from #2. Gnutella requires *many* peers to perform complex queries against their complex indexes, which constitutes a great deal of work. OtoH, a DHT implicitly knows which peers to address, and which queries to perform (in fact, the very act of addressing a peer is effectively performing the query). #5 again differs, although I argue that the total amount of work performed is essentially the same. Given my arguments above, the total work performed by the "system" to achieve a query is roughly equivalent between the two models. There isn't any one area in which one of the systems is burdened by an order of magnitude over the other. --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of SIMON Gwendal RD-MAPS-ISS Sent: Thursday, February 03, 2005 10:49 AM To: Peer-to-peer development. Subject: TR: [p2p-hackers] Paradigma Question: DHT's or Small World? Hi, Here are two assumptions that advocate for small-world. The first one, related to the human language, has been partially established by several studies [1,2] since the pioneering work of [3]. The graph of word interactions is constructed by linking two words when they co-occur in a sentence (a fortiori in a file). The study of the properties of these graphs shows they exhibit the small world effect and a scale-free distribution of degrees. The second assumption follows the observations you cite and some others [4,5,6]. The data-sharing graph is constructed by linking two users when they share a same file. Observations on several real traces show that this graph exhibits also the small-world effect and the scale-free distribution of degrees. Besides, it is known that the lexicon of an human contains few thousands of words. This lexicon and the words contained in the documents which have been produced and dowloaded by an user define her "semantic profile". Through the preceeding assumptions, we naturally infer that the graph generated by linking users when their semantic profile overlap is also small-world and scale-free. That is, if we consider that users emit requests on keywords chosen within their profile, we can expect that almost *all* files of interest for an user are stored by a small set of "friends". Moreover, these "friends" are already known by the user thanks to previous successfull queries. Therefore, it is possible to limit the search to a subspace of the information space without preventing the quality of responses. On the contrary, it is probable that these responses are more relevant for the requester point of view. For instance, a fan of "Fiona Apple" will discover mp3 of Fiona Apple and not informations on Apple Inc. or webpages for "apple pie" cooking. Or, an European querying informations on "football" will not receive pages on NFL. By the way, another related concern is the publication of a file. In a gnutella-like systems, peers just have to put their files in their "shared directory" in order to make them available by any node in the system. On the contrary, the task of publication in a DHT-based overlay requires to reach as many peers as the number of words describing the published document. Indeed, the published file has to be known by the peers that are responsible of all the *relevant* words of the document. This is clearly an issue for keyword-based search in DHTs. If you want to design a search engine indexing *all* words in the document, this task becomes unrealistic. -------------------- Gwendal Simon France Telecom R&D http://solipsis.netofpeers.net [1] D. Watts. Six Degrees. [2] A. Barabasi. Linked: the New Science of Networks. [3] R. Ferrer i Canco and R. Sole. The Small World of Human Language. [4] J. Keller, D. Stern and F. Dang Ngoc. MAAY: A Self-Adaptive Peer Network for Efficient Document Search. [5] V. Cholvi, P. Felber, and E.W. Biersack. Efficient Search in Unstructured Peer-to-Peer Networks. [6] Adriana Iamnitchi, Matei Ripeanu and Ian Foster, Small-World File-Sharing Communities. From Serguei.Osokine at efi.com Thu Feb 3 18:12:31 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC32E@fcexmb04.efi.internal> On Thursday, February 03, 2005 Bryan Turner wrote: > Given my arguments above, the total work performed by the "system" > to achieve a query is roughly equivalent between the two models. Uh, looks to me that given your arguments above the models are logically equivalent, which says nothing about whether the work is the same or not. In fact, I can easily imagine the situations where the load would be orders of magnitude different for Gnutella and DHTs. Best wishes - S.Osokine. 3 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Bryan Turner Sent: Thursday, February 03, 2005 8:35 AM To: 'Peer-to-peer development.' Subject: RE: [p2p-hackers] Paradigma Question: DHT's or Small World? Regarding Small World vs DHT; Pedantically, there is no difference.. you can map a DHT to Small World by viewing the domain of the DHT (it's keyspace) to be the semantic information sought by the peers. Thus, peers which seek nearby points in the keyspace are linked by Small World links, while those which seek distant points are only occasionally referenced. The difference that is being argued is what the USER is interested in versus what the PEER is interested in. In a DHT, the peer is required to be interested in keys which conform to some dynamic metric based on the specific model of DHT being used, while there is no model for the user's interests. I'm not arguing for or against Small World - simply that the models are equally expressive and thus equally capable of implementing each other's features. Just something to keep in mind. And to keep things on track: Gwendal, I like your explanation of a user's semantic profile, it's very crisp and approachable. It's been difficult to explain to colleagues in the past, next time I'll use your words. ;) In the following by "Gnutella", I mean "Gnutella-like systems". Please do not be offended by my mis-representation of the specific features supported by Gnutella. I see publishing between the two mediums in a different light. While it seems simpler to publish under Gnutella, there are tradeoffs that you haven't pointed out. For instance, single-word queries and exact-file searches are significantly more difficult under the Gnutella model exactly because your query must reach all your 'friends' - and return from all of them! In effect you get worst-case performance for every query. DHTs achieve best-case performance for this type of query, but are burdened by a more complex publishing process. I would also like to argue that full-text indexing on all documents is equally difficult for *both* models. My reasoning follows from the processing requirements (in any model) to index/query a full document: 1. Process a document to produce an index. 2. Store the index for future retrieval. 3. Provide query capability to a client. 4. Discover relevant indexes to a query. 5. Search the indexes for query terms. 6. Return results It should be clear that #1, #3, and #6 are essentially the same between the two models, as some entity must perform the same amount of work for these steps regardless of how it is handled "under the covers". #2 differs only in the location where the index is stored - locally or distributed. And in the amount of work done (Gnutella;less, DHT;more). #4 differs again in the location and work, but here I argue the amount of work has reversed from #2. Gnutella requires *many* peers to perform complex queries against their complex indexes, which constitutes a great deal of work. OtoH, a DHT implicitly knows which peers to address, and which queries to perform (in fact, the very act of addressing a peer is effectively performing the query). #5 again differs, although I argue that the total amount of work performed is essentially the same. Given my arguments above, the total work performed by the "system" to achieve a query is roughly equivalent between the two models. There isn't any one area in which one of the systems is burdened by an order of magnitude over the other. --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of SIMON Gwendal RD-MAPS-ISS Sent: Thursday, February 03, 2005 10:49 AM To: Peer-to-peer development. Subject: TR: [p2p-hackers] Paradigma Question: DHT's or Small World? Hi, Here are two assumptions that advocate for small-world. The first one, related to the human language, has been partially established by several studies [1,2] since the pioneering work of [3]. The graph of word interactions is constructed by linking two words when they co-occur in a sentence (a fortiori in a file). The study of the properties of these graphs shows they exhibit the small world effect and a scale-free distribution of degrees. The second assumption follows the observations you cite and some others [4,5,6]. The data-sharing graph is constructed by linking two users when they share a same file. Observations on several real traces show that this graph exhibits also the small-world effect and the scale-free distribution of degrees. Besides, it is known that the lexicon of an human contains few thousands of words. This lexicon and the words contained in the documents which have been produced and dowloaded by an user define her "semantic profile". Through the preceeding assumptions, we naturally infer that the graph generated by linking users when their semantic profile overlap is also small-world and scale-free. That is, if we consider that users emit requests on keywords chosen within their profile, we can expect that almost *all* files of interest for an user are stored by a small set of "friends". Moreover, these "friends" are already known by the user thanks to previous successfull queries. Therefore, it is possible to limit the search to a subspace of the information space without preventing the quality of responses. On the contrary, it is probable that these responses are more relevant for the requester point of view. For instance, a fan of "Fiona Apple" will discover mp3 of Fiona Apple and not informations on Apple Inc. or webpages for "apple pie" cooking. Or, an European querying informations on "football" will not receive pages on NFL. By the way, another related concern is the publication of a file. In a gnutella-like systems, peers just have to put their files in their "shared directory" in order to make them available by any node in the system. On the contrary, the task of publication in a DHT-based overlay requires to reach as many peers as the number of words describing the published document. Indeed, the published file has to be known by the peers that are responsible of all the *relevant* words of the document. This is clearly an issue for keyword-based search in DHTs. If you want to design a search engine indexing *all* words in the document, this task becomes unrealistic. -------------------- Gwendal Simon France Telecom R&D http://solipsis.netofpeers.net [1] D. Watts. Six Degrees. [2] A. Barabasi. Linked: the New Science of Networks. [3] R. Ferrer i Canco and R. Sole. The Small World of Human Language. [4] J. Keller, D. Stern and F. Dang Ngoc. MAAY: A Self-Adaptive Peer Network for Efficient Document Search. [5] V. Cholvi, P. Felber, and E.W. Biersack. Efficient Search in Unstructured Peer-to-Peer Networks. [6] Adriana Iamnitchi, Matei Ripeanu and Ian Foster, Small-World File-Sharing Communities. _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From rita at comet.columbia.edu Thu Feb 3 18:59:39 2005 From: rita at comet.columbia.edu (Rita H. Wouhaybi) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? Message-ID: <008f01c50a22$837a66d0$9e433b80@comet.columbia.edu> Alexander L?ser wrote: > Hey all, > structured overlay networks based on DHT's, such as Pastry and Chord > among others, have been investigated in the past to construct scalable > and performance orientated peer-to-peer networks. However, unstructured > networks, such as Gnutella or Kazaa, are still widely used among the > file sharing community. Recently researchers proposed extensions to > unstructured networks networks based on the small world idea: peers > dynamically create shortcuts to other peers based on their interests. > Over a while peers with the same interests became direct neighbors > through its shortcuts and build interest based clusters. Hence peers > no longer flood messages but partly route it's queries via a interested > based/semantic overlay. Examples are described in [1] [2] among > others. > > Comparing small world and DHT approaches is a difficult task, since > simulations usually differ in scenarios, data sets or simulation > methodology. I'm interested in scenarios and arguments PRO small > world overlays for unstructured networks. Does anybody now actual > theoretic or practical work that compares both approaches in different > scenarios (high churn, no super peers, key word based search, meta data > based search)? Which scenarios or arguments support small world > approaches for unstructured networks? > > Alex > > > > > [1] Gia - Making Gnutella like P2P Systems Scalable > http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf > http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003-gia.ppt > > [2] Efficient Content Location Using Interest Based Locality in > Peer-to-Peer Systems > http://www.ieee-infocom.org/2003/papers/53_01.PDF > -- > ___________________________________________________________ > > Alexander L?ser > Technische Universitaet Berlin > hp: http://cis.cs.tu-berlin.de/~aloeser/ > office: +49- 30-314-25551 > fax : +49- 30-314-21601 > ___________________________________________________________ > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers Interesting discussion Alex. >From the practical and system challenges that faced researchers working on DHTs (long time for the network to become stable, updates and maintenance for nodes join and leave, high cost of messaging when adding an object to the network, ..), it has become the norm to think about the application when trying to decide to use structured (DHTs) or unstructured (gnutella-like) p2p topologies. That is probably one of the reasons why people did not compare both structures in an analysis similar to what you are asking for. Thus, small world and power-law have emerged to bridge the gap between a total random network and a "rigid" DHT. Note that super-peers in Kazaa and Gnutella do actually help the network become more like a small-world. We also have worked in this area and created a power-law distribution P2P network that might interest you: - Rita H. Wouhaybi, and Andrew T. Campbell, "Phenix: Supporting Resilient Low-Diameter Peer-to-Peer Topologies", IEEE INFOCOM'2004, Hong Kong, China, March 7-11, 2004. Rita H. Wouhaybi rita@comet.columbia.edu http://comet.columbia.edu/~rita/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20050203/c06a78ef/attachment.html From Serguei.Osokine at efi.com Thu Feb 3 19:53:22 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC32F@fcexmb04.efi.internal> On Thursday, February 03, 2005 Rita H. Wouhaybi wrote: > Note that super-peers in Kazaa and Gnutella do actually help > the network become more like a small-world. Not necessarily. Or at least, to a much smaller extent than the intuitive thinking would suggest. Superpeers do make the network smaller in terms of the node numbers, but at the same time they increase the traffic on the intra-ultrapeer links in exactly the same proportion, making it more difficult to route anything to the remote nodes. So the actual query reach improvement (the degree of 'small-worldness', so to speak) is improved only due to the better than average super-peer bandwidth: http://www.grouter.net/gnutella/search.htm#PlainSuperpeerNetwork http://www.grouter.net/gnutella/search.htm#Eq25 Basically, if you cannot reach all hosts in a 'flat' network (without super-peers), chances are pretty high that the introduction of super-peers won't change this situation unless the original flat network was already pretty close to being a 'small world' (fully reachable) one. The search reach in the super-peered nets like Kazaa really is better, but it comes from first, higher than average superpeer bandwidth, and second, from the proactive index replication that naturally happens when a leaf connects to several superpeers at once (three or so in Kazaa case, I believe). This one tends to be viewed as just something done to improve a connection reliability through redundancy, whereas in fact it also improves the query reach in direct proportion to the number of redundant links: http://www.grouter.net/gnutella/search.htm#RedundantSuperpeerClusters I think this effect was first noted by the Stanford P2P research group, which named it 'k-redundancy': http://www-db.stanford.edu/~byang/pubs/superpeer.pdf Best wishes - S.Osokine. 3 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Rita H. Wouhaybi Sent: Thursday, February 03, 2005 11:00 AM To: p2p-hackers@zgp.org; aloeser@cs.tu-berlin.de Subject: Re:[p2p-hackers] Paradigma Question: DHT's or Small World? Alexander L?ser wrote: > Hey all, > structured overlay networks based on DHT's, such as Pastry and Chord > among others, have been investigated in the past to construct scalable > and performance orientated peer-to-peer networks. However, unstructured > networks, such as Gnutella or Kazaa, are still widely used among the > file sharing community. Recently researchers proposed extensions to > unstructured networks networks based on the small world idea: peers > dynamically create shortcuts to other peers based on their interests. > Over a while peers with the same interests became direct neighbors > through its shortcuts and build interest based clusters. Hence peers > no longer flood messages but partly route it's queries via a interested > based/semantic overlay. Examples are described in [1] [2] among > others. > > Comparing small world and DHT approaches is a difficult task, since > simulations usually differ in scenarios, data sets or simulation > methodology. I'm interested in scenarios and arguments PRO small > world overlays for unstructured networks. Does anybody now actual > theoretic or practical work that compares both approaches in different > scenarios (high churn, no super peers, key word based search, meta data > based search)? Which scenarios or arguments support small world > approaches for unstructured networks? > > Alex > > > > > [1] Gia - Making Gnutella like P2P Systems Scalable > http://berkeley.intel-research.net/sylvia/1103-chawathe.pdf > http://seattle.intel-research.net/people/yatin/publications/talks/sigcomm2003 -gia.ppt > > [2] Efficient Content Location Using Interest Based Locality in > Peer-to-Peer Systems > http://www.ieee-infocom.org/2003/papers/53_01.PDF > -- > ___________________________________________________________ > > Alexander L?ser > Technische Universitaet Berlin > hp: http://cis.cs.tu-berlin.de/~aloeser/ > office: +49- 30-314-25551 > fax : +49- 30-314-21601 > ___________________________________________________________ > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers Interesting discussion Alex. >From the practical and system challenges that faced researchers working on DHTs (long time for the network to become stable, updates and maintenance for nodes join and leave, high cost of messaging when adding an object to the network, ..), it has become the norm to think about the application when trying to decide to use structured (DHTs) or unstructured (gnutella-like) p2p topologies. That is probably one of the reasons why people did not compare both structures in an analysis similar to what you are asking for. Thus, small world and power-law have emerged to bridge the gap between a total random network and a "rigid" DHT. Note that super-peers in Kazaa and Gnutella do actually help the network become more like a small-world. We also have worked in this area and created a power-law distribution P2P network that might interest you: - Rita H. Wouhaybi, and Andrew T. Campbell, "Phenix: Supporting Resilient Low-Diameter Peer-to-Peer Topologies", IEEE INFOCOM'2004, Hong Kong, China, March 7-11, 2004. Rita H. Wouhaybi rita@comet.columbia.edu http://comet.columbia.edu/~rita/ From aloeser at cs.tu-berlin.de Fri Feb 4 12:57:50 2005 From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? References: <4201FDB1.6F607C0C@cs.tu-berlin.de> Message-ID: <4203714E.E1EA93CD@cs.tu-berlin.de> Thank you very much in sharing this discussion!! You gave me very valuable comments on the design question to choose either small world or DHT's. If I understood your arguments right, small world should be the preferred paradigm, if the system design requires the following (hard or soft) features: (Hard features) Churn: The system should support a high churn rate of peers/high churn rate of objects: By the way, since these hypotheses are intuitive but unproved, does anybody know a theoretical or experimental work, that proofed them? Furthermore, maybe this question is a bit naive, but what exactly is high? Complex queries: The system allows a user to pose complex queries, e.g. several keywords, or if I speak about meta data annotated documents more than one (semantic) predicate per query. (Soft features) Profile locality: One peer maps to one user. Probably a user is not interested or willing to transfer it's local profile to a global index but likes to keep it locally, e.g. for anonymity or to delete entries. Popularity: If most of the searches go for popular objects, small world may be the first choice. For example, this is the case for most music sharing networks. Community search: Depending on the shortcut creation strategies between friends on a small world network, the small world paradigm supports the data sharing graph between people with similar interests. By the way: Does it also support similar semantics? What kind of application scenario suits to this requirements? I think of a networked desktop search application. Similar to Gnutella, some people publish some of its documents, most don't. Some of them are annotated by meta data, probably with the same vocabulary or within the same ontology, some not. Users pose keyword queries, similar in a single desktop search engine. Queries either match the documents filename, folder or (if any) documents meta data. Would be the small world paradigm support such a system? Alex -- ___________________________________________________________ Alexander L?ser Technische Universit?t Berlin http://cis.cs.tu-berlin.de/~aloeser/ office : +49- 30-314-25551 fax : +49- 30-314-21601 skype : hallo.alex ___________________________________________________________ From gwendal.simon at francetelecom.com Fri Feb 4 13:26:21 2005 From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? Message-ID: Hi, > What kind of application scenario suits to this requirements? > I think of a networked desktop search application. Similar to > Gnutella, some people publish some of its documents, most > don't. Some of them are annotated by meta data, probably with > the same vocabulary or within the same ontology, some not. > Users pose keyword queries, similar in a single desktop > search engine. Queries either match the documents filename, > folder or (if any) documents meta data. Why do you want to restrict search to meta-data ? Google don't ! It must be possible to perform full-text search... Besides, how to define a world common ontology that could fit all future needs ? -------------------- Gwendal Simon France Telecom R&D http://solipsis.netofpeers.net From aloeser at cs.tu-berlin.de Fri Feb 4 13:43:11 2005 From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? References: Message-ID: <42037BEF.8E228624@cs.tu-berlin.de> SIMON Gwendal RD-MAPS-ISS wrote: > Hi, > > > What kind of application scenario suits to this requirements? > > I think of a networked desktop search application. Similar to > > Gnutella, some people publish some of its documents, most > > don't. Some of them are annotated by meta data, probably with > > the same vocabulary or within the same ontology, some not. > > Users pose keyword queries, similar in a single desktop > > search engine. Queries either match the documents filename, > > folder or (if any) documents meta data. > > Why do you want to restrict search to meta-data ? Google don't ! It must > be possible to perform full-text search... I assume a system where its possible to search full text. Probably for a first try, within the filename and directory structure only, later in the document itself. > > Besides, how to define a world common ontology that could fit all future > needs ? However, if the document contains any valuable meta data, the system should consider this information as well. I think of documents classified by an enterprise wide topic hierarchy or research docs classified within the ACM topic hierarchy or the documents within the google/dmoz project. Or possible doctors that exchange documents classified within a medical taxonomy. Please correct me, if my assumptions are wrong. Cheers Alex -- ___________________________________________________________ Alexander L?ser Technische Universit?t Berlin http://cis.cs.tu-berlin.de/~aloeser/ office : +49- 30-314-25551 fax : +49- 30-314-21601 skype : hallo.alex ___________________________________________________________ From hopper at omnifarious.org Fri Feb 4 15:03:42 2005 From: hopper at omnifarious.org (Eric M. Hopper) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? In-Reply-To: <42037BEF.8E228624@cs.tu-berlin.de> References: <42037BEF.8E228624@cs.tu-berlin.de> Message-ID: <1107529422.6165.27.camel@bats.omnifarious.org> On Fri, 2005-02-04 at 14:43 +0100, Alexander L?ser wrote: > However, if the document contains any valuable meta data, the system should > consider this information as well. I think of documents classified by an > enterprise wide topic hierarchy or research docs classified within the ACM > topic hierarchy or the documents within the google/dmoz project. Or possible > doctors that exchange documents classified within a medical taxonomy. > > Please correct me, if my assumptions are wrong. Well, one thing any search system has to deal with is being gamed. Meta-data is too easy to game. It's data for the computer, not for people, so it can be used to trick computers into giving people information they're not actually interested in. Computers, as much as possible, have to base their searching on what people will actually look at. Now, your idea of trying to automatically get people with similar interests to group together might provide a way for computers to take advantage of knowledge of those relationships to let people sort of vet documents for one another. And that could be an interesting approach. I think one of the primary problems there is the same one google has to deal with. Party crashers. People who try to become part of a community largely in order to sow disinformation, usually for commercial gain. Have fun (if at all possible), -- The best we can hope for concerning the people at large is that they be properly armed. -- Alexander Hamilton -- Eric Hopper (hopper@omnifarious.org http://www.omnifarious.org/~hopper) -- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 185 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050204/c3bf8deb/attachment.pgp From bryan.turner at pobox.com Fri Feb 4 18:50:25 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Paradigma Question: DHT's or Small World? In-Reply-To: <4203714E.E1EA93CD@cs.tu-berlin.de> Message-ID: <200502041850.j14IoPjZ016336@rtp-core-1.cisco.com> Alex, > Churn: The system should support a high churn rate of peers/high churn rate of > objects. By the way, since these hypotheses are intuitive but unproved, does > anybody know a theoretical or experimental work, that proofed them? Furthermore, > maybe this question is a bit naive, but what exactly is high? See [1,2] for some discussion of churn and the "half-life" of a network. These models were built from Chord, but the results are useful to both systems. To answer your question more directly: "high" means close to the half-life of your network. The half-life being the time it takes half the nodes in the network to cycle off of it. If your churn rate is higher than this, you effectively cannot keep the network together, as it is outpacing your stabilization protocol. If your churn is lower, then you get a stable network. So a "high" churn rate is just under your network's half-life. > Profile locality: One peer maps to one user. Probably a user is not interested > or willing to transfer it's local profile to a global index but likes to keep > it locally, e.g. for anonymity or to delete entries. Depending on system design, anonymity may be improved if a 'peer' is actually a darknet of users. This provides k-anonymity within the group. See [3,4] for such protocols. Probably not relevant to your request, but it's fascinating research anyway.. > Popularity: If most of the searches go for popular objects, small world may > be the first choice. For example, this is the case for most music sharing networks. The greater practical concern for popularity is resolving "flash crowds" gracefully in the system. Neither DHT/Small World models define the behavior for this case. You should review some of the various solutions to this problem (too many to reference, but see [5], Section 3, and [6], Section III, for an example). > What kind of application scenario suits to this requirements? Any form of data repository where the primary user is an individual. For instance: Phone Book, Restaurant Guide, News Portal, Product Catalog, Wiki, etc.. Hope that helps! --Bryan bryan.turner@pobox.com [1] Observations on the Dynamic Evolution of Peer-to-Peer Networks David Liben-Nowell, et. al. http://citeseer.ist.psu.edu/liben-nowell02observations.html [2] Analysis of the Evolution of Peer-to-Peer Systems David Liben-Nowell, et. al. http://citeseer.ist.psu.edu/liben-nowell02analysis.html [3] k-Anonymous Message Transmission, Luis von Ahn, et. al. http://www-2.cs.cmu.edu/~abortz/work/k-anon-final.html [4] A New k-Anonymous Message Transmission Protocol Gang Yao, Dengguo Feng http://dasan.sejong.ac.kr/~wisa04/ppt/9A2.pdf [5] Novel Architectures for P2P Applications: The Continuous-Discrete Approach Moni Naor, Udi Wieder http://citeseer.ist.psu.edu/554254.html [6] Small World Overlay P2P Networks, Ken Y. K. Hui, et. al. http://www.cse.cuhk.edu.hk/~cslui/PUBLICATION/iwqos2004_small_world.pdf -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Alexander L?ser Sent: Friday, February 04, 2005 7:58 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] Paradigma Question: DHT's or Small World? Thank you very much in sharing this discussion!! You gave me very valuable comments on the design question to choose either small world or DHT's. If I understood your arguments right, small world should be the preferred paradigm, if the system design requires the following (hard or soft) features: (Hard features) Churn: The system should support a high churn rate of peers/high churn rate of objects: By the way, since these hypotheses are intuitive but unproved, does anybody know a theoretical or experimental work, that proofed them? Furthermore, maybe this question is a bit naive, but what exactly is high? Complex queries: The system allows a user to pose complex queries, e.g. several keywords, or if I speak about meta data annotated documents more than one (semantic) predicate per query. (Soft features) Profile locality: One peer maps to one user. Probably a user is not interested or willing to transfer it's local profile to a global index but likes to keep it locally, e.g. for anonymity or to delete entries. Popularity: If most of the searches go for popular objects, small world may be the first choice. For example, this is the case for most music sharing networks. Community search: Depending on the shortcut creation strategies between friends on a small world network, the small world paradigm supports the data sharing graph between people with similar interests. By the way: Does it also support similar semantics? What kind of application scenario suits to this requirements? I think of a networked desktop search application. Similar to Gnutella, some people publish some of its documents, most don't. Some of them are annotated by meta data, probably with the same vocabulary or within the same ontology, some not. Users pose keyword queries, similar in a single desktop search engine. Queries either match the documents filename, folder or (if any) documents meta data. Would be the small world paradigm support such a system? Alex -- ___________________________________________________________ Alexander L?ser Technische Universit?t Berlin http://cis.cs.tu-berlin.de/~aloeser/ office : +49- 30-314-25551 fax : +49- 30-314-21601 skype : hallo.alex ___________________________________________________________ _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From john.casey at gmail.com Mon Feb 7 08:22:30 2005 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] gossiping in a DHT Message-ID: Hi All, I have been thinking about developing a gossip information dissemenation algorithm to work across a DHT. Does any one have any links to any must read papers on this topic? Conceptually, the process seems similar to that of gossip in an unstructured DHT. Just wondering if there was any prior work I should take a look at thanks. :) From davidopp at cs.berkeley.edu Mon Feb 7 16:36:09 2005 From: davidopp at cs.berkeley.edu (David L. Oppenheimer) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] gossiping in a DHT In-Reply-To: Message-ID: <200502071635.IAA27418@mindbender.davido.com> You might want to take a look at Kelips http://citeseer.ist.psu.edu/570786.html David > Hi All, I have been thinking about developing a gossip information > dissemenation algorithm to work across a DHT. Does any one have any > links to any must read papers on this topic? Conceptually, the process > seems similar to that of gossip in an unstructured DHT. Just wondering > if there was any prior work I should take a look at thanks. :) From paul at ref.nmedia.net Tue Feb 8 13:31:56 2005 From: paul at ref.nmedia.net (Paul Campbell) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] gossiping in a DHT In-Reply-To: References: Message-ID: <20050208133156.GA11916@ref.nmedia.net> On Mon, Feb 07, 2005 at 07:22:30PM +1100, John Casey wrote: > Hi All, I have been thinking about developing a gossip information > dissemenation algorithm to work across a DHT. Does any one have any > links to any must read papers on this topic? Conceptually, the process > seems similar to that of gossip in an unstructured DHT. Just wondering > if there was any prior work I should take a look at thanks. :) Gossipping has to overcome the unstructured nature of the underlying network. In a DHT, this is not necessary since it is easy to set up a real broadcast. Look for protocols dealing with broadcasting on a DHT. For instance, one could propagate a message around the ring until it gets back to the source. This would take N-1 messages (if the originator is listed in the message) and N-1 rounds. A faster way is to use the DHT structure where some nodes broadcast multiple messages. For instance, the source could conceptually break the DHT ring up into arcs and broadcast a message to a node residing on each arc along with the arc length. In turn, the next layer of nodes can broadcast the message across their respective arcs, subdividing the problem by another level. With log(N) known neighbors, it should take log(N) rounds to reach every node and again, N-1 messages. Contrast this with N*log(N) messages in an unstructured gossipping system with log(N) rounds. Thus, without structure, the load is much higher. From anwitaman at hotmail.com Wed Feb 9 12:00:44 2005 From: anwitaman at hotmail.com (Anwitaman Datta) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] RE: p2p-hackers Digest, Vol 19, Issue 7 In-Reply-To: <20050208200004.84AD13FD25@capsicum.zgp.org> Message-ID: There are several DHT based broadcasting mechanisms in the literature, which may also interest you. The first that I came across was "structella": http://nms.lcs.mit.edu/HotNets-II/papers/structella.pdf Also, we use such a scheme for range queries in P-Grid: http://www.p-grid.org/Papers/TR-IC-2004-111.pdf as also used in prefix hash tree http://berkeley.intel-research.net/sylvia/pht.pdf - Anwitaman Today's Topics: 1. Re: gossiping in a DHT (Paul Campbell) A faster way is to use the DHT structure where some nodes broadcast multiple messages. For instance, the source could conceptually break the DHT ring up into arcs and broadcast a message to a node residing on each arc along with the arc length. In turn, the next layer of nodes can broadcast the message across their respective arcs, subdividing the problem by another level. With log(N) known neighbors, it should take log(N) rounds to reach every node and again, N-1 messages. Contrast this with N*log(N) messages in an unstructured gossipping system with log(N) rounds. Thus, without structure, the load is much higher. _________________________________________________________________ Trailblazer Narain Karthikeyan. Know more about him ‘n his life. http://server1.msn.co.in/sp04/tataracing/ Stay in the loop with Tata Racing! From john.casey at gmail.com Thu Feb 10 04:40:09 2005 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] gossiping in a DHT In-Reply-To: <20050208133156.GA11916@ref.nmedia.net> References: <20050208133156.GA11916@ref.nmedia.net> Message-ID: thanks guys. I've just been reading digesting the papers you have given me. The structella, and the pointers to the broadcasting papers it have are very useful :) On Tue, 8 Feb 2005 05:31:56 -0800, Paul Campbell wrote: > On Mon, Feb 07, 2005 at 07:22:30PM +1100, John Casey wrote: > > Hi All, I have been thinking about developing a gossip information > > dissemenation algorithm to work across a DHT. Does any one have any > > links to any must read papers on this topic? Conceptually, the process > > seems similar to that of gossip in an unstructured DHT. Just wondering > > if there was any prior work I should take a look at thanks. :) > > Gossipping has to overcome the unstructured nature of the underlying > network. In a DHT, this is not necessary since it is easy to set up a > real broadcast. Look for protocols dealing with broadcasting on a DHT. > > For instance, one could propagate a message around the ring until it gets > back to the source. This would take N-1 messages (if the originator is > listed in the message) and N-1 rounds. > > A faster way is to use the DHT structure where some nodes broadcast multiple > messages. For instance, the source could conceptually break the DHT ring up > into arcs and broadcast a message to a node residing on each arc along with > the arc length. In turn, the next layer of nodes can broadcast the message > across their respective arcs, subdividing the problem by another level. With > log(N) known neighbors, it should take log(N) rounds to reach every node and > again, N-1 messages. Contrast this with N*log(N) messages in an unstructured > gossipping system with log(N) rounds. Thus, without structure, the load is > much higher. From rabbi at abditum.com Thu Feb 10 08:01:01 2005 From: rabbi at abditum.com (Len Sassaman) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] CodeCon Reminder Message-ID: We'd like to remind those of you planning to attend this year's event that CodeCon is fast approaching. CodeCon is the premier event in 2005 for application developer community. It is a workshop for developers of real-world applications with working code and active development projects. Past presentations at CodeCon have included the file distribution software BitTorrent; the Peek-A-Booty anti-censorship application; the email encryption system PGP Universal; and Audacity, a powerful audio editing tool. Some of this year's highlights include Off-The-Record Messaging, a privacy-enhancing encryption protocol for instant-message systems; SciTools, a web-based toolkit for genetic design and analysis; and Incoherence, a novel stereo sound visualization tool. CodeCon registration is discounted this year: $80 for cash at the door registrations. Registration will be available every day of the conference, though ticket are limited, and attendees are encouraged to register on the first day to secure admission. CodeCon will be held February 11-13, noon-6pm, at Club NV (525 Howard Street) in San Francisco. For more information, please visit http://www.codecon.org. From aloeser at cs.tu-berlin.de Thu Feb 10 08:57:00 2005 From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] gossiping in a DHT References: <20050208133156.GA11916@ref.nmedia.net> Message-ID: <420B21DC.4A2E1625@cs.tu-berlin.de> Hi John, probably you should look at the hypercup topology [1] , that permits broadcasting in a structured overlay. In Edutella [2] we use the broadcast mechanism to broadcast complex queries. Due to the combination of the broadcast with routing indices and a super-peer network we are able to focus the broadcast to a subset of peers. Alex [1] http://projekte.learninglab.uni-hannover.de/pub/bscw.cgi/d7825/HyperCuP%20-%20Hypercubes,%20Ontologies%20and%20Efficient%20Search%20on%20P2P%20Networks [2] http://www.kbs.uni-hannover.de/Arbeiten/Publikationen/2002/www2003_superpeer.pdf John Casey wrote: > thanks guys. I've just been reading digesting the papers you have > given me. The structella, and the pointers to the broadcasting papers > it have are very useful :) > > On Tue, 8 Feb 2005 05:31:56 -0800, Paul Campbell wrote: > > On Mon, Feb 07, 2005 at 07:22:30PM +1100, John Casey wrote: > > > Hi All, I have been thinking about developing a gossip information > > > dissemenation algorithm to work across a DHT. Does any one have any > > > links to any must read papers on this topic? Conceptually, the process > > > seems similar to that of gossip in an unstructured DHT. Just wondering > > > if there was any prior work I should take a look at thanks. :) > > > > Gossipping has to overcome the unstructured nature of the underlying > > network. In a DHT, this is not necessary since it is easy to set up a > > real broadcast. Look for protocols dealing with broadcasting on a DHT. > > > > For instance, one could propagate a message around the ring until it gets > > back to the source. This would take N-1 messages (if the originator is > > listed in the message) and N-1 rounds. > > > > A faster way is to use the DHT structure where some nodes broadcast multiple > > messages. For instance, the source could conceptually break the DHT ring up > > into arcs and broadcast a message to a node residing on each arc along with > > the arc length. In turn, the next layer of nodes can broadcast the message > > across their respective arcs, subdividing the problem by another level. With > > log(N) known neighbors, it should take log(N) rounds to reach every node and > > again, N-1 messages. Contrast this with N*log(N) messages in an unstructured > > gossipping system with log(N) rounds. Thus, without structure, the load is > > much higher. > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences -- ___________________________________________________________ Alexander L?ser Technische Universit?t Berlin http://cis.cs.tu-berlin.de/~aloeser/ office : +49- 30-314-25551 fax : +49- 30-314-21601 ___________________________________________________________ From telecontrol at t-online.de Thu Feb 10 10:46:29 2005 From: telecontrol at t-online.de (Telecontrol) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] We need some help for our project TV-Sharing over P2P (www.cybertelly.com) Message-ID: <003001c50f5d$c71b56c0$69a2a8c0@namepc> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 12199 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050210/fd77db14/attachment.gif From telecontrol at t-online.de Thu Feb 10 10:56:38 2005 From: telecontrol at t-online.de (Telecontrol) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] We need some help for our project TV-Sharing over P2P Message-ID: <003c01c50f5f$31c5e610$69a2a8c0@namepc> Please use the email adress telecontrol@t-online.de if you want to support the project , Thank you !! From sszukala at runbox.com Thu Feb 10 20:39:14 2005 From: sszukala at runbox.com (Shannon Alexander Szukala) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Re: p2p-hackers Digest, Vol 19, Issue 10 In-Reply-To: <20050210200004.046783FD65@capsicum.zgp.org> References: <20050210200004.046783FD65@capsicum.zgp.org> Message-ID: Hey I want to help out. Let me know what you are looking for. > Send p2p-hackers mailing list submissions to > p2p-hackers@zgp.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://zgp.org/mailman/listinfo/p2p-hackers > or, via email, send a message with subject or body 'help' to > p2p-hackers-request@zgp.org > > You can reach the person managing the list at > p2p-hackers-owner@zgp.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of p2p-hackers digest..." > > > Today's Topics: > > 1. We need some help for our project TV-Sharing over P2P > (Telecontrol) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 10 Feb 2005 11:56:38 +0100 > From: "Telecontrol" > Subject: [p2p-hackers] We need some help for our project TV-Sharing > over P2P > To: > Message-ID: <003c01c50f5f$31c5e610$69a2a8c0@namepc> > Content-Type: text/plain; charset="us-ascii" > > Please use the email adress telecontrol@t-online.de if you want to > support the project , Thank you !! > > > > ------------------------------ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > > > End of p2p-hackers Digest, Vol 19, Issue 10 > ******************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20050210/8dda94c0/attachment.html From trep at cs.ucr.edu Fri Feb 11 21:11:14 2005 From: trep at cs.ucr.edu (Thomas Repantis) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Bloom Filters in Gnutella (Was: Re: Paradigma Question: DHT's or Small World?) In-Reply-To: References: <4201FDB1.6F607C0C@cs.tu-berlin.de> Message-ID: <20050211211114.GA673@angeldust.chaos> Hi Greg, interesting what you wrote, that Gnutella uses Bloom Filters. I thought that simple hash tables were exchanged. How are the Bloom Filters propagated? Just from every leaf to its ultrapeer? Or do ultrapeers also exchange Bloom Filters? Let me know if you have any pointers on this. I'm only aware of: http://rfc-gnutella.sourceforge.net/src/Ultrapeers_1.0.html and http://www.limewire.com/developer/query_routing/keyword%20routing.htm I've also done some work on Bloom Filters and their propagation (the first paper on: http://www.cs.ucr.edu/~trep/publications.html ) Cheers, Thomas On Thu, Feb 03, 2005 at 10:26:25AM -0500, Greg Bildson wrote: > I'd just like to point out that Gnutella does not use pure flooding anymore > and you are unlikely to find P2P networks that don't have something akin to > supernodes. Gnutella uses bloom filter based keyword index replication and > dynamic querying (selectively sending out queries until a result limit is > reached) to reduce the overhead of flooding for popular queries and to route > all queries on the last hop. > > Thanks > -greg > -- http://www.cs.ucr.edu/~trep -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050211/e324ccd7/attachment.pgp From mgp at ucla.edu Tue Feb 15 09:52:41 2005 From: mgp at ucla.edu (Michael Parker) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Online Codes Message-ID: <4211C669.3080200@ucla.edu> Hi all, Does anyone know what happened to the "Online Codes" Sourceforge project, listed at http://sourceforge.net/projects/onlinecodes? I'm asking here for two reasons: First, because Online Codes [1, 2] would be a great tool in peer-to-peer applications, so I thought someone here might have followed the project while it was still active. Second, I've written a solid library implementation of the Online Codes encoding/decoding algorithm described in the aforementioned papers. Alas, only after I implemented it did I find out that the authors' company, Rateless, had patented it (or, so they allude to on their web site www.rateless.com, Digital Fountain owned the IP). I was thinking of releasing it under the GPL, but now that I've discovered patents are involved that seems like a very bad idea. So I was wondering if the Online Codes project broke up because of this, and whether I would get sued into oblivion if I ever made this code available? IANAL, but is it illegal to write such code and distribute it as a library on the net (after all, it is straight from their papers) to elucidate how the algorithm works, or only illegal to include the library in any working software program? Regards, Michael Parker [1] http://www.rateless.com/oncodes.pdf [2] http://www.rateless.com/msd.ps From stewbagz at gmail.com Tue Feb 15 10:00:42 2005 From: stewbagz at gmail.com (stew "stewbagz" mercer) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <4211C669.3080200@ucla.edu> References: <4211C669.3080200@ucla.edu> Message-ID: <3b462676050215020043ee0d5a@mail.gmail.com> I was wondering about this as well. It appears that there was a build of rateless-copy and rateless-tunnel that was done with the cygwin tool kit, and that appears to have caused some complications. if you go to http://www.rateless.com/download_copy.html you can see the links to the binaries, but I've not been able to download anything from it. They were supposedly writing some RFCs for it too, but there is no sign of them either ... On Tue, 15 Feb 2005 01:52:41 -0800, Michael Parker wrote: > Hi all, > > Does anyone know what happened to the "Online Codes" Sourceforge > project, listed at http://sourceforge.net/projects/onlinecodes? I'm > asking here for two reasons: First, because Online Codes [1, 2] would be > a great tool in peer-to-peer applications, so I thought someone here > might have followed the project while it was still active. Second, I've > written a solid library implementation of the Online Codes > encoding/decoding algorithm described in the aforementioned papers. > Alas, only after I implemented it did I find out that the authors' > company, Rateless, had patented it (or, so they allude to on their web > site www.rateless.com, Digital Fountain owned the IP). I was thinking of > releasing it under the GPL, but now that I've discovered patents are > involved that seems like a very bad idea. So I was wondering if the > Online Codes project broke up because of this, and whether I would get > sued into oblivion if I ever made this code available? IANAL, but is it > illegal to write such code and distribute it as a library on the net > (after all, it is straight from their papers) to elucidate how the > algorithm works, or only illegal to include the library in any working > software program? > > Regards, > Michael Parker > > [1] http://www.rateless.com/oncodes.pdf > [2] http://www.rateless.com/msd.ps > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From solipsis at pitrou.net Tue Feb 15 10:15:04 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <4211C669.3080200@ucla.edu> References: <4211C669.3080200@ucla.edu> Message-ID: <1108462504.7938.25.camel@p-dhcp-333-72.rd.francetelecom.fr> > I was thinking of > releasing it under the GPL, but now that I've discovered patents are > involved that seems like a very bad idea. So I was wondering if the > Online Codes project broke up because of this, and whether I would get > sued into oblivion if I ever made this code available? IANAL, but is it > illegal to write such code and distribute it as a library on the net > (after all, it is straight from their papers) to elucidate how the > algorithm works, or only illegal to include the library in any working > software program? If you are European then it's still legal ;) (given your e-mail address I guess you are not...) On the other hand, if software patents are valid in your country, then you can't distribute any code that infringes the patent without a license for that patent, even if you are doing it for research purposes, etc. Indeed, one of the problems with patents is that they are not subject to the traditional limits of copyright (fair use, etc.). Regards Antoine. -- http://solipsis.netofpeers.net/ From paul at ref.nmedia.net Tue Feb 15 19:58:23 2005 From: paul at ref.nmedia.net (Paul Campbell) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <4211C669.3080200@ucla.edu> References: <4211C669.3080200@ucla.edu> Message-ID: <20050215195823.GB25409@ref.nmedia.net> On Tue, Feb 15, 2005 at 01:52:41AM -0800, Michael Parker wrote: > Does anyone know what happened to the "Online Codes" Sourceforge > project, listed at http://sourceforge.net/projects/onlinecodes? I'm > asking here for two reasons: First, because Online Codes [1, 2] would be > a great tool in peer-to-peer applications, so I thought someone here > might have followed the project while it was still active. Second, I've > written a solid library implementation of the Online Codes > encoding/decoding algorithm described in the aforementioned papers. > Alas, only after I implemented it did I find out that the authors' > company, Rateless, had patented it (or, so they allude to on their web > site www.rateless.com, Digital Fountain owned the IP). I was thinking of > releasing it under the GPL, but now that I've discovered patents are > involved that seems like a very bad idea. There are additional papers out there. There are essentially two implementations of the idea. First, there's the "LT Codes" and "Raptor Codes". Second, there's the "Online Codes". Both are very similar in a lot of ways. There are also some fundamental problems. See this one: http://citeseer.ist.psu.edu/695965.html I didn't know that Online codes have now been patented. However, if you consider the code, you've got essentially two pieces. First, there's the LDPC cipher being used in erasure-handling only. Second, there's the inner error correction cipher. The inner cipher is what makes the fundamental difference between LT Codes and Online Codes. However, there is absolutely nothing to say that you can't use say a punctured rate-1 outer code (repitition-style codes) with a suitable scrambler, or vary the inner code with something that gives equivalent performance (even a BCH code). Patents only work as long as you implement ALL the features of the patent. From gojomo at bitzi.com Wed Feb 16 05:41:05 2005 From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi)) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: <4212DCF1.1070909@bitzi.com> Via Slashdot, as reported by Bruce Schneier: http://www.schneier.com/blog/archives/2005/02/sha1_broken.html Schneier writes: # SHA-1 Broken # # SHA-1 has been broken. Not a reduced-round version. Not a # simplified version. The real thing. # # The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu # (mostly from Shandong University in China) have been quietly # circulating a paper announcing their results: # # * collisions in the the full SHA-1 in 2**69 hash operations, # much less than the brute-force attack of 2**80 operations # based on the hash length. # # * collisions in SHA-0 in 2**39 operations. # # * collisions in 58-round SHA-1 in 2**33 operations. # # This attack builds on previous attacks on SHA-0 and SHA-1, and # is a major, major cryptanalytic result. It pretty much puts a # bullet into SHA-1 as a hash function for digital signatures # (although it doesn't affect applications such as HMAC where # collisions aren't important). # # The paper isn't generally available yet. At this point I can't # tell if the attack is real, but the paper looks good and this # is a reputable research team. # # More details when I have them. - Gordon @ Bitzi From jeffh at cs.rice.edu Wed Feb 16 06:51:45 2005 From: jeffh at cs.rice.edu (Jeff Hoye) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <4212DCF1.1070909@bitzi.com> References: <4212DCF1.1070909@bitzi.com> Message-ID: <4212ED81.4030606@cs.rice.edu> Let's wait for a real report. But it's cool if it's true. -Jeff Gordon Mohr (@ Bitzi) wrote: > Via Slashdot, as reported by Bruce Schneier: > > http://www.schneier.com/blog/archives/2005/02/sha1_broken.html > > Schneier writes: > > # SHA-1 Broken > # > # SHA-1 has been broken. Not a reduced-round version. Not a > # simplified version. The real thing. > # > # The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu > # (mostly from Shandong University in China) have been quietly > # circulating a paper announcing their results: > # > # * collisions in the the full SHA-1 in 2**69 hash operations, > # much less than the brute-force attack of 2**80 operations > # based on the hash length. > # > # * collisions in SHA-0 in 2**39 operations. > # > # * collisions in 58-round SHA-1 in 2**33 operations. > # > # This attack builds on previous attacks on SHA-0 and SHA-1, and > # is a major, major cryptanalytic result. It pretty much puts a > # bullet into SHA-1 as a hash function for digital signatures > # (although it doesn't affect applications such as HMAC where > # collisions aren't important). > # > # The paper isn't generally available yet. At this point I can't > # tell if the attack is real, but the paper looks good and this > # is a reputable research team. > # > # More details when I have them. > > - Gordon @ Bitzi > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From osokin at osokin.com Wed Feb 16 08:11:07 2005 From: osokin at osokin.com (Serguei Osokine) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <4212DCF1.1070909@bitzi.com> Message-ID: > # * collisions in the the full SHA-1 in 2**69 hash operations, > # much less than the brute-force attack of 2**80 operations... Okay, so the effective SHA-1 length is 138 bits instead of full 160 - so what's the big deal? It is still way more than, say, MD5 length. And MD5 is still widely used for stuff like content id'ing in various systems, because even 128 bits is quite a lot, never mind 138 bits. Best wishes - S.Osokine. 16 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Gordon Mohr (@ Bitzi) Sent: Tuesday, February 15, 2005 9:41 PM To: p2p-hackers Subject: [p2p-hackers] SHA1 broken? Via Slashdot, as reported by Bruce Schneier: http://www.schneier.com/blog/archives/2005/02/sha1_broken.html Schneier writes: # SHA-1 Broken # # SHA-1 has been broken. Not a reduced-round version. Not a # simplified version. The real thing. # # The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu # (mostly from Shandong University in China) have been quietly # circulating a paper announcing their results: # # * collisions in the the full SHA-1 in 2**69 hash operations, # much less than the brute-force attack of 2**80 operations # based on the hash length. # # * collisions in SHA-0 in 2**39 operations. # # * collisions in 58-round SHA-1 in 2**33 operations. # # This attack builds on previous attacks on SHA-0 and SHA-1, and # is a major, major cryptanalytic result. It pretty much puts a # bullet into SHA-1 as a hash function for digital signatures # (although it doesn't affect applications such as HMAC where # collisions aren't important). # # The paper isn't generally available yet. At this point I can't # tell if the attack is real, but the paper looks good and this # is a reputable research team. # # More details when I have them. - Gordon @ Bitzi _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From gojomo at bitzi.com Wed Feb 16 09:10:13 2005 From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi)) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: References: Message-ID: <42130DF5.3020708@bitzi.com> Serguei Osokine wrote: >># * collisions in the the full SHA-1 in 2**69 hash operations, >># much less than the brute-force attack of 2**80 operations... > > > Okay, so the effective SHA-1 length is 138 bits instead of full > 160 - so what's the big deal? If the results hold up: SHA1 is not as strong as it was designed to be, and its effective strength is being sent in the wrong direction, rather than being confirmed, by new research. Even while maintaining that SHA1 was unbroken and likely to remain so just last week, NIST was still recommending that SHA1 be phased out of government use by 2010: http://www.fcw.com/fcw/articles/2005/0207/web-hash-02-07-05.asp One more paper from a group of precocious researchers anywhere in the world, or unpublished result exploited in secret, could topple SHA1 from practical use entirely. Of course, that's remotely possible with any hash, but the pattern of recent results suggest that a further break is now more likely with SHA1 (and related hashes) than others. So the big deal would be: don't rely on SHA1 in any applications you intend to have a long effective life. > It is still way more than, say, MD5 > length. And MD5 is still widely used for stuff like content id'ing > in various systems, because even 128 bits is quite a lot, never > mind 138 bits. Just because it's widely used doesn't mean it's a good idea. MD5 should not be used for content identification, given the ability to create content pairs with the same MD5, with one version being (and appearing and acquiring a reputation for being) innocuous, and the other version malicious. - Gordon @ Bitzi From paul at ref.nmedia.net Wed Feb 16 13:15:36 2005 From: paul at ref.nmedia.net (Paul Campbell) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <4212DCF1.1070909@bitzi.com> References: <4212DCF1.1070909@bitzi.com> Message-ID: <20050216131536.GA27730@ref.nmedia.net> On Tue, Feb 15, 2005 at 09:41:05PM -0800, Gordon Mohr (@ Bitzi) wrote: > Via Slashdot, as reported by Bruce Schneier: > > http://www.schneier.com/blog/archives/2005/02/sha1_broken.html > > Schneier writes: > > # SHA-1 Broken I saw this a few months ago. It's not just SHA-1. All ciphers based on the MD-5 S-box design are apparently vulnerable. At this point, it appears that there are two options for the future: 1. Go to something with a larger internal state (256-bit state), and that is NOT just an extended version of the original (as the extended SHA standards attempt to do). 2. Go to a completely different type of cipher. The choices right now are either digital signatures via elliptic curves, or else using one of the stream cipher designs. Since neither one is really optimized for hashing-type operations, they are essentially no-go's for most P2P uses (e.g. DHT's). When I say "optimized", by that I mean very SLOW by the way. From ap at hamachi.cc Wed Feb 16 16:03:47 2005 From: ap at hamachi.cc (Alex Pankratov) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <20050216131536.GA27730@ref.nmedia.net> References: <4212DCF1.1070909@bitzi.com> <20050216131536.GA27730@ref.nmedia.net> Message-ID: <42136EE3.4000001@hamachi.cc> Paul Campbell wrote: > 2. Go to a completely different type of cipher. The choices right now are > either digital signatures via elliptic curves, ... By the way - is ECC patented ? I heard Sun had some activity around ECC patents, Certicom has patents for a curve selection algorithms, but is core ECC patented ? Or rather - is it in public domain or not ? I am seriously considering ECDSA as a replacement for RSA as it seems to be significantly faster for the same crypto strength. From Serguei.Osokine at efi.com Wed Feb 16 16:37:31 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC35B@fcexmb04.efi.internal> On Wednesday, February 16, 2005 Gordon Mohr wrote: > MD5 should not be used for content identification, given the > ability to create content pairs with the same MD5, with one > version being (and appearing and acquiring a reputation for > being) innocuous, and the other version malicious. Right. So let's go and try to find something with the same MD5 as this letter of mine, shall we? :-) For any practical purpose that I can imagine in a content identification field, MD5 is just fine. And SHA-1 is even more fine. There are plenty more simple ways to attack the CDN nets than MD5 collisions. Way more simple. And abandoning MD5 for SHA1, then SHA1 for Tiger, and then abandoning Tiger for some newer hash when some researcher finds that it is really twenty bits weaker than you thought - it is all just a huge waste of development effort, as far as I'm concerned. It sure is nice to know that the human mind can find collisions in a 160-bit hash, but I have a feeling that the practical meaning of this result in the content identification area is precisely zero. Probably the biggest effect will be that the more advanced of the marketing types will start saying with a knowing look: "ah, but SHA1 was compromised - shouldn't we use something more secure?" Which is a plenty effect by itself, I'll grant you that. It will be way easier to switch to a newer hash than to explain to these guys that this is all a load of bull. But this is a Chicken Little effect, which is of a psychological rather than of a technical nature, and I'd expect to find the concerns about SHA1 weakness on some marketing forum rather than here. (All of the above is only about the content identification in the P2P nets, of course. Security/authentication is a different story. But saying that MD5 should not be used for the content identification does seem like a bit of an overstatement to me. I mean, imagine yourself a Gnutella network - so its biggest, major, noticeable, or even existing concern is a collision in the content hashes? Are you kidding? :-) Best wishes - S.Osokine. 16 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Gordon Mohr (@ Bitzi) Sent: Wednesday, February 16, 2005 1:10 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] SHA1 broken? Serguei Osokine wrote: >># * collisions in the the full SHA-1 in 2**69 hash operations, >># much less than the brute-force attack of 2**80 operations... > > > Okay, so the effective SHA-1 length is 138 bits instead of full > 160 - so what's the big deal? If the results hold up: SHA1 is not as strong as it was designed to be, and its effective strength is being sent in the wrong direction, rather than being confirmed, by new research. Even while maintaining that SHA1 was unbroken and likely to remain so just last week, NIST was still recommending that SHA1 be phased out of government use by 2010: http://www.fcw.com/fcw/articles/2005/0207/web-hash-02-07-05.asp One more paper from a group of precocious researchers anywhere in the world, or unpublished result exploited in secret, could topple SHA1 from practical use entirely. Of course, that's remotely possible with any hash, but the pattern of recent results suggest that a further break is now more likely with SHA1 (and related hashes) than others. So the big deal would be: don't rely on SHA1 in any applications you intend to have a long effective life. > It is still way more than, say, MD5 > length. And MD5 is still widely used for stuff like content id'ing > in various systems, because even 128 bits is quite a lot, never > mind 138 bits. Just because it's widely used doesn't mean it's a good idea. MD5 should not be used for content identification, given the ability to create content pairs with the same MD5, with one version being (and appearing and acquiring a reputation for being) innocuous, and the other version malicious. - Gordon @ Bitzi _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From lloyd at randombit.net Wed Feb 16 22:05:17 2005 From: lloyd at randombit.net (Jack Lloyd) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <20050216131536.GA27730@ref.nmedia.net> References: <4212DCF1.1070909@bitzi.com> <20050216131536.GA27730@ref.nmedia.net> Message-ID: <20050216220516.GC29536@randombit.net> On Wed, Feb 16, 2005 at 05:15:36AM -0800, Paul Campbell wrote: > On Tue, Feb 15, 2005 at 09:41:05PM -0800, Gordon Mohr (@ Bitzi) wrote: > > Via Slashdot, as reported by Bruce Schneier: > > > > http://www.schneier.com/blog/archives/2005/02/sha1_broken.html > > > > Schneier writes: > > > > # SHA-1 Broken > > I saw this a few months ago. It's not just SHA-1. All ciphers based on the > MD-5 S-box design are apparently vulnerable. At this point, it appears that > there are two options for the future: No, there were no major results against full 80 round SHA-1 until this. There were collisions with ~50 of the 80 rounds for SHA-1, and Joux found a collision for SHA-0 around the same time Wang et all produced the collisions for MD4/MD5/RIPEMD/HAVAL-128 last summer. BTW, MD5 does not use S-Boxes in any form. > 1. Go to something with a larger internal state (256-bit state), and that is > NOT just an extended version of the original (as the extended SHA standards > attempt to do). Currently Whirlpool is looking like the best bet. Tiger is still out there, and is both reasonably fast on 32-bit machines and very fast on 64-bit, but it never saw much analysis, as the designers expected the 64-bit revolution about 8 years too early. Both are quite unlike the MDx designs, which is both good (possibly less likely to fall to whatever methods Wang and crew have), and bad (less analysis has been done). A major issue is that currently the details of the attacks haven't been published. All we really have right now are a set of collisions for various hashes, which proves that there are weaknesses, but until we know the details there is no way to say that they will or won't apply to Whirlpool/Tiger/SHA-2/etc. Fortunately the 2^69 worklimit on SHA-1 is currently theoretical for everyone but the TLAs, so the paper will have to explain the attack is sufficient detail to verify the results, from which people more compentent than me can see if the attacks do (or might) apply to the latest generation of hash functions. The real key is not just to upgrade, but to provide a smooth upgrade path in the future. Before SHA-1, the average security lifetime of a hash was about 5 years. I suspect we're seeing a return to that level of cycling; for the most part analysis of hash functions is not nearly as developed as that for block ciphers. > > 2. Go to a completely different type of cipher. The choices right now are > either digital signatures via elliptic curves, or else using one of the ECDSA and ECNR still use conventional hash functions; you don't reduce the impact of an attack on SHA-1 by using either of those as compared to DSA or RSA. > stream cipher designs. I am not aware of any methods of hashing with just a stream cipher; are you refering to Panama? Panama's stream cipher mode is still secure AFAIK, but the Panama transform has been shown insecure for hashing (IIRC with 2^80 operations, versus the expected 2^128) Regards, Jack From gojomo at bitzi.com Thu Feb 17 04:12:18 2005 From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi)) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC35B@fcexmb04.efi.internal> References: <4A60C83D027E224BAA4550FB1A2B120E0DC35B@fcexmb04.efi.internal> Message-ID: <421419A2.80307@bitzi.com> Serguei Osokine wrote: > On Wednesday, February 16, 2005 Gordon Mohr wrote: > >>MD5 should not be used for content identification, given the >>ability to create content pairs with the same MD5, with one >>version being (and appearing and acquiring a reputation for >>being) innocuous, and the other version malicious. > > > Right. So let's go and try to find something with the same > MD5 as this letter of mine, shall we? :-) I can't -- but you could have made a collision, very easily, if you composed your initial message with the intent of also composing an MD5 twin at the same time. That means for content identification MD5 is fatally flawed. For any file whose contents I think I know and trust, perhaps based on analysis and history of the file, there could be another dangerous file with the same MD5. MD5 cannot be used to distinguish between the two, but that's the whole point of using a secure hash for content identification. Dan Kaminsky runs over a number of potential attacks that are relevant to P2P -- see: http://paketto.doxpara.com Don't be fooled by the title of his analysis, "MD to be considered harmful someday" -- the attacks mentioned are possible now, and could trick people and software in subtle ways different from other threats to P2P nets. Here's another example from the cryptography list that convinced a doubter that the attacks on MD5 were of more than purely theoretical interest: two long binary strings, one a prime number, one not: http://lists.virus.org/cryptography-0412/msg00102.html Consider source code or executables which work fine with the primes, s-boxes, and other initialization vectors initially examined -- but have exploitable flaws when those values are perturbed in a manner that leaves the MD5 the same. You need to use a different, stronger content check to prevent such mischief -- making the use of MD5 redundant and even dangerous for the false sense of security it gives. > For any practical purpose that I can imagine in a content > identification field, MD5 is just fine. And SHA-1 is even more > fine. If you can't imagine exploits, perhaps it's just a failure of your imagination. Prudent engineering would assume some attackers have better imaginations than you, when it comes to exploiting hashes that don't work as originally intended. > There are plenty more simple ways to attack the CDN nets > than MD5 collisions. Way more simple. And abandoning MD5 for > SHA1, then SHA1 for Tiger, and then abandoning Tiger for some > newer hash when some researcher finds that it is really twenty > bits weaker than you thought - it is all just a huge waste of > development effort, as far as I'm concerned. Depends on the kinds of attacks you're worried about. There are more simple ways to disrupt P2P nets, sure. But are there more simple ways to trick conscientious, hash-checking users into running malware? And since when did the ease of other attacks become an excuse for ignoring more complicated and subtle (and thus perhaps more valuable) attacks? If you need a secure hash's properties in your software, you should use an uncompromised secure hash. (Results as early as 1996 suggested MD5 should not be used in applications where collision-resistance is important.) If you're stuck with a legacy hash, fine, analyze the situation and if you're confident the weakness has no effect on current usage, rationalize using it a while longer. But get ready for the potential need to switch hashes quickly in the presence of further discoveries. Or better yet: design with the idea in mind that no hash function lives forever. - Gordon @ Bitzi From osokin at osokin.com Thu Feb 17 07:37:55 2005 From: osokin at osokin.com (Serguei Osokine) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <421419A2.80307@bitzi.com> Message-ID: On Wednesday, February 16, 2005 Gordon Mohr wrote: > Dan Kaminsky runs over a number of potential attacks that > are relevant to P2P -- see: > > http://paketto.doxpara.com > ... > Here's another example from the cryptography list that convinced > a doubter... Certainly looks cute. Now correct me if I'm not getting something here - but isn't it true that in order to mount an attack one has to replace the "good" code (content, whatever) by the "bad" code, and the absolutely necessary condition is that the "good" code also has to be created by an attacker? So an attacker creates "good" code, gives it to security experts for verification, and then after they are done, replaces it with "bad code", right? Isn't it a bit far-fetched? Do we have a somewhat more realistic attack scenario? I just cannot imagine all this happening in real life. Real-life breakdowns always tend to be way simpler than their theoretical scenarios (and totally unexpected, too). > But are there more simple ways to trick conscientious, hash-checking > users into running malware? Users typically don't give a damn about hash-checking; they expect the system to do that for them. And a few users that do give a damn typically can defend themselves from pretty much anything no matter what you throw at them. So the fate of this "expert" group (consisting of about ten people for any given P2P system) does not really worry me, whereas for the rest of the user population there are plenty of ways to trick them into running the malware - *all* the current ways of doing so are simpler than fiddling with hashes. Which brings me back to my question above: do we have a realistic scenario where a network like Gnutella would be harmed by using MD5? (Not that I give a damn about MD5, and no one in Gnutella probably uses it anyway; my interest is largely theoretical here, and the same issues might be relevant for the other hashes, either.) > And since when did the ease of other attacks become an excuse > for ignoring more complicated and subtle (and thus perhaps > more valuable) attacks? Why, every time you do not have infinite development resources, of course. You always have to juggle priorities, and subtle attacks typically are not anywhere close to the head of the development priority list for P2P networks... > Or better yet: design with the idea in mind that no hash function > lives forever. Sure; but that's orthogonal: > If you're stuck with a legacy hash, fine, analyze the situation > and if you're confident the weakness has no effect on current > usage, rationalize using it a while longer. My point exactly. The issue is whether one should consider the deployed legacy codebase unsecure after every new discovery is made in the hash collision research or not. My personal approach would be to disregard the possible collision issues until there is a problem serious enough to be noticed by CNN. (So far I still cannot see any *realistic* attack scenario; maybe your next letter will convince me that I'm wrong :-) But everyone has a personal "worry threshold", I guess. Mine is pretty low... Best wishes - S.Osokine. 16 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Gordon Mohr (@ Bitzi) Sent: Wednesday, February 16, 2005 8:12 PM To: Peer-to-peer development. Subject: Re: [p2p-hackers] SHA1 broken? Serguei Osokine wrote: > On Wednesday, February 16, 2005 Gordon Mohr wrote: > >>MD5 should not be used for content identification, given the >>ability to create content pairs with the same MD5, with one >>version being (and appearing and acquiring a reputation for >>being) innocuous, and the other version malicious. > > > Right. So let's go and try to find something with the same > MD5 as this letter of mine, shall we? :-) I can't -- but you could have made a collision, very easily, if you composed your initial message with the intent of also composing an MD5 twin at the same time. That means for content identification MD5 is fatally flawed. For any file whose contents I think I know and trust, perhaps based on analysis and history of the file, there could be another dangerous file with the same MD5. MD5 cannot be used to distinguish between the two, but that's the whole point of using a secure hash for content identification. Dan Kaminsky runs over a number of potential attacks that are relevant to P2P -- see: http://paketto.doxpara.com Don't be fooled by the title of his analysis, "MD to be considered harmful someday" -- the attacks mentioned are possible now, and could trick people and software in subtle ways different from other threats to P2P nets. Here's another example from the cryptography list that convinced a doubter that the attacks on MD5 were of more than purely theoretical interest: two long binary strings, one a prime number, one not: http://lists.virus.org/cryptography-0412/msg00102.html Consider source code or executables which work fine with the primes, s-boxes, and other initialization vectors initially examined -- but have exploitable flaws when those values are perturbed in a manner that leaves the MD5 the same. You need to use a different, stronger content check to prevent such mischief -- making the use of MD5 redundant and even dangerous for the false sense of security it gives. > For any practical purpose that I can imagine in a content > identification field, MD5 is just fine. And SHA-1 is even more > fine. If you can't imagine exploits, perhaps it's just a failure of your imagination. Prudent engineering would assume some attackers have better imaginations than you, when it comes to exploiting hashes that don't work as originally intended. > There are plenty more simple ways to attack the CDN nets > than MD5 collisions. Way more simple. And abandoning MD5 for > SHA1, then SHA1 for Tiger, and then abandoning Tiger for some > newer hash when some researcher finds that it is really twenty > bits weaker than you thought - it is all just a huge waste of > development effort, as far as I'm concerned. Depends on the kinds of attacks you're worried about. There are more simple ways to disrupt P2P nets, sure. But are there more simple ways to trick conscientious, hash-checking users into running malware? And since when did the ease of other attacks become an excuse for ignoring more complicated and subtle (and thus perhaps more valuable) attacks? If you need a secure hash's properties in your software, you should use an uncompromised secure hash. (Results as early as 1996 suggested MD5 should not be used in applications where collision-resistance is important.) If you're stuck with a legacy hash, fine, analyze the situation and if you're confident the weakness has no effect on current usage, rationalize using it a while longer. But get ready for the potential need to switch hashes quickly in the presence of further discoveries. Or better yet: design with the idea in mind that no hash function lives forever. - Gordon @ Bitzi _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From em at em.no-ip.com Thu Feb 17 11:11:13 2005 From: em at em.no-ip.com (Enzo Michelangeli) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? References: <4212DCF1.1070909@bitzi.com><20050216131536.GA27730@ref.nmedia.net> <42136EE3.4000001@hamachi.cc> Message-ID: <005b01c514e1$6dcee580$0200a8c0@em.noip.com> ----- Original Message ----- From: "Alex Pankratov" To: "Peer-to-peer development." Sent: Thursday, February 17, 2005 12:03 AM Subject: Re: [p2p-hackers] SHA1 broken? [...] > By the way - is ECC patented ? I heard Sun had some activity around > ECC patents, Certicom has patents for a curve selection algorithms, > but is core ECC patented ? Or rather - is it in public domain or not ? Answers to patent-related questions are not Turing computable ;-) Anyway, several years ago the IEEE made an effort to collect statements and claims about intellectual property on PK encryption algorithms: http://grouper.ieee.org/groups/1363/P1363/patents.html Several of the letters collected refer to EC-related areas (Nyberg-Rueppel signatures, point compression techniques, etc.) Enzo From gojomo at bitzi.com Thu Feb 17 18:23:51 2005 From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi)) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: References: Message-ID: <4214E137.8000109@bitzi.com> Serguei Osokine wrote: > On Wednesday, February 16, 2005 Gordon Mohr wrote: > >>Dan Kaminsky runs over a number of potential attacks that >>are relevant to P2P -- see: >> >> http://paketto.doxpara.com >>... >>Here's another example from the cryptography list that convinced >>a doubter... > > > Certainly looks cute. Now correct me if I'm not getting something > here - but isn't it true that in order to mount an attack one has to > replace the "good" code (content, whatever) by the "bad" code, and the > absolutely necessary condition is that the "good" code also has to be > created by an attacker? So an attacker creates "good" code, gives it > to security experts for verification, and then after they are done, > replaces it with "bad code", right? Yes. > Isn't it a bit far-fetched? Do we have a somewhat more realistic > attack scenario? I just cannot imagine all this happening in real > life. Real-life breakdowns always tend to be way simpler than their > theoretical scenarios (and totally unexpected, too). It's possible. It's not that hard. It would offer rewards to an attacker that are different and possibly larger than those offered by the simple tricks that reel in easy marks. So it doesn't seem that far-fetched to me. >>But are there more simple ways to trick conscientious, hash-checking >>users into running malware? > > > Users typically don't give a damn about hash-checking; they > expect the system to do that for them. And a few users that do give > a damn typically can defend themselves from pretty much anything no > matter what you throw at them. So the fate of this "expert" group > (consisting of about ten people for any given P2P system) does not > really worry me, whereas for the rest of the user population there > are plenty of ways to trick them into running the malware - *all* > the current ways of doing so are simpler than fiddling with hashes. If your attack is just to get someone, somewhere to run your malware, sure. But the average/mass user is not the only interesting case. If you want to get onto other, higher-valued machines, you have to get around the real practices of many users, of various sophistication, who do care about hashes of received content matching expected values. For such people, to get them to settle for MD5, you either convince them not to worry about the potential attack -- making them potential victims -- or you lose them as users, because they realize that the hashes used for content-identification on your network do not offer the guarantee they seek. That's not a good result. I want P2P+CDN that delivers content that I and other sophisticated users can trust, and I want the unsophisticated users on the same network, too: I gain from their presence as peers/ seeds, and they can gain from my insistence on rigorous content identification. > Which brings me back to my question above: do we have a > realistic scenario where a network like Gnutella would be harmed by > using MD5? Having installers like the fire.exe/ice.exe described by Kaminsky, which have the same MD5 but install different software, could quickly undermine confidence in an MD5-only P2P network for most kinds of content delivery. Telling average users (or businesses considering P2P delivery), "but that's only when the attacker gets to create both files", is noise to them. (And for pro users, telling them that they have to trust the original creator of the file not to have created twins is tantamount to requiring the content to be separately digitally signed to prove origination -- an additional step rendering the plain standalone MD5 for content identification superfluous.) > (Not that I give a damn about MD5, and no one in Gnutella probably > uses it anyway; my interest is largely theoretical here, and the same > issues might be relevant for the other hashes, either.) > > >>And since when did the ease of other attacks become an excuse >>for ignoring more complicated and subtle (and thus perhaps >>more valuable) attacks? > > > Why, every time you do not have infinite development resources, > of course. You always have to juggle priorities, and subtle attacks > typically are not anywhere close to the head of the development > priority list for P2P networks... Of course work has to be prioritized in context. But the priority list is not a single-file line, where a few frontmost entries prevent consideration of everything else. In particular, I would guess the "head of the development priority list" for most commercial P2P networks is dominated by user satisfaction issues. But these are only remedied incrementally, with research and trial and error. The risk of delay is incremental competitive decay, and the work is never really "done". At the same time, developers can be addressing other specific flaws -- failures of the software and chosen algorithms to deliver the functionality intended. Such flaws can't be ignored forever. They may be easy to fix with a discrete amount of effort. And since transitioning hash functions requires lead time, the groundwork should be laid before any change is urgent. >>Or better yet: design with the idea in mind that no hash function >>lives forever. > > > Sure; but that's orthogonal: > > >>If you're stuck with a legacy hash, fine, analyze the situation >>and if you're confident the weakness has no effect on current >>usage, rationalize using it a while longer. > > > My point exactly. The issue is whether one should consider the > deployed legacy codebase unsecure after every new discovery is made > in the hash collision research or not. My personal approach would be > to disregard the possible collision issues until there is a problem > serious enough to be noticed by CNN. (So far I still cannot see any > *realistic* attack scenario; maybe your next letter will convince me > that I'm wrong :-) But everyone has a personal "worry threshold", > I guess. Mine is pretty low... I suppose it depends on how high your ambitions for P2P are. Clearly, you can have a very popular network with a very weak hash for quite a while -- witness ED2K, using MD4, a hash "broken" for over a decade. But over time, users have become more aware of the importance of hash-based content-verification, and users have generally migrated in the direction of more-rigorous hash-using networks -- though not to the *most* rigorous networks. If P2P is just a leisure-time lark for credulous, casual users who have many other unhygenic comuting practices, then you can be lacksadaisical in your use of hash algorithms. If you want it to also be a platform stable for long-term use by more discriminating users and commercial endeavors, you should take the strength of your hashes seriously. If you wait until someone is hurt enough that the damage is reported on CNN, that's too long. - Gordon @ Bitzi > Best wishes - > S.Osokine. > 16 Feb 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Gordon Mohr (@ Bitzi) > Sent: Wednesday, February 16, 2005 8:12 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] SHA1 broken? > > > Serguei Osokine wrote: > >>On Wednesday, February 16, 2005 Gordon Mohr wrote: >> >> >>>MD5 should not be used for content identification, given the >>>ability to create content pairs with the same MD5, with one >>>version being (and appearing and acquiring a reputation for >>>being) innocuous, and the other version malicious. >> >> >> Right. So let's go and try to find something with the same >>MD5 as this letter of mine, shall we? :-) > > > I can't -- but you could have made a collision, very easily, if > you composed your initial message with the intent of also composing > an MD5 twin at the same time. > > That means for content identification MD5 is fatally flawed. For > any file whose contents I think I know and trust, perhaps based > on analysis and history of the file, there could be another > dangerous file with the same MD5. MD5 cannot be used to distinguish > between the two, but that's the whole point of using a secure > hash for content identification. > > Dan Kaminsky runs over a number of potential attacks that > are relevant to P2P -- see: > > http://paketto.doxpara.com > > Don't be fooled by the title of his analysis, "MD to be considered > harmful someday" -- the attacks mentioned are possible now, and > could trick people and software in subtle ways different from > other threats to P2P nets. > > Here's another example from the cryptography list that convinced > a doubter that the attacks on MD5 were of more than purely > theoretical interest: two long binary strings, one a prime number, > one not: > > http://lists.virus.org/cryptography-0412/msg00102.html > > Consider source code or executables which work fine with the > primes, s-boxes, and other initialization vectors initially > examined -- but have exploitable flaws when those values are > perturbed in a manner that leaves the MD5 the same. You need > to use a different, stronger content check to prevent such > mischief -- making the use of MD5 redundant and even dangerous > for the false sense of security it gives. > > >> For any practical purpose that I can imagine in a content >>identification field, MD5 is just fine. And SHA-1 is even more >>fine. > > > If you can't imagine exploits, perhaps it's just a failure of > your imagination. Prudent engineering would assume some attackers > have better imaginations than you, when it comes to exploiting > hashes that don't work as originally intended. > > >>There are plenty more simple ways to attack the CDN nets >>than MD5 collisions. Way more simple. And abandoning MD5 for >>SHA1, then SHA1 for Tiger, and then abandoning Tiger for some >>newer hash when some researcher finds that it is really twenty >>bits weaker than you thought - it is all just a huge waste of >>development effort, as far as I'm concerned. > > > Depends on the kinds of attacks you're worried about. There > are more simple ways to disrupt P2P nets, sure. But are there > more simple ways to trick conscientious, hash-checking users > into running malware? > > And since when did the ease of other attacks become an excuse > for ignoring more complicated and subtle (and thus perhaps > more valuable) attacks? > > If you need a secure hash's properties in your software, you > should use an uncompromised secure hash. (Results as early as > 1996 suggested MD5 should not be used in applications where > collision-resistance is important.) > > If you're stuck with a legacy hash, fine, analyze the situation > and if you're confident the weakness has no effect on current > usage, rationalize using it a while longer. But get ready for > the potential need to switch hashes quickly in the presence of > further discoveries. Or better yet: design with the idea in mind > that no hash function lives forever. > > - Gordon @ Bitzi > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > From Serguei.Osokine at efi.com Thu Feb 17 18:37:34 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal> On Thursday, February 17, 2005 Gordon Mohr wrote: > I want P2P+CDN that delivers content that I and other sophisticated > users can trust, and I want the unsophisticated users on the same > network, too... > ... > If P2P is just a leisure-time lark for credulous, casual users who > have many other unhygenic comuting practices, then you can be > lacksadaisical in your use of hash algorithms. If you want it to > also be a platform stable for long-term use by more discriminating > users and commercial endeavors, you should take the strength of > your hashes seriously. Fair enough. So how do you prevent the DNS-hijacking of Bitzi? Or - way more importantly - how do you prevent the fake .torrent files from being submitted to any number of torrent aggregator sites? That one would be a method of choice for me, if I'd be in the mood to distribute some malicious code to many machines at once, and would not be in the mood to use a virus for that purpose. Best wishes - S.Osokine. 17 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Gordon Mohr (@ Bitzi) Sent: Thursday, February 17, 2005 10:24 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] SHA1 broken? Serguei Osokine wrote: > On Wednesday, February 16, 2005 Gordon Mohr wrote: > >>Dan Kaminsky runs over a number of potential attacks that >>are relevant to P2P -- see: >> >> http://paketto.doxpara.com >>... >>Here's another example from the cryptography list that convinced >>a doubter... > > > Certainly looks cute. Now correct me if I'm not getting something > here - but isn't it true that in order to mount an attack one has to > replace the "good" code (content, whatever) by the "bad" code, and the > absolutely necessary condition is that the "good" code also has to be > created by an attacker? So an attacker creates "good" code, gives it > to security experts for verification, and then after they are done, > replaces it with "bad code", right? Yes. > Isn't it a bit far-fetched? Do we have a somewhat more realistic > attack scenario? I just cannot imagine all this happening in real > life. Real-life breakdowns always tend to be way simpler than their > theoretical scenarios (and totally unexpected, too). It's possible. It's not that hard. It would offer rewards to an attacker that are different and possibly larger than those offered by the simple tricks that reel in easy marks. So it doesn't seem that far-fetched to me. >>But are there more simple ways to trick conscientious, hash-checking >>users into running malware? > > > Users typically don't give a damn about hash-checking; they > expect the system to do that for them. And a few users that do give > a damn typically can defend themselves from pretty much anything no > matter what you throw at them. So the fate of this "expert" group > (consisting of about ten people for any given P2P system) does not > really worry me, whereas for the rest of the user population there > are plenty of ways to trick them into running the malware - *all* > the current ways of doing so are simpler than fiddling with hashes. If your attack is just to get someone, somewhere to run your malware, sure. But the average/mass user is not the only interesting case. If you want to get onto other, higher-valued machines, you have to get around the real practices of many users, of various sophistication, who do care about hashes of received content matching expected values. For such people, to get them to settle for MD5, you either convince them not to worry about the potential attack -- making them potential victims -- or you lose them as users, because they realize that the hashes used for content-identification on your network do not offer the guarantee they seek. That's not a good result. I want P2P+CDN that delivers content that I and other sophisticated users can trust, and I want the unsophisticated users on the same network, too: I gain from their presence as peers/ seeds, and they can gain from my insistence on rigorous content identification. > Which brings me back to my question above: do we have a > realistic scenario where a network like Gnutella would be harmed by > using MD5? Having installers like the fire.exe/ice.exe described by Kaminsky, which have the same MD5 but install different software, could quickly undermine confidence in an MD5-only P2P network for most kinds of content delivery. Telling average users (or businesses considering P2P delivery), "but that's only when the attacker gets to create both files", is noise to them. (And for pro users, telling them that they have to trust the original creator of the file not to have created twins is tantamount to requiring the content to be separately digitally signed to prove origination -- an additional step rendering the plain standalone MD5 for content identification superfluous.) > (Not that I give a damn about MD5, and no one in Gnutella probably > uses it anyway; my interest is largely theoretical here, and the same > issues might be relevant for the other hashes, either.) > > >>And since when did the ease of other attacks become an excuse >>for ignoring more complicated and subtle (and thus perhaps >>more valuable) attacks? > > > Why, every time you do not have infinite development resources, > of course. You always have to juggle priorities, and subtle attacks > typically are not anywhere close to the head of the development > priority list for P2P networks... Of course work has to be prioritized in context. But the priority list is not a single-file line, where a few frontmost entries prevent consideration of everything else. In particular, I would guess the "head of the development priority list" for most commercial P2P networks is dominated by user satisfaction issues. But these are only remedied incrementally, with research and trial and error. The risk of delay is incremental competitive decay, and the work is never really "done". At the same time, developers can be addressing other specific flaws -- failures of the software and chosen algorithms to deliver the functionality intended. Such flaws can't be ignored forever. They may be easy to fix with a discrete amount of effort. And since transitioning hash functions requires lead time, the groundwork should be laid before any change is urgent. >>Or better yet: design with the idea in mind that no hash function >>lives forever. > > > Sure; but that's orthogonal: > > >>If you're stuck with a legacy hash, fine, analyze the situation >>and if you're confident the weakness has no effect on current >>usage, rationalize using it a while longer. > > > My point exactly. The issue is whether one should consider the > deployed legacy codebase unsecure after every new discovery is made > in the hash collision research or not. My personal approach would be > to disregard the possible collision issues until there is a problem > serious enough to be noticed by CNN. (So far I still cannot see any > *realistic* attack scenario; maybe your next letter will convince me > that I'm wrong :-) But everyone has a personal "worry threshold", > I guess. Mine is pretty low... I suppose it depends on how high your ambitions for P2P are. Clearly, you can have a very popular network with a very weak hash for quite a while -- witness ED2K, using MD4, a hash "broken" for over a decade. But over time, users have become more aware of the importance of hash-based content-verification, and users have generally migrated in the direction of more-rigorous hash-using networks -- though not to the *most* rigorous networks. If P2P is just a leisure-time lark for credulous, casual users who have many other unhygenic comuting practices, then you can be lacksadaisical in your use of hash algorithms. If you want it to also be a platform stable for long-term use by more discriminating users and commercial endeavors, you should take the strength of your hashes seriously. If you wait until someone is hurt enough that the damage is reported on CNN, that's too long. - Gordon @ Bitzi > Best wishes - > S.Osokine. > 16 Feb 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Gordon Mohr (@ Bitzi) > Sent: Wednesday, February 16, 2005 8:12 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] SHA1 broken? > > > Serguei Osokine wrote: > >>On Wednesday, February 16, 2005 Gordon Mohr wrote: >> >> >>>MD5 should not be used for content identification, given the >>>ability to create content pairs with the same MD5, with one >>>version being (and appearing and acquiring a reputation for >>>being) innocuous, and the other version malicious. >> >> >> Right. So let's go and try to find something with the same >>MD5 as this letter of mine, shall we? :-) > > > I can't -- but you could have made a collision, very easily, if > you composed your initial message with the intent of also composing > an MD5 twin at the same time. > > That means for content identification MD5 is fatally flawed. For > any file whose contents I think I know and trust, perhaps based > on analysis and history of the file, there could be another > dangerous file with the same MD5. MD5 cannot be used to distinguish > between the two, but that's the whole point of using a secure > hash for content identification. > > Dan Kaminsky runs over a number of potential attacks that > are relevant to P2P -- see: > > http://paketto.doxpara.com > > Don't be fooled by the title of his analysis, "MD to be considered > harmful someday" -- the attacks mentioned are possible now, and > could trick people and software in subtle ways different from > other threats to P2P nets. > > Here's another example from the cryptography list that convinced > a doubter that the attacks on MD5 were of more than purely > theoretical interest: two long binary strings, one a prime number, > one not: > > http://lists.virus.org/cryptography-0412/msg00102.html > > Consider source code or executables which work fine with the > primes, s-boxes, and other initialization vectors initially > examined -- but have exploitable flaws when those values are > perturbed in a manner that leaves the MD5 the same. You need > to use a different, stronger content check to prevent such > mischief -- making the use of MD5 redundant and even dangerous > for the false sense of security it gives. > > >> For any practical purpose that I can imagine in a content >>identification field, MD5 is just fine. And SHA-1 is even more >>fine. > > > If you can't imagine exploits, perhaps it's just a failure of > your imagination. Prudent engineering would assume some attackers > have better imaginations than you, when it comes to exploiting > hashes that don't work as originally intended. > > >>There are plenty more simple ways to attack the CDN nets >>than MD5 collisions. Way more simple. And abandoning MD5 for >>SHA1, then SHA1 for Tiger, and then abandoning Tiger for some >>newer hash when some researcher finds that it is really twenty >>bits weaker than you thought - it is all just a huge waste of >>development effort, as far as I'm concerned. > > > Depends on the kinds of attacks you're worried about. There > are more simple ways to disrupt P2P nets, sure. But are there > more simple ways to trick conscientious, hash-checking users > into running malware? > > And since when did the ease of other attacks become an excuse > for ignoring more complicated and subtle (and thus perhaps > more valuable) attacks? > > If you need a secure hash's properties in your software, you > should use an uncompromised secure hash. (Results as early as > 1996 suggested MD5 should not be used in applications where > collision-resistance is important.) > > If you're stuck with a legacy hash, fine, analyze the situation > and if you're confident the weakness has no effect on current > usage, rationalize using it a while longer. But get ready for > the potential need to switch hashes quickly in the presence of > further discoveries. Or better yet: design with the idea in mind > that no hash function lives forever. > > - Gordon @ Bitzi > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From zooko at zooko.com Thu Feb 17 20:30:55 2005 From: zooko at zooko.com (Zooko O'Whielacronx) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal> References: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal> Message-ID: This topic -- whether collision-resistance is or is not necessary for secure identification of content -- has been discussed extensively on the cryptography@metzdowd mailing list recently. Ben Laurie started it with a post entitled "The pointlessness of MD5 attacks". Here is my contribution to that discussion: http://thread.gmane.org/gmane.comp.encryption.general/5717 This note I posted alludes to this kind of situation: Bob, the honest and noble software maintainer, writes a good piece of software, S1, and then asks Charles the Malicious Multimedia Master to give him an icon to include in the package. Charles writes some malicious software S2, and then finds an icon I1 and another icon I2 such that MD5(B1) == MD5(B2), where B1 is the binary package resulting from packaging software S1 and icon I1, and B2 is the binary package resulting from packaging software S2 and icon I2. Charles then gives I1 to Bob, who compiles B2 himself. Charles generates T1 == MD5(B1), and distributes B1, telling Alice "Please verify that the binary package you download and run matches the hash T1.". Charles sends Alice a copy of binary software package B2, who verifies that MD5(B2) == T1, and then trusts the binary package as though it were a package that Bob wrote. Now to be clear: I don't know if the current attacks on MD5 and SHA1 enable Charles to do this! Because I don't know if those attacks can be used when there is a fixed IV or a fixed part of the message which is chosen by someone (Bob) other than the attacker (Charles). However, I do know that if a hash is collision-resistant then the situation outlined above cannot occur, but that if a hash is non-collision-resistant, then the situation outlined above *might* be possible, even if the hash is second-preimage resistant. I guess the challenge presented to Charles in the situation outlined above occupies a sort of middle ground between collision-resistance and second-preimage-resistance. The HMAC challenge occupies another niche in that middle ground -- in the situation described above, Charles is given a fixed IV or a fixed part-of-the-message. In the HMAC situation, Charles is faced with an IV which is random and unknown to him. Regards, Zooko From clodoaldo_gouveia at yahoo.com.br Thu Feb 17 20:34:56 2005 From: clodoaldo_gouveia at yahoo.com.br (Clodoaldo Gouveia) Date: Sat Dec 9 22:12:50 2006 Subject: [p2p-hackers] Changes in Pastry 1.3.2 Message-ID: <20050217203456.76779.qmail@web41127.mail.yahoo.com> Hi There... We work in a peer-to-peer project in Brazil and we use the DHT Pastry FreePastry 1.3.1, and now we want to change to the newer version of this implementation of Pastry, 1.3.2... All I?d Like to know is if someone has changed the FreePastry 1.3.1 to the FreePastry 1.3.2 in your applications and what are the main differences between both... I wanna know what changed on message routing, in our application some errors apeeared on the class rice.pastry.messaging.MessageDispatch! I will stay so glad if you can help and i wanna know what I have to change in my application to fix this problem... Thanks in advance... Clodoaldo Gouveia e Hermano Toscano --------------------------------- Yahoo! Acesso Gr?tis - Internet r?pida e gr?tis. Instale o discador do Yahoo! agora. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20050217/7394484d/attachment.html From zooko at zooko.com Thu Feb 17 20:39:27 2005 From: zooko at zooko.com (Zooko O'Whielacronx) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: References: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal> Message-ID: <248c7321e5782025ad5dd664665dbbff@zooko.com> following-up to my own post to correct and add URLs I wrote: > This topic -- whether collision-resistance is or is not necessary for > secure identification of content -- has been discussed extensively on > the cryptography@metzdowd mailing list recently. Ben Laurie started > it with a post entitled "The pointlessness of MD5 attacks". Here is > my contribution to that discussion: > > http://thread.gmane.org/gmane.comp.encryption.general/5717 ^-- actually, that's the URL to Ben Laurie's original post that started the discussion. Here's the URL I intended to give -- the URL to my own post about Alice the user, Bob the software maintainer, and Charles the Malicious Multimedia Master: http://article.gmane.org/gmane.comp.encryption.general/5789 Here's the URL to Adam Back's post which suggested the technique which could lead to this bad situation without violating the second-preimage-resistance of the hash function: http://article.gmane.org/gmane.comp.encryption.general/5729 Regards, Zooko From nlothian at educationau.edu.au Thu Feb 17 21:47:36 2005 From: nlothian at educationau.edu.au (Nick Lothian) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: > > Dan Kaminsky runs over a number of potential attacks that > are relevant > > to P2P -- see: > > > > http://paketto.doxpara.com > > ... > > Here's another example from the cryptography list that convinced a > > doubter... > > Certainly looks cute. Now correct me if I'm not getting > something here - but isn't it true that in order to mount an > attack one has to replace the "good" code (content, whatever) > by the "bad" code, and the absolutely necessary condition is > that the "good" code also has to be created by an attacker? > So an attacker creates "good" code, gives it to security > experts for verification, and then after they are done, > replaces it with "bad code", right? > > Isn't it a bit far-fetched? Do we have a somewhat more > realistic attack scenario? I just cannot imagine all this > happening in real life. Real-life breakdowns always tend to > be way simpler than their theoretical scenarios (and totally > unexpected, too). > According to some reports some anti-spyware tools currently use MD5 hashes to find known-bad software (See http://malektips.com/microsoft_antispyware_0007.html). It's not hard to imagine spyware manufactures modifying common opensource applications (eg: p2p software) so they include spyware and yet still have the same hash. Nick IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party. This email represents the views of the individual sender, which does not necessarily reflect those of education.au limited except where the sender expressly states otherwise. It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects. education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this e-mail. From Serguei.Osokine at efi.com Thu Feb 17 22:11:28 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC36B@fcexmb04.efi.internal> On Thursday, February 17, 2005 Nick Lothian wrote: > It's not hard to imagine spyware manufactures modifying common > opensource applications (eg: p2p software) so they include spyware > and yet still have the same hash. Sure, but then they would have to find some innocently looking way to include something like this into the open source app: d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c 2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a 08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6 dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0 e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70 - which is no big deal, could be a bitmap. However, after that they would have to modify the application to use the text above as a jump table to a malicious code, which would be dormant in the application until the data is changed to: d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c 2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a 08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6 dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0 e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70 If they can pull all of this off without raising any suspicion (which is not a huge problem if no one reads CVS diffs or sources), then they might as well just jump to this malicious code, or jump to it on some moderately obfuscated condition, since no one would notice this code or the jump to begin with if no one monitors the sources. Using MD5 collision to do that seems like a particularly convoluted way to achieve the same goal that can be achieved way simpler without it. Of course, if one is a Rube Goldberg fan, this is something he might want to do as a matter of principle... :-) Best wishes - S.Osokine. 17 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Nick Lothian Sent: Thursday, February 17, 2005 1:48 PM To: osokin@osokin.com; Peer-to-peer development. Subject: RE: [p2p-hackers] SHA1 broken? > > Dan Kaminsky runs over a number of potential attacks that > are relevant > > to P2P -- see: > > > > http://paketto.doxpara.com > > ... > > Here's another example from the cryptography list that convinced a > > doubter... > > Certainly looks cute. Now correct me if I'm not getting > something here - but isn't it true that in order to mount an > attack one has to replace the "good" code (content, whatever) > by the "bad" code, and the absolutely necessary condition is > that the "good" code also has to be created by an attacker? > So an attacker creates "good" code, gives it to security > experts for verification, and then after they are done, > replaces it with "bad code", right? > > Isn't it a bit far-fetched? Do we have a somewhat more > realistic attack scenario? I just cannot imagine all this > happening in real life. Real-life breakdowns always tend to > be way simpler than their theoretical scenarios (and totally > unexpected, too). > According to some reports some anti-spyware tools currently use MD5 hashes to find known-bad software (See http://malektips.com/microsoft_antispyware_0007.html). It's not hard to imagine spyware manufactures modifying common opensource applications (eg: p2p software) so they include spyware and yet still have the same hash. Nick IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party. This email represents the views of the individual sender, which does not necessarily reflect those of education.au limited except where the sender expressly states otherwise. It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects. education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this e-mail. _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From hal at finney.org Thu Feb 17 22:25:36 2005 From: hal at finney.org (Hal Finney) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: <20050217222536.6174957EBA@finney.org> The problem with the attack scenario where two versions of a program are created with the same hash, is that from what little we know of the new attacks, they aren't powerful enough to do this. All of the collisions they have shown have the property where the two alternatives start with the same initial value for the hash; they then have one or two blocks which are very carefully selected, with a few bits differing between the two blocks; and at the end, they are back to a common value for the hash. It is known that their techniques are not sensitive to this initial value. They actually made a mistake when they published their MD5 collision, because they had the wrong initial values due to a typo in Schneier's book. When people gave them the correct initial values, they were able to come up with new collisions within a matter of hours. If you look at their MD5 collision in detail, it was two blocks long. Each block was almost the same as the other, with just a few bits different. They start with the common initial value. Then they run the first blocks through. Amazingly, this has only a small impact on the intermediate value after this first block. Only a relatively few bits are different. If you or I tried to take two blocks with a few bits different and feed them to MD5, we would get totally different outputs. Changing even one bit will normally change half the output bits. The fact that they are able to change several bits and get only a small difference in the output is the first miracle. But then they do an even better trick. They now go on and do the second pair of blocks. The initial values for these blocks (which are the outputs from the previous stage) are close but not quite the same. And amazingly, these second blocks not only keep things from getting worse, they manage to heal the differences. They precisely compensate for the changes and bring the values back together. This is the second miracle and it is even greater. Now, it would be a big leap from this to being able to take two arbitrary different initial values and bring them together to a common output. That is what would be necessary to mount the code fraud attack. But as we can see by inspection of the collisions produced by the researchers (who are keeping their methodology secret for now), they don't seem to have that power. Instead, they are able to introduce a very carefully controlled difference between the two blocks, and then cancel it. Being able to cancel a huge difference between blocks would be a problem of an entirely different magnitude. Now, there is this other idea which Zooko alludes to, from Dan Kaminsky, www.doxpara.com, which could exploit the power of the new attacks to do something malicious. Let us grant that the only ability we have is that we can create slightly different pairs of blocks that collide. We can't meaningfully control the contents of these blocks, and they will differ in only a few bits. And these blocks have to be inserted into a program being distributed, which will have two versions that are *exactly the same* except for the few bits of difference between the blocks. This way the two versions will have the same hash, and this is the power which the current attacks seem to have. Kaminsky shows that you could still have "good" and "bad" versions of such a program. You'd have to write a program which tested a bit in the colliding blocks, and behaved "good" if the bit was set, and "bad" if the bit was clear. When someone reviewed this program, they'd see the potential bad behavior, but they'd also see that the behavior was not enabled because the bit that enabled it was not set. Maybe the bad behavior could be a back door used during debugging, and there is some flag bit that turns off the debugging mode. So the reviewer might assume that the program was OK despite this somewhat questionable code, because he builds it and makes sure to sign or validate the hash when built in the mode when the bad features are turned off. But what he doesn't know is, Kaminsky has another block of data prepared which has that flag bit in the opposite state, and which he can substitute without changing the hash. That will cause the program to behave in its "bad" mode, even though the only change was a few bits in this block of random data. So this way he can distribute a malicious build and it has the hash which was approved by the reviewer. And as Zooko points out, this doesn't have to be the main developer who is doing this, anyone who is doing some work on creating the final package might be able to do so. On the other hand, this attack is pretty blatant once you know it is possible. The lesson is that a reviewer should be suspicious of code whose security properties depend on the detailed contents of blocks of random-looking data. One problem with this is that there are some circumstances where it could be hard to tell. Zooko links to the example of a crypto key which could have weak and strong versions. The strong version could be approved and then the weak version substituted. There are also some crypto algorithms that use random-looking blocks of data which could have weak and strong versions. So it's not always as easy as it sounds. But most code will not have these problems, and for those programs it would be pretty conspicuous to implement Kaminsky's attacks. At present, that looks to be the best someone could do with SHA-1 or even MD5. Hal Finney From nlothian at educationau.edu.au Thu Feb 17 22:31:16 2005 From: nlothian at educationau.edu.au (Nick Lothian) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: > > On Thursday, February 17, 2005 Nick Lothian wrote: > > It's not hard to imagine spyware manufactures modifying common > > opensource applications (eg: p2p software) so they include > spyware and > > yet still have the same hash. > > Sure, but then they would have to find some innocently > looking way to include something like this into the open source app: > No - they just release the built .exe without the source (or even better - hack the original download site and replace the original version with their malicious version. If the hashes of the apps matched this could be pretty hard to detect). Nick IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party. This email represents the views of the individual sender, which does not necessarily reflect those of education.au limited except where the sender expressly states otherwise. It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects. education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this e-mail. From mccoy at mad-scientist.com Thu Feb 17 22:31:43 2005 From: mccoy at mad-scientist.com (Jim McCoy) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SHA1 broken? In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC36B@fcexmb04.efi.internal> References: <4A60C83D027E224BAA4550FB1A2B120E0DC36B@fcexmb04.efi.internal> Message-ID: On Feb 17, 2005, at 2:11 PM, Serguei Osokine wrote: > On Thursday, February 17, 2005 Nick Lothian wrote: >> It's not hard to imagine spyware manufactures modifying common >> opensource applications (eg: p2p software) so they include spyware >> and yet still have the same hash. > > Sure, but then they would have to find some innocently looking > way to include something like this into the open source app: > [collision data A] > - which is no big deal, could be a bitmap. However, after that they > would have to modify the application to use the text above as a jump > table to a malicious code, which would be dormant in the application > until the data is changed to: > [collision data B] So tell me, when was the last time you ran your SSL library through a debugger to determine with complete confidence that the modulii being used were not insecure ones? With this attack I could distribute a copy of a crypto library that seemed to match the hash it was supposed to have, but which was in fact opening you up to certain crypto attacks. As was pointed out on the crypto list thread zooko referenced, this is probably the only practical attack that can be made in this fashion right now. I can replace your crypto modulus, some RNG seeds, and other bits of data that are used by applications (and not just displayed, like the bitmap you suggest) which are of the "big, random-looking number" format. Jim From Serguei.Osokine at efi.com Thu Feb 17 23:23:27 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SHA1 broken? Message-ID: <4A60C83D027E224BAA4550FB1A2B120E0DC36C@fcexmb04.efi.internal> On Thursday, February 17, 2005 Jim McCoy wrote: > With this attack I could distribute a copy of a crypto library that > seemed to match the hash it was supposed to have, but which was in > fact opening you up to certain crypto attacks. And on Thursday, February 17, 2005 Nick Lothian wrote: > ...they just release the built .exe without the source (or even > better - hack the original download site and replace the original > version with their malicious version. If the hashes of the apps > matched this could be pretty hard to detect). Yes to both; but only if it would be your library to begin with, because it is essential that the *original* crypto library should have "collision data A" - without it, this attack is impossible. And not only that - the *original* crypto library would also have to have a) the malicious code prepared to be launched (say, granting root access or something), and b) the jump to this code that would not be executed with "data A", but would - with "data B". And all of this should be already present and ready for launch in an original library, the one that would be used by everyone for years and would not do anything visibly improper. RSA could do this, for sure. Heck, anyone who owns some cryptolib could do that - who scrutinizes cryptolib sources anyway? And it would be even simpler if only the binaries are distributed. But if you own a widely used cryptolib, you have more simple ways to include a backdoor into your code and to activate it on an innocently looking external event - especially if you do not show anyone the sources and distribute only the binaries. For anyone *but* the original code author, however, achieving a malicious collision this way would be impossible. So the Bad Charlie Webmaster from Zooko is pretty much out of luck - he'd have to conspire with an honest programmer Bob to do any harm. And an innocent programmer Bob is quite capable of doing plenty of harm even without any help and without knowing anything about hash properties, if he only pretends to be honest long enough. Why would he want to bring Charlie into his scam? Best wishes - S.Osokine. 17 Feb 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Jim McCoy Sent: Thursday, February 17, 2005 2:32 PM To: Peer-to-peer development. Subject: Re: [p2p-hackers] SHA1 broken? On Feb 17, 2005, at 2:11 PM, Serguei Osokine wrote: > On Thursday, February 17, 2005 Nick Lothian wrote: >> It's not hard to imagine spyware manufactures modifying common >> opensource applications (eg: p2p software) so they include spyware >> and yet still have the same hash. > > Sure, but then they would have to find some innocently looking > way to include something like this into the open source app: > [collision data A] > - which is no big deal, could be a bitmap. However, after that they > would have to modify the application to use the text above as a jump > table to a malicious code, which would be dormant in the application > until the data is changed to: > [collision data B] So tell me, when was the last time you ran your SSL library through a debugger to determine with complete confidence that the modulii being used were not insecure ones? With this attack I could distribute a copy of a crypto library that seemed to match the hash it was supposed to have, but which was in fact opening you up to certain crypto attacks. As was pointed out on the crypto list thread zooko referenced, this is probably the only practical attack that can be made in this fashion right now. I can replace your crypto modulus, some RNG seeds, and other bits of data that are used by applications (and not just displayed, like the bitmap you suggest) which are of the "big, random-looking number" format. Jim _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From gojomo at bitzi.com Fri Feb 18 05:54:15 2005 From: gojomo at bitzi.com (Gordon Mohr (@ Bitzi)) Date: Sat Dec 9 22:12:51 2006 Subject: Other P2P attacks (DNS, fake torrents, etc) Re: [p2p-hackers] SHA1 broken? In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal> References: <4A60C83D027E224BAA4550FB1A2B120E0DC368@fcexmb04.efi.internal> Message-ID: <42158307.60008@bitzi.com> Serguei Osokine wrote: > On Thursday, February 17, 2005 Gordon Mohr wrote: > >>I want P2P+CDN that delivers content that I and other sophisticated >>users can trust, and I want the unsophisticated users on the same >>network, too... >>... >>If P2P is just a leisure-time lark for credulous, casual users who >>have many other unhygenic comuting practices, then you can be >>lacksadaisical in your use of hash algorithms. If you want it to >>also be a platform stable for long-term use by more discriminating >>users and commercial endeavors, you should take the strength of >>your hashes seriously. > > > Fair enough. So how do you prevent the DNS-hijacking of Bitzi? Good question. There's no protection yet. I've assumed that when the budget and interest allows, we'd (1) offer signed versions of our XML metadata tickets; and (2) offer https service for some or all users. Other ideas welcome. > Or - way more importantly - how do you prevent the fake .torrent files > from being submitted to any number of torrent aggregator sites? I assume the torrent aggregators have some way of vetting submissions, either by reputation of the submitter, early testing/reviews by the most adventurous users, and so forth. I'm currently not immersed in those communities, so I don't know for sure. Anyone else want to chime in on how torrent aggregator sites manage to tend toward quality torrents over time? - Gordon @ Bitzi From eugen at leitl.org Fri Feb 18 12:41:02 2005 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net) Message-ID: <20050218124102.GI1404@leitl.org> ----- Forwarded message from jrandom ----- From: jrandom Date: Fri, 18 Feb 2005 03:39:24 -0800 To: i2p@i2p.net Subject: [i2p] 0.5 is available -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi y'all, After 6 months of work on the 0.4 series, we've implemented and deployed the new streaming library, integrated and tested bittorrent, mail, and naming apps, fixed a bunch of bugs, and learned as much as we could from real world users. We now have a new 0.5 release which reworks the tunnel routing algorithms, improving security and anonymity while giving the user more control of their own performance related tradeoffs. In addition, we've bundled susi23's susimail client, upgraded to the latest Jetty (allowing both symlinks and CGI), and a whole lot more. This new release is not backwards compatible - you must upgrade to get anything useful done. There has been a lot of work going on since 0.4.2.6 a month and a half ago, with contributions by smeghead, duck, Jhor, cervantes, Ragnarok, Sugadude, and the rest of the rabid testers in #i2p and #i2p-chat. I could write for pages describing whats up, but instead I'll just direct you to the change log at http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/history.txt?rev=HEAD For the impatient, please review the install and update instructions up at http://www.i2p.net/download Please note that since this new release updates the classpath, the update process will require you to start up the router again after it finishes. Any local modifications to the wrapper.config will be lost when updating, so please be sure to back it up. In addition, even though this new release includes the latest Jetty (5.1.2), if you want to enable CGI support, you will need to edit your ./eepsite/jetty.xml to include: /cgi-bin/* ./eepsite/cgi-bin Common Gateway Interface / org.mortbay.servlet.CGI /usr/local/bin:/usr/ucb:/bin:/usr/bin adjusting the Path as necessary for your OS/distro/tastes. New users have it easy - all of this is done for them. While the docs on the website haven't been updated to reflect the new tunnel routing and crypto changes yet, the nitty gritty is up at http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/doc/tunnel-alt.html?rev=HEAD There will be another release in the 0.5 series beyond this one, including more options for allowing the user to control the impact of predecessor attacks on their anonymity. There will certainly be performance and load balancing improvements as well, using the feedback we get deploying the new tunnel code on a wider network. Until the UDP transport is added in 0.6, we will want to continue to be fairly low key, as we've already run into the default limits on some braindead OSes (*cough*98*cough*). There is much we can improve upon while the network is small though, and while I know we all want to go out and show the world what I2P can do, another two months waiting won't hurt. Anyway, thats that. The new net is up and running, squid.i2p and other services should be up, you know where to get the goods, so get goin'! =jr -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFCFc3OGnFL2th344YRAszOAKCfTh/OOAAyonRmKoRF/iw5BwRkZACgpGp4 qHMJkSo2mzjHTHRf98fsvdM= =Vfl3 -----END PGP SIGNATURE----- _______________________________________________ i2p mailing list i2p@i2p.net http://i2p.dnsalias.net/mailman/listinfo/i2p ----- End forwarded message ----- -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050218/64d6b1e1/attachment.pgp From cefn.hoile at bt.com Fri Feb 18 14:14:57 2005 From: cefn.hoile at bt.com (cefn.hoile@bt.com) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Internship in P2P technologies with British Telecommunications Message-ID: <21DA6754A9238B48B92F39637EF307FD05B1A127@i2km41-ukdy.domain1.systemhost.net> The job opening linked below may be of interest to some people on this list. Apologies for cross posting. The Pervasive Systems Laboratory has a 12 month internship available as part of a research project in large-scale resilient networks. The project is focused on pervasive and adaptive networked systems. This position is part of a new and expanding project to develop cutting edge applications and novel solutions for dependable and robust ICT networks. The position offers a challenging and varied research environment and would be suitable for applicants who may, for example, be at the final stages of their MSc or PhD studies. ...more information can be found at http://cefn.com/cefn/?PICTJob Cefn http://cefn.com From dbarrett at quinthar.com Fri Feb 18 20:20:18 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net) In-Reply-To: <20050218124102.GI1404@leitl.org> References: <20050218124102.GI1404@leitl.org> Message-ID: <1108758022.2299409D@bf12.dngr.org> Just curious, what are the "default limits" you ran into under Win 98? Indeed, if you could summarize the top five lessons learned from your real-world users, what would they be? I'm sure we'd all like to learn them, too. -david On Fri, 18 Feb 2005 5:01 am, Eugen Leitl wrote: > ----- Forwarded message from jrandom ----- > > From: jrandom > Date: Fri, 18 Feb 2005 03:39:24 -0800 > To: i2p@i2p.net > Subject: [i2p] 0.5 is available > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi y'all, > > After 6 months of work on the 0.4 series, we've implemented and > deployed the new streaming library, integrated and tested bittorrent, > mail, and naming apps, fixed a bunch of bugs, and learned as much as > we could from real world users. We now have a new 0.5 release which > reworks the tunnel routing algorithms, improving security and > anonymity while giving the user more control of their own > performance related tradeoffs. In addition, we've bundled susi23's > susimail client, upgraded to the latest Jetty (allowing both symlinks > and CGI), and a whole lot more. This new release is not backwards > compatible - you must upgrade to get anything useful done. > > There has been a lot of work going on since 0.4.2.6 a month and a > half ago, with contributions by smeghead, duck, Jhor, cervantes, > Ragnarok, Sugadude, and the rest of the rabid testers in #i2p and > #i2p-chat. I could write for pages describing whats up, but instead > I'll just direct you to the change log at > http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/history.txt?rev=HEAD > > For the impatient, please review the install and update instructions > up at http://www.i2p.net/download > > Please note that since this new release updates the classpath, the > update process will require you to start up the router again after > it finishes. Any local modifications to the wrapper.config will > be lost when updating, so please be sure to back it up. In > addition, even though this new release includes the latest Jetty > (5.1.2), if you want to enable CGI support, you will need to edit > your ./eepsite/jetty.xml to include: > > > /cgi-bin/* > ./eepsite/cgi-bin > > Common Gateway Interface > / > org.mortbay.servlet.CGI > /usr/local/bin:/usr/ucb:/bin:/usr/bin > > > > adjusting the Path as necessary for your OS/distro/tastes. New > users have it easy - all of this is done for them. > > While the docs on the website haven't been updated to reflect the new > tunnel routing and crypto changes yet, the nitty gritty is up at > http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/doc/tunnel-alt.html?rev=HEAD > > There will be another release in the 0.5 series beyond this one, > including more options for allowing the user to control the impact > of predecessor attacks on their anonymity. There will certainly be > performance and load balancing improvements as well, using the > feedback we get deploying the new tunnel code on a wider network. > > Until the UDP transport is added in 0.6, we will want to continue to > be fairly low key, as we've already run into the default limits on > some braindead OSes (*cough*98*cough*). There is much we can improve > upon while the network is small though, and while I know we all want > to go out and show the world what I2P can do, another two months > waiting won't hurt. > > Anyway, thats that. The new net is up and running, squid.i2p and > other services should be up, you know where to get the goods, so > get goin'! > > =jr > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (GNU/Linux) > > iD8DBQFCFc3OGnFL2th344YRAszOAKCfTh/OOAAyonRmKoRF/iw5BwRkZACgpGp4 > qHMJkSo2mzjHTHRf98fsvdM= > =Vfl3 > -----END PGP SIGNATURE----- > _______________________________________________ > i2p mailing list > i2p@i2p.net > http://i2p.dnsalias.net/mailman/listinfo/i2p > > ----- End forwarded message ----- > -- > Eugen* Leitl leitl > ______________________________________________________________ > ICBM: 48.07078, 11.61144 http://www.leitl.org > 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE > http://moleculardevices.org http://nanomachines.net From eugen at leitl.org Fri Feb 18 22:31:26 2005 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net) (fwd from dbarrett@quinthar.com) (fwd from nickpro79@mail.ru) Message-ID: <20050218223126.GG1404@leitl.org> ----- Forwarded message from Nikita Proskourine ----- From: Nikita Proskourine Date: Fri, 18 Feb 2005 16:54:19 -0500 To: Eugen Leitl Cc: i2p@i2p.net Subject: Re: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net) (fwd from dbarrett@quinthar.com) X-Mailer: Evolution 2.0.2 I am guessing that the default limit in question is one of the things mentioned on http://support.microsoft.com/kb/q158474/ (MaxConnections). It defaults to 100 and can be raised, but on Win98 I believe the actual limit is min(MaxConnections, 256). Nick. On Fri, 2005-02-18 at 21:31 +0100, Eugen Leitl wrote: > ----- Forwarded message from David Barrett ----- > > From: David Barrett > Date: Fri, 18 Feb 2005 12:20:18 -0800 > To: "Peer-to-peer development." > Subject: Re: [p2p-hackers] [i2p] 0.5 is available (fwd from jrandom@i2p.net) > X-Mailer: Danger Service > Reply-To: David Barrett , > "Peer-to-peer development." > > Just curious, what are the "default limits" you ran into under Win 98? > Indeed, if you could summarize the top five lessons learned from your > real-world users, what would they be? I'm sure we'd all like to learn > them, too. > > -david > > On Fri, 18 Feb 2005 5:01 am, Eugen Leitl wrote: > >----- Forwarded message from jrandom ----- > > > >From: jrandom > >Date: Fri, 18 Feb 2005 03:39:24 -0800 > >To: i2p@i2p.net > >Subject: [i2p] 0.5 is available > > > >-----BEGIN PGP SIGNED MESSAGE----- > >Hash: SHA1 > > > >Hi y'all, > > > >After 6 months of work on the 0.4 series, we've implemented and > >deployed the new streaming library, integrated and tested bittorrent, > >mail, and naming apps, fixed a bunch of bugs, and learned as much as > >we could from real world users. We now have a new 0.5 release which > >reworks the tunnel routing algorithms, improving security and > >anonymity while giving the user more control of their own > >performance related tradeoffs. In addition, we've bundled susi23's > >susimail client, upgraded to the latest Jetty (allowing both symlinks > >and CGI), and a whole lot more. This new release is not backwards > >compatible - you must upgrade to get anything useful done. > > > >There has been a lot of work going on since 0.4.2.6 a month and a > >half ago, with contributions by smeghead, duck, Jhor, cervantes, > >Ragnarok, Sugadude, and the rest of the rabid testers in #i2p and > >#i2p-chat. I could write for pages describing whats up, but instead > >I'll just direct you to the change log at > >http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/history.txt?rev=HEAD > > > >For the impatient, please review the install and update instructions > >up at http://www.i2p.net/download > > > >Please note that since this new release updates the classpath, the > >update process will require you to start up the router again after > >it finishes. Any local modifications to the wrapper.config will > >be lost when updating, so please be sure to back it up. In > >addition, even though this new release includes the latest Jetty > >(5.1.2), if you want to enable CGI support, you will need to edit > >your ./eepsite/jetty.xml to include: > > > > > > /cgi-bin/* > > ./eepsite/cgi-bin > > > > Common Gateway Interface > > / > > org.mortbay.servlet.CGI > > /usr/local/bin:/usr/ucb:/bin:/usr/bin > > > > > > > >adjusting the Path as necessary for your OS/distro/tastes. New > >users have it easy - all of this is done for them. > > > >While the docs on the website haven't been updated to reflect the new > >tunnel routing and crypto changes yet, the nitty gritty is up at > >http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/router/doc/tunnel-alt.html?rev=HEAD > > > >There will be another release in the 0.5 series beyond this one, > >including more options for allowing the user to control the impact > >of predecessor attacks on their anonymity. There will certainly be > >performance and load balancing improvements as well, using the > >feedback we get deploying the new tunnel code on a wider network. > > > >Until the UDP transport is added in 0.6, we will want to continue to > >be fairly low key, as we've already run into the default limits on > >some braindead OSes (*cough*98*cough*). There is much we can improve > >upon while the network is small though, and while I know we all want > >to go out and show the world what I2P can do, another two months > >waiting won't hurt. > > > >Anyway, thats that. The new net is up and running, squid.i2p and > >other services should be up, you know where to get the goods, so > >get goin'! > > > >=jr > >-----BEGIN PGP SIGNATURE----- > >Version: GnuPG v1.2.4 (GNU/Linux) > > > >iD8DBQFCFc3OGnFL2th344YRAszOAKCfTh/OOAAyonRmKoRF/iw5BwRkZACgpGp4 > >qHMJkSo2mzjHTHRf98fsvdM= > >=Vfl3 > >-----END PGP SIGNATURE----- > >_______________________________________________ > >i2p mailing list > >i2p@i2p.net > >http://i2p.dnsalias.net/mailman/listinfo/i2p > > > >----- End forwarded message ----- > >-- > >Eugen* Leitl leitl > >______________________________________________________________ > >ICBM: 48.07078, 11.61144 http://www.leitl.org > >8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE > >http://moleculardevices.org http://nanomachines.net > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > ----- End forwarded message ----- > _______________________________________________ > i2p mailing list > i2p@i2p.net > http://i2p.dnsalias.net/mailman/listinfo/i2p ----- End forwarded message ----- -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050218/f1208cda/attachment.pgp From baford at mit.edu Fri Feb 18 23:44:43 2005 From: baford at mit.edu (Bryan Ford) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Final version of "P2P over NAT" paper available Message-ID: <200502190044.43425.baford@mit.edu> Hi folks, For those interested in P2P-over-NAT issues, I just wanted to announce that the final version of the following paper, to appear in USENIX '05, is now available: Peer-to-Peer Communication Across Network Address Translators, Bryan Ford, Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April 2005. (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf (HTML) http://www.brynosaurus.com/pub/net/p2pnat/ An earlier draft of this paper was announced on this list a few months ago. The final version includes, among other minor revisions, new "NAT Check" testing results based on almost twice the number of data points as the original draft. Cheers, Bryan --- Abstract: Network Address Translation (NAT) causes well-known difficulties for peer-to-peer (P2P) communication, since the peers involved may not be reachable at any globally valid IP address. Several NAT traversal techniques are known, but their documentation is slim, and data about their robustness or relative merits is slimmer. This paper documents and analyzes one of the simplest but most robust and practical NAT traversal techniques, commonly known as ``hole punching.'' Hole punching is moderately well-understood for UDP communication, but we show how it can be reliably used to set up peer-to-peer TCP streams as well. After gathering data on the reliability of this technique on a wide variety of deployed NATs, we find that about 82% of the NATs tested support hole punching for UDP, and about 64% support hole punching for TCP streams. As NAT vendors become increasingly conscious of the needs of important P2P applications such as Voice over IP and online gaming protocols, support for hole punching is likely to increase in the future. From ap at hamachi.cc Sat Feb 19 07:04:10 2005 From: ap at hamachi.cc (Alex Pankratov) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Final version of "P2P over NAT" paper available In-Reply-To: <200502190044.43425.baford@mit.edu> References: <200502190044.43425.baford@mit.edu> Message-ID: <4216E4EA.2030704@hamachi.cc> Bryan, Quoting your paper - > .. we find that about 82% of the NATs tested support hole punching > for UDP. > .. > The NAT Check data we gathered consists of 380 reported data points > .. I happened to have statistics for more than 16000 'data poits', and check this out - the rate of 'identity preserving' NAT devices suitable for hole punching works out to be 82.2%. *UDP* hole punching that is. Alex Bryan Ford wrote: > Hi folks, > > For those interested in P2P-over-NAT issues, I just wanted to announce that > the final version of the following paper, to appear in USENIX '05, is now > available: > > Peer-to-Peer Communication Across Network Address Translators, Bryan Ford, > Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April > 2005. > (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf > (HTML) http://www.brynosaurus.com/pub/net/p2pnat/ > > An earlier draft of this paper was announced on this list a few months ago. > The final version includes, among other minor revisions, new "NAT Check" > testing results based on almost twice the number of data points as the > original draft. > > Cheers, > Bryan > > --- > > Abstract: > > Network Address Translation (NAT) causes well-known difficulties for > peer-to-peer (P2P) communication, since the peers involved may not be > reachable at any globally valid IP address. Several NAT traversal techniques > are known, but their documentation is slim, and data about their robustness > or relative merits is slimmer. This paper documents and analyzes one of the > simplest but most robust and practical NAT traversal techniques, commonly > known as ``hole punching.'' Hole punching is moderately well-understood for > UDP communication, but we show how it can be reliably used to set up > peer-to-peer TCP streams as well. After gathering data on the reliability of > this technique on a wide variety of deployed NATs, we find that about 82% of > the NATs tested support hole punching for UDP, and about 64% support hole > punching for TCP streams. As NAT vendors become increasingly conscious of the > needs of important P2P applications such as Voice over IP and online gaming > protocols, support for hole punching is likely to increase in the future. > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > From dbarrett at quinthar.com Sat Feb 19 08:03:33 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Final version of "P2P over NAT" paper available In-Reply-To: <4216E4EA.2030704@hamachi.cc> Message-ID: <20050219080338.212D33FC27@capsicum.zgp.org> Heh, great validation of the results. So if what's the latest values for the following chart: NAT'd Firewalled +---------+------------- % Able to hole punch | 82.2% | 50-60% * % of total internet | ?? | ?? +---------+------------- % Benefiting | ?? | ?? * http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html Basically, I'd like to get a better understanding of what fraction of all internet users might benefit from these techniques, estimated as the product of the above rows. -david > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Alex Pankratov > Sent: Friday, February 18, 2005 11:04 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper available > > Bryan, > > Quoting your paper - > > > .. we find that about 82% of the NATs tested support hole punching > > for UDP. > > .. > > The NAT Check data we gathered consists of 380 reported data points > > .. > > I happened to have statistics for more than 16000 'data poits', and > check this out - the rate of 'identity preserving' NAT devices suitable > for hole punching works out to be 82.2%. *UDP* hole punching that is. > > Alex > > Bryan Ford wrote: > > > Hi folks, > > > > For those interested in P2P-over-NAT issues, I just wanted to announce > that > > the final version of the following paper, to appear in USENIX '05, is > now > > available: > > > > Peer-to-Peer Communication Across Network Address Translators, Bryan > Ford, > > Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April > > 2005. > > (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf > > (HTML) http://www.brynosaurus.com/pub/net/p2pnat/ > > > > An earlier draft of this paper was announced on this list a few months > ago. > > The final version includes, among other minor revisions, new "NAT Check" > > testing results based on almost twice the number of data points as the > > original draft. > > > > Cheers, > > Bryan > > > > --- > > > > Abstract: > > > > Network Address Translation (NAT) causes well-known difficulties for > > peer-to-peer (P2P) communication, since the peers involved may not be > > reachable at any globally valid IP address. Several NAT traversal > techniques > > are known, but their documentation is slim, and data about their > robustness > > or relative merits is slimmer. This paper documents and analyzes one of > the > > simplest but most robust and practical NAT traversal techniques, > commonly > > known as ``hole punching.'' Hole punching is moderately well-understood > for > > UDP communication, but we show how it can be reliably used to set up > > peer-to-peer TCP streams as well. After gathering data on the > reliability of > > this technique on a wide variety of deployed NATs, we find that about > 82% of > > the NATs tested support hole punching for UDP, and about 64% support > hole > > punching for TCP streams. As NAT vendors become increasingly conscious > of the > > needs of important P2P applications such as Voice over IP and online > gaming > > protocols, support for hole punching is likely to increase in the > future. > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From Ian.Wiles at blueyonder.co.uk Sat Feb 19 15:14:32 2005 From: Ian.Wiles at blueyonder.co.uk (Ian Wiles) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Learning how to build a P2P system Message-ID: Hello, I am pondering creating my own P2P system, but am having a bit of difficulty finding enough technical information. I am looking for technical documentation on p2p protocols and more high level stuff for the topologies used. Would any recommend O'Reilly's Peer to Peer and/or Ian Taylors From P2P to Web Services and Grids: Peers in a Client/Server World? I'm hoping that someone could point me in the right direction for web based information as obtaining these books depends on the library at the moment. I should also probably give a brief explanation of the system I'd like to develop. Basically it's a system to pass around text messages in a forum/usenet type of setup. However I would like the news spool to be distributed across all nodes, presumably this would require some sort of mirroring/backup as well as each node goes offline. This is why I would like to find some information on the techniques used, as I'm sure they'd be better than the solutions I would come up with. I would like to do this using Java as it's the language I'm most familiar with at the moment. The question is, should I bother starting from scratch or is that a recipe for disaster? Should I use something such as JXTA instead? I'm not to keen on using XML messages, but I'm easily persuaded. Originally I had planned on using binary packets. Thanks. -- Ian From jdd at dixons.org Sat Feb 19 19:28:11 2005 From: jdd at dixons.org (Jim Dixon) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Learning how to build a P2P system In-Reply-To: References: Message-ID: On Sat, 19 Feb 2005, Ian Wiles wrote: > I should also probably give a brief explanation of the system I'd like > to develop. Basically it's a system to pass around text messages in a > forum/usenet type of setup. However I would like the news spool to be > distributed across all nodes, presumably this would require some sort of > mirroring/backup as well as each node goes offline. You do understand that Usenet News _is_ a p2p system, right? One that works very well, despite immense and rapidly fluctuating loads, the sporadic loss of peers, legal threats, hordes of utterly clueless users, and sometimes uhm less clueful sys admins. Source code freely available. See for example http://www.isc.org/index.pl?/innd The design is ancient and barbaric, but it works. > This is why I would > like to find some information on the techniques used, as I'm sure they'd > be better than the solutions I would come up with. I would like to do > this using Java as it's the language I'm most familiar with at the > moment. Java is a practical tool for this sort of project. -- Jim Dixon jdd@dixons.org tel +44 117 982 0786 mobile +44 797 373 7881 http://xlattice.sourceforge.net p2p communications infrastructure From davidopp at cs.berkeley.edu Sat Feb 19 19:43:24 2005 From: davidopp at cs.berkeley.edu (David L. Oppenheimer) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Learning how to build a P2P system In-Reply-To: Message-ID: <200502191942.LAA04550@mindbender.davido.com> And if your tastes are more in the structured P2P camp than unstructured, check out UsenetDHT http://project-iris.net/irisbib/papers/usenetdht:iptps04/paper.pdf David > -----Original Message----- > From: p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Jim Dixon > Sent: Saturday, February 19, 2005 11:28 AM > To: Ian Wiles; Peer-to-peer development. > Subject: Re: [p2p-hackers] Learning how to build a P2P system > > On Sat, 19 Feb 2005, Ian Wiles wrote: > > > I should also probably give a brief explanation of the > system I'd like > > to develop. Basically it's a system to pass around text > messages in a > > forum/usenet type of setup. However I would like the news > spool to be > > distributed across all nodes, presumably this would require > some sort of > > mirroring/backup as well as each node goes offline. > > You do understand that Usenet News _is_ a p2p system, right? One that > works very well, despite immense and rapidly fluctuating loads, the > sporadic loss of peers, legal threats, hordes of utterly > clueless users, > and sometimes uhm less clueful sys admins. > > Source code freely available. See for example > http://www.isc.org/index.pl?/innd > > The design is ancient and barbaric, but it works. > > > This is > why I would > > like to find some information on the techniques used, as > I'm sure they'd > > be better than the solutions I would come up with. I would > like to do > > this using Java as it's the language I'm most familiar with at the > > moment. > > Java is a practical tool for this sort of project. > > -- > Jim Dixon jdd@dixons.org tel +44 117 982 0786 mobile +44 > 797 373 7881 > http://xlattice.sourceforge.net p2p communications > infrastructure > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From ap at hamachi.cc Sat Feb 19 20:13:24 2005 From: ap at hamachi.cc (Alex Pankratov) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Final version of "P2P over NAT" paper available In-Reply-To: <20050219080338.212D33FC27@capsicum.zgp.org> References: <20050219080338.212D33FC27@capsicum.zgp.org> Message-ID: <42179DE3.10309@hamachi.cc> Well, based on same stats it looks like 'hole punching' as it's described in p2pnat paper succeeds in ~84% of the cases. Our proggy is a bit more complex than that so our success rate is about 97%. Alex David Barrett wrote: > Heh, great validation of the results. > > So if what's the latest values for the following chart: > > NAT'd Firewalled > +---------+------------- > % Able to hole punch | 82.2% | 50-60% * > % of total internet | ?? | ?? > +---------+------------- > % Benefiting | ?? | ?? > > * http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html > > Basically, I'd like to get a better understanding of what fraction of all > internet users might benefit from these techniques, estimated as the product > of the above rows. > > -david > > >>-----Original Message----- >>From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On >>Behalf Of Alex Pankratov >>Sent: Friday, February 18, 2005 11:04 PM >>To: Peer-to-peer development. >>Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper available >> >>Bryan, >> >>Quoting your paper - >> >> > .. we find that about 82% of the NATs tested support hole punching >> > for UDP. >> > .. >> >>>The NAT Check data we gathered consists of 380 reported data points >> >> > .. >> >>I happened to have statistics for more than 16000 'data poits', and >>check this out - the rate of 'identity preserving' NAT devices suitable >>for hole punching works out to be 82.2%. *UDP* hole punching that is. >> >>Alex >> >>Bryan Ford wrote: >> >> >>>Hi folks, >>> >>>For those interested in P2P-over-NAT issues, I just wanted to announce >> >>that >> >>>the final version of the following paper, to appear in USENIX '05, is >> >>now >> >>>available: >>> >>>Peer-to-Peer Communication Across Network Address Translators, Bryan >> >>Ford, >> >>>Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, April >>>2005. >>>(PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf >>>(HTML) http://www.brynosaurus.com/pub/net/p2pnat/ >>> >>>An earlier draft of this paper was announced on this list a few months >> >>ago. >> >>>The final version includes, among other minor revisions, new "NAT Check" >>>testing results based on almost twice the number of data points as the >>>original draft. >>> >>>Cheers, >>>Bryan >>> >>>--- >>> >>>Abstract: >>> >>>Network Address Translation (NAT) causes well-known difficulties for >>>peer-to-peer (P2P) communication, since the peers involved may not be >>>reachable at any globally valid IP address. Several NAT traversal >> >>techniques >> >>>are known, but their documentation is slim, and data about their >> >>robustness >> >>>or relative merits is slimmer. This paper documents and analyzes one of >> >>the >> >>>simplest but most robust and practical NAT traversal techniques, >> >>commonly >> >>>known as ``hole punching.'' Hole punching is moderately well-understood >> >>for >> >>>UDP communication, but we show how it can be reliably used to set up >>>peer-to-peer TCP streams as well. After gathering data on the >> >>reliability of >> >>>this technique on a wide variety of deployed NATs, we find that about >> >>82% of >> >>>the NATs tested support hole punching for UDP, and about 64% support >> >>hole >> >>>punching for TCP streams. As NAT vendors become increasingly conscious >> >>of the >> >>>needs of important P2P applications such as Voice over IP and online >> >>gaming >> >>>protocols, support for hole punching is likely to increase in the >> >>future. >> >>>_______________________________________________ >>>p2p-hackers mailing list >>>p2p-hackers@zgp.org >>>http://zgp.org/mailman/listinfo/p2p-hackers >>>_______________________________________________ >>>Here is a web page listing P2P Conferences: >>>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >>> >>> >> >>_______________________________________________ >>p2p-hackers mailing list >>p2p-hackers@zgp.org >>http://zgp.org/mailman/listinfo/p2p-hackers >>_______________________________________________ >>Here is a web page listing P2P Conferences: >>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > From dbarrett at quinthar.com Sat Feb 19 21:43:12 2005 From: dbarrett at quinthar.com (David Barrett) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Final version of "P2P over NAT" paper available In-Reply-To: <42179DE3.10309@hamachi.cc> References: <20050219080338.212D33FC27@capsicum.zgp.org> <42179DE3.10309@hamachi.cc> Message-ID: <1108849400.7734B6B@dk12.dngr.org> I'm sorry, I didn't make my question clear. Given that you can hole punch for 82-97% of NAT'd users, how many users are behind NATs in the first place? For example if only 1% of users is behind a NAT then hole punching doesn't much matter. But it's 25%, 50%, or 75%, it becomes critical. Does this question make sense? Likewise, I'm interested in a similar stat for firewalls. Sorry for not being clear the first time. -david On Sat, 19 Feb 2005 12:32 pm, Alex Pankratov wrote: > Well, based on same stats it looks like 'hole punching' as it's > described in p2pnat paper succeeds in ~84% of the cases. Our > proggy is a bit more complex than that so our success rate is > about 97%. > > Alex > > David Barrett wrote: > >> Heh, great validation of the results. >> So if what's the latest values for the following chart: >> NAT'd Firewalled >> +---------+------------- >> % Able to hole punch | 82.2% | 50-60% * >> % of total internet | ?? | ?? >> +---------+------------- >> % Benefiting | ?? | ?? >> * http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html >> Basically, I'd like to get a better understanding of what fraction of >> all >> internet users might benefit from these techniques, estimated as the >> product >> of the above rows. >> -david >> >>> -----Original Message----- >>> From: p2p-hackers-bounces@zgp.org >>> [mailto:p2p-hackers-bounces@zgp.org] On >>> Behalf Of Alex Pankratov >>> Sent: Friday, February 18, 2005 11:04 PM >>> To: Peer-to-peer development. >>> Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper >>> available >>> >>> Bryan, >>> >>> Quoting your paper - >>> >>>> .. we find that about 82% of the NATs tested support hole punching >>>> for UDP. >>>> .. >>> >>>> The NAT Check data we gathered consists of 380 reported data points >>> >>>> .. >>> >>> I happened to have statistics for more than 16000 'data poits', and >>> check this out - the rate of 'identity preserving' NAT devices suitable >>> for hole punching works out to be 82.2%. *UDP* hole punching that is. >>> >>> Alex >>> >>> Bryan Ford wrote: >>> >>> >>>> Hi folks, >>>> >>>> For those interested in P2P-over-NAT issues, I just wanted to announce >>> >>> that >>> >>>> the final version of the following paper, to appear in USENIX '05, is >>> >>> now >>> >>>> available: >>>> >>>> Peer-to-Peer Communication Across Network Address Translators, Bryan >>> >>> Ford, >>> >>>> Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, >>>> April >>>> 2005. >>>> (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf >>>> (HTML) http://www.brynosaurus.com/pub/net/p2pnat/ >>>> >>>> An earlier draft of this paper was announced on this list a few months >>> >>> ago. >>> >>>> The final version includes, among other minor revisions, new "NAT >>>> Check" >>>> testing results based on almost twice the number of data points as the >>>> original draft. >>>> >>>> Cheers, >>>> Bryan >>>> >>>> --- >>>> >>>> Abstract: >>>> >>>> Network Address Translation (NAT) causes well-known difficulties for >>>> peer-to-peer (P2P) communication, since the peers involved may not be >>>> reachable at any globally valid IP address. Several NAT traversal >>> >>> techniques >>> >>>> are known, but their documentation is slim, and data about their >>> >>> robustness >>> >>>> or relative merits is slimmer. This paper documents and analyzes one of >>> >>> the >>> >>>> simplest but most robust and practical NAT traversal techniques, >>> >>> commonly >>> >>>> known as ``hole punching.'' Hole punching is moderately well-understood >>> >>> for >>> >>>> UDP communication, but we show how it can be reliably used to set up >>>> peer-to-peer TCP streams as well. After gathering data on the >>> >>> reliability of >>> >>>> this technique on a wide variety of deployed NATs, we find that about >>> >>> 82% of >>> >>>> the NATs tested support hole punching for UDP, and about 64% support >>> >>> hole >>> >>>> punching for TCP streams. As NAT vendors become increasingly conscious >>> >>> of the >>> >>>> needs of important P2P applications such as Voice over IP and online >>> >>> gaming >>> >>>> protocols, support for hole punching is likely to increase in the >>> >>> future. >>> >>>> _______________________________________________ >>>> p2p-hackers mailing list >>>> p2p-hackers@zgp.org >>>> http://zgp.org/mailman/listinfo/p2p-hackers >>>> _______________________________________________ >>>> Here is a web page listing P2P Conferences: >>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >>>> >>>> >>> >>> _______________________________________________ >>> p2p-hackers mailing list >>> p2p-hackers@zgp.org >>> http://zgp.org/mailman/listinfo/p2p-hackers >>> _______________________________________________ >>> Here is a web page listing P2P Conferences: >>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >> >> _______________________________________________ >> p2p-hackers mailing list >> p2p-hackers@zgp.org >> http://zgp.org/mailman/listinfo/p2p-hackers >> _______________________________________________ >> Here is a web page listing P2P Conferences: >> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >> > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From Ian.Wiles at blueyonder.co.uk Sat Feb 19 22:56:08 2005 From: Ian.Wiles at blueyonder.co.uk (Ian Wiles) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Learning how to build a P2P system In-Reply-To: References: Message-ID: In message , Jim Dixon writes > >You do understand that Usenet News _is_ a p2p system, right? One that >works very well, despite immense and rapidly fluctuating loads, the >sporadic loss of peers, legal threats, hordes of utterly clueless users, >and sometimes uhm less clueful sys admins. > >Source code freely available. See for example >http://www.isc.org/index.pl?/innd > >The design is ancient and barbaric, but it works. > Yep, I'm aware of this. I used to run a dnews server for some groups. I've also seen some P2P usenet based systems, such as Mynews etc. The difference between my idea was to distribute the spool, which doesn't happen with NNTP servers, they more or less copy the spool amongst their peers. As I would expect each node to be a home PC (I should've mentioned that...) if they each propagated 100% of the spool that would be pretty bad for scaling. For example most usenet servers now move about 1Tb of bandwidth a day (although this mostly accounts for binaries). Even text only would probably amount to 5Gb or so a day. Also a lot of usenet providers are no charging for access on top of ISP fees (a lot of ISP provided usenet servers are quite poor in my experience). Cheers. -- Ian Wiles From ap at hamachi.cc Sun Feb 20 00:26:28 2005 From: ap at hamachi.cc (Alex Pankratov) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Final version of "P2P over NAT" paper available In-Reply-To: <1108849400.7734B6B@dk12.dngr.org> References: <20050219080338.212D33FC27@capsicum.zgp.org> <42179DE3.10309@hamachi.cc> <1108849400.7734B6B@dk12.dngr.org> Message-ID: <4217D934.401@hamachi.cc> David Barrett wrote: > I'm sorry, I didn't make my question clear. Given that you can hole > punch for 82-97% of NAT'd users, how many users are behind NATs in the > first place? Around 70%, but keep in mind that 16000+ samples we have at the moment are far from being representative. > > For example if only 1% of users is behind a NAT then hole punching > doesn't much matter. But it's 25%, 50%, or 75%, it becomes critical. > > Does this question make sense? Yes, let me clarify the numbers I gave earlier - 82% is ratio of 'identity preserving' NATs among all NAT'ed clients we saw. 97% - is a number of user pairs (including routable clients) that we were able to successfully connect. If we were only to use technique suggested in p2pnat paper, 97% would've become 84%. > > Likewise, I'm interested in a similar stat for firewalls. The stats for firewall that allow outbound UDP is 100%, ie you can always connect two peers behind two separate _stateful_ firewalls that allow unrestricted outbound UDP. The % of those behind locked- down firewalls is neglible as well as the % of those behind the same firewall that doesn't allow for hairpin'ing. > > Sorry for not being clear the first time. Not a problem. > > -david > > On Sat, 19 Feb 2005 12:32 pm, Alex Pankratov wrote: > >> Well, based on same stats it looks like 'hole punching' as it's >> described in p2pnat paper succeeds in ~84% of the cases. Our >> proggy is a bit more complex than that so our success rate is >> about 97%. >> >> Alex >> >> David Barrett wrote: >> >>> Heh, great validation of the results. >>> So if what's the latest values for the following chart: >>> NAT'd Firewalled >>> +---------+------------- >>> % Able to hole punch | 82.2% | 50-60% * >>> % of total internet | ?? | ?? >>> +---------+------------- >>> % Benefiting | ?? | ?? >>> * http://zgp.org/pipermail/p2p-hackers/2004-December/002215.html >>> Basically, I'd like to get a better understanding of what fraction of >>> all >>> internet users might benefit from these techniques, estimated as the >>> product >>> of the above rows. >>> -david >>> >>>> -----Original Message----- >>>> From: p2p-hackers-bounces@zgp.org >>>> [mailto:p2p-hackers-bounces@zgp.org] On >>>> Behalf Of Alex Pankratov >>>> Sent: Friday, February 18, 2005 11:04 PM >>>> To: Peer-to-peer development. >>>> Subject: Re: [p2p-hackers] Final version of "P2P over NAT" paper >>>> available >>>> >>>> Bryan, >>>> >>>> Quoting your paper - >>>> >>>>> .. we find that about 82% of the NATs tested support hole punching >>>>> for UDP. >>>>> .. >>>> >>>> >>>>> The NAT Check data we gathered consists of 380 reported data points >>>> >>>> >>>>> .. >>>> >>>> >>>> I happened to have statistics for more than 16000 'data poits', and >>>> check this out - the rate of 'identity preserving' NAT devices suitable >>>> for hole punching works out to be 82.2%. *UDP* hole punching that is. >>>> >>>> Alex >>>> >>>> Bryan Ford wrote: >>>> >>>> >>>>> Hi folks, >>>>> >>>>> For those interested in P2P-over-NAT issues, I just wanted to announce >>>> >>>> >>>> that >>>> >>>>> the final version of the following paper, to appear in USENIX '05, is >>>> >>>> >>>> now >>>> >>>>> available: >>>>> >>>>> Peer-to-Peer Communication Across Network Address Translators, Bryan >>>> >>>> >>>> Ford, >>>> >>>>> Pyda Srisuresh, and Dan Kegel. USENIX Annual Technical Conference, >>>>> April >>>>> 2005. >>>>> (PDF) http://www.brynosaurus.com/pub/net/p2pnat.pdf >>>>> (HTML) http://www.brynosaurus.com/pub/net/p2pnat/ >>>>> >>>>> An earlier draft of this paper was announced on this list a few months >>>> >>>> >>>> ago. >>>> >>>>> The final version includes, among other minor revisions, new "NAT >>>>> Check" >>>>> testing results based on almost twice the number of data points as the >>>>> original draft. >>>>> >>>>> Cheers, >>>>> Bryan >>>>> >>>>> --- >>>>> >>>>> Abstract: >>>>> >>>>> Network Address Translation (NAT) causes well-known difficulties for >>>>> peer-to-peer (P2P) communication, since the peers involved may not be >>>>> reachable at any globally valid IP address. Several NAT traversal >>>> >>>> >>>> techniques >>>> >>>>> are known, but their documentation is slim, and data about their >>>> >>>> >>>> robustness >>>> >>>>> or relative merits is slimmer. This paper documents and analyzes >>>>> one of >>>> >>>> >>>> the >>>> >>>>> simplest but most robust and practical NAT traversal techniques, >>>> >>>> >>>> commonly >>>> >>>>> known as ``hole punching.'' Hole punching is moderately >>>>> well-understood >>>> >>>> >>>> for >>>> >>>>> UDP communication, but we show how it can be reliably used to set up >>>>> peer-to-peer TCP streams as well. After gathering data on the >>>> >>>> >>>> reliability of >>>> >>>>> this technique on a wide variety of deployed NATs, we find that about >>>> >>>> >>>> 82% of >>>> >>>>> the NATs tested support hole punching for UDP, and about 64% support >>>> >>>> >>>> hole >>>> >>>>> punching for TCP streams. As NAT vendors become increasingly conscious >>>> >>>> >>>> of the >>>> >>>>> needs of important P2P applications such as Voice over IP and online >>>> >>>> >>>> gaming >>>> >>>>> protocols, support for hole punching is likely to increase in the >>>> >>>> >>>> future. >>>> >>>>> _______________________________________________ >>>>> p2p-hackers mailing list >>>>> p2p-hackers@zgp.org >>>>> http://zgp.org/mailman/listinfo/p2p-hackers >>>>> _______________________________________________ >>>>> Here is a web page listing P2P Conferences: >>>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> p2p-hackers mailing list >>>> p2p-hackers@zgp.org >>>> http://zgp.org/mailman/listinfo/p2p-hackers >>>> _______________________________________________ >>>> Here is a web page listing P2P Conferences: >>>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >>> >>> >>> _______________________________________________ >>> p2p-hackers mailing list >>> p2p-hackers@zgp.org >>> http://zgp.org/mailman/listinfo/p2p-hackers >>> _______________________________________________ >>> Here is a web page listing P2P Conferences: >>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >>> >> _______________________________________________ >> p2p-hackers mailing list >> p2p-hackers@zgp.org >> http://zgp.org/mailman/listinfo/p2p-hackers >> _______________________________________________ >> Here is a web page listing P2P Conferences: >> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > From arachnid at notdot.net Mon Feb 21 01:01:02 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Efficient Decoding of Online / LT / Raptor codes Message-ID: After reading papers on Digital Fountain codes, it occurs to me that the decoding process can be more efficient than described: All the papers describe decoding by looking for encoded nodes that consist of a single source node, marking that node as decoded, and subtracting it from all the other nodes that include it. However, it occurs to me that it should be possible to be more efficient than this: The order of encoded nodes can be decreased without starting with nodes consisting of a single source node. As an example, if I have nodes consisting of: 1: A+B+C 2: A+B+D 3: C+D+E Assuming my operator is commutative (eg, XOR), I can add 1 and 2 to get C+D, and add that to 3 to recover E. Obviously this only works with much efficiency if the operator you use to subtract is the same as the one you use to add, such as XOR. Unfortunately, I don't know enough stats to figure out if this is worth it - is anyone able to show if it's worth the extra effort, or if the cases in which this is useful are rare? -Nick Johnson From arachnid at notdot.net Mon Feb 21 03:23:12 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <4211C669.3080200@ucla.edu> References: <4211C669.3080200@ucla.edu> Message-ID: On 15/02/2005, at 10:52 PM, Michael Parker wrote: > Hi all, > > Does anyone know what happened to the "Online Codes" Sourceforge > project, listed at http://sourceforge.net/projects/onlinecodes? I'm > asking here for two reasons: First, because Online Codes [1, 2] would > be a great tool in peer-to-peer applications, so I thought someone > here might have followed the project while it was still active. > Second, I've written a solid library implementation of the Online > Codes encoding/decoding algorithm described in the aforementioned > papers. Alas, only after I implemented it did I find out that the > authors' company, Rateless, had patented it (or, so they allude to on > their web site www.rateless.com, Digital Fountain owned the IP). I don't see it - where do they allude to it? The only mention of patents google finds on the site is in the copy of the GPL they have hosted there. -Nick Johnson From mgp at ucla.edu Mon Feb 21 05:57:25 2005 From: mgp at ucla.edu (mgp@ucla.edu) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes Message-ID: <200502210557.j1L5vPX11295@webmail.my.ucla.edu> Hi Nick, First, I think your optimization to rateless codes is interesting, although I'm worried that it might not be computationally feasible. I haven't given it much thought yet, but at first glimpse it seems like it would take exponential or factorial time to come up with such shortcuts from XORing the check blocks. I'll give it more thought, but if you or anyone else can show it can be done efficiently, I will be pleasantly suprised :) About your second e-mail... On the web site www.rateless.com, if you go to Library, you will see that under the "Online Codes" paper it says: "This paper marked the beginning of Rateless Research. The codes described in it fall within the scope of patents owned by Digital Fountain, Inc. In order to prevent intellectual property overlap, we have developed and use a new class of practical rateless codes, based on decoders that don't use chain-reaction, message passing or belief-propagation techniques." Which, I think, means that the rateless codes method described in the paper is not the one they are currently using, for the one in the paper is covered by the patents of Digital Fountain, Inc. Also, in the excellent paper "Digital Fountains: A Survey and Look Forward" [1], under "Barriers to Adoption -- Patent Protection" you will find references to 10 (!) or so patents in the bibliography. Again, most of these I believe belong to Digital Fountain, Inc. Regards, Michael Parker [1] www.eecs.harvard.edu/~michaelm/postscripts/itw2004.pdf On Mon, 21 Feb 2005 16:23:12 +1300 Nick Johnson wrote: > On 15/02/2005, at 10:52 PM, Michael Parker wrote: > > > Hi all, > > > > Does anyone know what happened to the "Online Codes" Sourceforge > > project, listed at http://sourceforge.net/projects/onlinecodes? I'm > > asking here for two reasons: First, because Online Codes [1, 2] would > > be a great tool in peer-to-peer applications, so I thought someone > > here might have followed the project while it was still active. > > Second, I've written a solid library implementation of the Online > > Codes encoding/decoding algorithm described in the aforementioned > > papers. Alas, only after I implemented it did I find out that the > > authors' company, Rateless, had patented it (or, so they allude to on > > their web site www.rateless.com, Digital Fountain owned the IP). > > I don't see it - where do they allude to it? > The only mention of patents google finds on the site is in the copy of > the GPL they have hosted there. > > -Nick Johnson > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From arachnid at notdot.net Mon Feb 21 06:07:09 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <200502210557.j1L5vPX11295@webmail.my.ucla.edu> References: <200502210557.j1L5vPX11295@webmail.my.ucla.edu> Message-ID: <42197A8D.9020400@notdot.net> mgp@ucla.edu wrote: >Hi Nick, > >First, I think your optimization to rateless codes is interesting, >although I'm worried that it might not be >computationally feasible. I haven't given it much thought yet, but at >first glimpse it seems like it would take >exponential or factorial time to come up with such shortcuts from XORing >the check blocks. I'll give it more >thought, but if you or anyone else can show it can be done efficiently, I >will be pleasantly suprised :) > > As far as I can tell (though I can't prove it), it should be possible to reduce the magnitude as much as is possible by simply checking each received block against the ones already received. If the number of source blocks they don't share is fewer than the order of whichever of the two blocks has higher order, replace the higher order block with the xor of the two. At first glance this would reqire O(n) time per block, and hence O(n^2) time for all blocks, but it could probably be reduced by creating tree 'indexes' to the blocks that contain each source block, reducing the complexity to less than O(n) for each block. >About your second e-mail... On the web site www.rateless.com, if you go >to Library, you will see that under >the "Online Codes" paper it says: > >"This paper marked the beginning of Rateless Research. The codes >described in it fall within the scope of >patents owned by Digital Fountain, Inc. In order to prevent intellectual >property overlap, we have >developed and use a new class of practical rateless codes, based on >decoders that don't use chain-reaction, >message passing or belief-propagation techniques." > > So Online codes should be unencumbered, while Torrent / LT codes aren't? -Nick Johnson From bryan.turner at pobox.com Mon Feb 21 15:57:11 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <42197A8D.9020400@notdot.net> Message-ID: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com> Nick, > If the number of source blocks they don't share is fewer than the order > of whichever of the two blocks has higher order, replace the higher order > block with the xor of the two. You are correct, this will work for codes which use XOR. I share Michael's caution about the efficiency of this scheme, as most packets will be sourced from dozens of blocks. With indexing, this could be sub-linear per application - but is an application per source block, or per packet? My guess is that even the best indexing would still require per-source-block lookup. So you have dozens of sub-linear applications per packet, which will probably end up being super-linear if your big-O constants aren't very small. If you could keep the DAG of all source blocks, received packets, etc, in memory then you could reduce it to O(M) per application where M is the depth of the DAG. But there's some recursion that's not accounted for at the next layer so the total cost is more than O(nM), it would be like O(n*CM) where C is the recursion constant. > So Online codes should be unencumbered, while Torrent / LT codes aren't? As far as I am aware, Digital Fountain owns all the patents to rateless codes currently in the literature. This is dangerous territory for open-source. That makes Rateless.com's claims of an unencumbered rateless code even more interesting.. their product documentation uses the words "Mixed Acyclic Decoder", which I cannot find any other references to online. My guess is that they have generalized the connectivity DAG for the source packets ("acyclic"), and used a two-part code ("mixed") to regenerate the DAG at the other end. Most of the Digital Fountain IP is predicated on the small per-packet ID-tag which identifies the connectivity of the packet in the source DAG. Digital Fountain has been working to optimize the source DAG to reduce memory consumption and improve throughput (basically you have to keep the entire file in memory to run a Digital Fountain of it). Rateless has probably come up with a new ID-tag which does not read on the Digital Fountain IP, but transmits the same information. --Bryan bryan.turner@pobox.com From arachnid at notdot.net Mon Feb 21 20:16:57 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com> References: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com> Message-ID: <273a8d5eb249de6348b5c145301b9ab2@notdot.net> On 22/02/2005, at 4:57 AM, Bryan Turner wrote: > >> So Online codes should be unencumbered, while Torrent / LT codes >> aren't? > > As far as I am aware, Digital Fountain owns all the patents to > rateless codes currently in the literature. This is dangerous > territory for > open-source. That makes Rateless.com's claims of an unencumbered > rateless > code even more interesting.. their product documentation uses the words > "Mixed Acyclic Decoder", which I cannot find any other references to > online. Since infringing a patent requires meeting all the claims, it seems likely to me that Online Codes won't infringe Digital Fountain's patents. However, IANAPL. On the plus side, I live in New Zealand, which doesn't recognize Software Patents (to the best of my knowledge). It'd be nice to be able to distribute anything I write to people in the US, though. -Nick Johnson From agthorr at cs.uoregon.edu Mon Feb 21 20:29:36 2005 From: agthorr at cs.uoregon.edu (agthorr@cs.uoregon.edu) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <273a8d5eb249de6348b5c145301b9ab2@notdot.net> References: <200502211557.j1LFvB1j028908@rtp-core-1.cisco.com> <273a8d5eb249de6348b5c145301b9ab2@notdot.net> Message-ID: <20050221202935.GB16576@barsoom.org> On Tue, Feb 22, 2005 at 09:16:57AM +1300, Nick Johnson wrote: > Since infringing a patent requires meeting all the claims, it seems > likely to me that Online Codes won't infringe Digital Fountain's > patents. However, IANAPL. IANAL either, but it was my understanding that infringing a patent occurs when you infringe any of the claims, though a court may dismiss some claims as being overly broad. Thus, most patents have a series of increasingly more specific claims, where the most-broad cover the most scope, and the most-specific are most likely to hold up. This is what I recall from taking an intellectual property course several years ago, and appears to be true doing some quick search on Google. But, again, IANAL. :) From b.fallenstein at gmx.de Mon Feb 21 23:53:30 2005 From: b.fallenstein at gmx.de (b.fallenstein@gmx.de) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes Message-ID: <28921.1109030010@www19.gmx.net> On Mon, 21 Feb 2005 12:29:36 -0800, agthorr@cs.uoregon.edu wrote: > On Tue, Feb 22, 2005 at 09:16:57AM +1300, Nick Johnson wrote: > > Since infringing a patent requires meeting all the claims, it seems > > likely to me that Online Codes won't infringe Digital Fountain's > > patents. However, IANAPL. > > IANAL either, but it was my understanding that infringing a patent > occurs when you infringe any of the claims, though a court may dismiss > some claims as being overly broad. You're correct. What Nick was thinking about is that infringing a patent requires infringing all *steps* in one particular claim. So if you have a patent with claims like 1. A method for quenching thirst, comprising of opening at least one container of juice, pouring at least some of the juice in said container into a glass and drinking said juice from said glass. 2. The method of claim 1, where the container is a pack. 3. The method of claim 1, where the container is a can. 4. A method for quenching thirst, comprising of opening at least one container of juice, putting a straw in said container, and drinking at least some of the juice in said container through said straw. then you infringe the patent if you drink juice from a glass, even though you only infringe on claim 1 and not claim 4; however, if you open a can of juice and then drink the juice directly from the can, you do not infringe on the patent, because you don't perform all of the steps in the patent. (There's a loophole there; the patent holder may argue that what you did was "equivalent" to one of the steps in the patent. In the example, say you're drinking from a cup instead of a glass; the patent holder may argue that the cup is 'equivalent' to a glass. However, I believe courts are generally quite strict about that; if the patent holder meant either a cup or a glass, why did they say 'glass' specifically?) IANAL, but I do believe I'm correct here, from what I've read. (Unfortunately the URI of the source escapes my mind :-( ) - Benja From sdaswani at gmail.com Tue Feb 22 01:15:08 2005 From: sdaswani at gmail.com (Susheel Daswani) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <28921.1109030010@www19.gmx.net> References: <28921.1109030010@www19.gmx.net> Message-ID: <1cd056b90502211715a073fd@mail.gmail.com> Folks, it is not the case that you infringe on a patent if you infringe ANY of the claims. You must infringe all. I an not a lawyer, but I am training to be one :). Here is the content of a message I sent a few months back: -------------- I'm not sure how everyone is handling the Altnet patent threat, but in my studies I've come across some salient points regarding patent infringement: "For an accused product to literally infringe a patent, EVERY element contained in the patent claim must also be present in the accused product or device. If a claimed apparatus has five parts, or 'elements', and the allegedly infringing apparatus has only four of those five, it does not literally infringe. This is true even though the defendant may have copied the four elements exactly, and regardless of how significant or insignificant the missing element is." 'Intellectual Property in the New Technological Age', 3rd Edition, page 230 This may already be known, but I thought I'd put it out there. So everyone should analyse their hashing systems to see how they compare to Altnet's patent elements. If you don't do everything they do, you can ignore their dinky letter :). I'm going to analyse their claims soon and compare to the systems I know. Some more interesting information, which is probably obvious: "[I]t does not matter [if] a defendant has ADDED several new elements -- adding new features cannot help a defendant escape infringement." -------------- Susheel On Tue, 22 Feb 2005 00:53:30 +0100 (MET), b.fallenstein@gmx.de wrote: > On Mon, 21 Feb 2005 12:29:36 -0800, agthorr@cs.uoregon.edu > wrote: > > On Tue, Feb 22, 2005 at 09:16:57AM +1300, Nick Johnson wrote: > > > Since infringing a patent requires meeting all the claims, it seems > > > likely to me that Online Codes won't infringe Digital Fountain's > > > patents. However, IANAPL. > > > > IANAL either, but it was my understanding that infringing a patent > > occurs when you infringe any of the claims, though a court may dismiss > > some claims as being overly broad. > > You're correct. What Nick was thinking about is that infringing a > patent requires infringing all *steps* in one particular claim. So if > you have a patent with claims like > From arachnid at notdot.net Tue Feb 22 02:37:54 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Online Codes In-Reply-To: <1cd056b90502211715a073fd@mail.gmail.com> References: <28921.1109030010@www19.gmx.net> <1cd056b90502211715a073fd@mail.gmail.com> Message-ID: On 22/02/2005, at 2:15 PM, Susheel Daswani wrote: > "For an accused product to literally infringe a patent, EVERY element > contained in the patent claim must also be present in the accused > product or device. If a claimed apparatus has five parts, or > 'elements', and the allegedly infringing apparatus has only four of > those five, it does not literally infringe. This is true even though > the defendant may have copied the four elements exactly, and > regardless of how significant or insignificant the missing element > is." > 'Intellectual Property in the New Technological Age', 3rd Edition, > page 230 This seems to be exactly what b.fallenstein was saying - all the elements in a particular claim must match, but only one claim needs to completely match for there to be a violation: >>> IANAL either, but it was my understanding that infringing a patent >>> occurs when you infringe any of the claims, though a court may >>> dismiss >>> some claims as being overly broad. >> >> You're correct. What Nick was thinking about is that infringing a >> patent requires infringing all *steps* in one particular claim. So if >> you have a patent with claims like From szabo at szabo.best.vwh.net Tue Feb 22 03:09:41 2005 From: szabo at szabo.best.vwh.net (Nick Szabo) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] no subject (file transmission) Message-ID: <20050222030941.24981.qmail@szabo.best.vwh.net> Disclaimer -- IANAL and the following are personal not legal opinions. The "found valid by a jury" business is FUD -- it's meant to intimidate those receiving the cease & desist letters into thinking they will lose, but the jury finding has no direct legal effect on future cases. To have such an effect it would have to satisfy the criteria for issue preclusion. If an issue is precluded for a future case that would mean the finding in the previous case stands -- it can't be re-argued or decided differently in the future case. The party against whom issue preclusion is asserted must have been present at the original trial and had a full and fair opportunity to litigate. Here this only applies to Akamai itself. In other words, if Akamai infringed again, and they had lost the first trial, the validity issue wouldn't be retried in the second trial. But other accused infringers can re-challenge the validity from scratch. (I suspect, but Im not certain, that the jury result can't even be introduced as evidence to sway future juries, though it looks like nothing stops them from putting the results in cease & desist letters). Here the jury finding couldn't even preclude re-arguing the validity even in a future suit against Akamai since it's not necessary to the result of the first case. This kind of thing has been going on in IP for many decades -- see Electrical Fittings v. Thomas Betts, 307 U.S. 241 (1939) at http://caselaw.lp.findlaw.com/scripts/getcase.pl?navby=search&court=US&case=/us/307/241.html. According to the result in that case Akamai here could have appealed the jury finding (even though it won the case), but since the issue isn't precluded Akamai had no incentive appeal, and didn't. If the roles were turned around the issue could be precluded -- if the jury had said the patent was invalid (and Digital Island/C&W/Altnet couldn't overturn the jury on appeal), and Altnet for that reason lost the case, Altnet would be precluded from raising that issue again -- the patent would remain invalid for subsequent lawsuits. If juries find conflicting results there probably won't ever be any preclusion, so the issue of the validity of this Altnet patent probably won't ever be precluded, though it would be an interesting case to argue. So the jury finding in Akamai might have this indirect legal effect on future cases in _preventing_ preclusion of the validity issue in favor of defendants. As for the probabilities of subsequent juries going the same way as the first, I suspect there's not much correlation. Juries in patent cases are all over the map, and given that there is more publicity in the p2p world this time around, future defendants will probably find much better prior art to challenge the Altnet patents with. > "Also, a decentralized scheme such as in > Kazaa has no availability problems but lacks integrity, since Kazaa is > plagued with many fake files. Clearly, decentralization is an unsolved > issue that needs further research." Perhaps this is ironic, but a good solution is protocol-enforced property rights. Specifically, we should treat names crossing trust boundaries as property, and securely agree across trust boundaries on who owns what names, as described in "Secure Property Titles with Owner Authority", http://szabo.best.vwh.net/securetitle.html Nick Szabo From szabo at szabo.best.vwh.net Tue Feb 22 03:22:24 2005 From: szabo at szabo.best.vwh.net (Nick Szabo) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Fully Message-ID: <20050222032224.44521.qmail@szabo.best.vwh.net> > "Also, a decentralized scheme such as in > Kazaa has no availability problems but lacks integrity, since Kazaa is > plagued with many fake files. Clearly, decentralization is an unsolved > issue that needs further research." Perhaps this is ironic, but a good solution is protocol-enforced property rights. Specifically, we should treat names crossing trust boundaries as property, and securely agree across trust boundaries on who owns what names, as described in "Secure Property Titles with Owner Authority", http://szabo.best.vwh.net/securetitle.html Nick Szabo From ian at locut.us Tue Feb 22 11:39:52 2005 From: ian at locut.us (Ian Clarke) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] no subject (file transmission) In-Reply-To: <20050222030941.24981.qmail@szabo.best.vwh.net> References: <20050222030941.24981.qmail@szabo.best.vwh.net> Message-ID: On 22 Feb 2005, at 03:09, Nick Szabo wrote: > Disclaimer -- IANAL and the following are personal not legal opinions. > > The "found valid by a jury" business is FUD -- it's meant to > intimidate those receiving the cease & desist letters into thinking > they will lose, but the jury finding has no direct legal effect on > future cases. IANAL either, but an IP lawyer told me that jurys determine matters of fact, where as the validity of a patent is a matter of law, and therefore a jury is incapable of finding that a patent is valid. Ian. -- Founder, The Freenet Project http://freenetproject.org/ CEO, Cematics Ltd http://cematics.com/ Personal Blog http://locut.us/~ian/blog/ From alwaysakid at 163.com Tue Feb 22 18:36:56 2005 From: alwaysakid at 163.com (Chris) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Dynamic IP In-Reply-To: <20050222032224.44521.qmail@szabo.best.vwh.net> Message-ID: <20050222183708.E5CD83FD48@capsicum.zgp.org> Reading thru B.Ford's paper, it just occurs to me that if there's somewhere a server who's willing map a user name to a udp ip:port tuple, many p2p app will not need things like Dyn2Go to resolve a dynamic ip server. Is there any such thing ? From bkn3 at columbia.edu Tue Feb 22 20:09:50 2005 From: bkn3 at columbia.edu (Brad Neuberg) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Byzantine Quorum Systems In-Reply-To: <20050222032224.44521.qmail@szabo.best.vwh.net> References: <20050222032224.44521.qmail@szabo.best.vwh.net> Message-ID: <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com> Nick, I've been reading over the papers on your website and want to know your opinion of how well byzantine quorum systems work under the following conditions: * A majority of the nodes have a half-life of about an hour (i.e. every hour half the nodes in the p2p system leave). A large number of nodes never rejoin the system again. * The system has high latency (i.e. it is running over the public Internet) * A majority of the nodes are NATed or firewalled, depending on other nodes to relay requests. My understanding of byzantine quorum algorithms are that they break down under the kinds of conditions found in large-scale, P2P systems, which have very high node churn and high latency, described above. They seem to be focused on very stable or LAN type networks. Is this a correct assumption? If it is, it seems that byzantine quorum algorithms need to be refocused on the kinds of networks that we are dealing with today, rather than LAN centric networks or networks of very stable servers on the public Internet. Best, Brad Neuberg At 07:22 PM 2/21/2005, you wrote: > > "Also, a decentralized scheme such as in > > Kazaa has no availability problems but lacks integrity, since Kazaa is > > plagued with many fake files. Clearly, decentralization is an unsolved > > issue that needs further research." > >Perhaps this is ironic, but a good solution is protocol-enforced property >rights. Specifically, we should treat names crossing trust boundaries as >property, and securely agree across trust boundaries on who owns what names, >as described in > >"Secure Property Titles with Owner Authority", >http://szabo.best.vwh.net/securetitle.html > >Nick Szabo >_______________________________________________ >p2p-hackers mailing list >p2p-hackers@zgp.org >http://zgp.org/mailman/listinfo/p2p-hackers >_______________________________________________ >Here is a web page listing P2P Conferences: >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences Brad Neuberg, bkn3@columbia.edu Senior Software Engineer, Rojo Networks Weblog: http://www.codinginparadise.org ===================================================================== Check out Rojo, an RSS and Atom news aggregator that I work on. Visit http://rojo.com for more info. Feel free to ask me for an invite! Rojo is Hiring! If you're interested in RSS, Weblogs, Social Networking, Java, Open Source, etc... then come work with us at Rojo. If you recommend someone and we hire them you'll get a free iPod! See http://www.rojonetworks.com/JobsAtRojo.html. From wesley at felter.org Tue Feb 22 20:30:41 2005 From: wesley at felter.org (Wes Felter) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Byzantine Quorum Systems In-Reply-To: <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com> References: <20050222032224.44521.qmail@szabo.best.vwh.net> <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com> Message-ID: <421B9671.30404@felter.org> Brad Neuberg wrote: > My understanding of byzantine quorum algorithms are that they break down > under the kinds of conditions found in large-scale, P2P systems, which > have very high node churn and high latency, described above. They seem > to be focused on very stable or LAN type networks. Is this a correct > assumption? If it is, it seems that byzantine quorum algorithms need to > be refocused on the kinds of networks that we are dealing with today, > rather than LAN centric networks or networks of very stable servers on > the public Internet. OceanStore solves this problem by using a byzantine quorum protocol only between supernodes. IIRC, performance of this protocol over the Internet was not too bad. Unfortunately, I didn't understand Nick's recent messages (and thus the context for this discussion) at all, so I don't know if this is relevant. Wes Felter - wesley@felter.org From bkn3 at columbia.edu Tue Feb 22 21:52:17 2005 From: bkn3 at columbia.edu (Brad Neuberg) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Byzantine Quorum Systems In-Reply-To: <421B9671.30404@felter.org> References: <20050222032224.44521.qmail@szabo.best.vwh.net> <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com> <421B9671.30404@felter.org> Message-ID: <6.2.1.2.2.20050222134908.025b1140@pop.mail.yahoo.com> At 12:30 PM 2/22/2005, you wrote: >Brad Neuberg wrote: > >>My understanding of byzantine quorum algorithms are that they break down >>under the kinds of conditions found in large-scale, P2P systems, which >>have very high node churn and high latency, described above. They seem >>to be focused on very stable or LAN type networks. Is this a correct >>assumption? If it is, it seems that byzantine quorum algorithms need to >>be refocused on the kinds of networks that we are dealing with today, >>rather than LAN centric networks or networks of very stable servers on >>the public Internet. > >OceanStore solves this problem by using a byzantine quorum protocol only >between supernodes. IIRC, performance of this protocol over the Internet >was not too bad. Unfortunately, I didn't understand Nick's recent messages >(and thus the context for this discussion) at all, so I don't know if this >is relevant. I should provide more context. I'm reading the following two papers by Nick: * "Secure Property Titles with Owner Authority" - http://szabo.best.vwh.net/securetitle.html * "Advances in Distributed Security" - http://szabo.best.vwh.net/distributed.html Both posit that greater advances in things like distributed naming over p2p networks are possible due to things like byzantine quorum systems. I've always felt that byzantine quorum systems are too fragile for unreliable p2p networks on the wider Internet, though I'd love to be proven wrong. Brad >Wes Felter - wesley@felter.org > >_______________________________________________ >p2p-hackers mailing list >p2p-hackers@zgp.org >http://zgp.org/mailman/listinfo/p2p-hackers >_______________________________________________ >Here is a web page listing P2P Conferences: >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences Brad Neuberg, bkn3@columbia.edu Senior Software Engineer, Rojo Networks Weblog: http://www.codinginparadise.org ===================================================================== Check out Rojo, an RSS and Atom news aggregator that I work on. Visit http://rojo.com for more info. Feel free to ask me for an invite! Rojo is Hiring! If you're interested in RSS, Weblogs, Social Networking, Java, Open Source, etc... then come work with us at Rojo. If you recommend someone and we hire them you'll get a free iPod! See http://www.rojonetworks.com/JobsAtRojo.html. From paul at ref.nmedia.net Tue Feb 22 23:33:49 2005 From: paul at ref.nmedia.net (Paul Campbell) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] no subject (file transmission) In-Reply-To: References: <20050222030941.24981.qmail@szabo.best.vwh.net> Message-ID: <20050222233349.GA2001@ref.nmedia.net> On Tue, Feb 22, 2005 at 11:39:52AM +0000, Ian Clarke wrote: > > On 22 Feb 2005, at 03:09, Nick Szabo wrote: > > >Disclaimer -- IANAL and the following are personal not legal opinions. > > > >The "found valid by a jury" business is FUD -- it's meant to > >intimidate those receiving the cease & desist letters into thinking > >they will lose, but the jury finding has no direct legal effect on > >future cases. > > IANAL either, but an IP lawyer told me that jurys determine matters of > fact, where as the validity of a patent is a matter of law, and > therefore a jury is incapable of finding that a patent is valid. False. Juries can decide both the facts and the law. Judges often refuse to allow them to do so unless they are instructed to do otherwise. There is a huge body of information about this. Specifically, there is the issue of jury nullification. In essence, a jury can decide to totally ignore the law and decide for innocence in a criminal case or decide for the defendant in a civil case in spite of whatever legal paperwork exists. This was most prominent during prohibition when it was very difficult or impossible to get a conviction simply because the power of jury nullification was raised in jury instructions and the resulting juries simply refused to convict. From paul at ref.nmedia.net Tue Feb 22 23:39:53 2005 From: paul at ref.nmedia.net (Paul Campbell) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Dynamic IP In-Reply-To: <20050222183708.E5CD83FD48@capsicum.zgp.org> References: <20050222032224.44521.qmail@szabo.best.vwh.net> <20050222183708.E5CD83FD48@capsicum.zgp.org> Message-ID: <20050222233953.GB2001@ref.nmedia.net> On Wed, Feb 23, 2005 at 02:36:56AM +0800, Chris wrote: > Reading thru B.Ford's paper, it just occurs to me that if there's somewhere > a server who's willing map a user name to a udp ip:port tuple, many p2p app > will not need things like Dyn2Go to resolve a dynamic ip server. > Is there any such thing ? First off, it is not necessary anyways. If the p2p network uses the very same network that maintains the distributed database of files/data also maintains a distributed database of usernames (perhaps mapping to a public key as a signature scheme), then there is no need for the service you mentioned. Most of them have this capability to one degree or another. The other place where it becomes "necessary" is for a "first contact point" or initial entry into the network. This is already essentially solved by maintaining a sufficiently large list of contact points (say a list of the ip:port list of 1000 known P2P contacts) that there is zero chance of not finding an initial entry point. It is still a sticky issue for "new" entries...someone who downloads the software needs something for an initial contact point that has enough permanence to withstand months or years of no updates. This can be provided at a relatively low-bandwidth level by a web server. It also provides a fall back for more or less gauranteed first contacts. From szabo at szabo.best.vwh.net Wed Feb 23 01:32:05 2005 From: szabo at szabo.best.vwh.net (Nick Szabo) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Re: Byzantine Quorum Systems In-Reply-To: <6.2.1.2.2.20050222115546.02598a80@pop.mail.yahoo.com> from "Brad Neuberg" at Feb 22, 2005 12:09:50 PM Message-ID: <20050223013206.42103.qmail@szabo.best.vwh.net> My proposal for a fully decentralized p2p directory to protect name integrity was somewhat vague; hopefully this will clarify things a bit: http://szabo.best.vwh.net/nameintegrity.html > Nick, I've been reading over the papers on your website and want to know > your opinion of how well byzantine quorum systems work under the following > conditions: I'm by no means an expert on the network conditions you describe, but I'll do my best. > * A majority of the nodes have a half-life of about an hour (i.e. every > hour half the nodes in the p2p system leave). A large number of nodes > never rejoin the system again. This is where Byzantine fault tolerance shines. The main security and reliability property holds per transaction. Byzantine fault-tolerant systems are vastly superior to any reputation- or history-based precaution in this regard. The most common transaction for the proposed directory is the propagation of a new (filename, hash, owner) tuple. The Byzantine attacker tries to overwhelm the directory with new, separate-looking nodes which forge messages, corrupting the directory for everybody. If the attacker can usurp more than a certain fraction (1/4 for some kinds of quorum systems, 1/3 for traditional slow Byzantine fault tolerant systems) of the separate-looking nodes during a given transaction, they can compromise the integrity of that transaction. The fraction is far smaller for non-Byzantine tolerant directories -- in a typical p2p system, you can't demonstrate that the directory is safe against even a very small number of message-forging nodes. > * The system has high latency (i.e. it is running over the public Internet) For the directory, we just want short tuples like (file name, file hash, owner name, logo, digital signature) to propagate about as fast as the large files they refer to. How bad slowing down overall file+directory propagation is a problem depends on the relative values users and uploaders place on propagation time, uploader reputation, and file name integrity. The larger number of messages required for a Byzantine quorum system is probably tolerable, as it is for the supernodes in OceanStore. > * A majority of the nodes are NATed or firewalled, depending on other nodes > to relay requests. All kinds of directories would benefit from cryptography to prevent substitution and replay attacks by intermediaries like these and others, to be sure. This reduces the intermediaries' attack to making all the nodes behind them simply fail, and fail-stop is a simpler problem to solve than Byzantine faults (message forging & the like). If the inside nodes fail and outside nodes can't reroute their messages to the failed nodes through another NAT or firewall, the correct nodes cut the failed nodes out of the directory, and the integrity of the directory is not harmed. > My understanding of byzantine quorum algorithms are that they break down > under the kinds of conditions found in large-scale, P2P systems, which have > very high node churn and high latency, described above. They seem to be > focused on very stable or LAN type networks. Is this a correct > assumption? If it is, it seems that byzantine quorum algorithms need to be > refocused on the kinds of networks that we are dealing with today... Actually, the original focus of the research in the 70's was even more fixed -- it was about ensuring the reliability of computers with multiple redundant CPUs. The best way to prove such reliability was to make a very open-ended assumption that CPUs would have not just statistical errors but arbitrary, even malicious errors -- so the models became even more pertinent the security and reliability of LANs, and more pertinent still to the security and reliability of wide-area distributed systems. The CPUs (and LAN nodes) were also deemed to be highly unstable -- there was a separate fail-stop model where the idea was to simply tolerate high numbers of failures that could (unlike Byzantine attacks) be detected and ignored as simple failures by other nodes. Both models are very well studied, and chances are the combination of Byzantine and fail-stop models is well-studies as well, though I haven't personally researched that angle. I heartily agree that more adaptations and improvements to an unstable environment should be explored. For example, a combination of fail-stop (to deal with very large numbers of detectable failures from catastrophic exit or blockage of many p2p nodes at once) and Byzantine fault tolerance (to deal with a smaller numbers of simultaneous message-forging nodes) could be explored. Such exploration should not, however, come at the expense of losing the provable security and reliability properties Byzantine fault-tolerance discipline makes possible for fully decentralized directories. Nick Szabo From lgonze at panix.com Wed Feb 23 01:56:34 2005 From: lgonze at panix.com (Lucas Gonze) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Re: Byzantine Quorum Systems In-Reply-To: <20050223013206.42103.qmail@szabo.best.vwh.net> References: <20050223013206.42103.qmail@szabo.best.vwh.net> Message-ID: While we're on the topic of quorum systems for naming, it strikes me that there's nothing about Zooko's triangle which requires systems to tolerate rapid churn. A network of highly stable nodes, for example ones hosted at university comp sci departments, would be fine as long as it was large enough and allowed open membership. From arachnid at notdot.net Wed Feb 23 08:18:36 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Homomorphic hashing and fountain codes - implementation question Message-ID: <421C3C5C.4040408@notdot.net> If I can be forgiven a stupid question: I'm reading in detail the paper (http://www.scs.cs.nyu.edu/~mfreed/docs/authcodes-ieee04.pdf) on homomorphic hash functions for use with Digital Fountain codes in preparation for implementing it. The problem I'm coming up against is in the description of the modifications to the Fountain code described on page 5. With their example settings, 256 bit long sub-blocks are now added modulo a 257 bit prime. This makes sense - what I don't get is how to encode the result in 256 bits! What is one supposed to do if the sum of the selected blocks overflows 256 bits? Can anyone enlighten me? -Nick Johnson From gwenchlan at fr.fm Wed Feb 23 14:23:21 2005 From: gwenchlan at fr.fm (Gwenchlan) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Node counting algorithm Message-ID: <421C91D9.6070908@fr.fm> Hi all, i am looking for papers or informations about node counting on overlays, but currently without success. In a distributed fashion, a node would be able to start a process to estimate the overlay size. Do someone here have seen something like this recently? Any clues about that? I think i will have to use random walkers. Thanks! From agthorr at barsoom.org Wed Feb 23 14:33:17 2005 From: agthorr at barsoom.org (Daniel Stutzbach) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Node counting algorithm In-Reply-To: <421C91D9.6070908@fr.fm> References: <421C91D9.6070908@fr.fm> Message-ID: <20050223143317.GG3549@barsoom.org> On Wed, Feb 23, 2005 at 03:23:21PM +0100, Gwenchlan wrote: > i am looking for papers or informations about node counting on overlays, > but currently without success. > In a distributed fashion, a node would be able to start a process to > estimate the overlay size. > Do someone here have seen something like this recently? > Any clues about that? > I think i will have to use random walkers. Are you looking for an algorithm that will estimate the overlay size for use by the overlay, or are you looking for measurement techniques? -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From zooko at zooko.com Wed Feb 23 14:44:40 2005 From: zooko at zooko.com (Zooko O'Whielacronx) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Re: Byzantine Quorum Systems In-Reply-To: <20050223013206.42103.qmail@szabo.best.vwh.net> References: <20050223013206.42103.qmail@szabo.best.vwh.net> Message-ID: <63099cbffbb15834bf7e5478a6dabd69@zooko.com> Hello Nick, Brad, et al. "The Sybil Attack" by John Doceur puts the question in the p2p context: http://citeseer.ist.psu.edu/douceur02sybil.html This paper can be lossily compressed as: "Your scheme can handle up to K malicious nodes. My attacker can bring K+1 malicious nodes to the party.". It is an excellent paper because it is very general and it requires p2p designers to think explicitly about the issue. The argument in "The Sybil Attack" is valid -- if you accept its premises, then you must accept its conclusion. However, there is one premise which is implicit in this paper and in most related research which ought to be challenged. This implicit premise is that a connection between two node arises ex nihilo. That is: for any three nodes A, B, and C, A has (at the start) no information about how B differs from C. This assumption is obviously key to the whole issue. It is also obviously wrong! In practice the opposite is often true: for any three nodes A, B, and C, A often has information distinguishing B from C. This is because A has been introduced to B somehow, and that introduction gave A information. (Likewise with A's introduction to C.) In sum, The Sybil Attack beats Byzantine techniques, but fortunately that doesn't matter because both of those ideas are set in an idealized world in which an unbounded number of indistinguishable peers are introduced to one another ex nihilo, with the introductions conveying no information. Rather that struggle vainly to overcome The Sybil Attack in that idealized world, I suggest designing distributed systems that are secure only when bootstrapped with useful introductions. I don't know whether that kind of design can accomodate "The Napster Setting", in which a large number of strangers want to be automatically introduced in order to trade files. I suspect that it *can*, but I also think that this isn't the only interesting setting for p2p designs, and other settings may be even more amenable to designs which use the information from introductions. Regards, Zooko From gwenchlan at fr.fm Wed Feb 23 14:48:43 2005 From: gwenchlan at fr.fm (Gwenchlan) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Node counting algorithm Message-ID: <421C97CB.2050205@fr.fm> Le mercredi 23 f?vrier 2005 ? 06:33 -0800, Daniel Stutzbach a ?crit : >On Wed, Feb 23, 2005 at 03:23:21PM +0100, Gwenchlan wrote: >> i am looking for papers or informations about node counting on overlays, >> but currently without success. >> In a distributed fashion, a node would be able to start a process to >> estimate the overlay size. >> Do someone here have seen something like this recently? >> Any clues about that? >> I think i will have to use random walkers. > >Are you looking for an algorithm that will estimate the overlay size >for use by the overlay, or are you looking for measurement techniques? > > Hi Daniel, probably the first option, in order to launch "on demand" mesurement for overlay maintenance (by the overlay itself) for exemple.. The request initiator would be waiting for a more or less precise estimation, due to dynamicity, response time expected.. From mgp at ucla.edu Wed Feb 23 18:27:41 2005 From: mgp at ucla.edu (Michael Parker) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Node counting algorithm In-Reply-To: <421C97CB.2050205@fr.fm> References: <421C97CB.2050205@fr.fm> Message-ID: <421CCB1D.1030907@ucla.edu> Interestingly enough, the topology of some DHTs makes this not too difficult to calculate. In Pastry/Bamboo, for example: If your leaf set overlaps, it's just the number of entries in your leaf set. If your leaf set does not overlap, divide the size of the ring (e.g. 2^128, 2^160) by the span of your leaf set (i.e., the farthest clockwise node minus the farthest counterclockwise node, modulo the ring size), and multiply by the size of your leaf set. Basically, what this means is if your leaf set is size L, and it spans a percentage x of the node identifier space, the size of the network is approximately L * x^-1. To improve accuracy, ask the two farthest nodes in your leaf set and ask them for their leaf sets, merging them into yours before calculating. That way, you have a larger effective L. The same can be done for Chord using its successor list and predecessor. Alternatively, in these two networks, since the number of nodes in your routing table is log_2 N, you can estimate the size of your network as 2^k, where k is the number of filled rows in your routing table. Finally, although I'm no Kademlia expert, I think you can estimate the size of the network by 2^k, where k is the number of buckets in your routing table. - Michael Parker Gwenchlan wrote: > Le mercredi 23 f?vrier 2005 ? 06:33 -0800, Daniel Stutzbach a ?crit : > >> On Wed, Feb 23, 2005 at 03:23:21PM +0100, Gwenchlan wrote: >> >>> i am looking for papers or informations about node counting on >>> overlays, but currently without success. >>> In a distributed fashion, a node would be able to start a process to >>> estimate the overlay size. >>> Do someone here have seen something like this recently? >>> Any clues about that? >>> I think i will have to use random walkers. >> >> >> Are you looking for an algorithm that will estimate the overlay size >> for use by the overlay, or are you looking for measurement techniques? >> >> > Hi Daniel, > probably the first option, in order to launch "on demand" mesurement > for overlay maintenance (by the overlay itself) for exemple.. > The request initiator would be waiting for a more or less precise > estimation, due to dynamicity, response time expected.. > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From em at em.no-ip.com Thu Feb 24 03:13:57 2005 From: em at em.no-ip.com (Enzo Michelangeli) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Node counting algorithm References: <421C97CB.2050205@fr.fm> <421CCB1D.1030907@ucla.edu> Message-ID: <00d801c51a1e$f3813740$0200a8c0@em.noip.com> ----- Original Message ----- From: "Michael Parker" To: "Peer-to-peer development." Sent: Thursday, February 24, 2005 2:27 AM Subject: Re: [p2p-hackers] Node counting algorithm > Interestingly enough, the topology of some DHTs makes this not too > difficult to calculate. In Pastry/Bamboo, for example: [...] > Finally, although I'm no Kademlia expert, I think you can estimate the > size of the network by 2^k, where k is the number of buckets in your > routing table. More accurately, by the number of nodes contained in each k-bucket. Each k-bucket gets full after k nodes, and that clips its contribution to k; however, partially-filled k-buckets do contribute useful information. See e.g. my posting archived at http://zgp.org/pipermail/p2p-hackers/2004-June/001991.html , so far with no followup. Enzo From srhea at cs.berkeley.edu Thu Feb 24 05:32:49 2005 From: srhea at cs.berkeley.edu (Sean C. Rhea) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Node counting algorithm In-Reply-To: <421CCB1D.1030907@ucla.edu> References: <421C97CB.2050205@fr.fm> <421CCB1D.1030907@ucla.edu> Message-ID: <3ab998d35d23fedc340ce0364a89aa8c@cs.berkeley.edu> On Feb 23, 2005, at 10:27 AM, Michael Parker wrote: > If your leaf set overlaps, it's just the number of entries in your > leaf set. > If your leaf set does not overlap, divide the size of the ring (e.g. > 2^128, 2^160) by the span of your leaf set (i.e., the farthest > clockwise node minus the farthest counterclockwise node, modulo the > ring size), and multiply by the size of your leaf set. Basically, what > this means is if your leaf set is size L, and it spans a percentage x > of the node identifier space, the size of the network is approximately > L * x^-1. To improve accuracy, ask the two farthest nodes in your leaf > set and ask them for their leaf sets, merging them into yours before > calculating. That way, you have a larger effective L. This technique gives estimates that, on average, overestimate the size of the network if you pick node identifiers uniformly at random (UAR). The reason is that UAR doesn't mean evenly distributed; some nodes' leaf sets cover much more than others. If you have one node whose leaf set covers a larger portion of the key space, that node underestimates the size of the ring, but a lot of other nodes end up covering less of the key space (to make room for the larger one) and end up overestimating the network size. When you average them all, the few nodes that underestimate don't make up for all the rest that overestimate it. IANAM (I am not a mathematician), but this is what happens when you simulate it at least, with identifiers drawn using SHA. References for more mathematical explanations and a better algorithm are described in this paper: http://iptps05.cs.cornell.edu/PDFs/CameraReady_174.pdf Sean -- Give a man a fish and he will eat for a day. Teach him how to fish, and he will sit in a boat and drink beer all day. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050223/45d8b186/PGP.pgp From gwenchlan at fr.fm Thu Feb 24 08:37:35 2005 From: gwenchlan at fr.fm (Gwenchlan) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Node counting algorithm Message-ID: <421D924F.20403@fr.fm> Le mercredi 23 f?vrier 2005 ? 21:32 -0800, Sean C. Rhea a ?crit : >On Feb 23, 2005, at 10:27 AM, Michael Parker wrote: >> If your leaf set overlaps, it's just the number of entries in your >> leaf set. >> If your leaf set does not overlap, divide the size of the ring (e.g. >> 2^128, 2^160) by the span of your leaf set (i.e., the farthest >> clockwise node minus the farthest counterclockwise node, modulo the >> ring size), and multiply by the size of your leaf set. Basically, what >> this means is if your leaf set is size L, and it spans a percentage x >> of the node identifier space, the size of the network is approximately >> L * x^-1. To improve accuracy, ask the two farthest nodes in your leaf >> set and ask them for their leaf sets, merging them into yours before >> calculating. That way, you have a larger effective L. > >This technique gives estimates that, on average, overestimate the size >of the network if you pick node identifiers uniformly at random (UAR). >The reason is that UAR doesn't mean evenly distributed; some nodes' >leaf sets cover much more than others. If you have one node whose leaf >set covers a larger portion of the key space, that node underestimates >the size of the ring, but a lot of other nodes end up covering less of >the key space (to make room for the larger one) and end up >overestimating the network size. When you average them all, the few >nodes that underestimate don't make up for all the rest that >overestimate it. > >IANAM (I am not a mathematician), but this is what happens when you >simulate it at least, with identifiers drawn using SHA. References for >more mathematical explanations and a better algorithm are described in >this paper: > > http://iptps05.cs.cornell.edu/PDFs/CameraReady_174.pdf > >Sean > Thanks for these DHT heuristics. I have omited to specify i was looking for tricks applying to unstructured networks (probably random graphs). Thus we cannot deal with density here. I was thinking about using a flexible method like randow walk, but K random walkers launched by initiator raise the problem of scalability and accuracy as the overlay may (more or less) vary during the counting itself, phenomenon amplified by network size... From eugen at leitl.org Thu Feb 24 12:19:40 2005 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] [IP] New paper on poisioning and pollution of P2P networks (fwd from dave@farber.net) Message-ID: <20050224121940.GD1404@leitl.org> ----- Forwarded message from David Farber ----- From: David Farber Date: Thu, 24 Feb 2005 06:09:58 -0500 To: Ip Subject: [IP] New paper on poisioning and pollution of P2P networks User-Agent: Microsoft-Entourage/11.1.0.040913 Reply-To: dave@farber.net ------ Forwarded Message From: Joseph Lorenzo Hall Reply-To: Date: Wed, 23 Feb 2005 23:20:22 -0800 To: Dave Farber , Declan McCullagh Subject: New paper on poisioning and pollution of P2P networks Hi Dave, Declan... I thought you two might enjoy this paper. -Joe ---- ## New paper on poisioning and pollution of P2P networks ## http://groups.sims.berkeley.edu/pam-p2p/index.php?p=40 [Nicolas Christin][1] has just put the finishing touches on a new paper authored with [Andreas Weigend][2] and SIMS professor [John Chuang][3], ["Content Availability, Pollution and Poisoning in File Sharing Peer-to-Peer Networks"][4] that will be presented at [ACM's Conference on Electronic Commerce][5] this summer in Vancouver, Canada. Here is the abstract: [1]: http://www.sims.berkeley.edu/~christin/ [2]: http://www.weigend.com/ [3]: http://www.sims.berkeley.edu/~chuang/ [4]: http://p2pecon.berkeley.edu/pub/CWC-EC05.pdf [5]: http://www.acm.org/sigs/sigecom/ec05/ > Copyright holders have been investigating technological solutions to prevent distribution of copyrighted materials in peer-to-peer file sharing networks. A particularly popular technique consists in poisoning a specific item (movie, song, or software title) by injecting a massive number of decoys into the peer-to-peer network, to reduce the availability of the targeted item. In addition to poisoning, pollution, that is, the accidental injection of unusable copies of files in the network, also decreases content availability. In this paper, we attempt to provide a first step toward understanding the differences between pollution and poisoning, and their respective impact on content availability in peer-to-peer file sharing networks. To that effect, we conduct a measurement study of content availability in the four most popular peer-to-peer file sharing networks, in the absence of poisoning, and then simulate different poisoning strategies on the measured data to evaluate their potential impact. We exhibit a strong correlation between content availability and topological properties of the underlying peer-to-peer network, and show that the injection of a small number of decoys can seriously impact the users perception of content availability. This is a really interesting paper. They measure a number of P2P network metrics - query response time, temporal stability, spatial stability and download completion time - using a widely distributed set of PCs on the [PlanetLab network][6] running scripted P2P software. This is a clever way to simultaneously study the characteristics of different P2P networks (notably eDonkey, eDonky/Overnet, FastTrack and Gnutella) as well as quantitatively illustrate differences in the underlying network algoritms. The really nifty part of this paper, in my opinion, involves measuring the effects of various content poisoning and pollution strategies. Their results show that fairly simple strategies are fairly simply defeated while more sophisticated and hybrid strategies aimed at mucking-up-the-net are difficult to detect and thwart. [6]: http://www.planet-lab.org/ -- Joseph Lorenzo Hall UC Berkeley, SIMS PhD Student http://pobox.com/~joehall/ blog: http://pobox.com/~joehall/nqb2/ ------ End of Forwarded Message ------------------------------------- You are subscribed as eugen@leitl.org To manage your subscription, go to http://v2.listbox.com/member/?listname=ip Archives at: http://www.interesting-people.org/archives/interesting-people/ ----- End forwarded message ----- -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20050224/6bf3d531/attachment.pgp From sam at neurogrid.com Fri Feb 25 22:19:17 2005 From: sam at neurogrid.com (Sam Joseph) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Agents and P2P Workshop - submission form live Message-ID: <421FA465.6030909@neurogrid.com> *** our apologies if you receive multiple copies of this e-mail *** Call for Papers for the Fourth International Workshop on Agents and Peer-to-Peer Computing (AP2PC 2005) http://p2p.ingce.unibo.it/ held in AAMAS 2005 International Conference on Autonomous Agents and MultiAgent Systems Utrecht University, Netherlands. from 25 July - 29 July 2005. [SUBMISSION FORM NOW AVAIALBLE] https://msrcmt.research.microsoft.com/AP2PC2005/CallForPapers.aspx [see below for more details] CALL FOR PAPERS: Peer-to-peer (P2P) computing has attracted enormous media attention, initially spurred by the popularity of file sharing systems such as Napster, Gnutella, and Morpheus. More recently systems like BitTorrent and eDonkey have continued to sustain that attention. New techniques such as distributed hash-tables (DHTs), semantic routing, and Plaxton Meshes are being combined with traditional concepts such as Hypercubes, Trust Metrics and caching techniques to pool together the untapped computing power at the "edges" of the internet. These new techniques and possibilities have generated a lot of interest in many industrial organizations, and has resulted in the creation of a P2P working group on standardization in this area. (http://www.irtf.org/charters/p2prg.html). In P2P computing peers and services forego central coordination and dynamically organise themselves to support knowledge sharing and collaboration, in both cooperative and non-cooperative environments. The success of P2P systems strongly depends on a number of factors. First, the ability to ensure equitable distribution of content and services. Economic and business models which rely on incentive mechanisms to supply contributions to the system are being developed, along with methods for controlling the "free riding" issue. Second, the ability to enforce provision of trusted services. Reputation based P2P trust management models are becoming a focus of the research community as a viable solution. The trust models must balance both constraints imposed by the environment (e.g. scalability) and the unique properties of trust as a social and psychological phenomenon. Recently, we are also witnessing a move of the P2P paradigm to embrace mobile computing in an attempt to achieve even higher ubiquitousness. The possibility of services related to physical location and the relation with agents in physical proximity could introduce new opportunities and also new technical challenges. Although researchers working on distributed computing, MultiAgent Systems, databases and networks have been using similar concepts for a long time, it is only fairly recently that papers motivated by the current P2P paradigm have started appearing in high quality conferences and workshops. Research in agent systems in particular appears to be most relevant because, since their inception, MultiAgent Systems have always been thought of as collections of peers. The MultiAgent paradigm can thus be superimposed on the P2P architecture, where agents embody the description of the task environments, the decision-support capabilities, the collective behavior, and the interaction protocols of each peer. The emphasis in this context on decentralization, user autonomy, dynamic growth and other advantages of P2P, also leads to significant potential problems. Most prominent among these problems are coordination: the ability of an agent to make decisions on its own actions in the context of activities of other agents, and scalability: the value of the P2P systems lies in how well they scale along several dimensions, including complexity, heterogeneity of peers, robustness, traffic redistribution, and so forth. It is important to scale up coordination strategies along multiple dimensions to enhance their tractability and viability, and thereby to widen potential application domains. These two problems are common to many large-scale applications. Without coordination, agents may be wasting their efforts, squander resources and fail to achieve their objectives in situations requiring collective effort. This workshop will bring together researchers working on agent systems and P2P computing with the intention of strengthening this connection. Researchers from other related areas such as distributed systems, networks and database systems will also be welcome (and, in our opinion, have a lot to contribute). We seek high-quality and original contributions on the general theme of "Agents and P2P Computing". The following is a non-exhaustive list of topics of special interest: - Intelligent agent techniques for P2P computing - P2P computing techniques for MultiAgent Systems - The Semantic Web, Semantic Coordination Mechanisms and P2P systems - Scalability, coordination, robustness and adaptability in P2P systems - Self-organization and emergent behavior in P2P networks - E-commerce and P2P computing - Participation and Contract Incentive Mechanisms in P2P Systems - Computational Models of Trust and Reputation - Community of interest building and regulation, and behavioral norms - Intellectual property rights in P2P systems - P2P architectures - Scalable Data Structures for P2P systems - Services in P2P systems (service definition languages, service discovery, filtering and composition etc.) - Knowledge Discovery and P2P Data Mining Agents - P2P oriented information systems - Information ecosystems and P2P systems - Security issues in P2P networks - Pervasive computing based on P2P architectures (ad-hoc networks,wireless communication devices and mobile systems) - Grid computing solutions based on agents and P2P paradigms - Legal issues in P2P networks PANEL The theme of the panel will be Decentralised Trust in P2P and MultiAgent Systems. As P2P and MultiAgent systems become larger and more diverse the risks of interacting with malicious peers become increasingly problematic. The panel will address how computational trust issues can be addressed in P2P and MultiAgent systems. The panel will involve short presentations by thepanelists followed by a discussion session involving the audience. IMPORTANT DATES Paper submission: 14th March 2005 Acceptance notification: 18th April 2005 Workshop: 25-26th July 2005 Camera ready for post-proceedings: 20th September 2005 REGISTRATION Accomodation and workshop registration will be handled by the AAMAS 2005 organization along with the main conference registration. SUBMISSION INSTRUCTIONS Previously unpublished papers should be formatted according to the LNCS/LNAI author instructions for proceedings and they should not be longer than 12 pages (about 5000 words including figures, tables, references, etc.). Please submit your papers through the Microsoft conference management system: https://msrcmt.research.microsoft.com/AP2PC2005/CallForPapers.aspx Particular preference will be given to those papers that build upon the contributions of papers presented at previous AP2PC workshops. In addition, please carefully consider the issues that our reviewers will be considering. Some of the issues our reviewers will be considering can be seen in this form: http://www.neurogrid.net/ap2pc2005/review-form.html At the very least we would encourage all authors to read the abstracts of the papers submitted to previous workshops - available from the links below: http://p2p.ingce.unibo.it/2002/ http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-40109-22-2991818-0,00.html http://p2p.ingce.unibo.it/2003/ http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-40109-22-37060961-0,00.html http://p2p.ingce.unibo.it/2004/ Particular preference will be given to both novel approaches and those papers that build upon the contributions of papers presented at previous AP2PC workshops. PUBLICATION Accepted papers will be distributed to the workshop participants as workshop notes. As in previous years post-proceedings of the revised papers (namely accepted papers presented at the workshop) will be submitted for publication to Springer in Lecture Notes in Computer Science series. ORGANIZING COMMITTEE Program Co-chairs Zoran Despotovic School of Computer and Communication Sciences, E'cole Polytechnique Fe'de'rale de Lausanne (EPFL) CH-1015 Lausanne, Switzerland Email zoran.despotovic@epfl.ch Sam Joseph (main contact) Dept. of Information and Computer Science, University of Hawaii at Manoa, USA 1680 East-West Road, POST 309, Honolulu, HI 96822 E-mail: srjoseph@hawaii.edu Claudio Sartori Dept. of Electronics, Computer Science and Systems, University of Bologna, Italy Viale Risorgimento, 2 - 40136 Bologna Italy E-mail: claudio.sartori@unibo.it Panel Chair Omer Rana School of Computer Science, Cardiff University, UK Queen's Buildings, Newport Road, Cardiff CF24 3AA, UK E-mail: o.f.rana@cs.cardiff.ac.uk PROGRAM COMMITTEE Karl Aberer, EPFL, Lausanne, Switzerland Alessandro Agostini, ITC-IRST, Trento, Italy Djamal Benslimane, Universite Claude Bernard, France Sonia Bergamaschi, University of Modena and Reggio-Emilia, Italy M. Brian Blake, Georgetown University, USA Rajkumar Buyya, University of Melbourne, Australia Paolo Ciancarini, University of Bologna, Italy Costas Courcoubetis, Athens University of Economics and Business, Greece Yogesh Deshpande, University of Western Sydney, Australia Asuman Dogac, Middle East Technical University, Turkey Boi V. Faltings, EPFL, Lausanne, Switzerland Maria Gini, University of Minnesota, USA Dina Q. Goldin, University of Connecticut, USA Chihab Hanachi, University of Toulouse, France Mark Klein, Massachusetts Institute of Technology, USA Matthias Klusch, DFKI, Saarbrucken, Germany Tan Kian Lee, National University of Singapore, Singapore Zakaria Maamar, Zayed University, UAE Wolfgang Mayer, University of South Australia, Australia Dejan Milojicic, Hewlett Packard Labs, USA Alberto Montresor, University of Bologna, Italy Luc Moreau, University of Southampton, UK Jean-Henry Morin, University of Geneve, Switzerland Andrea Omicini, University of Bologna, Italy Maria Orlowska, University of Queensland, Australia Aris. M. Ouksel, University of Illinois at Chicago, USA Mike Papazoglou, Tilburg University, Netherlands Paolo Petta, Austrian Research Institute for AI, Austria, Jeremy Pitt, Imperial College, UK Dimitris Plexousakis, Institute of Computer Science, FORTH, Greece Martin Purvis, University of Otago, New Zealand Omer F. Rana, Cardiff University, UK Douglas S. Reeves, North Carolina State University, USA Thomas Risse, Fraunhofer IPSI, Darmstadt, Germany Pierangela Samarati, University of Milan, Italy Christophe Silbertin-Blanc, University of Toulouse, France Maarten van Steen, Vrije Universiteit, Netherlands Katia Sycara, Robotics Institute, Carnegie Mellon University, USA Peter Triantafillou, Technical University of Crete, Greece Anand Tripathi, University of Minnesota, USA Vijay K. Vaishnavi, Georgia State University, USA Francisco Valverde-Albacete, Universidad Carlos III de Madrid, Spain Maurizio Vincini, University of Modena and Reggio-Emilia, Italy Fang Wang, BTexact Technologies, UK Gerhard Weiss, Technische Universitaet, Germany Bin Yu, North Carolina State University, USA Franco Zambonelli, University of Modena and Reggio-Emilia, Italy From hal at finney.org Fri Feb 25 23:30:50 2005 From: hal at finney.org (Hal Finney) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Homomorphic hashing and fountain codes - implementation question Message-ID: <20050225233050.C0EAD57EBA@finney.org> Nick Johnson writes: > If I can be forgiven a stupid question: > I'm reading in detail the paper > (http://www.scs.cs.nyu.edu/~mfreed/docs/authcodes-ieee04.pdf) on > homomorphic hash functions for use with Digital Fountain codes in > preparation for implementing it. The problem I'm coming up against is in > the description of the modifications to the Fountain code described on > page 5. With their example settings, 256 bit long sub-blocks are now > added modulo a 257 bit prime. This makes sense - what I don't get is how > to encode the result in 256 bits! What is one supposed to do if the sum > of the selected blocks overflows 256 bits? I think... you would have to plan on allocating 257 bits for the output of this process. Blocks would be 257 bits long. I don't know if that is a show stopper for an implementation. There is a possible workaround. You could choose a prime q which was just barely, barely, barely 257 bits long. Let it be 2^257 plus some number less than 2^170 or so. In other words, let the prime q start with 1000000... for 80+ bits of zeros. Now the chance of a random Z_q value happening to be > 256 bits will be vanishingly small. Unfortunately, files are not random. Someone could choose a file which had special values that would overflow 256 bits. You could fix that by first pre-randomizing the file in some reversible way. A suitable cryptographic primitive is called an All Or Nothing Transform. See http://theory.lcs.mit.edu/~boyko/aont-oaep.html for an example. So you'd first AONT transform the file, which would randomize the values; then you'd use this q for the coding, which could overflow in principle but not in practice. In the end, after reconstructing the file, you'd reverse the AONT transform. In your sci.crypt posting, you also asked: > 1) The algorithm requires two primes q and p. These primes are known to > both the publisher and the verifiers. Will security be reduced if the > same primes are used for all publishers, or can a single pair of primes > be used globally? You should be able to use the same primes globally, if the security of the size of your primes is adequate. > 2) What level of security does this algorithm provide with p and q > being 1024 bit and 257 bit, respectively? Eg, how many operations or > how much computing time would be required to compute a collision? How > does this fare with reduced lengths of p and q? It is a little hard to quantify. The security will be the minimum of the security levels of p and q against the discrete log problem. For q it is easy, it is half the size of q, or about 128 bits (i.e. 2^128 work to break it). That should be more than enough. For p it is harder. I think most people would agree that a p of 1024 bits corresponds to a security of perhaps 80-90 bits for discrete logs. This is somewhat marginal. I would suggest a p of more like 2048 bits, with a 256 bit q. There have been a number of proposals for theoretical factoring machines, most of which could be adapted at somewhat greater cost to finding discrete logs. Many people today are worried that 1024 bit keys can no longer be considered extremely safe. Particularly if you use the same p throughout the system, it would be wise to use something a little bigger. Hal Finney From hal at finney.org Fri Feb 25 23:35:07 2005 From: hal at finney.org (Hal Finney) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Homomorphic hashing and fountain codes - implementation question Message-ID: <20050225233507.0BA9557EBA@finney.org> Quick correction: > There is a possible workaround. You could choose a prime q which was > just barely, barely, barely 257 bits long. Let it be 2^257 plus some > number less than 2^170 or so. I should have said, let it be 2^256 plus some number less than... In other words, a 257 bit prime that is just barely bigger than 2^256. Hal From arachnid at notdot.net Sat Feb 26 10:45:56 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Homomorphic hashing and fountain codes - implementation question In-Reply-To: <20050225233050.C0EAD57EBA@finney.org> References: <20050225233050.C0EAD57EBA@finney.org> Message-ID: <42205364.5040301@notdot.net> Hal Finney wrote: >So you'd first AONT transform the file, which would randomize the values; >then you'd use this q for the coding, which could overflow in principle >but not in practice. In the end, after reconstructing the file, you'd >reverse the AONT transform. > > Interesting idea - thanks for suggesting it. I'd have to consider further to decide if the extra complexity in the protocol is worth the size and computation relieved by not having to store the 257th bit. >I would suggest a p of more like 2048 bits, with a 256 bit q. There have >been a number of proposals for theoretical factoring machines, most >of which could be adapted at somewhat greater cost to finding discrete >logs. Many people today are worried that 1024 bit keys can no longer be >considered extremely safe. Particularly if you use the same p throughout >the system, it would be wise to use something a little bigger. > > This could be a problem: 1024 bit hashes are already pretty large, 2048 bit would be substantially worse. I was, in fact, hoping it would be practical to use a smaller prime! Should someone successfully break the 1024 bit hash, what would the consequences be? Could they compute collisions for a single block (a break that is meaningless, since using the per-publisher model the only one setting blocks is the publisher, who can compute collisions with ease anyway - a backup file-wide SHA-1 hash will act to prevent this), could they conduct a preimage attack against a single block, or could they create collisions and preimage attacks for any file published with that K? Worse, could they compute collisions for every file published using that p? Thanks, Nick Johnson From hal at finney.org Sat Feb 26 18:55:29 2005 From: hal at finney.org (Hal Finney) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Homomorphic hashing and fountain codes - implementation question Message-ID: <20050226185529.46E7157EBA@finney.org> Nick Johnson writes: > Should someone successfully break the 1024 bit hash, what would the > consequences be? Could they compute collisions for a single block (a > break that is meaningless, since using the per-publisher model the only > one setting blocks is the publisher, who can compute collisions with > ease anyway - a backup file-wide SHA-1 hash will act to prevent this), > could they conduct a preimage attack against a single block, or could > they create collisions and preimage attacks for any file published with > that K? Worse, could they compute collisions for every file published > using that p? As I understand it, the hash on a block is computed by breaking it into 256 bit pieces t_i, and using pre-defined constants g_i, computing the product over i of g_i^t_i. If someone did the work to break discrete logs mod p, they could compute the discrete logs of the g_i relative to each other, and take discrete logs of hashes. This would allow them to create a block that would match the hash of any given block. They could compute preimages for any hash, and they could compute collisions for any file hash using that p. The nature of the discrete log attack based on p is that it is expensive to mount, but once you have done the work, you can easily find more discrete logs of other values using that same modulus p. However, on the other hand, these hypothetical machines are really expensive, something like a billion dollars in today's money. Some people speculate that it might be down to a hundred million in a few years. The main concern with 1024 bit moduli has been cryptographic, where "they" could factor your key. Since it's so easy to move to bigger keys, many people are doing it, just to make sure that no government or anyone else who has a billion dollars to spend could read their messages. I guess everyone feels like they are special enough that their secrets might be worth that much. In your case, maybe this level of paranoia is unnecessary. Nobody is going to spend a billion dollars to break this. You could think about how much the opponents of this system would be willing to spend, look at Moore's law over the time frame during which you would envision this system being used, and estimate a desired security level from that. There's also the point that if someone did have the money to build such a machine, they would probably rather factor RSA moduli secretly than publicly start messing with your hashes. As soon as someone noticed a matching hash it would give away the existence of the machine. Then people would switch to a larger hash, making the machine useless. All that money spent would be wasted after forging a single hash. Using it to break encryption keys allows the machine to be kept secret, making it far more valuable. http://mathworld.wolfram.com/RSANumber.html has a nice chart showing the progress over the years in factoring RSA moduli, which are within an order of magnitude of difficulty of finding discrete logs. The largest RSA modulus factored is 576 bits, although looking at the chart I expect 640 will probably fall within a year or so. One point is that you should try to design the system to allow the hash size to be upgradeable if and when it became necessary. Then, maybe you could even get away with something a little smaller than 1024, with the idea of upgrading in five years or so, when faster networks and cheaper disk storage will make a larger hash more palatable. This will also let you recover from a surprise breakthrough. Another point is that if you used a different p for every file, it would require a billion dollars per file break rather than a billion to break the whole system. In that case I'd feel much safer about using a smaller p, even maybe 768 bits if you accept that one or two files might be cracked by say 2010. Hal From mfreed at cs.nyu.edu Sun Feb 27 05:45:50 2005 From: mfreed at cs.nyu.edu (Michael J. Freedman) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] Homomorphic hashing and fountain codes - implementation question (fwd) Message-ID: [Looks like Max was getting his posts rejected from the mailing list, so I'm forwarding his response. --mike] ---------- Forwarded message ---------- Date: Sat, 26 Feb 2005 12:53:31 -0500 From: Maxwell Krohn To: Hal Finney Cc: Michael J Freedman , p2p-hackers@zgp.org Subject: Re: [p2p-hackers] Homomorphic hashing and fountain codes - implementation question (fwd) This is the response I sent to Nick personally about the one-bit overflow problem. This was the simplest solution we came up with. It makes all encoded blocks 1/256 bigger, where 256 is the number of bits per sub-block. Maxwell Krohn (krohn@mit.edu) wrote: > Hi, > > There might be 1 overflow bit per subblock, making 512 bits per > block, and 16 bytes per block of overflow bits. > > Our implementation of the Codes is here by the way, available by > anonymous CVS: > > http://cvs.pdos.lcs.mit.edu/cvs/codes1/ > > In memory, for the encoders and decoders, we store subblocks as big > integers, which can grow to 257 bits long. However, for sending over > the network (and storing on disk, perhaps) we want a denser packing. So > what we do is we sheer off the top bit of each subblock and pack them > together at the beginning of every block. In our implementation, it's > called the "carrybitmap_t" object, which you can grep for in our code. > > Hope this helps, > > Max From arachnid at notdot.net Mon Feb 28 10:53:22 2005 From: arachnid at notdot.net (Nick Johnson) Date: Sat Dec 9 22:12:51 2006 Subject: [p2p-hackers] SSM for Java? Message-ID: <4222F822.8020207@notdot.net> Has anyone come across a single-source-multicast library for Java, possibly using JNI? I've seen tantalizing hints of one, but no actual code. -Nick Johnson