From dnm at pobox.com Mon Oct 1 01:45:01 2001 From: dnm at pobox.com (Dan Moniz) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Fwd: International Workshop on Global and Peer-to-Peer Computing Message-ID: [ Forwarded from the bluesky list, sans some irrelevant headers and footers. -- dnm ] >Date: Mon, 1 Oct 2001 03:43:04 -0400 >Subject: International Workshop on Global and Peer-to-Peer Computing >To: "Global-Scale Distributed Storage Systems" >From: "Franck Cappello" > >Dear Colleagues, >Please find bellow the call for paper of the >"International Workshop on Global and Peer-to-Peer Computing" which will >take place along with the CCGRID 2002 conference in Berlin, Germany, 21-24 >May 2002. >We take this opportunity to invite you to submit scientific papers to this >workshop. >Please forgive us if you receive several copies of this message. >Feel free to forward this message to your colleagues. >Best Regards. > > > International Workshop on > "Global and Peer-to-Peer Computing > On Large Scale Distributed Systems" > (http: http://www.lri.fr/~fci/GP2PC.htm) > > organized at the IEEE International Symposium > on Cluster Computing and the Grid 2002 > CCGRID 2002 > >In cooperation with the IEEE Task Force on Cluster Computing (TFCC) > > >SCOPE > >The wide spread of the World Wide Web along with the availability >of increasingly powerful off-the-shelf hardware give rise to a new >infrastructure for distributed computing. Besides traditional grid >computing systems, it is now possible to run computations on a large >number workstations, personal computers and servers using a large scale >and loosely coupled system. > >This type of distributed computing, also referred to as Global >Computing, is currently used for a large variety of physic, mathematics >and biology applications mostly following the Master/slave paradigm. >Peer-to-Peer interaction models give the opportunity to enlarge the number >of user >and applications of Global Computing by allowing any resource to send >job or/and data requests, provide data or/and computation services and >participate to maintain the infrastructure itself. > >Because of their size and the high volatility of their resources, Global >Computing and Global Peer-to-Peer Computing platforms provide the >opportunity for researchers to revisit distributed computing major fields: >protocols, infrastructures, security, certification, fault tolerance, >scheduling, performance, etc. > >Authors are invited to submit original, unpublished work describing >current research in the area of Global and Peer-to-Peer Computing >including design and analysis of computational infrastructures as well as >applications in science, technology, and commerce. > >TOPICS > >Topics of interest include, but are not limited to: >* Global Computing and Peer-to-Peer computing platforms, >* Autonomous, self organizing and/or mobile distributed systems, >* Middleware, programming models, environments and toolkits, >* Protocols for resource management/discovery/reservation/scheduling, >* Economic considerations of resource usage (protocols, accounting), >* Storage in Global Computing Infrastructures (strategies, protocols) >* Performance monitoring, benchmarking, evaluation and modeling of > Global Computing and Peer-to-Peer systems and/or components thereof, >* Security, management and monitoring of resources, >* Result certification (detection/tolerance of wrong/corrupted results), >* Parallel computing on large scale distributed systems, >* Compute or I/O driven applications (scientific, engineering, >business), >* Global and Peer-to-Peer computing applications (programmed from >scratch, > ported from sequential, or parallel version, adaptations to fit a >global > computing environment) > >PAPER SUBMISSION > >Authors are encouraged to: > >Submit a full paper (max: 6 pages in length, formatted to the IEEE format) >Submit a research statement (max: 2 pages in length, formatted to the IEEE >format) >Use a minimum of 10pt font, and printable on A4 paper. IEEE guidelines can >be found here. Please email your papers to fci@lri.fr or >lalis@ics.forth.gr which is the preferred method for submission. > >Full papers (category (a)) will be reviewed by the program committee for >relevance, clarity and the novelty of results. If accepted, full papers >will be published in the conference proceedings by IEEE Computer Society. >Authors may purchase two additional pages. > >Short papers (category (b)) will be published in a separate section. This >is to encourage work that is not yet advanced enough for a full paper. > >We also encourage authors to present novel ideas, critique of existing >work, and application examples, which demonstrate how Global and >Peer-to-Peer Computing technology could be effectively deployed. We also >welcome practical work which applies Global and Peer-to-Peer Computing >technology in novel and interesting ways. > >IMPROTANTES DATES > >Papers due: November 24, 2001 >Notification to authors: December 21, 2001 >Final version of papers due: February 15, 2002 > >PROGRAM COMMITTEE (still expanding) > >Mark Baker, DCS , University of Portsmouth, UK >Taisuke Boku, CCP, University of Tsukuba, Japan >Franck Cappello, CNRS, Paris-South University, France >Henri Casanova, SDSC, California, USA >Christian Huitema, Microsoft, USA >Spyros Lalis, FORTH, Greece >Serge Petiton, LILF, Lille University, France >Avi Rubin, AT&T Labs - Research, USA >Mitsuhisa Sato, CCP, University of Tsukuba, Japan > > >SESSION CHAIRS > >For more information please contact: >Franck Cappello, Spyros Lalis, >CNRS, Institute of Computer Science >Universite Paris-Sud, Foundation for Research and Technology >Hellas >France, Greece, >fci@lri.lri.fr lalis@ics.forth.gr -- Dan Moniz [http://www.pobox.com/~dnm/] From Franck.Cappello at lri.fr Mon Oct 1 03:49:01 2001 From: Franck.Cappello at lri.fr (Franck.Cappello@lri.fr) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] GP2PC International Scientific Workshop Message-ID: <1001921421.3bb81b8d5eaf5@www.lri.fr> Dear Colleagues, Please find bellow the call for paper of the "International Workshop on Global and Peer-to-Peer Computing" which will take place along with the CCGRID 2002 conference in Berlin, Germany, 21-24 May 2002. We take this opportunity to invite you to submit scientific papers to this workshop. Please excuse use if you receive several copies of this message. Feel free to forward this message to your colleagues. Best Regards. International Workshop on Global and Peer-to-Peer Computing On Large Scale Distributed Systems (http: http://www.lri.fr/~fci/GP2PC.htm) organized at the IEEE International Symposium on Cluster Computing and the Grid 2002 CCGRID 2002 In cooperation with the IEEE Task Force on Cluster Computing (TFCC) SCOPE The wide spread of the World Wide Web along with the availability of increasingly powerful off-the-shelf hardware give rise to a new infrastructure for distributed computing. Besides traditional grid computing systems, it is now possible to run computations on a large number workstations, personal computers and servers using a large scale and loosely coupled system. This type of distributed computing, also referred to as Global Computing, is currently used for a large variety of physic, mathematics and biology applications mostly following the Master/slave paradigm. Peer-to-Peer interaction models give the opportunity to enlarge the number of user and applications of Global Computing by allowing any resource to send job or/and data requests, provide data or/and computation services and participate to maintain the infrastructure itself. Because of their size and the high volatility of their resources, Global Computing and Global Peer-to-Peer Computing platforms provide the opportunity for researchers to revisit distributed computing major fields: protocols, infrastructures, security, certification, fault tolerance, scheduling, performance, etc. Authors are invited to submit original, unpublished work describing current research in the area of Global and Peer-to-Peer Computing including design and analysis of computational infrastructures as well as applications in science, technology, and commerce. TOPICS Topics of interest include, but are not limited to: * Global Computing and Peer-to-Peer computing platforms, * Autonomous, self organizing and/or mobile distributed systems, * Middleware, programming models, environments and toolkits, * Protocols for resource management/discovery/reservation/scheduling, * Economic considerations of resource usage (protocols, accounting), * Storage in Global Computing Infrastructures (strategies, protocols) * Performance monitoring, benchmarking, evaluation and modeling of Global Computing and Peer-to-Peer systems and/or components thereof, * Security, management and monitoring of resources, * Result certification (detection/tolerance of wrong/corrupted results), * Parallel computing on large scale distributed systems, * Compute or I/O driven applications (scientific, engineering, business), * Global and Peer-to-Peer computing applications (programmed from scratch, ported from sequential, or parallel version, adaptations to fit a global computing environment) PAPER SUBMISSION Authors are encouraged to: Submit a full paper (max: 6 pages in length, formatted to the IEEE format) Submit a research statement (max: 2 pages in length, formatted to the IEEE format) Use a minimum of 10pt font, and printable on A4 paper. IEEE guidelines can be found here. Please email your papers to fci@lri.fr or lalis@ics.forth.gr which is the preferred method for submission. Full papers (category (a)) will be reviewed by the program committee for relevance, clarity and the novelty of results. If accepted, full papers will be published in the conference proceedings by IEEE Computer Society. Authors may purchase two additional pages. Short papers (category (b)) will be published in a separate section. This is to encourage work that is not yet advanced enough for a full paper. We also encourage authors to present novel ideas, critique of existing work, and application examples, which demonstrate how Global and Peer-to-Peer Computing technology could be effectively deployed. We also welcome practical work which applies Global and Peer-to-Peer Computing technology in novel and interesting ways. IMPROTANTES DATES Papers due: November 24, 2001 Notification to authors: December 21, 2001 Final version of papers due: February 15, 2002 PROGRAM COMMITTEE (still expanding) Mark Baker, DCS , University of Portsmouth, UK Taisuke Boku, CCP, University of Tsukuba, Japan Franck Cappello, CNRS, Paris-South University, France Henri Casanova, SDSC, California, USA Christian Huitema, Microsoft, USA Spyros Lalis, FORTH, Greece Serge Petiton, LILF, Lille University, France Avi Rubin, AT&T Labs - Research, USA Mitsuhisa Sato, CCP, University of Tsukuba, Japan SESSION CHAIRS For more information please contact: Franck Cappello, Spyros Lalis, CNRS, Institute of Computer Science Universite Paris-Sud, Foundation for Research and Technology Hellas France, Greece, fci@lri.lri.fr lalis@ics.forth.gr --------------------------------------------------------- Franck Cappello fci@lri.fr www.lri.fr/~fci Researcher within CNRS, LRI, Universit? Paris Sud, France tel +33 1 69 15 70 91 fax +33 1 69 15 65 86 COE Research Fellow, CCP, Tsukuba University, Japan tel +81 2 98 53 64 83 fax +81 2 98 53 64 06 --------------------------------------------------------- From rob at eorbit.net Mon Oct 1 17:24:01 2001 From: rob at eorbit.net (Mayhem & Chaos) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] MusicBrainz RDF Data dump Message-ID: <1001979805.23542.111.camel@cranky> Hi! Since Brandon has been looking at RDF data dumps, I finally got off my butt to put together an RDF dump of the data from the MusicBrainz project. As some of you might know, MusicBrainz is a music metadata database that allows users to identify audio CDs and digital audio tracks like MP3s or Vorbis files. The server and client software is released under the GPL and the LGPL, respectively. The data is covered by the OpenContent license in order to avoid the CDDB fiasco. I won't go into the details of the project here -- please check out the project at http://www.musicbrainz.org if you're interested in finding out more. You can download the RDF data dump from here: ftp://ftp.musicbrainz.org/pub/musicbrainz/rdfdump-2001-9-25.rdf.bz2 Please note that the URLs for the resources in the RDF dump are live and available from the MusicBrainz server. However, I still haven't completed the full specification of the musicbrainz namespaces -- I'm working on getting that done asap. Any feedback regarding this data dump would be deeply appreciated, since this is my first large scale data dump. -- --ruaok Freezerburn! All else is only icing. -- Soul Coughing Robert Kaye -- rob@eorbit.net -- http://www.mayhem-chaos.net From antr at microsoft.com Wed Oct 3 07:35:01 2001 From: antr at microsoft.com (Ant Rowstron) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] CFP: The 1st International Workshop on Peer-to-Peer Systems (IPTPS'02) Message-ID: <4BBF5F3B80921D47BFEB9C136808A47402D8D7D0@red-msg-08.redmond.corp.microsoft.com> CALL FOR PARTICIPATION: IPTPS'02 The 1st International Workshop on Peer-to-Peer Systems (IPTPS'02) 7-8 March, 2002 MIT Faculty Club, Cambridge, MA, USA. http://www.cs.rice.edu/Conferences/IPTPS02/ Peer-to-peer has emerged as a promising new paradigm for distributed computing. The 1st International Workshop on Peer-to-Peer Systems (IPTPS'02) aims to provide a forum for researchers active in peer-to-peer computing to discuss the state-of-the-art and to identify key research challenges in peer-to-peer computing. The goal of the workshop is to examine peer-to-peer technologies, applications and systems, and also to identify key research issues and challenges that lie ahead. In the context of this workshop, peer-to-peer systems are characterized as being decentralized, self-organizing distributed systems, in which all or most communication is symmetric. Topics of interest include, but are not limited to: * novel peer-to-peer applications and systems * peer-to-peer infrastructure * security in peer-to-peer systems * anonymity and anti-censorship * performance of peer-to-peer systems * workload characterization for peer-to-peer systems The program of the workshop will be a combination of invited review talks, presentations of position papers, and discussions. To ensure a productive workshop environment, attendance will be limited to about 35 participants who are active in the field. Each potential participant should submit a position paper of 5 pages or less that exposes a new problem, advocates a specific solution, or reports on actual experience. Participants will be invited based on the originality, technical merit and topical relevance of their submissions, as well as the likelihood that the ideas expressed in their submissions will lead to insightful technical discussions at the workshop. Please do not submit abbreviated versions of journal or conference papers. Online copies of the position papers will be made available prior to the workshop. We are investigating the possibility of producing a printed proceedings, including a summary of the interactions at the workshop, which would be mailed to participants after the workshop. Steering committee: Peter Druschel, Rice University, USA Frans Kaashoek, MIT, USA Antony Rowstron, Microsoft Research, UK Scott Shenker, ACIRI, Berkeley, USA Ion Stoica, UC Berkeley, USA Organizing chairs: Frans Kaashoek, MIT, USA Antony Rowstron, Microsoft Research, UK Program Committee: Ross Anderson, Cambridge University, UK Roger Dingledine, Reputation Technologies, Inc., USA Peter Druschel, Rice University, USA (co-chair) Steve Gribble, University of Washington, USA David Karger, MIT, USA John Kubiatowicz, UC Berkeley, USA Robert Morris, MIT, USA Antony Rowstron, Microsoft Research, UK (co-chair) Avi Rubin, AT&T Labs - Research, USA Scott Shenker, ACIRI, Berkeley, USA Ion Stoica, UC Berkeley, USA Guidelines for Submission: To submit, authors should follow the instructions at http://www.cs.rice.edu/Conferences/IPTPS02/submit/. Papers must be submitted by 18:00 GMT, Monday, 3 December 2001. The length of the paper must not exceed 5 pages (11pt font, 1 inch margins). All submissions will be acknowledged by email within 24 hours of receipt. Important Dates: * 3 December 2001 : Submission of position papers * 4 February 2002 : Notification of Acceptance/Rejection * 1 March 2002 : Final copies of accepted papers * 7-8 March 2002 : IPTPS'02 From zooko at zooko.com Wed Oct 3 13:15:01 2001 From: zooko at zooko.com (Zooko) Date: Sat Dec 9 22:11:43 2006 Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices)) In-Reply-To: Message from Gordon Mohr of "Wed, 19 Sep 2001 11:55:28 PDT." <00ca01c1413c$a7539340$0ea7fea9@golden> References: <01e801c135e2$5d03dda0$0ea7fea9@golden> <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <005601c14131$cbdc4d20$0ea7fea9@golden> <00ca01c1413c$a7539340$0ea7fea9@golden> Message-ID: I, Zooko, wrote the part prefixed with "> > ". Gojomo wrote: > > > I guess we just differ in our value judgements here. I value shorter ids for > > cut-and-paste purposes more than I value absence of "break" characters. > > Indeed, I can't really think of a motivating example for caring about "break" > > characters. Could you please suggest one? > > Again, Googling for identifiers. Other full-text searches for > fragments. Searching for the Base32 fragment 'B6THNJ' is always > a single word; searching for the Base64 fragment 'aS+w/e' might > be interpreted as 'as w e' and perhaps ignored completely. > > > Hm. I can't find a base-32 encoder in Python. Could someone who favors > > base-32, and thus presumably has an encoder handy, show the base-32 version of > > 40-byte, 30-byte, and 20-byte strings? Thanks! > > 20b -> 32 chars: 3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD > 30b -> 48 chars: 3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPF > 40b -> 64 chars: 3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ Hey, that' doesn't look too bad! I guess the four characters omitted are `0', `O', `1', and `l'? Hm. The only thing is that mojo ids look like this: http://localhost:4004/save_id/3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ so it is greater than 80 chars. Hmph. Of course, base-64 would also be greater than 80 chars: http://localhost:4004/save_id/Ftp3ZuSNvzDw6KmYkmTA81ZGVb-gLJ53qBoY945FO0qvR8pyzBWYBQ Hrm... I am just about convinced to switch to base-32. Regards, Zooko From greg at electricrain.com Mon Oct 15 22:16:01 2001 From: greg at electricrain.com (Gregory P. Smith) Date: Sat Dec 9 22:11:43 2006 Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices)) In-Reply-To: ; from zooko@zooko.com on Wed, Oct 03, 2001 at 01:05:06PM -0700 References: <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <005601c14131$cbdc4d20$0ea7fea9@golden> <00ca01c1413c$a7539340$0ea7fea9@golden> Message-ID: <20011015221505.F27951@zot.electricrain.com> > Hm. The only thing is that mojo ids look like this: > > http://localhost:4004/save_id/3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ > > so it is greater than 80 chars. Hmph. how about mojo://3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ/ (or pick your favorite protocol prefix and write a protocol handler) -g From zooko at zooko.com Tue Oct 16 05:34:01 2001 From: zooko at zooko.com (Zooko) Date: Sat Dec 9 22:11:43 2006 Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices)) In-Reply-To: Message from "Gregory P. Smith" of "Mon, 15 Oct 2001 22:15:05 PDT." <20011015221505.F27951@zot.electricrain.com> References: <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <005601c14131$cbdc4d20$0ea7fea9@golden> Message-ID: > (or pick your favorite protocol prefix and write a protocol handler) Yes, that's a good idea! mojo://3kizijb64xp3ncxae4isqzt3qnctf7vd8ej2kedcv3wqmmpfwfjw6dcvpkxmzqiz versus mojo://Ftp3ZuSNvzDw6KmYkmTA81ZGVb-gLJ53qBoY945FO0qvR8pyzBWYBQ I'm convinced that base32 is better. The drawback is 10 more chars, but the benefit is that it survives case-changes and it doesn't have troublesome non-alphanumeric characters. Regards, Zooko From david.hopwood at zetnet.co.uk Tue Oct 16 22:25:02 2001 From: david.hopwood at zetnet.co.uk (David Hopwood) Date: Sat Dec 9 22:11:43 2006 Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices)) References: <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <005601c14131$cbdc4d20$0ea7fea9@golden> Message-ID: <3BCBADBA.2208D1BA@zetnet.co.uk> -----BEGIN PGP SIGNED MESSAGE----- Zooko wrote: > > (or pick your favorite protocol prefix and write a protocol handler) > > Yes, that's a good idea! > > mojo://3kizijb64xp3ncxae4isqzt3qnctf7vd8ej2kedcv3wqmmpfwfjw6dcvpkxmzqiz > > versus > > mojo://Ftp3ZuSNvzDw6KmYkmTA81ZGVb-gLJ53qBoY945FO0qvR8pyzBWYBQ "//" should only be used for heirarchical schemes (so not in this case). In general, anyone designing new URI schemes should have read RFC 2718, RFC 2717, RFC 2396, RFC 2732, and . - -- David Hopwood Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/ RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01 Nothing in this message is intended to be legally binding. If I revoke a public key but refuse to specify why, it is because the private key has been seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip -----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv iQEVAwUBO8utVTkCAxeYt5gVAQHsJggAx/i3q+O4X6Tmaoqi4Q+nKjs9orqBkBzD hHtmVJyFo7wvAOaS1w1itKwrx+eVWUVElsDA/hy1HLm14lMN3XnKu9ZII3jKOMYQ K2eC7ZUpxceih9uA07uNxoVtqWYPXKgDKa4JsvdWQLog6rNH+kg2D7DdgFRPYpg3 nJmkTM3XDxOPnyPPl2NGB3s2thZKuGa2W8EOM2gHdDXPGkPqf/CeaS99yLmtvLAE dN2K/sSImiLTKdX1B8q0HSjI13mO8Z882rJXTlj9k/byuDnrP3RlZ983cMIA3SQ/ dF8XqPuRQZ3LyELnqcbNJWjZIOBUDnUdyCAI8efhDeUhUb9iG3+IeQ== =X6p8 -----END PGP SIGNATURE----- From bram at gawth.com Mon Oct 22 19:22:02 2001 From: bram at gawth.com (Bram Cohen) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] New release of BitTorrent is out! Message-ID: A new release of BitTorrent is out, get it here - http://bitconjurer.org/BitTorrent/ Since there hasn't been one in a while (over a month!) there's a ton of new stuff, including - * rewritten UI - now all graphical and works in mozilla/netscape under UNIX! * clean shutdown * monothreading - this produced a *huge* performance increase * several big bugs fixed - we're down to no known bugs! * the publisher now stores metadata in files, so it doesn't have to re-scan files every time it restarts * the tracker now stores publisher and downloader information persistently, so downloads start working again as soon as it restarts * oodles of other small improvements -Bram Cohen "Markets can remain irrational longer than you can remain solvent" -- John Maynard Keynes From justin at chapweske.com Thu Oct 25 05:30:01 2001 From: justin at chapweske.com (Justin Chapweske) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] The Content-Addressable Web Message-ID: <3BD7B24C.9030403@chapweske.com> I've just finished the first draft of "HTTP Extensions for a Content-Addressable Web". I believe that these simple extensions are a huge step forward in providing interopability between P2P systems. I would like for people to start brain-storming around this document. Pick it apart and make it stronger. The documents are located at http://onionneworks.com/caw/caw.ps and http://onionnetworks.com/caw/caw.txt I've also attached those documents to this e-mail. Thanks, -- Justin Chapweske, Onion Networks http://onionnetworks.com/ -------------- next part -------------- HTTP Extensions for a Content-Addressable Web Justin Chapweske, Onion Networks (justin@onionnetworks.com) October 7, 2001 Abstract The goal of the Content-Addressable Web (CAW) is to create a URN-based Web that can be optimized for content distribution. The use of URNs allows advanced caching techniques to be employed, and sets the foundation for creating ad hoc Content Distribution Networks (CDNs). This document specifies HTTP extensions that bridge the current Location-Based Web with the Content-Addressable Web. Table of Contents 1 Introduction 2 Self-Verifying URNs 3 HTTP Extensions 3.1 X-Content-URN 3.2 X-Target-URN 3.3 X-URN-N2* 3.3.1 X-URN-N2R 3.3.2 X-URN-N2L and X-URN-N2Ls 4 An Example Application 5 Open Issues 6 Acknowledgments 1 Introduction The rise in popularity of Content Distribution Networks (CDNs), such as Akamai, have shown that significant improvements can be made in throughput, latency, and scalability when content is distributed throughout the network and delivered from the edge. The fact that companies such as Tucows, FilePlanet, and various Linux distributions force their users to manually select mirrors points to a hole in existing web caching infrastructure. Standard web caching can provide significant benefits in certain situations, but suffers from a number of short comings: * It is ill-advised to retrieve content from an untrusted cache, because it can modify/corrupt the content at will. This severely limits the utility of cooperative caching systems. * URL-based naming causes the same object on different mirrors to look like different objects. This decreases the efficiency of caching and mirroring combinations. * There are few ways to discover optimal replicas of a given piece of content. There is no way for a browser to download a mirror list and automatically select an optimal mirror. To add to the burden, the Transient Web is steadily growing in size and importance. The Transient Web is embodied by peer-to-peer systems such as Gnutella, and is characterized by unreliable nodes and a high rate of nodes joining and leaving the network. URL-based addressing would be unacceptable for the Transient Web because there would be a high failure rate of retrieving objects. The solution to these problems it to create a Content-Addressable Web (CAW) that is URN-based rather than URL-based. A few proposals have been made to enable the practical use of URNs, such as RFC 2169 and and RFC 2915, but little has been done with them due to lack of application demand. Recently, however, the growing importance of peer-to-peer systems and the desire to create ad hoc CDNs has created demand for the Content-Addressable Web. One of the more interesting applications of the Content-Addressable Web is the creation of ad hoc Content Distribution Networks. In such networks receivers can achieve tremendous throughput by downloading content from multiple hosts in parallel. Receivers can also crawl through the network searching for optimal replicas, and can even retrieve content from completely untrusted hosts but be assured that they are receiving the content in tact. All of this is made possible by URNs. 2 Self-Verifying URNs While any kind of URN can be used within the Content-Addressable Web, there is a specific type of URN called a "Self-Verifying URN" that is particularly useful. These URNs have the property that the URN itself can be used to verify that the content has been received intact. It is RECOMMENDED that applications use cryptographically strong self-verifying URNs because hosts in ad hoc CDNs and the Transient Web are assumed to be untrusted. For instance, one could hash the content using the SHA-1 algorithm, and encode it using Base32 to produce the following URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB * It is RECOMMENDED that implementations support SHA-1 URNs at minimum. * Receivers MUST verify self-verifying URNs if any part of the content is retrieved from a potentially untrusted source. A future version of this document will also specify a URN format for performing streaming and random-access verification using Merkle Hash Trees. 3 HTTP Extensions In order to provide a transparent bridge between the URL-based Web and the Content-Addressable Web, a few HTTP extensions must be introduced. The nature of these extensions is that they need not be widely deployed in order to be useful. They are specifically designed to allow for proxying for hosts that are not CAW-aware. The following HTTP extensions are based off of the conventions defined in RFC 2169. It is RECOMMENDED that implementers of this specification also implement RFC 2169. The HTTP headers defined in this specification are all response headers. No additional request headers are specified by this document. It is RECOMMENDED that implementers of this specification use an HTTP/1.1 implementation compliant with RFC 2616. This specification uses the "X-" header prefix convention to denote that these are not W3C/IETF standard headers. If and when this specification becomes a standard the prefix will either be simply removed or replaced with an appropriate header extension mechanism. 3.1 X-Content-URN The X-Content-URN entity-header field provides a URN that uniquely identifies the entity-body. The URN is based on the content of the entity-body and any content-coding that has been applied, but not including any transfer-encoding applied to the message-body. For example: X-Content-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB 3.2 X-Target-URN The X-Target-URN entity-header field provides a URN that uniquely identifies the desired entity-body in the case of a redirect. For HTTP 3xx responses, the URN SHOULD indicate the server's preferred URN for automatic redirection to the resource. The X-Content-URN header is inappropriate in this case, because HTTP 3xx responses often still include message-body that explains that a redirect is taking place. This header primarily exists to allow the creation of URN-aware proxies that provide URN information w/o modifying the original web server. This allows URN-aware user-agents to take advantage of the headers, while simply redirecting user-agents that don't understand the Content-Addressable Web. For Example: X-Target-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB 3.3 X-URN-N2* These headers specify locations of various resolution services for the URNs specified in the X-Content-URN and X-Target-URN headers. These headers provide various ways of locating other replicas of the content. They can be used to provide additional sources for a multiple-source download. Or one can build an application that crawls across the resolution services searching for an optimal replica. These headers are based off of conventions defined in RFC 2169 and include N2L, N2Ls, N2R, N2Rs, N2C, N2Cs, and N2Ns. These headers provide URIs at which the associated resolution services can be performed for the URNs specified in the X-Content-URN and X-Target-URN headers. It is not necessary for these URIs to conform to the "/uri-res/?" convention specified in RFC 2169. It is believed that N2R, N2L, and N2Ls will be the most useful services for the Content-Addressable Web, so we will cover examples of those explicitly. The rest of the N2* headers should be implemented using the conventions used for N2R, N2L, and N2Ls. Implementations can feel free to use any additional URN resolution mechanisms, such as RFC 2915 DNS-based URN resolution. It is RECOMMENDED that receivers assume that the URN resolver services are potentially untrusted and should verify all content retrieved using a resolver's services. 3.3.1 X-URN-N2R This header specifies one or more URIs that perform the N2R (URN to Resource) resolution service for the URNs specified by the X-Content-URN or X-Target-URN headers. The N2R URIs directly specify mirrors for the content addressed by the URN and can be useful for multi-source downloads. For example: X-URN-N2R: http://urnresolver.com/uri-res/N2R?urn:sha1: or X-URN-N2R: http://untrustedmirror.com/pub/file.zip The key difference between this header and something like the Location header is that the URIs specified by this header should be assumed to be untrusted. 3.3.2 X-URN-N2L and X-URN-N2Ls This header specifies one or more URIs that perform the N2L (URN to URL) and N2Ls (URN to URLs) resolution services. These headers are used when other hosts provide URLs where the content is mirrored. This is most useful in ad hoc CDNs where mirrors may maintain lists of other mirrors. Browsers can simply crawl across the networks, recursively dereferencing N2L(s). For example: X-URN-N2L: http://urnresolver.com/uri-res/N2L?urn:sha1: and X-URN-N2Ls: http://untrustedmirror.com/pub/file.zip-mirrors.list For the N2Ls service, it is RECOMMENDED that the result conform to the text/uri-list media type specified in RFC 2169. 4 An Example Application The above HTTP extensions are deceptively simple and it may not be readily apparent how powerful they are. We will discuss an example application that will take advantage of a few of the features provided by the extensions. In this example we will will look at how the CAW could help at the imaginary linuxiso.org where ISO CD-ROM images of the various linux distributions are kept. The first step will be to issue a GET request for the content: GET /pub/Redhat-7.1-i386-disc1.iso HTTP/1.1 Host: www.linuxiso.org The abbreviated response: HTTP/1.1 200 OK Content-Type: Application/octet-stream Content-Length: 662072345 X-Content-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB X-URN-N2R: http://www.linuxmirrors.com/pub/Redhat-7.1i386-disc1.iso X-URN-N2R: http://123.24.24.21:8080/uri-res/N2R?urn:sha1: X-URN-N2Ls: http://123.24.24.21:8080/uri-res/N2Ls?urn:sha1: With this response, a CAW aware browser can immediately begin downloading the content from www.linuxiso.org, linuxmirrors.com, and 123.24.24.21 all in parallel. At the same time the browser can be dereferencing the N2Ls service at 123.24.24.21 to discover more mirrors for the content. The existence of the 123...21 host is meant to represent a member of an ad hoc CDN, perhaps the personal computer of a linux advocate that just downloaded the ISO and wants to share their bandwidth with others. By dereferencing the N2Ls, even more ad hoc nodes could be discovered. 5 Open Issues It is unclear how to deal with the mapping of X-URN-N2* headers in the presence of multiple X-Content-URN or X-Target-URN headers. This must be resolved. 6 Acknowledgments Gordon Mohr (gojomo@bitzi.com) for working on many of the concepts in this document within the Gnutella community. We also wish to thank Tony Kimball (alk@pobox.com) for his continued advocacy of RFC 2169. -------------- next part -------------- A non-text attachment was scrubbed... Name: caw.ps Type: application/postscript Size: 88595 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20011025/4afe354b/caw.ps From distobj at acm.org Thu Oct 25 08:35:01 2001 From: distobj at acm.org (Mark Baker) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web In-Reply-To: <3BD7B24C.9030403@chapweske.com> from "Justin Chapweske" at Oct 25, 2001 01:33:48 AM Message-ID: <200110251500.LAA19254@markbaker.ca> Justin, > The goal of the Content-Addressable Web (CAW) is to create > a URN-based Web that can be optimized for content distribution. Ooooh, noooo, not URNs again. 8-) > The use of URNs allows advanced caching techniques to be > employed, and sets the foundation for creating ad hoc Content > Distribution Networks (CDNs). Untrue. The use of URIs allows these things. There is nothing special about URNs in this respect. > Standard web caching can provide significant benefits in > certain situations, but suffers from a number of short comings: > > * It is ill-advised to retrieve content from an untrusted > cache, because it can modify/corrupt the content at will. > This severely limits the utility of cooperative caching > systems. Not to my knowledge. Either you trust the provider of the cache, or you don't. If you do, you can share. > * URL-based naming causes the same object on different mirrors > to look like different objects. Incorrect. It is current practice in mirroring that is at fault, not the URL mechanism. More specifically, mirroring is implemented with "copy" semantics, rather than "cache", necessitating the creation of new URIs. > This decreases the efficiency > of caching and mirroring combinations. Mirroring practice decreases the efficiency of mirroring. > * There are few ways to discover optimal replicas of a given > piece of content. There is no way for a browser to download > a mirror list and automatically select an optimal mirror. > > To add to the burden, the Transient Web is steadily growing > in size and importance. The Transient Web is embodied by > peer-to-peer systems such as Gnutella, and is characterized > by unreliable nodes and a high rate of nodes joining and > leaving the network. URL-based addressing would be unacceptable > for the Transient Web because there would be a high failure > rate of retrieving objects. The Web doesn't deal in "nodes", it deals in resources identified by authorities. That Gnutella treats each node as a separate authority, thereby creating an unbounded number of identities for a single resource, is a problem of Gnutella's, not the Web. > One of the more interesting applications of the Content-Addressable > Web is the creation of ad hoc Content Distribution Networks. > In such networks receivers can achieve tremendous throughput > by downloading content from multiple hosts in parallel. > Receivers can also crawl through the network searching for > optimal replicas, and can even retrieve content from completely > untrusted hosts but be assured that they are receiving the > content in tact. All of this is made possible by URNs. It's made possible by URIs, not URNs. > 2 Self-Verifying URNs > > While any kind of URN can be used within the Content-Addressable > Web, there is a specific type of URN called a "Self-Verifying > URN" that is particularly useful. These URNs have the > property that the URN itself can be used to verify that > the content has been received intact. It is RECOMMENDED > that applications use cryptographically strong self-verifying > URNs because hosts in ad hoc CDNs and the Transient Web > are assumed to be untrusted. For instance, one could hash > the content using the SHA-1 algorithm, and encode it using > Base32 to produce the following URN: > > urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB That's an invalid URN, AFAIK. There's no authority. All URIs need an authority to vouch for the identity. There's "urn:ietf:sha-1" that identifies the SHA-1 algorithm, but that namespace doesn't allow further qualification of the URN. Also, including a hash of the content in a URI is an extremely brittle way of identifying things, unless it is known that the content will be static for all time and space. If that URN identified a song in MP3 format, for example, then you'd need a new URN to identify the same song in a different format. Is that what you want? > 3.1 X-Content-URN > > The X-Content-URN entity-header field provides a URN that > uniquely identifies the entity-body. The URN is based on > the content of the entity-body and any content-coding that > has been applied, but not including any transfer-encoding > applied to the message-body. For example: > > X-Content-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB Content-Location should suffice. > 3.2 X-Target-URN > > The X-Target-URN entity-header field provides a URN that > uniquely identifies the desired entity-body in the case > of a redirect. For HTTP 3xx responses, the URN SHOULD indicate > the server's preferred URN for automatic redirection to > the resource. HTTP redirection allows an authority of a resource to specify that the resource is now found elsewhere. What could a client tell a server that it doesn't already know? What you're specifying here isn't redirection - I don't know what it is. > This header primarily exists to allow the creation of URN-aware > proxies that provide URN information w/o modifying the original > web server. This allows URN-aware user-agents to take advantage > of the headers, while simply redirecting user-agents that > don't understand the Content-Addressable Web. For Example: Why not just convert URLs to URNs; http://www.markbaker.ca/foo/bar/baz -> urn:markbaker.ca:foo:bar:baz > It is believed that N2R, N2L, and N2Ls will be the most useful > services for the Content-Addressable Web, so we will cover > examples of those explicitly. The rest of the N2* headers > should be implemented using the conventions used for N2R, > N2L, and N2Ls. The N2* conventions run completely against the architecture of the Web. URIs are resource identifiers. URNs are one kind of URI. How many URIs does a resource need? MB -- Mark Baker, CSO, Planetfred. Ottawa, Ontario, CANADA. mbaker@planetfred.com From gerald at impressive.net Thu Oct 25 11:35:02 2001 From: gerald at impressive.net (Gerald Oskoboiny) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: The Content-Addressable Web In-Reply-To: <3BD7B24C.9030403@chapweske.com>; from justin@chapweske.com on Thu, Oct 25, 2001 at 01:33:48AM -0500 References: <3BD7B24C.9030403@chapweske.com> Message-ID: <20011025140605.B20764@impressive.net> On Thu, Oct 25, 2001 at 01:33:48AM -0500, Justin Chapweske wrote: > I've just finished the first draft of "HTTP Extensions for a > Content-Addressable Web". I believe that these simple extensions are a > huge step forward in providing interopability between P2P systems. : > http://onionnetworks.com/caw/caw.txt Interesting stuff... > HTTP Extensions for a Content-Addressable Web > Justin Chapweske, Onion Networks (justin@onionnetworks.com) > October 7, 2001 > > Abstract > > The goal of the Content-Addressable Web (CAW) is to create > a URN-based Web that can be optimized for content distribution. > The use of URNs allows advanced caching techniques to be > employed, and sets the foundation for creating ad hoc Content > Distribution Networks (CDNs). This document specifies HTTP > extensions that bridge the current Location-Based Web with > the Content-Addressable Web. The Web "is the universe of network-accessible information" [1], i.e. anything with a URI, including URIs that are not tied to a particular hostname. You might find this useful: URIs, URLs, and URNs: Clarifications and Recommendations 1.0 Report from the joint W3C/IETF URI Planning Interest Group W3C Note 21 September 2001 http://www.w3.org/TR/uri-clarification/ it attempts to clarify confusion about URIs, URLs, and URNs. > 2 Self-Verifying URNs > > While any kind of URN can be used within the Content-Addressable > Web, there is a specific type of URN called a "Self-Verifying > URN" that is particularly useful. These URNs have the > property that the URN itself can be used to verify that > the content has been received intact. It is RECOMMENDED > that applications use cryptographically strong self-verifying > URNs because hosts in ad hoc CDNs and the Transient Web > are assumed to be untrusted. For instance, one could hash > the content using the SHA-1 algorithm, and encode it using > Base32 to produce the following URN: > > urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB I think URIs based on sha-1 hashes are a fantastic way to identify resources in P2P systems, and I don't understand why most of the P2P systems I have used don't work like this already. (apparently, anyway; I haven't studied the protocols, but that's my impression as a user.) When I search a P2P system for a particular file, I think the search results should be a list of filenames, content-lengths, and sha-1 hash URIs, and maybe some other stuff. Then the P2P client should present the results to me grouped by their sha-1 hashes, let me sort those results by size/filename/ number-of-peers-with-that-hash, and when I pick one of them, it should start downloading ranges of the file from each of, say, 10 of the peers that claim to have files with that hash. Then it should continue downloading other ranges of the file from the peers that gave the highest throughput on the first transfer. (and terminate the really slow ones prematurely.) Also, it might be good for P2P systems to be able to use an external resolver for these URIs, so I can have a general sha-1 URI resolver on my desktop that gets used by any P2P/Web clients I might use, and I can set its resolution strategy according to my preferences. > 3 HTTP Extensions > > In order to provide a transparent bridge between the URL-based > Web and the Content-Addressable Web, a few HTTP extensions > must be introduced. The nature of these extensions is that > they need not be widely deployed in order to be useful. > They are specifically designed to allow for proxying for > hosts that are not CAW-aware. I haven't reviewed this section closely, but you might want to see: http://www.w3.org/Protocols/HTTP/ietf-http-ext/ for info on HTTP Extensions. [1] About the World Wide Web http://www.w3.org/WWW/ -- Gerald Oskoboiny http://impressive.net/people/gerald/ From gojomo at bitzi.com Thu Oct 25 12:47:01 2001 From: gojomo at bitzi.com (Gordon Mohr) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web References: <200110251500.LAA19254@markbaker.ca> Message-ID: <00b001c15d8c$c66077a0$0ea7fea9@golden> Mark Baker writes: > > The use of URNs allows advanced caching techniques to be > > employed, and sets the foundation for creating ad hoc Content > > Distribution Networks (CDNs). > > Untrue. The use of URIs allows these things. There is nothing > special about URNs in this respect. Well, I belive the theory is that URNs have slightly stronger "identity" characteristics. However, the latest W3C "clarification" seems to simply confirm the practice of using some URIs as if they were URNs. > > * It is ill-advised to retrieve content from an untrusted > > cache, because it can modify/corrupt the content at will. > > This severely limits the utility of cooperative caching > > systems. > > Not to my knowledge. Either you trust the provider of the cache, > or you don't. If you do, you can share. If the names are self-verifying, as with secure hashes, you don't have to make a trust/no-trust decision about caches, at least not at the outset. You can make the trust/no-trust decision on what they give you. Transgressors are always caught. > > 2 Self-Verifying URNs > > > > While any kind of URN can be used within the Content-Addressable > > Web, there is a specific type of URN called a "Self-Verifying > > URN" that is particularly useful. These URNs have the > > property that the URN itself can be used to verify that > > the content has been received intact. It is RECOMMENDED > > that applications use cryptographically strong self-verifying > > URNs because hosts in ad hoc CDNs and the Transient Web > > are assumed to be untrusted. For instance, one could hash > > the content using the SHA-1 algorithm, and encode it using > > Base32 to produce the following URN: > > > > urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB > > That's an invalid URN, AFAIK. There's no authority. All URIs > need an authority to vouch for the identity. URNs need not have authorities -- especially those which are strictly a function of the content they describe. (Now, it is true that the URN namespace "sha1" has not yet been formally documented and reserved in accordance with RFC-specified procedures, but the above *could* be a valid URN.) > Also, including a hash of the content in a URI is an extremely > brittle way of identifying things, unless it is known that > the content will be static for all time and space. If that URN > identified a song in MP3 format, for example, then you'd need > a new URN to identify the same song in a different format. > Is that what you want? For many applications, yes. For stitching together pieces of a single exact file from dozens of independent sources, absolutely! You don't want a "mostly acceptable" MP3 to have the same reliable name as the "official" or "consensus 'best'" version. > > It is believed that N2R, N2L, and N2Ls will be the most useful > > services for the Content-Addressable Web, so we will cover > > examples of those explicitly. The rest of the N2* headers > > should be implemented using the conventions used for N2R, > > N2L, and N2Ls. > > The N2* conventions run completely against the architecture of the Web. > URIs are resource identifiers. URNs are one kind of URI. How many > URIs does a resource need? I'd say you want one (a hash-based URN) to serve as the resource's unfalsifiable "true name". You might want several others (traditionally, URLs) to reflect the resource's current reachable locations, or its names within alternate delivery systems. - Gojomo ____________________ Gordon Mohr, gojomo@ bitzi.com, Bitzi CTO _ http://bitzi.com _ From zooko at zooko.com Thu Oct 25 13:52:01 2001 From: zooko at zooko.com (Zooko) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web In-Reply-To: Message from "Gordon Mohr" of "Thu, 25 Oct 2001 12:39:33 PDT." <00b001c15d8c$c66077a0$0ea7fea9@golden> References: <200110251500.LAA19254@markbaker.ca> <00b001c15d8c$c66077a0$0ea7fea9@golden> Message-ID: [Everyone please note that this thread is being crossposted to p2p-hackers, decentralization and bluesky. Only subscribers can post directly to p2p-hackers, but messages from non-subscribers get forwarded to me for approval. --Zooko http://zgp.org/mailman/listinfo/p2p-hackers/ ] Lines prepended with "> > > " were written by Justin Chapweske. Lines prepended with "> > " were written by Mark Baker. > > > While any kind of URN can be used within the Content-Addressable > > > Web, there is a specific type of URN called a "Self-Verifying > > > URN" that is particularly useful. These URNs have the > > > property that the URN itself can be used to verify that > > > the content has been received intact. It is RECOMMENDED > > > that applications use cryptographically strong self-verifying > > > URNs because hosts in ad hoc CDNs and the Transient Web > > > are assumed to be untrusted. For instance, one could hash > > > the content using the SHA-1 algorithm, and encode it using > > > Base32 to produce the following URN: > > > > > > urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB > > > > That's an invalid URN, AFAIK. There's no authority. All URIs > > need an authority to vouch for the identity. This is surely the most pernicious myth about naming: that it is impossible to verify the correctness of a mapping yourself and you are doomed to trust in some external authority who will tell you the answer. There are two counterexamples: names that are a deterministic function of the content (which Freenet calls "Content Hash Keys" or "CHKs") and names that include the ID of a public key in the name, so that you can check a digital signature on the content (which Freenet calls "Sub-Space Keys" or "SSKs", and the Self-Certifying File System[1] calls "names"). I think this is an extremely important point. IMO the only part of "p2p" which is really revolutionary is the potential for "cooperation without vulnerability" -- two agents live on opposite sides of an unbridgeable trust boundary who are still able to interoperate and cooperate. I'm going to say it again: The most important concept in the whole field of the "p2p" or "decentralization" or whatever you call it is the concept of "cooperation without vulnerability". The most important component of infrastructure that we lack right now in order to enable cooperation without vulnerability is a name service which uses self-authenticating keys so that no agent is ever vulnerable to deception with regard to what object a name should map to. Regards, Zooko http://zooko.com/ [1] http://fs.net/ From alk at pobox.com Thu Oct 25 13:54:01 2001 From: alk at pobox.com (Tony Kimball) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web References: <3BD7B24C.9030403@chapweske.com> <200110251500.LAA19254@markbaker.ca> Message-ID: <15320.31589.458110.880405@gargle.gargle.HOWL> : : The N2* conventions run completely against the architecture of the Web. : URIs are resource identifiers. URNs are one kind of URI. How many : URIs does a resource need? N2L doesn't provide URIs, it provides URLs. A resource needs at least as many URLs has it has addressible locations. From alk at pobox.com Thu Oct 25 14:19:01 2001 From: alk at pobox.com (Tony Kimball) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web References: <00b001c15d8c$c66077a0$0ea7fea9@golden> <200110252117.RAA22943@markbaker.ca> Message-ID: <15320.33141.378877.269651@gargle.gargle.HOWL> Quoth Mark Baker on Thursday, 25 October: : : But if it's still really important for this system to have the : hash in the URI, how about this; : : http://foobar.org/my_content?sha1hash=32452345ASDASDFASDFS It's also really important for this system not to have a location in the URI. : Do you have an example of a resource that is best named by its : hash? All resources are best named by their hash, for some value of "best". From distobj at acm.org Thu Oct 25 14:28:01 2001 From: distobj at acm.org (Mark Baker) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web In-Reply-To: <00b001c15d8c$c66077a0$0ea7fea9@golden> from "Gordon Mohr" at Oct 25, 2001 12:39:33 PM Message-ID: <200110252117.RAA22943@markbaker.ca> Hi Gordon, > > Untrue. The use of URIs allows these things. There is nothing > > special about URNs in this respect. > > Well, I belive the theory is that URNs have slightly stronger > "identity" characteristics. s/theory/spiteful rumour/g 8-) > However, the latest W3C "clarification" > seems to simply confirm the practice of using some URIs as if > they were URNs. Even HTTP URIs. Yes, that debate still exists. In general, REST proponents believe that any URI is only as persistent as the authority is willing to make it. The dependance of the HTTP URI scheme on DNS, and that an authority doesn't "own" it, is often mentioned as a reason why HTTP URIs are less persistent. But that ignores the fact that for urn::foo, the registrant doesn't own "nid" either. If IBM runs out to register "microsoft" as a NID, you can guarantee WIPO will eventually get involved. It's just another central registry. > If the names are self-verifying, as with secure hashes, you > don't have to make a trust/no-trust decision about caches, > at least not at the outset. You can make the trust/no-trust > decision on what they give you. Transgressors are always > caught. I have an issue with the name "self-verifying". If I do a GET on urn:sha-1:234234KJASDFKAJFD, I don't know that I'm getting back a resource with that hash. I still have to run the hash and compare it to the URI, because the cache may be lying. So the honus of verification lies with the client, and therefore the URI isn't self-verifying. But if it's still really important for this system to have the hash in the URI, how about this; http://foobar.org/my_content?sha1hash=32452345ASDASDFASDFS > > Also, including a hash of the content in a URI is an extremely > > brittle way of identifying things, unless it is known that > > the content will be static for all time and space. If that URN > > identified a song in MP3 format, for example, then you'd need > > a new URN to identify the same song in a different format. > > Is that what you want? > > For many applications, yes. For stitching together pieces of a > single exact file from dozens of independent sources, absolutely! Ok, so how about my URI above? > You don't want a "mostly acceptable" MP3 to have the same > reliable name as the "official" or "consensus 'best'" version. HTTP handles that with variants (different representations of the same resource). Variants can have their own URIs too. So you could have; http://myfavband.org/song/myfavsong (the main URI) plus these variants; http://myfavband.org/song/myfavsong/mp3/256k http://myfavband.org/song/myfavsong/mp3/128k http://myfavband.org/song/myfavsong/wav/56k http://myfavband.org/song/myfavsong/au/8k > I'd say you want one (a hash-based URN) to serve as the resource's > unfalsifiable "true name". You might want several others (traditionally, > URLs) to reflect the resource's current reachable locations, or > its names within alternate delivery systems. Do you have an example of a resource that is best named by its hash? MB -- Mark Baker, CSO, Planetfred. Ottawa, Ontario, CANADA. mbaker@planetfred.com From distobj at acm.org Thu Oct 25 14:39:01 2001 From: distobj at acm.org (Mark Baker) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web In-Reply-To: <15320.31589.458110.880405@gargle.gargle.HOWL> from "Tony Kimball" at Oct 25, 2001 03:51:49 PM Message-ID: <200110252136.RAA23215@markbaker.ca> > : The N2* conventions run completely against the architecture of the Web. > : URIs are resource identifiers. URNs are one kind of URI. How many > : URIs does a resource need? > > N2L doesn't provide URIs, it provides URLs. URLs are URIs, so N2L does provide them. > A resource needs at least > as many URLs has it has addressible locations. URLs are names. The only thing that makes them a locator is a convention that converts them to an IP address, TCP port, local name tuple. But that mapping isn't authoritative. Quite the opposite in fact. For example, chances are that you likely have a cached version of http://www.yahoo.com in your browser cache. What is the URL for that document? It's http://www.yahoo.com. Is "Tony Kimball" a locator? Of course not, it's a name, right? So what if we defined a convention that said that you could do a DNS lookup on tony.kimball.person, open a connection to port 81, do a GET, and see his homepage. Would that make "Tony Kimball" a locator? MB -- Mark Baker, CSO, Planetfred. Ottawa, Ontario, CANADA. mbaker@planetfred.com From justin at chapweske.com Thu Oct 25 15:13:01 2001 From: justin at chapweske.com (Justin Chapweske) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Content-Addressable Networks Thread Message-ID: <3BD88D39.4060805@chapweske.com> Could we move the CAW thread to decentralization? Sorry for the cross-posting. -Justin From gojomo at usa.net Thu Oct 25 16:18:01 2001 From: gojomo at usa.net (Gordon Mohr) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web References: <200110252117.RAA22943@markbaker.ca> Message-ID: <015701c15dab$625960e0$0ea7fea9@golden> [bluesky dropped, as it's not accepting my non-subscriber messages] Mark Baker writes: > > However, the latest W3C "clarification" > > seems to simply confirm the practice of using some URIs as if > > they were URNs. > > Even HTTP URIs. > > Yes, that debate still exists. In general, REST proponents > believe that any URI is only as persistent as the authority is > willing to make it. The dependance of the HTTP URI scheme on > DNS, and that an authority doesn't "own" it, is often mentioned > as a reason why HTTP URIs are less persistent. But that ignores > the fact that for urn::foo, the registrant doesn't own > "nid" either. If IBM runs out to register "microsoft" as a NID, > you can guarantee WIPO will eventually get involved. It's just > another central registry. But with identifiers which are inherent, rather than assigned, it doesn't matter who registers/"owns" something. Only the consensus definition of the process for creating the name from the content matters. > > If the names are self-verifying, as with secure hashes, you > > don't have to make a trust/no-trust decision about caches, > > at least not at the outset. You can make the trust/no-trust > > decision on what they give you. Transgressors are always > > caught. > > I have an issue with the name "self-verifying". > > If I do a GET on urn:sha-1:234234KJASDFKAJFD, I don't know that > I'm getting back a resource with that hash. I still have to > run the hash and compare it to the URI, because the cache may > be lying. So the honus of verification lies with the client, > and therefore the URI isn't self-verifying. Turn the problem around. You have a resource. You want to share it. Sure, you can give it an arbitrary name in the HTTP URL namespace, under some hostname you control. But then no one will know whether what you have is the same thing as what they have. Third parties looking for your exact file cannot tell, from just a legal HTTP URL, if what you're offering is what they seek. You could start a convention for embedding a unique, location-independent identifier into your URLs -- as RFC2169 suggests, or you suggest later in your message. I believe that is a useful approach. But then, you now have something else interesting to advertise -- the "true name" of the content. In fact, you don't care at all about the domain-name and request-URI stuff, you'll let those take any value, as long as the reliable name matches and checks out. > But if it's still really important for this system to have the > hash in the URI, how about this; > > http://foobar.org/my_content?sha1hash=32452345ASDASDFASDFS The "N2R" resolution services discussed in RFC2169 and CAW are very much like you suggest. In the above, "sha1hash=32452345ASDASDFASDFS" is a unique, location-independent, durable name for the content. So why not call it an URN, and make its qualities explicit in the syntax? Like: http://foobar.org/my_content?name=urn:sha1:32452345ASDASDFASDFS And then, when you are completely indifferent about where you get the matching content -- indifferent about location, protocol, everything -- why should an application keep shuttling the "http://foobar.org/my_content?name=" portion around? Just go with: *?name=urn:sha1:32452345ASDASDFASDFS or better yet urn:sha1:32452345ASDASDFASDFS > > You don't want a "mostly acceptable" MP3 to have the same > > reliable name as the "official" or "consensus 'best'" version. > > HTTP handles that with variants (different representations of the > same resource). Variants can have their own URIs too. So you > could have; > > http://myfavband.org/song/myfavsong (the main URI) > > plus these variants; > > http://myfavband.org/song/myfavsong/mp3/256k > http://myfavband.org/song/myfavsong/mp3/128k > http://myfavband.org/song/myfavsong/wav/56k > http://myfavband.org/song/myfavsong/au/8k These names still don't come close to uniquely identifying specific instances. (There are, for example, trillions of equally-valid MP3 encodings of a song.) When you want a specific digital file, one that is an exact copy of an original/official/recommended version, you want a precise name. > > I'd say you want one (a hash-based URN) to serve as the resource's > > unfalsifiable "true name". You might want several others (traditionally, > > URLs) to reflect the resource's current reachable locations, or > > its names within alternate delivery systems. > > Do you have an example of a resource that is best named by its > hash? - a recipe for a chemical process that could be dangerous if any of the steps or quantities are slightly altered - a 2-hour video you want to grab as equal parts from 120 different sources, with no glitches - a compiled executable for which versions with malicious code might be floating around - Gordon From clay at shirky.com Thu Oct 25 18:16:01 2001 From: clay at shirky.com (Clay Shirky) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web In-Reply-To: from "Zooko" at Oct 25, 2001 01:41:51 PM Message-ID: <200110260048.f9Q0mh322670@mail.shirky.com> > The most important concept in the whole field of the "p2p" or > "decentralization" or whatever you call it is the concept of > "cooperation without vulnerability". You don't mean exactly that, I think, since the Visa system, as but one example, allows this. If I go into some little shop in Cancun to buy sunglasses, that merchant and I are all but guaranteed never to see one another again. Its a non-iterated game of PD, where defection can backfire, but it can never be punished. But I have a Visa card, and he has a Visa sign in his door. Visa knows him, and will pimpslap him upside his silly little head if he runs up extra charges on my card. Visa know me, and has a staff of trained knuckledraggers who will nail my head to the floor if I fail to pay up. So what I think you mean is "P2P can allow cooperation without vulnerability _or_ brokering authority." And this only matters if you think such a system can operate at a lower cost (cash cost plus opportunity cost) than a system with a broker. You and Todd think such a thing is possible. I'm not sure its impossible, but I am certainly in the skeptics' camp on this one. -clay From bram at gawth.com Sun Oct 28 13:12:02 2001 From: bram at gawth.com (Bram Cohen) Date: Sat Dec 9 22:11:43 2006 Subject: [p2p-hackers] BitTorrent 2.5.1 is out! Message-ID: BitTorrent 2.5.1 is out, check it out here - http://bitconjurer.org/BitTorrent/ Or, if you're running windows and just want a quick demo, go straight to the demo page - http://bitconjurer.org/BitTorrent/demo.html New in this release - timeouts and keepalives. This will probably be the last non-backwards compatible release for a while. my apologies to anyone who got a 404 in the last day or so - there was some release snafu. -Bram Cohen "Markets can remain irrational longer than you can remain solvent" -- John Maynard Keynes