From dnm at pobox.com  Mon Oct  1 01:45:01 2001
From: dnm at pobox.com (Dan Moniz)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Fwd: International Workshop on Global and Peer-to-Peer Computing
Message-ID: <p05100300b7dddd0fabad@[192.168.1.88]>

[ Forwarded from the bluesky list, sans some irrelevant headers and footers.

-- dnm ]

>Date: Mon, 1 Oct 2001 03:43:04 -0400
>Subject: International Workshop on Global and Peer-to-Peer Computing
>To: "Global-Scale Distributed Storage Systems" <bluesky@franklin.oit.unc.edu>
>From: "Franck Cappello" <fcappello@hotmail.com>
>
>Dear Colleagues,
>Please find bellow the call for paper of the
>"International Workshop on Global and Peer-to-Peer Computing" which will
>take place along with the CCGRID 2002 conference in Berlin, Germany, 21-24
>May 2002.
>We take this opportunity to invite you to submit scientific papers to this
>workshop.
>Please forgive us if you receive several copies of this message.
>Feel free to forward this message to your colleagues.
>Best Regards.
>
>
>                    International Workshop on
>                "Global and Peer-to-Peer Computing
>               On Large Scale  Distributed Systems"
>             (http: http://www.lri.fr/~fci/GP2PC.htm)
>
>          organized at the IEEE International Symposium
>             on Cluster Computing and the Grid 2002
>                          CCGRID 2002
>
>In cooperation with the IEEE Task Force on Cluster Computing (TFCC)
>
>
>SCOPE
>
>The wide spread of the World Wide Web along with the availability
>of increasingly powerful off-the-shelf hardware give rise to a new
>infrastructure for distributed computing. Besides traditional grid
>computing systems, it is now possible to run computations on a large
>number workstations, personal computers and servers using a large scale
>and loosely coupled system.
>
>This type of distributed computing, also referred to as Global
>Computing, is currently used for a large variety of physic, mathematics
>and biology applications mostly following the Master/slave paradigm.
>Peer-to-Peer interaction models give the opportunity to enlarge the number
>of user
>and applications of Global Computing by allowing any resource to send
>job or/and data requests, provide data or/and computation services and
>participate to maintain the infrastructure itself.
>
>Because of their size and the high volatility of their resources, Global
>Computing and Global Peer-to-Peer Computing platforms provide the
>opportunity for researchers to revisit distributed computing major fields:
>protocols, infrastructures, security, certification, fault tolerance,
>scheduling, performance, etc.
>
>Authors are invited to submit original, unpublished work describing
>current research in the area of Global and Peer-to-Peer Computing
>including design and  analysis of computational infrastructures as well as
>applications in science, technology, and commerce.
>
>TOPICS
>
>Topics of interest include, but are not limited to:
>*  Global Computing and Peer-to-Peer computing platforms,
>*  Autonomous, self organizing and/or mobile distributed systems,
>*  Middleware, programming models, environments and toolkits,
>*  Protocols for resource management/discovery/reservation/scheduling,
>*  Economic considerations of resource usage (protocols, accounting),
>*  Storage in Global Computing Infrastructures (strategies, protocols)
>*  Performance monitoring, benchmarking, evaluation and modeling of
>    Global Computing and Peer-to-Peer systems and/or components thereof,
>*  Security, management and monitoring of resources,
>*  Result certification (detection/tolerance of wrong/corrupted results),
>*  Parallel computing on large scale distributed systems,
>*  Compute or I/O driven applications  (scientific, engineering,
>business),
>*  Global and Peer-to-Peer computing applications (programmed from
>scratch,
>    ported from sequential, or parallel version, adaptations to fit a
>global
>    computing environment)
>
>PAPER SUBMISSION
>
>Authors are encouraged to:
>
>Submit a full paper (max: 6 pages in length, formatted to the IEEE format)
>Submit a research statement (max: 2 pages in length, formatted to the IEEE
>format)
>Use a minimum of 10pt font, and printable on A4 paper. IEEE guidelines can
>be found here. Please email your papers to fci@lri.fr or
>lalis@ics.forth.gr which is the preferred method for submission.
>
>Full papers (category (a)) will be reviewed by the program committee for
>relevance, clarity and the novelty of results. If accepted, full papers
>will be published in the conference proceedings by IEEE Computer Society.
>Authors may purchase two additional pages.
>
>Short papers (category (b)) will be published in a separate section. This
>is to encourage work that is not yet advanced enough for a full paper.
>
>We also encourage authors to present novel ideas, critique of existing
>work, and application examples, which demonstrate how Global and
>Peer-to-Peer Computing technology could be effectively deployed. We also
>welcome practical work which applies Global and Peer-to-Peer Computing
>technology in novel and interesting ways.
>
>IMPROTANTES DATES
>
>Papers due: November 24, 2001
>Notification to authors: December 21, 2001
>Final version of papers due: February 15, 2002
>
>PROGRAM COMMITTEE (still expanding)
>
>Mark Baker, DCS , University of Portsmouth, UK
>Taisuke Boku, CCP, University of Tsukuba, Japan
>Franck Cappello, CNRS, Paris-South University, France
>Henri Casanova, SDSC, California, USA
>Christian Huitema, Microsoft, USA
>Spyros Lalis, FORTH, Greece
>Serge Petiton, LILF, Lille University, France
>Avi Rubin, AT&T Labs - Research, USA
>Mitsuhisa Sato, CCP, University of Tsukuba, Japan
>
>
>SESSION CHAIRS
>
>For more information please contact:
>Franck Cappello,             Spyros Lalis,
>CNRS,                        Institute of Computer Science
>Universite Paris-Sud,        Foundation for Research and Technology
>Hellas
>France,                      Greece,
>fci@lri.lri.fr               lalis@ics.forth.gr


-- 
Dan Moniz <dnm@pobox.com> [http://www.pobox.com/~dnm/]

From Franck.Cappello at lri.fr  Mon Oct  1 03:49:01 2001
From: Franck.Cappello at lri.fr (Franck.Cappello@lri.fr)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] GP2PC International Scientific Workshop
Message-ID: <1001921421.3bb81b8d5eaf5@www.lri.fr>

Dear Colleagues,
Please find bellow the call for paper of the 
"International Workshop on Global and Peer-to-Peer Computing" which will take 
place along with the CCGRID 2002 conference in Berlin, Germany, 21-24 May 2002. 
We take this opportunity to invite you to submit scientific papers to this 
workshop.
Please excuse use if you receive several copies of this message.
Feel free to forward this message to your colleagues.
Best Regards.


                   International Workshop on 
               Global and Peer-to-Peer Computing
               On Large Scale Distributed Systems
            (http: http://www.lri.fr/~fci/GP2PC.htm)

         organized at the IEEE International Symposium
            on Cluster Computing and the Grid 2002
                         CCGRID 2002

In cooperation with the IEEE Task Force on Cluster Computing (TFCC)


SCOPE 

The wide spread of the World Wide Web along with the availability
of increasingly powerful off-the-shelf hardware give rise to a new
infrastructure for distributed computing. Besides traditional grid
computing systems, it is now possible to run computations on a large
number workstations, personal computers and servers using a large scale and 
loosely coupled system.

This type of distributed computing, also referred to as Global
Computing, is currently used for a large variety of physic, mathematics and 
biology applications mostly following the Master/slave paradigm. Peer-to-Peer 
interaction models give the opportunity to enlarge the number of user
and applications of Global Computing by allowing any resource to send
job or/and data requests, provide data or/and computation services and
participate to maintain the infrastructure itself. 

Because of their size and the high volatility of their resources, Global 
Computing and Global Peer-to-Peer Computing platforms provide the opportunity 
for researchers to revisit distributed computing major fields: protocols, 
infrastructures, security, certification, fault tolerance, scheduling, 
performance, etc.

Authors are invited to submit original, unpublished work describing
current research in the area of Global and Peer-to-Peer Computing
including design and  analysis of computational infrastructures as well as
applications in science, technology, and commerce. 

TOPICS

Topics of interest include, but are not limited to: 
*  Global Computing and Peer-to-Peer computing platforms, 
*  Autonomous, self organizing and/or mobile distributed systems,
*  Middleware, programming models, environments and toolkits,
*  Protocols for resource management/discovery/reservation/scheduling, 
*  Economic considerations of resource usage (protocols, accounting),
*  Storage in Global Computing Infrastructures (strategies, protocols) 
*  Performance monitoring, benchmarking, evaluation and modeling of
   Global Computing and Peer-to-Peer systems and/or components thereof,
*  Security, management and monitoring of resources,
*  Result certification (detection/tolerance of wrong/corrupted results),
*  Parallel computing on large scale distributed systems,
*  Compute or I/O driven applications  (scientific, engineering, business), 
*  Global and Peer-to-Peer computing applications (programmed from scratch,
   ported from sequential, or parallel version, adaptations to fit a global 
   computing environment)

PAPER SUBMISSION
 
Authors are encouraged to: 

Submit a full paper (max: 6 pages in length, formatted to the IEEE format) 
Submit a research statement (max: 2 pages in length, formatted to the IEEE 
format) 
Use a minimum of 10pt font, and printable on A4 paper. IEEE guidelines can be 
found here. Please email your papers to fci@lri.fr or lalis@ics.forth.gr which 
is the preferred method for submission. 

Full papers (category (a)) will be reviewed by the program committee for 
relevance, clarity and the novelty of results. If accepted, full papers will be 
published in the conference proceedings by IEEE Computer Society. Authors may 
purchase two additional pages. 

Short papers (category (b)) will be published in a separate section. This is to 
encourage work that is not yet advanced enough for a full paper. 

We also encourage authors to present novel ideas, critique of existing work, 
and application examples, which demonstrate how Global and Peer-to-Peer 
Computing technology could be effectively deployed. We also welcome practical 
work which applies Global and Peer-to-Peer Computing technology in novel and 
interesting ways. 

IMPROTANTES DATES

Papers due: November 24, 2001 
Notification to authors: December 21, 2001 
Final version of papers due: February 15, 2002 

PROGRAM COMMITTEE (still expanding)

Mark Baker, DCS , University of Portsmouth, UK 
Taisuke Boku, CCP, University of Tsukuba, Japan 
Franck Cappello, CNRS, Paris-South University, France 
Henri Casanova, SDSC, California, USA
Christian Huitema, Microsoft, USA
Spyros Lalis, FORTH, Greece
Serge Petiton, LILF, Lille University, France
Avi Rubin, AT&T Labs - Research, USA 
Mitsuhisa Sato, CCP, University of Tsukuba, Japan 


SESSION CHAIRS 

For more information please contact: 
Franck Cappello,             Spyros Lalis,
CNRS,                        Institute of Computer Science
Universite Paris-Sud,        Foundation for Research and Technology  Hellas
France,                      Greece,
fci@lri.lri.fr               lalis@ics.forth.gr 

---------------------------------------------------------
Franck Cappello         fci@lri.fr        www.lri.fr/~fci
Researcher within CNRS, LRI, Universit? Paris Sud, France
tel +33 1 69 15 70 91               fax +33 1 69 15 65 86
COE Research Fellow,    CCP,   Tsukuba University,  Japan
tel +81 2 98 53 64 83               fax +81 2 98 53 64 06
---------------------------------------------------------

From rob at eorbit.net  Mon Oct  1 17:24:01 2001
From: rob at eorbit.net (Mayhem & Chaos)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] MusicBrainz RDF Data dump
Message-ID: <1001979805.23542.111.camel@cranky>

Hi!

Since Brandon has been looking at RDF data dumps, I finally got off my
butt to put together an RDF dump of the data from the MusicBrainz
project.

As some of you might know, MusicBrainz is a music metadata database that
allows users to identify audio CDs and digital audio tracks like MP3s or
Vorbis files. The server and client software is released under the GPL
and the LGPL, respectively. The data is covered by the OpenContent
license in order to avoid the CDDB fiasco. I won't go into the details
of the project here -- please check out the project at
http://www.musicbrainz.org if you're interested in finding out more.

You can download the RDF data dump from here:

   ftp://ftp.musicbrainz.org/pub/musicbrainz/rdfdump-2001-9-25.rdf.bz2

Please note that the URLs for the resources in the RDF dump are live and
available from the MusicBrainz server. However, I still haven't
completed the full specification of the musicbrainz namespaces -- I'm
working on getting that done asap.

Any feedback regarding this data dump would be deeply appreciated, since
this is my first large scale data dump.

-- 

--ruaok         Freezerburn! All else is only icing. -- Soul Coughing

Robert Kaye   --    rob@eorbit.net   --   http://www.mayhem-chaos.net


From antr at microsoft.com  Wed Oct  3 07:35:01 2001
From: antr at microsoft.com (Ant Rowstron)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] CFP: The 1st International Workshop on Peer-to-Peer Systems (IPTPS'02)
Message-ID: <4BBF5F3B80921D47BFEB9C136808A47402D8D7D0@red-msg-08.redmond.corp.microsoft.com>

CALL FOR PARTICIPATION: IPTPS'02

  The 1st International Workshop on Peer-to-Peer Systems (IPTPS'02)

                            7-8 March, 2002
                 MIT Faculty Club, Cambridge, MA, USA.

              http://www.cs.rice.edu/Conferences/IPTPS02/

Peer-to-peer has emerged as a promising new paradigm for distributed
computing.  The 1st International Workshop on Peer-to-Peer Systems
(IPTPS'02) aims to provide a forum for researchers active in
peer-to-peer computing to discuss the state-of-the-art and to identify
key research challenges in peer-to-peer computing.

The goal of the workshop is to examine peer-to-peer technologies,
applications and systems, and also to identify key research issues and
challenges that lie ahead.  In the context of this workshop,
peer-to-peer systems are characterized as being decentralized,
self-organizing distributed systems, in which all or most
communication is symmetric.  Topics of interest include, but are not
limited to:

   * novel peer-to-peer applications and systems
   * peer-to-peer infrastructure
   * security in peer-to-peer systems
   * anonymity and anti-censorship
   * performance of peer-to-peer systems
   * workload characterization for peer-to-peer systems

The program of the workshop will be a combination of invited review 
talks, presentations of position papers, and discussions. To ensure
a productive workshop environment, attendance will be limited to about
35 participants who are active in the field.  Each potential
participant should submit a position paper of 5 pages or less that
exposes a new problem, advocates a specific solution, or reports on
actual experience.  Participants will be invited based on the
originality, technical merit and topical relevance of their submissions,
as well as the likelihood that the ideas expressed in their submissions
will lead to insightful technical discussions at the workshop.  Please
do not submit abbreviated versions of journal or conference papers.

Online copies of the position papers will be made available prior to
the workshop.  We are investigating the possibility of producing a
printed proceedings, including a summary of the interactions at the
workshop, which would be mailed to participants after the workshop.

Steering committee:

Peter Druschel, Rice University, USA
Frans Kaashoek, MIT, USA
Antony Rowstron, Microsoft Research, UK
Scott Shenker, ACIRI, Berkeley, USA
Ion Stoica, UC Berkeley, USA

Organizing chairs: 

Frans Kaashoek, MIT, USA
Antony Rowstron, Microsoft Research, UK

Program Committee:

Ross Anderson, Cambridge University, UK
Roger Dingledine, Reputation Technologies, Inc., USA
Peter Druschel, Rice University, USA (co-chair)
Steve Gribble, University of Washington, USA
David Karger, MIT, USA
John Kubiatowicz, UC Berkeley, USA
Robert Morris, MIT, USA
Antony Rowstron, Microsoft Research, UK (co-chair)
Avi Rubin, AT&T Labs - Research, USA
Scott Shenker, ACIRI, Berkeley, USA
Ion Stoica, UC Berkeley, USA

Guidelines for Submission:

To submit, authors should follow the instructions at
http://www.cs.rice.edu/Conferences/IPTPS02/submit/.  Papers must be
submitted by 18:00 GMT, Monday, 3 December 2001.  The length of the
paper must not exceed 5 pages (11pt font, 1 inch margins).  All
submissions will be acknowledged by email within 24 hours of receipt.

Important Dates:

   * 3 December 2001 : Submission of position papers
   * 4 February 2002 : Notification of Acceptance/Rejection
   * 1 March 2002    : Final copies of accepted papers
   * 7-8 March 2002  : IPTPS'02

From zooko at zooko.com  Wed Oct  3 13:15:01 2001
From: zooko at zooko.com (Zooko)
Date: Sat Dec  9 22:11:43 2006
Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices)) 
In-Reply-To: Message from Gordon Mohr <gojomo@usa.net> 
   of "Wed, 19 Sep 2001 11:55:28 PDT." <00ca01c1413c$a7539340$0ea7fea9@golden> 
References: <01e801c135e2$5d03dda0$0ea7fea9@golden> <Pine.LNX.4.21.0109070343480.24019-100000@azrael.dyn.cheapnet.net> <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <E15ji7k-0006aI-00@imp> <005601c14131$cbdc4d20$0ea7fea9@golden> <E15jlVI-0007dQ-00@imp>  <00ca01c1413c$a7539340$0ea7fea9@golden> 
Message-ID: <E15osGQ-0007rc-00@imp>

I, Zooko, wrote the part prefixed with "> > ".

 Gojomo wrote:
>
> > I guess we just differ in our value judgements here.  I value shorter ids for
> > cut-and-paste purposes more than I value absence of "break" characters.
> > Indeed, I can't really think of a motivating example for caring about "break"
> > characters.  Could you please suggest one?
> 
> Again, Googling for identifiers. Other full-text searches for
> fragments. Searching for the Base32 fragment 'B6THNJ' is always
> a single word; searching for the Base64 fragment 'aS+w/e' might
> be interpreted as 'as w e' and perhaps ignored completely.
> 
> > Hm.  I can't find a base-32 encoder in Python.  Could someone who favors
> > base-32, and thus presumably has an encoder handy, show the base-32 version of
> > 40-byte, 30-byte, and 20-byte strings?  Thanks!
> 
> 20b -> 32 chars: 3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD
> 30b -> 48 chars: 3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPF
> 40b -> 64 chars: 3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ

Hey, that' doesn't look too bad!  I guess the four characters omitted are `0',
`O', `1', and `l'?

Hm.  The only thing is that mojo ids look like this:

http://localhost:4004/save_id/3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ

so it is greater than 80 chars.  Hmph.

Of course, base-64 would also be greater than 80 chars:

http://localhost:4004/save_id/Ftp3ZuSNvzDw6KmYkmTA81ZGVb-gLJ53qBoY945FO0qvR8pyzBWYBQ

Hrm...

I am just about convinced to switch to base-32.

Regards,

Zooko


From greg at electricrain.com  Mon Oct 15 22:16:01 2001
From: greg at electricrain.com (Gregory P. Smith)
Date: Sat Dec  9 22:11:43 2006
Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices))
In-Reply-To: <E15osGQ-0007rc-00@imp>; from zooko@zooko.com on Wed, Oct 03, 2001 at 01:05:06PM -0700
References: <Pine.LNX.4.21.0109070343480.24019-100000@azrael.dyn.cheapnet.net> <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <E15ji7k-0006aI-00@imp> <005601c14131$cbdc4d20$0ea7fea9@golden> <E15jlVI-0007dQ-00@imp> <00ca01c1413c$a7539340$0ea7fea9@golden> <gojomo@usa.net> <E15osGQ-0007rc-00@imp>
Message-ID: <20011015221505.F27951@zot.electricrain.com>

> Hm.  The only thing is that mojo ids look like this:
> 
> http://localhost:4004/save_id/3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ
> 
> so it is greater than 80 chars.  Hmph.

how about

mojo://3KIZIJB64XP3NCXAE4ISQZT3QNCTF7VD8EJ2KEDCV3WQMMPFWFJW6DCVPKXMZQIZ/

(or pick your favorite protocol prefix and write a protocol handler)

-g

From zooko at zooko.com  Tue Oct 16 05:34:01 2001
From: zooko at zooko.com (Zooko)
Date: Sat Dec  9 22:11:43 2006
Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices)) 
In-Reply-To: Message from "Gregory P. Smith" <greg@electricrain.com> 
   of "Mon, 15 Oct 2001 22:15:05 PDT." <20011015221505.F27951@zot.electricrain.com> 
References: <Pine.LNX.4.21.0109070343480.24019-100000@azrael.dyn.cheapnet.net> <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <E15ji7k-0006aI-00@imp> <005601c14131$cbdc4d20$0ea7fea9@golden> <E15jlVI  <20011015221505.F27951@zot.electricrain.com> 
Message-ID: <E15tTGO-00046a-00@imp>

> (or pick your favorite protocol prefix and write a protocol handler)

Yes, that's a good idea!

mojo://3kizijb64xp3ncxae4isqzt3qnctf7vd8ej2kedcv3wqmmpfwfjw6dcvpkxmzqiz

versus

mojo://Ftp3ZuSNvzDw6KmYkmTA81ZGVb-gLJ53qBoY945FO0qvR8pyzBWYBQ

I'm convinced that base32 is better.  The drawback is 10 more chars, but the
benefit is that it survives case-changes and it doesn't have troublesome
non-alphanumeric characters.

Regards,

Zooko


From david.hopwood at zetnet.co.uk  Tue Oct 16 22:25:02 2001
From: david.hopwood at zetnet.co.uk (David Hopwood)
Date: Sat Dec  9 22:11:43 2006
Subject: please prefer base 32 over base 64 (was: Re: [p2p-hackers] Bitzi 
 (was Various identifier choices))
References: <Pine.LNX.4.21.0109070343480.24019-100000@azrael.dyn.cheapnet.net> <20010918204913.A80700@or.pair.com> <3BA7F46F.892E3989@mindspring.com> <3BA7F8D4.2090108@chapweske.com> <E15ji7k-0006aI-00@imp> <005601c14131$cbdc4d20$0ea7fea9@golden> <E15jlVI  <20011015221505.F27951@zot.electricrain.com> <E15tTGO-00046a-00@imp>
Message-ID: <3BCBADBA.2208D1BA@zetnet.co.uk>

-----BEGIN PGP SIGNED MESSAGE-----

Zooko wrote:
> > (or pick your favorite protocol prefix and write a protocol handler)
> 
> Yes, that's a good idea!
> 
> mojo://3kizijb64xp3ncxae4isqzt3qnctf7vd8ej2kedcv3wqmmpfwfjw6dcvpkxmzqiz
> 
> versus
> 
> mojo://Ftp3ZuSNvzDw6KmYkmTA81ZGVb-gLJ53qBoY945FO0qvR8pyzBWYBQ

"//" should only be used for heirarchical schemes (so not in this case).

In general, anyone designing new URI schemes should have read RFC 2718,
RFC 2717, RFC 2396, RFC 2732, and
<http://www.w3.org/International/2000/03/draft-masinter-url-i18n-05.txt>.

- -- 
David Hopwood <david.hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5  0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip


-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBO8utVTkCAxeYt5gVAQHsJggAx/i3q+O4X6Tmaoqi4Q+nKjs9orqBkBzD
hHtmVJyFo7wvAOaS1w1itKwrx+eVWUVElsDA/hy1HLm14lMN3XnKu9ZII3jKOMYQ
K2eC7ZUpxceih9uA07uNxoVtqWYPXKgDKa4JsvdWQLog6rNH+kg2D7DdgFRPYpg3
nJmkTM3XDxOPnyPPl2NGB3s2thZKuGa2W8EOM2gHdDXPGkPqf/CeaS99yLmtvLAE
dN2K/sSImiLTKdX1B8q0HSjI13mO8Z882rJXTlj9k/byuDnrP3RlZ983cMIA3SQ/
dF8XqPuRQZ3LyELnqcbNJWjZIOBUDnUdyCAI8efhDeUhUb9iG3+IeQ==
=X6p8
-----END PGP SIGNATURE-----

From bram at gawth.com  Mon Oct 22 19:22:02 2001
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] New release of BitTorrent is out!
Message-ID: <Pine.LNX.4.21.0110221908400.28051-100000@ultra.gawth.com>

A new release of BitTorrent is out, get it here -

http://bitconjurer.org/BitTorrent/

Since there hasn't been one in a while (over a month!) there's a ton of
new stuff, including -

        * rewritten UI - now all graphical and works in mozilla/netscape
            under UNIX!
        * clean shutdown
        * monothreading - this produced a *huge* performance increase
        * several big bugs fixed - we're down to no known bugs!
        * the publisher now stores metadata in files, so it doesn't have
            to re-scan files every time it restarts
        * the tracker now stores publisher and downloader information
            persistently, so downloads start working again as soon as it
            restarts
        * oodles of other small improvements

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From justin at chapweske.com  Thu Oct 25 05:30:01 2001
From: justin at chapweske.com (Justin Chapweske)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] The Content-Addressable Web
Message-ID: <3BD7B24C.9030403@chapweske.com>

I've just finished the first draft of "HTTP Extensions for a 
Content-Addressable Web".  I believe that these simple extensions are a 
huge step forward in providing interopability between P2P systems.

I would like for people to start brain-storming around this document. 
Pick it apart and make it stronger.

The documents are located at http://onionneworks.com/caw/caw.ps and 
http://onionnetworks.com/caw/caw.txt

I've also attached those documents to this e-mail.

Thanks,

--
Justin Chapweske, Onion Networks
http://onionnetworks.com/


-------------- next part --------------


HTTP Extensions for a Content-Addressable Web

Justin Chapweske, Onion Networks (justin@onionnetworks.com)

October 7, 2001

Abstract

The goal of the Content-Addressable Web (CAW) is to create
a URN-based Web that can be optimized for content distribution.
The use of URNs allows advanced caching techniques to be
employed, and sets the foundation for creating ad hoc Content
Distribution Networks (CDNs). This document specifies HTTP
extensions that bridge the current Location-Based Web with
the Content-Addressable Web. 

Table of Contents

1 Introduction
2 Self-Verifying URNs
3 HTTP Extensions
    3.1 X-Content-URN
    3.2 X-Target-URN
    3.3 X-URN-N2*
        3.3.1 X-URN-N2R
        3.3.2 X-URN-N2L and X-URN-N2Ls
4 An Example Application
5 Open Issues
6 Acknowledgments


1 Introduction

The rise in popularity of Content Distribution Networks (CDNs),
such as Akamai, have shown that significant improvements
can be made in throughput, latency, and scalability when
content is distributed throughout the network and delivered
from the edge. The fact that companies such as Tucows, FilePlanet,
and various Linux distributions force their users to manually
select mirrors points to a hole in existing web caching
infrastructure.

Standard web caching can provide significant benefits in
certain situations, but suffers from a number of short comings:

* It is ill-advised to retrieve content from an untrusted
  cache, because it can modify/corrupt the content at will.
  This severely limits the utility of cooperative caching
  systems.

* URL-based naming causes the same object on different mirrors
  to look like different objects. This decreases the efficiency
  of caching and mirroring combinations.

* There are few ways to discover optimal replicas of a given
  piece of content. There is no way for a browser to download
  a mirror list and automatically select an optimal mirror.

To add to the burden, the Transient Web is steadily growing
in size and importance. The Transient Web is embodied by
peer-to-peer systems such as Gnutella, and is characterized
by unreliable nodes and a high rate of nodes joining and
leaving the network. URL-based addressing would be unacceptable
for the Transient Web because there would be a high failure
rate of retrieving objects. 

The solution to these problems it to create a Content-Addressable
Web (CAW) that is URN-based rather than URL-based. A few
proposals have been made to enable the practical use of
URNs, such as RFC 2169 and and RFC 2915, but little has
been done with them due to lack of application demand. Recently,
however, the growing importance of peer-to-peer systems
and the desire to create ad hoc CDNs has created demand
for the Content-Addressable Web.

One of the more interesting applications of the Content-Addressable
Web is the creation of ad hoc Content Distribution Networks.
In such networks receivers can achieve tremendous throughput
by downloading content from multiple hosts in parallel.
Receivers can also crawl through the network searching for
optimal replicas, and can even retrieve content from completely
untrusted hosts but be assured that they are receiving the
content in tact. All of this is made possible by URNs.

2 Self-Verifying URNs

While any kind of URN can be used within the Content-Addressable
Web, there is a specific type of URN called a "Self-Verifying
URN" that is particularly useful. These URNs have the
property that the URN itself can be used to verify that
the content has been received intact. It is RECOMMENDED
that applications use cryptographically strong self-verifying
URNs because hosts in ad hoc CDNs and the Transient Web
are assumed to be untrusted. For instance, one could hash
the content using the SHA-1 algorithm, and encode it using
Base32 to produce the following URN:

urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB

* It is RECOMMENDED that implementations support SHA-1 URNs
  at minimum.

* Receivers MUST verify self-verifying URNs if any part of
  the content is retrieved from a potentially untrusted
  source.

A future version of this document will also specify a URN
format for performing streaming and random-access verification
using Merkle Hash Trees.

3 HTTP Extensions

In order to provide a transparent bridge between the URL-based
Web and the Content-Addressable Web, a few HTTP extensions
must be introduced. The nature of these extensions is that
they need not be widely deployed in order to be useful.
They are specifically designed to allow for proxying for
hosts that are not CAW-aware.

The following HTTP extensions are based off of the conventions
defined in RFC 2169. It is RECOMMENDED that implementers
of this specification also implement RFC 2169.

The HTTP headers defined in this specification are all response
headers. No additional request headers are specified by
this document.

It is RECOMMENDED that implementers of this specification
use an HTTP/1.1 implementation compliant with RFC 2616.

This specification uses the "X-"
header prefix convention to denote that these are not W3C/IETF
standard headers. If and when this specification becomes
a standard the prefix will either be simply removed or replaced
with an appropriate header extension mechanism.

3.1 X-Content-URN

The X-Content-URN entity-header field provides a URN that
uniquely identifies the entity-body. The URN is based on
the content of the entity-body and any content-coding that
has been applied, but not including any transfer-encoding
applied to the message-body. For example:

X-Content-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB

3.2 X-Target-URN

The X-Target-URN entity-header field provides a URN that
uniquely identifies the desired entity-body in the case
of a redirect. For HTTP 3xx responses, the URN SHOULD indicate
the server's preferred URN for automatic redirection to
the resource. 

The X-Content-URN header is inappropriate in this case, because
HTTP 3xx responses often still include message-body that
explains that a redirect is taking place.

This header primarily exists to allow the creation of URN-aware
proxies that provide URN information w/o modifying the original
web server. This allows URN-aware user-agents to take advantage
of the headers, while simply redirecting user-agents that
don't understand the Content-Addressable Web. For Example:

X-Target-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB

3.3 X-URN-N2*

These headers specify locations of various resolution services
for the URNs specified in the X-Content-URN and X-Target-URN
headers. These headers provide various ways of locating
other replicas of the content. They can be used to provide
additional sources for a multiple-source download. Or one
can build an application that crawls across the resolution
services searching for an optimal replica.

These headers are based off of conventions defined in RFC
2169 and include N2L, N2Ls, N2R, N2Rs, N2C, N2Cs, and N2Ns.
These headers provide URIs at which the associated resolution
services can be performed for the URNs specified in the
X-Content-URN and X-Target-URN headers.

It is not necessary for these URIs to conform to the "/uri-res/<service>?<uri>"
convention specified in RFC 2169.

It is believed that N2R, N2L, and N2Ls will be the most useful
services for the Content-Addressable Web, so we will cover
examples of those explicitly. The rest of the N2* headers
should be implemented using the conventions used for N2R,
N2L, and N2Ls.

Implementations can feel free to use any additional URN resolution
mechanisms, such as RFC 2915 DNS-based URN resolution.

It is RECOMMENDED that receivers assume that the URN resolver
services are potentially untrusted and should verify all
content retrieved using a resolver's services.

3.3.1 X-URN-N2R

This header specifies one or more URIs that perform the N2R
(URN to Resource) resolution service for the URNs specified
by the X-Content-URN or X-Target-URN headers. The N2R URIs
directly specify mirrors for the content addressed by the
URN and can be useful for multi-source downloads. For example: 

X-URN-N2R: http://urnresolver.com/uri-res/N2R?urn:sha1:<base32>

or

X-URN-N2R: http://untrustedmirror.com/pub/file.zip

The key difference between this header and something like
the Location header is that the URIs specified by this header
should be assumed to be untrusted.

3.3.2 X-URN-N2L and X-URN-N2Ls

This header specifies one or more URIs that perform the N2L
(URN to URL) and N2Ls (URN to URLs) resolution services.
These headers are used when other hosts provide URLs where
the content is mirrored. This is most useful in ad hoc CDNs
where mirrors may maintain lists of other mirrors. Browsers
can simply crawl across the networks, recursively dereferencing
N2L(s). For example:

X-URN-N2L: http://urnresolver.com/uri-res/N2L?urn:sha1:<base32>

and

X-URN-N2Ls: http://untrustedmirror.com/pub/file.zip-mirrors.list

For the N2Ls service, it is RECOMMENDED that the result conform
to the text/uri-list media type specified in RFC 2169.

4 An Example Application

The above HTTP extensions are deceptively simple and it may
not be readily apparent how powerful they are. We will discuss
an example application that will take advantage of a few
of the features provided by the extensions. 

In this example we will will look at how the CAW could help
at the imaginary linuxiso.org where ISO CD-ROM images of
the various linux distributions are kept. The first step
will be to issue a GET request for the content:

GET /pub/Redhat-7.1-i386-disc1.iso HTTP/1.1
Host: www.linuxiso.org 


The abbreviated response:

HTTP/1.1 200 OK
Content-Type: Application/octet-stream
Content-Length: 662072345
X-Content-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB
X-URN-N2R: http://www.linuxmirrors.com/pub/Redhat-7.1i386-disc1.iso
X-URN-N2R: http://123.24.24.21:8080/uri-res/N2R?urn:sha1:<base32>
X-URN-N2Ls: http://123.24.24.21:8080/uri-res/N2Ls?urn:sha1:<base32> 


With this response, a CAW aware browser can immediately begin
downloading the content from www.linuxiso.org, linuxmirrors.com,
and 123.24.24.21 all in parallel. At the same time the browser
can be dereferencing the N2Ls service at 123.24.24.21 to
discover more mirrors for the content.

The existence of the 123...21 host is meant to represent
a member of an ad hoc CDN, perhaps the personal computer
of a linux advocate that just downloaded the ISO and wants
to share their bandwidth with others. By dereferencing the
N2Ls, even more ad hoc nodes could be discovered.

5 Open Issues

It is unclear how to deal with the mapping of X-URN-N2* headers
in the presence of multiple X-Content-URN or X-Target-URN
headers. This must be resolved.

6 Acknowledgments

Gordon Mohr (gojomo@bitzi.com) for working on many of the
concepts in this document within the Gnutella community.
We also wish to thank Tony Kimball (alk@pobox.com) for his
continued advocacy of RFC 2169.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: caw.ps
Type: application/postscript
Size: 88595 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20011025/4afe354b/caw.ps
From distobj at acm.org  Thu Oct 25 08:35:01 2001
From: distobj at acm.org (Mark Baker)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
In-Reply-To: <3BD7B24C.9030403@chapweske.com> from "Justin Chapweske" at Oct 25, 2001 01:33:48 AM
Message-ID: <200110251500.LAA19254@markbaker.ca>

Justin,

> The goal of the Content-Addressable Web (CAW) is to create
> a URN-based Web that can be optimized for content distribution.

Ooooh, noooo, not URNs again. 8-)

> The use of URNs allows advanced caching techniques to be
> employed, and sets the foundation for creating ad hoc Content
> Distribution Networks (CDNs).

Untrue.  The use of URIs allows these things.  There is nothing
special about URNs in this respect.

> Standard web caching can provide significant benefits in
> certain situations, but suffers from a number of short comings:
> 
> * It is ill-advised to retrieve content from an untrusted
>   cache, because it can modify/corrupt the content at will.
>   This severely limits the utility of cooperative caching
>   systems.

Not to my knowledge.  Either you trust the provider of the cache,
or you don't.  If you do, you can share.

> * URL-based naming causes the same object on different mirrors
>   to look like different objects.

Incorrect.  It is current practice in mirroring that is at fault,
not the URL mechanism.  More specifically, mirroring is implemented
with "copy" semantics, rather than "cache", necessitating the
creation of new URIs.

> This decreases the efficiency
>   of caching and mirroring combinations.

Mirroring practice decreases the efficiency of mirroring.

> * There are few ways to discover optimal replicas of a given
>   piece of content. There is no way for a browser to download
>   a mirror list and automatically select an optimal mirror.
> 
> To add to the burden, the Transient Web is steadily growing
> in size and importance. The Transient Web is embodied by
> peer-to-peer systems such as Gnutella, and is characterized
> by unreliable nodes and a high rate of nodes joining and
> leaving the network. URL-based addressing would be unacceptable
> for the Transient Web because there would be a high failure
> rate of retrieving objects. 

The Web doesn't deal in "nodes", it deals in resources
identified by authorities.  That Gnutella treats each node
as a separate authority, thereby creating an unbounded
number of identities for a single resource, is a problem
of Gnutella's, not the Web.

> One of the more interesting applications of the Content-Addressable
> Web is the creation of ad hoc Content Distribution Networks.
> In such networks receivers can achieve tremendous throughput
> by downloading content from multiple hosts in parallel.
> Receivers can also crawl through the network searching for
> optimal replicas, and can even retrieve content from completely
> untrusted hosts but be assured that they are receiving the
> content in tact. All of this is made possible by URNs.

It's made possible by URIs, not URNs.

> 2 Self-Verifying URNs
> 
> While any kind of URN can be used within the Content-Addressable
> Web, there is a specific type of URN called a "Self-Verifying
> URN" that is particularly useful. These URNs have the
> property that the URN itself can be used to verify that
> the content has been received intact. It is RECOMMENDED
> that applications use cryptographically strong self-verifying
> URNs because hosts in ad hoc CDNs and the Transient Web
> are assumed to be untrusted. For instance, one could hash
> the content using the SHA-1 algorithm, and encode it using
> Base32 to produce the following URN:
> 
> urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB

That's an invalid URN, AFAIK.  There's no authority.  All URIs
need an authority to vouch for the identity.

There's "urn:ietf:sha-1" that identifies the SHA-1 algorithm,
but that namespace doesn't allow further qualification of the URN.

Also, including a hash of the content in a URI is an extremely
brittle way of identifying things, unless it is known that
the content will be static for all time and space.  If that URN
identified a song in MP3 format, for example, then you'd need
a new URN to identify the same song in a different format.
Is that what you want?

> 3.1 X-Content-URN
> 
> The X-Content-URN entity-header field provides a URN that
> uniquely identifies the entity-body. The URN is based on
> the content of the entity-body and any content-coding that
> has been applied, but not including any transfer-encoding
> applied to the message-body. For example:
> 
> X-Content-URN: urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB

Content-Location should suffice.

> 3.2 X-Target-URN
> 
> The X-Target-URN entity-header field provides a URN that
> uniquely identifies the desired entity-body in the case
> of a redirect. For HTTP 3xx responses, the URN SHOULD indicate
> the server's preferred URN for automatic redirection to
> the resource. 

HTTP redirection allows an authority of a resource to specify
that the resource is now found elsewhere.  What could a client
tell a server that it doesn't already know?  What you're
specifying here isn't redirection - I don't know what it is.

> This header primarily exists to allow the creation of URN-aware
> proxies that provide URN information w/o modifying the original
> web server. This allows URN-aware user-agents to take advantage
> of the headers, while simply redirecting user-agents that
> don't understand the Content-Addressable Web. For Example:

Why not just convert URLs to URNs;

http://www.markbaker.ca/foo/bar/baz

-> urn:markbaker.ca:foo:bar:baz

> It is believed that N2R, N2L, and N2Ls will be the most useful
> services for the Content-Addressable Web, so we will cover
> examples of those explicitly. The rest of the N2* headers
> should be implemented using the conventions used for N2R,
> N2L, and N2Ls.

The N2* conventions run completely against the architecture of the Web.
URIs are resource identifiers.  URNs are one kind of URI.  How many
URIs does a resource need?

MB
--
Mark Baker, CSO, Planetfred.
Ottawa, Ontario, CANADA.
mbaker@planetfred.com

From gerald at impressive.net  Thu Oct 25 11:35:02 2001
From: gerald at impressive.net (Gerald Oskoboiny)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: The Content-Addressable Web
In-Reply-To: <3BD7B24C.9030403@chapweske.com>; from justin@chapweske.com on Thu, Oct 25, 2001 at 01:33:48AM -0500
References: <3BD7B24C.9030403@chapweske.com>
Message-ID: <20011025140605.B20764@impressive.net>

On Thu, Oct 25, 2001 at 01:33:48AM -0500, Justin Chapweske wrote:
> I've just finished the first draft of "HTTP Extensions for a 
> Content-Addressable Web".  I believe that these simple extensions are a 
> huge step forward in providing interopability between P2P systems.
: 
> http://onionnetworks.com/caw/caw.txt

Interesting stuff...

> HTTP Extensions for a Content-Addressable Web
> Justin Chapweske, Onion Networks (justin@onionnetworks.com)
> October 7, 2001
> 
> Abstract
> 
> The goal of the Content-Addressable Web (CAW) is to create
> a URN-based Web that can be optimized for content distribution.
> The use of URNs allows advanced caching techniques to be
> employed, and sets the foundation for creating ad hoc Content
> Distribution Networks (CDNs). This document specifies HTTP
> extensions that bridge the current Location-Based Web with
> the Content-Addressable Web. 

The Web "is the universe of network-accessible information" [1],
i.e. anything with a URI, including URIs that are not tied to a
particular hostname.

You might find this useful:

    URIs, URLs, and URNs: Clarifications and Recommendations 1.0
    Report from the joint W3C/IETF URI Planning Interest Group
    W3C Note 21 September 2001
    http://www.w3.org/TR/uri-clarification/

it attempts to clarify confusion about URIs, URLs, and URNs.

> 2 Self-Verifying URNs
> 
> While any kind of URN can be used within the Content-Addressable
> Web, there is a specific type of URN called a "Self-Verifying
> URN" that is particularly useful. These URNs have the
> property that the URN itself can be used to verify that
> the content has been received intact. It is RECOMMENDED
> that applications use cryptographically strong self-verifying
> URNs because hosts in ad hoc CDNs and the Transient Web
> are assumed to be untrusted. For instance, one could hash
> the content using the SHA-1 algorithm, and encode it using
> Base32 to produce the following URN:
> 
> urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB

I think URIs based on sha-1 hashes are a fantastic way to identify
resources in P2P systems, and I don't understand why most of the
P2P systems I have used don't work like this already. (apparently,
anyway; I haven't studied the protocols, but that's my impression
as a user.)

When I search a P2P system for a particular file, I think the search
results should be a list of filenames, content-lengths, and sha-1
hash URIs, and maybe some other stuff.

Then the P2P client should present the results to me grouped by
their sha-1 hashes, let me sort those results by size/filename/
number-of-peers-with-that-hash, and when I pick one of them, it
should start downloading ranges of the file from each of, say, 10
of the peers that claim to have files with that hash.

Then it should continue downloading other ranges of the file from
the peers that gave the highest throughput on the first transfer.
(and terminate the really slow ones prematurely.)

Also, it might be good for P2P systems to be able to use an external
resolver for these URIs, so I can have a general sha-1 URI resolver
on my desktop that gets used by any P2P/Web clients I might use,
and I can set its resolution strategy according to my preferences.

> 3 HTTP Extensions
> 
> In order to provide a transparent bridge between the URL-based
> Web and the Content-Addressable Web, a few HTTP extensions
> must be introduced. The nature of these extensions is that
> they need not be widely deployed in order to be useful.
> They are specifically designed to allow for proxying for
> hosts that are not CAW-aware.

I haven't reviewed this section closely, but you might want to see:

    http://www.w3.org/Protocols/HTTP/ietf-http-ext/

for info on HTTP Extensions.

[1] About the World Wide Web
    http://www.w3.org/WWW/

-- 
Gerald Oskoboiny <gerald@impressive.net>
http://impressive.net/people/gerald/

From gojomo at bitzi.com  Thu Oct 25 12:47:01 2001
From: gojomo at bitzi.com (Gordon Mohr)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
References: <200110251500.LAA19254@markbaker.ca>
Message-ID: <00b001c15d8c$c66077a0$0ea7fea9@golden>

Mark Baker writes:
> > The use of URNs allows advanced caching techniques to be
> > employed, and sets the foundation for creating ad hoc Content
> > Distribution Networks (CDNs).
> 
> Untrue.  The use of URIs allows these things.  There is nothing
> special about URNs in this respect.

Well, I belive the theory is that URNs have slightly stronger 
"identity" characteristics. However, the latest W3C "clarification"
seems to simply confirm the practice of using some URIs as if
they were URNs.

> > * It is ill-advised to retrieve content from an untrusted
> >   cache, because it can modify/corrupt the content at will.
> >   This severely limits the utility of cooperative caching
> >   systems.
> 
> Not to my knowledge.  Either you trust the provider of the cache,
> or you don't.  If you do, you can share.

If the names are self-verifying, as with secure hashes, you 
don't have to make a trust/no-trust decision about caches,
at least not at the outset. You can make the trust/no-trust
decision on what they give you. Transgressors are always
caught.

> > 2 Self-Verifying URNs
> > 
> > While any kind of URN can be used within the Content-Addressable
> > Web, there is a specific type of URN called a "Self-Verifying
> > URN" that is particularly useful. These URNs have the
> > property that the URN itself can be used to verify that
> > the content has been received intact. It is RECOMMENDED
> > that applications use cryptographically strong self-verifying
> > URNs because hosts in ad hoc CDNs and the Transient Web
> > are assumed to be untrusted. For instance, one could hash
> > the content using the SHA-1 algorithm, and encode it using
> > Base32 to produce the following URN:
> > 
> > urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB
> 
> That's an invalid URN, AFAIK.  There's no authority.  All URIs
> need an authority to vouch for the identity.

URNs need not have authorities -- especially those which are
strictly a function of the content they describe.

(Now, it is true that the URN namespace "sha1" has not yet been
formally documented and reserved in accordance with RFC-specified
procedures, but the above *could* be a valid URN.)

> Also, including a hash of the content in a URI is an extremely
> brittle way of identifying things, unless it is known that
> the content will be static for all time and space.  If that URN
> identified a song in MP3 format, for example, then you'd need
> a new URN to identify the same song in a different format.
> Is that what you want?

For many applications, yes. For stitching together pieces of a
single exact file from dozens of independent sources, absolutely!

You don't want a "mostly acceptable" MP3 to have the same 
reliable name as the "official" or "consensus 'best'" version.

> > It is believed that N2R, N2L, and N2Ls will be the most useful
> > services for the Content-Addressable Web, so we will cover
> > examples of those explicitly. The rest of the N2* headers
> > should be implemented using the conventions used for N2R,
> > N2L, and N2Ls.
> 
> The N2* conventions run completely against the architecture of the Web.
> URIs are resource identifiers.  URNs are one kind of URI.  How many
> URIs does a resource need?

I'd say you want one (a hash-based URN) to serve as the resource's
unfalsifiable "true name". You might want several others (traditionally,
URLs) to reflect the resource's current reachable locations, or 
its names within alternate delivery systems. 

- Gojomo
____________________
Gordon Mohr, gojomo@
bitzi.com, Bitzi CTO
_ http://bitzi.com _


From zooko at zooko.com  Thu Oct 25 13:52:01 2001
From: zooko at zooko.com (Zooko)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web 
In-Reply-To: Message from "Gordon Mohr" <gojomo@bitzi.com> 
   of "Thu, 25 Oct 2001 12:39:33 PDT." <00b001c15d8c$c66077a0$0ea7fea9@golden> 
References: <200110251500.LAA19254@markbaker.ca>  <00b001c15d8c$c66077a0$0ea7fea9@golden> 
Message-ID: <E15wrK3-0008Kr-00@imp>

  [Everyone please note that this thread is being crossposted to p2p-hackers,
   decentralization and bluesky.  Only subscribers can post directly to
   p2p-hackers, but messages from non-subscribers get forwarded to me for
   approval.  --Zooko http://zgp.org/mailman/listinfo/p2p-hackers/ ] 

Lines prepended with "> > > " were written by Justin Chapweske.
Lines prepended with "> > " were written by Mark Baker.

> > > While any kind of URN can be used within the Content-Addressable
> > > Web, there is a specific type of URN called a "Self-Verifying
> > > URN" that is particularly useful. These URNs have the
> > > property that the URN itself can be used to verify that
> > > the content has been received intact. It is RECOMMENDED
> > > that applications use cryptographically strong self-verifying
> > > URNs because hosts in ad hoc CDNs and the Transient Web
> > > are assumed to be untrusted. For instance, one could hash
> > > the content using the SHA-1 algorithm, and encode it using
> > > Base32 to produce the following URN:
> > > 
> > > urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB
> > 
> > That's an invalid URN, AFAIK.  There's no authority.  All URIs
> > need an authority to vouch for the identity.

This is surely the most pernicious myth about naming: that it is impossible to
verify the correctness of a mapping yourself and you are doomed to trust in
some external authority who will tell you the answer.

There are two counterexamples: names that are a deterministic function of the
content (which Freenet calls "Content Hash Keys" or "CHKs") and names that
include the ID of a public key in the name, so that you can check a digital
signature on the content (which Freenet calls "Sub-Space Keys" or "SSKs", and
the Self-Certifying File System[1] calls "names").


I think this is an extremely important point.  IMO the only part of "p2p" which
is really revolutionary is the potential for "cooperation without
vulnerability" -- two agents live on opposite sides of an unbridgeable trust
boundary who are still able to interoperate and cooperate.

I'm going to say it again:

The most important concept in the whole field of the "p2p" or
"decentralization" or whatever you call it is the concept of "cooperation
without vulnerability".

The most important component of infrastructure that we lack right now in order
to enable cooperation without vulnerability is a name service which uses
self-authenticating keys so that no agent is ever vulnerable to deception with
regard to what object a name should map to.

Regards,

Zooko

http://zooko.com/

[1] http://fs.net/


From alk at pobox.com  Thu Oct 25 13:54:01 2001
From: alk at pobox.com (Tony Kimball)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
References: <3BD7B24C.9030403@chapweske.com>
	<200110251500.LAA19254@markbaker.ca>
Message-ID: <15320.31589.458110.880405@gargle.gargle.HOWL>

: 
: The N2* conventions run completely against the architecture of the Web.
: URIs are resource identifiers.  URNs are one kind of URI.  How many
: URIs does a resource need?

N2L doesn't provide URIs, it provides URLs.  A resource needs at least
as many URLs has it has addressible locations.

From alk at pobox.com  Thu Oct 25 14:19:01 2001
From: alk at pobox.com (Tony Kimball)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
References: <00b001c15d8c$c66077a0$0ea7fea9@golden>
	<200110252117.RAA22943@markbaker.ca>
Message-ID: <15320.33141.378877.269651@gargle.gargle.HOWL>

Quoth Mark Baker on Thursday, 25 October:
: 
: But if it's still really important for this system to have the
: hash in the URI, how about this;
: 
: http://foobar.org/my_content?sha1hash=32452345ASDASDFASDFS

It's also really important for this system not to have a location in
the URI.

: Do you have an example of a resource that is best named by its
: hash?

All resources are best named by their hash, for some value of "best".

From distobj at acm.org  Thu Oct 25 14:28:01 2001
From: distobj at acm.org (Mark Baker)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
In-Reply-To: <00b001c15d8c$c66077a0$0ea7fea9@golden> from "Gordon Mohr" at Oct 25, 2001 12:39:33 PM
Message-ID: <200110252117.RAA22943@markbaker.ca>

Hi Gordon,

> > Untrue.  The use of URIs allows these things.  There is nothing
> > special about URNs in this respect.
> 
> Well, I belive the theory is that URNs have slightly stronger 
> "identity" characteristics.

s/theory/spiteful rumour/g 8-)

> However, the latest W3C "clarification"
> seems to simply confirm the practice of using some URIs as if
> they were URNs.

Even HTTP URIs.

Yes, that debate still exists.  In general, REST proponents
believe that any URI is only as persistent as the authority is
willing to make it.  The dependance of the HTTP URI scheme on
DNS, and that an authority doesn't "own" it, is often mentioned
as a reason why HTTP URIs are less persistent.  But that ignores
the fact that for urn:<nid>:foo, the registrant doesn't own
"nid" either.  If IBM runs out to register "microsoft" as a NID,
you can guarantee WIPO will eventually get involved.  It's just
another central registry.

> If the names are self-verifying, as with secure hashes, you 
> don't have to make a trust/no-trust decision about caches,
> at least not at the outset. You can make the trust/no-trust
> decision on what they give you. Transgressors are always
> caught.

I have an issue with the name "self-verifying".

If I do a GET on urn:sha-1:234234KJASDFKAJFD, I don't know that
I'm getting back a resource with that hash.  I still have to
run the hash and compare it to the URI, because the cache may
be lying.  So the honus of verification lies with the client,
and therefore the URI isn't self-verifying.

But if it's still really important for this system to have the
hash in the URI, how about this;

http://foobar.org/my_content?sha1hash=32452345ASDASDFASDFS

> > Also, including a hash of the content in a URI is an extremely
> > brittle way of identifying things, unless it is known that
> > the content will be static for all time and space.  If that URN
> > identified a song in MP3 format, for example, then you'd need
> > a new URN to identify the same song in a different format.
> > Is that what you want?
> 
> For many applications, yes. For stitching together pieces of a
> single exact file from dozens of independent sources, absolutely!

Ok, so how about my URI above?

> You don't want a "mostly acceptable" MP3 to have the same 
> reliable name as the "official" or "consensus 'best'" version.

HTTP handles that with variants (different representations of the
same resource).  Variants can have their own URIs too.  So you
could have;

http://myfavband.org/song/myfavsong (the main URI)

plus these variants;

http://myfavband.org/song/myfavsong/mp3/256k
http://myfavband.org/song/myfavsong/mp3/128k
http://myfavband.org/song/myfavsong/wav/56k
http://myfavband.org/song/myfavsong/au/8k

> I'd say you want one (a hash-based URN) to serve as the resource's
> unfalsifiable "true name". You might want several others (traditionally,
> URLs) to reflect the resource's current reachable locations, or 
> its names within alternate delivery systems. 

Do you have an example of a resource that is best named by its
hash?

MB
-- 
Mark Baker, CSO, Planetfred.
Ottawa, Ontario, CANADA.
mbaker@planetfred.com

From distobj at acm.org  Thu Oct 25 14:39:01 2001
From: distobj at acm.org (Mark Baker)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
In-Reply-To: <15320.31589.458110.880405@gargle.gargle.HOWL> from "Tony Kimball" at Oct 25, 2001 03:51:49 PM
Message-ID: <200110252136.RAA23215@markbaker.ca>

> : The N2* conventions run completely against the architecture of the Web.
> : URIs are resource identifiers.  URNs are one kind of URI.  How many
> : URIs does a resource need?
> 
> N2L doesn't provide URIs, it provides URLs.

URLs are URIs, so N2L does provide them.

>  A resource needs at least
> as many URLs has it has addressible locations.

URLs are names.  The only thing that makes them a locator is a
convention that converts them to an IP address, TCP port, local name
tuple.  But that mapping isn't authoritative.  Quite the opposite
in fact.  For example, chances are that you likely have a cached
version of http://www.yahoo.com in your browser cache.  What is
the URL for that document?  It's http://www.yahoo.com.

Is "Tony Kimball" a locator?  Of course not, it's a name, right?
So what if we defined a convention that said that you could do a
DNS lookup on tony.kimball.person, open a connection to port 81,
do a GET, and see his homepage.  Would that make "Tony Kimball"
a locator?

MB
-- 
Mark Baker, CSO, Planetfred.
Ottawa, Ontario, CANADA.
mbaker@planetfred.com

From justin at chapweske.com  Thu Oct 25 15:13:01 2001
From: justin at chapweske.com (Justin Chapweske)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Content-Addressable Networks Thread
Message-ID: <3BD88D39.4060805@chapweske.com>

Could we move the CAW thread to decentralization?  Sorry for the 
cross-posting.

-Justin


From gojomo at usa.net  Thu Oct 25 16:18:01 2001
From: gojomo at usa.net (Gordon Mohr)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
References: <200110252117.RAA22943@markbaker.ca>
Message-ID: <015701c15dab$625960e0$0ea7fea9@golden>

[bluesky dropped, as it's not accepting my non-subscriber messages]

Mark Baker writes:
> > However, the latest W3C "clarification"
> > seems to simply confirm the practice of using some URIs as if
> > they were URNs.
> 
> Even HTTP URIs.
> 
> Yes, that debate still exists.  In general, REST proponents
> believe that any URI is only as persistent as the authority is
> willing to make it.  The dependance of the HTTP URI scheme on
> DNS, and that an authority doesn't "own" it, is often mentioned
> as a reason why HTTP URIs are less persistent.  But that ignores
> the fact that for urn:<nid>:foo, the registrant doesn't own
> "nid" either.  If IBM runs out to register "microsoft" as a NID,
> you can guarantee WIPO will eventually get involved.  It's just
> another central registry.

But with identifiers which are inherent, rather than assigned,
it doesn't matter who registers/"owns" something. Only the
consensus definition of the process for creating the name from
the content matters.

> > If the names are self-verifying, as with secure hashes, you 
> > don't have to make a trust/no-trust decision about caches,
> > at least not at the outset. You can make the trust/no-trust
> > decision on what they give you. Transgressors are always
> > caught.
> 
> I have an issue with the name "self-verifying".
> 
> If I do a GET on urn:sha-1:234234KJASDFKAJFD, I don't know that
> I'm getting back a resource with that hash.  I still have to
> run the hash and compare it to the URI, because the cache may
> be lying.  So the honus of verification lies with the client,
> and therefore the URI isn't self-verifying.

Turn the problem around. You have a resource. You want 
to share it. Sure, you can give it an arbitrary name 
in the HTTP URL namespace, under some hostname you 
control. 

But then no one will know whether what you have is the 
same thing as what they have. Third parties looking for
your exact file cannot tell, from just a legal HTTP 
URL, if what you're offering is what they seek.

You could start a convention for embedding a unique, 
location-independent identifier into your URLs -- as 
RFC2169 suggests, or you suggest later in your message.
I believe that is a useful approach.

But then, you now have something else interesting to 
advertise -- the "true name" of the content. In fact, you 
don't care at all about the domain-name and request-URI 
stuff, you'll let those take any value, as long as the 
reliable name matches and checks out.


> But if it's still really important for this system to have the
> hash in the URI, how about this;
> 
> http://foobar.org/my_content?sha1hash=32452345ASDASDFASDFS

The "N2R" resolution services discussed in RFC2169 and CAW are
very much like you suggest. 

In the above, "sha1hash=32452345ASDASDFASDFS" is a unique,
location-independent, durable name for the content. So why
not call it an URN, and make its qualities explicit in the
syntax? Like:

    http://foobar.org/my_content?name=urn:sha1:32452345ASDASDFASDFS

And then, when you are completely indifferent about where
you get the matching content -- indifferent about location,
protocol, everything -- why should an application keep shuttling 
the "http://foobar.org/my_content?name=" portion around?

Just go with:

    *?name=urn:sha1:32452345ASDASDFASDFS

or better yet

    urn:sha1:32452345ASDASDFASDFS

> > You don't want a "mostly acceptable" MP3 to have the same 
> > reliable name as the "official" or "consensus 'best'" version.
> 
> HTTP handles that with variants (different representations of the
> same resource).  Variants can have their own URIs too.  So you
> could have;
> 
> http://myfavband.org/song/myfavsong (the main URI)
> 
> plus these variants;
> 
> http://myfavband.org/song/myfavsong/mp3/256k
> http://myfavband.org/song/myfavsong/mp3/128k
> http://myfavband.org/song/myfavsong/wav/56k
> http://myfavband.org/song/myfavsong/au/8k

These names still don't come close to uniquely identifying 
specific instances. (There are, for example, trillions of
equally-valid MP3 encodings of a song.) When you want a
specific digital file, one that is an exact copy of an
original/official/recommended version, you want a precise
name.

> > I'd say you want one (a hash-based URN) to serve as the resource's
> > unfalsifiable "true name". You might want several others (traditionally,
> > URLs) to reflect the resource's current reachable locations, or 
> > its names within alternate delivery systems. 
> 
> Do you have an example of a resource that is best named by its
> hash?

  - a recipe for a chemical process that could
    be dangerous if any of the steps or quantities
    are slightly altered
  - a 2-hour video you want to grab as equal parts
    from 120 different sources, with no glitches
  - a compiled executable for which versions with 
    malicious code might be floating around

- Gordon


From clay at shirky.com  Thu Oct 25 18:16:01 2001
From: clay at shirky.com (Clay Shirky)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] Re: [decentralization] The Content-Addressable Web
In-Reply-To: <E15wrK3-0008Kr-00@imp> from "Zooko" at Oct 25, 2001 01:41:51 PM
Message-ID: <200110260048.f9Q0mh322670@mail.shirky.com>

> The most important concept in the whole field of the "p2p" or
> "decentralization" or whatever you call it is the concept of
> "cooperation without vulnerability".

You don't mean exactly that, I think, since the Visa system, as but
one example, allows this. If I go into some little shop in Cancun to
buy sunglasses, that merchant and I are all but guaranteed never to
see one another again. Its a non-iterated game of PD, where defection
can backfire, but it can never be punished.

But I have a Visa card, and he has a Visa sign in his door. Visa knows
him, and will pimpslap him upside his silly little head if he runs up
extra charges on my card. Visa know me, and has a staff of trained
knuckledraggers who will nail my head to the floor if I fail to pay up.

So what I think you mean is "P2P can allow cooperation without
vulnerability _or_ brokering authority." And this only matters if you
think such a system can operate at a lower cost (cash cost plus
opportunity cost) than a system with a broker. 

You and Todd think such a thing is possible. I'm not sure its
impossible, but I am certainly in the skeptics' camp on this one.

-clay


From bram at gawth.com  Sun Oct 28 13:12:02 2001
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:11:43 2006
Subject: [p2p-hackers] BitTorrent 2.5.1 is out!
Message-ID: <Pine.LNX.4.21.0110281255520.22321-100000@ultra.gawth.com>

BitTorrent 2.5.1 is out, check it out here -

http://bitconjurer.org/BitTorrent/

Or, if you're running windows and just want a quick demo, go straight to
the demo page -

http://bitconjurer.org/BitTorrent/demo.html

New in this release - timeouts and keepalives. This will probably be the
last non-backwards compatible release for a while.

my apologies to anyone who got a 404 in the last day or so - there was
some release snafu.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes