From david at pjort.com  Sun Feb  1 09:19:00 2004
From: david at pjort.com (David =?iso-8859-1?Q?G=F6thberg?=)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Re: P2P journal copyright
In-Reply-To: <401B22BA.8010005@neurogrid.com>
References: <20040131030211.GF20611@lycopodium>
	<401AF6B8.6050902@neurogrid.com>
	<20040131030211.GF20611@lycopodium>
Message-ID: <5.0.2.1.1.20040131100728.00a3f650@pop.home.se>


Sam Joseph, P2P journal wrote:
>We are trying to adjust our copyright policy to make everyone happy.
>We are in the process of trying to improve the wording of the policy, and 
>I don't understand why you don't want to help us.
>In fact here is the latest version of the copyright policy
>"P2PJ wants to find a balance between author?s copyright and ensuring 
>articles in P2PJ are unique. P2PJ prefers original, quality articles. 
>Quality is more important than quantity.
>
>By submitting the article to P2PJ, the author grants P2PJ and its 
>affiliated organizations and entities, perpetual, royalty-free, worldwide 
>licenses for both electronic and print formats. Authors are granted rights 
>to reproduce their articles provided that they include a prominent 
>statement: 'The piece originally appeared in P2P Journal'. A clearly 
>visible link to http://p2pjournal.com must also be made.


Hi everyone!
Since it was I who started out this discussion I feel obliged to comment.

Don Marti: Thanks for supporting my views. :)

Sam Joseph, p2pjournal.com: You're new version of your copyright agreement 
is much more agreeable. However I think it does create a whole set of legal 
problems and uncertainties for both parties. So I have some suggestions.

Stating that the p2pjournal gets a "license" and then that "Authors are 
granted rights to reproduce" makes it very unclear who owns what. The 
wording you have chosen for instance might make it illegal for any of the 
parties to resell the text! That is, your wording gives both parties the 
right to reproduce the text, but not to sell copies of it or resell the 
rights to it.

I think it is better and easier to give both parties one complete copyright 
and ownership of the text. That is, to copy the copyright!  :)

Here's a rough translation (from memory) and adaptation of the copyright 
agreement we used for the paintings my mother bought for the book she 
wrote. Lawyers in Sweden thought this was a very nice idea and they could 
see no legal problem with it:

**************************
"The author grants P2P Journal a "shared" or "copied" copyright to the 
document. This means that each party has the complete set of rights to the 
document as if it is two different documents. That means that both parties 
can do anything they want with their copy of the document. Both parties can 
reproduce, redistribute, licence or even sell their rights to the document 
to other parties. This also means that both parties can do changes to the 
document and reuse any part of it as they see fit. That is, both parties 
fully own their copy of the document."

"Both parties understand that if they sell their right to the document to 
some other party they should inform the buyer about the fact that this 
document is subject to a shared or copied copyright."
**************************

A funny consequence of a copied copyright is that each party actually can 
sell (or give away) copied copyrights of the document! That is, 
theoretically in the long run there might be many owners each owning a copy 
of the document. :))

We had one additional "backup copy" paragraph that you probably don't want 
to use and don't need. But it was good since one usually don't have many 
backup copies of paintings...

**************************
"If any of the parties loses his/her originals they have the right to 
request full size copies of the paintings from the other party at cost 
price. (Price of making a good copy in a print shop and postage and 
packaging.)"
**************************

We didn't bother to ad anything about how this "backup copy" paragraph 
should be handled if one party sold their right to the document to some 
other party. So it is unclear if a new owner that have not signed the 
original contract would be bound by it. So adding some obligation or rights 
between the two original parties can become very messy if one party wants 
to sell, reuse or change their copy. Your demand that the author must 
include the statement "The piece originally appeared in P2P Journal" causes 
a similar problem. What if the author greatly reworks the document? Or only 
reuses some small part of the document inside another document? And what if 
the author sells his right to the document? The easy way to avoid such 
problems is simply to not have any obligations between the two parties.

I am aware that you wanted to prevent the authors from publishing their 
document in other journals etc. But to accomplish that you must be the sole 
owner of the document. But to OWN the document you should PAY the author 
for doing WORK for you. And note that much of the content in the articles 
might have been the result of work done while the author was paid by some 
other source. And you didn't want to pay anything anyway so.

But you could ad the following paragraph:

**************************
If / when the author reuses the published article p2p Journal would like 
that the following statement is included: 'This piece originally appeared 
in P2P Journal'
However this is only a request, not an obligation.
**************************

I believe that most authors will be perfectly happy to "brag" in their 
paper that the paper has been published in a journal. But only making it a 
request gives the author full freedom. And this also solves the problem 
what to do when only reusing a small part of the document (like a figure or 
so).

I understand that you need to have the right to print and resell your 
journal and to reuse parts of your journal in other ways. But the authors 
also need to reuse the documents they write about their research. It would 
be silly if the authors would have to redraw the figures describing their 
p2p network structure. A "copied" copyright solves this problem completely 
giving both parties full freedom. And most p2p researchers really like 
freedom a lot! Freedom of speech/expression/publication is one of the main 
reasons many of us do p2p research.

Well, this was my five cents (my view) on the "problem".


Greetings from snowy Gothenburg, Sweden, Northern Europe,

               .../David

-----------------------------------------------------------
David G?thberg        Email: david@pjort.com
                       http://www.david.pjort.com
-----------------------------------------------------------


From sam at neurogrid.com  Mon Feb  2 01:04:18 2004
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] P2PJ Call for Referees
Message-ID: <401DA212.8080009@neurogrid.com>

Hi all,

I realise that this may not be the best timing to post this call for 
*referees*, since we are in the middle of a debate about copyright in 
P2PJ.  However I am in the process of moving house, and so can't gather 
all the threads of the copyright thing together right now.  I want to 
thank everyone who has mailed me and the list with input, and assure 
everyone that as soon as I have finished my house move I will work my 
hardest to get P2PJ set up with a copyright policy that addresses 
everyones concerns.

In the meantime I'm just mailing this call for referees because we 
really do need help reviewing papers, and I have already taken far too 
long to get it sorted out.

Sincerely, Sam Joseph

------------------------------------------------------------------
                              CALL FOR REVIEWERS

                                  Peer-to-Peer Journal

                                 (http://p2pjournal.com)
-------------------------------------------------------------------
Apologies if you receive this announcement more than once.

The Peer-to-Peer Journal (P2PJ) is an electronic, refereed journal
devoted to comprehensive coverage of Peer-to-Peer and Parallel computing
topics. It is freely available, with no subscription and is located at
http://www.p2pjournal.com/.

All articles submitted to P2PJ are evaluated by referees who comment
on the article and make recommendations to the Editor. P2PJ is looking
to expand the number of referees and would very much welcome volunteers
from those interested in P2P and with specializations in relevant fields.

When P2PJ receives articles, the editorial board selects referees to
approach according to the overlap between the article itself and the
expertise of the potential reviewers. An invitation email is then sent
to the selected referees, with the anonymised article attached and a
checklist of points to consider. P2PJ hopes to receive reviews
within 10 to 14 days. The authors then receive an anonymised version
of the referees' comments and suggestions for improvement of the article.

While there are no direct benefits available to referees, reviewing
does allow you to see the current state of the art andis of immense
benefit to the peer-to-peer community generally and to the authors
of the submission specifically. A referee would not normally receive
more than three papers a year to review, and most referees are asked
to review only occasionally, when a paper in their particular field
is submitted.

While there are no direct benefits available to referees, reviewing
does allow you to see the current state of the art, is of immense
benefit to the peer-to-peer community generally, and is beneficial
to the authors of the submission specifically.

If you would like to become a referee for P2PJ, please send this
this email to sam@p2pjournal.com after completing the form below.

-----------------------------------------------
Title: (e.g. Dr, Prof.)
First name:
Family Name:

Affiliation/institution:

Email address:

Keywords describing your area of expertise and interest
(please be a specific as possible, e.g. 'distributed hashtables',
not 'p2p'). List up to 5 keywords or phrases, one per line:


Do you have a doctorate (PhD)? If so, in what field:

Number of papers you have published in refereed journals:


Thank you. Your reply will be acknowledged. Invitations to
referee will then arrive when appropriate papers are submitted
to P2PJ. If you need to update your details, please notify
<sam@p2pjournal.com>


Raymond F. Gao, Editor-in-Chief
Daniel Brookshier, Editor
Sam Joseph, Editor


From b.fallenstein at gmx.de  Tue Feb  3 14:12:33 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
Message-ID: <401FAC51.6040000@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

does anybody know references for using cryptographic hashes as unique
identifiers for files in very large repositories (think all of the Web)?
The references I've found (e.g. Handbook of Applied Cryptography) don't
talk explicitly about that, but only about applications in message
authentication, and attacks related to that; of course that's related,
but it would be nice to know whether there are references from
cryptology talking explicitly about hashes as unique identifiers in very
large collections of messages.

Thanks,
- - Benja
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAH6xQUvR5J6wSKPMRAhRLAKCZUml8RrCWLTSuFR69uCLEvaLIYQCgqQP1
ZOpKsNoWrTxfuN/xu2PB5Dw=
=Vy04
-----END PGP SIGNATURE-----

From bert at web2peer.com  Tue Feb  3 15:27:42 2004
From: bert at web2peer.com (Bert)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
Message-ID: <20040203072743.1242.h004.c001.wm@mail.web2peer.com.criticalpath.net>

Maybe I'm not understanding the question, but isn't this exactly the
idea of distributed hash tables? E.g. Chord, CAN, Pastry, etc...


On Tue, 03 Feb 2004 16:12:33 +0200, Benja Fallenstein wrote:

> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi,
> 
> does anybody know references for using cryptographic hashes as unique
> identifiers for files in very large repositories (think all of the
> Web)?
> The references I've found (e.g. Handbook of Applied Cryptography)
don't
> talk explicitly about that, but only about applications in message
> authentication, and attacks related to that; of course that's
related,
> but it would be nice to know whether there are references from
> cryptology talking explicitly about hashes as unique identifiers in
> very
> large collections of messages.
> 
> Thanks,
> - - Benja
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
> 
> iD8DBQFAH6xQUvR5J6wSKPMRAhRLAKCZUml8RrCWLTSuFR69uCLEvaLIYQCgqQP1
> ZOpKsNoWrTxfuN/xu2PB5Dw=
> =Vy04
> -----END PGP SIGNATURE-----
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From jdd at dixons.org  Tue Feb  3 16:13:35 2004
From: jdd at dixons.org (Jim Dixon)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
In-Reply-To: <401FAC51.6040000@gmx.de>
Message-ID: <20040203160155.G78403-100000@localhost>

On Tue, 3 Feb 2004, Benja Fallenstein wrote:

> does anybody know references for using cryptographic hashes as unique
> identifiers for files in very large repositories (think all of the Web)?
> The references I've found (e.g. Handbook of Applied Cryptography) don't
> talk explicitly about that, but only about applications in message
> authentication, and attacks related to that; of course that's related,
> but it would be nice to know whether there are references from
> cryptology talking explicitly about hashes as unique identifiers in very
> large collections of messages.

No references really needed.  You can use SHA digests to generate 160
bit/20 byte keys which can be used as unique identifiers.  While it is
theoretically possible that one messages or other document could hash to
the same digest as another, changes are approximately 10^16 against it, so
we needn't worry in our lifetimes.

As someone else has pointed out, the various DHT networks (Chord, Pastry,
etc) as well as Freenet assume that SHA digests are unique identifiers.

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
http://jxcl.sourceforge.net                       Java unit test coverage
http://xlattice.sourceforge.net         p2p communications infrastructure


From b.fallenstein at gmx.de  Tue Feb  3 16:15:55 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
In-Reply-To: <20040203072743.1242.h004.c001.wm@mail.web2peer.com.criticalpath.net>
References: <20040203072743.1242.h004.c001.wm@mail.web2peer.com.criticalpath.net>
Message-ID: <401FC93B.6000508@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Bert,

The question was whether there are any references in the cryptology
literature that say that the idea is ok. I know it's true, I just wanted
a reference so that I don't have to do the math to show it myself. I
found none so far, so I'll have to do it myself ;)

(Luckily, a more math-inclined friend showed me how that's not all that
hard.)

Thanks
- - Benja

Bert wrote:
| Maybe I'm not understanding the question, but isn't this exactly the
| idea of distributed hash tables? E.g. Chord, CAN, Pastry, etc...
|
|
|
| On Tue, 03 Feb 2004 16:12:33 +0200, Benja Fallenstein wrote:
|
|
| Hi,
|
| does anybody know references for using cryptographic hashes as unique
| identifiers for files in very large repositories (think all of the
| Web)?
| The references I've found (e.g. Handbook of Applied Cryptography)
|
|> don't
|
| talk explicitly about that, but only about applications in message
| authentication, and attacks related to that; of course that's
|
|> related,
|
| but it would be nice to know whether there are references from
| cryptology talking explicitly about hashes as unique identifiers in
| very
| large collections of messages.
|
| Thanks,
| - Benja
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

| _______________________________________________
| p2p-hackers mailing list
| p2p-hackers@zgp.org
| http://zgp.org/mailman/listinfo/p2p-hackers
| _______________________________________________
| Here is a web page listing P2P Conferences:
| http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAH8k6UvR5J6wSKPMRAsaXAKCewyF+NSbPpE2jDwv0P6W08gKrjwCdHYsG
/7u2dWKVA4NOp+NK5IYm+Yc=
=Ll48
-----END PGP SIGNATURE-----

From zooko at zooko.com  Tue Feb  3 16:22:42 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers? 
In-Reply-To: Message from Jim Dixon <jdd@dixons.org> of "Tue,
	03 Feb 2004 16:13:35 GMT." <20040203160155.G78403-100000@localhost> 
References: <20040203160155.G78403-100000@localhost> 
Message-ID: <E1Ao3Jy-0003M3-00@localhost>


 Jim Dixon wrote:
>
> While it is
> theoretically possible that one messages or other document could hash to
> the same digest as another, changes are approximately 10^16 against it, so
> we needn't worry in our lifetimes.

You're right about the conclusion, Jim, but 2^10 ~= 10^3, so 2^160 ~= 10^48.

This is 10^16:

10,000,000,000,000,000

This is 10^48:

1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000

Regards,

Zooko


From hannes.tschofenig at siemens.com  Tue Feb  3 16:46:19 2004
From: hannes.tschofenig at siemens.com (Tschofenig Hannes)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
Message-ID: <2A8DB02E3018D411901B009027FD3A3F03BC067B@mchp905a.mch.sbs.de>

hi all, 

you would call it statistically unique:

from the hip draft: 

The birthday paradox sets a bound for the expectation of collisions. It is
based on the square root of the number of values. A 64-bit hash, then, would
put the chances of a collision at 50-50 with 2^32 hosts (4 billion). A 1%
chance of collision would occur in a population of 640M and a .001%
collision chance in a 20M population. A 128 bit hash will have the same
.001% collision chance in a 9x10^16 population. 

ciao
hannes


> -----Original Message-----
> From: Jim Dixon [mailto:jdd@dixons.org]
> Sent: Tuesday, February 03, 2004 5:14 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] References for using hashes as unique
> identifiers?
> 
> 
> On Tue, 3 Feb 2004, Benja Fallenstein wrote:
> 
> > does anybody know references for using cryptographic hashes 
> as unique
> > identifiers for files in very large repositories (think all 
> of the Web)?
> > The references I've found (e.g. Handbook of Applied 
> Cryptography) don't
> > talk explicitly about that, but only about applications in message
> > authentication, and attacks related to that; of course 
> that's related,
> > but it would be nice to know whether there are references from
> > cryptology talking explicitly about hashes as unique 
> identifiers in very
> > large collections of messages.
> 
> No references really needed.  You can use SHA digests to generate 160
> bit/20 byte keys which can be used as unique identifiers.  While it is
> theoretically possible that one messages or other document 
> could hash to
> the same digest as another, changes are approximately 10^16 
> against it, so
> we needn't worry in our lifetimes.
> 
> As someone else has pointed out, the various DHT networks 
> (Chord, Pastry,
> etc) as well as Freenet assume that SHA digests are unique 
> identifiers.
> 
> --
> Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 
> 797 373 7881
> http://jxcl.sourceforge.net                       Java unit 
> test coverage
> http://xlattice.sourceforge.net         p2p communications 
> infrastructure
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From b.fallenstein at gmx.de  Tue Feb  3 17:03:34 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
In-Reply-To: <E1Ao3Jy-0003M3-00@localhost>
References: <20040203160155.G78403-100000@localhost>
	<E1Ao3Jy-0003M3-00@localhost>
Message-ID: <401FD466.80101@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Zooko O'Whielacronx wrote:
|  Jim Dixon wrote:
|
|>While it is
|>theoretically possible that one messages or other document could hash to
|>the same digest as another, changes are approximately 10^16 against it, so
|>we needn't worry in our lifetimes.
|
| You're right about the conclusion, Jim, but 2^10 ~= 10^3, so 2^160 ~=
10^48.

It's certainly not 2^160 against -- that would be the case if we had
only two documents.

If we have n documents, the probability of a collision is < n^2/2^160
though, which for n = 2^60 still gives 2^(-40) or about one trillion
against.

- - Benja
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAH9RlUvR5J6wSKPMRAmSFAJ9kz8z7fqrEzW+AtEzMtRwQhQ2cAgCfeesw
mBPUNb//x8d9i88y/kEMuZA=
=mYPs
-----END PGP SIGNATURE-----

From jdd at dixons.org  Tue Feb  3 17:13:23 2004
From: jdd at dixons.org (Jim Dixon)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
In-Reply-To: <E1Ao3Jy-0003M3-00@localhost>
Message-ID: <20040203171223.B78403-100000@localhost>

On 3 Feb 2004, Zooko O'Whielacronx wrote:

> > While it is
> > theoretically possible that one messages or other document could hash to
> > the same digest as another, changes are approximately 10^16 against it, so
> > we needn't worry in our lifetimes.
>
> You're right about the conclusion, Jim, but 2^10 ~= 10^3, so 2^160 ~= 10^48.

Yep.  Should have stayed in bed.

> This is 10^16:
>
> 10,000,000,000,000,000
>
> This is 10^48:
>
> 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000

Which makes the odds just a bit better. ;-)

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
http://jxcl.sourceforge.net                       Java unit test coverage
http://xlattice.sourceforge.net         p2p communications infrastructure


From zooko at zooko.com  Tue Feb  3 17:21:48 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers? 
In-Reply-To: Message from Jim Dixon <jdd@dixons.org> of "Tue,
	03 Feb 2004 17:13:23 GMT." <20040203171223.B78403-100000@localhost> 
References: <20040203171223.B78403-100000@localhost> 
Message-ID: <E1Ao4FA-0003n2-00@localhost>


> > You're right about the conclusion, Jim, but 2^10 ~= 10^3, so 2^160 ~= 10^48.
> 
> Yep.  Should have stayed in bed.

Me too, because I didn't mention the birthday surprise.

But we don't have to worry about the birthday surprise for a specific document.  
The chance that anyone can come up with a second pre-image that matches this 
hash is 2^-160:

820550664cf296792b38d1647a4d8c0e1966af57

Regards,

Zooko

P.S.  The hash above is hex encoded.  Here it is base32 encoded in my own 
base32 alphabet: oeniy31c6km81k3a4f18wuccbacspm4z


From gojomo at bitzi.com  Tue Feb  3 17:33:08 2004
From: gojomo at bitzi.com (Gordon Mohr)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
In-Reply-To: <E1Ao4FA-0003n2-00@localhost>
References: <20040203171223.B78403-100000@localhost>
	<E1Ao4FA-0003n2-00@localhost>
Message-ID: <401FDB54.7040406@bitzi.com>

Zooko O'Whielacronx wrote:
> But we don't have to worry about the birthday surprise for a specific document.  
> The chance that anyone can come up with a second pre-image that matches this 
> hash is 2^-160:
> 
> 820550664cf296792b38d1647a4d8c0e1966af57
> 
> Regards,
> 
> Zooko
> 
> P.S.  The hash above is hex encoded.  Here it is base32 encoded in my own 
> base32 alphabet: oeniy31c6km81k3a4f18wuccbacspm4z

And using custom alphabets decreases the chances the inadvertent
collisions even further: the exact same preimage and hash function
will give different output! :)

- Gordon


From hal at finney.org  Tue Feb  3 19:08:31 2004
From: hal at finney.org (Hal Finney)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
Message-ID: <200402031908.i13J8VT10926@finney.org>

Hannes writes:
>
> The birthday paradox sets a bound for the expectation of collisions. It is
> based on the square root of the number of values. A 64-bit hash, then, would
> put the chances of a collision at 50-50 with 2^32 hosts (4 billion). A 1%
> chance of collision would occur in a population of 640M and a .001%
> collision chance in a 20M population. A 128 bit hash will have the same
> .001% collision chance in a 9x10^16 population. 

So if the world population stabilizes at 10 billion, or 10^10, then each
person can have 9 million addressable objects before the chance of a
collision rises to .001%.  At about a billion objects per person the chances
hit 50-50 and the system breaks down.

I don't think this is good enough.  9 million objects per person is
rather limited, especially in a future 100 years from now which will
probably be much more information-rich than today.

I would suggest before we standardize on this that we use a larger
hash than 160 bits.  Newer hashes have sizes of 256, 384 and 512 bits.
Otherwise we'll be facing a Y2K-like problem 50 to 100 years from now.

Hal

From b.fallenstein at gmx.de  Tue Feb  3 20:14:50 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <200402031908.i13J8VT10926@finney.org>
References: <200402031908.i13J8VT10926@finney.org>
Message-ID: <4020013A.8050106@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Hal,

Hal Finney wrote:
| I don't think this is good enough.  9 million objects per person is
| rather limited, especially in a future 100 years from now which will
| probably be much more information-rich than today.

I think it's a safe assumption that one hundred years from now,
computers will be able to find second preimages for 160-bit hashes
anyway -- and it's not too unlikely that the analysis of hashes will
have advanced enough that SHA-1 or other hashes of today are broken anyway.

There was a paper that extrapolated trends in cryptography into the
future, among other things predicting when breaking an n-bit hash would
cost how much. I don't remember enough about the paper to find it right
now, but maybe someone else remembers.

Cheers,
- - Benja
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAH/+mUvR5J6wSKPMRAu8fAKDQb/VMdmJtxRAzjlhbjcl7GGxkQwCfXX4d
kwpe9bkhJ/I226NvwL//11o=
=gurT
-----END PGP SIGNATURE-----

From mllist at vaste.mine.nu  Tue Feb  3 22:50:23 2004
From: mllist at vaste.mine.nu (Johan =?iso-8859-1?Q?F=E4nge?=)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <200402031908.i13J8VT10926@finney.org>
References: <200402031908.i13J8VT10926@finney.org>
Message-ID: <12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>

Has there been any research done on this, and how this can be handled? I
realize that this is extremely unlikely, but it nags me that it's a
problem.

To fully compare two hashed pieces of data on two peers, one would need to
transfer all of the data, correct? It is also not possible to detect
"possible" collisions, without increasing the standard transfer rate.
(E.g. the output of the hash-function.)

However, this increase of transfer-rate is in some instances already
taking place. E.g. when the collision is a top-level hash of a
merkle-tree, but the lower levels are different. As soon as the lower
parts of the tree is transferred (e.g. in a filesharing system where files
are being transferred) the collection might be detected. Of course this
only reduces the possibilities and there might be still be indetectable
collisions. A human might also detect that it's incorrect.

The question then is whether it, except when the hash-function is cracked,
really matters. If we really are unable to detect something wrong (and
again, that it isn't intentional) then does this really matters? Won't
what we got the be good enough?

Once collection is detected is it's not a hard problem to do something
about it. In a DHT, simply note on the node storing the hash with
collisions that there are two versions, and the necessary data to
distinguish them. After deciding which piece of data it is, then push one
or both pieces of data onto some other node (by adding some data, e.g.
"1", to the data hashed, and repeat until there's no longer a collision).

I believe that for any serious candidate to an all-encompassing namespace,
there should be some procedure as to what to do if a collision happens.

/Vaste

> Hannes writes:
>
> So if the world population stabilizes at 10 billion, or 10^10, then each
> person can have 9 million addressable objects before the chance of a
> collision rises to .001%.  At about a billion objects per person the
> chances
> hit 50-50 and the system breaks down.
>
> I don't think this is good enough.  9 million objects per person is
> rather limited, especially in a future 100 years from now which will
> probably be much more information-rich than today.
>
> I would suggest before we standardize on this that we use a larger
> hash than 160 bits.  Newer hashes have sizes of 256, 384 and 512 bits.
> Otherwise we'll be facing a Y2K-like problem 50 to 100 years from now.
>
> Hal


From justin at chapweske.com  Tue Feb  3 23:08:52 2004
From: justin at chapweske.com (Justin Chapweske)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
Message-ID: <1075849732.3607.3240.camel@bog>

I agree that this is the appropriate fix.  If you are concerned about
collisions, don't worry about using larger (and much slower) hash
functions.  Simply transfer enough levels of your Merkle tree (encoded
using THEX of course :) to provide the level of robustness that you are
looking for.

> However, this increase of transfer-rate is in some instances already
> taking place. E.g. when the collision is a top-level hash of a
> merkle-tree, but the lower levels are different. As soon as the lower
> parts of the tree is transferred (e.g. in a filesharing system where files
> are being transferred) the collection might be detected. Of course this
> only reduces the possibilities and there might be still be indetectable
> collisions. A human might also detect that it's incorrect.
> 


From opoli at comcast.net  Wed Feb  4 02:49:34 2004
From: opoli at comcast.net (Opoli)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <4020013A.8050106@gmx.de>
Message-ID: <000401c3eac9$8801c7d0$ae00a943@Willard>


There was a paper that extrapolated trends in cryptography into the
future, among other things predicting when breaking an n-bit hash would
cost how much. I don't remember enough about the paper to find it right
now, but maybe someone else remembers.

-- The RSA site has links to a number of papers dealing with this -
http://www.rsasecurity.com/rsalabs/technotes/bernstein.html
The most recent paper is 2002 - there is probably something out there
more up to date but this is what I had at hand.
Glad - to be a part of the list,

Chris Camp


Cheers,
- - Benja
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAH/+mUvR5J6wSKPMRAu8fAKDQb/VMdmJtxRAzjlhbjcl7GGxkQwCfXX4d
kwpe9bkhJ/I226NvwL//11o=
=gurT
-----END PGP SIGNATURE-----
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From p2p at kingprimate.com  Wed Feb  4 03:48:10 2004
From: p2p at kingprimate.com (Jeremiah Rogers)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
Message-ID: <40206B7A.5070109@kingprimate.com>

Johan F?nge wrote:

 > To fully compare two hashed pieces of data on two peers, one would 
need to
 > transfer all of the data, correct? It is also not possible to detect
 > "possible" collisions, without increasing the standard transfer rate.
 > (E.g. the output of the hash-function.)

You could compare smaller segments of the data. The likelyhood that both 
  halves of a two messages, as well as the entire messages, all have 
matching hashes would be extremely unlikely (3 * 2^n or maybe even 2^3n, 
I'm not sure). I guess it really depends on how sure you want to be that 
the messages aren't the same.

 > Once collection is detected is it's not a hard problem to do something
 > about it. In a DHT, simply note on the node storing the hash with
 > collisions that there are two versions, and the necessary data to
 > distinguish them. After deciding which piece of data it is, then push one
 > or both pieces of data onto some other node (by adding some data, e.g.
 > "1", to the data hashed, and repeat until there's no longer a collision).
 >
 > I believe that for any serious candidate to an all-encompassing 
namespace,
 > there should be some procedure as to what to do if a collision happens.

Being able to handle a collision without completely failing is 
important, but if you're going to have assertions made about database 
hashes then you need to have unique hashes to prevent refutable or 
misinterpreted statements.

There are also zero-knowledge situations where (for example) the DHT 
stores merely the hash H, and although both A and B hash to H, you won't 
be able to tell that.

There was a interesting solution proposed earlier on this list [1] that 
would allow for increased hash length in the future if collions occured 
while keeping current length (and network usage) lower. It involves 
taking a larger hash and shrinking it.

-jr

[1] http://zgp.org/pipermail/p2p-hackers/2002-November/000977.html

From bkn3 at columbia.edu  Wed Feb  4 05:31:54 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Distributed Hashtables - Low Latency/Low Membership
	Overhead
In-Reply-To: <40206B7A.5070109@kingprimate.com>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
Message-ID: <6.0.1.1.2.20040203212724.01e31b38@pop.mail.yahoo.com>

I'm looking for a DHT that has low latency (as few hops as possible) 
coupled with a design that can handle high churn (such as the average 
length of time for a node being 30 minutes).  The membership/maintenance 
protocol should have low-overhead (i.e. the background traffic for 
establishing, maintaining, and breaking membership in the DHT should be a 
low-percentage of the total network traffic).  Other factors can be 
"pessimized" to achieve these two goals, such as maintaining extra large 
routing tables in memory.  I am aware of work establishing low latency for 
DHTs with Kelips and a modified Chord, but am not aware of work that also 
combines low-overhead for the membership protocol.  Can folks point me to 
further work in this area?

Thanks,
Brad Neuberg
bkn3@columbia.edu


From paul at soniq.net  Wed Feb  4 06:05:09 2004
From: paul at soniq.net (Paul Boehm)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <200402031908.i13J8VT10926@finney.org>
References: <200402031908.i13J8VT10926@finney.org>
Message-ID: <20040204060509.GA5303@soniq.net>

On Tue, Feb 03, 2004 at 11:08:31AM -0800, Hal Finney wrote:
> I would suggest before we standardize on this that we use a larger
> hash than 160 bits.  Newer hashes have sizes of 256, 384 and 512 bits.
> Otherwise we'll be facing a Y2K-like problem 50 to 100 years from now.

i've been wondering if unique identifiers and data authentication really
are the same problem. 

i think the unique identifier problem could be solved elegantly with
universal hash functions, that allow for N to S mappings with arbitrary
length for both N and S, and a provably non-biased distribution.

for authentication of the data, the question remains whether trusting a
cryptographic hash really is better than asking a trusted peer
(assuming there won't be non-social filesharing in the future)
that has the data, to MAC it for us.

regards,
  paul

From b.fallenstein at gmx.de  Wed Feb  4 10:54:05 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Distributed Hashtables - Low Latency/Low Membership
	Overhead
In-Reply-To: <6.0.1.1.2.20040203212724.01e31b38@pop.mail.yahoo.com>
References: <200402031908.i13J8VT10926@finney.org>	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>	<40206B7A.5070109@kingprimate.com>
	<6.0.1.1.2.20040203212724.01e31b38@pop.mail.yahoo.com>
Message-ID: <4020CF4D.8090004@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

For handling high churn at manageable network traffic, see Bamboo
(http://bamboo-dht.org/). It's unfortunately not optimized for low
latency, though, but maybe ideas from there can help.

- - Benja

Brad Neuberg wrote:
| I'm looking for a DHT that has low latency (as few hops as possible)
| coupled with a design that can handle high churn (such as the average
| length of time for a node being 30 minutes).  The membership/maintenance
| protocol should have low-overhead (i.e. the background traffic for
| establishing, maintaining, and breaking membership in the DHT should be
| a low-percentage of the total network traffic).  Other factors can be
| "pessimized" to achieve these two goals, such as maintaining extra large
| routing tables in memory.  I am aware of work establishing low latency
| for DHTs with Kelips and a modified Chord, but am not aware of work that
| also combines low-overhead for the membership protocol.  Can folks point
| me to further work in this area?
|
| Thanks,
| Brad Neuberg
| bkn3@columbia.edu
|
| _______________________________________________
| p2p-hackers mailing list
| p2p-hackers@zgp.org
| http://zgp.org/mailman/listinfo/p2p-hackers
| _______________________________________________
| Here is a web page listing P2P Conferences:
| http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
|
|

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAIM9CUvR5J6wSKPMRAhKlAJ92QE5oZRtXJ6uSGxWL64heuqrP7gCcDSf7
ZVt1Fq/XeStyWIqerjO8hyk=
=YDM/
-----END PGP SIGNATURE-----

From aloeser at cs.tu-berlin.de  Wed Feb  4 11:16:35 2004
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for the
	Chord protocoll
Message-ID: <4020D493.2420AE9C@cs.tu-berlin.de>

Chord uses a consistent hash function which equaly distributes keys to
nodes. For a large number of keys, queries are somehow loadbalanced.
However, some keys are more popular than other keys, thus some nodes
receive more queries, more messages are routed to thos nodes, more
bandwith is used. An interesting approach would be to create cache
entries of keys and objects along the lookup paths of successful
queries.

Does anybody now experiences with this approach?
Which approaches for query loadbalancing are known to you?

Any links to papers or wprking systems are welcome.

Alex
--
___________________________________________________________

  M.Sc., Dipl. Wi.-Inf. Alexander L?ser
  Technische Universitaet Berlin Fakultaet IV - CIS
  bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
  hp: http://cis.cs.tu-berlin.de/~aloeser/
  office: +49- 30-314-25551
  fax   : +49- 30-314-21601
___________________________________________________________


From bkn3 at columbia.edu  Wed Feb  4 11:19:03 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches
	for the Chord protocoll
In-Reply-To: <4020D493.2420AE9C@cs.tu-berlin.de>
References: <4020D493.2420AE9C@cs.tu-berlin.de>
Message-ID: <6.0.1.1.2.20040204031846.01e4feb0@pop.mail.yahoo.com>

Check out Coral at http://www.scs.cs.nyu.edu/coral/.

At 03:16 AM 2/4/2004, Alexander L?ser wrote:
>Chord uses a consistent hash function which equaly distributes keys to
>nodes. For a large number of keys, queries are somehow loadbalanced.
>However, some keys are more popular than other keys, thus some nodes
>receive more queries, more messages are routed to thos nodes, more
>bandwith is used. An interesting approach would be to create cache
>entries of keys and objects along the lookup paths of successful
>queries.
>
>Does anybody now experiences with this approach?
>Which approaches for query loadbalancing are known to you?
>
>Any links to papers or wprking systems are welcome.
>
>Alex
>--
>___________________________________________________________
>
>   M.Sc., Dipl. Wi.-Inf. Alexander L?ser
>   Technische Universitaet Berlin Fakultaet IV - CIS
>   bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
>   hp: http://cis.cs.tu-berlin.de/~aloeser/
>   office: +49- 30-314-25551
>   fax   : +49- 30-314-21601
>___________________________________________________________
>
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From bkn3 at columbia.edu  Wed Feb  4 11:19:03 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches
	for the Chord protocoll
In-Reply-To: <4020D493.2420AE9C@cs.tu-berlin.de>
References: <4020D493.2420AE9C@cs.tu-berlin.de>
Message-ID: <6.0.1.1.2.20040204031846.01e4feb0@pop.mail.yahoo.com>

Check out Coral at http://www.scs.cs.nyu.edu/coral/.

At 03:16 AM 2/4/2004, Alexander L?ser wrote:
>Chord uses a consistent hash function which equaly distributes keys to
>nodes. For a large number of keys, queries are somehow loadbalanced.
>However, some keys are more popular than other keys, thus some nodes
>receive more queries, more messages are routed to thos nodes, more
>bandwith is used. An interesting approach would be to create cache
>entries of keys and objects along the lookup paths of successful
>queries.
>
>Does anybody now experiences with this approach?
>Which approaches for query loadbalancing are known to you?
>
>Any links to papers or wprking systems are welcome.
>
>Alex
>--
>___________________________________________________________
>
>   M.Sc., Dipl. Wi.-Inf. Alexander L?ser
>   Technische Universitaet Berlin Fakultaet IV - CIS
>   bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
>   hp: http://cis.cs.tu-berlin.de/~aloeser/
>   office: +49- 30-314-25551
>   fax   : +49- 30-314-21601
>___________________________________________________________
>
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From eugen at leitl.org  Wed Feb  4 11:26:44 2004
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] is muted horn good for you?
Message-ID: <20040204112644.GI13816@leitl.org>


Can someone knowledgeable comment on Mute's architecture?

	http://mute-net.sourceforge.net/index.shtml

-- Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040204/debc73d7/attachment.pgp
From coderman at peertech.org  Wed Feb  4 11:34:51 2004
From: coderman at peertech.org (coderman)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches	for
	the Chord protocoll
In-Reply-To: <6.0.1.1.2.20040204031846.01e4feb0@pop.mail.yahoo.com>
References: <4020D493.2420AE9C@cs.tu-berlin.de>
	<6.0.1.1.2.20040204031846.01e4feb0@pop.mail.yahoo.com>
Message-ID: <4020D8DB.4000102@peertech.org>

Brad Neuberg wrote:

> Check out Coral at http://www.scs.cs.nyu.edu/coral/. 

From: http://www.scs.cs.nyu.edu/coral/download.html

" Currently, only R/W access to Coral is available. Note that getting
read/write access requires an account on the SCS file servers, so
this route is probably open only to SCS members and affiliates."

Any way to get a snapshot (nightly tarballs?) or read access?  This
sounds interesting but I'd like to dig a little deeper...

Regards,


From mllist at vaste.mine.nu  Wed Feb  4 13:19:23 2004
From: mllist at vaste.mine.nu (Johan =?iso-8859-1?Q?F=E4nge?=)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <40206B7A.5070109@kingprimate.com>
References: <200402031908.i13J8VT10926@finney.org><12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
Message-ID: <13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>

Jeremiah Rogers wrote:

> Johan F?nge wrote:
>
>  > To fully compare two hashed pieces of data on two peers, one
>  > would need to transfer all of the data, correct? It is also
>  > not possible to detect "possible" collisions, without
>  > increasing the standard transfer rate. (E.g. the output of
>  > the hash-function.)
>
> You could compare smaller segments of the data. The likelyhood that both
>   halves of a two messages, as well as the entire messages, all have
> matching hashes would be extremely unlikely (3 * 2^n or maybe even 2^3n,
> I'm not sure). I guess it really depends on how sure you want to be that
> the messages aren't the same.

Unless you already compare smaller segments this _is_ increasing the
standard transfer rate. In filesharing this is sometimes already done to
get finer grained hashes, so there it is not problem.

>
>  > Once collection is detected is it's not a hard problem to do
>  > something about it. In a DHT, simply note on the node storing
>  > the hash with collisions that there are two versions, and the
>  > necessary data to distinguish them. After deciding which piece
>  > of data it is, then push one or both pieces of data onto some
>  > other node (by adding some data, e.g. "1", to the data hashed,
>  > and repeat until there's no longer a collision).
>  >
>  > I believe that for any serious candidate to an all-encompassing
>  > namespace, there should be some procedure as to what to do if a
>  > collision happens.
>
> Being able to handle a collision without completely failing is
> important, but if you're going to have assertions made about database
> hashes then you need to have unique hashes to prevent refutable or
> misinterpreted statements.

I'm not sure I understand what you mean here. The extra information is
collected and used only when a collision is detected. It doesn't affect
the rest of the system. The extra information means that the peer doing
the hash must do a more thorough investigation, and this shouldn't present
much of a problem. This extra information may be an additional hash. (data
has hash x; data + "1" has hash y)

>
> There are also zero-knowledge situations where (for example) the DHT
> stores merely the hash H, and although both A and B hash to H, you won't
> be able to tell that.

My earlier question was that if you really are _completely_ unable to
detect the error, does it then matter? After all, there is _nothing_ that
says something is wrong. It's a farily hypothetical question.

>
> There was a interesting solution proposed earlier on this list [1] that
> would allow for increased hash length in the future if collions occured
> while keeping current length (and network usage) lower. It involves
> taking a larger hash and shrinking it.

Yes, that is one kind of additional information.

/Vaste

From mllist at vaste.mine.nu  Wed Feb  4 13:31:45 2004
From: mllist at vaste.mine.nu (Johan =?iso-8859-1?Q?F=E4nge?=)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] is muted horn good for you?
In-Reply-To: <20040204112644.GI13816@leitl.org>
References: <20040204112644.GI13816@leitl.org>
Message-ID: <13035.81.227.49.57.1075901505.squirrel@Vaste_lp3.wired>

>
> Can someone knowledgeable comment on Mute's architecture?
>
> 	http://mute-net.sourceforge.net/index.shtml

http://mute-net.sourceforge.net/howAnts.shtml

Throwing a fairly quick look at this page, I'd say it's awful, with
regards to scaling. I must say that the ant-stuff is a pretty cool way to
explain gnutella, and as far as I can tell that is just what it is,
regarding structure. (But so far without any optimizations like
supernodes.)

It is different though, in that it also transfers data along the overlay
(even slower, but better anonymity).

/Vaste

From justin at chapweske.com  Wed Feb  4 15:06:26 2004
From: justin at chapweske.com (Justin Chapweske)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
	<13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>
Message-ID: <1075907185.3607.3259.camel@bog>

Practically speaking, increasing the size of the hash is not a bandwidth
issue.  It is a CPU issue.  So truncating a larger hash simply provides
less useful information at the cost of higher CPU utilization.

The rule of thumb that I go by is that your content verification should
take no more than 5% of the CPU when the file is being verified at a
rate equal to the maximum expected download rate.  This is usually quite
easy to attain for broadband connections, but gets starts getting
difficult for high-speed networks - 45 Mbps+.

The nice thing about the approach of comparing the top N values of the
hash tree is that, in order to generate the root hash, you have to
generate those intermediate hashes anyway, so it causes no additional
CPU load to verify as many hash bits as you like.

-Justin

> >
> > There was a interesting solution proposed earlier on this list [1] that
> > would allow for increased hash length in the future if collions occured
> > while keeping current length (and network usage) lower. It involves
> > taking a larger hash and shrinking it.
> 
> Yes, that is one kind of additional information.


From coderman at peertech.org  Wed Feb  4 15:23:18 2004
From: coderman at peertech.org (coderman)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <1075907185.3607.3259.camel@bog>
References: <200402031908.i13J8VT10926@finney.org>	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>	<40206B7A.5070109@kingprimate.com>	<13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>
	<1075907185.3607.3259.camel@bog>
Message-ID: <40210E66.8020806@peertech.org>

Justin Chapweske wrote:

>Practically speaking, increasing the size of the hash is not a bandwidth
>issue.  It is a CPU issue.  So truncating a larger hash simply provides
>less useful information at the cost of higher CPU utilization.
>
>The rule of thumb that I go by is that your content verification should
>take no more than 5% of the CPU when the file is being verified at a
>rate equal to the maximum expected download rate.  This is usually quite
>easy to attain for broadband connections, but gets starts getting
>difficult for high-speed networks - 45 Mbps+.
>  
>
Another factor is bringing large amounts of existing content onto the 
network.
I have a few 160G drives loaded with content, archives, software, etc, and
building hash identifiers for that much data takes a while.

This is one reason I have been a big fan of VIA's efforts to put crypto 
features
on their processors.  The C5XL shipped with a high quality random number
generator and the C5P with two RNG's and AES as well.  The next revision 
will
contain SHA on core, which will be directly useful for the types of high 
volume
hashing needed for large archives or high bandwidth links.

My hope is that the market reacts positively to these features, and 
other chip
makers start to follow suit.

[ Info about the C5XL rng: http://peertech.org/hardware/viarng/
  developer guide: 
/www.via.com.tw/en/images/Products/eden/pdf/PadLock_RNG_prog_guide.pdf ]
/


From agthorr at barsoom.org  Wed Feb  4 16:23:07 2004
From: agthorr at barsoom.org (Agthorr)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Distributed Hashtables - Low Latency/Low Membership
	Overhead
In-Reply-To: <6.0.1.1.2.20040203212724.01e31b38@pop.mail.yahoo.com>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
	<6.0.1.1.2.20040203212724.01e31b38@pop.mail.yahoo.com>
Message-ID: <20040204162306.GB22096@barsoom.org>

On Tue, Feb 03, 2004 at 09:31:54PM -0800, Brad Neuberg wrote:
> The membership/maintenance protocol should have low-overhead
> (i.e. the background traffic for establishing, maintaining, and
> breaking membership in the DHT should be a low-percentage of the
> total network traffic).

Well, how much will your total network traffic be?  How many nodes to
do you expect to have, at most?

> Other factors can be "pessimized" to achieve these two goals, such
> as maintaining extra large routing tables in memory.

I think you have conflicting goals there.  You can't have large
routing tables *and* low churn overhead.

-- Agthorr

From eugen at leitl.org  Wed Feb  4 16:36:26 2004
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Distributed Hashtables - Low Latency/Low Membership
	Overhead
In-Reply-To: <20040204162306.GB22096@barsoom.org>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
	<6.0.1.1.2.20040203212724.01e31b38@pop.mail.yahoo.com>
	<20040204162306.GB22096@barsoom.org>
Message-ID: <20040204163626.GK24465@leitl.org>

On Wed, Feb 04, 2004 at 08:23:07AM -0800, Agthorr wrote:

> I think you have conflicting goals there.  You can't have large
> routing tables *and* low churn overhead.

If the node address space is densely populated (see plenty of p2p connection
attempts on dynamically assigned DSL addresses) you can just assume it's a
noisy high-dimensional grid, and just store diffs. The routing table is
local, and gets refreshed when a delivery fails.

-- Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040204/13ccb57d/attachment.pgp
From agthorr at barsoom.org  Wed Feb  4 16:54:23 2004
From: agthorr at barsoom.org (Agthorr)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for the
	Chord protocoll
In-Reply-To: <4020D493.2420AE9C@cs.tu-berlin.de>
References: <4020D493.2420AE9C@cs.tu-berlin.de>
Message-ID: <20040204165423.GC22265@barsoom.org>

On Wed, Feb 04, 2004 at 12:16:35PM +0100, Alexander L?ser wrote:
> Chord uses a consistent hash function which equaly distributes keys to
> nodes. For a large number of keys, queries are somehow loadbalanced.
> However, some keys are more popular than other keys, thus some nodes
> receive more queries, more messages are routed to thos nodes, more
> bandwith is used. An interesting approach would be to create cache
> entries of keys and objects along the lookup paths of successful
> queries.

See the Caching and Load Balance sections in this paper:

http://www.pdos.lcs.mit.edu/papers/cfs:sosp01/cfs_sosp.pdf

-- Agthorr

From zooko at zooko.com  Wed Feb  4 17:28:01 2004
From: zooko at zooko.com (Bryce Wilcox-O'Hearn)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: Message from Jeremiah Rogers <p2p@kingprimate.com> 
	of "Tue, 03 Feb 2004 22:48:10 EST." <40206B7A.5070109@kingprimate.com> 
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com> 
Message-ID: <E1AoQoq-0001DW-00@localhost>


 Jeremiah Rogers <p2p@kingprimate.com> wrote:
>
> There was a interesting solution proposed earlier on this list [1] that 
> would allow for increased hash length in the future if collions occured 
> while keeping current length (and network usage) lower. It involves 
> taking a larger hash and shrinking it.
...
> [1] http://zgp.org/pipermail/p2p-hackers/2002-November/000977.html

Thank you for saying that my idea was interesting.  If you read [1], then you 
might also want to read the other messages in those threads, such as [2] in 
which I change my mind and decide to use normal old SHA-1.

By the way, the final spec that resulted from that design process is [3], and it 
is now implemented [4] (modulo an erratum and an optional feature).

Regards,

Zooko

[2] http://zgp.org/pipermail/p2p-hackers/2002-November/000984.html
[3] http://mnet.sourceforge.net/new_filesystem.html
[4] http://cvs.sourceforge.net/viewcvs.py/mnet/mnet_new/mnetlib/filesystem/znff.py?view=markup

From p2p at kingprimate.com  Wed Feb  4 17:51:56 2004
From: p2p at kingprimate.com (Jeremiah Rogers)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <1075907185.3607.3259.camel@bog>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
	<13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>
	<1075907185.3607.3259.camel@bog>
Message-ID: <4021313B.3040703@kingprimate.com>

Justin Chapweske wrote:
> Practically speaking, increasing the size of the hash is not a bandwidth
> issue.  It is a CPU issue.  So truncating a larger hash simply provides
> less useful information at the cost of higher CPU utilization.

Sorry, I should have made it more clear that in the network I'm 
designing in my head (low latency zero-knowledge triple store) hash size 
  has a fairly signifigant impact on the network. With larger files 
there is a smaller hash/actual data ratio so CPU usage is probably more 
important.

-jr

From b.fallenstein at gmx.de  Wed Feb  4 18:02:34 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <4021313B.3040703@kingprimate.com>
References: <200402031908.i13J8VT10926@finney.org>	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>	<40206B7A.5070109@kingprimate.com>	<13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>	<1075907185.3607.3259.camel@bog>
	<4021313B.3040703@kingprimate.com>
Message-ID: <402133BA.60006@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I'm not following -- if you truncate a 512 bit hash to 160 bits, you
only transfer 160 bits (that's the point of truncation), so where's the
additional bandwidth compared to using a 160 bit hash in the first place?

- - Benja

Jeremiah Rogers wrote:
| Justin Chapweske wrote:
|
|> Practically speaking, increasing the size of the hash is not a bandwidth
|> issue.  It is a CPU issue.  So truncating a larger hash simply provides
|> less useful information at the cost of higher CPU utilization.
|
|
| Sorry, I should have made it more clear that in the network I'm
| designing in my head (low latency zero-knowledge triple store) hash size
|  has a fairly signifigant impact on the network. With larger files there
| is a smaller hash/actual data ratio so CPU usage is probably more
| important.
|
| -jr
| _______________________________________________
| p2p-hackers mailing list
| p2p-hackers@zgp.org
| http://zgp.org/mailman/listinfo/p2p-hackers
| _______________________________________________
| Here is a web page listing P2P Conferences:
| http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
|
|

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAITO5UvR5J6wSKPMRAlxAAJ9Nc4l0g8Puopq+FSR5LNNxcpCiAwCg0WY2
VLtgckFoCJzrG+kKfrWsd7w=
=fuKR
-----END PGP SIGNATURE-----

From p2p at kingprimate.com  Wed Feb  4 18:09:17 2004
From: p2p at kingprimate.com (Jeremiah Rogers)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
	<13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>
Message-ID: <4021354D.9090009@kingprimate.com>

Johan F?nge wrote:
>> > To fully compare two hashed pieces of data on two peers, one
>> > would need to transfer all of the data, correct? It is also
>> > not possible to detect "possible" collisions, without
>> > increasing the standard transfer rate. (E.g. the output of
>> > the hash-function.)
>>
>>You could compare smaller segments of the data. The likelyhood that both
>>  halves of a two messages, as well as the entire messages, all have
>>matching hashes would be extremely unlikely (3 * 2^n or maybe even 2^3n,
>>I'm not sure). I guess it really depends on how sure you want to be that
>>the messages aren't the same.
> 
> Unless you already compare smaller segments this _is_ increasing the
> standard transfer rate. In filesharing this is sometimes already done to
> get finer grained hashes, so there it is not problem.

Sorry, I was answering your first question about how to compare files 
during suspected collisions. If we collect additional information 
upfront the chances of a collision drop from close to never to closer to 
never, which (you're right) doesn't really justify increasing the 
standard transfer rate by too far.

>> > Once collection is detected is it's not a hard problem to do
>> > something about it. In a DHT, simply note on the node storing
>> > the hash with collisions that there are two versions, and the
>> > necessary data to distinguish them. After deciding which piece
>> > of data it is, then push one or both pieces of data onto some
>> > other node (by adding some data, e.g. "1", to the data hashed,
>> > and repeat until there's no longer a collision).
>> >
>> > I believe that for any serious candidate to an all-encompassing
>> > namespace, there should be some procedure as to what to do if a
>> > collision happens.
>>
>>Being able to handle a collision without completely failing is
>>important, but if you're going to have assertions made about database
>>hashes then you need to have unique hashes to prevent refutable or
>>misinterpreted statements.
> 
> 
> I'm not sure I understand what you mean here. The extra information is
> collected and used only when a collision is detected. It doesn't affect
> the rest of the system. The extra information means that the peer doing
> the hash must do a more thorough investigation, and this shouldn't present
> much of a problem. This extra information may be an additional hash. (data
> has hash x; data + "1" has hash y)

Sorry, see comment above. I think I'm getting elements of the network I 
want to design confused with the (filesharing?) networks others seem to 
be talking about. Still, if you want to make (cryptographically signed) 
assertions about given hashes within the network you can't easily handle 
collisions without some weird protocols for re-establishing 
non-refutable signatures.

But since the collision will never happen (*crosses fingers*) these 
protocols could indeed be strange and slow.

>>There are also zero-knowledge situations where (for example) the DHT
>>stores merely the hash H, and although both A and B hash to H, you won't
>>be able to tell that.
> 
> 
> My earlier question was that if you really are _completely_ unable to
> detect the error, does it then matter? After all, there is _nothing_ that
> says something is wrong. It's a farily hypothetical question.

If you can't detect it than it probably doesn't matter. It could lead to 
strange occurances though, and if it happens that two very popular 
things hash to the same value the network may fall apart.

So the important question for choosing hash length is what the problems 
are when you have a collision. In a filesharing network it's a pain in 
the ass but there are copies of the plaintexts availible to compare, in 
a zero-knowledge network a collision probably goes undetected and can 
severely screw up the network.

-jr

From p2p at kingprimate.com  Wed Feb  4 18:26:15 2004
From: p2p at kingprimate.com (Jeremiah Rogers)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] References for using hashes as unique identifie rs?
In-Reply-To: <402133BA.60006@gmx.de>
References: <200402031908.i13J8VT10926@finney.org>
	<12327.81.227.49.57.1075848623.squirrel@Vaste_lp3.wired>
	<40206B7A.5070109@kingprimate.com>
	<13027.81.227.49.57.1075900763.squirrel@Vaste_lp3.wired>
	<1075907185.3607.3259.camel@bog>
	<4021313B.3040703@kingprimate.com> <402133BA.60006@gmx.de>
Message-ID: <40213947.8090908@kingprimate.com>

What I'm saying is that transferring a larger hash in this network is a 
bandwith problem -- with this network design basically all you transfer 
is hashes (since it's zk). But if you want collision resistance for the 
future by being able to up the DHT-key hash length later, you could 
compute the 512 bit hash and then only use it's 160 bit subset as the 
DHT-key. Later if the network gets too close to collisions you could use 
a larger subset as the key, while not having to throw away all 160 bit keys.

Benja Fallenstein wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> I'm not following -- if you truncate a 512 bit hash to 160 bits, you
> only transfer 160 bits (that's the point of truncation), so where's the
> additional bandwidth compared to using a 160 bit hash in the first place?
> 
> - - Benja
> 
> Jeremiah Rogers wrote:
> | Justin Chapweske wrote:
> |
> |> Practically speaking, increasing the size of the hash is not a bandwidth
> |> issue.  It is a CPU issue.  So truncating a larger hash simply provides
> |> less useful information at the cost of higher CPU utilization.
> |
> |
> | Sorry, I should have made it more clear that in the network I'm
> | designing in my head (low latency zero-knowledge triple store) hash size
> |  has a fairly signifigant impact on the network. With larger files there
> | is a smaller hash/actual data ratio so CPU usage is probably more
> | important.
> |
> | -jr
> | _______________________________________________
> | p2p-hackers mailing list
> | p2p-hackers@zgp.org
> | http://zgp.org/mailman/listinfo/p2p-hackers
> | _______________________________________________
> | Here is a web page listing P2P Conferences:
> | http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> |
> |
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
> 
> iD8DBQFAITO5UvR5J6wSKPMRAlxAAJ9Nc4l0g8Puopq+FSR5LNNxcpCiAwCg0WY2
> VLtgckFoCJzrG+kKfrWsd7w=
> =fuKR
> -----END PGP SIGNATURE-----
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From Paul.Harrison at infotech.monash.edu.au  Thu Feb  5 04:20:52 2004
From: Paul.Harrison at infotech.monash.edu.au (Paul Harrison)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for the
	Chord protocoll
In-Reply-To: <4020D493.2420AE9C@cs.tu-berlin.de>
Message-ID: <Pine.LNX.4.33.0402051458310.14493-100000@mandarin>

On Wed, 4 Feb 2004, Alexander [iso-8859-1] L=F6ser wrote:

> Chord uses a consistent hash function which equaly distributes keys to
> nodes. For a large number of keys, queries are somehow loadbalanced.
> However, some keys are more popular than other keys, thus some nodes
> receive more queries, more messages are routed to thos nodes, more
> bandwith is used. An interesting approach would be to create cache
> entries of keys and objects along the lookup paths of successful
> queries.
>
> Does anybody now experiences with this approach?
> Which approaches for query loadbalancing are known to you?

With a little work, you can probabilistically load balance common keys in
a DHT. This is how Circle does it (Circle is a chord implementation,
thecircle.org.au):

Each key in Circle consists of a 16 byte MD5 and an additional 4 random
bytes. Thus to do a look-up you are not looking for a key so much as a
very short span of the key space, ie [md5]00000000 through to
[md5]ffffffff. This is no harder than a single-key look-up. Actual entries
for a particular md5 will be scattered evenly along this span. This gives
you the potential to split single keys across multiple nodes.

To take advantage of this, Circle chooses as node id some point in the
middle of the span of one of the keys it published last session. So if a
key is common, it is likely that several nodes will choose node ids in the
middle of its span, thus splitting it across those nodes.

For each look-up you pick a random key within the span you want and work
up and down the hashtable from there until you have enough results. This
means each node within that span gets its fair share of the load.

Unlike caching schemes, this is fully scalable. A caching scheme still
requires that at least one node knows of all the instances of a particular
key.

Once you have this you can safely start adding entries for general
concepts into your DHT, which is quite useful: "I am a person, you can
talk to me", "I can cache data", "I'm willing to do distributed computing
stuff", "I am part of some other specialized P2P network", etc.

cheers,
Paul

pfh@logarithmic.net | http://www.logarithmic.net/pfh

Current cost to save one life: AU$300 / US$200
                 www.unicef.org  www.oxfam.org


From coderman at peertech.org  Thu Feb  5 03:55:54 2004
From: coderman at peertech.org (coderman)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for
	the	Chord protocoll
In-Reply-To: <Pine.LNX.4.33.0402051458310.14493-100000@mandarin>
References: <Pine.LNX.4.33.0402051458310.14493-100000@mandarin>
Message-ID: <4021BECA.3040005@peertech.org>

Paul Harrison wrote:

>...
>For each look-up you pick a random key within the span you want and work
>up and down the hashtable from there until you have enough results. This
>means each node within that span gets its fair share of the load.
>
>Unlike caching schemes, this is fully scalable. A caching scheme still
>requires that at least one node knows of all the instances of a particular
>key.
>
Maybe we are thinking of different kinds of loaded keys, but querying
multiple nodes for a given entry would be worse than just querying the
right node directly the first time.

Very loaded keys (the search for the "britney" key as the canonical
example) will overwhelm any individual node.  It would be required to 
distribute
the load across multiple nodes _but_ require only one hit to match.  
Traversing
to multiple nodes in this situation only compounds the problem.

Caching helps with the "only one hit required" portion of this 
requirement, but
does suffer from the cache coherence problem you mention: you pay for
management overhead associated with the caching.

I'm not sure how this particular problem could be resolved in a DHT, 
although
there are certainly techniques to minimize this as much as possible.


From gcarreno at gcarreno.org  Thu Feb  5 04:31:54 2004
From: gcarreno at gcarreno.org (Gustavo Carreno)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for
	the Chord protocoll
In-Reply-To: <4021BECA.3040005@peertech.org>
References: <Pine.LNX.4.33.0402051458310.14493-100000@mandarin>
	<4021BECA.3040005@peertech.org>
Message-ID: <82109911694.20040205043154@gcarreno.org>

Hello coderman,

Thursday, February 5, 2004, 3:55:54 AM, you wrote:

c> Paul Harrison wrote:

>>Unlike caching schemes, this is fully scalable. A caching scheme still
>>requires that at least one node knows of all the instances of a particular
>>key.
>>
c> Maybe we are thinking of different kinds of loaded keys, but querying
c> multiple nodes for a given entry would be worse than just querying the
c> right node directly the first time.

  I'm  just  a  by-stander  with  not much knowledge on DHT nor on P2P
  advanced  schemes,  but  if  someone tells me that Circle is doing a
  similar  search  has  the  Fast  Sort  implementation and the cached
  scheme  ir  more  like a bubble sort kind, I'm picking the Fast Sort
  implementation.

  From  the  little  that  I've  understood  'till  now  of  any chord
  implementation,  it's  quite  "node  droping" resistent and a cached
  scheme  will have the need for _THAT_ node to exist or pass a _HUGE_
  amount  of  data to the next node, just before getting out, assuming
  that it's not getting bumped due to network outage.

  So,  "node  dropping"  resistence  and  a similarity to Fast Search,
  splitting  the search "universe" by halfs, is it not better than one
  central cache?

  Please  keep in mind that I'm the least knowgeable person in matters
  of Hashing, P2P and the likes, so if I'm just tripping on my tongue,
  you have the obligation to correct me and slap me on the face :)

  Gustavo Carreno
 
-=[ "When you know Slackware you know Linux.
     When you know Red Hat, all you know is Red Hat" ]=-


From Paul.Harrison at infotech.monash.edu.au  Thu Feb  5 13:57:50 2004
From: Paul.Harrison at infotech.monash.edu.au (Paul Harrison)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for the
	Chord protocoll
In-Reply-To: <4021BECA.3040005@peertech.org>
Message-ID: <Pine.LNX.4.33.0402051834360.14426-100000@mandarin>

On Wed, 4 Feb 2004, coderman wrote:

> Paul Harrison wrote:
>
> >...
> >For each look-up you pick a random key within the span you want and work
> >up and down the hashtable from there until you have enough results. This
> >means each node within that span gets its fair share of the load.

Gustavo, you compared this to bubble sort... reading it over, it is kind
of unclear, I'm going to have to do a little braindump before i explain.
In a chord, each node has a random node id. Each node is responsible for
the chunk of hashtable between that node id and the smallest node id
greater than its node id on the network. So the hash-table is split across
all the nodes in the network.

I'm assuming that each node knows how to contact its "neighbours". That
is, the node with the largest node id less than its own node id, and the
node with the smallest node id larger than its own node id. Nodes in a
chord need to know their neighbours in order to maintain the network's
structure.

With these links, it doesn't cost much to traverse the nodes in order. So
once you've found one node that lies in the span you're looking for, it's
cheap to explore the whole span by walking up and down it (like a linked
list).

> >
> >Unlike caching schemes, this is fully scalable. A caching scheme still
> >requires that at least one node knows of all the instances of a particular
> >key.
> >
> Maybe we are thinking of different kinds of loaded keys, but querying
> multiple nodes for a given entry would be worse than just querying the
> right node directly the first time.
>

What i had in mind is a very common key, maybe even one that nearly
everyone in the network publishes. So it's impractical for a single node
to store every instance of it.  But also, when you go to look it up, you
might only need a few hundred hits to find what you wanted. If you're
looking for Britney's latest hit, any one of the 1 billion available
copies will do.

The fundamental concept is that a key need not be a single point. It can
be made into a (very small) segment which is divisible into pieces. So
there's no longer a single right node.

With the scheme i have in mind, you would hit one randomly chosen point in
the span you want. For all but the most common keys a single node would be
responsible for the whole span and things work exactly as in a normal
chord. For common keys, it would still almost always give you enough
results you need, but if you want more you have the option of walking up
and down the table to get the rest.

> Very loaded keys (the search for the "britney" key as the canonical
> example) will overwhelm any individual node.  It would be required to
> distribute
> the load across multiple nodes _but_ require only one hit to match.
> Traversing
> to multiple nodes in this situation only compounds the problem.
>

I think we do have the same thing in mind.

> Caching helps with the "only one hit required" portion of this
> requirement, but
> does suffer from the cache coherence problem you mention: you pay for
> management overhead associated with the caching.
>
> I'm not sure how this particular problem could be resolved in a DHT,
> although
> there are certainly techniques to minimize this as much as possible.

You need some sort of redundancy in a DHT even without common keys, nodes
can fail at any time. Publishing two versions of each key would be a start
(maybe the md5 and the md5 with the top bit flipped).

cheers,
Paul

pfh@logarithmic.net | http://www.logarithmic.net/pfh

Current cost to save one life: AU$300 / US$200
                 www.unicef.org  www.oxfam.org


From gcarreno at gcarreno.org  Thu Feb  5 16:55:46 2004
From: gcarreno at gcarreno.org (Gustavo Carreno)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for
	the Chord protocoll
In-Reply-To: <Pine.LNX.4.33.0402051834360.14426-100000@mandarin>
References: <Pine.LNX.4.33.0402051834360.14426-100000@mandarin>
Message-ID: <24154544463.20040205165546@gcarreno.org>

Hello Paul,

Thursday, February 5, 2004, 1:57:50 PM, you wrote:

PH> Gustavo, you compared this to bubble sort... reading it over, it is kind
PH> of unclear, I'm going to have to do a little braindump before i explain.

  Well, I did say that I'm a complete ignorant on the matter, but from
  the reading on this thread that I've done, that was the fealing that
  I had:
  - Chord could be compared to a FastSort.
  - The cached system could be compared with a BubbleSort.

  Of course this is a VERY simplistic way of looking at it.

  Gustavo Carreno
 
-=[ "When you know Slackware you know Linux.
     When you know Red Hat, all you know is Red Hat" ]=-


From nl at essential.com.au  Fri Feb  6 04:40:59 2004
From: nl at essential.com.au (Nick Lothian)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Re: P2P journal copyright
Message-ID: <C316306FDC7ED511BC2C00D0B789CD9E016915CC@fox.essential.com.au>

> 
> Sam Joseph, P2P journal wrote:
> >We are trying to adjust our copyright policy to make everyone happy.
> >We are in the process of trying to improve the wording of 
> the policy, and 
> >I don't understand why you don't want to help us.
> >In fact here is the latest version of the copyright policy
> >"P2PJ wants to find a balance between author's copyright and 
> ensuring 
> >articles in P2PJ are unique. P2PJ prefers original, quality 
> articles. 
> >Quality is more important than quantity.
> >
> >By submitting the article to P2PJ, the author grants P2PJ and its 
> >affiliated organizations and entities, perpetual, 
> royalty-free, worldwide 
> >licenses for both electronic and print formats. Authors are 
> granted rights 
> >to reproduce their articles provided that they include a prominent 
> >statement: 'The piece originally appeared in P2P Journal'. A clearly 
> >visible link to http://p2pjournal.com must also be made.
> 
<snip>

A couple of interesting references on journal copyright (more directly
related to problems Elsevier publishing, but relevant none the less), from
<http://weblogs.cs.cornell.edu/AllThingsDistributed/archives/000380.html>

1) Donald Knuth: http://www-cs-faculty.stanford.edu/~knuth/joalet.pdf
2) Scientific Publishing: A Mathematician's Viewpoint:
http://www.ams.org/notices/200007/forum-birman.pdf

Nick

From coderman at peertech.org  Fri Feb  6 06:08:53 2004
From: coderman at peertech.org (coderman)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for
	the	Chord protocoll
In-Reply-To: <Pine.LNX.4.33.0402051834360.14426-100000@mandarin>
References: <Pine.LNX.4.33.0402051834360.14426-100000@mandarin>
Message-ID: <40232F75.6070504@peertech.org>

Paul Harrison wrote:

>What i had in mind is a very common key, maybe even one that nearly
>everyone in the network publishes. ...
>
>The fundamental concept is that a key need not be a single point. It can
>be made into a (very small) segment which is divisible into pieces. So
>there's no longer a single right node. ...
>
>I think we do have the same thing in mind.
>  
>
Yes, this handles higher loads nicely.  And brings up another good question:

How do the various DHT's handle malicious nodes trying to "spam" or flood
a given key?

The easier it is for various nodes to assist or join in caching or 
handling a
specific section of the keyspace, the easier it is to spam or flood popular
keys with potentially bogus information.  If you are addressing content
that is self certifying this is somewhat avoided, but for other types of
lookup this would be problematic.

Any ideas?


From jdd at dixons.org  Fri Feb  6 14:44:58 2004
From: jdd at dixons.org (Jim Dixon)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches for
	the Chord protocoll
In-Reply-To: <40232F75.6070504@peertech.org>
Message-ID: <20040206144123.F94769-100000@localhost>

On Thu, 5 Feb 2004, coderman wrote:

> How do the various DHT's handle malicious nodes trying to "spam" or flood
> a given key?

Most or all of the DHTs have well-defined user communities in which the
spammer is readily identifiable.  They can then be dealt with through
adminstrative processes.

Typically the user has a digital certificate of some sort.

> The easier it is for various nodes to assist or join in caching or
> handling a
> specific section of the keyspace, the easier it is to spam or flood popular
> keys with potentially bogus information.  If you are addressing content
> that is self certifying this is somewhat avoided, but for other types of
> lookup this would be problematic.

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
http://jxcl.sourceforge.net                       Java unit test coverage
http://xlattice.sourceforge.net         p2p communications infrastructure


From aloeser at cs.tu-berlin.de  Fri Feb  6 15:56:49 2004
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] Request for query load balancing approaches forthe 
	Chord protocoll
References: <20040206144123.F94769-100000@localhost>
Message-ID: <4023B941.2C7D1C7D@cs.tu-berlin.de>

Hi all ,
thank you so far for your feedback on query load balancing strategies in CHORD.
I'd like to point you to an additional issue in this area:

What kind of load balancing strategies exist for conjunctive queries in
distributed hash tables?
Consider the following two terms  hashed in the Chord index:

$AB1234=SHA-1("Style:Bossa Nova")
$DF2388=SHA-1("Composer:Gilberto")

$AB1234  value: URI  http://"Girl from Ipanema"
$AB1234  value: URI  http://"Girl from Ipanema"

and a lookup for songs from Gilberto in Bossa Nova Style ->  Lookup ($DF2388 AND
$AB1234)

I know only ony approach from Datta/Aberer, where popular queries combinations
are  indexed temporary in the Chord table with a new key. Does anybody know other
approaches?

Alex


Jim Dixon wrote:

> On Thu, 5 Feb 2004, coderman wrote:
>
> > How do the various DHT's handle malicious nodes trying to "spam" or flood
> > a given key?
>
> Most or all of the DHTs have well-defined user communities in which the
> spammer is readily identifiable.  They can then be dealt with through
> adminstrative processes.
>
> Typically the user has a digital certificate of some sort.
>
> > The easier it is for various nodes to assist or join in caching or
> > handling a
> > specific section of the keyspace, the easier it is to spam or flood popular
> > keys with potentially bogus information.  If you are addressing content
> > that is self certifying this is somewhat avoided, but for other types of
> > lookup this would be problematic.
>
> --
> Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
> http://jxcl.sourceforge.net                       Java unit test coverage
> http://xlattice.sourceforge.net         p2p communications infrastructure
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

--
___________________________________________________________

  M.Sc., Dipl. Wi.-Inf. Alexander L?ser
  Technische Universitaet Berlin Fakultaet IV - CIS
  bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
  hp: http://cis.cs.tu-berlin.de/~aloeser/
  office: +49- 30-314-25551
  fax   : +49- 30-314-21601
___________________________________________________________


From anwitaman at hotmail.com  Mon Feb  9 11:30:46 2004
From: anwitaman at hotmail.com (Anwitaman Datta)
Date: Sat Dec  9 22:12:37 2006
Subject: [p2p-hackers] load-balancing/query-adaptive indexing/DHTs
Message-ID: <LAW9-F100C5lgE8ZIfx0003e3b1@hotmail.com>

Hi Alex (and all)
Just to clarify one detail, that our work was evaluated not for Chord, but 
p-grid (www.p-grid.org), however most of the ideas ought to be applicable 
for a wide variety of DHTs including Chord.
Here are the two papers which Alex must be referring to, where we tackle 
some of the issues.
http://www.p-grid.org/Papers/TR-IC-2003-32.pdf
http://www.p-grid.org/Papers/TR-IC-2003-69.pdf

rgds,
A.

Message: 4
Date: Fri, 06 Feb 2004 16:56:49 +0100
From: Alexander L?ser <aloeser@cs.tu-berlin.de>
Subject: Re: [p2p-hackers] Request for query load balancing approaches
	forthe 	Chord protocoll
To: "Peer-to-peer development." <p2p-hackers@zgp.org>
Message-ID: <4023B941.2C7D1C7D@cs.tu-berlin.de>
Content-Type: text/plain; charset=iso-8859-1

Hi all ,
thank you so far for your feedback on query load balancing strategies in 
CHORD.
I'd like to point you to an additional issue in this area:

What kind of load balancing strategies exist for conjunctive queries in
distributed hash tables?
Consider the following two terms  hashed in the Chord index:

$AB1234=SHA-1("Style:Bossa Nova")
$DF2388=SHA-1("Composer:Gilberto")

$AB1234  value: URI  http://"Girl from Ipanema"
$AB1234  value: URI  http://"Girl from Ipanema"

and a lookup for songs from Gilberto in Bossa Nova Style ->  Lookup ($DF2388 
AND
$AB1234)

I know only ony approach from Datta/Aberer, where popular queries 
combinations
are  indexed temporary in the Chord table with a new key. Does anybody know 
other
approaches?

Alex

_________________________________________________________________
Contact brides & grooms FREE! http://www.shaadi.com/ptnr.php?ptnr=hmltag 
Only on www.shaadi.com. Register now!


From lgonze at panix.com  Wed Feb 11 18:38:54 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
Message-ID: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>

Dear Zooko,

The Google "I feel lucky" button is a counter-example to your 
conjecture that names cannot be all three of secure, decentralized and 
memorable.

Because "I feel lucky" is very accurate about the target object (for 
some names), it is secure (for some names).  In real world usage of a 
web browser, the name "Dave Winer" most probably does identify the web 
site "http://scripting.com", and "I feel lucky" gets the answer to this 
question right.

Because "I feel lucky" does not require a centralized registry, it is 
decentralized.  The centralized entity Google decides on name 
assertions like "Dave Winer"->http://scripting.com based on 
decentralized information sources.  If other search engines also 
operated "I feel lucky" services, they would also probably come to the 
same conclusion about this name, so both the information sources and 
name resolution services are decentralized.

Because "I feel lucky" allows short bit strings to be used as names, it 
is memorable.

In this counter-example, I have added one concept to the three in your 
original essay; the idea of limiting which memorable names can be 
resolved in a way that is secure and decentralized.  I hope you will 
agree that this is within the bounds of the problem.

best, sincerely, yours truly, etc,
Lucas


From sam at neurogrid.com  Wed Feb 11 20:57:40 2004
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] P2P workshops in Hawaii
Message-ID: <402A9744.3090005@neurogrid.com>

Hi All,

Pardon the cross-posting, but I just found some new P2P workshops in 
Hawaii with abstract deadlines coming up soon.

http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

CHEERS> SAM


From mccoy at mad-scientist.com  Wed Feb 11 21:57:27 2004
From: mccoy at mad-scientist.com (Jim McCoy)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
Message-ID: <47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>


On Feb 11, 2004, at 10:38 AM, Lucas Gonze wrote:
>
> The Google "I feel lucky" button is a counter-example to [Zooko's] 
> conjecture that names cannot be all three of secure, decentralized and 
> memorable.

I think that your liberal use of "(for some names)" betrays the 
weakness of this example.

Google-bombing is a simple counter-example to your suggestion.  I would 
propose that the "consensus opinion" nature of the google page ranking 
mechanism weakens any suggestion that it might be secure.  It is not 
secure, but it is decentralized (google is effectively the world's 
largest online reputation system) and it can support memorable 
identifier tags.

If I use the google IFL button today for a query it will returns a 
particular result. In a secure system I should be able to state with 
complete confidence that a similar query made in one year hence will 
return the exact same result (or a result that the current "owner" of 
that name has delegated to.)  Only if I was feeling very, very lucky 
would I make such a claim.  To be secure I need to be able to use a 
"name" as a reference that I can hand to someone else and know that 
they will get the same answer.  The top-rank in a google search is a 
very ephemeral condition and it is subject to change without notice at 
seemingly random intervals.  This hardly qualifies as secure.

Jim McCoy


From lgonze at panix.com  Wed Feb 11 22:39:40 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
Message-ID: <Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>


On Wed, 11 Feb 2004, Jim McCoy wrote:
> I think that your liberal use of "(for some names)" betrays the
> weakness of this example.

"(for some names)" isn't a crutch, it's meat of my idea.

> Google-bombing is a simple counter-example to your suggestion.  I would
> propose that the "consensus opinion" nature of the google page ranking
> mechanism weakens any suggestion that it might be secure.

When Google-bombing is used to add a name for something, which is the
usual situation, it doesn't necessarily destroy an old name, so doesn't
affect security.

When it's used to switch a name from one target to another, the original
target of the name had to have too little consensus behind it.  You would
have a really hard time convincing Google, the Internet Archive, Overture,
and the MSN search engine that the name "Google" refers to
"http://nytimes.com", for example.

The problem isn't with consensus in general, it is with which names you
choose to use.  This is up to the judgement of the user.  If you're not so
clever you'll use a bad consensus name like "John", if you are you'll use
a good consensus name like "Blosxom."

> If I use the google IFL button today for a query it will returns a
> particular result. In a secure system I should be able to state with
> complete confidence that a similar query made in one year hence will
> return the exact same result (or a result that the current "owner" of
> that name has delegated to.)  Only if I was feeling very, very lucky
> would I make such a claim.  To be secure I need to be able to use a
> "name" as a reference that I can hand to someone else and know that
> they will get the same answer.  The top-rank in a google search is a
> very ephemeral condition and it is subject to change without notice at
> seemingly random intervals.  This hardly qualifies as secure.

Ok, so assume that you can only say with high certainty that a consensus
name is true for today.  Your name is then scoped to a specific date and
can be looked up as the Google name for the day you used it.

That said, one-day only names are not useful in the first place.  You just
shouldn't use them.  But there are a lot of names likely to hold their
place for a few years, long enough for a name to be dereferenced and
converted to a long number.

What matters is how strong the consensus is.  For some names there isn't
enough, for others there is.

- Lucas


From bkn3 at columbia.edu  Wed Feb 11 23:21:14 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
	<Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
Message-ID: <6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>

Remember, though, that names don't just identify web-sites.  They can also 
identify email addresses ("joeschmoe@somewhere.com"), instant messaging 
endpoints, and phone-numbers.  While your example of using Google to skirt 
around Zookos Triangle for identifying web sites and web pages is 
interesting, it would break down if you are using it to identify messaging 
endpoints to a particular user or group.

Brad Neuberg
bkn3@columbia.edu

At 02:39 PM 2/11/2004, you wrote:

>On Wed, 11 Feb 2004, Jim McCoy wrote:
> > I think that your liberal use of "(for some names)" betrays the
> > weakness of this example.
>
>"(for some names)" isn't a crutch, it's meat of my idea.
>
> > Google-bombing is a simple counter-example to your suggestion.  I would
> > propose that the "consensus opinion" nature of the google page ranking
> > mechanism weakens any suggestion that it might be secure.
>
>When Google-bombing is used to add a name for something, which is the
>usual situation, it doesn't necessarily destroy an old name, so doesn't
>affect security.
>
>When it's used to switch a name from one target to another, the original
>target of the name had to have too little consensus behind it.  You would
>have a really hard time convincing Google, the Internet Archive, Overture,
>and the MSN search engine that the name "Google" refers to
>"http://nytimes.com", for example.
>
>The problem isn't with consensus in general, it is with which names you
>choose to use.  This is up to the judgement of the user.  If you're not so
>clever you'll use a bad consensus name like "John", if you are you'll use
>a good consensus name like "Blosxom."
>
> > If I use the google IFL button today for a query it will returns a
> > particular result. In a secure system I should be able to state with
> > complete confidence that a similar query made in one year hence will
> > return the exact same result (or a result that the current "owner" of
> > that name has delegated to.)  Only if I was feeling very, very lucky
> > would I make such a claim.  To be secure I need to be able to use a
> > "name" as a reference that I can hand to someone else and know that
> > they will get the same answer.  The top-rank in a google search is a
> > very ephemeral condition and it is subject to change without notice at
> > seemingly random intervals.  This hardly qualifies as secure.
>
>Ok, so assume that you can only say with high certainty that a consensus
>name is true for today.  Your name is then scoped to a specific date and
>can be looked up as the Google name for the day you used it.
>
>That said, one-day only names are not useful in the first place.  You just
>shouldn't use them.  But there are a lot of names likely to hold their
>place for a few years, long enough for a name to be dereferenced and
>converted to a long number.
>
>What matters is how strong the consensus is.  For some names there isn't
>enough, for others there is.
>
>- Lucas
>
>
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From lgonze at panix.com  Thu Feb 12 00:32:47 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
	<Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
	<6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>
Message-ID: <Pine.NEB.4.58.0402111919270.6379@panix2.panix.com>


On Wed, 11 Feb 2004, Brad Neuberg wrote:

> Remember, though, that names don't just identify web-sites.  They can also
> identify email addresses ("joeschmoe@somewhere.com"), instant messaging
> endpoints, and phone-numbers.  While your example of using Google to skirt
> around Zookos Triangle for identifying web sites and web pages is
> interesting, it would break down if you are using it to identify messaging
> endpoints to a particular user or group.

Can you articulate why it would break down, Brad?

The thing about "I feel lucky" sorts of names is not so much that they are
only useful for web pages, or at least that's not my intention.  My
thought is that IFL is a simple and concrete counter example of a class of
names that acts differently from either ICANN names or self-authenticating
names.

This class of names will have about the same properties as verbal name
systems.  They'll be probabilistic.  Speakers will have the burden of
choosing a less ambiguous name if they want to address within a larger
scope.  The scope of a name will be flexible.  The context in which a name
is used will affect the likelyhood that one object rather than another is
the target.  Names will change hands over time.

"I feel lucky" itself is not exactly a perfect name resolver.  Google
attempts to answer the question "what web page is this person thinking
of."  It might also attempt to answer other questions like "what SIP
listener?" or "what MX server?"  It might learn to take context into
consideration, ending up with a hybrid between pet names and consensus
names.

- Lucas


>
> Brad Neuberg
> bkn3@columbia.edu
>
> At 02:39 PM 2/11/2004, you wrote:
>
> >On Wed, 11 Feb 2004, Jim McCoy wrote:
> > > I think that your liberal use of "(for some names)" betrays the
> > > weakness of this example.
> >
> >"(for some names)" isn't a crutch, it's meat of my idea.
> >
> > > Google-bombing is a simple counter-example to your suggestion.  I would
> > > propose that the "consensus opinion" nature of the google page ranking
> > > mechanism weakens any suggestion that it might be secure.
> >
> >When Google-bombing is used to add a name for something, which is the
> >usual situation, it doesn't necessarily destroy an old name, so doesn't
> >affect security.
> >
> >When it's used to switch a name from one target to another, the original
> >target of the name had to have too little consensus behind it.  You would
> >have a really hard time convincing Google, the Internet Archive, Overture,
> >and the MSN search engine that the name "Google" refers to
> >"http://nytimes.com", for example.
> >
> >The problem isn't with consensus in general, it is with which names you
> >choose to use.  This is up to the judgement of the user.  If you're not so
> >clever you'll use a bad consensus name like "John", if you are you'll use
> >a good consensus name like "Blosxom."
> >
> > > If I use the google IFL button today for a query it will returns a
> > > particular result. In a secure system I should be able to state with
> > > complete confidence that a similar query made in one year hence will
> > > return the exact same result (or a result that the current "owner" of
> > > that name has delegated to.)  Only if I was feeling very, very lucky
> > > would I make such a claim.  To be secure I need to be able to use a
> > > "name" as a reference that I can hand to someone else and know that
> > > they will get the same answer.  The top-rank in a google search is a
> > > very ephemeral condition and it is subject to change without notice at
> > > seemingly random intervals.  This hardly qualifies as secure.
> >
> >Ok, so assume that you can only say with high certainty that a consensus
> >name is true for today.  Your name is then scoped to a specific date and
> >can be looked up as the Google name for the day you used it.
> >
> >That said, one-day only names are not useful in the first place.  You just
> >shouldn't use them.  But there are a lot of names likely to hold their
> >place for a few years, long enough for a name to be dereferenced and
> >converted to a long number.
> >
> >What matters is how strong the consensus is.  For some names there isn't
> >enough, for others there is.
> >
> >- Lucas
> >
> >
> >
> >_______________________________________________
> >p2p-hackers mailing list
> >p2p-hackers@zgp.org
> >http://zgp.org/mailman/listinfo/p2p-hackers
> >_______________________________________________
> >Here is a web page listing P2P Conferences:
> >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From mccoy at mad-scientist.com  Thu Feb 12 01:11:14 2004
From: mccoy at mad-scientist.com (Jim McCoy)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <Pine.NEB.4.58.0402111919270.6379@panix2.panix.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
	<Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
	<6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>
	<Pine.NEB.4.58.0402111919270.6379@panix2.panix.com>
Message-ID: <5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>

I guess I am having a hard time seeing how this differs from a 
collective "pet names" system (I am sure someone will fill in the ref 
to markm's original paper) with SDSI-like deferals except for the fact 
that it introduces really horrible trust problems.  The problem with a 
(SDSI-like) Google's IFL's "Dave Winer" is that the middle stage in 
this name chain is not very stable and so it is hard to place any real 
value on what is returned for the last key in the chain.  While it is 
true that for some names this will end up being a semi-stable 
reference, for most others it fails to convey any context and leads to 
ambiguous answers.

If we place any importance on this ephemeral link-weighting then it 
will simply raise the value of attacks upon it.  How hard do you really 
think it would be to get the IFL result for Dave Winer to point to 
http://www.gonze.com? Another problem with relying upon this sort of a 
power law system is that it degrades very quickly and dramatically once 
you get away from the big connectors.  The more keywords/context we 
need to add to this "name" to dis-ambiguate it the less useful it 
becomes compared to using a real pet name.

> "I feel lucky" itself is not exactly a perfect name resolver.  Google
> attempts to answer the question "what web page is this person thinking
> of."

No, google is trying to answer the question "what web page has the 
highest rank in our system when linked to these specific keywords."  
Don't try to read any sort of AI-like intention into the pagerank 
algorithm, it just doesn't work that way.

Want a simple example?  Try this little IFL lookup:

http://www.google.com/search?q=pet+names&btnI

Jim


From list at waterken.net  Thu Feb 12 16:24:03 2004
From: list at waterken.net (Tyler Close)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
Message-ID: <200402120824.03646.list@waterken.net>

On Wed February 11 2004 10:38 am, Lucas Gonze wrote:
> The Google "I feel lucky" button is a counter-example to your
> conjecture that names cannot be all three of secure, decentralized and
> memorable.

So if Google IFLs are used as identifiers, how does Google
securely identify the result of an IFL lookup, without losing the
decentralized property?

Tyler

-- 
The union of REST and capability-based security.
http://www.waterken.com/dev/Web/


From lgonze at panix.com  Thu Feb 12 18:10:19 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>
Message-ID: <B7DFAE94-5D86-11D8-8D19-000393455590@panix.com>


On Wed, 11 Feb 2004, Jim McCoy wrote:
 > I guess I am having a hard time seeing how this differs from a
 > collective "pet names" system (I am sure someone will fill in the ref
 > to markm's original paper) with SDSI-like deferals except for the fact
 > that it introduces really horrible trust problems.  The problem with a
 > (SDSI-like) Google's IFL's "Dave Winer" is that the middle stage in
 > this name chain is not very stable and so it is hard to place any real
 > value on what is returned for the last key in the chain.

That's an unusually broad way of interpreting pet names.  A collective
pet names system with SDSI-like deferals would normally not be 
considered
pet names at all.

That said, SDSI is a good point of reference, along with PageRank, the
PGP web of trust, and FOAF.  The PGP web of trust is a pretty good
place to look for RMS' public key, for example, and for most
applications I would be happy to address his public key using a URI
like pgpwot:"Richard M. Stallman".  And this doesn't strike me as a
horrible trust problem in any way.

 > While it is true that for some names this will end up being a
 > semi-stable reference, for most others it fails to convey any context
 > and leads to ambiguous answers.

Summarizing the conversation at this point, this question is not
whether consensus names are decentralized, secure and memorable, but
whether they are "complex, limited, and risky."

Complexity: PageRank and related algorithms are very well known ground
at this point.

Risky: the most stable consensus names might have a lifetime of ten
years.  (EG, "ietf" will probably continue to map to "http://ietf.org"
for another ten years).  The least stable secure hashes might have a
lifetime of ten years.

Limited: given that the number of consensus names with acceptable
stability is low, how can all the other objects that need names get
them?  Use members of the decentralized, secure, and memorable
namespace as tree roots.  Whatever the number of root nodes, the
namespace would be less centralized than namespaces administered by
ICANN or Verisign.

 > If we place any importance on this ephemeral link-weighting then it
 > will simply raise the value of attacks upon it.  How hard do you 
really
 > think it would be to get the IFL result for Dave Winer to point to
 > http://www.gonze.com?

Pretty hard, and that's assuming mappings as informal as (Dave Winer,
http://scripting.com) even need to be used.  Google rankings are under
heavy attack by well-funded entities every day; despite this, top
ranked items like the New York Times have been stable for a long time.

...

A meta comment: I wrote my original note around Google "I feel lucky" 
because it's a simple way of expressing the idea.  If you read that in 
an overly literal way you will miss the point.

- Lucas Gonze


From lgonze at panix.com  Thu Feb 12 18:16:54 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <200402120824.03646.list@waterken.net>
Message-ID: <A2DC8A46-5D87-11D8-8D19-000393455590@panix.com>

On Thursday, Feb 12, 2004, at 11:24 America/New_York, Tyler Close wrote:
> So if Google IFLs are used as identifiers, how does Google
> securely identify the result of an IFL lookup, without losing the
> decentralized property?

You shouldn't use an IFL name that isn't consistent across search 
engines you trust.  Any conscientious effort to use PageRank on the 
same well known set of crawl results (e.g. a snapshot of the web taken 
on May 5, 2003) should give the same result.  If there is a problem 
with one name, let that name go.

- Lucas


From zooko at zooko.com  Thu Feb 12 19:36:54 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL)
In-Reply-To: Message from Lucas Gonze <lgonze@panix.com> of "Wed,
	11 Feb 2004 13:38:54 EST."
	<8B690778-5CC1-11D8-BA3F-000393455590@panix.com> 
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com> 
Message-ID: <E1ArMdq-0007kX-00@localhost>


Wednesday morning I was thinking of writing an essay criticizing a widespread 
and sloppy use of the word "secure".  Wednesday evening I saw, from Lucas 
Gonze's message to p2p-hackers, that I am eminently guilty of this sloppy usage.

Sometimes people ask me "Is it possible to have a secure such-and-such?".  The 
answer is always "It depends on what you mean by 'secure'.".  It depends on what 
policy you wish to have securely enforced in this system.  Two different people 
might ask for a "secure such-and-such", but have different and even mutually 
incompatible ideas of what policy will be enforced.

It isn't surprising that people have different ideas about what is desirable in 
a system, but it is a little surprising when they assume that security implies 
their own desiderata.  The word "security" is ripe with multiple possibilities.  
People sloppily use it to mean "the qualities that I desire are still present 
even if the system is under attack".  They forget that other people desire other 
qualities.

My essay "Names: Decentralized, Secure, Human-Meaningful: Choose Two" [1] 
blunders right into this confusion.  In that essay I make a sweeping assertion 
about the realms of possibility without explaining clearly what I mean by 
"secure".

In my defense, buried down in the fiddly bits at the bottom of the essay is a 
comment:

  """
  The first step will be to specify what it *means* for the decentralized 
  human-memorizable names to be secure, which is the same as specifying what 
  universal policy should govern ownership of names. (For example, in the case 
  of CHKs, to be secure means that you can't have a collision such that one CHK 
  identifies two different bitstrings which, conveniently enough, is part of the 
  definition of security for cryptographic hashes. In the case of SSKs, to be 
  secure means that only the holder of the private key can change the mapping 
  from a given SSKs to its object, which is conveniently similar to the 
  traditional notion of security for digital signatures.) 
  """

However, I never defined what I meant by "secure" in general in that essay.  
I will now attempt to do so.

By "a secure naming scheme" I mean that the scheme has referential integrity.  

A person, Alice, sends a message to another person, Bob.  That message contains 
a name.  Bob uses the naming system to de-reference that name, resulting in an 
object.

"Referential integrity" means that nobody can cause the resulting object to be 
other than what Alice intended.

There are a lot of implications of this which I understand only partly at this 
point.  Anyway, I'll stop for now and send out this message.

Regards,

Zooko

[1] http://zooko.com/distnames.html


From zooko at zooko.com  Thu Feb 12 20:21:20 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: Message from "Zooko O'Whielacronx" <zooko@zooko.com> 
	of "12 Feb 2004 14:36:54 EST." <E1ArMdq-0007kX-00@localhost> 
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<E1ArMdq-0007kX-00@localhost> 
Message-ID: <E1ArNKq-0008DW-00@localhost>


[following up to my own post]

Jack Lloyd <lloyd@randombit.net> wrote to me privately to point out that my 
definition of "referential integrity" might inadvertently include availability.

> "Referential integrity" means that nobody can cause the resulting object to be 
> other than what Alice intended.

I suggest that we follow the tradition of computer security and separate 
"violation of referential integrity" -- substituting a bogus object in place of 
the object that Alice meant -- from "denial of service", i.e. preventing Bob 
from getting any object.

Regards,

Zooko


From lgonze at panix.com  Thu Feb 12 20:29:00 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL)
In-Reply-To: <E1ArMdq-0007kX-00@localhost>
Message-ID: <17952616-5D9A-11D8-8D19-000393455590@panix.com>


On Thursday, Feb 12, 2004, at 14:36 America/New_York, Zooko 
O'Whielacronx wrote:
> Wednesday morning I was thinking of writing an essay criticizing a 
> widespread
> and sloppy use of the word "secure".  Wednesday evening I saw, from 
> Lucas
> Gonze's message to p2p-hackers, that I am eminently guilty of this 
> sloppy usage.

Can you articulate how that relates to my message, Zooko?

Thanks.

- Lucas


From bkn3 at columbia.edu  Thu Feb 12 20:29:59 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
	<Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
	<6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>
	<Pine.NEB.4.58.0402111919270.6379@panix2.panix.com>
	<5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>
Message-ID: <6.0.1.1.2.20040212122926.01e5c4c8@pop.mail.yahoo.com>

What I don't understand is that this scheme was offered as a way to resolve 
Zookos Triangle.  How is this Google-based system secure?

Brad

At 05:11 PM 2/11/2004, you wrote:
>I guess I am having a hard time seeing how this differs from a collective 
>"pet names" system (I am sure someone will fill in the ref to markm's 
>original paper) with SDSI-like deferals except for the fact that it 
>introduces really horrible trust problems.  The problem with a (SDSI-like) 
>Google's IFL's "Dave Winer" is that the middle stage in this name chain is 
>not very stable and so it is hard to place any real value on what is 
>returned for the last key in the chain.  While it is true that for some 
>names this will end up being a semi-stable reference, for most others it 
>fails to convey any context and leads to ambiguous answers.
>
>If we place any importance on this ephemeral link-weighting then it will 
>simply raise the value of attacks upon it.  How hard do you really think 
>it would be to get the IFL result for Dave Winer to point to 
>http://www.gonze.com? Another problem with relying upon this sort of a 
>power law system is that it degrades very quickly and dramatically once 
>you get away from the big connectors.  The more keywords/context we need 
>to add to this "name" to dis-ambiguate it the less useful it becomes 
>compared to using a real pet name.
>
>>"I feel lucky" itself is not exactly a perfect name resolver.  Google
>>attempts to answer the question "what web page is this person thinking
>>of."
>
>No, google is trying to answer the question "what web page has the highest 
>rank in our system when linked to these specific keywords."
>Don't try to read any sort of AI-like intention into the pagerank 
>algorithm, it just doesn't work that way.
>
>Want a simple example?  Try this little IFL lookup:
>
>http://www.google.com/search?q=pet+names&btnI
>
>Jim
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From bkn3 at columbia.edu  Thu Feb 12 20:31:21 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
	<Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
	<6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>
	<Pine.NEB.4.58.0402111919270.6379@panix2.panix.com>
	<5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>
Message-ID: <6.0.1.1.2.20040212123017.01e80250@pop.mail.yahoo.com>

I think this is different then a pet-names system because it is a global 
name-space.  Pet names suffer from the fact that I can't open a browser on 
anyones machine and enter www.cnn.com and be taken to the same place; it 
depends what you've labeled with that particular pet name, or what someone 
else who you trust has labeled.  The original Google scheme seems to be a 
global namespace created from everyone's links.

Brad

At 05:11 PM 2/11/2004, you wrote:
>I guess I am having a hard time seeing how this differs from a collective 
>"pet names" system (I am sure someone will fill in the ref to markm's 
>original paper) with SDSI-like deferals except for the fact that it 
>introduces really horrible trust problems.  The problem with a (SDSI-like) 
>Google's IFL's "Dave Winer" is that the middle stage in this name chain is 
>not very stable and so it is hard to place any real value on what is 
>returned for the last key in the chain.  While it is true that for some 
>names this will end up being a semi-stable reference, for most others it 
>fails to convey any context and leads to ambiguous answers.
>
>If we place any importance on this ephemeral link-weighting then it will 
>simply raise the value of attacks upon it.  How hard do you really think 
>it would be to get the IFL result for Dave Winer to point to 
>http://www.gonze.com? Another problem with relying upon this sort of a 
>power law system is that it degrades very quickly and dramatically once 
>you get away from the big connectors.  The more keywords/context we need 
>to add to this "name" to dis-ambiguate it the less useful it becomes 
>compared to using a real pet name.
>
>>"I feel lucky" itself is not exactly a perfect name resolver.  Google
>>attempts to answer the question "what web page is this person thinking
>>of."
>
>No, google is trying to answer the question "what web page has the highest 
>rank in our system when linked to these specific keywords."
>Don't try to read any sort of AI-like intention into the pagerank 
>algorithm, it just doesn't work that way.
>
>Want a simple example?  Try this little IFL lookup:
>
>http://www.google.com/search?q=pet+names&btnI
>
>Jim
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From bkn3 at columbia.edu  Thu Feb 12 20:51:37 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Zooko's Spectrum, not Zooko's Law
Message-ID: <6.0.1.1.2.20040212123254.01ea7c78@pop.mail.yahoo.com>

Zooko's Spectrum, not Zooko's Law
---------------------------------------------------

A problem I have always had with Zookos Law (decentralization, security, 
human-memorizable, pick two) is that he doesn't define what he means by 
those three terms.  What does he mean by security? Security to me is a 
spectrum, from completely open to completely militarily locked down.  What 
does he mean by human-memorizable?  That goes all the way from extremely 
human friendly, such as "Brad Neuberg" to "brad@neuberg.com" to short 
identifiers like Compuserve used to have such as "234323432@compuserve.com" 
all the way to 128-bit hashes.  That sure looks like a spectrum to me.

Decentralization is also itself a spectrum.  Systems such as Napster and 
Bittorrent are hybrid decentralized, while systems such as Gnutella are 
much more decentralized.  Systems are a complex collection of pieces; some 
pieces can be centralized, while the rest are decentralized, as Napster and 
Bittorrent have shown.  Bittorrents trackers are relatively centralized, 
while the content streaming is decentralized.  The goal is not to be 
religious on whether to centralize or decentralize, but to identify what 
your political, social, and business goals are in order to decentralize the 
bits that achieve these goals.

I agree that at their extreme, you can't have all three qualities, but that 
is an extreme statement.  If each of these three qualities, 
decentralization, security, and human-friendly names, are a spectrum, then 
perhaps we can have all three if we slightly relax them.

Call it Zooko's Spectrum, not Zooko's Law.  You don't have to throw out all 
three, you just have to slightly relax one of them.  So you can have 
human-friendly names and security, but you have to slightly relax the 
degree of decentralization in your system (but not throw it completely 
out).  Or perhaps you can demand extreme decentralization and extreme 
security without throwing out human-friendliness, but slightly relax the 
human-friendly part (by having names that are short numerical GUIDs the 
length of phone-numbers but not the 128-bit GUIDs of FreeNet).

The end result is you can have your cake and eat it too, if you decide to 
use carrot cake instead of flour.  Decentralization, Security, 
Human-Friendly Names: a nuanced spectrum of choices that can't all be had 
100% but can slightly be had if you slightly relax one of them.

I've also posted this on my blog at http://www.codinginparadise.org if you 
have a response to this you want to post on your blog.

Brad Neuberg
bkn3@columbia.edu


From jdd at dixons.org  Thu Feb 12 20:58:46 2004
From: jdd at dixons.org (Jim Dixon)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: <E1ArNKq-0008DW-00@localhost>
Message-ID: <20040212204923.E29957-100000@localhost>

On 12 Feb 2004, Zooko O'Whielacronx wrote:

> > "Referential integrity" means that nobody can cause the resulting object to be
> > other than what Alice intended.
>
> I suggest that we follow the tradition of computer security and separate
> "violation of referential integrity" -- substituting a bogus object in place of
> the object that Alice meant -- from "denial of service", i.e. preventing Bob
> from getting any object.

"Any object" is a bit strong.  This wording implies that if Bob is
prevented from 'getting' _any_ object, you do not have referential
integrity.

I think that what you mean to say is "referential integrity obtains if a
reference can be resolved only in one way", that is, all objects obtained
by resolving the reference in any mannner are identical.  This does not
necessarily imply that the reference can be resolved.

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
http://jxcl.sourceforge.net                       Java unit test coverage
http://xlattice.sourceforge.net         p2p communications infrastructure


From bkn3 at columbia.edu  Thu Feb 12 20:57:00 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Zooko's Spectrum, not Zooko's Law
In-Reply-To: <6.0.1.1.2.20040212123254.01ea7c78@pop.mail.yahoo.com>
References: <6.0.1.1.2.20040212123254.01ea7c78@pop.mail.yahoo.com>
Message-ID: <6.0.1.1.2.20040212125633.01e5a590@pop.mail.yahoo.com>

Posted this and realized that Zooko had described what he meant by security 
today. :)

At 12:51 PM 2/12/2004, you wrote:
>Zooko's Spectrum, not Zooko's Law
>---------------------------------------------------
>
>A problem I have always had with Zookos Law (decentralization, security, 
>human-memorizable, pick two) is that he doesn't define what he means by 
>those three terms.  What does he mean by security? Security to me is a 
>spectrum, from completely open to completely militarily locked down.  What 
>does he mean by human-memorizable?  That goes all the way from extremely 
>human friendly, such as "Brad Neuberg" to "brad@neuberg.com" to short 
>identifiers like Compuserve used to have such as 
>"234323432@compuserve.com" all the way to 128-bit hashes.  That sure looks 
>like a spectrum to me.
>
>Decentralization is also itself a spectrum.  Systems such as Napster and 
>Bittorrent are hybrid decentralized, while systems such as Gnutella are 
>much more decentralized.  Systems are a complex collection of pieces; some 
>pieces can be centralized, while the rest are decentralized, as Napster 
>and Bittorrent have shown.  Bittorrents trackers are relatively 
>centralized, while the content streaming is decentralized.  The goal is 
>not to be religious on whether to centralize or decentralize, but to 
>identify what your political, social, and business goals are in order to 
>decentralize the bits that achieve these goals.
>
>I agree that at their extreme, you can't have all three qualities, but 
>that is an extreme statement.  If each of these three qualities, 
>decentralization, security, and human-friendly names, are a spectrum, then 
>perhaps we can have all three if we slightly relax them.
>
>Call it Zooko's Spectrum, not Zooko's Law.  You don't have to throw out 
>all three, you just have to slightly relax one of them.  So you can have 
>human-friendly names and security, but you have to slightly relax the 
>degree of decentralization in your system (but not throw it completely 
>out).  Or perhaps you can demand extreme decentralization and extreme 
>security without throwing out human-friendliness, but slightly relax the 
>human-friendly part (by having names that are short numerical GUIDs the 
>length of phone-numbers but not the 128-bit GUIDs of FreeNet).
>
>The end result is you can have your cake and eat it too, if you decide to 
>use carrot cake instead of flour.  Decentralization, Security, 
>Human-Friendly Names: a nuanced spectrum of choices that can't all be had 
>100% but can slightly be had if you slightly relax one of them.
>
>I've also posted this on my blog at http://www.codinginparadise.org if you 
>have a response to this you want to post on your blog.
>
>Brad Neuberg
>bkn3@columbia.edu
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From lgonze at panix.com  Thu Feb 12 21:05:59 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <6.0.1.1.2.20040212122926.01e5c4c8@pop.mail.yahoo.com>
Message-ID: <42359519-5D9F-11D8-8D19-000393455590@panix.com>


On Thursday, Feb 12, 2004, at 15:29 America/New_York, Brad Neuberg 
wrote:

> What I don't understand is that this scheme was offered as a way to 
> resolve Zookos Triangle.  How is this Google-based system secure?

You have to say what you mean by secure for me to answer well.

Some terminology: there is a name, a target being named, a referrer 
supplying the name, and a dereferencer consuming the name.

The question of whether an attacker substitute a bogus object when the 
name is dereferenced depends on which name and which time span.  Over a 
hundred year time span probably all names have to qualified by the date 
of reference.  Over a single day there are quite a few names which 
don't have to be qualified.  Whether there are enough names with enough 
stability over the required lifespan of a name depends on the 
application.  It is trivial to think of applications that these names 
work for and applications that they don't work for.

The question of denial of service is different.  This seems unlikely, 
since all trustworthy resolvers would have to be taken down.  (What 
portion of resolvers?  That is quantifiable but would take real 
research.)

Lastly, I want to make a much broader point not related at all to 
whether my scheme resolves Zooko's triangle.  The interesting stuff 
here is what the hell names are, how they work, and how we can write 
programs to support better name systems in the digital world.  When I 
say names I mean real names, the things that we intuitively understand 
because they evolved along with knowledge and self consciousness.  That 
type of naming is very decentralized, secure enough given endless 
kludges, and has optimal memorability.  All the factors are balanced.

To create an optimal name system in the digital world means to study 
and leverage the ways in which people already use names.  I believe 
this is a doable job in the relatively short term.  That is the 
substance of what I am proposing.

- Lucas


From zooko at zooko.com  Thu Feb 12 21:09:08 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: Message from Jim Dixon <jdd@dixons.org> of "Thu,
	12 Feb 2004 20:58:46 GMT." <20040212204923.E29957-100000@localhost> 
References: <20040212204923.E29957-100000@localhost> 
Message-ID: <E1ArO56-0000gS-00@localhost>


Jim, thank you for your comments.

 Jim Dixon wrote:
>
> > > "Referential integrity" means that nobody can cause the resulting object to be
> > > other than what Alice intended.
> >
> > I suggest that we follow the tradition of computer security and separate
> > "violation of referential integrity" -- substituting a bogus object in place of
> > the object that Alice meant -- from "denial of service", i.e. preventing Bob
> > from getting any object.
> 
> "Any object" is a bit strong.  This wording implies that if Bob is
> prevented from 'getting' _any_ object, you do not have referential
> integrity.

I'm sorry -- I don't understand the objection.  What I meant to say was simply 
that availability of the name service could be considered separately from 
correctness of the name service, where by correctness I mean that the resulting 
object is an object that Alice intended.


> I think that what you mean to say is "referential integrity obtains if a
> reference can be resolved only in one way", that is, all objects obtained
> by resolving the reference in any mannner are identical.  This does not
> necessarily imply that the reference can be resolved.

Actually, that's too restrictive!  Alice might want the name to resolve to a set 
of objects, where any one from that set is okay.  For example if a SIP URL 
resolves to a set of proxies, and Bob should use whichever SIP proxy is 
currently available.  However, Bob should *not* use a proxy inserted into the 
result by someone other than Alice.

Also Alice might want the name to denote something that changes over time, so 
that if Bob resolves it once he gets one object, and if he resolves it again he 
might get another object.  That can be tricky, because then denial-of-service 
can extend to "rollback attacks" where Bob is denied the new object and thus the 
name resolves to the old object.  However in general a mapping from name to 
object which is time-variant or varies in other ways, or which is a one-to-many 
mapping, doesn't violate the principle of referential integrity.


Regards,

Zooko


From bkn3 at columbia.edu  Thu Feb 12 21:20:51 2004
From: bkn3 at columbia.edu (Brad Neuberg)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <42359519-5D9F-11D8-8D19-000393455590@panix.com>
References: <6.0.1.1.2.20040212122926.01e5c4c8@pop.mail.yahoo.com>
	<42359519-5D9F-11D8-8D19-000393455590@panix.com>
Message-ID: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>


>
>Lastly, I want to make a much broader point not related at all to whether 
>my scheme resolves Zooko's triangle.  The interesting stuff here is what 
>the hell names are, how they work, and how we can write programs to 
>support better name systems in the digital world.  When I say names I mean 
>real names, the things that we intuitively understand because they evolved 
>along with knowledge and self consciousness.  That type of naming is very 
>decentralized, secure enough given endless kludges, and has optimal 
>memorability.  All the factors are balanced.
>
>To create an optimal name system in the digital world means to study and 
>leverage the ways in which people already use names.  I believe this is a 
>doable job in the relatively short term.  That is the substance of what I 
>am proposing.

I think I see where your going towards.  The "real-world" has converged on 
a system like you've described, and you're saying we can use this as an 
inspiration to solve these naming issues.  Did you get to see the post on 
Zooko's Spectrum?  The real world has solved these naming issues by 
slightly "relaxing" some of the security constraints (i.e. two people can 
have the same name) and by slightly relaxing the human friendliness (our 
names don't really mean anything, compared to some possible names which 
might be descriptive such as "The guy with red hair").

Brad


>- Lucas
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From lgonze at panix.com  Thu Feb 12 21:32:55 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>
Message-ID: <05425DE5-5DA3-11D8-8D19-000393455590@panix.com>

On Thursday, Feb 12, 2004, at 16:20 America/New_York, Brad Neuberg 
wrote:
> I think I see where your going towards.  The "real-world" has 
> converged on a system like you've described, and you're saying we can 
> use this as an inspiration to solve these naming issues.

Exactly!

> Did you get to see the post on Zooko's Spectrum?  The real world has 
> solved these naming issues by slightly "relaxing" some of the security 
> constraints (i.e. two people can have the same name) and by slightly 
> relaxing the human friendliness (our names don't really mean anything, 
> compared to some possible names which might be descriptive such as 
> "The guy with red hair").

Yup!

:-)

(I always considered it "Zooko's Koan" myself, because the truth of it 
is less interesting than all the meaty issues under the surface.)


From lloyd at randombit.net  Thu Feb 12 21:38:02 2004
From: lloyd at randombit.net (Jack Lloyd)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: <E1ArO56-0000gS-00@localhost>
Message-ID: <Pine.LNX.4.44.0402121626050.19894-100000@centaur.acm.jhu.edu>

On 12 Feb 2004, Zooko O'Whielacronx wrote:

> > I think that what you mean to say is "referential integrity obtains if a
> > reference can be resolved only in one way", that is, all objects obtained
> > by resolving the reference in any mannner are identical.  This does not
> > necessarily imply that the reference can be resolved.
> 
> Actually, that's too restrictive!  Alice might want the name to resolve to a set 
> of objects, where any one from that set is okay.  For example if a SIP URL 
> resolves to a set of proxies, and Bob should use whichever SIP proxy is 
> currently available.  However, Bob should *not* use a proxy inserted into the 
> result by someone other than Alice.

I think you could view this as the set of <foo> being a single object which
is resolved, and let Bob pick whichever one from the set that he likes.  
Unless there is a situation where Bob should get any one element from the
set but not any of the others, which I cannot think of offhand. In some
cases it may make more sense for it to resolve to a single 'random' <foo>
for simplicity, of course.

> Also Alice might want the name to denote something that changes over time, so 
> that if Bob resolves it once he gets one object, and if he resolves it again he 
> might get another object.  That can be tricky, because then denial-of-service 
> can extend to "rollback attacks" where Bob is denied the new object and thus the 
> name resolves to the old object.  However in general a mapping from name to 
> object which is time-variant or varies in other ways, or which is a one-to-many 
> mapping, doesn't violate the principle of referential integrity.

Are there any systems that allow for this? I have been thinking of some
cases where a (semi-)persistent name that points to time-depedent data
would be useful. The only thing that comes to mind is generating a long
random string, but that is not self-authenticating unless you include a
signature or similiar, unlike the more usual system of key = hash(content).  
And it allows for creating deliberate collisions, unlike the hash-based
names which require you to break the hash first. I figure that has got to
be a more elegant method.

-J


From zooko at zooko.com  Thu Feb 12 21:44:13 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL 
In-Reply-To: Message from Lucas Gonze <lgonze@panix.com> of "Thu,
	12 Feb 2004 16:05:59 EST."
	<42359519-5D9F-11D8-8D19-000393455590@panix.com> 
References: <42359519-5D9F-11D8-8D19-000393455590@panix.com> 
Message-ID: <E1ArOd3-0000jb-00@localhost>


 Lucas Gonze wrote:
>
> Lastly, I want to make a much broader point not related at all to 
> whether my scheme resolves Zooko's triangle.  The interesting stuff 
> here is what the hell names are, how they work, and how we can write 
> programs to support better name systems in the digital world.  When I 
> say names I mean real names, the things that we intuitively understand 
> because they evolved along with knowledge and self consciousness.  That 
> type of naming is very decentralized, secure enough given endless 
> kludges, and has optimal memorability.  All the factors are balanced.

Lucas: I still owe you a response to your original "IFL" message, but I wanted 
to make a comment about this much broader point.

As Carl Ellison has argued, the kind of names that we used as our brains were 
evolving don't scale.  Agriculture and urbanization arose about 10,000 years 
ago.  Language probably evolved between 100,000 and 5,000,000 years ago.

So for almost all of our evolutionary history we needed to remember only a 
handful of names for people.

But I do strongly agree with the sentiment that a computer scientist attempting 
to devise a new tool should pay close attention to how the existing natural 
analog succeeds at its job.

Mark Miller's "Pet Names" [1], and SDSI's "linked local namespaces" [2] are a 
beautiful hack to make a naming scheme that scales while allowing the use of 
names to be natural inasmuch as each individual user retains full control his or 
her own local namespace.  That is: I can say "Check out Dave's new movie!" and 
you can say "Check out Dave's new movie!" and we can mean different Daves and 
thus different movies.  DNS and Google's "I Feel Lucky" both seem unnatural to 
me -- I have to say "Check out dsmithson954@aol.co.us's new movie!" ?

Regards,

Zooko

[1] http://www.erights.org/elib/capability/pnml.html
[2] http://citeseer.nj.nec.com/2379.html


From zooko at zooko.com  Thu Feb 12 21:52:40 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: Message from Jack Lloyd <lloyd@randombit.net> of "Thu,
	12 Feb 2004 16:38:02 EST."
	<Pine.LNX.4.44.0402121626050.19894-100000@centaur.acm.jhu.edu> 
References: <Pine.LNX.4.44.0402121626050.19894-100000@centaur.acm.jhu.edu> 
Message-ID: <E1ArOlE-0000kt-00@localhost>


 Jack Lloyd <lloyd@randombit.net> wrote:
>
> > Also Alice might want the name to denote something that changes over time, so 
> > that if Bob resolves it once he gets one object, and if he resolves it again he 
> > might get another object.
...
> Are there any systems that allow for this?

Yes.  The Self-Certifying File System [1] and Freenet [2].  Probably others!  
Not Mnet [3] yet, alas.

Regards,

Zooko

[1] http://www.fs.net/sfswww/
[2] http://freenet.sourceforge.net/
[3] http://mnet.sf.net/


From zooko at zooko.com  Thu Feb 12 22:05:57 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL 
In-Reply-To: Message from Brad Neuberg <bkn3@columbia.edu> of "Thu,
	12 Feb 2004 12:31:21 PST."
	<6.0.1.1.2.20040212123017.01e80250@pop.mail.yahoo.com> 
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
	<Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
	<6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>
	<Pine.NEB.4.58.0402111919270.6379@panix2.panix.com>
	<5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>
	<6.0.1.1.2.20040212123017.01e80250@pop.mail.yahoo.com> 
Message-ID: <E1ArOy5-00011r-00@localhost>


 Brad Neuberg <bkn3@columbia.edu> wrote:
>
> I think this is different then a pet-names system because it is a global 
> name-space.  Pet names suffer from the fact that I can't open a browser on 
> anyones machine and enter www.cnn.com and be taken to the same place; it 
> depends what you've labeled with that particular pet name, or what someone 
> else who you trust has labeled.

I honestly consider that quality to be a benefit rather than a drawback.

I know that most people disagree with me about this.

I really wish that if a person wants to, he can make "cnn" map to:

http://www.ce.unipr.it/pardis/CNN/cnn.html

on his computer.

But anyway, if you borrow his computer then maybe you can import your own 
namespace before you start browsing?

Regards,

Zooko


From b.fallenstein at gmx.de  Thu Feb 12 23:08:07 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL)
In-Reply-To: <E1ArOlE-0000kt-00@localhost>
References: <Pine.LNX.4.44.0402121626050.19894-100000@centaur.acm.jhu.edu>
	<E1ArOlE-0000kt-00@localhost>
Message-ID: <402C0757.8000507@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Zooko O'Whielacronx wrote:
|  Jack Lloyd <lloyd@randombit.net> wrote:
|>>Also Alice might want the name to denote something that changes over
time, so
|>>that if Bob resolves it once he gets one object, and if he resolves
it again he
|>>might get another object.
|
| ...
|
|>Are there any systems that allow for this?
|
| Yes.  The Self-Certifying File System [1] and Freenet [2].  Probably
others!
| Not Mnet [3] yet, alas.

I'm developing a replacement for HTTP-based addressing which resolves
addresses through a P2P network. (We don't have a webpage up for it,
shame on us.) In Storm, our system, we have something called a
'pointer,' which can have different 'current' versions over time, but
old versions stay available as long as any peer keeps a copy.

I keep my homepage in Storm; you can browse it at
http://himalia.it.jyu.fi/~benja/, which will re-direct you to a HTTP
gateway to our system. The HTTP gateway automatically inserts a
"History" link (as well as "This page is linked from...") in the
upper-right corner of the page; the history is computed by retrieving
all known versions of the pointer from the network.

So that's kind of an example to-- the set of available versions can
grow, and the 'current' version can change.

Cheers,
- - Benja
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFALAdWUvR5J6wSKPMRAkitAJ9krkeVMF+SqdfSRzCBBVQmOH3D+gCdGRa5
eHsvlgrmFKU3OO6/nB02nBI=
=vLwJ
-----END PGP SIGNATURE-----

From zooko at zooko.com  Thu Feb 12 23:23:59 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: Message from "Zooko O'Whielacronx" <zooko@zooko.com> 
	of "12 Feb 2004 16:52:40 EST." <E1ArOlE-0000kt-00@localhost> 
References: <Pine.LNX.4.44.0402121626050.19894-100000@centaur.acm.jhu.edu>
	<E1ArOlE-0000kt-00@localhost> 
Message-ID: <E1ArQBb-0001bK-00@localhost>


[following up to my own post]

> > > Also Alice might want the name to denote something that changes over time, so 
> > > that if Bob resolves it once he gets one object, and if he resolves it again he 
> > > might get another object.
> ...
> > Are there any systems that allow for this?
> 
> Yes.  The Self-Certifying File System [1] and Freenet [2].  Probably others!  
> Not Mnet [3] yet, alas.

Oh, and I'm embarassed that I forgot the new YURL scheme [4].  YURL is designed 
to fit into the World-Wide Web.  It has the same integrity guarantees, based on 
the same sorts of cryptography, that Freenet and SFS provide.

Regards,

Zooko

> [1] http://www.fs.net/sfswww/
> [2] http://freenet.sourceforge.net/
> [3] http://mnet.sf.net/

[4] http://www.waterken.com/dev/YURL/

From clausen at gnu.org  Thu Feb 12 23:51:55 2004
From: clausen at gnu.org (Andrew Clausen)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <6.0.1.1.2.20040212122926.01e5c4c8@pop.mail.yahoo.com>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<47D50CA7-5CDD-11D8-90BE-000393071F50@mad-scientist.com>
	<Pine.NEB.4.58.0402111700470.8648@panix2.panix.com>
	<6.0.1.1.2.20040211151942.01e63438@pop.mail.yahoo.com>
	<Pine.NEB.4.58.0402111919270.6379@panix2.panix.com>
	<5A42F906-5CF8-11D8-90BE-000393071F50@mad-scientist.com>
	<6.0.1.1.2.20040212122926.01e5c4c8@pop.mail.yahoo.com>
Message-ID: <20040212235155.GA534@gnu.org>

On Thu, Feb 12, 2004 at 12:29:59PM -0800, Brad Neuberg wrote:
> What I don't understand is that this scheme was offered as a way to resolve 
> Zookos Triangle.  How is this Google-based system secure?

My answer to this is: it has a high cost of attack.  That is, you have
to either purchase many domain names or convince many high PageRank
people (who in turn purchased many domain names - i.e. recursion ;)
to link to you.

See my thesis:

http://members.optusnet.com.au/clausen/reputation/rep-cost-attack.pdf

Cheers,
Andrew


From clausen at gnu.org  Fri Feb 13 00:11:16 2004
From: clausen at gnu.org (Andrew Clausen)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <05425DE5-5DA3-11D8-8D19-000393455590@panix.com>
References: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>
	<05425DE5-5DA3-11D8-8D19-000393455590@panix.com>
Message-ID: <20040213001115.GB534@gnu.org>

On Thu, Feb 12, 2004 at 04:32:55PM -0500, Lucas Gonze wrote:
> On Thursday, Feb 12, 2004, at 16:20 America/New_York, Brad Neuberg 
> wrote:
> >I think I see where your going towards.  The "real-world" has 
> >converged on a system like you've described, and you're saying we can 
> >use this as an inspiration to solve these naming issues.
> 
> Exactly!

The real-world is very centralized, and isn't so inspiring, IMHO.

When you trust Google's "I'm feeling lucky", you are trusting the domain
name system.  That is, that all domain names were purchased properly.
Otherwise, you can Sybil-attack PageRank with a flood of "false" domain
names at zero cost.  An insider in Verisign might do this, say.

So, I agree that it might be possible to use reputation to securely
find objects with easy-to-remember names, but you need to bootstrap
off something else you trust (eg: your friends' public keys).

Cheers,
Andrew


From aloeser at cs.tu-berlin.de  Fri Feb 13 14:30:40 2004
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Distributed Inverted List
References: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>
	<05425DE5-5DA3-11D8-8D19-000393455590@panix.com>
	<20040213001115.GB534@gnu.org>
Message-ID: <402CDF90.F75628AD@cs.tu-berlin.de>

Hi all,
I search for approaches storing large inverted lists. Consider the case for
a Hash value many object ids exist. The Object values are sorted. E.g.:


HASH  Object values
$1234 |  123, 456, 678, 1020, 100002
$4566 |  123,  8755, 78899, 10000276

Id like to store this hashtable in form of a distributed hashtable.
Unfortunately each node can only store two object values, but not the whole
list of objects for a hash key.  What approaches exist to store such a
inverted list in a distributed manner in a DHT?

Alex

PS: Just as a background:  In my approach I will exercise a join between the
object values of the keys  $1234 AND $4566.  In the example the result of
the would would be 123.

--
___________________________________________________________

  M.Sc., Dipl. Wi.-Inf. Alexander L?ser
  Technische Universitaet Berlin Fakultaet IV - CIS
  bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
  hp: http://cis.cs.tu-berlin.de/~aloeser/
  office: +49- 30-314-25551
  fax   : +49- 30-314-21601
___________________________________________________________


From b.fallenstein at gmx.de  Fri Feb 13 14:53:27 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Distributed Inverted List
In-Reply-To: <402CDF90.F75628AD@cs.tu-berlin.de>
References: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>	<05425DE5-5DA3-11D8-8D19-000393455590@panix.com>	<20040213001115.GB534@gnu.org>
	<402CDF90.F75628AD@cs.tu-berlin.de>
Message-ID: <402CE4E7.8060204@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hi Alex,

I don't really understand the question yet: why can the whole list not
be the object value? Alternatively, why can't the individual items be
the object values (you say there can only be two per key, why?) and the
client does the sorting? I guess that you have a reason, I just don't
understand it yet.

Cheers,
- - Benja

Alexander L?ser wrote:
| Hi all,
| I search for approaches storing large inverted lists. Consider the
case for
| a Hash value many object ids exist. The Object values are sorted. E.g.:
|
|
| HASH  Object values
| $1234 |  123, 456, 678, 1020, 100002
| $4566 |  123,  8755, 78899, 10000276
|
| Id like to store this hashtable in form of a distributed hashtable.
| Unfortunately each node can only store two object values, but not the
whole
| list of objects for a hash key.  What approaches exist to store such a
| inverted list in a distributed manner in a DHT?
|
| Alex
|
| PS: Just as a background:  In my approach I will exercise a join
between the
| object values of the keys  $1234 AND $4566.  In the example the result of
| the would would be 123.
|
| --
| ___________________________________________________________
|
|   M.Sc., Dipl. Wi.-Inf. Alexander L?ser
|   Technische Universitaet Berlin Fakultaet IV - CIS
|   bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
|   hp: http://cis.cs.tu-berlin.de/~aloeser/
|   office: +49- 30-314-25551
|   fax   : +49- 30-314-21601
| ___________________________________________________________
|
|
| _______________________________________________
| p2p-hackers mailing list
| p2p-hackers@zgp.org
| http://zgp.org/mailman/listinfo/p2p-hackers
| _______________________________________________
| Here is a web page listing P2P Conferences:
| http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
|
|

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFALOTnUvR5J6wSKPMRAkflAKCJUgYKZwrMKEhvgSEzirxz/SgKbgCeLP1c
YfuxE5Z94u+rVESwJtonfnA=
=SPiL
-----END PGP SIGNATURE-----

From aloeser at cs.tu-berlin.de  Fri Feb 13 15:24:37 2004
From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Distributed Inverted List
References: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>	<05425DE5-5DA3-11D8-8D19-000393455590@panix.com>	<20040213001115.GB534@gnu.org>
	<402CDF90.F75628AD@cs.tu-berlin.de> <402CE4E7.8060204@gmx.de>
Message-ID: <402CEC35.37AB9983@cs.tu-berlin.de>

Hi Benja, hi all
consider a number of documents, each document has a unique  DocID. Further each
document is annotated with several meta data terms. I try to biuld a global
index where you can lookup each terms and retreive the matching documents. One
example:

DocID| Terms
123 | Bossa Nova
456 | Bossa Nova, Jobim
678 | Jobim
1020 | Bossa Nova
8755 | Jobim

In my approach I hash each document term ("Bossa Nova" =  $1234, "Jobim"=$4566
) nd store it in a DHT. Further,   I invert the list above so I can lookup
terms and get the corresponding document ID ( By the way, this is very common
in information retreival and known as inverted index or inverted list) :

HASID | DocID
$1234 |  123, 456, 1020
$4566 | 456, 678, 8755

Now I can query for documents containing either "Bossa Nova" (DocID= 123, 456,
1020) OR "Jobim" (DocID=456, 678, 8755) or, more interesting, "Bossa Nova" AND
"Jobim"(DocID=456).
This works fine if each hashkey and all DocIds can be stored in a DHT on one
physicall node. However if I try to store all 100 Millions documents, related
to Britney spears, I can't possible store all DocIds at one physicall node.  My
question is: What approaches are used to store a large number of DocIDs on
several physical nodes, belongig to the same hash key?

Alex

Benja Fallenstein wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Alex,
>
> I don't really understand the question yet: why can the whole list not
> be the object value? Alternatively, why can't the individual items be
> the object values (you say there can only be two per key, why?) and the
> client does the sorting? I guess that you have a reason, I just don't
> understand it yet.
>
> Cheers,
> - - Benja
>
> Alexander L?ser wrote:
> | Hi all,
> | I search for approaches storing large inverted lists. Consider the
> case for
> | a Hash value many object ids exist. The Object values are sorted. E.g.:
> |
> |
> | HASH  Object values
> | $1234 |  123, 456, 678, 1020, 100002
> | $4566 |  123,  8755, 78899, 10000276
> |
> | Id like to store this hashtable in form of a distributed hashtable.
> | Unfortunately each node can only store two object values, but not the
> whole
> | list of objects for a hash key.  What approaches exist to store such a
> | inverted list in a distributed manner in a DHT?
> |
> | Alex
> |
> | PS: Just as a background:  In my approach I will exercise a join
> between the
> | object values of the keys  $1234 AND $4566.  In the example the result of
> | the would would be 123.
> |
> | --
> | ___________________________________________________________
> |
> |   M.Sc., Dipl. Wi.-Inf. Alexander L?ser
> |   Technische Universitaet Berlin Fakultaet IV - CIS
> |   bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
> |   hp: http://cis.cs.tu-berlin.de/~aloeser/
> |   office: +49- 30-314-25551
> |   fax   : +49- 30-314-21601
> | ___________________________________________________________
> |
> |
> | _______________________________________________
> | p2p-hackers mailing list
> | p2p-hackers@zgp.org
> | http://zgp.org/mailman/listinfo/p2p-hackers
> | _______________________________________________
> | Here is a web page listing P2P Conferences:
> | http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> |
> |
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
> iD8DBQFALOTnUvR5J6wSKPMRAkflAKCJUgYKZwrMKEhvgSEzirxz/SgKbgCeLP1c
> YfuxE5Z94u+rVESwJtonfnA=
> =SPiL
> -----END PGP SIGNATURE-----
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

--
___________________________________________________________

  M.Sc., Dipl. Wi.-Inf. Alexander L?ser
  Technische Universitaet Berlin Fakultaet IV - CIS
  bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
  hp: http://cis.cs.tu-berlin.de/~aloeser/
  office: +49- 30-314-25551
  fax   : +49- 30-314-21601
___________________________________________________________


From list at waterken.net  Fri Feb 13 15:35:36 2004
From: list at waterken.net (Tyler Close)
Date: Sat Dec  9 22:12:38 2006
Subject: The y-property (Was: [p2p-hackers] what did I mean by "secure"? (was:
	IFL))
In-Reply-To: <E1ArMdq-0007kX-00@localhost>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<E1ArMdq-0007kX-00@localhost>
Message-ID: <200402130735.36230.list@waterken.net>

I think what you are trying to get at is what I've called the
y-property. See:

http://www.waterken.com/dev/YURL/Definition/#The_y-property

""
Briefly stated, the y-property is: "The introducer determines the message 
target."

The y-property means that only the introducer has the privilege of
determining the recipient of a message sent to the introduced site
and the processor of the sent message. The introducer is the site
authorized to write to a communication channel read by a client
site. The introducer uses the communication channel to provide a
URL to the client site. The URL identifies the introduced site.
The introduced site is a site selected by the introducer. The
client site is the site that receives the URL and uses it to send
a message to the introduced site. Receiving a message means having
access to the plaintext of the message. Processing a message means
producing a response message which the client site will accept as
an authentic response to the sent message.

The y-property is the result of applying the principle of least
privilege to the fact that the introducer decides which site to
introduce.
""

Tyler

On Thu February 12 2004 11:36 am, Zooko O'Whielacronx wrote:
> However, I never defined what I meant by "secure" in general in that essay.
> I will now attempt to do so.
>
> By "a secure naming scheme" I mean that the scheme has referential
> integrity.
>
> A person, Alice, sends a message to another person, Bob.  That message
> contains a name.  Bob uses the naming system to de-reference that name,
> resulting in an object.
>
> "Referential integrity" means that nobody can cause the resulting object to
> be other than what Alice intended.
>
> There are a lot of implications of this which I understand only partly at
> this point.  Anyway, I'll stop for now and send out this message.
>
> Regards,
>
> Zooko
>
> [1] http://zooko.com/distnames.html
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

-- 
The union of REST and capability-based security.
http://www.waterken.com/dev/Web/


From b.fallenstein at gmx.de  Fri Feb 13 15:48:33 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Distributed Inverted List
In-Reply-To: <402CEC35.37AB9983@cs.tu-berlin.de>
References: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>	<05425DE5-5DA3-11D8-8D19-000393455590@panix.com>	<20040213001115.GB534@gnu.org>	<402CDF90.F75628AD@cs.tu-berlin.de>
	<402CE4E7.8060204@gmx.de> <402CEC35.37AB9983@cs.tu-berlin.de>
Message-ID: <402CF1D1.3070400@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alex--

ok, I understand the problem better now. I don't have an answer for the
scaling problem besides what has already been discussed on the list wrt
storing large numbers of values for the same key, already.

However, do remember that if you take that approach, you need to
retrieve millions of values from the network to do your intersection and
get the result list!

"On the Feasibility of Peer-to-Peer Web Indexing and Search," by Jinyang
Li, Boon Thau Loo, Joe Hellerstein, Frans Kaashoek, David R. Karger and
Robert Morris, has some ideas about how to keep the bandwidth usage for
intersection within bounds by using Bloom filters, but maybe you knew
that already.

Sorry for not being able to help with your actual question.

- - Benja
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFALPHRUvR5J6wSKPMRAtyrAJ9K/4Z+TVsiuIjZ20xCM2bzxEg4NgCgyERX
fE3slifAjb0NLMdeSGzsQL0=
=hC3l
-----END PGP SIGNATURE-----

From Wolfgang.Mueller2 at uni-bayreuth.de  Fri Feb 13 15:48:58 2004
From: Wolfgang.Mueller2 at uni-bayreuth.de (Wolfgang =?iso-8859-1?q?M=FCller?=)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Distributed Inverted List
In-Reply-To: <402CDF90.F75628AD@cs.tu-berlin.de>
References: <6.0.1.1.2.20040212131846.01ec13d0@pop.mail.yahoo.com>
	<20040213001115.GB534@gnu.org> <402CDF90.F75628AD@cs.tu-berlin.de>
Message-ID: <200402131648.58275.wolfgang.mueller2@uni-bayreuth.de>

Hi, Alex,
There is a cute paper by Li et al. about the scalability of such approaches. 

http://citeseer.nj.nec.com/li03feasibility.html

There are some others about replacing hot spots in "chords" by multiple nodes, 
but I do not have the references on top of my head. Others in this list 
probably have.

Cheers,
Wolfgang

From list at waterken.net  Fri Feb 13 16:04:41 2004
From: list at waterken.net (Tyler Close)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <A2DC8A46-5D87-11D8-8D19-000393455590@panix.com>
References: <A2DC8A46-5D87-11D8-8D19-000393455590@panix.com>
Message-ID: <200402130804.41436.list@waterken.net>

That's not what I was getting at. I'll try to be clearer.

The Google IFL system maps a name to a URL. The client then uses
that URL to contact the target site. Currently, the returned URL
is an http URL that is neither secure, nor decentralized. The IFL
query might also return an https URL, which may be considered
secure, but is not decentralized. The only secure and
decentralized identifier you propose is the IFL name itself, but
you can't return that in response to an IFL query, or we get
infinite recursion.

The output of an IFL lookup must be an identifier that is itself
secure and decentralized for the IFL system to be considered
secure and decentralized. IFL requires a secure and decentralized
identifier scheme in order to be a secure and decentralized naming
system. Given a secure and decentralized identifier scheme, the
need to make yet another one is less compelling.

For economy of expression, let's use the term YURL for a secure
and decentralized URI. I've defined this term at:

http://www.waterken.com/dev/YURL/Definition/

Given the existence of a YURL scheme, the IFL system equates to
what I've been calling a keyword service. A keyword service maps a
human-memorable name to a YURL. The IFL system has the special
property that the mapping it provides is not decided unilaterally,
but by consensus. This property makes the IFL a useful form of
keyword service.

I propose that the particular use-cases you have been thinking
about, eg: establishing car brands, are completely solved by
keyword services and that we need not try to extend the IFL
keyword service into a new kind of fragile YURL scheme. We could
work through some example scenarios if you like.

Tyler

On Thu February 12 2004 10:16 am, Lucas Gonze wrote:
> On Thursday, Feb 12, 2004, at 11:24 America/New_York, Tyler Close wrote:
> > So if Google IFLs are used as identifiers, how does Google
> > securely identify the result of an IFL lookup, without losing the
> > decentralized property?
>
> You shouldn't use an IFL name that isn't consistent across search
> engines you trust.  Any conscientious effort to use PageRank on the
> same well known set of crawl results (e.g. a snapshot of the web taken
> on May 5, 2003) should give the same result.  If there is a problem
> with one name, let that name go.
>
> - Lucas
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

-- 
The union of REST and capability-based security.
http://www.waterken.com/dev/Web/


From lgonze at panix.com  Fri Feb 13 16:22:58 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL
In-Reply-To: <20040213001115.GB534@gnu.org>
Message-ID: <E2A02D6E-5E40-11D8-9EC6-000393455590@panix.com>


On Thursday, Feb 12, 2004, at 19:11 America/New_York, Andrew Clausen 
wrote:
> The real-world is very centralized, and isn't so inspiring, IMHO.

A not totally irrelevant digression:
I'm not interested in real world naming only as an inspiration -- 
memorable program-generated names have to literally model real world 
naming.

A computer model for handling names is exactly as good as it is similar 
to our cognitive processes.  PicHunter*, for example, also works 
exactly as well as it matches human cognition.

* http://www.pnylab.com/pny/papers/phj/main.html

> When you trust Google's "I'm feeling lucky", you are trusting the 
> domain
> name system.  That is, that all domain names were purchased properly.
> Otherwise, you can Sybil-attack PageRank with a flood of "false" domain
> names at zero cost.  An insider in Verisign might do this, say.
>
> So, I agree that it might be possible to use reputation to securely
> find objects with easy-to-remember names, but you need to bootstrap
> off something else you trust (eg: your friends' public keys).

I'm going to think about this for a while before I have a comment.  The 
first thing I'm going to think about is whether it's the same problem 
as Zooko's triangle or a different one.

- Lucas


From list at waterken.net  Fri Feb 13 16:18:53 2004
From: list at waterken.net (Tyler Close)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL)
In-Reply-To: <E1ArQBb-0001bK-00@localhost>
References: <Pine.LNX.4.44.0402121626050.19894-100000@centaur.acm.jhu.edu>
	<E1ArOlE-0000kt-00@localhost> <E1ArQBb-0001bK-00@localhost>
Message-ID: <200402130818.53552.list@waterken.net>

On Thu February 12 2004 03:23 pm, Zooko O'Whielacronx wrote:
> [following up to my own post]
>
> > > > Also Alice might want the name to denote something that changes over
> > > > time, so that if Bob resolves it once he gets one object, and if he
> > > > resolves it again he might get another object.
> >
> > ...
> >
> > > Are there any systems that allow for this?
> >
> > Yes.  The Self-Certifying File System [1] and Freenet [2].  Probably
> > others! Not Mnet [3] yet, alas.
>
> Oh, and I'm embarassed that I forgot the new YURL scheme [4].  YURL is
> designed to fit into the World-Wide Web.  It has the same integrity
> guarantees, based on the same sorts of cryptography, that Freenet and SFS
> provide.

YURL is actually the term I coined to define the property provided
by all of these URL schemes, the y-property. See:

http://www.waterken.com/dev/YURL/Definition/

There is a list of all known YURL schemes at:

http://www.waterken.com/dev/YURL/#YURL_schemes

(I don't have Freenet there yet, because I can't find a link to
the URL specifications. Anyone got one?)

The YURL scheme I created for the WWW is httpsy. See:

http://www.waterken.com/dev/YURL/httpsy/

The YURL concept was derived from the cap YURL scheme used by the
E language. AFAICT, the cap YURL scheme predates all others and is
the originator of this field.

Tyler

-- 
The union of REST and capability-based security.
http://www.waterken.com/dev/Web/


From zooko at zooko.com  Fri Feb 13 17:26:29 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: Message from Tyler Close <list@waterken.net> of "Fri,
	13 Feb 2004 08:18:53 PST." <200402130818.53552.list@waterken.net> 
References: <Pine.LNX.4.44.0402121626050.19894-100000@centaur.acm.jhu.edu>
	<E1ArOlE-0000kt-00@localhost> <E1ArQBb-0001bK-00@localhost>
	<200402130818.53552.list@waterken.net> 
Message-ID: <E1Arh5B-0006Rc-00@localhost>


 Tyler Close wrote:
>
> The YURL concept was derived from the cap YURL scheme used by the
> E language. AFAICT, the cap YURL scheme predates all others and is
> the originator of this field.

Okay, now I'm even *more* embarassed than in addition to not mentioning httpsy, 
I also didn't mention E:

http://erights.org/

E is an object-oriented, garbage-collected programming language.  References in 
E can refer to remote objects living on other computers across the network as 
well as to objects in the local virtual machine.  E cryptographically enforces 
the rule of referential integrity: If Alice gives you a reference, then the 
resulting object is an object that was acceptable to Alice as the target of that 
reference.

Like the other systems we've mentioned in this thread (except Mnet), E allows 
mutable content.

The author of E is Mark Miller, who is also the author of the Pet Names Markup 
Language proposal, and who is also responsible for teaching me about these 
issues a few years ago.

E references can be serialized in order to be stored, passed via e-mail, etc..  
The serialized form of an E reference is a non-human-meaningful string.  (It is 
derived, like all of the systems mentioned, from the secure hash of a public 
key.)

Regards,

Zooko


From lgonze at panix.com  Fri Feb 13 18:10:39 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL: memorability as overlay
In-Reply-To: <200402130804.41436.list@waterken.net>
Message-ID: <EE3D8547-5E4F-11D8-9EC6-000393455590@panix.com>


Assume a (+secure +decentralized -memorable) namespace based on 
self-authenticating hashes, and also a decentralized body of hypertext 
documents with edges labeled according to the whim of the labeler.  In 
that case memorability can be added as an overlay without introducing 
centralization or insecurity, because PageRank applies equally to 
secure name schemes not backed by an authority:

<html><body>I sure do love the 		
<a href="urn:sha1:92a949fd41844e1bb8c6812cdea102708fde23a4"/>New York 
Times</a>
.</body></html>

A Sybil attack on PageRank by ICANN or Verisign would succeed in 
breaking names in their portion of the namespace, but names in the 
self-authenticating portion would continue to work.  That addresses 
Andrew Clausen's point without introducing a second reputation network.

If IFL names can resolve to (+secure +decentralized -memorable) names 
as well as (+secure -decentralized +memorable) names, that resolves 
Tyler Close's point without requiring IFL names to be recursive.

I can now summarize my solution to Zooko's triangle: PageRank sometimes 
allows memorability to be added as an overlay on secure and 
decentralized but not memorable namespaces.  When there is hypertext, 
it is possible for names to be all three of memorable, decentralized 
and secure.

- Lucas


From lgonze at panix.com  Fri Feb 13 18:25:58 2004
From: lgonze at panix.com (Lucas Gonze)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL: memorability as overlay
In-Reply-To: <EE3D8547-5E4F-11D8-9EC6-000393455590@panix.com>
Message-ID: <11711810-5E52-11D8-9EC6-000393455590@panix.com>

> because PageRank applies equally to secure name schemes not backed by 
> an authority:

Per Clausen's thesis, the integrity of PageRank is backed by the cash 
price of a name, so self-authenticating names are out.


From lintao.liu at asu.edu  Fri Feb 13 18:57:24 2004
From: lintao.liu at asu.edu (Lintao Liu)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Distributed Inverted List
In-Reply-To: <200402131648.58275.wolfgang.mueller2@uni-bayreuth.de>
Message-ID: <000001c3f263$388043c0$c3b2a995@LintaoLiu>

Hi,=20
 	I have thought about this problem and came with a partial
answer, which will appear in GP2PC.
	http://www.public.asu.edu/~kryu1/papers/KF-GP2PC-CR2.pdf
	The limitation of this method is: It is not designed for full
text search. The average number of keywords associated with each file
shouldn't be too large, like more than 20. If each file has less than 10
keywords, this mechanism works great, based on our experiment.=20
	Here is some overhead analysis which is not included in this
paper but should be helpful to understand this limitation:
	For a single file f, K(f) is its keyword set.=20
	Let m =3D |K(f)|, and n =3D |K(f)&Dictionary|
	We assume that all synthetic keywords consist of only two prime
keywords. (Based on the experiment, more than 90% synthetic keywords
follow this assumption) The total number of replicas for this file is:
	Without dictionary:
m							(1)
	With dictionary   : m - n + n(n-1)/2 =3D m +
n(n-3)/2		(2)
	As the first sight of (2), we know n cannot be too large.
	If we consider another interesting metric: n/m, we will find m
also plays an important role. Basically, n/m represents how many
percentages of total occurrences would be removed. In another word, it
means how many generic keywords would benefit from our design. For
example, there are total 100000 keyword occurrences and the top 10
keywords appear 30000 times. Putting these 10 keywords in the dictionary
will make n/m =3D 30000/100000 =3D0.3 (not exactly equal to 0.3, but
approximately). A larger n/m will help more generic keywords but
generate more synthetic keywords. To some extent, n/m shows how much the
keyword fusion helps the system to solve the imbalance problem.
	To keep the same ratio of n/m, a larger m will make a larger n,
which is where the limitation comes from.=20
	Other overhead includes network traffic. Consider a stable
system where a dictionary is already built and don't change a lot, the
number of messages for each file is almost the same with the number of
replicas.
=09
	We can use the same mechanism for query processing, just
replacing files with queries, and disk storage imbalance with network
traffic imbalance. We believe we can achieve the same goal but cause
less overhead (because the popular keywords in queries may not be
generic in files, which will generate less number of new replicas).
=09
	Any comments are welcome. And we are doing new experiments to
test our design. If you happen to have access to some query logs, we
really appreciate if you could provide it.=20


Best,
Lintao

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]
On Behalf Of Wolfgang M=FCller
Sent: Friday, February 13, 2004 8:49 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] Distributed Inverted List

Hi, Alex,
There is a cute paper by Li et al. about the scalability of such
approaches.=20

http://citeseer.nj.nec.com/li03feasibility.html

There are some others about replacing hot spots in "chords" by multiple
nodes,=20
but I do not have the references on top of my head. Others in this list=20
probably have.

Cheers,
Wolfgang
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From jdd at dixons.org  Fri Feb 13 19:01:59 2004
From: jdd at dixons.org (Jim Dixon)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: <E1ArO56-0000gS-00@localhost>
Message-ID: <20040213182808.M29957-100000@localhost>

On 12 Feb 2004, Zooko O'Whielacronx wrote:

> > > > "Referential integrity" means that nobody can cause the resulting object to be
> > > > other than what Alice intended.
> > >
> > > I suggest that we follow the tradition of computer security and separate
> > > "violation of referential integrity" -- substituting a bogus object in place of
> > > the object that Alice meant -- from "denial of service", i.e. preventing Bob
> > > from getting any object.
> >
> > "Any object" is a bit strong.  This wording implies that if Bob is
> > prevented from 'getting' _any_ object, you do not have referential
> > integrity.
>
> I'm sorry -- I don't understand the objection.  What I meant to say was simply
> that availability of the name service could be considered separately from
> correctness of the name service, where by correctness I mean that the resulting
> object is an object that Alice intended.

Well, that's better, but I believe that Alice's intentions should not be
considered.

> > I think that what you mean to say is "referential integrity obtains if a
> > reference can be resolved only in one way", that is, all objects obtained
> > by resolving the reference in any mannner are identical.  This does not
> > necessarily imply that the reference can be resolved.
>
> Actually, that's too restrictive!  Alice might want the name to resolve to a set
> of objects, where any one from that set is okay.  For example if a SIP URL
> resolves to a set of proxies, and Bob should use whichever SIP proxy is
> currently available.  However, Bob should *not* use a proxy inserted into the
> result by someone other than Alice.

Reformulation: "referential integrity obtains if a reference can be
resolved only in the manner determined by the algorithm under
consideration", that is, any object obtained at any given time by
resolving the reference in any manner belongs to a set of objects whose
membership is determined by the algorithm under consideration and
variables selected by that algorithm.

The BBC used to (and probably still do) operate their domain name services
in such a way that certain symbols would resolve differently depending
upon when and where the question was being asked.  If you were in Kansas,
they wanted you to use their New York server farm, but if you were in
Coventry, they directed you to London.

> Also Alice might want the name to denote something that changes over time, so
> that if Bob resolves it once he gets one object, and if he resolves it again he
> might get another object.  That can be tricky, because then denial-of-service
> can extend to "rollback attacks" where Bob is denied the new object and thus the
> name resolves to the old object.  However in general a mapping from name to
> object which is time-variant or varies in other ways, or which is a one-to-many
> mapping, doesn't violate the principle of referential integrity.

Consider a symbol x(t).  Alice intends for this to be 1 during odd hours
and 0 during even hours.  Mallory blocks access to the symbol during even
hours.  Bob has no trouble resolving the symbol during odd hours, but
during even hours he has to fall back on the last value he could resolve
to.  In other words, to Bob, the value of x is always 1.

Your wording seems to imply that x(t) has referential integrity even
though Bob's understanding of the symbol is wrong half the time.

Alice has too much to drink one evening and alters the software on her
server.  In consequence x(t) is always 1, although she intended for it to
be 1 during odd hours and 0 during even hours.  In this case your wording
implies that there is a lack of referential integrity, because the symbol
doesn't resolve to what she intended.  I submit that there isn't.

I think that (a) on closer examination the notions of referential
integrity and availability are quite hard to disentangle and (b) certainly
the intentions of the designer should not be relevant to considerations of
referential integrity.

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
http://jxcl.sourceforge.net                       Java unit test coverage
http://xlattice.sourceforge.net         p2p communications infrastructure


From zooko at zooko.com  Fri Feb 13 19:27:57 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: Message from Jim Dixon <jdd@dixons.org> of "Fri,
	13 Feb 2004 19:01:59 GMT." <20040213182808.M29957-100000@localhost> 
References: <20040213182808.M29957-100000@localhost> 
Message-ID: <E1Ariyj-0007ef-00@localhost>


 Jim Dixon wrote:
>
> Your wording seems to imply that x(t) has referential integrity even
> though Bob's understanding of the symbol is wrong half the time.

You're right.  My current definition of referential integrity does not enable 
Alice to require that the object changes at specific times, even though it does 
enable Alice to change the object.  This may be surprising, but I chose it 
because it is not (as far as I know) possible to securely enforce the former, 
but it is possible to securely enforce the latter, modulo roll-back attacks.


> I think that (a) on closer examination the notions of referential
> integrity and availability are quite hard to disentangle and (b) certainly
> the intentions of the designer should not be relevant to considerations of
> referential integrity.

As to (a), I agree.  As to (b), I didn't mean the designer of the system, 
I meant the speaker -- the one who utters the name.  The notion of referential 
integrity that I am promulgating states that this person has the sole authority 
to determine what object results from the dereferencing of the name.

Regards,

Zooko


From jdd at dixons.org  Sat Feb 14 05:35:34 2004
From: jdd at dixons.org (Jim Dixon)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL) 
In-Reply-To: <E1Ariyj-0007ef-00@localhost>
Message-ID: <20040214044913.F29957-100000@localhost>

On 13 Feb 2004, Zooko O'Whielacronx wrote:

> > Your wording seems to imply that x(t) has referential integrity even
> > though Bob's understanding of the symbol is wrong half the time.
>
> You're right.  My current definition of referential integrity does not enable
> Alice to require that the object changes at specific times, even though it does
> enable Alice to change the object.  This may be surprising, but I chose it
> because it is not (as far as I know) possible to securely enforce the former,
> but it is possible to securely enforce the latter, modulo roll-back attacks.

Earlier in this thread you introduced the notions of time and caching.  I
added location to the mix.

In the domain name system each of these -- time, caching, location -- is a
factor.  Bob connects to the Internet through a communications channel.
He asks that a string (a.xyz.org) be resolved to an IP address.  There may
be an arbitrary number of xyz.org servers located at various points on the
Internet.  One of these will be authoritative.  There also will be a
number of caching name servers between Bob and the xyx.org servers.  None
of these is authoritative.

When the string a.xyz.org is presented for resolution to the authoritative
name server, it has two options: it can reply NULL (the string does not
resolve), or it can reply with an IP address.  Either might depend upon
the time, the location of the query, any other factor known to the
resolver (temperature, pressure, etc), or be random to a degree.  The
resolver's reply includes suggestions as to how the resolution (the
information returned) should be used, specifically time-to-live.

This complex context should be used to test your idea of referential
integrity.

Certainly any resolver needs to return not just the value that the symbol
resolves to, but also whether the value was cached and if so when.  It
also needs to say whether or not it is authoritative, although precisely
what "authoritative" means needs some further examination.

> > I think that (a) on closer examination the notions of referential
> > integrity and availability are quite hard to disentangle and (b) certainly
> > the intentions of the designer should not be relevant to considerations of
> > referential integrity.
>
> As to (a), I agree.  As to (b), I didn't mean the designer of the system,
> I meant the speaker -- the one who utters the name.  The notion of referential
> integrity that I am promulgating states that this person has the sole authority
> to determine what object results from the dereferencing of the name.

What person?  What is the significance of "authority"?  How is authority
relevant to referential integrity?

In my world, which is not at all unusual, I work behind a firewall.  I
control several domains.  Names resolve differently depending upon which
side of the firewall you are on.  Certain names are undefined on one side
of the firewall but defined on the other.  Other names resolve to
different IP addresses depending upon which side of the firewall you are
on.  The resolvers certainly resolve names differently at different times.

When I configure my name servers incorrectly, those name servers are
behaving correctly (they have 'referential integrity') when the object
resulting from their dereferencing of a name is not what I intended.  This
is because it is the name server which is authoritative, not me.  Any
other interpretation of the situation leads only to confusion.

--
Jim Dixon  jdd@dixons.org   tel +44 117 982 0786  mobile +44 797 373 7881
http://jxcl.sourceforge.net                       Java unit test coverage
http://xlattice.sourceforge.net         p2p communications infrastructure


From list at waterken.net  Sat Feb 14 06:48:55 2004
From: list at waterken.net (Tyler Close)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL: memorability as overlay
In-Reply-To: <11711810-5E52-11D8-9EC6-000393455590@panix.com>
References: <11711810-5E52-11D8-9EC6-000393455590@panix.com>
Message-ID: <200402132248.55559.list@waterken.net>

On Fri February 13 2004 10:25 am, Lucas Gonze wrote:
> > because PageRank applies equally to secure name schemes not backed by
> > an authority:
>
> Per Clausen's thesis, the integrity of PageRank is backed by the cash
> price of a name, so self-authenticating names are out.

So then is your contradiction of Zooko's Triangle. In your proof,
PageRank got its decentralized property from the underlying
self-authenticating pointers. Take away the self-authenticating
pointers, and you take away decentralization.

Although not itself centralized, PageRank depends upon a
centralized infrastructure. In particular, PageRank depends upon
the artificial scarcity, and resultant cost, of identity (ie:
having a hostname).

Tyler

-- 
The union of REST and capability-based security.
http://www.waterken.com/dev/Web/


From hopper at omnifarious.org  Mon Feb 16 14:23:54 2004
From: hopper at omnifarious.org (Eric M. Hopper)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] An Analysis of Compare by Hash
Message-ID: <1076941433.26007.6.camel@monster.omnifarious.org>

Many people here might already be aware of this interesting paper, but I
thought I'd post it anyway, since I hadn't seen any mention of it.

http://www.usenix.org/events/hotos03/tech/full_papers/henson/henson_html/hash.html

This is directly applicable to many of the projects people talk about
here.

Have fun (if at all possible),
-- 
The best we can hope for concerning the people at large is that they
be properly armed.  -- Alexander Hamilton
-- Eric Hopper (hopper@omnifarious.org  http://www.omnifarious.org/~hopper) --
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040216/8740bdda/attachment.pgp
From zooko at zooko.com  Mon Feb 16 15:18:47 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] An Analysis of Compare by Hash 
In-Reply-To: Message from "Eric M. Hopper" <hopper@omnifarious.org> of "Mon,
	16 Feb 2004 06:23:54 PST."
	<1076941433.26007.6.camel@monster.omnifarious.org> 
References: <1076941433.26007.6.camel@monster.omnifarious.org> 
Message-ID: <E1AskWF-0007GL-00@localhost>


> http://www.usenix.org/events/hotos03/tech/full_papers/henson/henson_html/hash.html

The paper is wrong on a factual count, plus I disagree with the engineering 
intuitions.  I'll just point out the factual part here.

"Users of compare-by-hash argue that this assumption is warranted because the 
chance of a hash collision between any two randomly generated blocks is 
estimated to be many orders of magnitude smaller than the chance of many kinds 
of hardware errors."

There is no recourse to "randomly generated blocks" in the design and analysis 
of cryptographic hashes like SHA-1.  The hypothetical adversary who seeks to 
cause collisions is allowed to generate the pre-images however he likes, 
including making them related, adaptively computing them, doing birthday-
collision attacks, ad infinitum.  SHA-1 and the other crypto hashes have been 
explicitly designed and evaluated under that assumption, *not* under some kind 
of "random inputs" assumption [1].  Henson's suggestion that perhaps one could 
find collisions in SHA-1 by using non-random inputs is pure speculation, and 
appears to have been written in ignorance of the relevant research.  

Graydon Hoare has written a more detailed response to Henson:

http://www.venge.net/monotone/docs/Hash-Integrity.html#Hash%20Integrity

Hopefully Graydon's note will be cited whereever Henson's paper is.

Regards,

Zooko

[1] As an example of how cryptographers have not allowed their work to rest on 
    this extremely strong assumption about random distribution of inputs, 
    consider that they have previously used a weaker assumption called "hash 
    function balance", and that they have recently questioned even this 
    assumption:

    "Hash Function Balance and its Impact on Birthday Attacks" (2002)
    Mihir Bellare, Tadayoshi Kohno
    http://citeseer.nj.nec.com/bellare02hash.html


From hopper at omnifarious.org  Mon Feb 16 21:36:22 2004
From: hopper at omnifarious.org (Eric Mathew Hopper)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] An Analysis of Compare by Hash
In-Reply-To: <E1AskWF-0007GL-00@localhost>
References: <1076941433.26007.6.camel@monster.omnifarious.org>
	<E1AskWF-0007GL-00@localhost>
Message-ID: <20040216213622.GA21307@omnifarious.org>

On Mon, Feb 16, 2004 at 10:18:47AM -0500, Zooko O'Whielacronx wrote:
> 
> > http://www.usenix.org/events/hotos03/tech/full_papers/henson/henson_html/hash.html
> 
> The paper is wrong on a factual count, plus I disagree with the
> engineering intuitions.  I'll just point out the factual part here.
> 
> "Users of compare-by-hash argue that this assumption is warranted
> because the chance of a hash collision between any two randomly
> generated blocks is estimated to be many orders of magnitude smaller
> than the chance of many kinds of hardware errors."
> 
> There is no recourse to "randomly generated blocks" in the design and
> analysis of cryptographic hashes like SHA-1.

This, I totally agreed with, and it's a major flaw of that paper.  It
confused a lot of people in the forum that was talking about Monotone.

I also agree that his engineering intuition is wrong.  He seems to
appreciate what exponents really mean, but then goes on to demonstrate
that he doesn't.

Though, my own back-of-the-envelope calculations, where I assume one
document for every electron in the ocean indiciate that SHA-1 is
possibly inadequate for the needs of a globally distributed filesystem
for the indefinite future.  But they also indicate that SHA-256 is more
than adequate (though it's also significantly less well proven).  :-)

> Graydon Hoare has written a more detailed response to Henson:
> 
> http://www.venge.net/monotone/docs/Hash-Integrity.html#Hash%20Integrity
> 
> Hopefully Graydon's note will be cited whereever Henson's paper is.

It should be.  Thanks for pointing it out.

One thing that I think the first paper usefully points out though is
that some systems are dependent in a very fundamental way on the
evenness of the distribution of SHA-1 output.  And it is good to keep in
mind this dependency when designing them.

Have fun (if at all possible),
-- 
"It does me no injury for my neighbor to say there are twenty gods or no God.
It neither picks my pocket nor breaks my leg."  --- Thomas Jefferson
"Go to Heaven for the climate, Hell for the company."  -- Mark Twain
-- Eric Hopper (hopper@omnifarious.org  http://www.omnifarious.org/~hopper) --
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040216/2b429d8d/attachment.pgp
From hopper at omnifarious.org  Tue Feb 17 02:30:27 2004
From: hopper at omnifarious.org (Eric M. Hopper)
Date: Sat Dec  9 22:12:38 2006
Subject: The y-property (Was: [p2p-hackers] what did I mean by
	"secure"? (was: IFL))
In-Reply-To: <200402130735.36230.list@waterken.net>
References: <8B690778-5CC1-11D8-BA3F-000393455590@panix.com>
	<E1ArMdq-0007kX-00@localhost>  <200402130735.36230.list@waterken.net>
Message-ID: <1076985026.26007.21.camel@monster.omnifarious.org>

On Fri, 2004-02-13 at 07:35, Tyler Close wrote:
> I think what you are trying to get at is what I've called the
> y-property. See:
> 
> http://www.waterken.com/dev/YURL/Definition/#The_y-property
> 
> ""
> Briefly stated, the y-property is: "The introducer determines the message 
> target."

Yes, Zooko's statement and this one are equivalent.

-- 
Eric M. Hopper <hopper@omnifarious.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040216/7523969d/attachment.pgp
From hopper at omnifarious.org  Tue Feb 17 02:44:48 2004
From: hopper at omnifarious.org (Eric M. Hopper)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] what did I mean by "secure"? (was: IFL)
In-Reply-To: <20040213182808.M29957-100000@localhost>
References: <20040213182808.M29957-100000@localhost>
Message-ID: <1076985887.26007.27.camel@monster.omnifarious.org>

On Fri, 2004-02-13 at 11:01, Jim Dixon wrote:
> I think that (a) on closer examination the notions of referential
> integrity and availability are quite hard to disentangle and (b) certainly
> the intentions of the designer should not be relevant to considerations of
> referential integrity.

It's pretty easy for Bob to determine whether or not his information is
'live' or not.  Perhaps that should be added as a phrase "Any request
for an object that returns a live value returns what Alice intended.". 
I submit that determining whether or not Alice's intentions are 'real'
(i.e. she's drunk or under coercion) is beyond the scope of the
definition.

Have fun (if at all possible),
-- 
The best we can hope for concerning the people at large is that they
be properly armed.  -- Alexander Hamilton
-- Eric Hopper (hopper@omnifarious.org  http://www.omnifarious.org/~hopper) --
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040216/d434a499/attachment.pgp
From hopper at omnifarious.org  Tue Feb 17 02:51:14 2004
From: hopper at omnifarious.org (Eric M. Hopper)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL: memorability as overlay
In-Reply-To: <200402132248.55559.list@waterken.net>
References: <11711810-5E52-11D8-9EC6-000393455590@panix.com>
	<200402132248.55559.list@waterken.net>
Message-ID: <1076986274.26007.31.camel@monster.omnifarious.org>

On Fri, 2004-02-13 at 22:48, Tyler Close wrote:
> Although not itself centralized, PageRank depends upon a
> centralized infrastructure. In particular, PageRank depends upon
> the artificial scarcity, and resultant cost, of identity (ie:
> having a hostname).

No hostname is required, just an IP.  From what I know, PageRank would
still work if everybody used IP addresses instead of hostnames in URLs. 
Though, that would break half the websites out their as many use virtual
hosting.

Though, I suppose you could say that having an IP that doesn't change
over time is costly and therefor a scarce commodity.

-- 
The best we can hope for concerning the people at large is that they
be properly armed.  -- Alexander Hamilton
-- Eric Hopper (hopper@omnifarious.org  http://www.omnifarious.org/~hopper) --
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040216/c7bf4aef/attachment.pgp
From sam at neurogrid.com  Tue Feb 17 03:03:05 2004
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Re: P2P journal copyright
In-Reply-To: <5.0.2.1.1.20040131100728.00a3f650@pop.home.se>
References: <20040131030211.GF20611@lycopodium>	<401AF6B8.6050902@neurogrid.com>	<20040131030211.GF20611@lycopodium>
	<5.0.2.1.1.20040131100728.00a3f650@pop.home.se>
Message-ID: <40318469.1090508@neurogrid.com>

Hi David,

I've finally got things together to read all the mails and related 
documents - apologies for the delay.

David G?thberg wrote:

> Sam Joseph, p2pjournal.com: You're new version of your copyright 
> agreement is much more agreeable. However I think it does create a 
> whole set of legal problems and uncertainties for both parties. So I 
> have some suggestions.
>
> Stating that the p2pjournal gets a "license" and then that "Authors 
> are granted rights to reproduce" makes it very unclear who owns what. 
> The wording you have chosen for instance might make it illegal for any 
> of the parties to resell the text! That is, your wording gives both 
> parties the right to reproduce the text, but not to sell copies of it 
> or resell the rights to it.
>
> I think it is better and easier to give both parties one complete 
> copyright and ownership of the text. That is, to copy the copyright!  :)
>
> Here's a rough translation (from memory) and adaptation of the 
> copyright agreement we used for the paintings my mother bought for the 
> book she wrote. Lawyers in Sweden thought this was a very nice idea 
> and they could see no legal problem with it: 

<snip>

Thanks for your input here David - although I have to say that since I 
am not a lawyer I don't really know whether our new wording or your new 
suggestion would lead to more or less legal complications ...

I also had an email from Johan Fange suggesting that perhaps the 
copyright should be of limited duration, and of course we also had input 
to the list from Nick Lothian with links to some other journals that 
have had to deal with these issues. 

Many thanks to Nick for those links.  Actually I think that the debate 
Nick is linking us to is more about the cost of journals than the 
copyright, and since P2PJournal is free, perhaps that debate is not so 
relevant.  However through those links I did find the copyright approach 
of the American Mathematical Society 
http://www.ams.org/authors/ctp.html  which seems to be the kind of 
agreement that Knuth and co were advocating.

Anyways, I think we have to update P2PJ's current pages on copyright in 
some fashion to stop Don flaming my next CFP, so I'm going to suggest we 
go with Ray's wording for the moment, and try and get some legal 
consultation about what would really achieve the two goals of satisfying 
all the authors and journals requirements, because otherwise this is all 
a bit meaningless.

Thanks to everyone for their input on this.

CHEERS> SAM


From clausen at gnu.org  Tue Feb 17 03:55:12 2004
From: clausen at gnu.org (Andrew Clausen)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] IFL: memorability as overlay
In-Reply-To: <1076986274.26007.31.camel@monster.omnifarious.org>
References: <11711810-5E52-11D8-9EC6-000393455590@panix.com>
	<200402132248.55559.list@waterken.net>
	<1076986274.26007.31.camel@monster.omnifarious.org>
Message-ID: <20040217035512.GC580@gnu.org>

On Mon, Feb 16, 2004 at 06:51:14PM -0800, Eric M. Hopper wrote:
> On Fri, 2004-02-13 at 22:48, Tyler Close wrote:
> > Although not itself centralized, PageRank depends upon a
> > centralized infrastructure. In particular, PageRank depends upon
> > the artificial scarcity, and resultant cost, of identity (ie:
> > having a hostname).
> 
> No hostname is required, just an IP.  From what I know, PageRank would
> still work if everybody used IP addresses instead of hostnames in URLs. 

PageRank requires you allocate "initial karma" to some subset of pages.
You could use the domain system, IP addresses, the Open Directory, or
any combination of these to seed this.  I am not aware of any evidence
of what Google actually does use.

> Though, that would break half the websites out their as many use virtual
> hosting.

It wouldn't "break" them, it would just mean they wouldn't get as much
"initial karma".

> Though, I suppose you could say that having an IP that doesn't change
> over time is costly and therefor a scarce commodity.

Yeah, probably scarcer than domain names.  But, IP address allocation
is also centralized.

Cheers,
Andrew


From hopper at omnifarious.org  Tue Feb 17 08:20:22 2004
From: hopper at omnifarious.org (Eric M. Hopper)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
In-Reply-To: <20040203160155.G78403-100000@localhost>
References: <20040203160155.G78403-100000@localhost>
Message-ID: <1077006022.26007.76.camel@monster.omnifarious.org>

On Tue, 2004-02-03 at 08:13, Jim Dixon wrote:
> On Tue, 3 Feb 2004, Benja Fallenstein wrote:
> 
> > does anybody know references for using cryptographic hashes as unique
> > identifiers for files in very large repositories (think all of the Web)?
> > The references I've found (e.g. Handbook of Applied Cryptography) don't
> > talk explicitly about that, but only about applications in message
> > authentication, and attacks related to that; of course that's related,
> > but it would be nice to know whether there are references from
> > cryptology talking explicitly about hashes as unique identifiers in very
> > large collections of messages.
> 
> No references really needed.  You can use SHA digests to generate 160
> bit/20 byte keys which can be used as unique identifiers.  While it is
> theoretically possible that one messages or other document could hash to
> the same digest as another, changes are approximately 10^16 against it, so
> we needn't worry in our lifetimes.

Actually lets do this guess and come up with a better number...

Lets say there are approximately 275 billion documents on the web.

That's close to 2^38.  This means there are 2^76 relations between two
different documents, each of which has the potential for two documents
with the same hash.

Assuming SHA1 has a flat distribution or is 'balanced', there's an even
chance for all 2^160 possible values.

2^160 / 2^76 = 2^84

This is a chance of about 1 in 2^84 (or 10^25) of two documents
currently existing that have the same hash.  This is extremely tiny. 
The chances of us being wiped out by an asteroid in the next decade are
probably at least a trillion times higher.

For every factor of 2 increase in the number of documents, there is a
factor of 4 increase in the chance of two of those documents having the
same hash.

It all depends on the chances you're willing to accept.  And, how
willing you are to accept current assumptions about the flatness in
SHA1's output.

IMHO, cryptographic hash functions receive too little analysis given
their importance.  The cryptographic community doesn't seem to feel that
it's as fun to break a hash function as it is to break a block cipher. 
I think cryptographic hash functions are actually more important than
block ciphers.  And I think the way people are starting to use them as
identifiers makes assurances as to the flatness of their output even
more important than the uses they've previously been put to in
cryptography.

I've only seen one paper so far [1] (thanks to Zooko for pointing it out
to me) that even lays out an approach for analyzing the flatness (or
balance) of cryptographic hash functions.  I've seen at least 2-3 papers
analyzing Rijndael in great detail.

But, these are just random opinions I hold.  :-)

Have fun (if at all possible),
-- 
The best we can hope for concerning the people at large is that they
be properly armed.  -- Alexander Hamilton
-- Eric Hopper (hopper@omnifarious.org  http://www.omnifarious.org/~hopper) --

[1]
    "Hash Function Balance and its Impact on Birthday Attacks" (2002)
    Mihir Bellare, Tadayoshi Kohno
    http://citeseer.nj.nec.com/bellare02hash.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040217/11ed482d/attachment.pgp
From b.fallenstein at gmx.de  Tue Feb 17 10:51:43 2004
From: b.fallenstein at gmx.de (Benja Fallenstein)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers?
In-Reply-To: <1077006022.26007.76.camel@monster.omnifarious.org>
References: <20040203160155.G78403-100000@localhost>
	<1077006022.26007.76.camel@monster.omnifarious.org>
Message-ID: <4031F23F.6070503@gmx.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Eric M. Hopper wrote:
| Lets say there are approximately 275 billion documents on the web.
|
| That's close to 2^38.  This means there are 2^76 relations between two
| different documents, each of which has the potential for two documents
| with the same hash.
|
| Assuming SHA1 has a flat distribution or is 'balanced', there's an even
| chance for all 2^160 possible values.
|
| 2^160 / 2^76 = 2^84
|
| This is a chance of about 1 in 2^84 (or 10^25) of two documents
| currently existing that have the same hash.

I got confused by your way of putting it, although on closer inspection
you are correct. The formula that I have in mind (which gives an upper
bound to the probability of a hash collision) is

~    (collision probability)  <  (number of docs)^2 / (possible hashes)

I.e., in the example,

~    probability   <   (2^38)^2 / 2^160  =  2^(-84)

or one in 2^84.

| This is extremely tiny.
| The chances of us being wiped out by an asteroid in the next decade are
| probably at least a trillion times higher.

This is an example I have been using too! ;-)

I also find comparing this to the risks we think acceptable when
designing nuclear power plants a good point.

A problem this line of argument suffers from is that for neither of the
two points, I can come up with an article that actually points numbers
on these risks. That would make it more convincing, if you could show
how big these probabilities are, compared to a hash collision one.

| For every factor of 2 increase in the number of documents, there is a
| factor of 4 increase in the chance of two of those documents having the
| same hash.

(As an upper bound that is close enough to the actual value for the
difference not to be interesting.)

| IMHO, cryptographic hash functions receive too little analysis given
| their importance.  The cryptographic community doesn't seem to feel that
| it's as fun to break a hash function as it is to break a block cipher.
| I think cryptographic hash functions are actually more important than
| block ciphers.  And I think the way people are starting to use them as
| identifiers makes assurances as to the flatness of their output even
| more important than the uses they've previously been put to in
| cryptography.
|
| I've only seen one paper so far [1] (thanks to Zooko for pointing it out
| to me) that even lays out an approach for analyzing the flatness (or
| balance) of cryptographic hash functions.  I've seen at least 2-3 papers
| analyzing Rijndael in great detail.

I agree very much. I have not seen any literature about the assumptions
behind the design of hash functions, either; papers just say, "Let there
be a hash function, and the algorithm shall be as follows," but I have
not yet seen any discussions about what combinations of steps make us
believe that it is hard to break a function and why.

It would be cool if a cryptography PhD student read this list and took
your mail as an invitation ;-)

- - Benja
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAMfI/UvR5J6wSKPMRAi5DAJ0cHkX6eLHggj2wTcE+8Aesna43ewCfdxWK
mDyT+OG6bRqtPcKNdXXl6pA=
=H+Lo
-----END PGP SIGNATURE-----

From zooko at zooko.com  Tue Feb 17 12:11:21 2004
From: zooko at zooko.com (Zooko O'Whielacronx)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] References for using hashes as unique identifiers? 
In-Reply-To: Message from Benja Fallenstein <b.fallenstein@gmx.de> 
	of "Tue, 17 Feb 2004 12:51:43 +0200." <4031F23F.6070503@gmx.de> 
References: <20040203160155.G78403-100000@localhost>
	<1077006022.26007.76.camel@monster.omnifarious.org>
	<4031F23F.6070503@gmx.de> 
Message-ID: <E1At44P-0004DN-00@localhost>


Once upon a time there was a discussion about the risk of hash collision in Mojo 
Nation on the Mojo Nation devel mailing list.  Hal Finney said that such extreme 
probabilities were beyond our ability to engineer.  I agreed and said that there 
was a higher chance that our hash-collision-handling code would be buggy than 
that there would be a hash collision.

Greg Smith wasn't satisfied with this and went ahead and implemented a simple 
routine to detect and clean up after a hash collision.  Shortly thereafter we 
discovered that there was a sporadic bug in this routine which caused it to 
trigger sometimes even in the absence of a collision.  Rather than fixing the 
bug, Greg removed the routine.

Regards,

Zooko


From yo0ga at yahoo.com  Tue Feb 17 17:48:40 2004
From: yo0ga at yahoo.com (yoga avidia sudarma)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] need a password
Message-ID: <20040217174840.95803.qmail@web9405.mail.yahoo.com>


1. IDEALWINA@YAHOO.COM

2. CARENPM@YAHOO.COM

 
---------------------------------
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20040217/4f7b26f7/attachment.html
From yo0ga at yahoo.com  Tue Feb 17 20:04:32 2004
From: yo0ga at yahoo.com (yoga avidia sudarma)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Re: p2p-hackers Digest, Vol 7, Issue 19
In-Reply-To: <20040217200007.AD2603FD71@capsicum.zgp.org>
Message-ID: <20040217200432.10743.qmail@web9404.mail.yahoo.com>

need help....... email me the password of:

1. idealwina@yahoo.com

2. carenpm@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20040217/ce33a7c5/attachment.htm
From eugen at leitl.org  Thu Feb 19 16:35:31 2004
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] [Twisted-Python] Twisted and P2P (re: p2p,
	gnutella) (fwd from rdrb@123.cl)
Message-ID: <20040219163531.GQ26194@leitl.org>

----- Forwarded message from RITA Y/O RODRIGO DIAZ Y/O BENENSON <rdrb@123.cl> -----

From: RITA Y/O RODRIGO DIAZ Y/O BENENSON <rdrb@123.cl>
Date: Wed, 18 Feb 2004 21:51:51 -0300
To: twisted-python@twistedmatrix.com
Subject: [Twisted-Python] Twisted and P2P (re: p2p, gnutella)
X-Mailer: iPlanet Messenger Express 5.2 HotFix 1.21 (built Sep  8 2003)
Reply-To: twisted-python@twistedmatrix.com

Hi,
I'm working on the Twistification of http://thecircle.org.au/ and in the reparation of the broken http://khashmir.sf.net

if you would like to help contact me at the email 

myname at elo dot utfsm dot cl

knowing that myname is rodrigob

(I hate spam)

TheCircle work is just at his begining (design is made, code is starting, slowly), the khashmir reparation is going fine (but no estimation can be done in debugging, because number of critical bugs is unknown).

rodrigob.


<twisted-python@twistedmatrix.com>
> From: stephan <mailinglists@shechen.at>
> Subject: [Twisted-Python] p2p, gnutella
> Reply-To: twisted-python@twistedmatrix.com
> 
> 
> 
> I am looking forward to add p2p functionality to my app. Does 
> anybody know 
> what the current status of twisted's gnutella implementation is?
> 
> There seems to have been a semi-finished implemention in 
> 1.0.2alpha4 which 
> I can't find in 1.1.1 anymore.
> 
> I might also be willing to implement missing parts but I would need 
> to get 
> a short briefing on how things are.
> 
> thanks,
> 
> _stephan
> 


_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

----- End forwarded message -----
-- Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040219/706940d7/attachment.pgp
From Ashish_Vashishta at baylor.edu  Tue Feb 24 03:54:41 2004
From: Ashish_Vashishta at baylor.edu (Vashishta, Ashish)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] P2P in NS
Message-ID: <A5D097A8EBE48946BB029633F914F3E10E69F3@MAIL-A-B.baylor.edu>

Hi all

I have to implement YOID multicast protocol in NS. I am not sure of how
to begin with this P2P stuff in NS. There are several issues related to
P2P and NS in general that I am not able to resolve.

Can anyone provide some pointers where I can go and look for P2P stuff
for NS. The NS mailing list archive is not quite helpful for P2P
simulations.

 
Thanks in advance 

Ashish

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20040223/964653da/attachment.htm
From sam at neurogrid.com  Tue Feb 24 04:05:22 2004
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] P2P in NS
In-Reply-To: <A5D097A8EBE48946BB029633F914F3E10E69F3@MAIL-A-B.baylor.edu>
References: <A5D097A8EBE48946BB029633F914F3E10E69F3@MAIL-A-B.baylor.edu>
Message-ID: <403ACD82.7040809@neurogrid.com>

Hi Ashish,

I think the thing you want to check out is PacketLevel P2P simulator 
which can run on top of NS.  I reviewed it as part of my paper on P2P 
simulations:

http://www.p2pjournal.com/issues/November03.pdf

The PLP2P simulator can be found at the following link:

http://www.cc.gatech.edu/computing/compass/gnutella/

CHEERS> SAM

Vashishta, Ashish wrote:

> Hi all
>
> I have to implement YOID multicast protocol in NS. I am not sure of 
> how to begin with this P2P stuff in NS. There are several issues 
> related to P2P and NS in general that I am not able to resolve.
>
> Can anyone provide some pointers where I can go and look for P2P stuff 
> for NS. The NS mailing list archive is not quite helpful for P2P 
> simulations.
>
>  
>
> Thanks in advance
>
> Ashish
>
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>p2p-hackers mailing list
>p2p-hackers@zgp.org
>http://zgp.org/mailman/listinfo/p2p-hackers
>_______________________________________________
>Here is a web page listing P2P Conferences:
>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>  
>


From sam at neurogrid.com  Tue Feb 24 08:51:00 2004
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] CFP: AP2PC 2004
Message-ID: <403B1074.5080605@neurogrid.com>

*** our apologies if you receive multiple copies of this e-mail ***

                Preliminary call for papers


              Third International Workshop on
             Agents and Peer-to-Peer Computing
                        (AP2PC 2004)

                 http://p2p.ingce.unibo.it/

                  to be held at AAMAS 2004
          Third International Joint Conference on
          Autonomous Agents and MultiAgent Systems

                     New York City, USA
                     July 19 or 20, 2004


----------
Overview
----------
Peer-to-peer (P2P) computing is attracting enormous media attention, spurred by the popularity of file sharing systems such as Napster, Gnutella, and Morpheus. The peers are autonomous, or as some call them, first-class citizens. P2P networks are emerging as a new distributed computing paradigm for their potential to harness the computing power of the hosts composing the network and make their under-utilized resources available to others. This possibility has generated a lot of interest in many industrial organizations which have already launched important projects.

In P2P systems, peer and web services in the role of resources become shared and combined to enable new capabilities greater than the sum of the parts. This means that services can be developed and treated as pools of methods that can be composed dynamically. The decentralized nature of P2P computing makes it also ideal for economic environments that foster knowledge sharing and collaboration as well as cooperative and non-cooperative behaviors in sharing resources. Business models are being developed, which rely on incentive mechanisms to supply contributions to the system and methods for controlling free riding. Clearly, the growth and the management of P2P networks must be regulated to ensure adequate compensation of content and/or service providers. At the same time, there is also a need to ensure equitable distribution of content and services.

Although researchers working on distributed computing, MultiAgent Systems, databases and networks have been using similar concepts for a long time, it is only recently that papers motivated by the current P2P paradigm have started appearing in high quality conferences and workshops. Research in agent systems in particular appears to be most relevant because, since their inception, MultiAgent Systems have always been thought of as networks of peers.

The MultiAgent paradigm can thus be superimposed on the P2P architecture, where agents embody the description of the task environments, the decision-support capabilities, the collective behavior, and the interaction protocols of each peer. The emphasis in this context on decentralization, user autonomy, ease and speed of growth that gives P2P its advantages, also leads to significant potential problems. Most prominent among these problems are coordination: the ability of an agent to make decisions on its own actions in the context of activities of other agents, and scalability: the value of the P2P systems lies in how well they scale along several dimensions, including complexity, heterogeneity of peers, robustness, traffic redistribution, and so on. It is important to scale up coordination strategies along multiple dimensions to enhance their tractability and viability, and thereby to widen the application domains. These two problems are common to many large-scale applications. Without coordination, agents may be wasting their efforts, squander resources and fail to achieve their objectives in situations requiring collective effort.

This workshop will bring together researchers working on agent systems and P2P computing with the intention of strengthening this connection. Researchers from other related areas such as distributed systems, networks and database systems will also be welcome (and, in our opinion, have a lot to contribute).

We seek high-quality and original contributions on the general theme of "Agents and P2P Computing". The following is a non-exhaustive list of topics of special interest:

* Intelligent agent techniques for P2P computing
* P2P computing techniques for multi-agent systems
* The Semantic Web, Semantic Coordination Mechanisms and P2P systems
* Scalability, coordination, robustness and adaptability in P2P systems
* Self-organization and emergent behavior in P2P networks
* E-commerce and P2P computing
* Participation and Contract Incentive Mechanisms in P2P Systems
* Computational Models of Trust and Reputation
* Community of interest building and regulation, and behavioral norms
* Intellectual property rights in P2P systems
* P2P architectures
* Scalable Data Structures for P2P systems
* Services in P2P systems (service definition languages, service discovery, filtering and composition etc.)
* Knowledge Discovery and P2P Data Mining Agents
* P2P data management
* Information ecosystems and P2P systems
* Security issues in P2P networks
* Pervasive computing based on P2P architectures (ad-hoc networks, wireless communication devices and mobile systems)
* Grid computing solutions based on agents and P2P paradigms
* Legal issues in P2P networks


-------
Panel
-------
The theme of the panel will be Conducting Business via P2P. P2P computing has had some visible successes in applications such as file sharing, but many of these applications have had a consumer or hobbyist focus. This panel will discuss emerging "mission-critical" applications of P2P and the challenges that P2P technologies must surmount in order to best support such applications. These challenges include security, trust and reputation, representing business protocols, checking compliance, bootstrapping systems, and performance. The panel will involve 10 minute presentations by four panelists followed by a discussion session involving the audience.


------------------
Important dates
------------------
Abstract:                       1st April 2004  (see submission
instructions below)
Paper submission:               6th April 2004
Acceptance notification:        1st May 2004
Workshop:                      19 or 20th July 2004
Camera ready for
Post-proceedings:              31st August 2004


---------------
Registration
---------------
Accomodation and workshop registration will be handled by the AAMAS 2004 organization along with the main conference registration.


---------------------------
Submission instructions
---------------------------
Unpublished papers should be formatted according to the LNCS/LNAI author instructions for proceedings and they should not be longer than 12 pages (about 5000 words including figures, tables, references, etc.).

The abstract and then the paper should be submitted electronically HERE according to the deadlines mentioned above. 

In case of problems submit abstract and paper (pdf), according to the deadlines mentioned above, to submission@ingce.unibo.it 
by specifying in both emails: paper's author(s), title, contact author and at most 5 keywords/topics.


-------------
Publication
-------------
Accepted papers will be distributed to the workshop participants as workshop notes. Post-proceedings of the revised papers (namely accepted papers presented at the workshop) will be published by Springer - Lecture Notes in Computer Science series (LNCS)
Here are the volumes of revised and invited papers of preceding editions:
LNCS volume no. 2530 for AP2PC'2002
LNCS volume no. 2872 for AP2PC'2003 (publication in progress) 


-------------
Organizers
-------------
Program Co-chairs

Karl Aberer
?cole Polytechnique F?d?rale de Lausanne (EPFL)
CH-1015 Lausanne, Switzerland
Tel. +41-21-693 4679 - Fax +41-21-693 8115
E-mail: karl.aberer@epfl.ch

Sonia Bergamaschi
Dept. of Science Engineering,
University of Modena and Reggio-Emilia, Italy
via Vignolese, 905 - 41100 Modena Italy
Tel. +39 059 2056132 - Fax +39 059 2056126
E-mail: bergamaschi.sonia@unimo.it

Gianluca Moro (main contact)
Dept. of Electronics, Computer Science and Systems,
University of Bologna, Italy
Via Venezia, 52 - I-47023 Cesena (FC)
Tel. +39 0547 339237 - Fax +39 0547 339208
E-mail: gmoro@deis.unibo.it


-------------
Panel Chair
-------------
Munindar P. Singh
Dept. of Computer Science, North Carolina State University, USA
Venture I, Suite 110 / Box 7535 - Raleigh, NC 27695-7535
Tel. +1 919 515.5677 - Fax +1 919 515.7896
E-mail: mpsingh@eos.ncsu.edu


----------------------
Steering Commitee
----------------------
Karl Aberer, EPFL, Lausanne, Switzerland
Sonia Bergamaschi, University of Modena and Reggio-Emilia, Italy
Manolis Koubarakis, Technical University of Crete
Paul Marrow, Intelligent Systems Laboratory, BTexact Technologies, UK
Gianluca Moro, University of Bologna, Cesena, Italy
Aris M. Ouksel, University of Illinois at Chicago, USA
Claudio Sartori, University of Bologna, Italy
Munindar P. Singh, North Carolina State University, USA


----------------------------------
Web Master of Review System
----------------------------------
Sam Joseph
Laboratory for Interactive Learning Technology (LILT), University of
Hawaii
E-mail: srjoseph@hawaii.edu


---------------
Sponsorships
---------------
Khaled Nagi
Computer Science Dept., Alexandria University,
E-mail: khaledn@acm.org


----------------------
Program commitee
----------------------
Karl Aberer, EPFL, Lausanne, Switzerland
Sonia Bergamaschi, University of Modena and Reggio-Emilia, Italy
Jon Bing, Universitat of Oslo, Norway
M. Brian Blake, Georgetown University, USA
Rajkumar Buyya, University of Melbourne, Australia
Ooi Beng Chin, National University of Singapore, Singapore
Paolo Ciancarini, University of Bologna, Italy
Costas Courcoubetis, Athens University of Economics and Business, Greece
Yogesh Deshpande, University of Western Sydney, Australia
Asuman Dogac, Middle East Technical University, Turkey
Boi V. Faltings, EPFL, Lausanne, Switzerland
Maria Gini, University of Minnesota, USA
Dina Q. Goldin, University of Connecticut, USA
Chihab Hanachi, University of Toulouse, France
Mark Klein, Massachusetts Institute of Technology, USA
Matthias Klusch, DFKI, Saarbrucken, Germany
Yannis Labrou, PowerMarket Inc., USA
Tan Kian Lee, National University of Singapore, Singapore
Dejan Milojicic, Hewlett Packard Labs, USA
Alberto Montresor, University of Bologna, Italy
Luc Moreau, University of Southampton, UK
Jean-Henry Morin, University of Geneve, Switzerland
John Mylopoulos, University of Toronto, Canada
Andrea Omicini, University of Bologna, Italy
Maria Orlowska, University of Queensland, Australia
Aris. M. Ouksel, University of Illinois at Chicago, USA
Mike Papazoglou, Tilburg University, Netherlands
Terry R. Payne, University of Southampton, UK
Paolo Petta, Austrian Research Institute for AI, Austria,
Jeremy Pitt, Imperial College, UK
Dimitris Plexousakis, Institute of Computer Science, FORTH, Greece
Martin Purvis, University of Otago, New Zealand
Omer F. Rana, Cardiff University, UK
Katia Sycara, Robotics Institute, Carnegie Mellon University, USA
Douglas S. Reeves, North Carolina State University, USA
Thomas Risse, Fraunhofer IPSI, Darmstadt, Germany
Pierangela Samarati, University of Milan, Italy
Giovanni Sartor, CIRSFID, University of Bologna, Italy,
Christophe Silbertin-Blanc, University of Toulouse, France
Maarten van Steen, Vrije Universiteit, Netherlands
Markus Stumptner, University of South Australia, Australia
Peter Triantafillou, Technical University of Crete, Greece
Anand Tripathi, University of Minnesota, USA
Vijay K. Vaishnavi, Georgia State University, USA
Francisco Valverde-Albacete, Universidad Carlos III de Madrid, Spain
Maurizio Vincini, University of Modena and Reggio-Emilia, Italy
Fang Wang, BTexact Technologies, UK
Gerhard Weiss, Technische Universitaet, Germany
Bin Yu, North Carolina State University, USA
Franco Zambonelli, University of Modena and Reggio-Emilia, Italy


From p2p at garethwestern.com  Tue Feb 24 11:08:35 2004
From: p2p at garethwestern.com (p2p@garethwestern.com)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] Web Service Interface for P2P File Storage
Message-ID: <1077620915.403b30b3e5818@www.garethwestern.com>

Hi,
    I'm developing a web service interface for a p2p storage system. Is anyone 
aware of any existing work in this area? The XSpace implementation on 
xmethods.net seems to be similar to what I am looking for, only I am trying to 
use a distributed p2p storage space, such as PAST, instead. Also, which 
storage networks do you recommend? I am currently starting to experiment with 
PAST and Bamboo, with plans to also try an implementation of Chord as soon as 
the first ones are developed. Thanks for your advice!
Cheers,
Gareth

From eugen at leitl.org  Thu Feb 26 11:00:18 2004
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] FYI: [Twisted-Python] ANN: Twisted 1.2.0 (fwd from
	itamar@itamarst.org)
Message-ID: <20040226110017.GI32608@leitl.org>


In (admittedly, unlikely) case you were unaware of this excellent P2P framework.

----- Forwarded message from Itamar Shtull-Trauring <itamar@itamarst.org> -----

From: Itamar Shtull-Trauring <itamar@itamarst.org>
Date: Wed, 25 Feb 2004 23:12:38 -0500
To: twisted-python@twistedmatrix.com
Subject: [Twisted-Python] ANN: Twisted 1.2.0
Organization: http://itamarst.org
X-Mailer: Ximian Evolution 1.4.5 
Reply-To: twisted-python@twistedmatrix.com

Twisted is an event-driven networking framework for server and client
applications.

For more information, visit http://www.twistedmatrix.com, join the list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python or
visit us on #twisted at irc.freenode.net.

The Twisted from Scratch tutorial is a good starting point for learning
Twisted: http://twistedmatrix.com/documents/howto/tutorial


What's New in 1.2.0
===================

- SFTP server implementation for the SSH server.

- Improved wxPython support.

- IMAPv4 enhancements and bug fixes.

- Allow disabling display of tracebacks in error web pages.

- ident protocol implementation (client and server).

- Support mapping arbitrary child FDs when running processes on POSIX.

- Initial SOAP client support (using SOAPpy).

- Partial download support for FTP client.

- Web framework now supports different handlers for different methods
(e.g. GET or POST).

- Coverage support in the trial testing framework.

- Bug fixes and documentation and feature enhancements.


_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

----- End forwarded message -----
-- Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040226/9e1b35d2/attachment.pgp
From rohit_bhalla2002 at yahoo.com  Sat Feb 28 06:12:43 2004
From: rohit_bhalla2002 at yahoo.com (Rohit Bhalla)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] NS and P2P
Message-ID: <20040228061243.8410.qmail@web60207.mail.yahoo.com>

Hello,

I found in the ns-users archives some messages that
relate the implementation of P2P stuff in NS
Already someone have implemented with success on of
this types of agents (Narada,Yoid, gnutella)? If yes,
i
appreciate so much if someone can send the code to
me...

Thanks by your help!
Rohit


__________________________________
Do you Yahoo!?
Get better spam protection with Yahoo! Mail.
http://antispam.yahoo.com/tools

From ncbgroups at yahoo.com.br  Sat Feb 28 06:15:28 2004
From: ncbgroups at yahoo.com.br (=?iso-8859-1?q?Nilton=20Braga?=)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] NS and P2P
In-Reply-To: <20040228061243.8410.qmail@web60207.mail.yahoo.com>
Message-ID: <20040228061528.77654.qmail@web40017.mail.yahoo.com>

Hi!

I'm also interested in this code, if someone knows
where to find it.

Thank you.


 --- Rohit Bhalla <rohit_bhalla2002@yahoo.com>
escreveu: > Hello,
> 
> I found in the ns-users archives some messages that
> relate the implementation of P2P stuff in NS
> Already someone have implemented with success on of
> this types of agents (Narada,Yoid, gnutella)? If
> yes,
> i
> appreciate so much if someone can send the code to
> me...
> 
> Thanks by your help!
> Rohit
> 
> 
> __________________________________
> Do you Yahoo!?
> Get better spam protection with Yahoo! Mail.
> http://antispam.yahoo.com/tools
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
>
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences 

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html

From bradneuberg at yahoo.com  Sun Feb 29 04:30:50 2004
From: bradneuberg at yahoo.com (Brad Neuberg)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] NS By P2P
In-Reply-To: <20040228061528.77654.qmail@web40017.mail.yahoo.com>
Message-ID: <20040229043050.84108.qmail@web60703.mail.yahoo.com>

What do you mean by NS? Do you mean Netscape and
Mozilla browser integration? If thats what you are
interested in, I can provide lots of tips on
Mozilla-P2P integration (and a little about IE-P2P
integration).

Brad

--- Nilton Braga <ncbgroups@yahoo.com.br> wrote:
> Hi!
> 
> I'm also interested in this code, if someone knows
> where to find it.
> 
> Thank you.
> 
> 
> 
>  --- Rohit Bhalla <rohit_bhalla2002@yahoo.com>
> escreveu: > Hello,
> > 
> > I found in the ns-users archives some messages
> that
> > relate the implementation of P2P stuff in NS
> > Already someone have implemented with success on
> of
> > this types of agents (Narada,Yoid, gnutella)? If
> > yes,
> > i
> > appreciate so much if someone can send the code to
> > me...
> > 
> > Thanks by your help!
> > Rohit
> > 
> > 
> > __________________________________
> > Do you Yahoo!?
> > Get better spam protection with Yahoo! Mail.
> > http://antispam.yahoo.com/tools
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> >
>
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 
>
______________________________________________________________________
> 
> Yahoo! Mail - O melhor e-mail do Brasil! Abra sua
> conta agora:
> http://br.yahoo.com/info/mail.html
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From ncbgroups at yahoo.com.br  Sun Feb 29 05:10:26 2004
From: ncbgroups at yahoo.com.br (=?iso-8859-1?q?Nilton=20Braga?=)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] NS By P2P
In-Reply-To: <20040229043050.84108.qmail@web60703.mail.yahoo.com>
Message-ID: <20040229051026.2294.qmail@web40017.mail.yahoo.com>

Well, in fact, when I said ns I was referring to the
ns-2 software (Network Simulator -
www.isi.edu/nsnam/ns/)

But if you have tips about p2p-browser integration,
I'm really interested.

Could you send these tips??

Thank you.
 

> What do you mean by NS? Do you mean Netscape and
> Mozilla browser integration? If thats what you are
> interested in, I can provide lots of tips on
> Mozilla-P2P integration (and a little about IE-P2P
> integration).
> 
> Brad
> 
> --- Nilton Braga <ncbgroups@yahoo.com.br> wrote:
> > Hi!
> > 
> > I'm also interested in this code, if someone knows
> > where to find it.
> > 
> > Thank you.
> > 
> > 
> > 
> >  --- Rohit Bhalla <rohit_bhalla2002@yahoo.com>
> > escreveu: > Hello,
> > > 
> > > I found in the ns-users archives some messages
> > that
> > > relate the implementation of P2P stuff in NS
> > > Already someone have implemented with success on
> > of
> > > this types of agents (Narada,Yoid, gnutella)? If
> > > yes,
> > > i
> > > appreciate so much if someone can send the code
> to
> > > me...
> > > 
> > > Thanks by your help!
> > > Rohit
> > > 
> > > 
> > > __________________________________
> > > Do you Yahoo!?
> > > Get better spam protection with Yahoo! Mail.
> > > http://antispam.yahoo.com/tools
> > > _______________________________________________
> > > p2p-hackers mailing list
> > > p2p-hackers@zgp.org
> > > http://zgp.org/mailman/listinfo/p2p-hackers
> > > _______________________________________________
> > > Here is a web page listing P2P Conferences:
> > >
> >
>
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> > 
> > 
> >
>
______________________________________________________________________
> > 
> > Yahoo! Mail - O melhor e-mail do Brasil! Abra sua
> > conta agora:
> > http://br.yahoo.com/info/mail.html
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> >
>
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
>
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences 

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html

From lujianming at software.ict.ac.cn  Sun Feb 29 06:37:43 2004
From: lujianming at software.ict.ac.cn (Jimmy)
Date: Sat Dec  9 22:12:38 2006
Subject: [p2p-hackers] any opensource application or simulation based on CAN
	? 
Message-ID: <20040229064030.5F41D3FC37@capsicum.zgp.org>

p2p-hackers,

	 hi,every body. I am now doing something about  text retrieval  on P2P  and going to take the CAN as the lower p2p networks overlay .  It seems to be hard to find some opensource application  based on CAN ,or a good opensource CAN simulation available.
 
	Does any body know any  any opensource application or simulation based on CAN ? 
                                                                                              
Good luck ^_^.
 											Jimmy.Lud.
 											lujianming@software.ict.ac.cn
 											2004-02-29

-------------- next part --------------
A non-text attachment was scrubbed...
Name: face-3.gif
Type: image/gif
Size: 842 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20040229/7ccbaa9e/face-3.gif