From dmarti at zgp.org  Fri Feb 23 14:55:02 2001
From: dmarti at zgp.org (Don Marti)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] (no subject)
Message-ID: <20010223225408.E5E703FC21@capsicum.zgp.org>

Fri Feb 23 14:54:08 PST 2001

From bram at gawth.com  Mon Feb 26 20:49:01 2001
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Hello p2p hackers
Message-ID: <Pine.LNX.4.21.0102262047380.12794-100000@ultra.gawth.com>

Hello everyone, I wanted to check who's subscribed to this list so far.

I'm Bram, one of the people working on Mojo Nation, which I will happily
chew everyone's ear off with just a little prompting.

Who else is on here?

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From zooko at zooko.com  Mon Feb 26 21:11:01 2001
From: zooko at zooko.com (zooko@zooko.com)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Hello p2p hackers 
In-Reply-To: Message from Bram Cohen <bram@gawth.com> 
   of "Mon, 26 Feb 2001 20:48:36 PST." <Pine.LNX.4.21.0102262047380.12794-100000@ultra.gawth.com> 
References: <Pine.LNX.4.21.0102262047380.12794-100000@ultra.gawth.com> 
Message-ID: <E14XcNQ-0002M4-00@imp>

> I'm Bram, one of the people working on Mojo Nation, which I will happily
> chew everyone's ear off with just a little prompting.


Heh heh heh...


Okay...


<prompt target="bram">

What was that you were saying about a new method of doing replay attack
prevention on IRC today?  I remain enamoured of my own method (which
provides full scale extensible fail-safe behaviour[1] as defined by Li
Gong[2] and Paul Syverson), so I would like to hear about alternatives.

</prompt>


Regards,

Zooko

[1] "Fail-Stop Protocols: An Approach to Designing Secure Protocols"
http://citeseer.nj.nec.com/49099.html

[2] Fun footnote for p2p fans: Li Gong was the chief security architect
for Java, and is now leading Sun's nebulous p2p tech platform that Bill
Joy talked about at the O'Reilly conference.


From wesley at felter.org  Mon Feb 26 22:11:01 2001
From: wesley at felter.org (Wesley Felter)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Hello p2p hackers 
In-Reply-To: <E14XcNQ-0002M4-00@imp>
Message-ID: <Pine.LNX.4.21.0102262357410.1324-100000@valentine.felter.org>

On Mon, 26 Feb 2001 zooko@zooko.com wrote:

> [2] Fun footnote for p2p fans: Li Gong was the chief security architect
> for Java, and is now leading Sun's nebulous p2p tech platform that Bill
> Joy talked about at the O'Reilly conference.

And considering that Sun apparently isn't planning to answer any of their
email about Jxta until the spec/code is released, I don't have a lot of
faith in its security model. Maybe the third time will be the charm,
though.

While I'm at it, I'll pipeline in an introduction: I'm not working on a
P2P system; instead I read the docs and protocols for as many of the
existing ones as I can and try to learn some lessons from them. Then I try
to convince other people to learn those lessons, too.

It's somewhat annoying to have to correct the To: line since the Reply-To
is not the list...

Wesley Felter - wesley@felter.org - http://felter.org/wesley/


From md98-osa at nada.kth.se  Tue Feb 27 07:30:02 2001
From: md98-osa at nada.kth.se (Oskar Sandberg)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access
 control lists)
In-Reply-To: <200102271059.KAA04195@longitude.doc.ic.ac.uk>
Message-ID: <Pine.SOL.4.30.0102271607150.1997-100000@my.nada.kth.se>

(I'm crossposting this into the p2p-hackers list, since it isn't
freenet-dev stuff.)

On Tue, 27 Feb 2001, Theodore Hong wrote:
> Oskar Sandberg <md98-osa@nada.kth.se> wrote:
> > I didn't think the Oceanstore paper was very interesting. Like you say,
> > the naming scheme is like ours (and that is pretty much the obvious way
> > of doing it). The interesting part is the paper they reference for their
> > global routing system (which I first thought was just a hypercube mesh,
> > but which is actually a lot more complicated):
> >
> > http://www.cs.utexas.edu/users/plaxton/ps/1999/tocs.ps
>
> yeah, what you told me at the conference didn't sound that great, but it is
> more sophisticated -- from what I gather, the network is covered by a large
> number of overlapping trees.  Each tree, which corresponds to some object
> GUID, covers all the nodes, but with different orderings.  To find an
> object, you traverse the appropriate tree upwards to its root, and then
> downwards to the location of the object.  Along the way, however, if you
> encounter a downwards reference to the location of the object, go straight
> there.  Thus the root can be corrupt, but it doesn't matter -- the
> important thing is that requests will converge towards the root and
> hopefully intersect a storage reference.  Actually, it doesn't seem that
> dissimilar to Freenet, if you substitute "epicenter" for "root".  I need to
> go actually read the Plaxton paper, though, since they didn't lay out that
> many details.

Upon further thought, the actual protocol is not very different from
what I told you. It is basically a hypercube type search, though they
modify it to allow for arbitrary dimmensions (not hard) and the
ability to decrease the number of hops by increasing the size of the
table at every node (I don't know how to describe that
geometrically).

The further complication, that of going downward in the tree once the
root for an object is found is there, as far as I can tell, to
satisfy their objective of minizing a certain cost function in the
final transfer - something that we do not deal with.

I'm not sure that their model is so fault tolerant to the roots of
objects falling out at all. And to the extent that it is, it has the
problem that it is easy to identify the level of "rootiness" of any
node for a piece of data - making targetted attacks easy. Also, for
Oceanstores big words about mobility of data, this offers little or
no such thing as far as I can tell (which the Oceanstore people get
around by adding the second Bloom filter level, but large parts of
the web are being served by Bloom filter using Squid caches already -
that hardly makes the data mobile).

It is a nice model, but the routing within such a system feel
uncomfortably rigid to me. I do have to look at it in more detail
before passing any final judegement.

> They also had a reference to some type of searching in encrypted data,
> without revealing the search string?  Presumably I guess you present some
> encrypted string, and the algorithm tells you whether the string is present
> in the data without decrypting either?  That could be useful.

I read the paper they reference some time ago, and it is interesting
but 100% useless. Basically it is just encrypting every word as a
seperate block and having the searcher encrypt the search terms with
the same key (but modified heavily so as to not suffer from the
million holes in that version). The only application it might be
useful for are ASP like systems that keep the data secret from the
ASP itself (which would be a cool thing, though not very related to
what we are doing).

In fact, the fact that they presented this as a workable alternative
for searching REALLY did a disservice to the Oceanstore paper in my
eyes (they obviously had not really considered it).

>
> theo
>
>
> _______________________________________________
> Devl mailing list
> Devl@freenetproject.org
> http://www.uprizer.com/mailman/listinfo/devl
>


From md98-osa at nada.kth.se  Tue Feb 27 07:50:01 2001
From: md98-osa at nada.kth.se (Oskar Sandberg)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Re: [freenet-devl] Alpine, ELF
In-Reply-To: <200102271104.LAA04215@longitude.doc.ic.ac.uk>
Message-ID: <Pine.SOL.4.30.0102271629220.1997-100000@my.nada.kth.se>

(Also crossposted from freenet-dev into p2p-hackers.)

On Tue, 27 Feb 2001, Theodore Hong wrote:

> Peter Todd <retep@penguinpowered.com> wrote:
< >
> > Alpine is horribly inefficient. Being a all-to-all network topology
> > where every search request is sent to *every* machine on the network
> > it's bandwidth useage for any single node is n where n is the number
> > of nodes in the network. Therefore the bandwidth usage for all of
> > nodes is n^2, obviously horribly inefficient.
>
> Well, that's the point -- they claim they can do it: "The low overhead of a
> DTCP connection means hundreds of thousands of concurrent connections can
> be used by an application for direct communication with a large number of
> peers."  Can that be true?

If capacity grows with n and the amount of messages with n^2, the it
is easy enough to figure out how many nodes can be supported. If you
want to support 1000 nodes, the amount of capacity added by each node
must be 1000 times greater than the amount of messages generated by
each node (per unit time) times the size of the message.

If search messages are only 1 kB or so, and nodes in the network
generate an average of 10 new searches per hour, then each node must
make 10,000 kB of transfer capacity available per hour to handle that
traffic.  That is a comfortable background level for people with
broadband connections (if not for the ISPs serving them).

Continuing up, at 46,000 nodes you are saturating a 1 megabit
connection - which would indicate (given that my numbers of 1 kB and
10 searches per hour were probably conservative) that the Alpine
people are sprouting turkey excrement.

If you turn it the other way of course - even if your search horizon
contains only 1000 people, that is certainly enough for many uses of
P2P (including filesharing). I think that if the people working on
Gnutella clones would just get there acts together, do the math, and
code their systems with recognition that the network cannot scale,
but that the horizon can still be large enough to satisfy most users,
we would have viable decentralized Napster alternative today...

>
> theo
>
>
> _______________________________________________
> Devl mailing list
> Devl@freenetproject.org
> http://www.uprizer.com/mailman/listinfo/devl
>


From orasis at acm.cs.umn.edu  Tue Feb 27 08:05:01 2001
From: orasis at acm.cs.umn.edu (Justin Chapweske)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists)
In-Reply-To: <Pine.SOL.4.30.0102271607150.1997-100000@my.nada.kth.se>; from md98-osa@nada.kth.se on Tue, Feb 27, 2001 at 04:28:53PM +0100
References: <200102271059.KAA04195@longitude.doc.ic.ac.uk> <Pine.SOL.4.30.0102271607150.1997-100000@my.nada.kth.se>
Message-ID: <20010227100356.D16140@go.cs.umn.edu>

> 
> > They also had a reference to some type of searching in encrypted data,
> > without revealing the search string?  Presumably I guess you present some
> > encrypted string, and the algorithm tells you whether the string is present
> > in the data without decrypting either?  That could be useful.
> 
> I read the paper they reference some time ago, and it is interesting
> but 100% useless. Basically it is just encrypting every word as a
> seperate block and having the searcher encrypt the search terms with
> the same key (but modified heavily so as to not suffer from the
> million holes in that version). The only application it might be
> useful for are ASP like systems that keep the data secret from the
> ASP itself (which would be a cool thing, though not very related to
> what we are doing).
> 

Are you referring to Schneier's "Clueless Agents" paper 
(http://www.counterpane.com/clueless-agents.html)?  You might also want to 
check out http://www.islandnet.com/~mskala/limdiff.html but I wouldn't 
trust it because it requires its own S-box construction.

-- 
Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc.
http://www.sourceforge.net/projects/swarmcast/

From orasis at acm.cs.umn.edu  Tue Feb 27 10:23:01 2001
From: orasis at acm.cs.umn.edu (Justin Chapweske)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] range bitmap compression and set operations
Message-ID: <20010227122223.F16140@go.cs.umn.edu>

Hey guys, I figure that some of you must have run into similar situations 
as I have and was wondering if anyone had some insights before I totally 
dive into this:

I need to send various bitmaps across the network with the size being up 
to 65536 bits.  The nice thing about these bitmaps is that they typically 
contain very nice ranges of 1's and 0's so they are very ammenable to 
simple RLE style encoding.  For example a typical bitmap will simply have 
1's for bits 0-1024,32768-34816 which could be encoded quite simply in XML 
using exactly that range format.

Here is the interesting part:

I want to be able to find an encoding of these bitmaps that allows me to 
VERY efficiently compute unions of two bitmap sets.  I could of course 
expand the bitmaps and execute a simple AND between them, but I don't want 
to waste the memory.

So does anyone know of or have any pointers to an RLE-style encoding that 
allows simple/efficient set operations on the encoded form?????  

-- 
Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc.
http://www.sourceforge.net/projects/swarmcast/

From hal at finney.org  Tue Feb 27 10:33:01 2001
From: hal at finney.org (hal@finney.org)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists)
Message-ID: <200102271829.KAA12263@finney.org>

The Plaxton routing method used in OceanStore is interesting
but it's not quite as described here.
> > > http://www.cs.utexas.edu/users/plaxton/ps/1999/tocs.ps

OceanStore is at http://oceanstore.cs.berkeley.edu/.

As described in the OceanStore papers, every node gets a number.
It appears to be important that no two nodes have the same number, and
that the numbering be relatively "dense" - that is, there should not
be too many numbers that don't have nodes.  (So you can't just let each
node pick a 160 bit random number, for example.)

The routing scheme then goes from one node to another by number.  It can
be used for looking up data if the data item has a "home" on the node
whose number corresponds to a (truncated) hash of the data item.

Each node contains pointers to other nodes which have similar numbers.
The example in the OceanStore paper uses base 16.  Each node has a
set of pointers to certain other nodes in the net.  These pointers are
organized into levels.  Each node has 16 k-level pointers, to the 16
closest neighbors which match in the low order k digits.  ("Closest"
is in terms of ping time.)

For example, node 0325 has 16 level-0 pointers to nodes of ___0, ___1,
___2, ... ___f, where the _ represent "don't care" digits.  The pointers
are to nearby nodes which match this pattern.  So it might actually
point to 19a0, 07f1, 6932, ..., 4cbf.  Only the last digit matters.

Then for the level 1 pointers, these will point to the 16 closest nodes
which match in the lower digit: __05, __15, __25, ... __f5.  Again,
the first two digits don't matter and are based on whichever nodes are
closest that satisfy this.  It might point to a305, 2915, 0325, ..., 80f5.

Then there are level 2 pointers of the form _025, _125, _225, ...
and level 3 pointers of the form 0325, 1325, 2325, 3325, ....

I'm not clear how this system of numbers and pointers is set up and
maintained, particularly in a dynamic network where nodes are constantly
joining and leaving.  I need to read the Plaxton paper to learn more.

Once you have the pointers, though, the routing is easy.  Simply move
through the network, setting the digits from right to left.  The example
in the paper goes from 0325 to 4598.  The first step follows a level
0 pointer from 0325 to ___8, so that the right digit will be correct.
It happens to go to b4f8.  From this node we will follow a level 1 link
to __98, which actually goes to 9098.  From here we take a level 2 link
to _598, which happens to go to 7598.  And from here we can take a level
3 link directly to 4598.

To the extent that we view this as a tree traversal, I see it as one where
the root of the tree is the destination node, 4598.  Its first-level
children are those reachable via level-3 pointers: 0598, 1598, 2598,
3598, etc.  The children of these nodes are the ones reachable to them via
level-2 pointers.  The children of 7598 are __98, one of which happens
to be 9098.  And the children of 9098 are the ones reachable to it by
level-1 pointers, one of which is b4f8.  Finally, the children of b4f8
are those reachable to it by level-0 pointers, one of which is 0325.

In this view, what we did was walk straight up the tree, from leaf node
to parent.  At each step we got one more digit right.  And the single
data structure can be looked at as a different tree rooted at each
different node.

In this way it is similar to hypercube routing, as Oskar noted earlier.
Hypercube routing usually is done base 2, but the Plaxton tree could be
done that way as well, just substitute 2 for 16 above.  The difference
is that in the hypercube, we point to the nodes which differ from us in
exactly one bit.  If our address is 1010101, we point to 1010100 and
1010111 and 1010001, etc.  But in a binary Plaxton tree we point to
______0, and to _____11, and to ____001, etc.  It is like a "loose"
hypercube, in that a number of the bit positions are unspecified,
so we can pick a closer node.  But the same basic routing algorithm
(get 1 bit at a time right) is used.

I think the main advantage you get from this looseness in the Plaxton
system is that you can pick a closer node.  Hypercube routing gets
you there in a small number of steps, but each step may be long in
physical space.  With the Plaxton tree, the steps are closer, at least
the first ones.  (The last step is the same as the hypercube so it may
not be particularly close.)  That's because with Plaxton you choose your
neighbors to be close nodes.

You can get something of the same effect with a hypercube if you can map
it to the geometry of your network, but since the surface of the earth
is topologically 2-dimensional it is impossible to map a 20-dimensional
hypercube onto it and maintain distance closeness.  So the Plaxton tree
should be faster than hypercube routing in practice.

However it still seems like it should share some of the inefficiency,
particularly in the last couple of steps where you won't have any
physically nearby nodes that match yours so closely.  You may end up
hopping from Chicago to Timbuktu to London on your last three steps,
even if the first 10 steps stayed in the U.S.

Hal

From hal at finney.org  Tue Feb 27 11:23:01 2001
From: hal at finney.org (hal@finney.org)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] range bitmap compression and set operations
Message-ID: <200102271919.LAA12444@finney.org>

Justin Chapweske writes:
> I need to send various bitmaps across the network with the size being up 
> to 65536 bits.  The nice thing about these bitmaps is that they typically 
> contain very nice ranges of 1's and 0's so they are very ammenable to 
> simple RLE style encoding.  For example a typical bitmap will simply have 
> 1's for bits 0-1024,32768-34816 which could be encoded quite simply in XML 
> using exactly that range format.
>
> Here is the interesting part:
>
> I want to be able to find an encoding of these bitmaps that allows me to 
> VERY efficiently compute unions of two bitmap sets.  I could of course 
> expand the bitmaps and execute a simple AND between them, but I don't want 
> to waste the memory.
>
> So does anyone know of or have any pointers to an RLE-style encoding that 
> allows simple/efficient set operations on the encoded form?????  

It seems like your proposed encoding is well suited for unions.  Have a list
of start and end points, and do something like:

  for( ; ; ) {
    if( b->start < a->start ) {
      // Swap a and b
      t = a; a = b; b = t;
    }
    // Now a->start <= b->start
    o->start = a->start;
    for( ; ; ) {
      if( a->end < b->start ) {
        // No overlap
        o->end = a->end;
        o++;
        break; // out of inner loop, back to top of outer loop
      }
      // Overlap
      if( b->end <= a->end ) {
        // b segment enclosed in a
        ++b;
        continue;  // back to inner loop
      }
      // b extends beyond a
      // advance a, swap a and b, back to inner loop
      ++a;
      t = a; a = b; b = t;
    }
  }

This copies a union b into o, using fields ->start and ->end as the
start and end of the ranges of 1's.  It needs to be enhanced to detect
the end of the data, but the basic idea is simple.  We look at the next
segment and see if it overlaps the current one.

Here is another way to do it, without swapping a and b, same idea:

  insegment = 0;
  for( ; ; ) {
    if( !insegment ) {
      if( a->start <= b->start ) {
        o->start = a->start;
        if( a->end < b->start ) {
          o++->end = a->end;
        } else
          insegment = 1;
      } else {
        o->start = b->start;
        if( b->end < a->start ) {
          o++->end = b->end;
        } else
          insegment = 1;
      }
    } else if( a->end < b->start ) {
      o++->end = a++->end;
      insegment = 0;
    } else if( b->end < a->start ) {
      o++->end = b++->end;
      insegment = 0;
    } else if( a->end < b->end )
      ++a;
    else
      ++b;
  }

I haven't tested any of this of course, it's just to show the general
idea.  I doubt you will come up with a data structure or algorithm that's
much faster.

Hal

From bram at gawth.com  Tue Feb 27 12:03:01 2001
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Hello p2p hackers 
In-Reply-To: <E14XcNQ-0002M4-00@imp>
Message-ID: <Pine.LNX.4.21.0102271133260.12794-100000@ultra.gawth.com>

On Mon, 26 Feb 2001 zooko@zooko.com wrote:

> <prompt target="bram">
> 
> What was that you were saying about a new method of doing replay attack
> prevention on IRC today? 

It has to do with moving towards connection-awareness in our
communications.

It's a bit involved to go into here, but I think there's a simple lesson -
don't worry about the security in the first version of your system too
much, you'll probably want to change it around later anyway.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes



From alk at pobox.com  Tue Feb 27 12:30:01 2001
From: alk at pobox.com (Tony Kimball)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Re: [freenet-devl] Alpine, ELF
References: <200102271104.LAA04215@longitude.doc.ic.ac.uk>
	<Pine.SOL.4.30.0102271629220.1997-100000@my.nada.kth.se>
Message-ID: <15004.3599.450174.566258@spanky.love.edu>

Quoth Oskar Sandberg on Tuesday, 27 February:
:
: I think that if the people working on
: Gnutella clones would just get there acts together, do the math, and
: code their systems with recognition that the network cannot scale,
: but that the horizon can still be large enough to satisfy most users...

If the population space is large enough and persistent enough, this
will happen by annealling, inevitably:  Interest pockets will form,
consisting of servents with a lot of interest overlap.  The
preconditions for such an outcome aren't really there, though, in
the current gnutella software.




From orasis at acm.cs.umn.edu  Tue Feb 27 12:58:01 2001
From: orasis at acm.cs.umn.edu (Justin Chapweske)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] range bitmap compression and set operations
In-Reply-To: <200102271919.LAA12444@finney.org>; from hal@finney.org on Tue, Feb 27, 2001 at 11:19:20AM -0800
References: <200102271919.LAA12444@finney.org>
Message-ID: <20010227145727.G16140@go.cs.umn.edu>

> >
> > Here is the interesting part:
> >
> > I want to be able to find an encoding of these bitmaps that allows me to 
> > VERY efficiently compute unions of two bitmap sets.  I could of course 
> > expand the bitmaps and execute a simple AND between them, but I don't want 
> > to waste the memory.
> >

Shit!  I meant intersect!  Although quick unions are important as well for 
adding new ranges to the set so I much appreciate the feedback.  

Right now the approach I'm taking is to ensure that all lists are ordered 
and as compact as possible (no entries like "0,0-10,9-15") and then 
performing the operation, but for some reason I tend to think that this 
sort/compress phase may be unneccessary.

-- 
Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc.
http://www.sourceforge.net/projects/swarmcast/

From orasis at acm.cs.umn.edu  Tue Feb 27 13:06:01 2001
From: orasis at acm.cs.umn.edu (Justin Chapweske)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] range bitmap compression and set operations
In-Reply-To: <20010227145727.G16140@go.cs.umn.edu>; from orasis@acm.cs.umn.edu on Tue, Feb 27, 2001 at 02:57:28PM +0000
References: <200102271919.LAA12444@finney.org> <20010227145727.G16140@go.cs.umn.edu>
Message-ID: <20010227150506.H16140@go.cs.umn.edu>

I found my answer in Perl (of course).  Set::IntSpan is exactly what I was 
looking for:

http://www.infoboard.com/perldoc/modules/Set/IntSpan.html

Sorry if some folks may find this off-topic, but I think that we will find 
various algorithms and data structures to commonly appear on our different 
systems. 

-- 
Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc.
http://www.sourceforge.net/projects/swarmcast/

From bram at gawth.com  Tue Feb 27 13:14:01 2001
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] range bitmap compression and set operations
In-Reply-To: <20010227150506.H16140@go.cs.umn.edu>
Message-ID: <Pine.LNX.4.21.0102271313110.12794-100000@ultra.gawth.com>

On Tue, 27 Feb 2001, Justin Chapweske wrote:

> I found my answer in Perl (of course).  Set::IntSpan is exactly what I was 
> looking for:
> 
> http://www.infoboard.com/perldoc/modules/Set/IntSpan.html
> 
> Sorry if some folks may find this off-topic, but I think that we will find 
> various algorithms and data structures to commonly appear on our different 
> systems. 

This isn't coderpunks - you won't get flamed for talking about code here.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From md98-osa at nada.kth.se  Tue Feb 27 17:24:01 2001
From: md98-osa at nada.kth.se (Oskar Sandberg)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists)
In-Reply-To: <20010227100356.D16140@go.cs.umn.edu>; from orasis@acm.cs.umn.edu on Tue, Feb 27, 2001 at 10:03:56AM +0000
References: <200102271059.KAA04195@longitude.doc.ic.ac.uk> <Pine.SOL.4.30.0102271607150.1997-100000@my.nada.kth.se> <20010227100356.D16140@go.cs.umn.edu>
Message-ID: <20010228022531.A1131@hobbex.localdomain>

On Tue, Feb 27, 2001 at 10:03:56AM +0000, Justin Chapweske wrote:
> > I read the paper they reference some time ago, and it is interesting
> > but 100% useless. Basically it is just encrypting every word as a
> > seperate block and having the searcher encrypt the search terms with
> > the same key (but modified heavily so as to not suffer from the
> > million holes in that version). The only application it might be
> > useful for are ASP like systems that keep the data secret from the
> > ASP itself (which would be a cool thing, though not very related to
> > what we are doing).
> > 
> 
> Are you referring to Schneier's "Clueless Agents" paper 
> (http://www.counterpane.com/clueless-agents.html)?  You might also want to 
> check out http://www.islandnet.com/~mskala/limdiff.html but I wouldn't 
> trust it because it requires its own S-box construction.

Neither, I reffering the paper regarding search on encrypted data that
they reference from the Oceanstore paper. You can read it here:

http://paris.cs.berkeley.edu/~dawnsong/papers/se.ps

-- 
'DeCSS would be fine. Where is it?'
'Here,' Montag touched his head.
'Ah,' Granger smiled and nodded.

Oskar Sandberg
md98-osa@nada.kth.se

From md98-osa at nada.kth.se  Wed Feb 28 09:03:01 2001
From: md98-osa at nada.kth.se (Oskar Sandberg)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists)
In-Reply-To: <200102271829.KAA12263@finney.org>; from hal@finney.org on Tue, Feb 27, 2001 at 10:29:51AM -0800
References: <200102271829.KAA12263@finney.org>
Message-ID: <20010228173753.B642@hobbex.localdomain>

On Tue, Feb 27, 2001 at 10:29:51AM -0800, hal@finney.org wrote:
> The Plaxton routing method used in OceanStore is interesting
> but it's not quite as described here.
> > > > http://www.cs.utexas.edu/users/plaxton/ps/1999/tocs.ps
> 
> OceanStore is at http://oceanstore.cs.berkeley.edu/.
< snip description >

It seems to me that Oceanstore is using the system described in Plaxton's
paper pretty much straight off. They claim to add more redundant links,
but that isn't really a big change.

> I'm not clear how this system of numbers and pointers is set up and
> maintained, particularly in a dynamic network where nodes are constantly
> joining and leaving.  I need to read the Plaxton paper to learn more.

The above paper doesn't deal with adding nodes to the network at all. The
Oceanstore paper says:

"While existing work on Plaxton-like data structures did not include
algorithms for online creation and maintenance of the global mesh, we have
produced recursive node insertion and removal algorithms."

The problem of giving a new node links at each level should be pretty
trivial by just following the primary neighbor sequence of it's ID from
any node (finding the closest ones may be harder - though often you could
just start with from a node at your POP on the physical network and
therefore get it). Giving other nodes links to a new node is probably more
difficult, and definitely impossible in an environment where the new node
could have less than honest intentions...

<>
> I think the main advantage you get from this looseness in the Plaxton
> system is that you can pick a closer node.  Hypercube routing gets
> you there in a small number of steps, but each step may be long in
> physical space.  With the Plaxton tree, the steps are closer, at least
> the first ones.  (The last step is the same as the hypercube so it may
> not be particularly close.)  That's because with Plaxton you choose your
> neighbors to be close nodes.
> 
> You can get something of the same effect with a hypercube if you can map
> it to the geometry of your network, but since the surface of the earth
> is topologically 2-dimensional it is impossible to map a 20-dimensional
> hypercube onto it and maintain distance closeness.  So the Plaxton tree
> should be faster than hypercube routing in practice.
> 
> However it still seems like it should share some of the inefficiency,
> particularly in the last couple of steps where you won't have any
> physically nearby nodes that match yours so closely.  You may end up
> hopping from Chicago to Timbuktu to London on your last three steps,
> even if the first 10 steps stayed in the U.S.

In a way the Plaxton system deals with this. It assumes that there is some
person somewhere that is sharing the data. This person does an Insert,
which places the data on each step following the routing from their
location (in Plaxton's language, the primary neighbor sequence of the).
Then, when searching, it does not just have the primary neighbors for each
value and level, it also has a set of secondary neighbors, being some
number d of the other nodes that could have be the primary neighbors but
weren't closest. At each step in a read, it checks if the secondary
neighbors has well as the primary if they have the data (ie, if they were
in the primary neighbor sequence of the inserter). If going through the
root node (where the sequences from the inserter and the reader are
guaranteed to intersect) is not the closest path between the inserter and
reader, then the chances are high that it would have been found along the
earlier path (don't trust me, most of the Plaxton paper deals with
proving this).

-- 
'DeCSS would be fine. Where is it?'
'Here,' Montag touched his head.
'Ah,' Granger smiled and nodded.

Oskar Sandberg
md98-osa@nada.kth.se

From hal at finney.org  Wed Feb 28 10:21:01 2001
From: hal at finney.org (hal@finney.org)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists)
Message-ID: <200102281817.KAA16135@finney.org>

Oskar writes:
> The above paper doesn't deal with adding nodes to the network at all. The
> Oceanstore paper says:
>
> "While existing work on Plaxton-like data structures did not include
> algorithms for online creation and maintenance of the global mesh, we have
> produced recursive node insertion and removal algorithms."

Of course "recursive" tells us nothing about the actual cost of
such algorithms.  I have read the Plaxton paper (the part about the
algorithms, not the proofs!) now, and there are some subtleties in the
Plaxton storage that look to me like they require global knowledge.

> The problem of giving a new node links at each level should be pretty
> trivial by just following the primary neighbor sequence of it's ID from
> any node (finding the closest ones may be harder - though often you could
> just start with from a node at your POP on the physical network and
> therefore get it).

I don't think this will work (starting with a nearby node) because the
local node will have a different label than yours.  If you are looking
for __34 and his number is 4321, he won't have anything that matches.
He'll forward to ___4, which will forward to __34, but this is now two
jumps away and is not necessarily the closest matching node to you.
That is, the returned __34 is the closest such node to the ___4 node,
not to 4321 which is where you are.  With higher level neighbors there are
more intervening hops and maintaining closeness is even more questionable.
If you just accept this as being "close enough" then this sloppiness
will grow with each node insertion.

It's also important that nodes not share labels, or at least that they
find out if any other nodes have the same label.  This can largely be done
by simply querying the network for a given label, but it is vulnerable
to race conditions and it's not clear what happens then.

> Giving other nodes links to a new node is probably more
> difficult, and definitely impossible in an environment where the new node
> could have less than honest intentions...

Yes, Wei Dai pointed out on the bluesky list that the system was highly
vulnerable to data-erasing attacks, where a node is able to choose its own
label which matches the ID of some document it wants to erase.  It then
gets that document assigned to it and is able to keep it off the network.

Regarding inefficiency of routing:

> In a way the Plaxton system deals with this. It assumes that there is some
> person somewhere that is sharing the data. This person does an Insert,
> which places the data on each step following the routing from their
> location (in Plaxton's language, the primary neighbor sequence of the).

Yes, I see that Plaxton short-circuits the routing by spreading around
pointers to the data.  This makes it less likely that the last few hops
(which are the expensive ones) will be needed to find the target node.
The typical search involves hopping around in the low order parts of
the tree, where nodes are physically close, until we find the pointer
to the actual data and go there directly.  Hence they are able to prove
that they are within a constant bound of optimal.

OceanStore appears to extend this by spreading the data around, not
just references to it, but I need to read that part more carefully.

Hal

From markm at caplet.com  Wed Feb 28 11:42:01 2001
From: markm at caplet.com (Mark S. Miller)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Re: Welcome to the "p2p-hackers" mailing list
In-Reply-To: <20010228081920.0DBC93FCA5@capsicum.zgp.org>
Message-ID: <5.0.2.1.2.20010228113411.04fa1cf0@shell9.ba.best.com>

At 12:19 AM Wednesday 2/28/01, p2p-hackers-request@zgp.org wrote:
>So here is a mailing list which I hope will continue the noble
>tradition of fraternization among p2p hackers.

In the noble tradition of openness, we should also make the archives 
visible to non-subscribers.  That way, the archives serve as a valuable 
public record accessible to search engines, and for people to link into.

I've been doing this on e-lang from the beginning, and have been very happy 
with the results.


        Cheers,
        --MarkM


From zooko at zooko.com  Wed Feb 28 11:42:02 2001
From: zooko at zooko.com (zooko@zooko.com)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] proposal re: scalability of block publication and fetching
Message-ID: <E14YCRQ-00077q-00@imp>

I was mildly stung by Oskar's assertion that Mojo Nation architecture
is inherently non-scalable with respect to fetching a block whose id
you know.  (Which Freenet people apparently call "routing", I guess
because they are thinking of app-level communication routing.)


I am thinking about a change to the MN architecture which would be very
very simple to deploy (for you Mojo Hackers, it would be simply a new
handicapper plug-in).  I will describe it in abstract terms, glossing
over at least four implementation details that you don't need to know
in order to tell me if this is scalable or not.




Suppose that you have a network with nodes which hold blocks of data
indexed by the SHA1 hash of the block.


A node, `A', wants to publish a block of data and then later a
different node `B', who already knows the unique id of that block wants
to fetch the block


`A' knows the "phonebook info" for N other nodes, where the "phonebook
info" consists of the public key and other information sufficient to
communicate with that node.


My proposal, which I call "MaskMatchingHandicapper", is that `A'
chooses the log(N) nodes whose public key ids have the highest "mask
match" with the id of the block.  A "mask match" is currently the
number of contigious leading bits which all match, but any scalar
comparison like Hamming distance or integer difference should behave as
well.


`A' then sends the block to those log(N) nodes.


Now later `B' wishes to fetch the block, whose id `B' already knows.


`B' already knows M other nodes.  `B' queries the top log(M) nodes
based on mask match.


(Note that you can consider the fact that `A' and `B' know possibly
different sets of counterparties to be either a consequence of one or
both of them having an incomplete knowledge of the net, or of `B'
operating at a later time than `A', in which case some nodes will have
come and gone.)


Now what is the chance of success?  It is the chance that at least one
of the log(N) nodes published-to by `A' are also among the log(M) nodes
queried by `B'.  The key phrase in there is "at least one", and given
some weak assumptions about the chance of an arbitrary node being
reachable by one of the counterparties, we can easily gain a high
confidence that at least one will be reachable by both simply by
changing our "log(X)" to "K * log(X)" for some small constant K.  (This
has implications for the bandwidth usage and the performance, which is
one of those issues that I'm glossing over, although we do have a
solution to this already implemented and deployed in Mojo Nation.
Actually four solutions, two of which we are grandfathering-out at this
point...  ;-))


There are (at least) two particular ways of failure:


1.  All log(N) of the nodes that `A' published to are unknown to `B'.
2.  All log(M) of the nodes queried by `B' were unknown to `A'.


Also combinations of the two.


Note that the latter case (#2) could happen if the size of the network
had ballooned dramatically between `A' publishing and `B' querying.
But the network would have to actually _square_ in size before _all_ of
the log(M) nodes queried by `B' were newbies.




Now is this scalable?  It seems obvious to me that it is, although it
leaves open questions of practical performance (you do not want to
publish log(N) times the size of your data), and the big question of
how `B' learned the unique id of the block.  (Both of these issues are
already solved, of course, on Mojo Nation, but those solutions might
not scale.)


Regards,

Zooko

P.S.  The idea of MaskMatchingHandicapper was inspired by an idea that
Raph Levien posted to advogato concerning doing the same thing, but for
plaintext meta-data instead of for blocks.


From md98-osa at nada.kth.se  Wed Feb 28 12:35:02 2001
From: md98-osa at nada.kth.se (Oskar Sandberg)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] proposal re: scalability of block publication and fetching
In-Reply-To: <E14YCRQ-00077q-00@imp>; from zooko@zooko.com on Wed, Feb 28, 2001 at 11:39:16AM -0800
References: <E14YCRQ-00077q-00@imp>
Message-ID: <20010228213550.B1736@hobbex.localdomain>

On Wed, Feb 28, 2001 at 11:39:16AM -0800, zooko@zooko.com wrote:
> 
> I was mildly stung by Oskar's assertion that Mojo Nation architecture
> is inherently non-scalable with respect to fetching a block whose id
> you know.  

Good :-).

> (Which Freenet people apparently call "routing", I guess
> because they are thinking of app-level communication routing.)

Fetching (Freenet: Requesting Data) a piece of data from a known id
(Freenet: key) is not what I refer to as routing, but a node deciding
where to send a Request is.

Since your system assumes that every node has global knowledge of the
network, you don't have to route as such, but that is exactly what I'm
saying is wrong.

<> 
> Suppose that you have a network with nodes which hold blocks of data
> indexed by the SHA1 hash of the block.
> 
> 
> A node, `A', wants to publish a block of data and then later a
> different node `B', who already knows the unique id of that block wants
> to fetch the block
> 
> 
> `A' knows the "phonebook info" for N other nodes, where the "phonebook
> info" consists of the public key and other information sufficient to
> communicate with that node.

The question is, how large is this "phonebook" (freenet: ReferenceStore or
Routing table)? For the network to be truly scalable, you don't want the
size of the routing table at any node have to grow faster than O(log N)
where N is the number of nodes on the entire network. 

If you have two parties (A and B) that selected their routing tables
independently and randomly, then the chance that they will contain a
single shared node is the chance that A chose a node that B chose, or:

1-(1-<the chance that B chose any given node>)^<number of nodes A chooses>

(it's not really an exponent since A will only choose a given node once,
but as the numbers increase that will stop mattering). If you want the
size of the routing tables to grow only with O(log n) then this becomes:

1-(1-O(log n)/O(n))^O(log n)

You can try plotting that in Matlab or something, and you'll see that it
dives towards zero pretty soon.

<> 
> Now is this scalable?  It seems obvious to me that it is, although it
> leaves open questions of practical performance (you do not want to
> publish log(N) times the size of your data), and the big question of
> how `B' learned the unique id of the block.  (Both of these issues are
> already solved, of course, on Mojo Nation, but those solutions might
> not scale.)

Not unless I am misunderstanding you in some respect regarding how the
nodes "phonebooks" are gathered. I would assert that is impossible to make
a scalable way of finding data on a network that does not work through
some sort of sorting and then hillclimbing find when trying to locate the
data.

In fact, I would _really_ recommend that you take a look at the Plaxton
scheme that Hal and I were discussing here. While I have issues with it's
resistant to coordinated attacks, anonymity, and node operator control
over the routing, these are things that you guys don't seem to put to
weight on, and that your current scheme isn't any better suited in
respect to anyways.

> 
> 
> Regards,
> 
> Zooko
> 
> P.S.  The idea of MaskMatchingHandicapper was inspired by an idea that
> Raph Levien posted to advogato concerning doing the same thing, but for
> plaintext meta-data instead of for blocks.
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers

-- 
'DeCSS would be fine. Where is it?'
'Here,' Montag touched his head.
'Ah,' Granger smiled and nodded.

Oskar Sandberg
md98-osa@nada.kth.se

From bram at gawth.com  Wed Feb 28 12:48:02 2001
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] proposal re: scalability of block publication and
 fetching
In-Reply-To: <20010228213550.B1736@hobbex.localdomain>
Message-ID: <Pine.LNX.4.21.0102281353140.1081-100000@ultra.gawth.com>

On Wed, 28 Feb 2001, Oskar Sandberg wrote:

> > `A' knows the "phonebook info" for N other nodes, where the "phonebook
> > info" consists of the public key and other information sufficient to
> > communicate with that node.
> 
> The question is, how large is this "phonebook" (freenet: ReferenceStore or
> Routing table)? For the network to be truly scalable, you don't want the
> size of the routing table at any node have to grow faster than O(log N)
> where N is the number of nodes on the entire network. 

Until we get up to around 10,000 counterparties the size of the phone book
won't be more than a few megs. When we get to that point we'll start
worrying about how to make the system scale more.

DNS has the same scaling problem. It doesn't seem to be melting.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From hal at finney.org  Wed Feb 28 12:56:01 2001
From: hal at finney.org (hal@finney.org)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Bluesky list
Message-ID: <200102282053.MAA16966@finney.org>

A[nother] new mailing list is getting started to discuss general issues
of peer-to-peer style file-sharing systems like Freenet, MojoNation,
Publius, Gnutella and the like.  Information is at:
http://www.transarc.com/~ota/bluesky/index.html.

Their charter:

     The purpose of the mailing list is to foster discussion of design
     and implementation issues related to the development of scalable,
     decentralized storage systems of literally global scope. The
     emphasis should be on technical descriptions and critique of
     mechanisms providing efficiency, reliability, security, and similar
     properties. Discussion of goals and semantics is also desirable,
     while acknowledging that a diversity of systems will be built and
     evaluated. Messages with primarily political, legal or philosophical
     content are discouraged.

It's got some smart people signed up although the traffic level has been
pretty low.  P2P hackers might want to take a look.

Hal

From orasis at acm.cs.umn.edu  Wed Feb 28 12:59:01 2001
From: orasis at acm.cs.umn.edu (Justin Chapweske)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Scalability vs Network Stability
In-Reply-To: <E14YCRQ-00077q-00@imp>; from zooko@zooko.com on Wed, Feb 28, 2001 at 11:39:16AM -0800
References: <E14YCRQ-00077q-00@imp>
Message-ID: <20010228145824.L16140@go.cs.umn.edu>

One thing that seems to be missing from these conversations on routing and 
lookup is some quantifiable measure of network stability.  A lot of these 
routing systems being proposed seem nice from a pure scalabilty 
perspective but I doubt that most of them will work in any sort of 
unstable network environment.

Does anyone have any good quantifiable definition of network stability 
that we can use to benchmark our systems against?  It seems to me that 
most of the algorithms in Freenet depend on a relatively high degree of 
network stability...But what happens when Napster goes down and everyone 
starts using Espra?  

Here is my current gut feeling on the order of network stability levels 
required for various systems to succeed:

1) Oceanstore, Publius
2) Freenet, OpenCola Folders
3) MojoNation
4) Gnutella
5) Swarmcast (Nodes actively shut themselves down after a period of time)
 
Now what I want is to assign some numbers to each of these systems....any 
guesses how?

-- 
Justin Chapweske, Lead Swarmcast Developer, OpenCola Inc.
http://www.sourceforge.net/projects/swarmcast/

From md98-osa at nada.kth.se  Wed Feb 28 16:12:01 2001
From: md98-osa at nada.kth.se (Oskar Sandberg)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] proposal re: scalability of block publication and fetching
In-Reply-To: <Pine.LNX.4.21.0102281353140.1081-100000@ultra.gawth.com>; from bram@gawth.com on Wed, Feb 28, 2001 at 01:58:33PM -0800
References: <20010228213550.B1736@hobbex.localdomain> <Pine.LNX.4.21.0102281353140.1081-100000@ultra.gawth.com>
Message-ID: <20010301011350.A2677@hobbex.localdomain>

On Wed, Feb 28, 2001 at 01:58:33PM -0800, Bram Cohen wrote:
> On Wed, 28 Feb 2001, Oskar Sandberg wrote:
> 
> > > `A' knows the "phonebook info" for N other nodes, where the "phonebook
> > > info" consists of the public key and other information sufficient to
> > > communicate with that node.
> > 
> > The question is, how large is this "phonebook" (freenet: ReferenceStore or
> > Routing table)? For the network to be truly scalable, you don't want the
> > size of the routing table at any node have to grow faster than O(log N)
> > where N is the number of nodes on the entire network. 
> 
> Until we get up to around 10,000 counterparties the size of the phone book
> won't be more than a few megs. When we get to that point we'll start
> worrying about how to make the system scale more.

Until then you can't claim to have a scalable architecture.

> DNS has the same scaling problem. It doesn't seem to be melting.

I think taking one's cues from DNS is just about the last thing somebody
trying to build a decentralized P2P system should do. And remember that
systems like Napster and ICQ already have namespaces considerably larger
than the DNS domain names - by burdening MN with a simplistic routing
model you are severly undershooting it's potential.

> -Bram Cohen
> 
> "Markets can remain irrational longer than you can remain solvent"
>                                         -- John Maynard Keynes
> 

-- 
'DeCSS would be fine. Where is it?'
'Here,' Montag touched his head.
'Ah,' Granger smiled and nodded.

Oskar Sandberg
md98-osa@nada.kth.se

From md98-osa at nada.kth.se  Wed Feb 28 16:35:02 2001
From: md98-osa at nada.kth.se (Oskar Sandberg)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists)
In-Reply-To: <200102281817.KAA16135@finney.org>; from hal@finney.org on Wed, Feb 28, 2001 at 10:17:40AM -0800
References: <200102281817.KAA16135@finney.org>
Message-ID: <20010301013605.B2677@hobbex.localdomain>

On Wed, Feb 28, 2001 at 10:17:40AM -0800, hal@finney.org wrote:
> Oskar writes:
> > The above paper doesn't deal with adding nodes to the network at all. The
> > Oceanstore paper says:
> >
> > "While existing work on Plaxton-like data structures did not include
> > algorithms for online creation and maintenance of the global mesh, we have
> > produced recursive node insertion and removal algorithms."
> 
> Of course "recursive" tells us nothing about the actual cost of
> such algorithms.  I have read the Plaxton paper (the part about the
> algorithms, not the proofs!) now, and there are some subtleties in the
> Plaxton storage that look to me like they require global knowledge.

I guess. Plaxton's system had parent links, so shouldn't the node be able
to find the root for it's id, and then walk backwards up each branch of
the tree to find all the options at every level (which would be high cost
operation of course).

<> 
> It's also important that nodes not share labels, or at least that they
> find out if any other nodes have the same label.  This can largely be done
> by simply querying the network for a given label, but it is vulnerable
> to race conditions and it's not clear what happens then.

Plaxton seems to solve ties by simply invoking an order function on the
network (his beta function) so I figure that should be possible to apply
here as well.

-- 
'DeCSS would be fine. Where is it?'
'Here,' Montag touched his head.
'Ah,' Granger smiled and nodded.

Oskar Sandberg
md98-osa@nada.kth.se

From hal at finney.org  Wed Feb 28 17:55:01 2001
From: hal at finney.org (hal@finney.org)
Date: Sat Dec  9 22:11:41 2006
Subject: [p2p-hackers] Oceanstore's routing (was Re: [freenet-devl] updating with access control lists)
Message-ID: <200103010152.RAA18066@finney.org>

Oskar writes:
> On Wed, Feb 28, 2001 at 10:17:40AM -0800, hal@finney.org wrote:
> > Of course "recursive" tells us nothing about the actual cost of
> > such algorithms.  I have read the Plaxton paper (the part about the
> > algorithms, not the proofs!) now, and there are some subtleties in the
> > Plaxton storage that look to me like they require global knowledge.
>
> I guess. Plaxton's system had parent links, so shouldn't the node be able
> to find the root for it's id, and then walk backwards up each branch of
> the tree to find all the options at every level (which would be high cost
> operation of course).

I don't follow how this would work.  Say we're using base 4, and my
new node has randomly chosen the label 0123.  Now here's a neighboring
node with label 3210.

For my level-0 links, I need to find the closest nodes matching ___0,
___1, ___2, and ___3.  I can get these from 3210 just fine.  For my
level-1 links, I need close nodes matching __03, __13, __23, __33.
I can't get any of these from 3210 directly.  He can follow his ___3
link and I can get nodes from that 2nd link, but they may not be
closest to me.

Likewise for the level-2 links I need _023, _123, _223, _323.  For these
I can take the ___3 neighbor of 3210, and his __23 neighbor, and use his
level-2 links.  But now these are three steps away.  And last I need 0123,
1123, 2123 and 3123, which I can get by following ___3, __23, _123 path
and ask for his neighbors (cost is not an issue for the highest-level
neighbors in Plaxton).

Possibly a good compromise would be to somehow identify a bunch of
physically-nearby nodes participating in the network, and perform this
algorithm with each of them to get a good assortment of candidates for
each one, and then to take the closest.

> > It's also important that nodes not share labels, or at least that they
> > find out if any other nodes have the same label.  This can largely be done
> > by simply querying the network for a given label, but it is vulnerable
> > to race conditions and it's not clear what happens then.
>
> Plaxton seems to solve ties by simply invoking an order function on the
> network (his beta function) so I figure that should be possible to apply
> here as well.

Right, and you could use IP address or some such to break the ties,
but you have to recognize that ties exist first, which is global
information.

There's a special rule for the last-level neighbors; you don't take
the closest one, you take the one highest in beta (assuming there is
more than one).  So in the example above when I go for 1123, 2123, &
3123, I need to query whether there is more than one of these and pick
the one with the highest beta value.  Now, I suppose I can just copy
anyone's links because they also were supposed to use the highest beta.
But if a new node has just been added with label 2123, and it happens
to have a higher beta, I need to know about it, and so does everybody
else who's pointing at 2123.  (And for that matter, all the data at the
old 2123 has to get sent over to the new one, since that's where people
will look.)  Similar problems happen when a node leaves.

I'd feel better if OceanStore had a claim that adding/removing nodes
took log n or log^2 n operations, or some such.  Given that nodes will
be entering and leaving all the time, if these operations are costly it
could be a significant load.

Hal