From hemppah at cc.jyu.fi  Sun Dec  1 05:11:01 2002
From: hemppah at cc.jyu.fi (hemppah@cc.jyu.fi)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and 
Message-ID: <1038747848.3dea08c86601b@tammi2.cc.jyu.fi>

>> Currently I'm doing my Master thesis which focuses some specific issues 
>>related 
>> to p2p systems. I have a few questions concercning these issues: 
>> 
>> a) Is DHT-based routing model (O(log(n)) the best effort for finding (data) 
>> blocks in p2p network ? 

>I have a biased description of search/discovery methods for large peer 
>networks (out of date - i will revise one of these days...) which may be 
>helpfull: 

>http://cubicmetercrystal.com/alpine/discovery.html 

Yes, I have found your text and currently it's one of the references in my 
thesis (and probably will write about it also). 

Btw, how well Alpine can scale (e.g. number of users) ? Do you have any "real- 
life" experiences from Alpine (how social-connection paradigm really works 
etc.) ? What about search performance in Alpine (any big O's) ? Security 
(PKIs) ? 

(Could you notify me, when you will revise it ;)) 

Thanks, 
-Hermanni 


-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

From bram at gawth.com  Sun Dec  1 09:45:01 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] turing testing
Message-ID: <Pine.LNX.4.21.0212010944080.11483-100000@ultra.gawth.com>

A way to differentiate humans from computers, I particularly like gimpy -

http://www.captcha.net/

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From agm at SIMS.Berkeley.EDU  Sun Dec  1 10:51:02 2002
From: agm at SIMS.Berkeley.EDU (Antonio Garcia)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] turing testing
In-Reply-To: <Pine.LNX.4.21.0212010944080.11483-100000@ultra.gawth.com>
Message-ID: <Pine.LNX.4.44.0212011049330.9077-100000@asterix.sims.berkeley.edu>


Yes, but some them leave something to be desired in terms of accuracy. I 
concluded I wasn't human on several occasions...A.


> A way to differentiate humans from computers, I particularly like gimpy -
> 
> http://www.captcha.net/
> 
> -Bram Cohen
> 
> "Markets can remain irrational longer than you can remain solvent"
>                                         -- John Maynard Keynes
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Antonio Garcia-Martinez
cryptologia.com


From hal at finney.org  Sun Dec  1 12:45:02 2002
From: hal at finney.org (Hal Finney)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] turing testing
Message-ID: <200212012127.gB1LRGY02903@finney.org>

> http://www.captcha.net/

These are interesting efforts, but I've never seen any analysis which
looks at the state of the art in current image recognition and comes up
with a quantitative estimate of how much work it would take to be able
to make a program that could recognize the letters.

I think the assumption is that if someone did come up with a program
that could do it, the challenge writers would just change their algorithm
and the AI guys would have to start from scratch.

But this is uncomfortably reminiscent of the amateur's method of cipher
design.  He, too, fails to consider the possible attacks on his cipher
in any kind of detail or with any familiarity with the field.  And he,
too, when presented with a break in his cipher will simply make a tweak
to avoid whatever particular weakness was found.

We know that this is an unsound approach to cipher design.  Is there
any reason to believe that this strategy will work better when creating
images that people can recognize but not programs?

Hal

From bram at gawth.com  Sun Dec  1 13:54:01 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] turing testing
In-Reply-To: <Pine.LNX.4.44.0212011049330.9077-100000@asterix.sims.berkeley.edu>
Message-ID: <Pine.LNX.4.21.0212011352430.11483-100000@ultra.gawth.com>

Antonio Garcia wrote:

> Yes, but some them leave something to be desired in terms of accuracy. I 
> concluded I wasn't human on several occasions...A.

Some of them I find to be a bit shaky, but gimpy is for me 100% accurate,
and has much surreal beauty -

http://www.captcha.net/cgi-bin/gimpy

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From coderman at mindspring.com  Sun Dec  1 13:55:02 2002
From: coderman at mindspring.com (coderman)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
In-Reply-To: <1038747848.3dea08c86601b@tammi2.cc.jyu.fi>
References: <1038747848.3dea08c86601b@tammi2.cc.jyu.fi>
Message-ID: <3DEA854E.5010301@mindspring.com>

hemppah@cc.jyu.fi wrote:

> ...

>
> Btw, how well Alpine can scale (e.g. number of users) ? Do you have any "real-
> life" experiences from Alpine (how social-connection paradigm really works
> etc.) ? What about search performance in Alpine (any big O's) ? Security
> (PKIs) ? 

Scalability is the big question, as it is hard to guage real world
scalability without an actual real world network :-)

I have done some preliminary scalability testing on the OSDL systems
( http://www.osdl.org/ ) using a pair of 4way SMP systems on a gigabit
network link.  I was able to establish and communicate with over 4 million
concurrent DTCP/Alpine connections between them, using about a gigabyte of
RAM.

On lower end hardware and bandwidth (5 - 10M of memory, DSL/cable) peers
groups would be much smaller, 10,000 to 25,000 concurrent.

The main feature of alpine that allows it to scale with less effort is the
fact that all communication is direct, using a lightweight UDP based
transport.  The number of connections is limited only by the memory and
bandwidth you have available to use.

Search performance is also difficult to guage because of the social
discovery mechanism employed.  Each peer is continually tuning peer groups
to increase the use of high quality peers with similar interests and
removing peers that do not contribute or share interests.

When you first join the network your query effectiveness is going to be
low.  As you use the network, query effectiveness increases given the
feedback received from previous queries and how they affect the composition
of the peer groups you use for resource discovery.

This makes it nice for people who share interests in unpopular or obscure
resources; they can eventually obtain a fast, effective peer group for
finding these resources.  This is in direct contrast to most other search
mechanisms where obscure or unpopular resources are always more difficult
to locate.

I am focusing on usability for the next and future devel snapshots of
alpine, so hopefully some real world use on a wider scale will allow
me to answet your questions which much more detail in the future.
Right now all I can provide you is rough guesstimates and intuitions..

Last, regarding security, I would like to integrate PGP/GPG style
assymetric cyphers and digital signatures, however this is a low
priority.  I am also considering an implementation of peer frogs
( http://cubicmetercrystal.com/peerfrog/ ) for avoiding man-in-the-middle
attacks in large decentralized peer networks (although this is geared
more for large wireless peer networks)

>
> (Could you notify me, when you will revise it ;))

Absoutely.

>
> Thanks,
> -Hermanni
>

Best regards,
     Martin Peck.


-- 
_____________________________________________________________________
  coderman@mindspring.com               http://cubicmetercrystal.com/
  key fingerprint: 9C00 C63E A71D D488 AF17  F406 56FB 71D9 E17D E793
                ( see html source for public key )
---------------------------------------------------------------------


From bram at gawth.com  Sun Dec  1 13:57:01 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] turing testing
In-Reply-To: <200212012127.gB1LRGY02903@finney.org>
Message-ID: <Pine.LNX.4.21.0212011353590.11483-100000@ultra.gawth.com>

Hal Finney wrote:

> > http://www.captcha.net/
> 
> These are interesting efforts, but I've never seen any analysis which
> looks at the state of the art in current image recognition and comes up
> with a quantitative estimate of how much work it would take to be able
> to make a program that could recognize the letters.

There isn't a coherent answer to that question. What's needed are
fundamental advances in vision algorithms, and nobody has any good time
horizon for that.

> I think the assumption is that if someone did come up with a program
> that could do it, the challenge writers would just change their algorithm
> and the AI guys would have to start from scratch.

Not necessarily. We might be able to block it for one or two rounds, but
eventually the AI side will win. These tests, however, work just fine
*today*, and will continue to be a useful anti-spam measure for a while.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From p2phackers at bondolo.org  Mon Dec  2 09:24:01 2002
From: p2phackers at bondolo.org (p2phackers@bondolo.org)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] turing testing
In-Reply-To: <Pine.LNX.4.21.0212010944080.11483-100000@ultra.gawth.com>
References: <Pine.LNX.4.21.0212010944080.11483-100000@ultra.gawth.com>
Message-ID: <3DEB96FD.2060707@bondolo.org>

Bram Cohen wrote:

>A way to differentiate humans from computers
>
It probably wasn't something that was considered by the authors and I 
could find no mention of on their site, but these approaches also do a 
good job of differentiating the sighted from the blind. I followed the 
yahoo email account creation script a few weeks ago and was disappointed 
that they offered no alternative differentiator.

Just a fwiw because I live with someone who worries about these things.

Mike


From bert at akamail.com  Mon Dec  2 09:31:01 2002
From: bert at akamail.com (Bert)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
In-Reply-To: <3DEA854E.5010301@mindspring.com>
References: <1038747848.3dea08c86601b@tammi2.cc.jyu.fi> <3DEA854E.5010301@mindspring.com>
Message-ID: <3DEB9A6D.6040503@akamail.com>

coderman wrote:

> I am also considering an implementation of peer frogs
> ( http://cubicmetercrystal.com/peerfrog/ ) for avoiding man-in-the-middle
> attacks in large decentralized peer networks (although this is geared
> more for large wireless peer networks) 

Coderman -- this looks interesting. Could you tell us a bit about it? 
E.g. what's it trying to solve that, say, SSL doesn't? Is it to avoid 
any need for trusted set of CAs?

Thanks,

Bert


From coderman at mindspring.com  Mon Dec  2 12:07:01 2002
From: coderman at mindspring.com (coderman)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
In-Reply-To: <1038747848.3dea08c86601b@tammi2.cc.jyu.fi>
References: <1038747848.3dea08c86601b@tammi2.cc.jyu.fi> <3DEA854E.5010301@mindspring.com> <3DEB9A6D.6040503@akamail.com>
Message-ID: <3DEBBD96.6060804@mindspring.com>

Bert wrote:

> coderman wrote:
>
> > I am also considering an implementation of peer frogs
> > ( http://cubicmetercrystal.com/peerfrog/ ) for avoiding man-in-the-middle
> > attacks in large decentralized peer networks (although this is geared
> > more for large wireless peer networks)
>
>
> Coderman -- this looks interesting. Could you tell us a bit about it?
> E.g. what's it trying to solve that, say, SSL doesn't? Is it to avoid
> any need for trusted set of CAs?

Hi Bert,

This is not a replacement for trusted CA's, which are always
preferable and more secure (if they are trustworthy :-)

Peer frogs is aimed at ad-hoc wireless peer networks where
man-in-the-middle attacks are trivial to implement.  Simply
setup a rogue AP or hijack an existing AP using amplifiers,
etc.  (see airjack: http://802.11ninja.net/ )

In such an environment, when communication with peers you may
not know (i.e. no prior key exchange out of band, and no trusted
CA's) you have few options to prevent a man-in-the-middle.

Peer frogs is a last resort designed for this type of environment
where CA's or prior key exchange is not feasable or available.

The value of this is debatable, however, I think it may be useful
enough to deserve an implementation...


-- 
_____________________________________________________________________
  coderman@mindspring.com               http://cubicmetercrystal.com/
  key fingerprint: 9C00 C63E A71D D488 AF17  F406 56FB 71D9 E17D E793
                ( see html source for public key )
---------------------------------------------------------------------


From bram at gawth.com  Tue Dec  3 00:12:02 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] Upcoming p2p-hackers meeting
Message-ID: <Pine.LNX.4.21.0212030010450.11483-100000@ultra.gawth.com>

The second sunday this months happens early, the meeting will be this
upcoming sunday, the 8th.

Usual time, usual place, 3pm, the metreon.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From cefn.hoile at bt.com  Tue Dec  3 10:37:01 2002
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
	 
Message-ID: <92D69A614ED4924A9071EA4886317878A6DBFD@i2km11-ukbr.domain1.systemhost.net>

hemppah@cc.jyu.fi wrote:

"Basically, I just ment that, for example, in DHTs the basic assumption is
that 
you can't share your resources from your *own* computer; DHTs maps keys to 
values in m-bit virtual space address. And in Gnutella, the basic assumption
is 
contrast. You have (well you don't *have* to but..) to share your resources 
from your own computer."

>>>>>>>

Thought this information may be of use to you.

SWAN (Small World Adaptive Networks) is a fully decentralised lookup system,
with a function similar to the DHTs 'CAN' and 'Chord'. 

It has lookup performance approximating to O(log(n)) and has been
implemented and tested both in simulation (1,000,000 nodes) and in trial
deployments across a test network (100,000 nodes).

However, whilst it is comparable to the DHTs you mention in lookup
performance, both the resources and the mapping from resource id to resource
address are hosted locally to the client which hosts the resource.

In other words there is no notion of servers which are responsible for
subsections of the hashing space (to which you have to delegate
responsibility if your resource's id falls in that subspace). There is also
no requirement for others to host information of the resources that you host
(since the route terminates with your machine). 

In this way it is like a blend of the autonomy of Gnutella, and the
scalability of regular DHTs. Is this the sort of function you were after?

Cefn Hoile
BTexact Technologies

-----Original Message-----
From: hemppah@cc.jyu.fi [mailto:hemppah@cc.jyu.fi] 
Sent: 30 November 2002 12:21
To: p2p-hackers@zgp.org
Subject: Re: [p2p-hackers] About search methods, key revokation in PKI and 


Hi, 

Thanks for you answer. Please see my answers below.

>Date: Fri, 29 Nov 2002 09:22:10 -0800 (PST)
>From: Antonio Garcia <agm@SIMS.Berkeley.EDU> 
>To: p2p-hackers@zgp.org 
>Subject: Re: [p2p-hackers] About search methods, key revokation in PKI and 
> signature management 
>Reply-To: p2p-hackers@zgp.org 

> 
>> a) Is DHT-based routing model (O(log(n)) the best effort for finding 
>> (data)
>> blocks in p2p network ? 

>Depends on the P2P network. In Gnutella, there are no bounds on 
>searches.
>In CHORD and Tapestry, it is log n, in CAN it is n^(1/d), where d is the 
>dimension of the logical space. 

Yes, that's true. The point I was trying to say was that is O(log n) really
the 
best effort and only implemented by DHTs. So there really is no other "log
n"s than DHTs (e.g. trees or tries) ?

>> 
>> b) What are assumptions for the best effort ? Look question a (what 
>>ever it
>>is). 
>> 

>Rather complicated question! I would recommend reading the papers...

Yes, rather unbounded question ;).

Basically, I just ment that, for example, in DHTs the basic assumption is
that 
you can't share your resources from your *own* computer; DHTs maps keys to 
values in m-bit virtual space address. And in Gnutella, the basic assumption
is 
contrast. You have (well you don't *have* to but..) to share your resources 
from your own computer.

etc.


>> 
>> d) How digital signatures should be managed (PKI) in p2p environment 
>> ?

>That's unresolved, and would be an excellent paper if you figured it 
>out.

Hmm..I think Groove has quite clever signature management system, but I'm
not 
sure how well it can scale (e.g. 1,000,000,000s of users).

>> 
>> e) How do I know that if I have searched data, results are accurate 
>>(not
>>fake 
>> blocks etc.) 

>Well, if you are searching by file hash, then it's pretty likely to be
>what you are looking for. If you are searching by meta-data, then who 
>knows what you're getting... 

Hmm..this *might* be weird question, but can Tree Hash EXchange format
(THEX) 
combine file hashes and metadata somehow ?

Thanks,
Hermanni


-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers

From cefn.hoile at bt.com  Tue Dec  3 12:16:01 2002
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
Message-ID: <92D69A614ED4924A9071EA4886317878A6DBFE@i2km11-ukbr.domain1.systemhost.net>

FYI With reference to Fritz' two measures : hops and space.

SWAN approximates to O(log(n)) for lookup, but each node requires only a
fixed space for storage, regardless of the number of nodes in the network. 

The space required is a function of the dimensionality of the hashing space,
which is fixed for a given SWAN network.

Cefn Hoile
BTexact Technologies

-----Original Message-----
From: J. Fritz Barnes [mailto:barnesjf@vuse.vanderbilt.edu] 
Sent: 30 November 2002 19:48
To: p2p-hackers@zgp.org
Subject: Re: [p2p-hackers] About search methods, key revokation in PKI and

There are two measurements to evaluate the routing by: amount of space
required and number of hops required to find a specific piece of
information.  The DHT systems (Chord, Pastry, Tapestry) tend to be O(log(n))
for both space and number of hops.  If you were willing to store additional
information; you could have O(n) space and O(1) lookup.  This is a
spectrum... on the other side each host might only keep the next peer in
which case it would be
O(1) space and O(n) lookup.

From myers at maski.org  Tue Dec  3 13:28:02 2002
From: myers at maski.org (Myers Carpenter)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
In-Reply-To: 
	<92D69A614ED4924A9071EA4886317878A6DBFD@i2km11-ukbr.domain1.systemhost.net>
References: 
	<92D69A614ED4924A9071EA4886317878A6DBFD@i2km11-ukbr.domain1.systemhost.net>
Message-ID: <1038950927.19345.109.camel@trouble>

On Tue, 2002-12-03 at 13:36, cefn.hoile@bt.com wrote:
> SWAN (Small World Adaptive Networks) is a fully decentralised lookup system,
> with a function similar to the DHTs 'CAN' and 'Chord'. 

Thanks for pointing this out.  Any vague idea when the papers behind
this will be published?

myers


From oskar at freenetproject.org  Tue Dec  3 14:44:02 2002
From: oskar at freenetproject.org (Oskar Sandberg)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
In-Reply-To: <92D69A614ED4924A9071EA4886317878A6DBFD@i2km11-ukbr.domain1.systemhost.net>
References: <92D69A614ED4924A9071EA4886317878A6DBFD@i2km11-ukbr.domain1.systemhost.net>
Message-ID: <20021203224341.GA361@sporty.spiceworld>

On Tue, Dec 03, 2002 at 06:36:30PM -0000, cefn.hoile@bt.com wrote:
<>
> However, whilst it is comparable to the DHTs you mention in lookup
> performance, both the resources and the mapping from resource id to resource
> address are hosted locally to the client which hosts the resource.
> 
> In other words there is no notion of servers which are responsible for
> subsections of the hashing space (to which you have to delegate
> responsibility if your resource's id falls in that subspace). There is also
> no requirement for others to host information of the resources that you host
> (since the route terminates with your machine). 

Until you actually cough up the papers regarding your panacea, 
discussion is quite pointless, but what you describe is simply not 
possible:

If data identifiers are derived from the content, rather than a 
location, and if the network is to find a route to the data offered, 
then for every piece of data that a node offers it has to tell _some_ 
node on the network that it contains those keys. Your assertion that 
"There is also no requirement for others to host information of the 
resources that you host" leaves only the possibility of blind trial and 
error querying of nodes - if other nodes do not contain information 
about which data you host it is not possible for them to make an 
intelligent determination of whether to send a query to you. 

<>
--
Oskar Sandberg
oskar@freenetproject.org

From bram at gawth.com  Tue Dec  3 16:11:01 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] turing testing
In-Reply-To: <3DEB96FD.2060707@bondolo.org>
Message-ID: <Pine.LNX.4.21.0212031607540.10254-100000@ultra.gawth.com>

p2phackers@bondolo.org wrote:

> Bram Cohen wrote:
> 
> >A way to differentiate humans from computers
> >
> It probably wasn't something that was considered by the authors and I 
> could find no mention of on their site, but these approaches also do a 
> good job of differentiating the sighted from the blind. I followed the 
> yahoo email account creation script a few weeks ago and was disappointed 
> that they offered no alternative differentiator.

That's an interesting problem. Maybe it could quiz you on the distinction
between markov chain generated texts and real texts. Of course that
requires a fairly large database of sample texts which the attacker can't
have, and also requires the user speak the language being quizzed.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From hemppah at cc.jyu.fi  Wed Dec  4 01:34:01 2002
From: hemppah at cc.jyu.fi (hemppah@cc.jyu.fi)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and  
In-Reply-To: <92D69A614ED4924A9071EA4886317878A6DBFD@i2km11-ukbr.domain1.systemhost.net>
References: <92D69A614ED4924A9071EA4886317878A6DBFD@i2km11-ukbr.domain1.systemhost.net>
Message-ID: <1038993984.3dedca4060b61@tammi2.cc.jyu.fi>

Thank you for your answer.


Quoting cefn.hoile@bt.com:
>> hemppah@cc.jyu.fi wrote:
>> 
>> "Basically, I just ment that, for example, in DHTs the basic assumption is
>> that 
>> you can't share your resources from your *own* computer; DHTs maps keys to 
>> values in m-bit virtual space address. And in Gnutella, the basic
>> assumption
>> is 
>> contrast. You have (well you don't *have* to but..) to share your resources
>> 
>> from your own computer."
> 
> >>>>>>>
> 
> Thought this information may be of use to you.
> 
> SWAN (Small World Adaptive Networks) is a fully decentralised lookup
> system,
> with a function similar to the DHTs 'CAN' and 'Chord'. 


This sounds interesting. However, I wasn't able to any find documentation about
SWAN (from the btexact.com, google...). Is there any documentation/papers
available yet ?

Btw, in SWAN, what is the *maximun* (estimated) number of concurrent users ?

-Hermanni


-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

From cefn.hoile at bt.com  Wed Dec  4 03:10:02 2002
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
Message-ID: <92D69A614ED4924A9071EA4886317878A6DC01@i2km11-ukbr.domain1.systemhost.net>

Hermanni, 

Hopefully the paper I sent will clarify the implementation of SWAN for you.
I can send this information to others by request.

In answer to your question...

"Btw, in SWAN, what is the *maximun* (estimated) number of concurrent users
?"

...the maximum number of concurrent nodes will depend upon the network
capacity. Since each node (i.e. each named resource) interacts directly with
other nodes, the notions of user or server are somewhat irrelevant when
discussing capacity. Each machine may host whatever number of _nodes_ (named
resources) it has capacity for.

Maintenance of the SWAN takes place by passing messages between hosted
nodes. Thus, to maintain your connectedness to the network you must have
sufficient network capacity. 

To date we have run up to 6,000 nodes on a single PC as part of a deployed
SWAN of 100,000 nodes, but we haven't really attempted to max out the
network bandwidth. 

Cefn Hoile
BTexact Technologies


From hemppah at cc.jyu.fi  Wed Dec  4 04:26:01 2002
From: hemppah at cc.jyu.fi (hemppah@cc.jyu.fi)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
In-Reply-To: <92D69A614ED4924A9071EA4886317878A6DC01@i2km11-ukbr.domain1.systemhost.net>
References: <92D69A614ED4924A9071EA4886317878A6DC01@i2km11-ukbr.domain1.systemhost.net>
Message-ID: <1039004290.3dedf282a50ea@tammi2.cc.jyu.fi>

Quoting cefn.hoile@bt.com:

> Hermanni, 
> 
> Hopefully the paper I sent will clarify the implementation of SWAN for you.
> I can send this information to others by request.

Again, thank you! :)

> 
> In answer to your question...
> 
> "Btw, in SWAN, what is the *maximun* (estimated) number of concurrent users
> ?"
> 
> ...the maximum number of concurrent nodes will depend upon the network
> capacity. Since each node (i.e. each named resource) interacts directly
> with
> other nodes, the notions of user or server are somewhat irrelevant when
> discussing capacity. Each machine may host whatever number of _nodes_
> (named
> resources) it has capacity for.


So, in theory, SWAN should scale to 10^9s (billions) of users, if we can assume
that network is fast enough (e.g. cable/xDSL etc.), and if there is only one
node per computer. Agree ?

-Hermanni


-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

From cefn.hoile at bt.com  Wed Dec  4 05:49:02 2002
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
Message-ID: <92D69A614ED4924A9071EA4886317878A6DC04@i2km11-ukbr.domain1.systemhost.net>

"So, in theory, SWAN should scale to 10^9s (billions) of users, if we can
assume that network is fast enough (e.g. cable/xDSL etc.), and if there is
only one node per computer. Agree ?"


In principle, you are right. However, the scenario of use has to be taken
into account here.

A SWAN node is tunable using a parameter which is basically a polling
period. The more often the SWAN node wakes up and sends a message, the
faster the SWAN will converge, or the faster a new node will be
incorporated, but also the more CPU time and network bandwidth will be used.
So the tradeoff you choose will depend upon the needs of your app.

Equally, whilst all 'FIND' queries follow a single route from the querying
node to the hosting node, with approximate O(log(n)) hops, making a SWAN
highly scalable compared to the flooding approach, the frequency of lookups
will of course have an impact on the network load.

To give you a useful data point, the 108,000 node experiment was run across
18 dual processor machines on a 10Mb/s hub.

Running 6000 nodes per machine used up an average of 50% of the machines'
cpu, (twin 450MHz processors). The polling period of the SWAN nodes was
tuned to achieve this.

The network traffic produced by each node was _at most_ 1.7 kb/s, assuming
that there is actually a 10Mb/s output from each node (very unlikely given
the hub we are using). I suspect also it was much less than the full
capacity of the hub in practice, since the CPU load was the dependent
variable we focused upon when tuning the experiment. Can people suggest
useful linux tools to monitor volume of network usage (UDP), possibly per
thread?

The experiment was focused on the worst case scenario - add 100,000 nodes
simultaneously and test how long it takes to self-organise from scratch (no
structure) into a SWAN. In this scenario, it took 3 hours to 'asymptote' to
100% lookup success. Taking the similar approach, but with one node per
computer, we would therefore expect convergence of the 'worst case scenario'
network within seconds, assuming that the network can handle the 1.7kb/s UDP
load for each node. 

In reality, of course, we can assume that the network is incremental, and
that the vast majority of it is already converged. Given this assumption,
making a new node accessible takes much less time and much less network
bandwidth. The specific performance under these circumstances has yet to be
tested, but we hope to do this in the next few weeks.

From hemppah at cc.jyu.fi  Wed Dec  4 06:42:01 2002
From: hemppah at cc.jyu.fi (hemppah@cc.jyu.fi)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
In-Reply-To: <92D69A614ED4924A9071EA4886317878A6DC04@i2km11-ukbr.domain1.systemhost.net>
References: <92D69A614ED4924A9071EA4886317878A6DC04@i2km11-ukbr.domain1.systemhost.net>
Message-ID: <1039012440.3dee12586ce6e@tammi2.cc.jyu.fi>

Quoting cefn.hoile@bt.com:

> "So, in theory, SWAN should scale to 10^9s (billions) of users, if we can
> assume that network is fast enough (e.g. cable/xDSL etc.), and if there is
> only one node per computer. Agree ?"
> 
> 
> In principle, you are right. However, the scenario of use has to be taken
> into account here.
> 

...

> variable we focused upon when tuning the experiment. Can people suggest
> useful linux tools to monitor volume of network usage (UDP), possibly per
> thread?
> 

You could try APSR (http://www.aa-security.de/), at least there is a "Thread
support" listed in feature page (support for monitoring threads or *feature* of
APSR ? Unfortunately I don't know, since I have used never used it :().

Please tell me, if you find this useful.

-Hermanni


-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

From cefn.hoile at bt.com  Wed Dec  4 10:43:01 2002
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods, key revokation in PKI and
Message-ID: <92D69A614ED4924A9071EA4886317878A6DC0B@i2km11-ukbr.domain1.systemhost.net>

Oskar wrote:

"then for every piece of data that a node offers it has to tell _some_ 
node on the network that it contains those keys"

Worth pointing out this ambiguity. The original phrase looked like.

> There is 
> also no requirement for others to host information of the resources 
> that you host (since the route terminates with your machine).

The phrase should be that there is no requirement for others to host your
'mappings'. In other words, no-one else has to hold any key:value pair on
your behalf. Of course, some others have to record that your machine is
hosting a specific key. However, no out of date values will be returned,
since the search terminates with the hosting peer. Perhaps more importantly,
responsibility for the hash space is not subdivided into bins, avoiding some
of the problems of peer churn.


"Until you actually cough up the papers regarding your panacea, 
discussion is quite pointless"

The paper 'Fully Decentralised, Scalable Look-up in a Network of Peers
using Small World Networks' has been available since July 2002, published at
the SCI conference Orlando Florida. Unfortunately, we don't have copyright
for electronic publishing, and people have some trouble getting hold of this
volume. Of course, peer review is very valuable, so we distribute a version
of this paper selectively to those who agree in turn not to publish
electronically, as others on the list may confirm. This has been notified to
the list before, but some time ago, so worth posting a reminder.

An additional paper was published in the same month at the AAMAS conference
in Bologna Italy, "A distributed implementation of the SWAN peer-to-peer
look-up system using mobile agents" which will be published as part of the
Springer Lecture Notes in Computer Science series in the near future. Once
again, we need to be careful about electronic distribution.

Finally, we have submitted a paper to the IPTPS conference for the coming
year, which we are not at liberty to distribute freely since it hasn't been
accepted yet. 

I appreciate your frustration with the controlled distribution of these
papers. Our fond hope is IPTPS will publish the latest paper electronically
'in the clear', and eliminate the electronic copyright problems we have had
with former papers.

Thanks for your input, Oskar.

Cefn

From bram at gawth.com  Wed Dec  4 12:10:01 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
In-Reply-To: <92D69A614ED4924A9071EA4886317878A6DC01@i2km11-ukbr.domain1.systemhost.net>
Message-ID: <Pine.LNX.4.21.0212041205020.10254-100000@ultra.gawth.com>

cefn.hoile@bt.com wrote:

> To date we have run up to 6,000 nodes on a single PC as part of a deployed
> SWAN of 100,000 nodes, but we haven't really attempted to max out the
> network bandwidth. 

How many machines total were involved in that experiment? Was it all on
stable test machines or was it in the wild? It sounds like you've got the
right attitude towards testing, but I can tell you from experience that
real deployed machines, with their huge churn rates, high latency,
heterogeneous connectivity speeds, and general funkiness, are *very*
different.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From cefn.hoile at bt.com  Thu Dec  5 04:56:01 2002
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
Message-ID: <92D69A614ED4924A9071EA4886317878A6DC0C@i2km11-ukbr.domain1.systemhost.net>

Regarding the experiment...

> To date we have run up to 6,000 nodes on a single PC as part of a 
> deployed SWAN of 100,000 nodes, but we haven't really attempted to max 
> out the network bandwidth.

BRAM: How many machines total were involved in that experiment? 

CEFN: There were 18 machines. In other words 18 * 6000 = 108000 nodes, so a
bit more than the 100,000 figure. 

BRAM: Was it all on stable test machines or was it in the wild? 

CEFN: You could say 'stable test machines', however, originally there was
something wrong with the behaviour of the TCP/IP stack within the RedHat
Linux 7.1 cluster we were running it across. This caused machines to
disappear from the network occasionally, and the performance of the lookup
was surprisingly robust to this (as reported in the 'Mobile Agents' AAMAS
paper). The UDP transport which was implemented more recently is the data I
have been sharing with the list in the past few days. The UDP approach is
much more efficient, and does not suffer from the TCP stack problems we had
with the original setup.

BRAM: It sounds like you've got the right attitude towards testing, but I
can tell you from experience that real deployed machines, with their huge
churn rates, high latency, heterogeneous connectivity speeds, and general
funkiness, are *very* different.

CEFN: I am certainly prepared to accept this. I wouldn't assert that there
will be no problem with the deployment in the wild with more asymmetry,
unexpected failure etc. However, node behaviours are deliberately designed
to accommodate failure, and the impact of node arrival and removal is
distributed throughout the network. Roughly, the number of nodes affected by
arrivals or removals should be linear in the number of nodes arriving or
being removed (this is connected to the fixed number of links which each
node has, and hence the number of failed or new links required for each
arrival or departure). One perspective on this argument is found in the
introduction to Shlomi Dolev's book 'Self-Stabilisation'....
"A self-stabilizing system is a system that can automatically recover
following the occurrence of (transient) faults. The idea is to design
systems that can be started in an arbitrary state, and still converge to a
desired behaviour. The occurrence of faults in the system can cause the
system to reach an arbitrary state. Self-stabilising systems that experience
faults recover automatically from such faults, since they are designed to
start in an arbitrary state"
This is of course why the experiment we did involves 100,000 nodes
self-organising from scratch (the worst case scenario). If we can solve this
one in a reasonable time, (which I think we have demonstrated) then
transient faults (which we know will only affect a subset of nodes) are a
special case which is much easier. We know that the states arising from
transient faults are a graceful degradation of function, not a total loss of
function. We also know that the degree of degradation is nice and linear.
This means that the success of the system in the context of a specific
application depends on several factors; the rate at which the network can
fix itself, the rate at which nodes arrive and disappear, and the degree of
lookup failure which the application can handle.

From melc at fashionvictims.com  Thu Dec  5 08:20:01 2002
From: melc at fashionvictims.com (melc)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] CP2PC source release
Message-ID: <Pine.GSO.4.50.0212051714270.760-100000@blade008.cs.vu.nl>

Announcing the CP2PC SourceForge site and first code release.

We have started development on our CP2PC uberclient file-sharing
application.  As part of the CP2PC project we have developed a minimal
programming interface (API) to peer-to-peer file-sharing systems.  Now,
based on this API, we have started building a file-sharing uberclient that
will provide seamless access to multiple file-sharing networks from a
single client.

In following with the open source maxim of "release early, release often"
we have set up a SourceForge site for CP2PC and released the current
version (0.1) of the code on this site.  The URL for the site is:

        http://sourceforge.net/projects/cp2pc

The current release contains

    * Core CP2PC code. This provides a default implementation
      of the CP2PC API, an implementation of core facilities (that is
      facilities that may be used by file-sharing network components,
      including a local RDF triple database which implements the
      Tristero search interfaces, a simple CP2PC shell interface,
      donload and upload monitors, etc.), and a skeleton component that
      can be extended to build new file-sharing network backends.

    * Gnutella component.  This is a file-sharing network backend
      component that can be used to connect to and use the Gnutella
      network.  Currently, the component provides a simple shell like
      interface to the network (the shell implements CP2PC API calls).
      The component can also be used by other programs as a library.
      This component uses Limewire [1] code to do the actual Gnutella
      specific work - it forms a bridge between the CP2PC API and the
      Limewire code.

    * GDN component. This is a file-sharing network backend
      component that can be used to connect to and use the GDN [2]
      network.  Currently, the component provides a simple shell like
      interface to the network (the shell implements CP2PC API calls).
      The component can also be used by other programs as a library.
      The component acts as a GDN client; creating, binding to,
      accessing, and destroying objects as necessary.

Future work for the CP2PC project will include the creation of a GUI
frontend, XML-RPC interface for components and the creation of more
backend components.

The code is written in Java and can be downloaded as a
tarball or from CVS.  Required libraries (i.e., jar files) can
also be downloaded from our SourceForge site.   Both the tarball and CVS
contain the CP2PC documentation, including a description of the API
and of the mapping of the API to various file-sharing networks.

The code is released under the LGPL.

Note that we have also set up two mailing lists, cp2pc-announce and
cp2pc-devl.

Ihor.


[1] http://www.limewire.org
[2] http://www.cs.vu.nl/globe


From levine at vinecorp.com  Thu Dec  5 23:13:01 2002
From: levine at vinecorp.com (James D. Levine)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] (SF Bay Area) South Bay PeerPunks meeting
Message-ID: <Pine.LNX.4.44.0212052334190.1700-100000@localhost.localdomain>


The second monthly...er, semi-annual South Bay 
PeerPunks meeting will convene Monday December 16 - that's
a week from next Monday at the time/place below.

PeerPunks is just my clever name for the Silicon Valley 
contingent of p2p enthusiasts, hackers, well-wishers, 
etc. who can't make it up to Bram's monthly meeting
in SF on a regular basis.  Any and all are welcome, so
please come and join in...

If you don't know what I look like, just look for the guy
in the red EFF "Fair Use Has A Possee" t-shirt. 


See you there and then.

James


Where:

    Dana Street Roasting Company
    744 W Dana St, Mountain View,CA 94041
    Phone: (650) 390-9638

    This is just 1/2 block off Castro St.


When:  7:00 pm onward, Monday December 16

 
From bram at gawth.com  Sat Dec  7 17:49:01 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] reminder: meeting tomorrow
Message-ID: <Pine.LNX.4.21.0212071747320.10254-100000@ultra.gawth.com>

Remember, there's a p2p-hackers meeting tomorrow, the 8th at 3pm in the
metreon.

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From aharwood at cs.mu.OZ.AU  Sat Dec  7 17:55:01 2002
From: aharwood at cs.mu.OZ.AU (Aaron Harwood)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] PDPTA'03 - Massively Distributed Computing call for papers
Message-ID: <9293063A-08B3-11D7-B2BD-000393716D16@cs.mu.oz.au>

The 2003 International Conference on Parallel and Distributed
Processing Techniques and Applications (PDPTA'03: June 23-26, 2003,
Las Vegas, Nevada, USA)

Session on Massively Distributed Computing Call for papers to be
submitted by Feb. 17, 2003

http://www.cs.mu.oz.au/~aharwood/PDPTA/cfp/

Description: Achieving sustained parallelism in a massively
distributed environment is paramount when harnessing the Internet's
total computing resources. Aspects of algorithm and application
designs that are negligible at small scales must be readdressed
before being applied on a massive scale. Most notably is the use
of completely decentralized operations and adherence to strict
complexity requirements that ensures scalability. Computations must
be progressive and adapt to a transient infrastructure; many of
the nodes in a massive system will be upgraded or completely removed
during the computation and many new nodes may be added. The massively
distributed environment gives rise to new techniques, computational
models and applications that are the focus of this session.

Topics of interest : The session chair invites papers reporting
original research results including algorithms, protocols,
architectures, simulations, applications, computational models and
experimental systems that support massively distributed computing.
Topics include but are not limited to: scalability and reliability,
load balancing and quality of service, resource discovery and
directory services, routing and virtual networks, data mining, high
performance computation, caching and content/service distribution
networks, cooperative systems, security and system administration,
programmability and transparency, interconnection networks and
peer-to-peer systems, atomic operations, message passing and
synchronous processing, data casting, hierarchical techniques and
grid systems, autonomous systems and agent technology, data
representation and visualization, emergent computation and system
behavior, applications to the World Wide Web, IPv6, bluetooth and
mobile computing environments.

Important dates:

Feb. 17, 2003: Draft papers (about 5 pages) due

March 21, 2003: Notification of acceptance

April 22, 2003: Camera-Ready papers & Prereg. due

June 23-26, 2003: Conference

Submission: Prospective authors are invited to submit three copies
of their draft paper (about 5 pages - single space, font size of
10 to 12) to A. Harwood (address is given below) by Feb. 17, 2003
(Monday). E-mail submissions should use PDF or Postscript format
when possible. Fax submissions are also acceptable. The length of
the Camera-Ready papers (if accepted) will be limited to 7 (IEEE
style) pages. Papers must not have been previously published or
currently submitted for publication elsewhere.

The first page of the draft paper should include: title of the
paper, name, affiliation, postal address, E-mail address, telephone
number, and Fax number for each author. The first page should also
include the name of the author who will be presenting the paper
(if accepted) and a maximum of 5 keywords.

Chair of Session:
       Dr Aaron Harwood Department of Computer Science and Software
       Engineering The University of Melbourne Victoria 3010,
       Australia.  aharwood@cs.mu.oz.au fax: +61 3 9348 1184

Conference: The 2003 International Conference on Parallel and
Distributed Processing Techniques and Applications (PDPTA'03) will
be held in Las Vegas, Nevada, June 23 - 26, 2003. The PDPTA'03
Conference will be held simultaneously (i.e., same location and
dates) with a number of other international conferences and workshops
(CISST'03, IC-AI'03, IC'03, METMBS'03, CIC'03, ...)

The last set of conferences (PDPTA'02 and affiliated events) had
research contributions from 72 countries (the event held in Las
Vegas had over 1,550 participants from all over the world.) It is
hoped that PDPTA'03 will also have a strong international flavor.
For more details see:

http://www.ashland.edu/~iajwa/conferences/


From pfh at mail.csse.monash.edu.au  Sat Dec  7 20:48:01 2002
From: pfh at mail.csse.monash.edu.au (Paul Harrison)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
In-Reply-To: <92D69A614ED4924A9071EA4886317878A6DC04@i2km11-ukbr.domain1.systemhost.net>
Message-ID: <Pine.LNX.4.33.0212081451350.26371-100000@mandarin>

On Wed, 4 Dec 2002 cefn.hoile@bt.com wrote:

> "So, in theory, SWAN should scale to 10^9s (billions) of users, if we can
> assume that network is fast enough (e.g. cable/xDSL etc.), and if there is
> only one node per computer. Agree ?"
>
>
> In principle, you are right. However, the scenario of use has to be taken
> into account here.
>
> A SWAN node is tunable using a parameter which is basically a polling
> period. The more often the SWAN node wakes up and sends a message, the
> faster the SWAN will converge, or the faster a new node will be
> incorporated, but also the more CPU time and network bandwidth will be used.
> So the tradeoff you choose will depend upon the needs of your app.
>

Some thoughts on this: In the absence of any other information, if we
wonder how long a node is going to continue running, the answer will, on
average, be the same amount of time as the node has been running. Ie a
node that has been up for a long time will probably continue to be up for
a while longer, while a node that has been up for a short time is less
likely to be up for much longer. (this is an idea i heard on the radio, i
forget who was saying it, but they were using it to estimate how long
until the Berlin wall came down :-) )

Anyway this means (Circle uses these, probably other DHTs too...)

- you can maintain a constant probability of a given node being up
by polling at exponentially increasing intervals

- if nodes start out in a passive mode (ie, not storing key->value
mappings) and only activates (starts storing part of the hashtable) after
a certain period, then you can be more confident of active nodes being up
for a while longer, and use a longer polling interval. This also reduces
network traffic, since if someone logs in, downloads one file, and logs
out again (or whatever) then you won't end up shuffling the hashtable onto
and off them again in a short period.

cheers,
Paul

Email: pfh@csse.monash.edu.au

one ring, no rulers, thecircle.org.au


From levine at vinecorp.com  Sun Dec  8 12:48:01 2002
From: levine at vinecorp.com (James D. Levine)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] RESCHEDULE: (SF Bay Area) South Bay PeerPunks meeting
Message-ID: <Pine.LNX.4.44.0212081330380.1698-100000@localhost.localdomain>


Several people have written to remind me that PeerPunks
will conflict with the Creative Commons reception.
That's worth rescheduling for, so please note the change-


  Tuesday December 17, 7pm onward  

James


------

The second monthly...er, semi-annual South Bay 
PeerPunks meeting will convene Tuesday December 17 - that's
a week from next Tuesday at the time/place below.

PeerPunks is just my clever name for the Silicon Valley 
contingent of p2p enthusiasts, hackers, well-wishers, 
etc. who can't make it up to Bram's monthly meeting
in SF on a regular basis.  Any and all are welcome, so
please come and join in...

If you don't know what I look like, just look for the guy
in the red EFF "Fair Use Has A Possee" t-shirt. 


See you there and then.

James


Where:

    Dana Street Roasting Company
    744 W Dana St, Mountain View,CA 94041
    Phone: (650) 390-9638

    This is just 1/2 block off Castro St.


When:  7:00 pm onward, Tuesday December 17

 
-- 


From me at aaronsw.com  Sun Dec  8 21:17:01 2002
From: me at aaronsw.com (Aaron Swartz)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] (SF Area) Creative Commons Launch
In-Reply-To: <Pine.LNX.4.44.0212052334190.1700-100000@localhost.localdomain>
Message-ID: <61A4C625-0B35-11D7-96C7-003065F376B6@aaronsw.com>

James D. Levine wrote:
> The second monthly...er, semi-annual South Bay
> PeerPunks meeting will convene [...]
> When:  7:00 pm onward, Monday December 16

Hm. For those of you who *are* in SF then, you're all invited to the 
Creative Commons Launch party:

Join us in celebrating the release of our licenses at an early-evening 
reception featuring a chat and screening by DJ Spooky, That Subliminal 
Kid (NYC); a multimedia jam by People Like Us (London); and an address 
by Lawrence Lessig, Chairman of Creative Commons and Professor of Law, 
Stanford University. Plus a few surprises.

When: Monday, December 16th; 6:30 pm - 9:00 pm
Where: SomArts Cultural Center, 934 Brannan Street, San Francisco, 
California

Space is limited and spots are filling up fast ? RSVP today. We look 
forward to seeing you there.

RSVP to neeru@creativecommons.org
More info: http://www.aaronsw.com/weblog/000743

-- 
Aaron Swartz [http://www.aaronsw.com]
I'll be in SanFran for the Creative Commons launch the week of Dec15.


From cefn.hoile at bt.com  Mon Dec  9 12:18:01 2002
From: cefn.hoile at bt.com (cefn.hoile@bt.com)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] About search methods
Message-ID: <92D69A614ED4924A9071EA4886317878A6DC19@i2km11-ukbr.domain1.systemhost.net>

Paul wrote:

"this means (Circle uses these, probably other DHTs too...)
- you can maintain a constant probability of a given node being up by
polling at exponentially increasing intervals
- if nodes start out in a passive mode (ie, not storing key->value
mappings) and only activates (starts storing part of the hashtable) after a
certain period, then you can be more confident of active nodes being up for
a while longer, and use a longer polling interval. This also reduces network
traffic, since if someone logs in, downloads one file, and logs out again
(or whatever) then you won't end up shuffling the hashtable onto and off
them again in a short period."

Cefn replies: 

Paul, 

This is a very relevant strategy for minimising churn in cell-based DHTs,
and possibly for tuning polling periods for the general case of distributed
systems.

Exploiting this 'long-livedness' metric is sort of implicit in SWAN, and
possibly other approaches, in that once you've been on the network for a
while, you tend towards having links to long-lasting nodes, since the
short-lived ones are replaced regularly, and you eventually will hit on
long-lived ones. 

Effectively, choosing a node as representing one of your small world links
has the dual role of being able to use them for routing, and polling them,
(because you require message acknowledgement from them). Short lived nodes
will be less and less well represented in your small world network links the
longer you remain in the network.

I'm not sure you have the same amount freedom to choose neighbours within a
cell-based DHT network, once structural roles are assigned (e.g. once X is
responsible for cell A and Y is responsible for cell B). However, as you
mention, it is possible to selectively assign those structural roles, and of
course there is some redundancy.

I have a slight concern here, though, in that those 'cell responsibility'
roles are somewhat monolithic. In other words, responsibility for a cell has
a fixed load which may be more than a given machine can handle. SWAN avoids
this by more granularity of responsibility (in other words responsibility
for network maintenance in chunks which correlate to the number of
addressible resources running on your machine). 

Doesn't the monolithic nature of cell assignments in cell-based DHTs mean
that you have a lot of asymmetry of load, depending on whether your machine
satisfies certain criteria or not? I am totally prepared to be corrected on
this, but this is my initial reading of the situation.

Cefn

From bram at gawth.com  Tue Dec 10 15:25:01 2002
From: bram at gawth.com (Bram Cohen)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] eff event today
Message-ID: <Pine.LNX.4.21.0212101523400.1513-100000@ultra.gawth.com>

The EFF is having an even today, to celebrate their space expansion,
details at eff.org

-Bram Cohen

"Markets can remain irrational longer than you can remain solvent"
                                        -- John Maynard Keynes


From bdarla at KOM.tu-darmstadt.de  Fri Dec 13 01:07:01 2002
From: bdarla at KOM.tu-darmstadt.de (Vasilios Darlagiannis)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] P2P scalable simulator
Message-ID: <3DF9A333.3020009@kom.tu-darmstadt.de>

Dear all,
    I am wondering if anybody could suggest a scalable simulator
appropriate for P2P-related experiments. I believe that tools like ns-2 are
not very suitable, since they deal with many details at lower OSI
layers, which are not so important for most of the P2P experiments.
What I see as important is to have a general tool capable of simulating
a million of nodes with reasonable hardware requirements.
    Any suggestions?

    Best Regards,

    Vasilios Darlagiannis


From sam at neurogrid.com  Fri Dec 13 02:50:01 2002
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] P2P scalable simulator
References: <3DF9A333.3020009@kom.tu-darmstadt.de>
Message-ID: <3DF9BE85.8060605@neurogrid.com>

Hi Vasilios,

I'm not sure what your reasonable hard ware requirements are.  The 
NeuroGrid simulator (which simulates Gnutella, Freenet and NeuroGrid 
networks, and is extendable to others) tries to be as scalable as possible.

Currently I run a 10,000 node network with three documents in each node, 
and max 60 connections per node in about 300 Megs of RAM.  I would 
certainly like to improve on this.  I am currently running million node 
networks but at prohibitive cost in terms of RAM.  I would be very 
interested to try and work together to improve the memory footprint.

You can check out the NeuroGrid simulator at:

http://www.neurogrid.net/php/simulation.php

CHEERS> SAM

Vasilios Darlagiannis wrote:

> Dear all,
>    I am wondering if anybody could suggest a scalable simulator
> appropriate for P2P-related experiments. I believe that tools like 
> ns-2 are
> not very suitable, since they deal with many details at lower OSI
> layers, which are not so important for most of the P2P experiments.
> What I see as important is to have a general tool capable of simulating
> a million of nodes with reasonable hardware requirements.
>    Any suggestions?


From bdarla at KOM.tu-darmstadt.de  Fri Dec 13 04:32:01 2002
From: bdarla at KOM.tu-darmstadt.de (Vasilios Darlagiannis)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] Re: p2p-hackers digest, Vol 1 #253 - 2 msgs
References: <20021213113053.756.6523.Mailman@capsicum.zgp.org>
Message-ID: <3DF9D316.4020908@kom.tu-darmstadt.de>

Hi Sam,
    Thank you very much for the link. The numbers you're providing
are promising, and I hope that can become even better. I will try to
make an evaluation of the Neurogrid simulator and I will come back
for further discussion.

    Greetings,

    Vasilis


p2p-hackers-request@zgp.org wrote:

>
>Hi Vasilios,
>
>I'm not sure what your reasonable hard ware requirements are.  The 
>NeuroGrid simulator (which simulates Gnutella, Freenet and NeuroGrid 
>networks, and is extendable to others) tries to be as scalable as possible.
>
>Currently I run a 10,000 node network with three documents in each node, 
>and max 60 connections per node in about 300 Megs of RAM.  I would 
>certainly like to improve on this.  I am currently running million node 
>networks but at prohibitive cost in terms of RAM.  I would be very 
>interested to try and work together to improve the memory footprint.
>
>You can check out the NeuroGrid simulator at:
>
>http://www.neurogrid.net/php/simulation.php
>
>CHEERS> SAM
>
>Vasilios Darlagiannis wrote:
>


From sam at neurogrid.com  Fri Dec 13 07:05:02 2002
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] Re: p2p-hackers digest, Vol 1 #253 - 2 msgs
References: <20021213113053.756.6523.Mailman@capsicum.zgp.org> <3DF9D316.4020908@kom.tu-darmstadt.de>
Message-ID: <3DF9FA68.7060404@neurogrid.com>

Hi Vasilios

Vasilios Darlagiannis wrote:

>    Thank you very much for the link. The numbers you're providing
> are promising, and I hope that can become even better. I will try to
> make an evaluation of the Neurogrid simulator and I will come back
> for further discussion. 

Sure thing.  The releases on sourceforge are a little behind, so you may 
want to check the source out of cvs to get the code with the best 
performance.  I'm hoping to make some new releases in the next few 
weeks, various bug fixes, tidying up and features need to be documented 
and packaged ....

CHEERS> SAM


From levine at vinecorp.com  Fri Dec 13 08:18:02 2002
From: levine at vinecorp.com (James D. Levine)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] REMINDER: (SF Bay Area) South Bay PeerPunks meeting next Tuesday
Message-ID: <Pine.LNX.4.44.0212130903360.1716-100000@localhost.localdomain>


Just a friendly reminder -- next Tuesday 7pm onward in 
Mountain View.   

James

------

The second monthly...er, semi-annual South Bay 
PeerPunks meeting will convene Tuesday December 17 - that's
a week from next Tuesday at the time/place below.

PeerPunks is just my clever name for the Silicon Valley 
contingent of p2p enthusiasts, hackers, well-wishers, 
etc. who can't make it up to Bram's monthly meeting
in SF on a regular basis.  Any and all are welcome, so
please come and join in...

If you don't know what I look like, just look for the guy
in the red EFF "Fair Use Has A Possee" t-shirt. 


See you there and then.

James


Where:

    Dana Street Roasting Company
    744 W Dana St, Mountain View,CA 94041
    Phone: (650) 390-9638

    This is just 1/2 block off Castro St.


When:  7:00 pm onward, Tuesday December 17

 
-- 


From levine at vinecorp.com  Tue Dec 17 11:45:01 2002
From: levine at vinecorp.com (James D. Levine)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] TONIGHT: South Bay PeerPunks meeting next Tuesday (fwd)
Message-ID: <Pine.LNX.4.44.0212171235210.2732-100000@localhost.localdomain>


See you there!

James

------

The second monthly...er, semi-annual South Bay 
PeerPunks meeting will convene Tuesday December 17 - that's
a week from next Tuesday at the time/place below.

PeerPunks is just my clever name for the Silicon Valley 
contingent of p2p enthusiasts, hackers, well-wishers, 
etc. who can't make it up to Bram's monthly meeting
in SF on a regular basis.  Any and all are welcome, so
please come and join in...

If you don't know what I look like, just look for the guy
in the red EFF "Fair Use Has A Possee" t-shirt. 


See you there and then.

James


Where:

    Dana Street Roasting Company
    744 W Dana St, Mountain View,CA 94041
    Phone: (650) 390-9638

    This is just 1/2 block off Castro St.


When:  7:00 pm onward, Tuesday December 17

 
-- 


From levine at vinecorp.com  Tue Dec 17 11:46:01 2002
From: levine at vinecorp.com (James D. Levine)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] TONIGHT: South Bay PeerPunks meeting
Message-ID: <Pine.LNX.4.44.0212171238370.3005-100000@localhost.localdomain>


Oops. I messed up the title on the other message.

It's tonight.  See you there.


James

------

The second monthly...er, semi-annual South Bay 
PeerPunks meeting will convene Tuesday December 17 - that's
a week from next Tuesday at the time/place below.

PeerPunks is just my clever name for the Silicon Valley 
contingent of p2p enthusiasts, hackers, well-wishers, 
etc. who can't make it up to Bram's monthly meeting
in SF on a regular basis.  Any and all are welcome, so
please come and join in...

If you don't know what I look like, just look for the guy
in the red EFF "Fair Use Has A Possee" t-shirt. 


See you there and then.

James


Where:

    Dana Street Roasting Company
    744 W Dana St, Mountain View,CA 94041
    Phone: (650) 390-9638

    This is just 1/2 block off Castro St.


When:  7:00 pm onward, Tuesday December 17

 
-- 


From wesley at felter.org  Wed Dec 18 14:53:01 2002
From: wesley at felter.org (Wes Felter)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] LGPL Java Rendezvous library
Message-ID: <1040248741.1310.35.camel@arlx002.austin.ibm.com>

Who will be the first Java P2P hacker to integrate this for peer
discovery?

ftp://ftp.strangeberry.com/pub/

-- 
Wes Felter - wesley@felter.org - http://felter.org/wesley/

From jlevine at bayarea.net  Mon Dec 23 12:07:02 2002
From: jlevine at bayarea.net (jlevine@bayarea.net)
Date: Sat Dec  9 22:12:04 2006
Subject: [p2p-hackers] [SF Bay Area] Next Silicon Valley PeerPunks meeting
Message-ID: <Pine.NEB.4.21.0212191251140.22684-100000@shell2.bayarea.net>


The next Silicon Valley PeerPunks meeting is tentatively
scheduled for Tuesday Jan. 14.   

I'd like to do these monthly from now own - unless I hear
major disagreement the second Tuesday of each month looks
good to me.  Please let me know if that poses a regular 
conflict for anybody.


James