From coderman at gmail.com  Wed Mar  1 06:02:55 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Re: Identify "defective" nodes
In-Reply-To: <7FE01058-EEAF-4602-B0C7-57B9E48B8984@vt.edu>
References: <du02b0$mve$1@lyon.ram.loc>
	<EPEJIODJLBDLEHGHIEADOEHJFPAB.gbildson@limepeer.com>
	<6.2.3.4.0.20060228105303.01f980a0@mail.cs.cornell.edu>
	<19196d860602280800n2674cc64s4a07565d5651fd4b@mail.gmail.com>
	<20060228161418.GN5704@cs.uoregon.edu>
	<7FE01058-EEAF-4602-B0C7-57B9E48B8984@vt.edu>
Message-ID: <4ef5fec60602282202x4d5ff8fbh83c9d3c0ab7180a4@mail.gmail.com>

On 2/28/06, H. Lally Singh <lally@vt.edu> wrote:
> ...  Hell you could go as far as a CORBA.

noooooooo!

:)

[actually, CORBA isn't that bad; it's just well suited for large
enterprise distributed systems and not a lot else]

From lally at vt.edu  Wed Mar  1 18:08:07 2006
From: lally at vt.edu (H. Lally Singh)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Re: Identify "defective" nodes
In-Reply-To: <4ef5fec60602282202x4d5ff8fbh83c9d3c0ab7180a4@mail.gmail.com>
References: <du02b0$mve$1@lyon.ram.loc>
	<EPEJIODJLBDLEHGHIEADOEHJFPAB.gbildson@limepeer.com>
	<6.2.3.4.0.20060228105303.01f980a0@mail.cs.cornell.edu>
	<19196d860602280800n2674cc64s4a07565d5651fd4b@mail.gmail.com>
	<20060228161418.GN5704@cs.uoregon.edu>
	<7FE01058-EEAF-4602-B0C7-57B9E48B8984@vt.edu>
	<4ef5fec60602282202x4d5ff8fbh83c9d3c0ab7180a4@mail.gmail.com>
Message-ID: <DAE4CB53-96B9-4DCE-BD8C-BD8D5EF6A7A1@vt.edu>

Haha, I didn't want to get into any kind of debate; just mention that  
you can get lightweight ORBs that I've successfully used in things as  
remote as embedded realtime.  But yeah, you've gotta be real careful  
(and selective of your ORB) to keep it lightweight.


-- 
When the Boogeyman goes to sleep every night he checks his closet for  
Chuck Norris.


On Mar 1, 2006, at 1:02 AM, coderman wrote:

> On 2/28/06, H. Lally Singh <lally@vt.edu> wrote:
>> ...  Hell you could go as far as a CORBA.
>
> noooooooo!
>
> :)
>
> [actually, CORBA isn't that bad; it's just well suited for large
> enterprise distributed systems and not a lot else]
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From philip_matthews at magma.ca  Thu Mar  2 21:20:20 2006
From: philip_matthews at magma.ca (Philip Matthews)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Number of pinholes supported by low-end NATs?
Message-ID: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>

Does anyone have any idea how many pinholes a P2P application
can have open at one time through a typical low-end NAT box?

Could a P2P application maintain connections to 30 peers simultaneously?
What about 50 peers? Or 100 peers?

Do the numbers differ if the messages are carried over UDP vs TCP?

(Note: My interest is in limitations in the NAT box, and NOT in any  
limitations in
the Windows, Mac, Linux, ... box on which the P2P application is  
running.)

Just wondering if anyone has any solid data in this area.

- Philip

From lemonobrien at yahoo.com  Thu Mar  2 21:45:24 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
Message-ID: <20060302214524.21119.qmail@web53604.mail.yahoo.com>

It seems port numbers change after a certain amount of time due to the local NAT or local ISP, and the peer to peer application has to re-configure itself to find out what its new global address is...and broadcast that to others so they can re-connect; does anyone know what the average time is a port number is good for? I'm getting up to 24 hours testing through Comcast and SBC; but i have to sleep so...i'm not sure on this number.
   
  thanks
  lemon


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060302/7e2b03db/attachment.html
From coderman at gmail.com  Thu Mar  2 22:44:33 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <20060302214524.21119.qmail@web53604.mail.yahoo.com>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
	<20060302214524.21119.qmail@web53604.mail.yahoo.com>
Message-ID: <4ef5fec60603021444k323d801icd0b4134e5450897@mail.gmail.com>

On 3/2/06, Lemon Obrien <lemonobrien@yahoo.com> wrote:
> ...
> It seems port numbers change after a certain amount of time due to the local
> NAT or local ISP, and the peer to peer application has to re-configure
> itself to find out what its new global address is...and broadcast that to
> others so they can re-connect; does anyone know what the average time is a
> port number is good for? I'm getting up to 24 hours testing through Comcast
> and SBC; but i have to sleep so...i'm not sure on this number.

the longest time i've experienced on a dynamic address assigned DSL
connection provided by verizon using a linux 2.6 based NAT is 4 months
continuous.

if you are using a lot of traffic or endpoints, verizon, comcast, and
other providers appear to roll your endpoint more frequently.  i've
got a friend on verizon FiOS who gets an endpoint roll every few score
minutes (few times per hour) when running a large number of torrents.

i'm not aware of a thorough analysis of provider and NAT behavior
under varying conditions but this would certainly be useful to track
over time.

From hopper at omnifarious.org  Thu Mar  2 22:54:07 2006
From: hopper at omnifarious.org (Eric Hopper)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <4ef5fec60603021444k323d801icd0b4134e5450897@mail.gmail.com>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
	<20060302214524.21119.qmail@web53604.mail.yahoo.com>
	<4ef5fec60603021444k323d801icd0b4134e5450897@mail.gmail.com>
Message-ID: <20060302225407.GA14429@omnifarious.org>

On Thu, Mar 02, 2006 at 02:44:33PM -0800, coderman wrote:
> if you are using a lot of traffic or endpoints, verizon, comcast, and
> other providers appear to roll your endpoint more frequently.  i've
> got a friend on verizon FiOS who gets an endpoint roll every few score
> minutes (few times per hour) when running a large number of torrents.

I wonder if that kind of behavior is one of the reasons IPv6 isn't being
rolled out much by ISPs.  It would make doing that kind of thing a lot
harder.

I think though that consumer level NAT hardware might have some sort of
limit on the number of mappings it can keep in memory and apply to
packets though.

*sigh*,
-- 
"It does me no injury for my neighbor to say there are twenty gods or no God.
It neither picks my pocket nor breaks my leg."  --- Thomas Jefferson
"Go to Heaven for the climate, Hell for the company."  -- Mark Twain
-- Eric Hopper (http://www.omnifarious.org/~hopper) --
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060302/3b73bb3f/attachment.pgp
From saikat at cs.cornell.edu  Thu Mar  2 23:15:49 2006
From: saikat at cs.cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <20060302214524.21119.qmail@web53604.mail.yahoo.com>
References: <20060302214524.21119.qmail@web53604.mail.yahoo.com>
Message-ID: <1141341349.25273.40.camel@localhost.localdomain>

On Thu, 2006-03-02 at 13:45 -0800, Lemon Obrien wrote:
> It seems port numbers change after a certain amount of time due to the
> local NAT or local ISP

Do you mean the external port number allocated by the NAT? Most NATs
will timeout idle connections; sending subsequent packets will cause a
new allocation. The length of the timeout depends on the transport
protocol; the new allocation depends on the mapping type of the NAT.

For TCP, most NATs timeout somewhere between 1-2 hours. Some (rather
aggressive) NATs will timeout within 10-15 minutes of inactivity.
Detailed numbers here: http://nutss.net/stunt-results.php?sort=-33 

Standards in the process are trying to peg the default timeout to 2h.
For UDP, the timeout is significantly less. Standards are setting the
UDP timeout to 5 minutes of inactivity. 

> , and the peer to peer application has to re-configure itself to find
> out what its new global address is
> ...and broadcast that to others so they can re-connect;

That said, the standards are also trying to ensure NATs have consistent
mapping / cone behavior / address and port independent mapping.
Consequently, the need to re-publish this information should diminish
over time.

> I'm getting up to 24 hours testing through Comcast and SBC; but i have
> to sleep so...i'm not sure on this number.

Hmmm. This should not be an ISP issue unless the ISP is putting you
behind a NAT. Also, your subject seems to suggest your IP address is
changing -- is comcast/sbc assigning you new DHCP addresses every few
days? Perhaps the DHCP lease time can hint on the value they are using.
In any event, as long as your endpoint renews your DHCP lease before it
expires, it shouldn't change IP addresses.

-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060302/be23daf6/attachment.pgp
From saikat at cs.cornell.edu  Thu Mar  2 23:24:29 2006
From: saikat at cs.cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <20060302225407.GA14429@omnifarious.org>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
	<20060302214524.21119.qmail@web53604.mail.yahoo.com>
	<4ef5fec60603021444k323d801icd0b4134e5450897@mail.gmail.com>
	<20060302225407.GA14429@omnifarious.org>
Message-ID: <1141341869.25273.49.camel@localhost.localdomain>

On Thu, 2006-03-02 at 14:54 -0800, Eric Hopper wrote:
> I think though that consumer level NAT hardware might have some sort of
> limit on the number of mappings it can keep in memory and apply to
> packets though.

Certain (really old) NAT models (/firmware) did indeed limit the number
of simultaneous connections to 256. They are really hard to find these
days. The lowest limit I can find is 1000
(http://nutss.net/stunt-results.php?sort=-9), but most NATs support
roughly 65K mappings these days.

You'd think they'd allow memory to fill up before doing any garbage
collection of stale connections; turns out vendors would rather do the
timeout thing (some fud about DoS etc. check the behave list for
messages from Hoffman and Srisuresh on the topic) -- so for the most
part, 65K is more than the app can use anyways and inactivity timeouts
are the primary concern here.

-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060302/aa007f25/attachment.pgp
From saikat at cs.cornell.edu  Thu Mar  2 23:38:00 2006
From: saikat at cs.cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Number of pinholes supported by low-end NATs?
In-Reply-To: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
Message-ID: <1141342680.25273.59.camel@localhost.localdomain>

On Thu, 2006-03-02 at 16:20 -0500, Philip Matthews wrote:
> Does anyone have any idea how many pinholes a P2P application
> can have open at one time through a typical low-end NAT box?

I just posted about this and linked so some related data in the previous
thread. Most NATs you can buy off the shelf support upwards of a few
thousand; 65k in many cases.

> Do the numbers differ if the messages are carried over UDP vs TCP?

It may. There are two things to consider here (beside UDP/TCP). One is
the NAT mapping type (think "cone" or not), and the other is NAT
filtering type (think "full", "restricted" etc.) 

Intuitively, full cone NATs need only keep track of the mapping (just
the local port) -- they can potentially support infinite simultaneous
sessions from the same local port; and support ~65K such local ports.

Non-cone, or restricted cone NATs need to track each session (both local
port, and destination) -- they can support roughly ~65K simultaneous
sessions combined over all choices of local ports.

This number can differ for UDP and TCP; and in some cases, the combined
number of simultaneous TCP and UDP sessions could be subject to a sum
total of ~65K.

If you are looking for some really absolute pessimistic lower bounds,
the most conservative I'd go would be ~1K simultaneous sessions (TCP and
UDP combined) through the NAT.

cheers,
-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060302/e83dc097/attachment.pgp
From saikat at cs.cornell.edu  Thu Mar  2 23:47:47 2006
From: saikat at cs.cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <1141341349.25273.40.camel@localhost.localdomain>
References: <20060302214524.21119.qmail@web53604.mail.yahoo.com>
	<1141341349.25273.40.camel@localhost.localdomain>
Message-ID: <1141343267.25273.65.camel@localhost.localdomain>

On Thu, 2006-03-02 at 18:15 -0500, Saikat Guha wrote:
> Detailed numbers here: http://nutss.net/stunt-results.php?sort=-33 

The column of interest is on the far right, third last, titled 'Timer
ESTD'. This is the time an established TCP connection will stay up
through the NAT despite inactivity (e.g. SSH left on for days without
activity). The column reports the lower-bound and upper-bound inside of
which the true timer value lies. 

cheers,
-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060302/8cb2dbe3/attachment.pgp
From seth.johnson at RealMeasures.dyndns.org  Fri Mar  3 00:10:31 2006
From: seth.johnson at RealMeasures.dyndns.org (Seth Johnson)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Important Statement to Review for Signing
Message-ID: <44078977.8F6EFFB0@RealMeasures.dyndns.org>


Hello folks,

Please review the important joint statement below, related to the
WIPO Broadcaster's Treaty, and consider adding your signature if
you are an American citizen.  Also make sure those you know who
should sign are also given the opportunity.

Andy Oram has written a good letter to the US Delegation to WIPO
on the subject:
> http://www.oreillynet.com/pub/a/etel/2006/01/13/the-problem-with-webcasting.html?page=2

CPTech Links on the Treaty:
> http://www.cptech.org/ip/wipo/bt/index.html#Coments
Electronic Frontier Foundation Links:
> http://www.eff.org/IP/WIPO/broadcasting_treaty/
IP Justice Links:
> http://www.ipjustice.org/WIPO/broadcasters.shtml
Union for the Public Domain Links:
> http://www.public-domain.org/?q=node/47

The Latest Draft of the Treaty:
> http://www.cptech.org/ip/wipo/sccr12.2rev2.doc

A survey of relevant links:
> http://www.hyperorg.com/blogger/mtarchive/wipo_and_the_war_against_the_i.html

If you choose to sign, please send your name along with an
affiliation or appropriate short phrase to attach to your name
for identification purposes, to mailto:seth.p.johnson@gmail.com. 
If your organization endorses the statement, please indicate that
separately, so your organization will be listed under that
header.

Thank you for consideration.


Seth Johnson
Corresponding Secretary
New Yorkers for Fair Use


Joint Statement to Congress:


Dear (Relevant Congressional Committees) (cc the WIPO
Delegation):

Negotiations are currently underway at the World Intellectual
Property Organization (WIPO) to develop a treaty giving
broadcasters power to suppress currently lawful communications.
The United States delegation is also advocating similar rights
for "webcasters" through which the authors of new works
communicate them to the public.

Some provisions of the proposed "Treaty on the Protection of
Broadcasting Organizations" would merely update and standardize
existing legal norms, but several proposals would require
Congress to enact sweeping new laws that give private parties
control over information, communication, and even copyrighted
works of others, whenever they have broadcast or "webcast" the
work.

The novel policy areas addressed by this treaty go beyond
ordinary treaty-making that seeks worldwide adherence to U.S.
policy. Instead, this initiative invades Congress? prerogative to
develop and establish national policy.  Indeed, even as Congress
is debating how best to protect network neutrality, treaty
negotiators are debating how to eliminate it.

The threat to personal liberties presented by this treaty is too
grave to allow these new policy initiatives be handed over to an
unelected delegation to negotiate with foreign countries, leaving
Congress with the sole option whether to acquiesce.  When dealing
with policies that are related to copyright and communications,
Congress's assigned powers and responsibility under Article I,
Section 8 of the Constitution become particularly important.  We
urge two important steps.  First, the new proposed regulations
should be published in the Federal Register, with an invitation
to the public to comment. Second, the appropriate House and
Senate committees should hold hearings to more fully explore the
impact of these novel legal restrictions on commerce, freedom of
speech, copyright holders, network neutrality, and communications
policy.

Americans currently enjoy substantial freedoms with respect to
broadcast and webcast communications.  Under the proposed treaty,
the existing options available to commercial enterprises and
entrepreneurs as well as the general public to communicate news,
information and entertainment would be limited by a new private
gatekeeper who adds nothing of value to the content.
Communications policies currently under discussion at the FCC
would be impacted.  Individuals and small businesses would be
limited in their freedom of speech.  Copyright owners would find
their freedom to license their works limited by whether the work
had been broadcast or webcast.  The principle of network
neutrality, already the subject of congressional hearings, would
be all but destroyed.

As able as the staff of the United States Patent and Trademark
Office and the Library of Congress may be, it was never intended
that they alone should stake out the United States national
policy to be promoted before an unelected international body in
entirely new areas abridging civil liberties. Congress should be
the first to establish America?s national policies in this new
area so that our WIPO delegation will have sufficient guidance to
achieve legitimate objectives without impairing Constitutional
principles such as freedom of speech and assembly, without
impairing the value of copyrights, and without granting to
private parties arbitrary power to suppress existing freedoms or
burden new technologies.

We cannot afford for Congress to wait for the Senate to be
presented with a fully formed treaty calling for the enacting of
domestic law at odds with fundamental American liberties foreign
to American and international legal norms, and that would bring
to a close many of the benefits of widespread personal computing
and the end-to-end connectivity brought by the Internet.  We ask
Congress to use its authority now to shape these important
communications policies impacting constitutionally based
copyright laws and First Amendment liberties.

Signed,

(Affiliations for individual signers are for identification
only.  Endorsing organizations are listed separately.)


    William Abernathy, Independent Technical Editor
    Scottie D. Arnett, President, Info-Ed, Inc.
    Jonathan Askin, Pulver.com
    John Bachir, Ibiblio.org
    Tom Barger, DMusic.com
    Fred Benenson, FreeCulture.org
    Daniel Berninger, VON Coalition
    Eric Blossom, GNU Radio
    Joshua Breitbart, Media Tank
    Dave Burstein, Editor, DSL Prime
    Michael Calabrese, Vice President, New America Foundation
    Dave A. Chakrabarti, Community Technologist, CTCNet Chicago
    Steven Cherry, Senior Associate Editor, IEEE Spectrum
    Steven Clift, Publicus.Net
    Roland J. Cole, J.D., Ph.D., Executive Director, Software 
        Patent Institute
    Gordon Cook, Editor, Publisher and Owner since 1992 of the 
        COOK Report on Internet Protocol
    Walt Crawford, Editor/Publisher, Cites & Insights
    Cynthia H. de Lorenzi, Washington Bureau for ISP Advocacy
    Cory Doctorow, Author, journalist, Fulbright Chair, EFF 
        Fellow
    Marshall Eubanks, CEO, AmericaFree.tv
    Harold Feld, Senior Vice President, Media Access Project
    Miles R. Fidelman, President, The Center for Civic Networking
    Richard Forno  (bio: http://www.infowarrior.org/rick.html)
    Laura N. Gasaway, Professor of Law, University of North 
        Carolina
    Paul Gherman, University Librarian, Vanderbilt University
    Shubha Ghosh, Professor of Law, Southern Methodist University
    Paul Ginsparg, Cornell University
    Fred R. Goldstein, Ionary Consulting
    Robin Gross, IP Justice
    Michael Gurstein, New Jersey Institute for Technology
    Jon Hall, President, Linux International
    Chuck Hamaker, Atkins Library, University of North Carolina -
        Charlotte
    Charles M. Hannum, consultant, founder of The NetBSD Project
    Dewayne Hendricks, CEO, Dandin Group
    David R Hughes, CEO, Old Colorado City Communications, 1993 
        EFF Pioneer Award
    Paul Hyland, Computer Professionals for Social Responsibility
    David S. Isenberg, Ph.D., Founder & CEO, isen.com, LLC
    Seth Johnson, New Yorkers for Fair Use
    Paul Jones, School of Information and Library Science, 
        University of North Carolina - Chapel Hill
    Peter D. Junger, Professor of Law Emeritus, Case Western 
        Reserve University
    Brewster Kahle, Internet Archive
    Jerry Kang, Professor of Law, UCLA School of Law
    Dennis S. Karjala, Jack E. Brown Professor of Law, Arizona 
        State University
    Dan Krimm, Independent Musician
    Michael J. Kurtz, Astronomer and Computer Scientist, Harvard-
        Smithsonian Center for Astrophysics
    Michael Maranda, President, Association For Community 
        Networking
    Kevin Marks, mediAgora
    Anthony McCann, www.beyondthecommons.com
    Sascha Meinrath, Champaign-Urbana Community Wireless Network, 
        Free Press
    Edmund Mierzwinski, Consumer Program Director, U.S. Public 
        Interest Research Group
    Lee N. Miller, Ph.D., Editor Emeritus, Ecological Society of 
        America
    John Mitchell, InteractionLaw
    Tom Moritz, Chief, Knowledge Managment, Getty Research 
        Institute
    Andrew Odlyzko, University of Minnesota
    Ken Olthoff, Advisory Board, EFF Austin
    Andy Oram, Editor, O'Reilly Media
    Bruce Perens (bio at http://perens.com/Bio.html)
    Ian Peter, Senior Partner, Ian Peter and Associates Pty Ltd
    Malla Pollack, Law Professor, American Justice School of Law
    Jeff Pulver, Pulver.com
    Tom Raftery, PodLeaders.com
    David P. Reed, contributor to original Internet Protocol 
        design
    Jerome H. Reichman, Bunyan S. Womble Professor of Law
    Lawrence Rosen, Rosenlaw & Einschlag; Stanford University 
        Lecturer in Law
    Bruce Schneier, security technologist and CTO, Counterpane
    David J. Smith, Specialist of Distributed Content 
        Distribution and Protocols, Michigan State University
    Michael E. Smith, LXNY
    Richard Stallman, President, Free Software Foundation
    Fred Stutzman, Ph.D. Student, UNC Chapel Hill
    Peter Suber, Open Access Project Director, Public Knowledge
    Jay Sulzberger, New Yorkers for Fair Use
    Aaron Swartz, infogami
    Stephen H. Unger, Professor, Computer Science Department, 
        Columbia University
    Eric F. Van de Velde, Ph.D., Director, Library Information 
        Technology, California Institute of Technology
    Tom Vogt, independent computer security researcher
    David Weinberger, Harvard Berkman Center
    Frannie Wellings, Free Press
    Adam Werbach, President, Ironweed Films
    Stephen Wolff, igewolff.net
    Brett Wynkoop, Wynn Data Ltd.
    John Young, Cryptome.org


Endorsing Organizations:

    Association For Community Networking (AFCN)
    The Center for Civic Networking
    Computer Professionals for Social Responsibility
    Contact Communications
    The COOK Report on Internet Protocol
    Cryptome.org
    Champaign-Urbana Community Wireless Network
    Dandin Group
    FreeCulture.org
    Free Press
    Free Software Foundation
    Illinois Community Technology Coalition
    Internet Archive
    Ionary Consulting
    IP Justice
    isen.com, LLC
    mediAgora
    New Yorkers for Fair Use
    Old Colorado City Communications
    Podleaders.com
    Pulver.com
    Rosenlaw & Einschlag
    U.S. Public Interest Research Group
    Washington Bureau for ISP Advocacy
    Wyoming.com

--

[separate one-page attachment]

WHY PUBLIC SCRUTINY OF THE PROPOSED BROADCASTER TREATY IS NEEDED

If Congress were to hold public hearings, or if the US delegation
to WIPO were to publish the current proposal for public review
and comment, myriad voices from various segments of society could
come forward to show that the proposed Broadcaster's Treaty:

   * Is written to look like existing copyright treaties, but it 
     is not based on the constitutional requirements for 
     copyright protection, such as originality, and in fact is 
     antagonistic to copyrights

   * Is promoted as a way of standardizing existing signal 
     protection, but in fact extends well beyond signal 
     protection by giving broadcasters and webcasters a monopoly, 
     for 50 years, over the content created by others the moment 
     it is broadcast or transmitted over the Internet

   * Gives broadcasters greater rights than producers of original 
     works

   * Accords exclusive rights to non-authors in direct violation 
     of fundamental rights guaranteed by the Constitution

   * Attacks the principle of network neutrality which serves as 
     the basis by which the Internet has fostered a profound 
     expansion in human capacities and innovation

   * Grants privileges that extend beyond broadcast signals to 
     actually give broadcasters control over works conveyed 
     within a broadcast -- including copyrighted and public 
     domain works

   * Blocks fair use and other copyright provisions that enable 
     the public to make use of and benefit from published 
     information

   * Chills freedom of expression by extending unwarranted 
     controls over broadcast publication

   * Benefits broadcasters at the expense of the web, the public 
     and future innovation

   * Creates a de facto tax on copyrights, freedom of speech, 
     communications and technological progress, all for the 
     benefit of broadcasters and webcasters who have added 
     nothing to deserve such a windfall.


From slavitch at gmail.com  Fri Mar  3 00:42:14 2006
From: slavitch at gmail.com (Michael Slavitch)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Number of pinholes supported by low-end NATs?
In-Reply-To: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
Message-ID: <1fc32c8f0603021642h4ba80b1dy@mail.gmail.com>

In theory 65K-2. In practise, "less" but lots.  Our testing seems to support
values that are like this:


1
4
16
64
256
1K
4K
16K
64K

With more recent boxes supporting the higher end in test cases if not in
real life.  CPU and bandwidth cack out before ports do.

Gaming forced this.

"Do the numbers differ if the messages are carried over UDP vs TCP?"

.  If you can guess how much RAM is available you can guess the size of the
pinhole range.

Look at ( http://openwrt.org/)

On 02/03/06, Philip Matthews < philip_matthews@magma.ca> wrote:
>
> Does anyone have any idea how many pinholes a P2P application
> can have open at one time through a typical low-end NAT box?
>
> Could a P2P application maintain connections to 30 peers simultaneously?
> What about 50 peers? Or 100 peers?
>
> Do the numbers differ if the messages are carried over UDP vs TCP?
>
> (Note: My interest is in limitations in the NAT box, and NOT in any
> limitations in
> the Windows, Mac, Linux, ... box on which the P2P application is
> running.)
>
> Just wondering if anyone has any solid data in this area.
>
> - Philip
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


--
Michael Slavitch
Ottawa, Ontario Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060302/034bd318/attachment.html
From matthew at matthew.at  Fri Mar  3 03:08:30 2006
From: matthew at matthew.at (Matthew Kaufman)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <20060302214524.21119.qmail@web53604.mail.yahoo.com>
Message-ID: <00c401c63e6f$c29e15f0$0202fea9@matthewdesk>

If your protocol supports IP address mobility (preferably with protection
against using that against an endpoint for session hijacking, and preferably
protecting that against replay as well), then existing peers wouldn't need
to "re-connect", they'd just be able to keep their connections up through
the change in address,... though you would want to re-determine your
external address/port so any new peers could connect to that. Conveniently,
knowledge that your address had undergone a mobility event would be exactly
how you'd know when you needed to do that re-determination.
 
I happen to have designed, published the specification for, and implemented
a protocol that does just this.
 
Matthew Kaufman
matthew@matthew.at
http://www.amicima.com


  _____  

From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Lemon Obrien
Sent: Thursday, March 02, 2006 1:45 PM
To: Peer-to-peer development.
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers


It seems port numbers change after a certain amount of time due to the local
NAT or local ISP, and the peer to peer application has to re-configure
itself to find out what its new global address is...and broadcast that to
others so they can re-connect; does anyone know what the average time is a
port number is good for? I'm getting up to 24 hours testing through Comcast
and SBC; but i have to sleep so...i'm not sure on this number.
 
thanks
lemon


You don't get no juice unless you squeeze
Lemon Obrien, the Third.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060302/662252da/attachment.htm
From eugen at leitl.org  Fri Mar  3 11:33:11 2006
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <20060302225407.GA14429@omnifarious.org>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
	<20060302214524.21119.qmail@web53604.mail.yahoo.com>
	<4ef5fec60603021444k323d801icd0b4134e5450897@mail.gmail.com>
	<20060302225407.GA14429@omnifarious.org>
Message-ID: <20060303113311.GD25017@leitl.org>

On Thu, Mar 02, 2006 at 02:54:07PM -0800, Eric Hopper wrote:

> I think though that consumer level NAT hardware might have some sort of
> limit on the number of mappings it can keep in memory and apply to
> packets though.

Most consumer NAT implementations are indeed buggy and limited (P2P
crashing consumer firewalls is well-documented), which is the main 
reason I've moved to a m0n0wall on a WRAP.

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060303/95f26b61/attachment.pgp
From googaya at gmail.com  Fri Mar  3 12:33:55 2006
From: googaya at gmail.com (googaya)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] NATs reconfiguring IPs and Port Numbers
In-Reply-To: <20060303113311.GD25017@leitl.org>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>	<20060302214524.21119.qmail@web53604.mail.yahoo.com>	<4ef5fec60603021444k323d801icd0b4134e5450897@mail.gmail.com>	<20060302225407.GA14429@omnifarious.org>
	<20060303113311.GD25017@leitl.org>
Message-ID: <440837B3.3000206@gmail.com>

Eugen Leitl wrote:

>On Thu, Mar 02, 2006 at 02:54:07PM -0800, Eric Hopper wrote:
>
>  
>
>>I think though that consumer level NAT hardware might have some sort of
>>limit on the number of mappings it can keep in memory and apply to
>>packets though.
>>    
>>
>
>Most consumer NAT implementations are indeed buggy and limited (P2P
>crashing consumer firewalls is well-documented), which is the main 
>reason I've moved to a m0n0wall on a WRAP.
>
>  
>
Good point we have found excellent success using OpenVPN and can connect
to an Asterisk/OpenSER with
no problems no matter the location, Enterprise, Hotels/Motels,
Starbucks, etc...
-E
http://googaya.com


From m.rogers at cs.ucl.ac.uk  Fri Mar  3 15:42:29 2006
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Number of pinholes supported by low-end NATs?
In-Reply-To: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
Message-ID: <440863E5.2050009@cs.ucl.ac.uk>

Hi Philip,

Gnutella ultrapeers usually have 32 connections to other ultrapeers and 
around the same number to leaf peers. I'd guess a lot of ultrapeers are 
behind domestic NATs but I don't know how you'd find out for sure - 
maybe LimeWire collects some stats from its firewall detection logic?

Cheers,
Michael


Philip Matthews wrote:
> Does anyone have any idea how many pinholes a P2P application
> can have open at one time through a typical low-end NAT box?
> 
> Could a P2P application maintain connections to 30 peers simultaneously?
> What about 50 peers? Or 100 peers?
> 
> Do the numbers differ if the messages are carried over UDP vs TCP?
> 
> (Note: My interest is in limitations in the NAT box, and NOT in any  
> limitations in
> the Windows, Mac, Linux, ... box on which the P2P application is  running.)
> 
> Just wondering if anyone has any solid data in this area.
> 
> - Philip
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From bcg at utas.edu.au  Tue Mar  7 01:35:56 2006
From: bcg at utas.edu.au (Brad Goldsmith)
Date: Sat Dec  9 22:13:10 2006
Subject: [p2p-hackers] Mailing lists for P2P conference announcements?
In-Reply-To: <440863E5.2050009@cs.ucl.ac.uk>
References: <8C1DD971-379F-45F8-8169-8238C52D03F6@magma.ca>
	<440863E5.2050009@cs.ucl.ac.uk>
Message-ID: <6103992410571a1b0388715d8f088442@utas.edu.au>

Hi All,

What mailing lists (or other resources) do people here use for 
conference announcements?

I've just joined the p2p research group mailing list at the ietf 
(p2prg@ietf.org).

Just wondering if anyone can suggest some other good lists for keeping 
up with calls for papers, etc?

I always seem to miss them!

Cheers,
Brad

---

Brad Goldsmith
School of Computing
University of Tasmania, Tasmania, Australia
Office: Launceston Campus, Computing Building, V-177
Telephone: (03) 6324 3389 International: +61-3-6324 3389
Facsimile: (03) 6324 3368 International: +61-3-6324 3368


From lemonobrien at yahoo.com  Tue Mar  7 01:47:13 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Mailing lists for P2P conference announcements?
In-Reply-To: <6103992410571a1b0388715d8f088442@utas.edu.au>
Message-ID: <20060307014713.12935.qmail@web53607.mail.yahoo.com>

why do you want to go to a conference on peer-to-peer technology? the state of the art is here.

Brad Goldsmith <bcg@utas.edu.au> wrote:  Hi All,

What mailing lists (or other resources) do people here use for 
conference announcements?

I've just joined the p2p research group mailing list at the ietf 
(p2prg@ietf.org).

Just wondering if anyone can suggest some other good lists for keeping 
up with calls for papers, etc?

I always seem to miss them!

Cheers,
Brad

---

Brad Goldsmith
School of Computing
University of Tasmania, Tasmania, Australia
Office: Launceston Campus, Computing Building, V-177
Telephone: (03) 6324 3389 International: +61-3-6324 3389
Facsimile: (03) 6324 3368 International: +61-3-6324 3368

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060306/caff10c7/attachment.htm
From dbarrett at quinthar.com  Tue Mar  7 01:49:59 2006
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Mailing lists for P2P conference announcements?
In-Reply-To: <20060307014713.12935.qmail@web53607.mail.yahoo.com>
Message-ID: <20060307015002.D7A1A3FD10@capsicum.zgp.org>

Speaking of which, we should have another P2P in SFC meetup.  The last one
was a good time.  Any takers?

 
  _____  

From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
Behalf Of Lemon Obrien
Sent: Monday, March 06, 2006 5:47 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] Mailing lists for P2P conference announcements?

 
why do you want to go to a conference on peer-to-peer technology? the state
of the art is here.

Brad Goldsmith <bcg@utas.edu.au> wrote: 

Hi All,

What mailing lists (or other resources) do people here use for 
conference announcements?

I've just joined the p2p research group mailing list at the ietf 
(p2prg@ietf.org).

Just wondering if anyone can suggest some other good lists for keeping 
up with calls for papers, etc?

I always seem to miss them!

Cheers,
Brad

---

Brad Goldsmith
School of Computing
University of Tasmania, Tasmania, Australia
Office: Launceston Campus, Computing Building, V-177
Telephone: (03) 6324 3389 International: +61-3-6324 3389
Facsimile: (03) 6324 3368 International: +61-3-6324 3368

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060306/6a3e6d57/attachment.html
From ap at hamachi.cc  Tue Mar  7 01:53:15 2006
From: ap at hamachi.cc (Alex Pankratov)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Mailing lists for P2P conference announcements?
In-Reply-To: <20060307015002.D7A1A3FD10@capsicum.zgp.org>
References: <20060307015002.D7A1A3FD10@capsicum.zgp.org>
Message-ID: <440CE78B.10304@hamachi.cc>

Is anyone from Vancouver BC here by the way ? Just curious :)

Alex

David Barrett wrote:
> Speaking of which, we should have another P2P in SFC meetup.  The last
> one was a good time.  Any takers?
> 
>  
> 
> ------------------------------------------------------------------------
> 
> *From:* p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]
> *On Behalf Of *Lemon Obrien
> *Sent:* Monday, March 06, 2006 5:47 PM
> *To:* Peer-to-peer development.
> *Subject:* Re: [p2p-hackers] Mailing lists for P2P conference announcements?
> 
>  
> 
> why do you want to go to a conference on peer-to-peer technology? the
> state of the art is here.
> 
> */Brad Goldsmith <bcg@utas.edu.au>/* wrote:
> 
> Hi All,
> 
> What mailing lists (or other resources) do people here use for
> conference announcements?
> 
> I've just joined the p2p research group mailing list at the ietf
> (p2prg@ietf.org).
> 
> Just wondering if anyone can suggest some other good lists for keeping
> up with calls for papers, etc?
> 
> I always seem to miss them!
> 
> Cheers,
> Brad
> 
> ---
> 
> Brad Goldsmith
> School of Computing
> University of Tasmania, Tasmania, Australia
> Office: Launceston Campus, Computing Building, V-177
> Telephone: (03) 6324 3389 International: +61-3-6324 3389
> Facsimile: (03) 6324 3368 International: +61-3-6324 3368
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 
> 
> 
> You don't get no juice unless you squeeze
> Lemon Obrien, the Third.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From bcg at utas.edu.au  Tue Mar  7 02:32:39 2006
From: bcg at utas.edu.au (Brad Goldsmith)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Mailing lists for P2P conference announcements?
In-Reply-To: <20060307014713.12935.qmail@web53607.mail.yahoo.com>
References: <20060307014713.12935.qmail@web53607.mail.yahoo.com>
Message-ID: <447cc400203a31f4c007ae882bb5b96f@utas.edu.au>


On 07/03/2006, at 12:47 PM, Lemon Obrien wrote:

> why do you want to go to a conference on peer-to-peer technology? the 
> state of the art is here.

Unfortunately, some of us have to publish in order to justify our 
existences. Or at least to justify our PhD grant money...

So, to make this post more than a poor attempt at a witty retort, I 
know that the 6th IEEE p2p conf is coming up:

http://p2p2006.csc.ncsu.edu/ - Submission date is April 26, 2006.


Cheers.


---

Brad Goldsmith
School of Computing
University of Tasmania, Tasmania, Australia
Office: Launceston Campus, Computing Building, V-177
Telephone: (03) 6324 3389 International: +61-3-6324 3389
Facsimile: (03) 6324 3368 International: +61-3-6324 3368


From lemonobrien at yahoo.com  Tue Mar  7 02:33:53 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Mailing lists for P2P conference announcements?
In-Reply-To: <20060307015002.D7A1A3FD10@capsicum.zgp.org>
Message-ID: <20060307023353.92187.qmail@web53604.mail.yahoo.com>

HERE HERE !!!

David Barrett <dbarrett@quinthar.com> wrote:        v\:* {behavior:url(#default#VML);}  o\:* {behavior:url(#default#VML);}  w\:* {behavior:url(#default#VML);}  .shape {behavior:url(#default#VML);}        st1\:*{behavior:url(#default#ieooui) }                Speaking of which, we should have another P2P in SFC meetup.  The last one was a good time.  Any takers?
   
        
---------------------------------
  
  From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Lemon Obrien
Sent: Monday, March 06, 2006 5:47 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] Mailing lists for P2P conference announcements?

   
  why do you want to go to a conference on peer-to-peer technology? the state of the art is here.

Brad Goldsmith <bcg@utas.edu.au> wrote: 
  Hi All,

What mailing lists (or other resources) do people here use for 
conference announcements?

I've just joined the p2p research group mailing list at the ietf 
(p2prg@ietf.org).

Just wondering if anyone can suggest some other good lists for keeping 
up with calls for papers, etc?

I always seem to miss them!

Cheers,
Brad

---

Brad Goldsmith
School of Computing
University of Tasmania, Tasmania, Australia
Office: Launceston Campus, Computing Building, V-177
Telephone: (03) 6324 3389 International: +61-3-6324 3389
Facsimile: (03) 6324 3368 International: +61-3-6324 3368

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
  

You don't get no juice unless you squeeze
Lemon Obrien, the Third.


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060306/ea7191d5/attachment.htm
From slavitch at gmail.com  Tue Mar  7 03:59:03 2006
From: slavitch at gmail.com (Michael Slavitch)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Mailing lists for P2P conference announcements?
In-Reply-To: <440CE78B.10304@hamachi.cc>
References: <20060307015002.D7A1A3FD10@capsicum.zgp.org>
	<440CE78B.10304@hamachi.cc>
Message-ID: <1fc32c8f0603061959v7d008fb9v61229389f2d91593@mail.gmail.com>

Let's try to be interesting and make all p2psip-hackers meeting virtual and
distributed.

Let's eat what we grow.

M

On 3/6/06, Alex Pankratov <ap@hamachi.cc> wrote:
>
> Is anyone from Vancouver BC here by the way ? Just curious :)
>
> Alex
>
> David Barrett wrote:
> > Speaking of which, we should have another P2P in SFC meetup.  The last
> > one was a good time.  Any takers?
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > *From:* p2p-hackers-bounces@zgp.org [mailto: p2p-hackers-bounces@zgp.org
> ]
> > *On Behalf Of *Lemon Obrien
> > *Sent:* Monday, March 06, 2006 5:47 PM
> > *To:* Peer-to-peer development.
> > *Subject:* Re: [p2p-hackers] Mailing lists for P2P conference
> announcements?
> >
> >
> >
> > why do you want to go to a conference on peer-to-peer technology? the
> > state of the art is here.
> >
> > */Brad Goldsmith <bcg@utas.edu.au >/* wrote:
> >
> > Hi All,
> >
> > What mailing lists (or other resources) do people here use for
> > conference announcements?
> >
> > I've just joined the p2p research group mailing list at the ietf
> > (p2prg@ietf.org).
> >
> > Just wondering if anyone can suggest some other good lists for keeping
> > up with calls for papers, etc?
> >
> > I always seem to miss them!
> >
> > Cheers,
> > Brad
> >
> > ---
> >
> > Brad Goldsmith
> > School of Computing
> > University of Tasmania, Tasmania, Australia
> > Office: Launceston Campus, Computing Building, V-177
> > Telephone: (03) 6324 3389 International: +61-3-6324 3389
> > Facsimile: (03) 6324 3368 International: +61-3-6324 3368
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
> >
> >
> >
> > You don't get no juice unless you squeeze
> > Lemon Obrien, the Third.
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


--
Michael Slavitch
Ottawa, Ontario Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060306/90172eee/attachment.html
From rosejn at gmail.com  Tue Mar  7 10:46:38 2006
From: rosejn at gmail.com (Jeff Rose)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
Message-ID: <440D648E.302@gmail.com>

Anyone out there doing work in clustering?  I'd be interested in hearing 
what people are up to, or if you have pointers to good papers that would 
be great too.  At this point I'm interested in any angle, geographic 
distance based, semantic distance based, hierarchical, non-hierarchical 
etc...  My goal is to work towards a very amorphous clustering scheme 
that allows any kind of object to be located close to relatives in the 
network as long as they share a common distance function.

Peace,
Jeff

From jdefarge at gmail.com  Tue Mar  7 13:58:37 2006
From: jdefarge at gmail.com (jacques defarge)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440D648E.302@gmail.com>
References: <440D648E.302@gmail.com>
Message-ID: <e08401b0603070558o44cd6c6crb08986696fa13e6f@mail.gmail.com>

Hello Jeff,

I am finishing my MSc. dissertation on p2p systems grouped in a cluster like
structure. I have used
JXTA as the p2p substrate, but it was the worst choice even possible. After
finishing my MSc. work
I am searching for really nice projects on such area. If you would like we
could discuss your ideas
further, develop some things, and even collaborate on a paper or a
prototype.

[]'s
Jacques

On 3/7/06, Jeff Rose <rosejn@gmail.com> wrote:
>
> Anyone out there doing work in clustering?  I'd be interested in hearing
> what people are up to, or if you have pointers to good papers that would
> be great too.  At this point I'm interested in any angle, geographic
> distance based, semantic distance based, hierarchical, non-hierarchical
> etc...  My goal is to work towards a very amorphous clustering scheme
> that allows any kind of object to be located close to relatives in the
> network as long as they share a common distance function.
>
> Peace,
> Jeff
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060307/5e07fad7/attachment.htm
From rosejn at gmail.com  Tue Mar  7 14:25:48 2006
From: rosejn at gmail.com (Jeff Rose)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <e08401b0603070558o44cd6c6crb08986696fa13e6f@mail.gmail.com>
References: <440D648E.302@gmail.com>
	<e08401b0603070558o44cd6c6crb08986696fa13e6f@mail.gmail.com>
Message-ID: <440D97EC.8060408@gmail.com>

Why was the JXTA stuff the worst p2p substrate possible?  I don't really 
care much for java and all the corporate speak, but technically what was 
it missing?  Outside of the realm of true research I've been looking 
around for ideas to eventually roll into some kind of p2p infrastructure 
for ruby...  (Probably a set of interfaces for various services, and 
then a couple implementations behind it.  DHT search, unstructured 
search (random walk, smart flood), torrent style download, and some 
glue...)  Is your thesis available on the web?

-Jeff


From jacob at mungo.dk  Tue Mar  7 16:23:13 2006
From: jacob at mungo.dk (Jacob Madsen)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
Message-ID: <440DB371.8040702@mungo.dk>

Hey

I was reading about FastTrack on wikipedia and checked
http://www.slyck.com for stats on the network. And according to
http://www.slyck.com/stats.php there is almost 3 millions users at the
moment.
Do someone know of a method to calculate the approx. number of supernodes?

Thanks!

From saikat at cs.cornell.edu  Tue Mar  7 16:51:33 2006
From: saikat at cs.cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
In-Reply-To: <440DB371.8040702@mungo.dk>
References: <440DB371.8040702@mungo.dk>
Message-ID: <1141750293.5300.32.camel@localhost.localdomain>

On Tue, 2006-03-07 at 17:23 +0100, Jacob Madsen wrote:
> I was reading about FastTrack [...] almost 3 millions users at the
> moment.
> Do someone know of a method to calculate the approx. number of supernodes?

Bunch of really *rough* estimates.

1) In [1], the authors claim that roughly 50% of MSN users are NAT'ed;
such nodes cannot become supernodes [2] (while [2] relates to Skype
specifically, there is some evidence to suggest Skype uses the same
underlying network as Kazaa = FastTrack; also, [2] finds roughly 4
million users, so in the same ballpark). Based on these two, figure
1.5--2M  supernodes as an upper bound. My personal intuition is that it
is much less.

2) In [2], a crawl of the supernode network identified some 250K, only
some of which were online simultaneously. The crawl was not exhaustive.
Figure this is some sort of weak lower bound, with the real number being
higher.

End result: we don't know for sure, but my guess is somewhere between
200K to 2M active supernodes. An order of magnitude doesn't matter much,
right? :P 

A large-scale crawl should yield some tighter bounds. The crawl
methodology is written up in [2].

[1] P Rodriguez, See-Mong Tan, C. Gkantsidis, "On the feasibility of
commercial, legal P2P Content Distribution", In ACM/SIGCOMM CCR, Jan
2006. http://research.microsoft.com/~pablo/papers/CCR.pdf

[2] S Guha, N Daswani and R Jain, "An Experimental Study of the Skype
Peer-to-Peer VoIP System", In IPTPS'06, Feb 2006.
http://saikat.guha.cc/pub/iptps06-skype/

cheers,
-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060307/21a3ad08/attachment.pgp
From jacob at mungo.dk  Tue Mar  7 18:31:09 2006
From: jacob at mungo.dk (Jacob Madsen)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
In-Reply-To: <1141750293.5300.32.camel@localhost.localdomain>
References: <440DB371.8040702@mungo.dk>
	<1141750293.5300.32.camel@localhost.localdomain>
Message-ID: <440DD16D.104@mungo.dk>

Hey again

Great answar Saikat - thanks alot!

The stuff about Skype using the FastTrack protocol or something similar
made me wonder. I thought that the whole network wasnt reachable from a
client's point, since the search requests have a time-to-live that is
decremented at each supernode it passes and dropped when the value is
zero (so in the aspect of Skype, not all users can reach eachother,
unless they have implemented the protocol differently)

Thats actually 2 questions and i'm really only interested in the first
about FastTrack and the time-to-live value limiting the reachability of
the search requests.

As far as i remember, the client has 1 connection to a supernode and
send its search requests to that supernode. The supernode send (flood)
its supernode neighbours with the request and the time-to-live value in
the request is decremented at each supernode it passes and will reach
the zero value at some point and be dropped.

Again - thanks in advance!

Saikat Guha wrote:
> On Tue, 2006-03-07 at 17:23 +0100, Jacob Madsen wrote:
>   
>> I was reading about FastTrack [...] almost 3 millions users at the
>> moment.
>> Do someone know of a method to calculate the approx. number of supernodes?
>>     
>
> Bunch of really *rough* estimates.
>
> 1) In [1], the authors claim that roughly 50% of MSN users are NAT'ed;
> such nodes cannot become supernodes [2] (while [2] relates to Skype
> specifically, there is some evidence to suggest Skype uses the same
> underlying network as Kazaa = FastTrack; also, [2] finds roughly 4
> million users, so in the same ballpark). Based on these two, figure
> 1.5--2M  supernodes as an upper bound. My personal intuition is that it
> is much less.
>
> 2) In [2], a crawl of the supernode network identified some 250K, only
> some of which were online simultaneously. The crawl was not exhaustive.
> Figure this is some sort of weak lower bound, with the real number being
> higher.
>
> End result: we don't know for sure, but my guess is somewhere between
> 200K to 2M active supernodes. An order of magnitude doesn't matter much,
> right? :P 
>
> A large-scale crawl should yield some tighter bounds. The crawl
> methodology is written up in [2].
>
> [1] P Rodriguez, See-Mong Tan, C. Gkantsidis, "On the feasibility of
> commercial, legal P2P Content Distribution", In ACM/SIGCOMM CCR, Jan
> 2006. http://research.microsoft.com/~pablo/papers/CCR.pdf
>
> [2] S Guha, N Daswani and R Jain, "An Experimental Study of the Skype
> Peer-to-Peer VoIP System", In IPTPS'06, Feb 2006.
> http://saikat.guha.cc/pub/iptps06-skype/
>
> cheers,
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>   


From saikat at cs.cornell.edu  Tue Mar  7 19:30:33 2006
From: saikat at cs.cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
In-Reply-To: <440DD16D.104@mungo.dk>
References: <440DB371.8040702@mungo.dk>
	<1141750293.5300.32.camel@localhost.localdomain>
	<440DD16D.104@mungo.dk>
Message-ID: <1141759833.5300.39.camel@localhost.localdomain>

On Tue, 2006-03-07 at 19:31 +0100, Jacob Madsen wrote:
> Thats actually 2 questions and i'm really only interested in the first
> about FastTrack and the time-to-live value limiting the reachability of
> the search requests.

Perhaps [1] may be of some interest to you. I don't know much about the
internal reachability of the FastTrack network.

[1] J Liang, R Kumar, and K W Ross, "The Kazaa Overlay: A Measurement
Study", Computer Networks 49, 6. Oct. 2005.
http://cis.poly.edu/~ross/papers/KazaaOverlay.pdf

-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060307/66548116/attachment.pgp
From salman at cs.columbia.edu  Tue Mar  7 20:50:14 2006
From: salman at cs.columbia.edu (Salman Abdul Baset)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
In-Reply-To: <440DB371.8040702@mungo.dk>
References: <440DB371.8040702@mungo.dk>
Message-ID: <Pine.GSO.4.58.0603071549180.6427@flame.cs.columbia.edu>

Please see
http://www1.cs.columbia.edu/~salman/publications/skype1_4.pdf
It provides some insights on Skype super nodes.
regards,
salman

On Tue, 7 Mar 2006, Jacob Madsen wrote:

> Hey
>
> I was reading about FastTrack on wikipedia and checked
> http://www.slyck.com for stats on the network. And according to
> http://www.slyck.com/stats.php there is almost 3 millions users at the
> moment.
> Do someone know of a method to calculate the approx. number of supernodes?
>
> Thanks!
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From dbarrett at quinthar.com  Wed Mar  8 00:46:11 2006
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440D648E.302@gmail.com>
Message-ID: <20060308004615.169FA3FD08@capsicum.zgp.org>

I've been long interested in using synthetic coordinate systems to estimate
the network distance (latency, throughput, etc) between any two nodes (even
if they've never communicated directly) for purposes of clustering, but
haven't really dug into it.  Has anyone tried it in a real-world setting?

-david

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On
> Behalf Of Jeff Rose
> Sent: Tuesday, March 07, 2006 2:47 AM
> To: Peer-to-peer development.
> Subject: [p2p-hackers] clustering
> 
> Anyone out there doing work in clustering?  I'd be interested in hearing
> what people are up to, or if you have pointers to good papers that would
> be great too.  At this point I'm interested in any angle, geographic
> distance based, semantic distance based, hierarchical, non-hierarchical
> etc...  My goal is to work towards a very amorphous clustering scheme
> that allows any kind of object to be located close to relatives in the
> network as long as they share a common distance function.
> 
> Peace,
> Jeff
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From ian at locut.us  Wed Mar  8 01:02:45 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440D648E.302@gmail.com>
References: <440D648E.302@gmail.com>
Message-ID: <5008189E-074B-43CC-A4F6-CEA901134073@locut.us>

I think Chapters 1 and 2 of Oskar Sandberg's recent thesis may be  
extremely relevant:

   http://www.math.chalmers.se/~ossa/lic.pdf

It outlines an algorithm to choose which nodes in a network should  
have edges between them such that "greedy" routing (routing to the  
peer closest to what you are looking for) has small world O( log^2 
(N) ) path lengths.

The algorithm is very simple, and pleasingly "natural".  Basically  
you use greedy routing to find a path to your intended destination  
node, then you add a link from each node along this path to the  
destination.  The number of outbound edges from each node can't  
obviously increase indefinitely, so they are deleted according to a  
least recently used scheme to make room for new edges.

This is based on Freenet's approach to edge selection.  The paper  
describes both experimental results that confirm that this algorithm  
does lead to a small world topology, and also delves into the  
mathematics behind why this might be the case.

It is interesting not just because of the simplicity of the  
algorithm, but because it is extremely amenable to the construction  
and maintainence of decentralized P2P networks, and because it could  
tell us something about why human relationships tend to form small  
world networks.

This algorithm is used by the Dijjer (http://dijjer.org/) P2P network.

Ian.

On 7 Mar 2006, at 02:46, Jeff Rose wrote:

> Anyone out there doing work in clustering?  I'd be interested in  
> hearing what people are up to, or if you have pointers to good  
> papers that would be great too.  At this point I'm interested in  
> any angle, geographic distance based, semantic distance based,  
> hierarchical, non-hierarchical etc...  My goal is to work towards a  
> very amorphous clustering scheme that allows any kind of object to  
> be located close to relatives in the network as long as they share  
> a common distance function.


From networksimulator at gmail.com  Wed Mar  8 02:17:23 2006
From: networksimulator at gmail.com (Ranus)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440D648E.302@gmail.com>
Message-ID: <000c01c64256$71f96bf0$ccc96fa6@thinkingfish>

I ran into the idea of "small-world" last week and searched in the
literature about it, there have been several papers using clustering to
improve the system performance, and still more trying to understand the
small-world networks. For P2P applications, Hui Zhang has published a =
paper
named "Using the Small-World Model to Improve Freenet Performance". It
should correspond to your idea, so maybe you could read that.

Actually I'm quite interested in the idea myself, in the p2p context, =
the
dynamics toward clustering is important... It should be efficient and
simple, and self-organized. I'd be happy if  we can have more =
discussion.


--
Ranus Yue
Tsinghua University

=B7=A2=BC=FE=C8=CB: p2p-hackers-bounces@zgp.org =
[mailto:p2p-hackers-bounces@zgp.org] =B4=FA
=B1=ED Jeff Rose
=B7=A2=CB=CD=CA=B1=BC=E4: 2006=C4=EA3=D4=C27=C8=D5 18:47
=CA=D5=BC=FE=C8=CB: Peer-to-peer development.
=D6=F7=CC=E2: [p2p-hackers] clustering

Anyone out there doing work in clustering?  I'd be interested in hearing
what people are up to, or if you have pointers to good papers that would =
be
great too.  At this point I'm interested in any angle, geographic =
distance
based, semantic distance based, hierarchical, non-hierarchical etc...  =
My
goal is to work towards a very amorphous clustering scheme that allows =
any
kind of object to be located close to relatives in the network as long =
as
they share a common distance function.

Peace,
Jeff
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From lemonobrien at yahoo.com  Wed Mar  8 02:59:54 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <000c01c64256$71f96bf0$ccc96fa6@thinkingfish>
Message-ID: <20060308025954.84414.qmail@web53611.mail.yahoo.com>

i read the book 'sync'; clustering is easy...at weblogic they udp to ping/pong each other...the same thing all peer-to-peer apps have to do to punch and keep open udp behind a firewall...so you have the messaging...within the pong or ping...sinc your session data...the smaller the better...when recieving a ping/pong and session data has changed...propagate that message with all other known peers...since all low class behind a firewall nodes only talk to relay nodes (aka any other cool name for a computer which is also used to relay messages to childern and to other relay station...leaf,super,open). The relay  node will transform your net/mesh/grid's session...and or sessions as the french would say...
   
   
Ranus <networksimulator@gmail.com> wrote:
  I ran into the idea of "small-world" last week and searched in the
literature about it, there have been several papers using clustering to
improve the system performance, and still more trying to understand the
small-world networks. For P2P applications, Hui Zhang has published a paper
named "Using the Small-World Model to Improve Freenet Performance". It
should correspond to your idea, so maybe you could read that.

Actually I'm quite interested in the idea myself, in the p2p context, the
dynamics toward clustering is important... It should be efficient and
simple, and self-organized. I'd be happy if we can have more discussion.


--
Ranus Yue
Tsinghua University

??????: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] ??
?? Jeff Rose
????????: 2006??3??7?? 18:47
??????: Peer-to-peer development.
????: [p2p-hackers] clustering

Anyone out there doing work in clustering? I'd be interested in hearing
what people are up to, or if you have pointers to good papers that would be
great too. At this point I'm interested in any angle, geographic distance
based, semantic distance based, hierarchical, non-hierarchical etc... My
goal is to work towards a very amorphous clustering scheme that allows any
kind of object to be located close to relatives in the network as long as
they share a common distance function.

Peace,
Jeff
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060307/35e55d7e/attachment.html
From garyjefferson123 at yahoo.com  Wed Mar  8 03:43:39 2006
From: garyjefferson123 at yahoo.com (Gary Jefferson)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
Message-ID: <20060308034339.54645.qmail@web35809.mail.mud.yahoo.com>

Ranus <networksimulator at gmail.com> wrote:
>  I ran into the idea of "small-world" last week and
> searched in the literature about it,

Just a general question for the list: DHTs don't
exhibit small-world characteristics, at least not in
the overlay network view of things, right?  I mean,
from each node's perspective, it might look a bit like
a small world -- in Kademlia, for instance, my routing
tables will center around my own ID space.  But the
network, as a whole, isn't small world, and isn't
vulnerable to vertex-order attacks, right?  Or am I
missing something?

Gary

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

From agthorr at cs.uoregon.edu  Wed Mar  8 04:55:52 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <20060308034339.54645.qmail@web35809.mail.mud.yahoo.com>
References: <20060308034339.54645.qmail@web35809.mail.mud.yahoo.com>
Message-ID: <20060308045551.GC5696@cs.uoregon.edu>

On Tue, Mar 07, 2006 at 07:43:39PM -0800, Gary Jefferson wrote:
> Just a general question for the list: DHTs don't
> exhibit small-world characteristics, at least not in
> the overlay network view of things, right?  I mean,
> from each node's perspective, it might look a bit like
> a small world -- in Kademlia, for instance, my routing
> tables will center around my own ID space.  

DHTs typically are small worlds.

The small-world characteristics are short path lengths (comparable to
a random graph) and high clustering (much more than a random graph).

DHTs certainly have short path lengths (that's the point).  All the
DHTs I can think of also have high clustering.  A node's neighbors are
likely to also be neighbors.

> But the network, as a whole, isn't small world, and isn't vulnerable
> to vertex-order attacks, right?  Or am I missing something?

I believe you are thinking of power-law graphs (also called
"scale-free") which have a few very-high-degree peers that make juicy
targets.  Power-law graphs frequently (but don't necessarily) exhibit
small-world characteristics.  Almost everyone is connected to some of
the high-degree peers who are all connected together, so there is a
lot of clustering.  The high degree peers also provide short-paths
everywhere.  As a result, they're small worlds.

However, not all small-worlds are power-law graphs.  In particular,
DHTs are not power-law graphs.  There are not exceptionally high or
low degree peers.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From wesley at felter.org  Wed Mar  8 03:55:26 2006
From: wesley at felter.org (Wes Felter)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440D648E.302@gmail.com>
References: <440D648E.302@gmail.com>
Message-ID: <1F413594-2AC1-4FBE-A25D-08109D7F5FFC@felter.org>

On Mar 7, 2006, at 4:46 AM, Jeff Rose wrote:

> Anyone out there doing work in clustering?  I'd be interested in  
> hearing what people are up to, or if you have pointers to good  
> papers that would be great too.  At this point I'm interested in  
> any angle, geographic distance based, semantic distance based,  
> hierarchical, non-hierarchical etc...  My goal is to work towards a  
> very amorphous clustering scheme that allows any kind of object to  
> be located close to relatives in the network as long as they share  
> a common distance function.

IIRC, OASIS uses some sophisticated clustering under the hood. http:// 
www.coralcdn.org/oasis/

Wes Felter - wesley@felter.org - http://felter.org/wesley/


From networksimulator at gmail.com  Wed Mar  8 06:08:20 2006
From: networksimulator at gmail.com (Ranus)
Date: Sat Dec  9 22:13:11 2006
Subject: =?gb2312?B?tPC4tDogW3AycC1oYWNrZXJzXSBjbHVzdGVyaW5n?=
In-Reply-To: <20060308045551.GC5696@cs.uoregon.edu>
Message-ID: <000601c64276$b5b267a0$ccc96fa6@thinkingfish>

Well... Small-world is characterized by both a high clustering degree =
and a
short average path length. While DHTs are designed to have short paths =
from
node to node, they do not necessarily present clustering.=20

I just thought about Chord... The nodes keep track of neighbors every =
2^n
steps away. I don't think the nodes are clustering here, or am I wrong
somewhere?


--
Ranus Yue
Tsinghua University

-----=D3=CA=BC=FE=D4=AD=BC=FE-----
=B7=A2=BC=FE=C8=CB: p2p-hackers-bounces@zgp.org =
[mailto:p2p-hackers-bounces@zgp.org] =B4=FA
=B1=ED Daniel Stutzbach
=B7=A2=CB=CD=CA=B1=BC=E4: 2006=C4=EA3=D4=C28=C8=D5 12:56
=CA=D5=BC=FE=C8=CB: p2p-hackers@zgp.org
=D6=F7=CC=E2: Re: [p2p-hackers] clustering

On Tue, Mar 07, 2006 at 07:43:39PM -0800, Gary Jefferson wrote:
> Just a general question for the list: DHTs don't exhibit small-world=20
> characteristics, at least not in the overlay network view of things,=20
> right?  I mean, from each node's perspective, it might look a bit like =

> a small world -- in Kademlia, for instance, my routing tables will=20
> center around my own ID space.

DHTs typically are small worlds.

The small-world characteristics are short path lengths (comparable to a
random graph) and high clustering (much more than a random graph).

DHTs certainly have short path lengths (that's the point).  All the DHTs =
I
can think of also have high clustering.  A node's neighbors are likely =
to
also be neighbors.

> But the network, as a whole, isn't small world, and isn't vulnerable=20
> to vertex-order attacks, right?  Or am I missing something?

I believe you are thinking of power-law graphs (also called
"scale-free") which have a few very-high-degree peers that make juicy
targets.  Power-law graphs frequently (but don't necessarily) exhibit
small-world characteristics.  Almost everyone is connected to some of =
the
high-degree peers who are all connected together, so there is a lot of
clustering.  The high degree peers also provide short-paths everywhere.  =
As
a result, they're small worlds.

However, not all small-worlds are power-law graphs.  In particular, DHTs =
are
not power-law graphs.  There are not exceptionally high or low degree =
peers.

--=20
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From eugen at leitl.org  Wed Mar  8 07:57:03 2006
From: eugen at leitl.org (Eugen Leitl)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <20060308004615.169FA3FD08@capsicum.zgp.org>
References: <440D648E.302@gmail.com>
	<20060308004615.169FA3FD08@capsicum.zgp.org>
Message-ID: <20060308075703.GM25017@leitl.org>

On Tue, Mar 07, 2006 at 04:46:11PM -0800, David Barrett wrote:

> I've been long interested in using synthetic coordinate systems to estimate
> the network distance (latency, throughput, etc) between any two nodes (even
> if they've never communicated directly) for purposes of clustering, but
> haven't really dug into it.  Has anyone tried it in a real-world setting?

Has routing been converging towards geography lately, or is this an urban myth?

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060308/e7555690/attachment.pgp
From m.rogers at cs.ucl.ac.uk  Wed Mar  8 08:35:54 2006
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <20060308034339.54645.qmail@web35809.mail.mud.yahoo.com>
References: <20060308034339.54645.qmail@web35809.mail.mud.yahoo.com>
Message-ID: <440E976A.8010906@cs.ucl.ac.uk>

Gary Jefferson wrote:
> Just a general question for the list: DHTs don't
> exhibit small-world characteristics, at least not in
> the overlay network view of things, right?

That depends on your definition of 'small world'. Most people take it to 
mean short paths and high clustering, but as far as I know the Freenet 
work uses a more specific definition due to Jon Kleinberg: a graph is a 
small world if it can be embedded in a metric space such that the length 
of the edges follows a power law distribution, where the magnitude of 
the power law exponent is equal to the number of dimensions in the 
metric space[1]. Kleinberg showed that greedy routing in such a graph is 
efficient, but if the power law exponent has any other value then greedy 
routing is inefficient[2,3]. However, it's also possible that the length 
distribution doesn't follow a power law at all (eg Chord, where the 
length distribution is exponential and greedy routing is efficient).

Most DHTs aren't small worlds in the sense used in the Freenet work; 
Symphony[4] is an exception.

Cheers,
Michael

[1] http://www.cs.cornell.edu/home/kleinber/nips14.ps
[2] http://www.cs.cornell.edu/home/kleinber/nat00.pdf
[3] http://www.cs.cornell.edu/home/kleinber/swn.pdf
[4] http://www-db.stanford.edu/~bawa/Pub/symphony.pdf

From networksimulator at gmail.com  Wed Mar  8 16:38:58 2006
From: networksimulator at gmail.com (Ranus)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440E976A.8010906@cs.ucl.ac.uk>
Message-ID: <001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>

About your previous mail: I checked the definition of clustering
<http://en.wikipedia.org/wiki/Clustering_coefficient> coefficient, that of a
chord vertex is 3/(n-2) (for a 2^n node ring), and it's the same for every
vertex, so the overall clustering coefficient is not high when n is
moderately large.

The definition of small-world is indeed clustering and short path.

Well, the network defined by Kleinberg falls into the category of
"scale-free networks". In such networks there are a few nodes that connect
to abundant other nodes. Following the power law distribution of degrees is
one way to construct scale-free networks, which are also small-world
networks. But there could be scale-free networks that are not small worlds.
(E.g, two star networks connected at the centers, here the clustering
coefficient is 0 because there's no triangle)

On the other hand, a very famous illustration of a small world is a regular
lattice with a few extra, randomly chosen edges. This does not conform the
power law distribution of edges and there's no supernode, but it's a small
world.

Am I going too far from the point? Better turn around here...

The scale-free networks seem quite appealing as performance and scalability
are concerned, and it's simpler to guarantee short paths with powerful
supernodes. Many P2P applications such as Gia (Making Gnutella-like P2P
Systems Scalable, Sigcomm'03) introduce such supernodes to improve the
scalability. But it seems harder, to build scalable networks with more or
less the same number of edges for each node (this could be realistic when
edges are interpreted as connections, rather than links/routing
information), because at such time the supernodes are no where to be found.

I think some work could be done here... what do you suggest?


--
Ranus Yue
Tsinghua University

-----�ʼ�ԭ��-----
������: p2p-hackers-bounces@zgp.org [ <mailto:p2p-hackers-bounces@zgp.org>
mailto:p2p-hackers-bounces@zgp.org] ���� Michael Rogers
����ʱ��: 2006��3��8�� 16:36
�ռ���: Peer-to-peer development.
����: Re: [p2p-hackers] clustering

Gary Jefferson wrote:
> Just a general question for the list: DHTs don't exhibit small-world
> characteristics, at least not in the overlay network view of things,
> right?

That depends on your definition of 'small world'. Most people take it to
mean short paths and high clustering, but as far as I know the Freenet work
uses a more specific definition due to Jon Kleinberg: a graph is a small
world if it can be embedded in a metric space such that the length of the
edges follows a power law distribution, where the magnitude of the power law
exponent is equal to the number of dimensions in the metric space[1].
Kleinberg showed that greedy routing in such a graph is efficient, but if
the power law exponent has any other value then greedy routing is
inefficient[2,3]. However, it's also possible that the length distribution
doesn't follow a power law at all (eg Chord, where the length distribution
is exponential and greedy routing is efficient).

Most DHTs aren't small worlds in the sense used in the Freenet work;
Symphony[4] is an exception.

Cheers,
Michael

[1]  <http://www.cs.cornell.edu/home/kleinber/nips14.ps>
http://www.cs.cornell.edu/home/kleinber/nips14.ps
[2]  <http://www.cs.cornell.edu/home/kleinber/nat00.pdf>
http://www.cs.cornell.edu/home/kleinber/nat00.pdf
[3]  <http://www.cs.cornell.edu/home/kleinber/swn.pdf>
http://www.cs.cornell.edu/home/kleinber/swn.pdf
[4]  <http://www-db.stanford.edu/~bawa/Pub/symphony.pdf>
http://www-db.stanford.edu/~bawa/Pub/symphony.pdf
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
 <http://zgp.org/mailman/listinfo/p2p-hackers>
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
 <http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences>
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060309/9bf072ff/attachment.html
From agthorr at cs.uoregon.edu  Wed Mar  8 17:15:36 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>
References: <440E976A.8010906@cs.ucl.ac.uk>
	<001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>
Message-ID: <20060308171535.GI5696@cs.uoregon.edu>

On Thu, Mar 09, 2006 at 12:38:58AM +0800, Ranus wrote:
> About your previous mail: I checked the definition of clustering
> <http://en.wikipedia.org/wiki/Clustering_coefficient> coefficient, that of a
> chord vertex is 3/(n-2) (for a 2^n node ring), and it's the same for every
> vertex, so the overall clustering coefficient is not high when n is
> moderately large.

That's actually a very larger clustering coefficient, compared to
random graph with the same number of nodes and edges.

An Erdos-Renyi random graph (the standard random graph) every edge
exists with probability |E| / |V|^2.  Consequently, the clustering
coefficient is approximately |E| / |V|^2.  In Chord, |E| = |V|*lg |V|, so
we have a clustering coefficient in the comparable random graph of 
lg |V| / |V|.

In your calculation, you're using n=2^|V|, so the clustering
coefficient is 3/(lg |V| - 2).  This is an enormous clustering
coefficient.  For |V| = 1 million, the clustering coefficient is 17%!
For a comparable random graph, it's only 0.002%.

(I use "lg x" to mean "log base 2 of x")

> The scale-free networks seem quite appealing as performance and scalability
> are concerned, and it's simpler to guarantee short paths with powerful
> supernodes. 

Scale-free networks scale well in some ways and scale very badly in
other ways.  Yes, they maintain short path lengths as the network gets
bigger.  However, they require that you have these increasingly
high-degree peers as the network gets bigger.  

You may have some peers that have more capacity than others, and it
may make sense to utilize those resources, but it does not seem 
wise to assume that if your network grows by a factor of 100 that you
will be able to a user who has 100 times more resources than your
previous best-user.

> Many P2P applications such as Gia (Making Gnutella-like P2P
> Systems Scalable, Sigcomm'03) introduce such supernodes to improve the
> scalability. But it seems harder, to build scalable networks with more or
> less the same number of edges for each node (this could be realistic when
> edges are interpreted as connections, rather than links/routing
> information), because at such time the supernodes are no where to be found.

It largely depends on what you're trying to achieve.  For a network
like Gnutella, global visibility is not one of the design
requirements, so it's OK to use a constant number of edges.

DHTs (typically) use O(log |V|) edges to guarantee a worst-case of
O(log |V|) steps per lookup.  That seems like a good compromise to me.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From Pietro.Michiardi at eurecom.fr  Wed Mar  8 18:31:38 2006
From: Pietro.Michiardi at eurecom.fr (Pietro Michiardi)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] PhD in networked and distributed systems,
	Institut Eurecom
Message-ID: <440F230A.7030906@eurecom.fr>

PhD position opening 
 
 Networking and Security Dept. Institut Eurecom 
 
 The Networking and Security Department of Institut EURECOM  is seeking 
 candidates with a background in networked and distributed systems. 
 Candidates should have a MsC in Computer Science and demonstrate 
 exceptional research potential. 
 
 The successful candidate is expected to carry out research in an area 
 related to peer-to-peer and content distribution systems; wireless and 
 mobile systems (including ad hoc networking). Additional skills in 
 cooperation issues in self-organizing systems would be a plus. 
 
 Candidates should also have experience in programming in C/C++; using 
 MATLAB; network simulation environments (ns, glomosim, ...). 
 
 The position is available immediately. The selection process will begin 
 immediately upon receipt of applications. 
 To apply, send resume, a motivation letter and e-mail addresses of at 
 least one reference to: 
 
 Pietro.Michiardi@eurecom.fr 
 
 Please, use the following subject: [EUR-CAS:PHD]. 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Pietro.Michiardi.vcf
Type: text/x-vcard
Size: 317 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060308/2b839362/Pietro.Michiardi.vcf
From Pietro.Michiardi at eurecom.fr  Wed Mar  8 18:32:24 2006
From: Pietro.Michiardi at eurecom.fr (Pietro Michiardi)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Postdoc in networked and distributed systems,
	Institut Eurecom
Message-ID: <440F2338.4090900@eurecom.fr>

Post-Doctoral Research Fellowship 
     
   Networking and Security Dept. Institut Eurecom 
   
   Duration: 12 months (minimum) up to 18 months 
   
   The Networking and Security Department of Institut EURECOM  is seeking 
   a post-doctoral fellow with a strong research background in networked 
   and distributed systems. Candidates should have a PhD in Computer 
   Science or equivalent and demonstrate exceptional research potential as 
   well as basic project management skills. 
   
   The successful candidate is expected work in the context of a funded 
   European Project and to carry out research in an area 
   related to peer-to-peer and content distribution systems; wireless and 
   mobile systems (including ad hoc networking).

   Additional skills in game theoretical modeling of networks; network 
   modeling; graph theory are highly recommended. 

   This position also involves some project management. The successful 
   applicant is expected to represent EURECOM at the project level, 
   participate in technical meetings in European locations and handle 
   interactions with the partners of EU project. 
   
   The position is available immediately. The selection process will begin 
   immediately upon receipt of applications. 
   To apply, send resume, a statement of research and e-mail addresses of 
   at least three references to: 
   
   
   Pietro.Michiardi@eurecom.fr
   
   Please, use the following subject: [EUR-CAS:POSTDOC]. 
 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Pietro.Michiardi.vcf
Type: text/x-vcard
Size: 317 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060308/baad6004/Pietro.Michiardi.vcf
From m.rogers at cs.ucl.ac.uk  Wed Mar  8 18:36:09 2006
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>
References: <001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>
Message-ID: <440F2419.9030807@cs.ucl.ac.uk>

Ranus wrote:
> Well, the network defined by Kleinberg falls into the category of 
> "scale-free networks".

Sorry to contradict you, but a scale-free network has a power law degree 
distribution. A Kleinberg small world has a power law length 
distribution, and there are no restrictions on the degree distribution[1].

> The scale-free networks seem quite appealing as performance and 
> scalability are concerned, and it's simpler to guarantee short paths 
> with powerful supernodes.

I've come across a couple of papers suggesting scale-free topologies for 
P2P networks, but I'm a bit skeptical. Scale-free graphs typically have 
O(log n) diameter, so if you flood O(sqrt n) advertisements from any 
node and O(sqrt n) queries from any other node then the query is likely 
to find the advertisement, but on the other hand the same's true of 
random graphs, which don't depend on a small fraction of the nodes 
handling a large fraction of the messages... I guess it's a question of 
finding a degree distribution that matches the bandwidth distribution of 
the nodes. Maybe that's power law, maybe not... does anyone have any 
figures?

Cheers,
Michael

[1] http://fleece.ucsd.edu/~massimo/Journal/SWorld-Submission.pdf

From Pietro.Michiardi at eurecom.fr  Wed Mar  8 18:47:59 2006
From: Pietro.Michiardi at eurecom.fr (Pietro Michiardi)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] PhD in networked and distributed systems,
	Institut Eurecom
Message-ID: <440F26DF.4060406@eurecom.fr>

(Apologies for the previous barely readable posting)


PhD position opening
Networking and Security Dept. Institut Eurecom

The Networking and Security Department of Institut EURECOM  is seeking 
candidates with a background in networked and distributed systems.

Candidates should have a MsC in Computer Science and demonstrate 
exceptional research potential.

The successful candidate is expected to carry out research in an area 
related to peer-to-peer and content distribution systems; wireless and 
mobile systems (including ad hoc networking). Additional skills in 
cooperation issues in self-organizing systems would be a plus.

Candidates should also have experience in programming in C/C++; using 
MATLAB; network simulation environments (ns, glomosim, ...).

The position is available immediately. The selection process will begin 
immediately upon receipt of applications.
To apply, send resume, a motivation letter and e-mail addresses of at 
least one reference to: Pietro.Michiardi@eurecom.fr

Please, use the following subject: [EUR-CAS:PHD].
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Pietro.Michiardi.vcf
Type: text/x-vcard
Size: 317 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060308/70a9b87e/Pietro.Michiardi.vcf
From Pietro.Michiardi at eurecom.fr  Wed Mar  8 18:53:19 2006
From: Pietro.Michiardi at eurecom.fr (Pietro Michiardi)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Postdoc in networked and distributed systems,
	Institut Eurecom
Message-ID: <440F281F.9050908@eurecom.fr>

(Apologies for the previous barely readable posting)

Post-Doctoral Research Fellowship
Networking and Security Dept. Institut Eurecom

Duration: 12 months (minimum) up to 18 months

The Networking and Security Department of Institut EURECOM  is seeking a 
post-doctoral fellow with a strong research background in networked and 
distributed systems. Candidates should have a PhD in Computer Science or 
equivalent and demonstrate exceptional research potential as well as 
basic project management skills.
The successful candidate is expected work in the context of a funded 
European Project and to carry out research in an area related to 
peer-to-peer and content distribution systems; wireless and mobile 
systems (including ad hoc networking).

Additional skills in game theoretical modeling of networks; network 
modeling; graph theory are highly recommended.

This position also involves some project management. The successful 
applicant is expected to represent EURECOM at the project level, 
participate in technical meetings in European locations and handle 
interactions with the partners of EU project.

The position is available immediately. The selection process will begin 
immediately upon receipt of applications.
To apply, send resume, a statement of research and e-mail addresses of  
at least three references to: Pietro.Michiardi@eurecom.fr

Please, use the following subject: [EUR-CAS:POSTDOC]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Pietro.Michiardi.vcf
Type: text/x-vcard
Size: 317 bytes
Desc: not available
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060308/616cb50c/Pietro.Michiardi.vcf
From ian at locut.us  Thu Mar  9 03:19:33 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <000c01c64256$71f96bf0$ccc96fa6@thinkingfish>
References: <440D648E.302@gmail.com>
	<000c01c64256$71f96bf0$ccc96fa6@thinkingfish>
Message-ID: <823242bd0603081919h7717ccecyad22dff89e93f0a4@mail.gmail.com>

On 3/7/06, Ranus <networksimulator@gmail.com> wrote:
>
> Hui Zhang has published a paper
> named "Using the Small-World Model to Improve Freenet Performance". It
> should correspond to your idea, so maybe you could read that.


Be careful of this paper.  If I recall correctly, most of their results can
be attributed to the fact that they ensured that links existed between
adjacent nodes in the graph, which obviously would have a dramatic
beneficial effect relative to a network where local links may be missing as
it means that in the worst case you will do an exhaustive search for the
node you are looking for just by following local links.

Our findings, as presented in Oskar's thesis, are that Freenet-style edge
selection results in the desired degree of clustering without "artificial"
help.

Ian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060308/07c28188/attachment.htm
From networksimulator at gmail.com  Thu Mar  9 03:24:39 2006
From: networksimulator at gmail.com (Ranus)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <20060308171535.GI5696@cs.uoregon.edu>
Message-ID: <002301c64329$09f29210$ccc96fa6@thinkingfish>

Daniel:	You are right on Chord. Thanks for pointing that out.

Michael: No universally accepted definition of the scale-free network is
available now. The definition of "scale-free metric" does not require a
power-law degree distribution, but that's 99% the case in available research
and you can say it's the only case that matters. Some deviation in real life
case is reported (as the degree cannot grow ultimately)[1].

Many DHTs provide good connectivity but do not require global visibility,
and usually there's not much need in adding it. We do not really care if
it's scale free or not, we only care if we can find what we want. Under
current condition you cannot really  feel the difference when you're in this
part of the overlay network or another, there's no way to measure. 

But sometimes you do want to connect to certain nodes more than others. Then
starting at a random point and having no idea of the whole picture, how to
find the right part you belong quickly? This does not always mean global
visibility is required, though that does settle the problem. There could be
other approaches,  esp. when the network is large and decentralized. So it's
not only a matter of join and stay somewhere, but choose the right cluster
as well. Could we call this "heuristic clustering"? Again, the question fall
back to the very first one: who do you call as your 'relatives' and how to
ask for them from your current neighbors?


[1] http://www.santafe.edu/research/publications/workingpapers/01-03-021.pdf
--
Ranus Yue
Tsinghua University
 

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org 
> [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Daniel Stutzbach
> Sent: Thursday, March 09, 2006 1:16 AM
> To: 'Peer-to-peer development.'
> Subject: Re: [p2p-hackers] clustering
> 
> That's actually a very larger clustering coefficient, 
> compared to random graph with the same number of nodes and edges.
> 
> An Erdos-Renyi random graph (the standard random graph) every 
> edge exists with probability |E| / |V|^2.  Consequently, the 
> clustering coefficient is approximately |E| / |V|^2.  In 
> Chord, |E| = |V|*lg |V|, so we have a clustering coefficient 
> in the comparable random graph of lg |V| / |V|.
> 
> In your calculation, you're using n=2^|V|, so the clustering 
> coefficient is 3/(lg |V| - 2).  This is an enormous 
> clustering coefficient.  For |V| = 1 million, the clustering 
> coefficient is 17%!
> For a comparable random graph, it's only 0.002%.
> 
> (I use "lg x" to mean "log base 2 of x")
> 
> Scale-free networks scale well in some ways and scale very 
> badly in other ways.  Yes, they maintain short path lengths 
> as the network gets bigger.  However, they require that you 
> have these increasingly high-degree peers as the network gets 
> bigger.  
> 
> You may have some peers that have more capacity than others, 
> and it may make sense to utilize those resources, but it does 
> not seem wise to assume that if your network grows by a 
> factor of 100 that you will be able to a user who has 100 
> times more resources than your previous best-user.
> 
> It largely depends on what you're trying to achieve.  For a 
> network like Gnutella, global visibility is not one of the 
> design requirements, so it's OK to use a constant number of edges.
> 
> DHTs (typically) use O(log |V|) edges to guarantee a 
> worst-case of O(log |V|) steps per lookup.  That seems like a 
> good compromise to me.
> 
> -- 
> Daniel Stutzbach                           Computer Science 
> Ph.D Student
> http://www.barsoom.org/~agthorr                     
> University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From ossa at math.chalmers.se  Thu Mar  9 08:14:14 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440F2419.9030807@cs.ucl.ac.uk>
References: <001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>
	<440F2419.9030807@cs.ucl.ac.uk>
Message-ID: <440FE3D6.6010002@math.chalmers.se>

Michael Rogers wrote:
> Ranus wrote:
> 
>> Well, the network defined by Kleinberg falls into the category of
>> "scale-free networks".
> 
> 
> Sorry to contradict you, but a scale-free network has a power law degree
> distribution. A Kleinberg small world has a power law length
> distribution, and there are no restrictions on the degree distribution[1].

If you mean by "Kleinberg small world" the model that Jon Kleinberg
proposed for navigable networks, then this is not the case. That model
has directed "long-range" edges, with a fixed out-degree and Poisson
in-degree. While making the out-degree arbitrary is easy, changing the
in-degrees while retaining the mathematical results would be
non-trivial. I'm not sure if anybody has done it.

// oskar

From ossa at math.chalmers.se  Thu Mar  9 08:23:33 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440E976A.8010906@cs.ucl.ac.uk>
References: <20060308034339.54645.qmail@web35809.mail.mud.yahoo.com>
	<440E976A.8010906@cs.ucl.ac.uk>
Message-ID: <440FE605.8060605@math.chalmers.se>

Michael Rogers wrote:
> However, it's also possible that the length
> distribution doesn't follow a power law at all (eg Chord, where the
> length distribution is exponential and greedy routing is efficient).

Actually, while the frequency of Chord links falls exponentially with
the "level" (not sure what the Chord term is) the length of such links
increases exponentially as well, so in fact the frequency of links with
certain lengths do fall harmonically. One could see Chord as some sort
of "mean field" version of the same dynamics as Kleinberg's model.

> Most DHTs aren't small worlds in the sense used in the Freenet work;
> Symphony[4] is an exception.

All sensible DHTs are small-world networks. If our definition of the
term doesn't imply this, we are getting lost in semantics.

// oskar

From m.rogers at cs.ucl.ac.uk  Thu Mar  9 11:43:05 2006
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <440FE3D6.6010002@math.chalmers.se>
References: <001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>
	<440F2419.9030807@cs.ucl.ac.uk> <440FE3D6.6010002@math.chalmers.se>
Message-ID: <441014C9.1090500@cs.ucl.ac.uk>

Oskar Sandberg wrote:
> If you mean by "Kleinberg small world" the model that Jon Kleinberg
> proposed for navigable networks, then this is not the case. That model
> has directed "long-range" edges, with a fixed out-degree and Poisson
> in-degree.

True, that's why I linked to the Franceschetti & Meester paper which 
generalises Kleinberg's model to any out-degree distribution.

> While making the out-degree arbitrary is easy, changing the
> in-degrees while retaining the mathematical results would be
> non-trivial. I'm not sure if anybody has done it.

That's interesting - would you mind elaborating? As long as the 
distribution of edge lengths is right and the edges are independent, why 
does the in-degree matter?

Cheers,
Michael

From ossa at math.chalmers.se  Thu Mar  9 11:55:30 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <441014C9.1090500@cs.ucl.ac.uk>
References: <001401c642ce$cf4db3b0$ccc96fa6@thinkingfish>	<440F2419.9030807@cs.ucl.ac.uk>
	<440FE3D6.6010002@math.chalmers.se> <441014C9.1090500@cs.ucl.ac.uk>
Message-ID: <441017B2.4080306@math.chalmers.se>

Michael Rogers wrote:
> Oskar Sandberg wrote:
>> While making the out-degree arbitrary is easy, changing the
>> in-degrees while retaining the mathematical results would be
>> non-trivial. I'm not sure if anybody has done it.
> 
> 
> That's interesting - would you mind elaborating? As long as the 
> distribution of edge lengths is right and the edges are independent, why 
> does the in-degree matter?

I doubt it matters to the results, but just off the top of my head I 
would expect it to complicate things in the proofs because it means that 
lengths of the long range links are not purely independent at each step. 
There are probably conditioning arguments to get around it, but it is 
still non-trivial (for some value of trivial).

// oskar

From jacob at mungo.dk  Thu Mar  9 20:10:02 2006
From: jacob at mungo.dk (Jacob Madsen)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Structured P2P Networks
Message-ID: <44108B9A.5040107@mungo.dk>

Hey,

I'm writing a small paper about DHTs and it got me wondering...
I can't recall that I've ever came across other kind of structured P2P
networks than DHT. Maybe one of you guys know about other kind of
structured P2P networks?

Thanks!

From dbarrett at quinthar.com  Thu Mar  9 21:27:15 2006
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Practical clustering
Message-ID: <20060309212720.AB6063FC50@capsicum.zgp.org>

So I?ve enjoyed the discussion on the mathematical foundation and taxonomy
of networks vis-?-vis clustering, but what techniques have you found useful
on a more nuts-and-bolts level?  It seems to me the clustering problem can
be broken down into (roughly):

 
1) Measuring the ?distance? between any two nodes: How have you defined
?distance? in this context: physical distance, latency, network hops,
throughput, etc?  And for your selection, what techniques have you used to
measure it (IP geolocation, synthetic coordinate systems, embedded
traceroute, network monitoring, etc)?

 
2) Measuring the ?strength? of any given node: How have you defined
?strength? in this context: uptime, CPU, memory, upload capacity, download
capacity, etc?  How have you measured these?

 
3) Join algorithm: When a new node comes into the network, how do you
determine its placement?  Is it a top-down, recursive approach, or a
bottom-up, emergent approach?

 
4) Optimization algorithm: As new nodes come and go (or even as a
?background process?) how do you optimize the network by promoting and
demoting nodes, and on what basis?

 
5) Repair algorithm: Given that any node can disappear at any time without
warning, how do you mitigate the effect of this and clean up afterwards?

 
And not all clustering need be in a network sense.  With iGlance, for
example, I have a ?video buddy list? feature where you have a super
low-bandwidth video stream to all your peers.  My initial thought for this
was to have each client build up a ?pose library? in order to optimize for
the massive similarity between frames in a video stream of you sitting in
front of your computer.  For example, there?s the ?looking at screen? pose
and ?talking on phone? pose, and ?away from computer? pose, and so on.  My
goal was to accumulate this pose library over time, and then just send out
indices into this library.

 
I?d classify the clustering algorithm I used for my pose library generation
as follows:

 
1) Distance: Use an image differencing function that roughly equated to
?number of pixels changed? (it was more sophisticated than this, but not by
much)

 
2) Strength: If a new image coming in is within some tolerance of an
existing image, just increment the ?strength? of the old image and discard
the new.  Thus we?d gradually identify which images were most common, and
call those the strongest poses.

 
3) Join algorithm: Trickle a new image down from top to bottom.  At each
node, if it?s within tolerance, discard and increment its strength.  If
outside tolerance, compare to children and pick whichever is within second
tolerance.  If none are, call it a new child.

 
4) Optimization algorithm: Set a fixed limit on the number of nodes allowed
in the tree (based on how much memory I was willing to allocate to the
problem) and ? before a node creates a new leaf ? discard the weakest child
branch if we?re at our limit.  I think there was also a fan-out limit and
when overrun, I?d measure the distances between all children, find the pair
that was closest, and then demote the weaker under the stronger.

 
5) Repair algorithm: Not applicable to this problem because images didn?t
just disappear without warning.

 
Anyway, so that?s a clustering approach I used, and it worked surprisingly
well.  The big problems were it had too many ?magic numbers? (merge
threshold, split threshold, total node numbers, etc) and it was hard
choosing the right values.  That and at the end of the day, the number of
poses you have even staring at your computer is actually quite high, and
simply sending a 1FPS video feed creates a better experience, is easier to
implement, and is sufficiently low bandwidth for my needs.  I?m sure with
enough tweaking it could be made to work and support a huge number of peers,
but meh.  Work for a future day.  Assuming I even kept the source, when it?d
be just like me to throw it away.

 
So, that?s my real-world clustering story.  What story do you have, and what
lessons did you learn?

 
-david

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060309/7f9a4a7c/attachment.html
From m.rogers at cs.ucl.ac.uk  Thu Mar  9 22:15:01 2006
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Structured P2P Networks
In-Reply-To: <44108B9A.5040107@mungo.dk>
References: <44108B9A.5040107@mungo.dk>
Message-ID: <4410A8E5.1050303@cs.ucl.ac.uk>

Hi Jacob,

The only ones I've come across are Freedman and Vingralek's distributed 
trie (http://www.cs.rice.edu/Conferences/IPTPS02/167.pdf) and Law and 
Siu's random expander networks 
(http://www.ieee-infocom.org/2003/papers/52_02.PDF).

Cheers,
Michael

From sam at neurogrid.com  Fri Mar 10 02:05:16 2006
From: sam at neurogrid.com (Sam Joseph)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] AP2PC '06 Call for Participation
Message-ID: <4410DEDC.3040306@neurogrid.com>

 **** list of accepted papers available **** early registration
deadline: 10 March 2006 ****

*********************************************************************

Fifth International Workshop on
Agents and Peer-to-Peer Computing
(AP2PC 2006)

9 May 2006

Future University Hakodate, Japan
(held in conjunction with AAMAS 2006)

URL: http://p2p.ingce.unibo.it/

*********************************************************************
CALL FOR PARTICIPATION
*********************************************************************

The workshop on Agents and Peer-to-Peer Computing ( AP2PC), in its
fifth edition this year, is a well-established forum for researchers
interested in sharing their experiences in combining peer-to-peer
based approaches those of agents and multiagent systems. AP2PC 2006
will be held as a satellite workshop of AAMAS 2006, the 5th Inter-
national Joint Conference on Autonomous Agents and Multiagent Systems
in May 2006 in Hakodate, Japan.

The workshop programme will consist of presentations of several
contributed papers a panel and an invited speaker. Everybody with an
interest in the crossover between peer-to-peer and agent systems is
cordially invited to attend.

*********************************************************************
REGISTRATION
*********************************************************************

Registration will be handled by the AAMAS organisers. Information is
available at the AAMAS registration page:

http://www.fun.ac.jp/aamas2006/registration.html

Please note the deadline for early registration is already upon us:

*** 10 March 2006 ***

*********************************************************************
ACCEPTED PAPERS
*********************************************************************

The PC has selected 10 full papers and 6 short papers for presentation
at the workshop:

*** Long Papers ***

* Cooperative CBR System for Peer Agent Committee Formation
Hager Karoui, Rushed Kanawati & Laure Petrucci

* Hybrid DHT Design for Mobile Environments
Stefan Zoels, Simon Schubert, Wolfgang Kellerer & Zoran Despotovic

* Mitigating the Impact of Liars by Reflecting Peer's Credibility on P2P
File Reputation Systems
Soyoung Lee, O-Hoon Kwon, Jong Kim & Sung Je Hong

* A Comparative Study of Reasoning Techniques for Service Selection
Murat Sensoy & Pinar Yolum

* Chora: Expert-based P2P Web Search
Halldor Gylfason, Omar Khan & Grant Schoenebeck

* K-link: A Peer-to-Peer Solution for Organizational Knowledge Management
Giuseppe Pirro', Domenico Talia & Massimo Ruffolo

* An Analysis of Interest Community Facilitated P2P Search
Elth Ogston

* Mobile agent-based approach for resource discovery in peer-to-peer
networks
Jaafar Gaber & Mohamed Bakhouya

* Peer to Peer Grid Computing System based on Mobile Agents
Joon-Min Gil & Sung-Jin Choi

* DANTE: A Self-Adapting Peer-to-Peer System
Luis Rodero, Luis Lo'pez, Antonio Ferna'ndez, Vicent Cholvi

*** Short Papers ***

* Studying viable free markets in Peer-to-Peer file exchange
applications without Altruistic Agents
David Cabanillas & Steven Willmott

* PROSA: P2P Resource Organisation by Social Acquaintances
Vincenza Carchiolo, Michele Malgeri, Giuseppe Mangioni & Vincenzo Nicosia

* The Exclusion of Malicious Routing Peers in Structured P2P Systems
O-Hoon Kwon, Bong Soo Roh, Sung Je Hong & Jong Kim

* Facilitating collaboration in a distributed software development
environment using P2P architecture
Maryam Purvis, Martin Purvis & Bastin Tony Roy Savarimuthu

* Reliable P2P File Sharing Service
Jung Hwa Shin

* Distributed Multilayer Network Management for NEC using Agents
Richard Vaughan, James Wise, Paul Huey, Michael Alcock, Jonathan Vaughan
& Graham Atkins

*********************************************************************

From rosejn at gmail.com  Fri Mar 10 10:08:57 2006
From: rosejn at gmail.com (Jeff Rose)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <823242bd0603081919h7717ccecyad22dff89e93f0a4@mail.gmail.com>
References: <440D648E.302@gmail.com>	<000c01c64256$71f96bf0$ccc96fa6@thinkingfish>
	<823242bd0603081919h7717ccecyad22dff89e93f0a4@mail.gmail.com>
Message-ID: <44115039.6080107@gmail.com>

It seems like people are always putting arbitrary restrictions on p2p 
systems and simulations in terms of connectivity, but is this really 
necessary?  Unless you are trying to use NATed nodes (assume we can 
punch or route through a neighbor),just about any pair of computers on 
the internet can be neighbors.  In essence the internet is a fully 
connected overlay graph.  All DHT's and other less-structured schemes 
are doing is deciding which links to send messages down.  So when you 
talk about "links existing" you just mean that a given pair maintains 
some amount of regular communication, or just that they know of each 
others existence in the network?  Maybe since you are coming from the 
freenet side of things connectivity has a lot more meaning than in other 
schemes?

-Jeff

Ian Clarke wrote:
> On 3/7/06, *Ranus* <networksimulator@gmail.com 
> <mailto:networksimulator@gmail.com>> wrote:
> 
>     Hui Zhang has published a paper
>     named "Using the Small-World Model to Improve Freenet Performance". It
>     should correspond to your idea, so maybe you could read that.
> 
> 
> Be careful of this paper.  If I recall correctly, most of their results 
> can be attributed to the fact that they ensured that links existed 
> between adjacent nodes in the graph, which obviously would have a 
> dramatic beneficial effect relative to a network where local links may 
> be missing as it means that in the worst case you will do an exhaustive 
> search for the node you are looking for just by following local links.
> 
> Our findings, as presented in Oskar's thesis, are that Freenet-style 
> edge selection results in the desired degree of clustering without 
> "artificial" help.
> 
> Ian.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From ian at locut.us  Fri Mar 10 18:48:55 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <44115039.6080107@gmail.com>
References: <440D648E.302@gmail.com>	<000c01c64256$71f96bf0$ccc96fa6@thinkingfish>
	<823242bd0603081919h7717ccecyad22dff89e93f0a4@mail.gmail.com>
	<44115039.6080107@gmail.com>
Message-ID: <6534FBFF-07D4-48EC-BB0A-362B0D69A6B3@locut.us>

You can't really ignore the implications of NATs and firewalled nodes  
that easily since most computers on the Internet these days are  
behind NATs or firewalls.

But even if you do ignore their existence, the determining factor of  
whether two nodes in a P2P network can communicate is that they know  
of each other's existence, and that they know each-other's location  
in information space (ie. not just their location in IP space).

It is not realistic to assume that every node in a P2P network will  
have this information for every other node in the P2P network, at  
least not if you want the network to be scalable, and so it is  
necessary for nodes to select a subset of all other nodes in the P2P  
network with which they can communicate.

Of course, the practicalities of operating a P2P network, which  
include issues such as establishing cryptographic tunnels, and  
dealing with NATs and firewalls, provide significant additional  
motivation for restricting the subset of nodes with which a  
particular node might seek to communicate with.

Ian.

On 10 Mar 2006, at 02:08, Jeff Rose wrote:

> It seems like people are always putting arbitrary restrictions on  
> p2p systems and simulations in terms of connectivity, but is this  
> really necessary?  Unless you are trying to use NATed nodes (assume  
> we can punch or route through a neighbor),just about any pair of  
> computers on the internet can be neighbors.  In essence the  
> internet is a fully connected overlay graph.  All DHT's and other  
> less-structured schemes are doing is deciding which links to send  
> messages down.  So when you talk about "links existing" you just  
> mean that a given pair maintains some amount of regular  
> communication, or just that they know of each others existence in  
> the network?  Maybe since you are coming from the freenet side of  
> things connectivity has a lot more meaning than in other schemes?
>
> -Jeff
>
> Ian Clarke wrote:
>> On 3/7/06, *Ranus* <networksimulator@gmail.com  
>> <mailto:networksimulator@gmail.com>> wrote:
>>     Hui Zhang has published a paper
>>     named "Using the Small-World Model to Improve Freenet  
>> Performance". It
>>     should correspond to your idea, so maybe you could read that.
>> Be careful of this paper.  If I recall correctly, most of their  
>> results can be attributed to the fact that they ensured that links  
>> existed between adjacent nodes in the graph, which obviously would  
>> have a dramatic beneficial effect relative to a network where  
>> local links may be missing as it means that in the worst case you  
>> will do an exhaustive search for the node you are looking for just  
>> by following local links.
>> Our findings, as presented in Oskar's thesis, are that Freenet- 
>> style edge selection results in the desired degree of clustering  
>> without "artificial" help.
>> Ian.
>> --------------------------------------------------------------------- 
>> ---
>> _______________________________________________
>> p2p-hackers mailing list
>> p2p-hackers@zgp.org
>> http://zgp.org/mailman/listinfo/p2p-hackers
>> _______________________________________________
>> Here is a web page listing P2P Conferences:
>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


From mfreed at cs.nyu.edu  Sat Mar 11 15:49:26 2006
From: mfreed at cs.nyu.edu (Michael J Freedman)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <44115039.6080107@gmail.com>
References: <440D648E.302@gmail.com>
	<000c01c64256$71f96bf0$ccc96fa6@thinkingfish>
	<823242bd0603081919h7717ccecyad22dff89e93f0a4@mail.gmail.com>
	<44115039.6080107@gmail.com>
Message-ID: <Pine.BSO.4.62.0603111041280.26067@ludlow.scs.cs.nyu.edu>

On Fri, 10 Mar 2006, Jeff Rose wrote:

> Date: Fri, 10 Mar 2006 11:08:57 +0100
> From: Jeff Rose <rosejn@gmail.com>
> Reply-To: Peer-to-peer development. <p2p-hackers@zgp.org>
> To: Peer-to-peer development. <p2p-hackers@zgp.org>
> Subject: Re: [p2p-hackers] clustering
> 
> It seems like people are always putting arbitrary restrictions on p2p systems 
> and simulations in terms of connectivity, but is this really necessary? 
> Unless you are trying to use NATed nodes (assume we can punch or route 
> through a neighbor),just about any pair of computers on the internet can be 
> neighbors.  In essence the internet is a fully connected overlay graph.

The problem is that "just about every" and "every" node being able to 
communicate are not quite the same thing.  Indeed, it's precisely the 
difference in these two assumptions which actually raises a lot of 
problems when actually deploying DHTs in the wide-area.

We recently presented a short paper at WORLDS '05 which discusses the 
real-world problems that arises from non-transitivity in Internet routing:

   A can speak to B, B can speak to C, but A can't speak to C

as we all independently discovered from running CoralCDN, OpenDHT, and i3. 
(Firewalls and NATs are actually an easier problem that this, as they 
express routing constraints much more symmetrically.)

   http://www.scs.stanford.edu/mfreed/docs/ntr-worlds05.pdf

I sent an email about this paper to this mailing list a few months ago, 
and my apologies for the repeat.  However, as our main audience for this 
paper was actually meant to be the hacker community, as opposed to the 
academic one, I thought it bears re-mention.

--mike

-----
www.michaelfreedman.org                              www.coralcdn.org

From lemonobrien at yahoo.com  Sat Mar 11 21:19:50 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] clustering
In-Reply-To: <Pine.BSO.4.62.0603111041280.26067@ludlow.scs.cs.nyu.edu>
Message-ID: <20060311211950.59943.qmail@web53605.mail.yahoo.com>

>>A can speak to B, B can speak to C, but A can't speak to C

  this is why you have a relay/super node...to be the go between between A nad C; this is also why creating p2p systems is very difficult...b/c you'll have to test this along with tons of other different senarios.
  

Michael J Freedman <mfreed@cs.nyu.edu> wrote:
  On Fri, 10 Mar 2006, Jeff Rose wrote:

> Date: Fri, 10 Mar 2006 11:08:57 +0100
> From: Jeff Rose 
> Reply-To: Peer-to-peer development. 

> To: Peer-to-peer development. 

> Subject: Re: [p2p-hackers] clustering
> 
> It seems like people are always putting arbitrary restrictions on p2p systems 
> and simulations in terms of connectivity, but is this really necessary? 
> Unless you are trying to use NATed nodes (assume we can punch or route 
> through a neighbor),just about any pair of computers on the internet can be 
> neighbors. In essence the internet is a fully connected overlay graph.

The problem is that "just about every" and "every" node being able to 
communicate are not quite the same thing. Indeed, it's precisely the 
difference in these two assumptions which actually raises a lot of 
problems when actually deploying DHTs in the wide-area.

We recently presented a short paper at WORLDS '05 which discusses the 
real-world problems that arises from non-transitivity in Internet routing:

A can speak to B, B can speak to C, but A can't speak to C

as we all independently discovered from running CoralCDN, OpenDHT, and i3. 
(Firewalls and NATs are actually an easier problem that this, as they 
express routing constraints much more symmetrically.)

http://www.scs.stanford.edu/mfreed/docs/ntr-worlds05.pdf

I sent an email about this paper to this mailing list a few months ago, 
and my apologies for the repeat. However, as our main audience for this 
paper was actually meant to be the hacker community, as opposed to the 
academic one, I thought it bears re-mention.

--mike

-----
www.michaelfreedman.org www.coralcdn.org
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060311/4a80794c/attachment.htm
From osokin at osokin.com  Sun Mar 12 03:08:38 2006
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <5008189E-074B-43CC-A4F6-CEA901134073@locut.us>
Message-ID: <IJECJCHFLLIPKMNGHDPMOENBIGAA.osokin@osokin.com>

On Tuesday, March 07, 2006 Ian Clarke wrote:
> I think Chapters 1 and 2 of Oskar Sandberg's recent thesis may be  
> extremely relevant:
>
>   http://www.math.chalmers.se/~ossa/lic.pdf
>
> It outlines an algorithm to choose which nodes in a network should  
> have edges between them such that "greedy" routing (routing to the  
> peer closest to what you are looking for) has small world O( log^2 
> (N) ) path lengths.

	Ian, Oskar, that was wonderful. I rarely see anything that has
that much aesthetic value, but the analyitical proof that the graph
rewiring caused by the random search actions actually makes it a 
small-world graph with log^2(N) path length, was certainly worth the
one-year wait since last April, which is when Ian has preannounced 
this paper.

	I have to apologize in advance for my questions - some of them
might be downright stupid (I won't even pretend that I followed the
logic of all your proofs), and in any case there are many of them.
Perhaps, too many. Feel free to answer any subset that does not seem
stupid to you (and as a special case, of course, just ignore all of 
them :-)

	But I waited for this paper for a year, and as you might imagine,
I had time to ponder quite a few issues... So here we go:

1. What is the difference between Freenet[1] and Sandberg/Dijjer[2]
   models that makes Dijjer easier to analyze? This is mentioned a few
   times in [2] and in other Freenet- and Dijjer- related places, but
   I did not notice the explicit explanation of the differences
   anywhere. Was it the directed links in the graph in [2]? Something
   else? I honestly tried to figure this out by myself, but couldn't; 
   sorry... Maybe if I'd spend more time looking at both [1] and [2],
   I'd figure this out, but since you guys are already here, I thought
   that maybe you could answer that? 

2. Whatever this difference is, how relevant do you think are the
   Sandberg/Dijjer results[2] for the original Freenet paper[1]?

   For example:

	a. section 5.2 of [1] says about Fig. 3: "We can see that the
	   pathlength scales approximately logarithmically, with a change
	   of slope near 40,000 nodes."

	   Given what you guys know now, do you think that Fig. 3 shows
	   log^2(N) instead, or Freenet simply has different scaling
	   properties due to its different algorithms?

	b. Fig. 5 shows the link number distribution that awfully
	   resembles the power-law one, whereas Oskar mentioned in
	   another mail that Kleinberg is constant out-degree and 
	   Poisson in-degree. Is my assumption that Sandberg/Dijjer 
	   model is supposed to mimic the Kleinberg distibution
	   incorrect and in fact both Sandberg/Dijjer and Freenet
	   are power-law? Or is just one of them power law due to
	   the different algorithms used? Or maybe Fig. 5 in [1] is
	   actually showing a part of the Poisson distribution that 
	   can be easily mistaken for the power law?

	c. Section 5.4 of [1] says: "In a small-world network, the
	   majority of nodes have only relatively few, local, connections
	   to other nodes, while a small number of nodes have large, 
	   wide-ranging sets of connections. Small-world networks permit
	   efficient short paths between arbitrary points because of the
	   shortcuts provided by the well-connected nodes..."

	   This does not sound like anything I've been able to see in the
	   Oskar's thesis[2] - it has left me with a distinct impression
	   that the connectivity in the small world (at least in the one
	   analyzed there) was provided by the exactly right (Kleinberg's)
	   proportion of long-distance links, which Sandberg/Dijjer
	   rewiring model was designed to achieve, and not by the means 
	   of any special nodes with "wide-ranging sets of connections".
	   Is it because Freenet and Dijjer used different algorithms, or
	   the original assumption about the reasons for the Freenet
	   connectivity was wrong? 

	   Or is it simply that I'm misreading the Freenet paper[1] and
	   it did not mean to imply that small world properties of Freenet
	   are due to the small subset of well connected nodes? Maybe my
	   understanding of Sandberg/Dijjer[2] is wrong?

3. Why would one want to use the system with O(log^2(N)) path length
   when pretty much every DHT gives a path lenght of O(log(N))? At
   today's P2P nets scale (millions of nodes) the difference between
   log and log^2 is non-trivial. I understand that creating the small-
   world network by a simple and natural evolutionary process is very
   elegant, simple, and attractive, but is this elegance worth the 
   increased path length? Why not simply use a DHT?

4. Speaking of DHTs: Oskar recently said something curious here that 
   I did not understand: "Actually, while the frequency of Chord links
   falls exponentially with the "level" (not sure what the Chord term
   is) the length of such links increases exponentially as well, so in
   fact the frequency of links with certain lengths do fall
   harmonically. One could see Chord as some sort of "mean field"
   version of the same dynamics as Kleinberg's model."

   What exactly does this mean? DHTs (including Chord) have log(N)
   diameter, and Kleinberg has log^2(N), right? This looks fairly
   different to me, and I totally missed the meaning of the "mean
   field", and why Chord and Kleinberg would have the same dynamics;
   I did not even understand what is this dynamics that you're talking
   about. Oskar, could you please elaborate on this?

5. Throughout this mail I've been using the expresson Sandberg/Dijjer 
   to denote both what is described in [2] and Dijjer, as if it is the 
   same system; is this a right thing to do, or there are significant
   differences between these two? What about these directed links, for
   example? Does Dijjer use them? Does this matter?

6. Both the simulation in Freenet paper[1] and Sandberg/Dijjer paper[2]
   use the random requests to rewire the networks. In reality, there
   will be all kinds of hotspots and uneven distributions of requests
   due to the different popularity of content. Do you guys think it
   would affect the results of analysis performed in [2], and if yes,
   how? For example, if one particular file is extremely popular, one
   could imagine that its storage node[s] will have many more links 
   than the average, and the search for the other (rarely requested)
   data items might become ineffective, because these path lengths will
   become very high. 

   Do you think it is a valid concern? Are there any results that would
   show that this is not an issue?

7. While Sandberg [2] assumes that the requests are passing between 
   the random points, both Freenet and Dijjer seem to allow the length
   of the path to be shortened when the requested content is already
   cached on the intermediate node. If this is really so, what is the
   effect that will have on Oskar's analysis results? Fewer links will
   be rewired, so this should have some effect on the graph dynamics,
   right? Any idea what it would be?

8. The answer to '7' above might also depend on the cache size and 
   replacement policies on the nodes, and on the content popularity
   distribution (which I already mentioned in '6' above). All these
   factors are hard to predict for the real system in advance, and I'm
   wondering what is the feeling that you guys have about the overall
   stability of the rewiring algorithm?

   I mean, the original Kleinberg d(x, y)^(-1) approach is extremely
   sensitive to the value of this "-1". Any other value, and you either
   do not have enough long distance links, or you do not have enough
   short-range links once you arrive into the general vicinity of your
   destination. In any case, your path length becomes polynomial instead
   of polylogarithmic once you have the smallest deviation from this
   "-1" number.

   And this makes me a bit uneasy; I do understand that the catastrophic
   failure is unlikely, and the detrimental effect of all these changes
   (if any) is likely to be tolerable, but still... Look at it this
   way: Chapter 1 of Oskar's thesis is dedicated to proving that 
   "...family given by (1.1) allows for polylogarithmic routing at, 
   and only at, one value of the [alpha]...", then Chapter 2 proves
   that this is exactly what happens with random-request rewiring, and
   then Dijjer's practical implementation and deployment arrive and
   throw these random requests out of the window with their caching,
   different content popularity, and whatnot. So given the strict
   requirements that were proven to be vital in Chapter 1, I cannot 
   help wondering: is Dijjer still polylogarithmic, or not?

   I'd feel much better about all this if there would be some results
   showing that the system is stable in the presence of such changes
   (in a sense that reasonably small algorithm input changes should not
   cause any catastrophically large changes in the algorithm results).
   Do you have any results that would point in that direction? Would
   you say that the data displayed on Figures 2.4-2.6 already shows 
   that your results are not exactly Kleinberg, but this does not 
   cause any catastrophic consequences, or you also have some other
   sources of optimism?

9. And finally, if I'm not mistaken, Dijjer description seems to imply
   that the links are rewired after every search, whereas Oskar suggests
   that this should be happen only with proability p, and in section 2.3
   says: "...there are simple heuristic arguments for why p should
   reasonably be on the order of one over the expected length of the 
   greedy walks". Any particular reason for that Dijjer behaviour, or
   I've missed something and Dijjer does, in fact, rewire with this
   probability p? (Would be interesting to hear these simple heuristic
   arguments too, but that's another story...)

[1] Freenet: A Distributed Anonymous Information Storage and Retrieval
    System. Ian Clarke1, Oskar Sandberg2, Brandon Wiley3, and Theodore
    W. Hong.
    http://www.cl.cam.ac.uk/~twh25/academic/papers/icsi-revised.pdf

[2] Searching in a Small World. Oskar Sandberg.
    http://www.math.chalmers.se/~ossa/lic.pdf

	Best wishes -
	S.Osokine.
	11 Mar 2006.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Ian Clarke
Sent: Tuesday, March 07, 2006 5:03 PM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] clustering


I think Chapters 1 and 2 of Oskar Sandberg's recent thesis may be  
extremely relevant:

   http://www.math.chalmers.se/~ossa/lic.pdf

It outlines an algorithm to choose which nodes in a network should  
have edges between them such that "greedy" routing (routing to the  
peer closest to what you are looking for) has small world O( log^2 
(N) ) path lengths.

The algorithm is very simple, and pleasingly "natural".  Basically  
you use greedy routing to find a path to your intended destination  
node, then you add a link from each node along this path to the  
destination.  The number of outbound edges from each node can't  
obviously increase indefinitely, so they are deleted according to a  
least recently used scheme to make room for new edges.

This is based on Freenet's approach to edge selection.  The paper  
describes both experimental results that confirm that this algorithm  
does lead to a small world topology, and also delves into the  
mathematics behind why this might be the case.

It is interesting not just because of the simplicity of the  
algorithm, but because it is extremely amenable to the construction  
and maintainence of decentralized P2P networks, and because it could  
tell us something about why human relationships tend to form small  
world networks.

This algorithm is used by the Dijjer (http://dijjer.org/) P2P network.

Ian.

On 7 Mar 2006, at 02:46, Jeff Rose wrote:

> Anyone out there doing work in clustering?  I'd be interested in  
> hearing what people are up to, or if you have pointers to good  
> papers that would be great too.  At this point I'm interested in  
> any angle, geographic distance based, semantic distance based,  
> hierarchical, non-hierarchical etc...  My goal is to work towards a  
> very amorphous clustering scheme that allows any kind of object to  
> be located close to relatives in the network as long as they share  
> a common distance function.

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From ossa at math.chalmers.se  Sun Mar 12 16:18:28 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <IJECJCHFLLIPKMNGHDPMOENBIGAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMOENBIGAA.osokin@osokin.com>
Message-ID: <441449D4.7070809@math.chalmers.se>

Serguei Osokine wrote:
> 	Ian, Oskar, that was wonderful. I rarely see anything that has
> that much aesthetic value, but the analyitical proof that the graph
> rewiring caused by the random search actions actually makes it a 
> small-world graph with log^2(N) path length, was certainly worth the
> one-year wait since last April, which is when Ian has preannounced 
> this paper.

It should be noted that the analytic proof in fact does not established 
this. I did come up with the proof that appears in the thesis last 
April, and I have since (with little success) attempted to generalize it 
so it holds for the rewiring algorithm. The problem is that the proof, 
like the Kleinberg proofs, assumes that edges are chosen independently 
at each point, but in fact using our algorithm they will depend on one 
another (in a good way, but that is hard to prove).

In fact, I haven't even managed to show that the class of distributions 
for which I have a bound - those which are solutions to the system 
implied by equations (2.3) and (2.4) in the text - is nonempty. Until 
this is done the theorem as it stands is interesting, but from a strict 
mathematical viewpoint quite meaningless. (Which is why I have held off 
on publication).

<>
> 1. What is the difference between Freenet[1] and Sandberg/Dijjer[2]
>    models that makes Dijjer easier to analyze? This is mentioned a few
>    times in [2] and in other Freenet- and Dijjer- related places, but
>    I did not notice the explicit explanation of the differences
>    anywhere. Was it the directed links in the graph in [2]? Something
>    else? I honestly tried to figure this out by myself, but couldn't; 
>    sorry... Maybe if I'd spend more time looking at both [1] and [2],
>    I'd figure this out, but since you guys are already here, I thought
>    that maybe you could answer that? 

What makes Freenet (by which I mean the old Freenet routing, not the one 
they are using now) much more difficult to analyze is that the data 
rather than the nodes which have addresses. Data bounces around the 
network dynamically, and so the markov chain implied becomes a lot more 
complicated. For one thing, Freenet suffers from load balancing issues, 
while the point to point routing cannot have such issues.

> 2. Whatever this difference is, how relevant do you think are the
>    Sandberg/Dijjer results[2] for the original Freenet paper[1]?
> 
>    For example:
> 
> 	a. section 5.2 of [1] says about Fig. 3: "We can see that the
> 	   pathlength scales approximately logarithmically, with a change
> 	   of slope near 40,000 nodes."
> 
> 	   Given what you guys know now, do you think that Fig. 3 shows
> 	   log^2(N) instead, or Freenet simply has different scaling
> 	   properties due to its different algorithms?

Paths in Freenet are affected by other things than just point to point 
routing. Caching for instance will play a big role. I also don't 
remember exactly how we were scaling the routing table size in those 
simulations (the log^2 bound is for a constant routing table size).

I'm guessing that a lot of what was seen in those simulation were 
artifacts of other factors.

> 	b. Fig. 5 shows the link number distribution that awfully
> 	   resembles the power-law one, whereas Oskar mentioned in
> 	   another mail that Kleinberg is constant out-degree and 
> 	   Poisson in-degree. Is my assumption that Sandberg/Dijjer 
> 	   model is supposed to mimic the Kleinberg distibution
> 	   incorrect and in fact both Sandberg/Dijjer and Freenet
> 	   are power-law? Or is just one of them power law due to
> 	   the different algorithms used? Or maybe Fig. 5 in [1] is
> 	   actually showing a part of the Poisson distribution that 
> 	   can be easily mistaken for the power law?

The model presented in the paper has fixed out-degree and (roughly) 
poisson in-degree like Kleinberg's. Freenet has a fixed out-degree but 
possibly a power-law in-degree, which is the cause of the aforementioned 
load balancing issues (I don't have any good evidence for this, just the 
feeling that their is some sort of preferential attachment going on).

> 	c. Section 5.4 of [1] says: "In a small-world network, the
> 	   majority of nodes have only relatively few, local, connections
> 	   to other nodes, while a small number of nodes have large, 
> 	   wide-ranging sets of connections. Small-world networks permit
> 	   efficient short paths between arbitrary points because of the
> 	   shortcuts provided by the well-connected nodes..."
> 
> 	   This does not sound like anything I've been able to see in the
> 	   Oskar's thesis[2] - it has left me with a distinct impression
> 	   that the connectivity in the small world (at least in the one
> 	   analyzed there) was provided by the exactly right (Kleinberg's)
> 	   proportion of long-distance links, which Sandberg/Dijjer
> 	   rewiring model was designed to achieve, and not by the means 
> 	   of any special nodes with "wide-ranging sets of connections".
> 	   Is it because Freenet and Dijjer used different algorithms, or
> 	   the original assumption about the reasons for the Freenet
> 	   connectivity was wrong? 

The original paper is based on less sophisticated small-world models 
than the later stuff. I think most of the small-world references in 
there are hand-waving really.

If you place an arbitrary point somewhere where you say that connections 
shorter than that are local and others are long distance, then 
Kleinberg's model (and ours) will reflect the somewhat fuzzy statement 
you quote above.

> 3. Why would one want to use the system with O(log^2(N)) path length
>    when pretty much every DHT gives a path lenght of O(log(N))? At
>    today's P2P nets scale (millions of nodes) the difference between
>    log and log^2 is non-trivial. I understand that creating the small-
>    world network by a simple and natural evolutionary process is very
>    elegant, simple, and attractive, but is this elegance worth the 
>    increased path length? Why not simply use a DHT?

In real life, the order is much less interesting then the actual route 
length. One shouldn't stair blindly at the order when the constants 
could be dominating in real world situations. Also, the log^2 is with 
constant routing table size, if you scale up the routing table with the 
size of the network, you can roughly divide the order by the routing 
table size (most log n DHTs scale the routing table with log n as well).

> 4. Speaking of DHTs: Oskar recently said something curious here that 
>    I did not understand: "Actually, while the frequency of Chord links
>    falls exponentially with the "level" (not sure what the Chord term
>    is) the length of such links increases exponentially as well, so in
>    fact the frequency of links with certain lengths do fall
>    harmonically. One could see Chord as some sort of "mean field"
>    version of the same dynamics as Kleinberg's model."
> 
>    What exactly does this mean? DHTs (including Chord) have log(N)
>    diameter, and Kleinberg has log^2(N), right? This looks fairly
>    different to me, and I totally missed the meaning of the "mean
>    field", and why Chord and Kleinberg would have the same dynamics;
>    I did not even understand what is this dynamics that you're talking
>    about. Oskar, could you please elaborate on this?

Again, the routing tables grow in Chord. If each node had one "chord" 
link (every log n th node having a chord of the same length), then Chord 
would be roughly log^2 as well.

<>
> 6. Both the simulation in Freenet paper[1] and Sandberg/Dijjer paper[2]
>    use the random requests to rewire the networks. In reality, there
>    will be all kinds of hotspots and uneven distributions of requests
>    due to the different popularity of content. Do you guys think it
>    would affect the results of analysis performed in [2], and if yes,
>    how? For example, if one particular file is extremely popular, one
>    could imagine that its storage node[s] will have many more links 
>    than the average, and the search for the other (rarely requested)
>    data items might become ineffective, because these path lengths will
>    become very high. 

I would imagine that the large number of files stored per node, and the 
random distribution implied by using a secure hash would help even out 
distribution. I'm not sure what an uneven distribution does under our 
algorithm, but intuitively it should handle it just fine (it would mean 
more links to the destination of popular queries, but that is a good 
idea since such links are popular...)


>    Do you think it is a valid concern? Are there any results that would
>    show that this is not an issue?

Yes, and no (besides standard concentration results showing that it 
should even out if there are sufficiently many documents per node).

> 7. While Sandberg [2] assumes that the requests are passing between 
>    the random points, both Freenet and Dijjer seem to allow the length
>    of the path to be shortened when the requested content is already
>    cached on the intermediate node. If this is really so, what is the
>    effect that will have on Oskar's analysis results? Fewer links will
>    be rewired, so this should have some effect on the graph dynamics,
>    right? Any idea what it would be?

This is part of the answer to your first question about why these 
situations are harder to analyze.

> 8. The answer to '7' above might also depend on the cache size and 
>    replacement policies on the nodes, and on the content popularity
>    distribution (which I already mentioned in '6' above). All these
>    factors are hard to predict for the real system in advance, and I'm
>    wondering what is the feeling that you guys have about the overall
>    stability of the rewiring algorithm?

No idea.

>    I mean, the original Kleinberg d(x, y)^(-1) approach is extremely
>    sensitive to the value of this "-1". Any other value, and you either
>    do not have enough long distance links, or you do not have enough
>    short-range links once you arrive into the general vicinity of your
>    destination. In any case, your path length becomes polynomial instead
>    of polylogarithmic once you have the smallest deviation from this
>    "-1" number.

It isn't actually that sensitive. It is sensitive asymptotically, but 
for any given network size varying the exponent a little will make 
rather little difference. In fact, one can calculate numerically (I know 
of no proof) that there are better exponents than -1 for any given n 
(though the best exponent approaches 1 as the network grows towards 
infinity).

>    And this makes me a bit uneasy; I do understand that the catastrophic
>    failure is unlikely, and the detrimental effect of all these changes
>    (if any) is likely to be tolerable, but still... Look at it this
>    way: Chapter 1 of Oskar's thesis is dedicated to proving that 
>    "...family given by (1.1) allows for polylogarithmic routing at, 
>    and only at, one value of the [alpha]...", then Chapter 2 proves
>    that this is exactly what happens with random-request rewiring, and
>    then Dijjer's practical implementation and deployment arrive and
>    throw these random requests out of the window with their caching,
>    different content popularity, and whatnot. So given the strict
>    requirements that were proven to be vital in Chapter 1, I cannot 
>    help wondering: is Dijjer still polylogarithmic, or not?

It is certainly not proved.

>    I'd feel much better about all this if there would be some results
>    showing that the system is stable in the presence of such changes
>    (in a sense that reasonably small algorithm input changes should not
>    cause any catastrophically large changes in the algorithm results).
>    Do you have any results that would point in that direction? Would
>    you say that the data displayed on Figures 2.4-2.6 already shows 
>    that your results are not exactly Kleinberg, but this does not 
>    cause any catastrophic consequences, or you also have some other
>    sources of optimism?

I would feel better about it too if more were proven. I am optimistic 
that things actually do work, but I am not very optimistic about how 
much I will ever be able to show (and it isn't just me, though a better 
mathematician may of course have gotten further, I have put these 
problems in front of a many good researchers who seem to agree they are 
difficult problems).

> 9. And finally, if I'm not mistaken, Dijjer description seems to imply
>    that the links are rewired after every search, whereas Oskar suggests
>    that this should be happen only with proability p, and in section 2.3
>    says: "...there are simple heuristic arguments for why p should
>    reasonably be on the order of one over the expected length of the 
>    greedy walks". Any particular reason for that Dijjer behaviour, or
>    I've missed something and Dijjer does, in fact, rewire with this
>    probability p? (Would be interesting to hear these simple heuristic
>    arguments too, but that's another story...)

It won't matter so much if the routing table is large (remember the 
mathematical model has one long link) though I would recommend that 
Dijjer updated less often.

// oskar

From osokin at osokin.com  Sun Mar 12 22:12:15 2006
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <441449D4.7070809@math.chalmers.se>
Message-ID: <IJECJCHFLLIPKMNGHDPMOENLIGAA.osokin@osokin.com>


	Oskar, thank you for a quick and thorough reply!

> In fact, I haven't even managed to show that the class of
> distributions for which I have a bound - those which are solutions 
> to the system implied by equations (2.3) and (2.4) in the text - is
> nonempty. Until this is done the theorem as it stands is interesting,
> but from a strict mathematical viewpoint quite meaningless. (Which is
> why I have held off on publication).

	Damn... I knew that I should've spent more time following the 
chain of proofs... I thought that you did prove your rewiring method
to converge to the balanced distribution in terms of Theorem 2.2 with
some margin of error due to the dynamic nature of the rewiring, and 
that the differences between solid and dotted lines in Figs. 2.4-2.6
were basically the result of this dynamic error, especially since the
solid line on Fig. 2.6 seemed to have the clearly visible simulation
artifacts... :-)

> Also, the log^2 is with constant routing table size, if you scale up
> the routing table with the size of the network, you can roughly divide
> the order by the routing table size (most log n DHTs scale the routing
> table with log n as well).
> ...
> Again, the routing tables grow in Chord. If each node had one "chord"
> link (every log n th node having a chord of the same length), then
> Chord would be roughly log^2 as well.

	Ah, great points. So let me get this straight - what you're 
saying is that Chord and other similar DHTs (Kademlia, etc) all have
basically the Kleinberg distribution, correct? They have same average
number of shortcuts covering every shortcut length interval with ~2^k
order of magnitude size and ~2^k order of magnitude average shortcut
length, so the probability of having the shortcut with length 2^k - 
P(2^k) is inversely proportional to the interval size 2^k, so what we
have is P[2^k] ~ 1/(2^k), or P[x] ~ 1/x.

	So the apparent log(N) Chord path length is due only to log(N)
links from every node, and if say, Dijjer would have log(N) links per
node, its path length would also scale as long(N).

	Very interesting - if I really got all this correctly. It seems
obvious in retrospect, but I, for one, never thought about it this way.
In fact, I could not fully understand your point before doing some back
of the envelope (well, back of your thesis, actually :-) calculations
while writing this mail. Judging from the previous discussion, I won't
be the only one for whom this chain of reasoning will be a surprise -
thank you for taking time to lay it out in the open!

	By the way, as a practical matter, does Dijjer grow its routing
tables with network size? Shouldn't be a problem to scale these tables
as log(N) and have the scalability similar to Chord and other DHTs. If
you do, you might want to mention it somewhere - I'm fairly sure that
I'm not the single person dense enough to miss the link between log(N)
path length and log(N) links per node and to be left with an impression
that Chord scales better than Dijjer as a result.

> It won't matter so much if the routing table is large (remember the
> mathematical model has one long link) though I would recommend that 
> Dijjer updated less often.

	If the Dijjer routing table indeed has log(N) size and you replace
only one entry in it, doesn't it already mean that you are updating with 
the probability of "...order of one over the expected length of the
greedy walks", which length also becomes log(N) in this case? Or your
recommendation is based on some other considerations?

	Best wishes -
	S.Osokine.
	12 Mar 2006.


-----Original Message-----
From: Oskar Sandberg [mailto:ossa@math.chalmers.se]
Sent: Sunday, March 12, 2006 8:18 AM
To: osokin@osokin.com; Peer-to-peer development.
Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)


Serguei Osokine wrote:
> 	Ian, Oskar, that was wonderful. I rarely see anything that has
> that much aesthetic value, but the analyitical proof that the graph
> rewiring caused by the random search actions actually makes it a 
> small-world graph with log^2(N) path length, was certainly worth the
> one-year wait since last April, which is when Ian has preannounced 
> this paper.

It should be noted that the analytic proof in fact does not established 
this. I did come up with the proof that appears in the thesis last 
April, and I have since (with little success) attempted to generalize it 
so it holds for the rewiring algorithm. The problem is that the proof, 
like the Kleinberg proofs, assumes that edges are chosen independently 
at each point, but in fact using our algorithm they will depend on one 
another (in a good way, but that is hard to prove).

In fact, I haven't even managed to show that the class of distributions 
for which I have a bound - those which are solutions to the system 
implied by equations (2.3) and (2.4) in the text - is nonempty. Until 
this is done the theorem as it stands is interesting, but from a strict 
mathematical viewpoint quite meaningless. (Which is why I have held off 
on publication).

<>
> 1. What is the difference between Freenet[1] and Sandberg/Dijjer[2]
>    models that makes Dijjer easier to analyze? This is mentioned a few
>    times in [2] and in other Freenet- and Dijjer- related places, but
>    I did not notice the explicit explanation of the differences
>    anywhere. Was it the directed links in the graph in [2]? Something
>    else? I honestly tried to figure this out by myself, but couldn't; 
>    sorry... Maybe if I'd spend more time looking at both [1] and [2],
>    I'd figure this out, but since you guys are already here, I thought
>    that maybe you could answer that? 

What makes Freenet (by which I mean the old Freenet routing, not the one 
they are using now) much more difficult to analyze is that the data 
rather than the nodes which have addresses. Data bounces around the 
network dynamically, and so the markov chain implied becomes a lot more 
complicated. For one thing, Freenet suffers from load balancing issues, 
while the point to point routing cannot have such issues.

> 2. Whatever this difference is, how relevant do you think are the
>    Sandberg/Dijjer results[2] for the original Freenet paper[1]?
> 
>    For example:
> 
> 	a. section 5.2 of [1] says about Fig. 3: "We can see that the
> 	   pathlength scales approximately logarithmically, with a change
> 	   of slope near 40,000 nodes."
> 
> 	   Given what you guys know now, do you think that Fig. 3 shows
> 	   log^2(N) instead, or Freenet simply has different scaling
> 	   properties due to its different algorithms?

Paths in Freenet are affected by other things than just point to point 
routing. Caching for instance will play a big role. I also don't 
remember exactly how we were scaling the routing table size in those 
simulations (the log^2 bound is for a constant routing table size).

I'm guessing that a lot of what was seen in those simulation were 
artifacts of other factors.

> 	b. Fig. 5 shows the link number distribution that awfully
> 	   resembles the power-law one, whereas Oskar mentioned in
> 	   another mail that Kleinberg is constant out-degree and 
> 	   Poisson in-degree. Is my assumption that Sandberg/Dijjer 
> 	   model is supposed to mimic the Kleinberg distibution
> 	   incorrect and in fact both Sandberg/Dijjer and Freenet
> 	   are power-law? Or is just one of them power law due to
> 	   the different algorithms used? Or maybe Fig. 5 in [1] is
> 	   actually showing a part of the Poisson distribution that 
> 	   can be easily mistaken for the power law?

The model presented in the paper has fixed out-degree and (roughly) 
poisson in-degree like Kleinberg's. Freenet has a fixed out-degree but 
possibly a power-law in-degree, which is the cause of the aforementioned 
load balancing issues (I don't have any good evidence for this, just the 
feeling that their is some sort of preferential attachment going on).

> 	c. Section 5.4 of [1] says: "In a small-world network, the
> 	   majority of nodes have only relatively few, local, connections
> 	   to other nodes, while a small number of nodes have large, 
> 	   wide-ranging sets of connections. Small-world networks permit
> 	   efficient short paths between arbitrary points because of the
> 	   shortcuts provided by the well-connected nodes..."
> 
> 	   This does not sound like anything I've been able to see in the
> 	   Oskar's thesis[2] - it has left me with a distinct impression
> 	   that the connectivity in the small world (at least in the one
> 	   analyzed there) was provided by the exactly right (Kleinberg's)
> 	   proportion of long-distance links, which Sandberg/Dijjer
> 	   rewiring model was designed to achieve, and not by the means 
> 	   of any special nodes with "wide-ranging sets of connections".
> 	   Is it because Freenet and Dijjer used different algorithms, or
> 	   the original assumption about the reasons for the Freenet
> 	   connectivity was wrong? 

The original paper is based on less sophisticated small-world models 
than the later stuff. I think most of the small-world references in 
there are hand-waving really.

If you place an arbitrary point somewhere where you say that connections 
shorter than that are local and others are long distance, then 
Kleinberg's model (and ours) will reflect the somewhat fuzzy statement 
you quote above.

> 3. Why would one want to use the system with O(log^2(N)) path length
>    when pretty much every DHT gives a path lenght of O(log(N))? At
>    today's P2P nets scale (millions of nodes) the difference between
>    log and log^2 is non-trivial. I understand that creating the small-
>    world network by a simple and natural evolutionary process is very
>    elegant, simple, and attractive, but is this elegance worth the 
>    increased path length? Why not simply use a DHT?

In real life, the order is much less interesting then the actual route 
length. One shouldn't stair blindly at the order when the constants 
could be dominating in real world situations. Also, the log^2 is with 
constant routing table size, if you scale up the routing table with the 
size of the network, you can roughly divide the order by the routing 
table size (most log n DHTs scale the routing table with log n as well).

> 4. Speaking of DHTs: Oskar recently said something curious here that 
>    I did not understand: "Actually, while the frequency of Chord links
>    falls exponentially with the "level" (not sure what the Chord term
>    is) the length of such links increases exponentially as well, so in
>    fact the frequency of links with certain lengths do fall
>    harmonically. One could see Chord as some sort of "mean field"
>    version of the same dynamics as Kleinberg's model."
> 
>    What exactly does this mean? DHTs (including Chord) have log(N)
>    diameter, and Kleinberg has log^2(N), right? This looks fairly
>    different to me, and I totally missed the meaning of the "mean
>    field", and why Chord and Kleinberg would have the same dynamics;
>    I did not even understand what is this dynamics that you're talking
>    about. Oskar, could you please elaborate on this?

Again, the routing tables grow in Chord. If each node had one "chord" 
link (every log n th node having a chord of the same length), then Chord 
would be roughly log^2 as well.

<>
> 6. Both the simulation in Freenet paper[1] and Sandberg/Dijjer paper[2]
>    use the random requests to rewire the networks. In reality, there
>    will be all kinds of hotspots and uneven distributions of requests
>    due to the different popularity of content. Do you guys think it
>    would affect the results of analysis performed in [2], and if yes,
>    how? For example, if one particular file is extremely popular, one
>    could imagine that its storage node[s] will have many more links 
>    than the average, and the search for the other (rarely requested)
>    data items might become ineffective, because these path lengths will
>    become very high. 

I would imagine that the large number of files stored per node, and the 
random distribution implied by using a secure hash would help even out 
distribution. I'm not sure what an uneven distribution does under our 
algorithm, but intuitively it should handle it just fine (it would mean 
more links to the destination of popular queries, but that is a good 
idea since such links are popular...)


>    Do you think it is a valid concern? Are there any results that would
>    show that this is not an issue?

Yes, and no (besides standard concentration results showing that it 
should even out if there are sufficiently many documents per node).

> 7. While Sandberg [2] assumes that the requests are passing between 
>    the random points, both Freenet and Dijjer seem to allow the length
>    of the path to be shortened when the requested content is already
>    cached on the intermediate node. If this is really so, what is the
>    effect that will have on Oskar's analysis results? Fewer links will
>    be rewired, so this should have some effect on the graph dynamics,
>    right? Any idea what it would be?

This is part of the answer to your first question about why these 
situations are harder to analyze.

> 8. The answer to '7' above might also depend on the cache size and 
>    replacement policies on the nodes, and on the content popularity
>    distribution (which I already mentioned in '6' above). All these
>    factors are hard to predict for the real system in advance, and I'm
>    wondering what is the feeling that you guys have about the overall
>    stability of the rewiring algorithm?

No idea.

>    I mean, the original Kleinberg d(x, y)^(-1) approach is extremely
>    sensitive to the value of this "-1". Any other value, and you either
>    do not have enough long distance links, or you do not have enough
>    short-range links once you arrive into the general vicinity of your
>    destination. In any case, your path length becomes polynomial instead
>    of polylogarithmic once you have the smallest deviation from this
>    "-1" number.

It isn't actually that sensitive. It is sensitive asymptotically, but 
for any given network size varying the exponent a little will make 
rather little difference. In fact, one can calculate numerically (I know 
of no proof) that there are better exponents than -1 for any given n 
(though the best exponent approaches 1 as the network grows towards 
infinity).

>    And this makes me a bit uneasy; I do understand that the catastrophic
>    failure is unlikely, and the detrimental effect of all these changes
>    (if any) is likely to be tolerable, but still... Look at it this
>    way: Chapter 1 of Oskar's thesis is dedicated to proving that 
>    "...family given by (1.1) allows for polylogarithmic routing at, 
>    and only at, one value of the [alpha]...", then Chapter 2 proves
>    that this is exactly what happens with random-request rewiring, and
>    then Dijjer's practical implementation and deployment arrive and
>    throw these random requests out of the window with their caching,
>    different content popularity, and whatnot. So given the strict
>    requirements that were proven to be vital in Chapter 1, I cannot 
>    help wondering: is Dijjer still polylogarithmic, or not?

It is certainly not proved.

>    I'd feel much better about all this if there would be some results
>    showing that the system is stable in the presence of such changes
>    (in a sense that reasonably small algorithm input changes should not
>    cause any catastrophically large changes in the algorithm results).
>    Do you have any results that would point in that direction? Would
>    you say that the data displayed on Figures 2.4-2.6 already shows 
>    that your results are not exactly Kleinberg, but this does not 
>    cause any catastrophic consequences, or you also have some other
>    sources of optimism?

I would feel better about it too if more were proven. I am optimistic 
that things actually do work, but I am not very optimistic about how 
much I will ever be able to show (and it isn't just me, though a better 
mathematician may of course have gotten further, I have put these 
problems in front of a many good researchers who seem to agree they are 
difficult problems).

> 9. And finally, if I'm not mistaken, Dijjer description seems to imply
>    that the links are rewired after every search, whereas Oskar suggests
>    that this should be happen only with proability p, and in section 2.3
>    says: "...there are simple heuristic arguments for why p should
>    reasonably be on the order of one over the expected length of the 
>    greedy walks". Any particular reason for that Dijjer behaviour, or
>    I've missed something and Dijjer does, in fact, rewire with this
>    probability p? (Would be interesting to hear these simple heuristic
>    arguments too, but that's another story...)

It won't matter so much if the routing table is large (remember the 
mathematical model has one long link) though I would recommend that 
Dijjer updated less often.

// oskar


From stclausen at gmail.com  Mon Mar 13 08:54:43 2006
From: stclausen at gmail.com (steve clausensaun)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Overlay attacks
Message-ID: <f8cce3cf0603130054u7cef2bb2kb5a32cf52fd0a5bf@mail.gmail.com>

Hi all,

I' m doing some experiments on a few overlays. Basically, I'm trying to see
the effect of nodes' misbehavior in landmark-based positionning networks
such as GNP or NPS.
Is there any litterature or useful links to know?
Have you some ideas on behaviors that should be studied?
Last point, Is there a way to avoid the filtering function established by
NPS for example consisting in not considering nodes Whose error is deviating
from the median  errors of other nodes. I mean is there a way to destabilize
this kind of protections.

Kind regards,

Stuart.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060313/805e6b74/attachment.html
From thorsten.strufe at tu-ilmenau.de  Mon Mar 13 15:23:40 2006
From: thorsten.strufe at tu-ilmenau.de (Thorsten Strufe)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Overlay attacks
In-Reply-To: <f8cce3cf0603130054u7cef2bb2kb5a32cf52fd0a5bf@mail.gmail.com>
References: <f8cce3cf0603130054u7cef2bb2kb5a32cf52fd0a5bf@mail.gmail.com>
Message-ID: <44158E7C.2000201@tu-ilmenau.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Stuart,

the PIC-Paper [1] has quite some stuff on destructive node-behaviour in
systems with synthetic coordinates.


HTH,

Thorsten

[1]
@inproceedings{ costa04pic,
  author = {M. Costa and M. Castro and A. Rowstron and P. Key},
  title = {{PIC: Practical Internet coordinates for distance  estimation}},
  booktitle = {Proceedings of ICDCS},
  year = {2004},
  month = {March}
}
URL: http://www.cl.cam.ac.uk/Teaching/2003/AdvSysTop/pic.pdf

steve clausensaun schrieb:
> Hi all,
>  
> I' m doing some experiments on a few overlays. Basically, I'm trying to
> see the effect of nodes' misbehavior in landmark-based positionning
> networks such as GNP or NPS.
> Is there any litterature or useful links to know?
> Have you some ideas on behaviors that should be studied?
- --8<---
- --
Dipl.-Inf. Thorsten Strufe         +49(0)3677-694552, CC 440
Fachgebiet Telematik               Institut Praktische Informatik	
Technische Universit?t Ilmenau     http://www-ia.tu-ilmenau.de/IPI/FGT
Sartre:Begehe keine Dummheit zweimal, die Auswahl ist doch gro? genug!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEFY58OQ9e8ojbqFYRAhKzAJ9vJzw0iSI1RxaUNT3Oru+bIbxRaACfTZSY
t0+IHP4Lxnay+NRYVMz6n/Y=
=V+H4
-----END PGP SIGNATURE-----

From ian at locut.us  Mon Mar 13 22:33:45 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <IJECJCHFLLIPKMNGHDPMOENLIGAA.osokin@osokin.com>
References: <441449D4.7070809@math.chalmers.se>
	<IJECJCHFLLIPKMNGHDPMOENLIGAA.osokin@osokin.com>
Message-ID: <823242bd0603131433x4993d916qf9c81eeb317c5dfd@mail.gmail.com>

On 3/12/06, Serguei Osokine <osokin@osokin.com> wrote:
>         By the way, as a practical matter, does Dijjer grow its routing
> tables with network size?

I will take this as Oskar isn't really familiar with the details of
Dijjer (I think he is much more interested in the mathematics rather
than the practicalities of implementation).  Dijjer does not scale its
routing tables with the log of the network size, basically it tries to
make the routing tables as big as possible within a given constraint
(I think the default is 20 connections).  While this means that it
doesn't have log(N) scalability, in practice it doesn't really make
much difference and it means that we don't need to try to calculate
the total network size.

>  Shouldn't be a problem to scale these tables
> as log(N) and have the scalability similar to Chord and other DHTs. If
> you do, you might want to mention it somewhere - I'm fairly sure that
> I'm not the single person dense enough to miss the link between log(N)
> path length and log(N) links per node and to be left with an impression
> that Chord scales better than Dijjer as a result.

That is a good point.  If you would like to add something to Dijjer's
FAQ about it, please feel free (its a wiki, I would do it myself but
i'm on the road).

Ian.

From caihailong at gmail.com  Thu Mar 16 07:43:35 2006
From: caihailong at gmail.com (Hailong Cai)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] any job opportunities in P2P area?
In-Reply-To: <823242bd0603131433x4993d916qf9c81eeb317c5dfd@mail.gmail.com>
Message-ID: <005a01c648cd$557ae210$cfa45d81@csewang03>

Hi guys,

Just want to know if anybody knows some companies (esp. big ones) hiring people with P2P background?
It seems difficult to find such positions since few big companies do P2P development.  Thanks!

-Hailong


From turbogeek at cluck.com  Thu Mar 16 18:07:20 2006
From: turbogeek at cluck.com (Daniel Brookshier)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] any job opportunities in P2P area?
In-Reply-To: <005a01c648cd$557ae210$cfa45d81@csewang03>
References: <005a01c648cd$557ae210$cfa45d81@csewang03>
Message-ID: <39BF82C8-C85E-4963-9523-A727F300E02A@cluck.com>

I may have one. You need to know and love the evil that is JXTA. I  
usually get called but often do not have the bandwidth to cover. Send  
a resume directly to me and detail your JXTA experience.

On Mar 16, 2006, at 1:43 AM, Hailong Cai wrote:

> Hi guys,
>
> Just want to know if anybody knows some companies (esp. big ones)  
> hiring people with P2P background?
> It seems difficult to find such positions since few big companies  
> do P2P development.  Thanks!
>
> -Hailong
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


From ossa at math.chalmers.se  Sat Mar 18 13:24:25 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <823242bd0603131433x4993d916qf9c81eeb317c5dfd@mail.gmail.com>
References: <441449D4.7070809@math.chalmers.se>	
	<IJECJCHFLLIPKMNGHDPMOENLIGAA.osokin@osokin.com>
	<823242bd0603131433x4993d916qf9c81eeb317c5dfd@mail.gmail.com>
Message-ID: <441C0A09.6040207@math.chalmers.se>

Ian Clarke wrote:
> On 3/12/06, Serguei Osokine <osokin@osokin.com> wrote:
> 
>>        By the way, as a practical matter, does Dijjer grow its routing
>>tables with network size?
> 
> 
> I will take this as Oskar isn't really familiar with the details of
> Dijjer (I think he is much more interested in the mathematics rather
> than the practicalities of implementation).  Dijjer does not scale its
> routing tables with the log of the network size, basically it tries to
> make the routing tables as big as possible within a given constraint
> (I think the default is 20 connections).  While this means that it
> doesn't have log(N) scalability, in practice it doesn't really make
> much difference and it means that we don't need to try to calculate
> the total network size.

To put actual numbers to this, I did simulations of point to point 
routes using the same algorithm that I think is being used in Dijjer 
(the one described in my thesis, with 20 shortcuts, but without using 
"local" links):

1000    4.4446
2000    5.066
4000    5.7604
8000    6.546755
16000   7.476
32000   8.34637
64000   9.434343
128000  10.747675
256000  12.572815
512000  15.743066

Success-rates were above 99% for all levels. These are just the lengths 
of routes, and do not take things like caching into account (off hand, 
you could probably see each cached document as an extra shortcut, since 
having the document required is the same thing as having an edge to its 
"home").

So at least up to 150,000 nodes or so you should have no problems with 
20 neighbors. Above that one might want to consider having more.

// oskar

From aptgetd at gmail.com  Sun Mar 19 06:43:29 2006
From: aptgetd at gmail.com (noc ops)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
In-Reply-To: <440DB371.8040702@mungo.dk>
References: <440DB371.8040702@mungo.dk>
Message-ID: <441CFD91.7030400@gmail.com>

Jacob Madsen wrote:
> Hey
> 
> I was reading about FastTrack on wikipedia and checked
> http://www.slyck.com for stats on the network. And according to
> http://www.slyck.com/stats.php there is almost 3 millions users at the
> moment.
> Do someone know of a method to calculate the approx. number of supernodes?
-----------------
I'm interested in knowing if there's a way to filter communication among
super nodes? If so, any insight will be appreciated.


Please advice.


regards,
/virendra

> 
> Thanks!
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

From saikat at cs.cornell.edu  Sun Mar 19 08:51:05 2006
From: saikat at cs.cornell.edu (Saikat Guha)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
In-Reply-To: <441CFD91.7030400@gmail.com>
References: <440DB371.8040702@mungo.dk>  <441CFD91.7030400@gmail.com>
Message-ID: <1142758265.12050.11.camel@localhost.localdomain>

On Sat, 2006-03-18 at 22:43 -0800, noc ops wrote:
> I'm interested in knowing if there's a way to filter communication among
> super nodes? If so, any insight will be appreciated.

Depends on your end-goal. If your end-goal is to prevent local clients
from becoming supernodes, but allow them to access external supernodes,
then there are many solutions:
- NAT/firewall the local clients being the extreme
- Drop inbound first-contact packets to local clients
- Deep packet inspection for specific patterns identified by Salman

If your intent is to listen-in on supernode communication, or
traffic-shape it etc., that may be harder.

cheers,
-- 
Saikat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : http://zgp.org/pipermail/p2p-hackers/attachments/20060319/2a16f2ed/attachment.pgp
From ossa at math.chalmers.se  Sun Mar 19 15:10:36 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <441C0A09.6040207@math.chalmers.se>
References: <441449D4.7070809@math.chalmers.se>		<IJECJCHFLLIPKMNGHDPMOENLIGAA.osokin@osokin.com>	<823242bd0603131433x4993d916qf9c81eeb317c5dfd@mail.gmail.com>
	<441C0A09.6040207@math.chalmers.se>
Message-ID: <441D746C.7070606@math.chalmers.se>

Oskar Sandberg wrote:
> To put actual numbers to this, I did simulations of point to point 
> routes using the same algorithm that I think is being used in Dijjer 
> (the one described in my thesis, with 20 shortcuts, but without using 
> "local" links):
> 
<>
> 256000  12.572815
> 512000  15.743066

I let this simulation run further, and here are the results the next 
results:

Run size: 1024000
(474 of 10000 successful: 0.0474)
(Mean Steps: 471.88397)

This means that the network with 20 links undergoes a phase transition 
between 500 thousand and one million nodes, after which it doesn't work 
at all (this behavior isn't surprising). So if the network should grow 
to over half a million nodes, you definitely need more than 20 edges 
(and caching won't help).

// oskar

From aptgetd at gmail.com  Sun Mar 19 20:50:13 2006
From: aptgetd at gmail.com (noc ops)
Date: Sat Dec  9 22:13:11 2006
Subject: [p2p-hackers] Supernodes in the FastTrack network
In-Reply-To: <1142758265.12050.11.camel@localhost.localdomain>
References: <440DB371.8040702@mungo.dk> <441CFD91.7030400@gmail.com>
	<1142758265.12050.11.camel@localhost.localdomain>
Message-ID: <441DC405.10004@gmail.com>


Saikat Guha wrote:
> On Sat, 2006-03-18 at 22:43 -0800, noc ops wrote:
> 
>>I'm interested in knowing if there's a way to filter communication among
>>super nodes? If so, any insight will be appreciated.
> 
> 
> Depends on your end-goal. If your end-goal is to prevent local clients
> from becoming supernodes, but allow them to access external supernodes,
> then there are many solutions:
--------------
To block skype communication (super nodes). Blocking TCP (authentication
is ok) it's the super nodes that hard to nail.


> - NAT/firewall the local clients being the extreme
> - Drop inbound first-contact packets to local clients
> - Deep packet inspection for specific patterns identified by Salman
-------------
Salman's paper doesn't address blocking super nodes unless I missed it.


regards,
/virendra

> 
> If your intent is to listen-in on supernode communication, or
> traffic-shape it etc., that may be harder.
> 
> cheers,

From osokin at osokin.com  Sun Mar 19 21:31:15 2006
From: osokin at osokin.com (Serguei Osokine)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <441D746C.7070606@math.chalmers.se>
Message-ID: <IJECJCHFLLIPKMNGHDPMEEFCIHAA.osokin@osokin.com>

On Sunday, March 19, 2006 Oskar Sandberg wrote:
> ...the network with 20 links undergoes a phase transition between
> 500 thousand and one million nodes, after which it doesn't work 
> at all (this behavior isn't surprising).

	It is to me. How come it is significantly more than square, 
and doubling of the number of nodes increases the path by a factor 
of *thirty*?

	If I understand things correctly, Chord, for example, would have
about 19 steps at 512,000 (with 19 links, mind you), and 20 steps at
one million - having the same 20 links that your simulation of Dijjer
does. This result of 472 is *extremely* counterintuitive (I'd expect
to see 19-20 at most), and desrves some explanation. Frankly, at the
absence of other data, the simulation bug seems to be the most likely
explanation. 

	Best wishes -
	S.Osokine.
	19 Mar 2006.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Oskar Sandberg
Sent: Sunday, March 19, 2006 7:11 AM
To: Peer-to-peer development.
Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)


Oskar Sandberg wrote:
> To put actual numbers to this, I did simulations of point to point 
> routes using the same algorithm that I think is being used in Dijjer 
> (the one described in my thesis, with 20 shortcuts, but without using 
> "local" links):
> 
<>
> 256000  12.572815
> 512000  15.743066

I let this simulation run further, and here are the results the next 
results:

Run size: 1024000
(474 of 10000 successful: 0.0474)
(Mean Steps: 471.88397)

This means that the network with 20 links undergoes a phase transition 
between 500 thousand and one million nodes, after which it doesn't work 
at all (this behavior isn't surprising). So if the network should grow 
to over half a million nodes, you definitely need more than 20 edges 
(and caching won't help).

// oskar
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From ossa at math.chalmers.se  Mon Mar 20 10:49:22 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <IJECJCHFLLIPKMNGHDPMEEFCIHAA.osokin@osokin.com>
References: <IJECJCHFLLIPKMNGHDPMEEFCIHAA.osokin@osokin.com>
Message-ID: <441E88B2.1050200@math.chalmers.se>

Serguei Osokine wrote:
> On Sunday, March 19, 2006 Oskar Sandberg wrote:
> 
>>...the network with 20 links undergoes a phase transition between
>>500 thousand and one million nodes, after which it doesn't work 
>>at all (this behavior isn't surprising).
> 
> 
> 	It is to me. How come it is significantly more than square, 
> and doubling of the number of nodes increases the path by a factor 
> of *thirty*?

Like I said, it means that the network has undergone a phase transition 
to a completely different behavior. At some point the number of edges 
compared to the size is not sufficient to keep the network connected 
enough, and the whole thing breaks down. Phase transitions are a very 
common phenomenon in the study of random graphs.

> 	If I understand things correctly, Chord, for example, would have
> about 19 steps at 512,000 (with 19 links, mind you), and 20 steps at
> one million - having the same 20 links that your simulation of Dijjer
> does. This result of 472 is *extremely* counterintuitive (I'd expect
> to see 19-20 at most), and desrves some explanation. Frankly, at the
> absence of other data, the simulation bug seems to be the most likely
> explanation. 

No, it isn't a bug. In fact, I was expecting it at some point, I just 
didn't know where for this particular case. You cannot compare this to 
Chord, it is a much richer, more complex, probabilistic model.

// oskar

From Serguei.Osokine at efi.com  Mon Mar 20 17:32:57 2006
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42AC1@fcexmb04.efi.internal>

On Monday, March 20, 2006 Oskar Sandberg wrote:
> Phase transitions are a very common phenomenon in the study of 
> random graphs.

	Interesting. Is there any way to predict these things in
advance and stick to the networks that do not have this problem? 
I mean, running the simulations with the number of nodes increasing
all the way into millions is not the best method of assuring the
future network operation. Among other things, your simulation might
simply miss the phase transition for whatever reason.

	With Gnutella, for example, we knew that the problem was there 
two months before the actual meltdown, and simple back of the napkin
calculations were sufficient to see that. And once it was fixed, the
network grew into the millions of nodes without any further issues.
Is this kind of prediction possible here? This phase transition (and
the reason for it) were not intuitively obvious and expected - at 
least for me. In fact, I'd still be hard-pressed to explain it, even
with the knowledge that it is expected.

> No, it isn't a bug. In fact, I was expecting it at some point, 
> I just didn't know where for this particular case. You cannot 
> compare this to Chord, it is a much richer, more complex, 
> probabilistic model.

	So what you're saying is that once Dijjer approaches one million
nodes, it will have a catastrophic meltdown? And this is the way it is
supposed to be? What about Freenet? Will it blow up before reaching 
one million nodes as well? And if yes, can you do something to prevent
it? 

	Best wishes -
	S.Osokine.
	20 Mar 2006.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Oskar Sandberg
Sent: Monday, March 20, 2006 2:49 AM
To: osokin@osokin.com; Peer-to-peer development.
Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)


Serguei Osokine wrote:
> On Sunday, March 19, 2006 Oskar Sandberg wrote:
> 
>>...the network with 20 links undergoes a phase transition between
>>500 thousand and one million nodes, after which it doesn't work 
>>at all (this behavior isn't surprising).
> 
> 
> 	It is to me. How come it is significantly more than square, 
> and doubling of the number of nodes increases the path by a factor 
> of *thirty*?

Like I said, it means that the network has undergone a phase transition 
to a completely different behavior. At some point the number of edges 
compared to the size is not sufficient to keep the network connected 
enough, and the whole thing breaks down. Phase transitions are a very 
common phenomenon in the study of random graphs.

> 	If I understand things correctly, Chord, for example, would have
> about 19 steps at 512,000 (with 19 links, mind you), and 20 steps at
> one million - having the same 20 links that your simulation of Dijjer
> does. This result of 472 is *extremely* counterintuitive (I'd expect
> to see 19-20 at most), and desrves some explanation. Frankly, at the
> absence of other data, the simulation bug seems to be the most likely
> explanation. 

No, it isn't a bug. In fact, I was expecting it at some point, I just 
didn't know where for this particular case. You cannot compare this to 
Chord, it is a much richer, more complex, probabilistic model.

// oskar
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From ian at locut.us  Mon Mar 20 17:56:48 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AC1@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AC1@fcexmb04.efi.internal>
Message-ID: <70BB22B9-3AEC-4CD6-B10A-1DD14DBB4CA0@locut.us>

On 20 Mar 2006, at 09:32, Serguei Osokine wrote:
> 	So what you're saying is that once Dijjer approaches one million
> nodes, it will have a catastrophic meltdown? And this is the way it is
> supposed to be? What about Freenet? Will it blow up before reaching
> one million nodes as well? And if yes, can you do something to prevent
> it?

It means that as these networks grow, we will need to increase the  
number of connections per node, just as Chord and other DHTs do.  The  
difference is that we aren't (currently) doing this automatically  
because we don't want to have to calculate the size of the network on  
the fly, so we just chose a high enough number that will suffice for  
the time being.

Ian.


From Serguei.Osokine at efi.com  Mon Mar 20 18:20:02 2006
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42AC4@fcexmb04.efi.internal>

On Monday, March 20, 2006 Ian Clarke wrote:
> It means that as these networks grow, we will need to increase the  
> number of connections per node, just as Chord and other DHTs do. The  
> difference is that we aren't (currently) doing this automatically  
> because we don't want to have to calculate the size of the network 
> on the fly, so we just chose a high enough number that will suffice
> for the time being.

	You mean you did not expect to grow Dijjer and Freenet beyond
512K nodes before you'd have to replace all the client code? With
today's P2P network sizes it might be a good idea to have the code
that would be ready to scale into high millions at least - you never
know when you might need it... :-)

	One more thing - Oskar was running his simulations without all
the practical optimizations that Dijjer uses (caching and all), and
without trying to simulate the content with a different popularity 
(using just random point pairs), if I'm not mistaken. This difference
might also affect the number of nodes where the meltdown happens. And 
since there's no good way to simulate either, I'd be extra cautious
when choosing the number of links for the real deployment. The network
with the real algorithms and access patterns might fail sooner than
the simulation, so you might want to be extra conservative when
choosing the number of links.

	By the way - Oskar, did you grow the network to the next higher
size starting with the previous stable graph, or just created it from
scratch? I thought that the former method might be more stable and
give you the higher failure point - and it would more accurately
reflect the real life situation, too, because the real deployments
tend to grow from the stable base and not just be started from zero
nodes. Do you think this might affect your simulation?

	Best wishes -
	S.Osokine.
	20 Mar 2006.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Ian Clarke
Sent: Monday, March 20, 2006 9:57 AM
To: Peer-to-peer development.
Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)


On 20 Mar 2006, at 09:32, Serguei Osokine wrote:
> 	So what you're saying is that once Dijjer approaches one million
> nodes, it will have a catastrophic meltdown? And this is the way it is
> supposed to be? What about Freenet? Will it blow up before reaching
> one million nodes as well? And if yes, can you do something to prevent
> it?

It means that as these networks grow, we will need to increase the  
number of connections per node, just as Chord and other DHTs do.  The  
difference is that we aren't (currently) doing this automatically  
because we don't want to have to calculate the size of the network on  
the fly, so we just chose a high enough number that will suffice for  
the time being.

Ian.

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From agthorr at cs.uoregon.edu  Mon Mar 20 18:29:19 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AC4@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AC4@fcexmb04.efi.internal>
Message-ID: <20060320182918.GB5200@cs.uoregon.edu>

On Mon, Mar 20, 2006 at 10:20:02AM -0800, Serguei Osokine wrote:
> 	You mean you did not expect to grow Dijjer and Freenet beyond
> 512K nodes before you'd have to replace all the client code? With
> today's P2P network sizes it might be a good idea to have the code
> that would be ready to scale into high millions at least - you never
> know when you might need it... :-)

Most users upgrade their software within 2 months [1], so replacing
all the client code actually isn't that hard.  I'm assuming the
network is robust enough to keep working if a small percentage of
clients have the old code.

[1] = based on measurements of LimeWire Ultrapeer users.  Amir
H. Rasti, Daniel Stutzbach, Reza Rejaie, "On the Long-term Evolution
of the Two-Tier Gnutella Overlay", to appear at the Global Internet
Symposium 2006.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From gbildson at limepeer.com  Mon Mar 20 18:33:43 2006
From: gbildson at limepeer.com (Greg Bildson)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <20060320182918.GB5200@cs.uoregon.edu>
Message-ID: <EPEJIODJLBDLEHGHIEADAEMBGBAB.gbildson@limepeer.com>

Define "most".  If you mean more than 70% (not sure of the exact
percentage), I would have to disagree.

Thanks
-greg

> -----Original Message-----
> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> Behalf Of Daniel Stutzbach
> Sent: Monday, March 20, 2006 1:29 PM
> To: Peer-to-peer development.
> Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)
>
>
> On Mon, Mar 20, 2006 at 10:20:02AM -0800, Serguei Osokine wrote:
> > 	You mean you did not expect to grow Dijjer and Freenet beyond
> > 512K nodes before you'd have to replace all the client code? With
> > today's P2P network sizes it might be a good idea to have the code
> > that would be ready to scale into high millions at least - you never
> > know when you might need it... :-)
>
> Most users upgrade their software within 2 months [1], so replacing
> all the client code actually isn't that hard.  I'm assuming the
> network is robust enough to keep working if a small percentage of
> clients have the old code.
>
> [1] = based on measurements of LimeWire Ultrapeer users.  Amir
> H. Rasti, Daniel Stutzbach, Reza Rejaie, "On the Long-term Evolution
> of the Two-Tier Gnutella Overlay", to appear at the Global Internet
> Symposium 2006.
>
> --
> Daniel Stutzbach                           Computer Science Ph.D Student
> http://www.barsoom.org/~agthorr                     University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


From coderman at gmail.com  Mon Mar 20 19:15:28 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AC1@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AC1@fcexmb04.efi.internal>
Message-ID: <4ef5fec60603201115s2f45af3rbda4182fe28670b1@mail.gmail.com>

On 3/20/06, Serguei Osokine <Serguei.Osokine@efi.com> wrote:
> On Monday, March 20, 2006 Oskar Sandberg wrote:
> > Phase transitions are a very common phenomenon in the study of
> > random graphs.
>
>         Interesting. Is there any way to predict these things in
> advance and stick to the networks that do not have this problem?
> I mean, running the simulations with the number of nodes increasing
> all the way into millions is not the best method of assuring the
> future network operation. Among other things, your simulation might
> simply miss the phase transition for whatever reason.

anyone have further insight on predicting the distribution or nature
of phase transitions in arbitrary graphs?  i haven't turned up any
good papers, but i haven't looked that hard yet either.

in particular i'm interested in the node degree distribution and what
effect this has.  (it seems that some number of nodes of sufficient
degree are needed to achieve state transition in any reasonable sized
graph; sparse and randomly connected nodes are much less likely to
exhibit this behavior.  i'd like to find a formal study of this
interplay)

From ian at locut.us  Mon Mar 20 19:15:16 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AC4@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AC4@fcexmb04.efi.internal>
Message-ID: <FDAD31F8-5334-422A-8802-B3DAF1B14975@locut.us>


On 20 Mar 2006, at 10:20, Serguei Osokine wrote:

> On Monday, March 20, 2006 Ian Clarke wrote:
>> It means that as these networks grow, we will need to increase the
>> number of connections per node, just as Chord and other DHTs do. The
>> difference is that we aren't (currently) doing this automatically
>> because we don't want to have to calculate the size of the network
>> on the fly, so we just chose a high enough number that will suffice
>> for the time being.
>
> 	You mean you did not expect to grow Dijjer and Freenet beyond
> 512K nodes

No, this conversation applies to Dijjer, Freenet is a different beast  
altogether due to its "darknet" approach.  Dijjer is currently very  
much in beta, hence the relatively small number of links.  Freenet  
allows users to have as many links per-node as they like.

Ian.


From ian at locut.us  Mon Mar 20 19:22:19 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <EPEJIODJLBDLEHGHIEADAEMBGBAB.gbildson@limepeer.com>
References: <EPEJIODJLBDLEHGHIEADAEMBGBAB.gbildson@limepeer.com>
Message-ID: <B5279D37-0566-428F-BA4C-4A691E70C058@locut.us>

I think for the purposes of this conversation, around 70% will be  
more than sufficient (although without further thought I am not sure  
exactly how scalability in a small world network is affected by  
having variable numbers of links among nodes).

Ian.

On 20 Mar 2006, at 10:33, Greg Bildson wrote:

> Define "most".  If you mean more than 70% (not sure of the exact
> percentage), I would have to disagree.
>
> Thanks
> -greg
>
>> -----Original Message-----
>> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers- 
>> bounces@zgp.org]On
>> Behalf Of Daniel Stutzbach
>> Sent: Monday, March 20, 2006 1:29 PM
>> To: Peer-to-peer development.
>> Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)
>>
>>
>> On Mon, Mar 20, 2006 at 10:20:02AM -0800, Serguei Osokine wrote:
>>> 	You mean you did not expect to grow Dijjer and Freenet beyond
>>> 512K nodes before you'd have to replace all the client code? With
>>> today's P2P network sizes it might be a good idea to have the code
>>> that would be ready to scale into high millions at least - you never
>>> know when you might need it... :-)
>>
>> Most users upgrade their software within 2 months [1], so replacing
>> all the client code actually isn't that hard.  I'm assuming the
>> network is robust enough to keep working if a small percentage of
>> clients have the old code.
>>
>> [1] = based on measurements of LimeWire Ultrapeer users.  Amir
>> H. Rasti, Daniel Stutzbach, Reza Rejaie, "On the Long-term Evolution
>> of the Two-Tier Gnutella Overlay", to appear at the Global Internet
>> Symposium 2006.
>>
>> --
>> Daniel Stutzbach                           Computer Science Ph.D  
>> Student
>> http://www.barsoom.org/~agthorr                     University of  
>> Oregon
>> _______________________________________________
>> p2p-hackers mailing list
>> p2p-hackers@zgp.org
>> http://zgp.org/mailman/listinfo/p2p-hackers
>> _______________________________________________
>> Here is a web page listing P2P Conferences:
>> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


From ossa at math.chalmers.se  Mon Mar 20 19:26:10 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <4ef5fec60603201115s2f45af3rbda4182fe28670b1@mail.gmail.com>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AC1@fcexmb04.efi.internal>
	<4ef5fec60603201115s2f45af3rbda4182fe28670b1@mail.gmail.com>
Message-ID: <441F01D2.7060708@math.chalmers.se>

coderman wrote:
> On 3/20/06, Serguei Osokine <Serguei.Osokine@efi.com> wrote:
> 
>>On Monday, March 20, 2006 Oskar Sandberg wrote:
>>
>>>Phase transitions are a very common phenomenon in the study of
>>>random graphs.
>>
>>        Interesting. Is there any way to predict these things in
>>advance and stick to the networks that do not have this problem?
>>I mean, running the simulations with the number of nodes increasing
>>all the way into millions is not the best method of assuring the
>>future network operation. Among other things, your simulation might
>>simply miss the phase transition for whatever reason.
> 
> 
> anyone have further insight on predicting the distribution or nature
> of phase transitions in arbitrary graphs?  i haven't turned up any
> good papers, but i haven't looked that hard yet either.

Try this:

http://www.arxiv.org/abs/math.PR/0504589

// oskar

From bob.harris.spamcontrol at gmail.com  Mon Mar 20 20:11:50 2006
From: bob.harris.spamcontrol at gmail.com (Bob Harris)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
Message-ID: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>

Hi everyone,

Having lurked on this list for some time, I discern an interesting
trend. There is a lot of hype around small world networks. They have
a catchy name. And they are easy to code up. But they have terrible
performance. I suspect most people who work on small worlds are
either theoreticians who don't care about performance, or innumerate
people caught up in the hype. Who wants O(log^2 N) performance?
Did I really see simulations talking about 40+ hops? Are these people
serious ? Do they not understand the difference between 5 hops and 25?
6 and 36?

Those of you who are puzzled by phase transitions ought to read Karp's
paper "The Transitive Closure of a Random Digraph," Random Structures
and Algorithms, Vol. 1, No. 1 (1990). He shows that you need log N
edges per node on average to keep a random graph connected.  While
at it, one might as well read a recent P2P paper on O(log N),
O(d N ^ 1/d) or even O(1) systems. Why would anyone want to
reinvent a crappier wheel ?

Bob.

On 3/20/06, Ian Clarke <ian@locut.us> wrote:
> I think for the purposes of this conversation, around 70% will be
> more than sufficient (although without further thought I am not sure
> exactly how scalability in a small world network is affected by
> having variable numbers of links among nodes).
>
> Ian.
>
> On 20 Mar 2006, at 10:33, Greg Bildson wrote:
>
> > Define "most".  If you mean more than 70% (not sure of the exact
> > percentage), I would have to disagree.
> >
> > Thanks
> > -greg
> >
> >> -----Original Message-----
> >> From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-
> >> bounces@zgp.org]On
> >> Behalf Of Daniel Stutzbach
> >> Sent: Monday, March 20, 2006 1:29 PM
> >> To: Peer-to-peer development.
> >> Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)
> >>
> >>
> >> On Mon, Mar 20, 2006 at 10:20:02AM -0800, Serguei Osokine wrote:
> >>>     You mean you did not expect to grow Dijjer and Freenet beyond
> >>> 512K nodes before you'd have to replace all the client code? With
> >>> today's P2P network sizes it might be a good idea to have the code
> >>> that would be ready to scale into high millions at least - you never
> >>> know when you might need it... :-)
> >>
> >> Most users upgrade their software within 2 months [1], so replacing
> >> all the client code actually isn't that hard.  I'm assuming the
> >> network is robust enough to keep working if a small percentage of
> >> clients have the old code.
> >>
> >> [1] = based on measurements of LimeWire Ultrapeer users.  Amir
> >> H. Rasti, Daniel Stutzbach, Reza Rejaie, "On the Long-term Evolution
> >> of the Two-Tier Gnutella Overlay", to appear at the Global Internet
> >> Symposium 2006.
> >>
> >> --
> >> Daniel Stutzbach                           Computer Science Ph.D
> >> Student
> >> http://www.barsoom.org/~agthorr                     University of
> >> Oregon
> >> _______________________________________________
> >> p2p-hackers mailing list
> >> p2p-hackers@zgp.org
> >> http://zgp.org/mailman/listinfo/p2p-hackers
> >> _______________________________________________
> >> Here is a web page listing P2P Conferences:
> >> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From coderman at gmail.com  Mon Mar 20 20:28:08 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
Message-ID: <4ef5fec60603201228n65ec3185uab4eee83762a0ff4@mail.gmail.com>

On 3/20/06, Bob Harris <bob.harris.spamcontrol@gmail.com> wrote:
> Hi everyone,
>
> Having lurked on this list for some time, I discern an interesting
> trend. There is a lot of hype around small world networks. They have
> a catchy name. And they are easy to code up. But they have terrible
> performance...

back in 2000: s/small world/peer to peer/g.  like any fad this has
merit and hyperbole. (as will the next technology/idea, and the next,
etc).


> ... I suspect most people who work on small worlds are
> either theoreticians who don't care about performance, or innumerate
> people caught up in the hype. Who wants O(log^2 N) performance?

this varies a _lot_ based on architecture; besides, not everyone wants
to scale a small world to 500,000,000 users.


> Those of you who are puzzled by phase transitions ought to read Karp's
> paper "The Transitive Closure of a Random Digraph," Random Structures
> and Algorithms, Vol. 1, No. 1 (1990). He shows that you need log N
> edges per node on average to keep a random graph connected.

homogeneous, yes. which is why the paper on inhomogeneous random
graphs is useful.  the real world is not homogeneous...

better understanding of the elements in your decentralized networking
toolkit gives better product.

(that said i do agree that far too many designs overlook the impact of
malicious/coordinated attacks on these fragile overlay/routing
architectures.)

From ian at locut.us  Mon Mar 20 20:42:25 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
Message-ID: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>

On 20 Mar 2006, at 12:11, Bob Harris wrote:
> There is a lot of hype around small world networks. They have
> a catchy name. And they are easy to code up. But they have terrible
> performance.

It is rather courageous (or perhaps simply foolish) of you to dismiss  
an entire avenue of study so cavalierly, time will tell whether you  
are right.

>  Who wants O(log^2 N) performance?

It has already been pointed out that actual route lengths are far  
more important than the order of the route lengths in practical  
networks.  It has also been pointed out that O(log^2 N) performance  
presumes a fixed routing table size, where in most if not all  
practical deployments, routing table sizes are increased with the  
size of the network.

> Did I really see simulations talking about 40+ hops?

You might have, but I can't recall any such simulations mentioned in  
this thread.

Ian.


From m.rogers at cs.ucl.ac.uk  Mon Mar 20 21:05:37 2006
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <4ef5fec60603201115s2f45af3rbda4182fe28670b1@mail.gmail.com>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AC1@fcexmb04.efi.internal>
	<4ef5fec60603201115s2f45af3rbda4182fe28670b1@mail.gmail.com>
Message-ID: <441F1921.7090205@cs.ucl.ac.uk>

coderman wrote:
> anyone have further insight on predicting the distribution or nature
> of phase transitions in arbitrary graphs?  i haven't turned up any
> good papers, but i haven't looked that hard yet either.

"Giant component" and "percolation" might be good phrases to search for.

Cheers,
Michael

From lemonobrien at yahoo.com  Mon Mar 20 21:08:56 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <4ef5fec60603201228n65ec3185uab4eee83762a0ff4@mail.gmail.com>
Message-ID: <20060320210856.59395.qmail@web53602.mail.yahoo.com>

small world/social networking/web 2.0 .... all hype...all that is happening is the remergence of the same patterns throughout systems.

coderman <coderman@gmail.com> wrote:  On 3/20/06, Bob Harris wrote:
> Hi everyone,
>
> Having lurked on this list for some time, I discern an interesting
> trend. There is a lot of hype around small world networks. They have
> a catchy name. And they are easy to code up. But they have terrible
> performance...

back in 2000: s/small world/peer to peer/g. like any fad this has
merit and hyperbole. (as will the next technology/idea, and the next,
etc).


> ... I suspect most people who work on small worlds are
> either theoreticians who don't care about performance, or innumerate
> people caught up in the hype. Who wants O(log^2 N) performance?

this varies a _lot_ based on architecture; besides, not everyone wants
to scale a small world to 500,000,000 users.


> Those of you who are puzzled by phase transitions ought to read Karp's
> paper "The Transitive Closure of a Random Digraph," Random Structures
> and Algorithms, Vol. 1, No. 1 (1990). He shows that you need log N
> edges per node on average to keep a random graph connected.

homogeneous, yes. which is why the paper on inhomogeneous random
graphs is useful. the real world is not homogeneous...

better understanding of the elements in your decentralized networking
toolkit gives better product.

(that said i do agree that far too many designs overlook the impact of
malicious/coordinated attacks on these fragile overlay/routing
architectures.)
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060320/ebaf6069/attachment.html
From turbogeek at cluck.com  Mon Mar 20 21:14:32 2006
From: turbogeek at cluck.com (Daniel Brookshier)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
Message-ID: <9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>

I'll chime in. In the P2P world, O(log^2 N) may not be efficient, but  
it may be the cheapest in terms of resources. For instance, a walker  
may take a while to find a resource in a small world topology, but it  
expends little effort at each node. Conversely, to attain fewer hops,  
that also means a larger resource at each node to index and process  
the index queries. There are also ways to use the hubs in such  
networks to greatly improve efficiency.

The small world is also not necessarily the complete network or only  
topology available to an application. The number of hops in a search  
is not the same as a the number of hops that may be applied to  
communications. Thus even when one part is inefficient, the other may  
be ideal.

On Mar 20, 2006, at 2:42 PM, Ian Clarke wrote:

> On 20 Mar 2006, at 12:11, Bob Harris wrote:
>> There is a lot of hype around small world networks. They have
>> a catchy name. And they are easy to code up. But they have terrible
>> performance.
>
> It is rather courageous (or perhaps simply foolish) of you to  
> dismiss an entire avenue of study so cavalierly, time will tell  
> whether you are right.
>
>>  Who wants O(log^2 N) performance?
>
> It has already been pointed out that actual route lengths are far  
> more important than the order of the route lengths in practical  
> networks.  It has also been pointed out that O(log^2 N) performance  
> presumes a fixed routing table size, where in most if not all  
> practical deployments, routing table sizes are increased with the  
> size of the network.
>
>> Did I really see simulations talking about 40+ hops?
>
> You might have, but I can't recall any such simulations mentioned  
> in this thread.
>
> Ian.
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


From lemonobrien at yahoo.com  Mon Mar 20 21:28:28 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
Message-ID: <20060320212828.2813.qmail@web53608.mail.yahoo.com>

its the communications between the nodes..are they passive, aggressive, how do they communicate with each other...for example...should a relay propagate a stream to more than one node; with video, for instance, your replicating data over the entire network; but something like a simple connection message should not. 
   
  I say screw the math...math can't do real measurement of true animalistic behavior..
  anger for example? 'fuzzy'....and how nodes communicate and work together are the same way an  orgranism will work with others (small world) and within itself...as well...
  so your network can be whatever...it will ultimateley be determined by ones view of the world. and science is a view...for example...i believe in the passive approach...and my protocols reflect that...and if you know small world theory; this could be called the 'weak' connection....which if you know anything about human and social behavior, is the one where information travels most efficiently.
   
  
Daniel Brookshier <turbogeek@cluck.com> wrote:
  I'll chime in. In the P2P world, O(log^2 N) may not be efficient, but 
it may be the cheapest in terms of resources. For instance, a walker 
may take a while to find a resource in a small world topology, but it 
expends little effort at each node. Conversely, to attain fewer hops, 
that also means a larger resource at each node to index and process 
the index queries. There are also ways to use the hubs in such 
networks to greatly improve efficiency.

The small world is also not necessarily the complete network or only 
topology available to an application. The number of hops in a search 
is not the same as a the number of hops that may be applied to 
communications. Thus even when one part is inefficient, the other may 
be ideal.

On Mar 20, 2006, at 2:42 PM, Ian Clarke wrote:

> On 20 Mar 2006, at 12:11, Bob Harris wrote:
>> There is a lot of hype around small world networks. They have
>> a catchy name. And they are easy to code up. But they have terrible
>> performance.
>
> It is rather courageous (or perhaps simply foolish) of you to 
> dismiss an entire avenue of study so cavalierly, time will tell 
> whether you are right.
>
>> Who wants O(log^2 N) performance?
>
> It has already been pointed out that actual route lengths are far 
> more important than the order of the route lengths in practical 
> networks. It has also been pointed out that O(log^2 N) performance 
> presumes a fixed routing table size, where in most if not all 
> practical deployments, routing table sizes are increased with the 
> size of the network.
>
>> Did I really see simulations talking about 40+ hops?
>
> You might have, but I can't recall any such simulations mentioned 
> in this thread.
>
> Ian.
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060320/0d67e3d6/attachment.htm
From bob.harris.spamcontrol at gmail.com  Mon Mar 20 21:39:24 2006
From: bob.harris.spamcontrol at gmail.com (Bob Harris)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
Message-ID: <a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>

> In the P2P world, O(log^2 N) may not be efficient, but
> it may be the cheapest in terms of resources.

I like the use of "may" in that sentence. Well, maybe not. Small worlds
typically need O(log N) connections to other nodes to maintain
connectivity. Pastry and Chord use O(log N) connections as well. Small
worlds get O(log^2 N) hop lookups, Pastry and Chord get O(log N).
Remind me how x^2 packets and x^2 processing is better than x.

>For instance, a walker
> may take a while to find a resource in a small world topology, but it
> expends little effort at each node.

Same arguments apply to a Chord or Pastry lookup. The bandwidth
spent to retain the ring structure is about the same as the bandwidth
spent to ping neighbors.

>Conversely, to attain fewer hops,
> that also means a larger resource at each node to index and process
> the index queries.

True only in unstructured networks. Indexing and aggregation
of the kind you allude to are not necessary to get better lookup
times in structured p2p.

>There are also ways to use the hubs in such
> networks to greatly improve efficiency.

Same techniques to reduce N apply equally well to other systems.

> The small world is also not necessarily the complete network or only
> topology available to an application. The number of hops in a search
> is not the same as a the number of hops that may be applied to
> communications. Thus even when one part is inefficient, the other may
> be ideal.

True, when something doesn't work well, there are other ways around
it. I see nothing but hype in small worlds. Why not add UDDI+WSDL+
web services+social networking and become buzzword complete?

How many here realize that the hype around small worlds was
created mostly by physicists by applying elementary math to a highly
idealized and theoretical problem, with no implementation? Those guys
don't care about performance, just the pretty name. You've all seen
the books, and you know they are devoid of content. 5 is really << 25.
Use the better tool.

Bob.

From bob.harris.spamcontrol at gmail.com  Mon Mar 20 21:46:42 2006
From: bob.harris.spamcontrol at gmail.com (Bob Harris)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <20060320212828.2813.qmail@web53608.mail.yahoo.com>
References: <9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<20060320212828.2813.qmail@web53608.mail.yahoo.com>
Message-ID: <a5b3d0b30603201346k7c56715cu9f58d55b531edc99@mail.gmail.com>

Hi Lemon,

>it will ultimateley be determined by ones
> view of the world. and science is a view...

This argument leads to sophism, where every idea (think intelligent
design) is as worthy and good as every other. That's false :-). At
least, in the reality-based community.

>for example...i believe in the
> passive approach...and my protocols reflect that...and if you know small
> world theory; this could be called the 'weak' connection....which if you
> know anything about human and social behavior, is the one where information
> travels most efficiently.

I guess none of us know anything about human and social behavior; we
all thought information travels fastest down the shortest fiber link with
the smallest number of hops.

Frankly, a lot of small world hype relies on hand-wavy allusions to
social behavior. Well, computers aren't people; they can be very
efficient when
misapplied analogies do not get in the way.

Bob.


>
>
>
>
> Daniel Brookshier <turbogeek@cluck.com> wrote:
> I'll chime in. In the P2P world, O(log^2 N) may not be efficient, but
> it may be the cheapest in terms of resources. For instance, a walker
> may take a while to find a resource in a small world topology, but it
> expends little effort at each node. Conversely, to attain fewer hops,
> that also means a larger resource at each node to index and process
> the index queries. There are also ways to use the hubs in such
> networks to greatly improve efficiency.
>
> The small world is also not necessarily the complete network or only
> topology available to an application. The number of hops in a search
> is not the same as a the number of hops that may be applied to
> communications. Thus even when one part is inefficient, the other may
> be ideal.
>
> On Mar 20, 2006, at 2:42 PM, Ian Clarke wrote:
>
> > On 20 Mar 2006, at 12:11, Bob Harris wrote:
> >> There is a lot of hype around small world networks. They have
> >> a catchy name. And they are easy to code up. But they have terrible
> >> performance.
> >
> > It is rather courageous (or perhaps simply foolish) of you to
> > dismiss an entire avenue of study so cavalierly, time will tell
> > whether you are right.
> >
> >> Who wants O(log^2 N) performance?
> >
> > It has already been pointed out that actual route lengths are far
> > more important than the order of the route lengths in practical
> > networks. It has also been pointed out that O(log^2 N) performance
> > presumes a fixed routing table size, where in most if not all
> > practical deployments, routing table sizes are increased with the
> > size of the network.
> >
> >> Did I really see simulations talking about 40+ hops?
> >
> > You might have, but I can't recall any such simulations mentioned
> > in this thread.
> >
> > Ian.
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> >
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
>
>
> You don't get no juice unless you squeeze
> Lemon Obrien, the Third.
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
>
>

From agthorr at cs.uoregon.edu  Mon Mar 20 21:47:09 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
Message-ID: <20060320214709.GF5200@cs.uoregon.edu>

On Mon, Mar 20, 2006 at 04:39:24PM -0500, Bob Harris wrote:
> I like the use of "may" in that sentence. Well, maybe not. Small worlds
> typically need O(log N) connections to other nodes to maintain
> connectivity. Pastry and Chord use O(log N) connections as well. Small
> worlds get O(log^2 N) hop lookups, Pastry and Chord get O(log N).
> Remind me how x^2 packets and x^2 processing is better than x.

Pastry and Chord *are* small worlds, so I think you're a bit confused.

It sounds like you're generalizing based on some very specific network (which
you haven't named) that happens to also be a small world.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From coderman at gmail.com  Mon Mar 20 21:53:03 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
Message-ID: <4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>

On 3/20/06, Bob Harris <bob.harris.spamcontrol@gmail.com> wrote:
> ...
> Small worlds
> typically need O(log N) connections to other nodes to maintain
> connectivity. Pastry and Chord use O(log N) connections as well. Small
> worlds get O(log^2 N) hop lookups, Pastry and Chord get O(log N).
> Remind me how x^2 packets and x^2 processing is better than x.

there's a trade off here; what good is an O(log N) topology when it's
broken?  resilience is typically better in small worlds precisely
because they are unstructured.  directed/coordinated attacks against
highly structured networks are highly effective (not to mention the
usual intermittent failures, but most of these models assume a random
distribution of failure - this is an invitation to malicious intent)

also consider real world networks where such relationships /
topologies exist: in our wireless networks unidirectional links
present from a 30mW client in a coffee shop vary greatly from links
present on a high power amplified panel array mounted on a tower. 
this type of node disparity is more akin to small world than a highly
structured overlay.

and in practice you'll probably find yourself combining features of
both to make a resilient network that can function efficiently in good
conditions and remain functional in volatile / malicious environments.

just my $0.02

From agthorr at cs.uoregon.edu  Mon Mar 20 21:58:49 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
Message-ID: <20060320215848.GG5200@cs.uoregon.edu>

On Mon, Mar 20, 2006 at 01:53:03PM -0800, coderman wrote:
> there's a trade off here; what good is an O(log N) topology when it's
> broken?  resilience is typically better in small worlds precisely
> because they are unstructured.  

Resiliency depends on other features of the graph.  If the small world
is also a power-law graph, for example, it is vulnerable to attacks on
the high-degree peers.  The fact that the graph is a small world is
irrelevant.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From coderman at gmail.com  Mon Mar 20 22:00:50 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <20060320214709.GF5200@cs.uoregon.edu>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<20060320214709.GF5200@cs.uoregon.edu>
Message-ID: <4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>

On 3/20/06, Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:
> ...
> Pastry and Chord *are* small worlds, so I think you're a bit confused.

i should distinguish between unstructured small worlds (what i've been
calling small worlds) and highly structured overlay small worlds
(CAN/Chord/Pastry/etc).

where's Zooko's p2p ontology page? :)

From agthorr at cs.uoregon.edu  Mon Mar 20 22:09:43 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<20060320214709.GF5200@cs.uoregon.edu>
	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>
Message-ID: <20060320220942.GH5200@cs.uoregon.edu>

On Mon, Mar 20, 2006 at 02:00:50PM -0800, coderman wrote:
> i should distinguish between unstructured small worlds (what i've been
> calling small worlds) and highly structured overlay small worlds
> (CAN/Chord/Pastry/etc).
> 
> where's Zooko's p2p ontology page? :)

That's an important distinction.  I'd suggest you say "unstructured"
instead of "small world" then, because the Chord, Pastry, and company
are more small-world-ish than unstructured overlays like Gnutella.

(specifically because they have a much higher clustering coefficient
than Gnutella)

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From coderman at gmail.com  Mon Mar 20 22:16:07 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <20060320215848.GG5200@cs.uoregon.edu>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
Message-ID: <4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>

On 3/20/06, Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:
> ...
> Resiliency depends on other features of the graph.  If the small world
> is also a power-law graph, for example, it is vulnerable to attacks on
> the high-degree peers.  The fact that the graph is a small world is
> irrelevant.

true. thanks for mentioning this as the high degree node attacks are
indeed a great way to cripple any of the power law networks.

i suppose this highlights a prejudice of mine: that nodes within
unstructured graphs scale according to capability (and thus high
degrees and power laws emerge from aggregate node behavior) and this
in turn is more resilient than highly structured networks which assign
identifier space in a much more homogeneous and fragile manner.

From agthorr at cs.uoregon.edu  Mon Mar 20 22:25:38 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
Message-ID: <20060320222537.GI5200@cs.uoregon.edu>

On Mon, Mar 20, 2006 at 02:16:07PM -0800, coderman wrote:
> i suppose this highlights a prejudice of mine: that nodes within
> unstructured graphs scale according to capability (and thus high
> degrees and power laws emerge from aggregate node behavior) and this
> in turn is more resilient than highly structured networks which assign
> identifier space in a much more homogeneous and fragile manner.

I think the key difference is how the search/lookup is conducted, and
not the structure of the graph.  If you use flooding over a DHT, the
DHT will be just as resilient.  The resiliency difficulties of DHTs
are result of wanting the additional constraint that you need the
DHT-style lookup to go to one particular node.  There really isn't any
risk of the graph fragmenting into tiny pieces.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From lemonobrien at yahoo.com  Mon Mar 20 22:30:01 2006
From: lemonobrien at yahoo.com (Lemon Obrien)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603201346k7c56715cu9f58d55b531edc99@mail.gmail.com>
Message-ID: <20060320223001.66458.qmail@web53614.mail.yahoo.com>

>> computers aren't people;
   
  yeah...but people make them; design software for them; and use them...and without them; their would be no need for them.
   
  >>information travels fastest down the shortest fiber link with
the smallest number of hops
   
  you may find this may not always be the case either; especially with streams of data. or types of data where delivery may be the most important need. You looking at the problem from the point of view where your environment is 'pure' and simple. Look at TCP...between every node the reciever send an 'awk'...why? cause it was designed that way...and designed to adapt....but also designed from the sender's point of view. If you design from the perspective of the receiever...then the protocol will be totally different.
   
  your perspective shapes your math...as also being human.

Bob Harris <bob.harris.spamcontrol@gmail.com> wrote:
  Hi Lemon,

>it will ultimateley be determined by ones
> view of the world. and science is a view...

This argument leads to sophism, where every idea (think intelligent
design) is as worthy and good as every other. That's false :-). At
least, in the reality-based community.

>for example...i believe in the
> passive approach...and my protocols reflect that...and if you know small
> world theory; this could be called the 'weak' connection....which if you
> know anything about human and social behavior, is the one where information
> travels most efficiently.

I guess none of us know anything about human and social behavior; we
all thought information travels fastest down the shortest fiber link with
the smallest number of hops.

Frankly, a lot of small world hype relies on hand-wavy allusions to
social behavior. Well, computers aren't people; they can be very
efficient when
misapplied analogies do not get in the way.

Bob.


>
>
>
>
> Daniel Brookshier wrote:
> I'll chime in. In the P2P world, O(log^2 N) may not be efficient, but
> it may be the cheapest in terms of resources. For instance, a walker
> may take a while to find a resource in a small world topology, but it
> expends little effort at each node. Conversely, to attain fewer hops,
> that also means a larger resource at each node to index and process
> the index queries. There are also ways to use the hubs in such
> networks to greatly improve efficiency.
>
> The small world is also not necessarily the complete network or only
> topology available to an application. The number of hops in a search
> is not the same as a the number of hops that may be applied to
> communications. Thus even when one part is inefficient, the other may
> be ideal.
>
> On Mar 20, 2006, at 2:42 PM, Ian Clarke wrote:
>
> > On 20 Mar 2006, at 12:11, Bob Harris wrote:
> >> There is a lot of hype around small world networks. They have
> >> a catchy name. And they are easy to code up. But they have terrible
> >> performance.
> >
> > It is rather courageous (or perhaps simply foolish) of you to
> > dismiss an entire avenue of study so cavalierly, time will tell
> > whether you are right.
> >
> >> Who wants O(log^2 N) performance?
> >
> > It has already been pointed out that actual route lengths are far
> > more important than the order of the route lengths in practical
> > networks. It has also been pointed out that O(log^2 N) performance
> > presumes a fixed routing table size, where in most if not all
> > practical deployments, routing table sizes are increased with the
> > size of the network.
> >
> >> Did I really see simulations talking about 40+ hops?
> >
> > You might have, but I can't recall any such simulations mentioned
> > in this thread.
> >
> > Ian.
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers@zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> >
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
>
>
> You don't get no juice unless you squeeze
> Lemon Obrien, the Third.
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
>
>
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences


You don't get no juice unless you squeeze
Lemon Obrien, the Third.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060320/103145bc/attachment.html
From ossa at math.chalmers.se  Mon Mar 20 22:42:40 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet
	(RE:	[p2p-hackers] clustering))
In-Reply-To: <20060320222537.GI5200@cs.uoregon.edu>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>	<20060320215848.GG5200@cs.uoregon.edu>	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
	<20060320222537.GI5200@cs.uoregon.edu>
Message-ID: <441F2FE0.5090702@math.chalmers.se>

Daniel Stutzbach wrote:
> I think the key difference is how the search/lookup is conducted, and
> not the structure of the graph.  If you use flooding over a DHT, the
> DHT will be just as resilient.  The resiliency difficulties of DHTs
> are result of wanting the additional constraint that you need the
> DHT-style lookup to go to one particular node.  There really isn't any
> risk of the graph fragmenting into tiny pieces.

The point with the type of networks discussed earlier in this thread
(which do not fall into the category of structured DHT nor flooding an
unstructured network) is that they offer a middle-ground. Look-ups are
routed queries, but they need not necessarily go to one particular node
(there only needs to be a high probability that two queries for the same
thing will intersect), and nodes do not have any "canonical neighbors"
(*) in the graph. Also, because the network is formed dynamically from
the normal procedure of routing, no special join or leave procedures are
necessary.

Structured DHT networks are brittle. If a Chord node, for instance, for
some reason thinks that a node on the other side of the ring is actually
its closest neighbor, bad things will happen, and in theory it could
cause the whole network to degenerate. Similar scenarios are possible
for other structured DHTs. The randomly-structured (in lack of a better
word) small-world networks OTOH can live through a complete netsplit and
just keep ticking, without any special adjustment needing to made.

// oskar

(*) By canonical neighbor I mean a neighbor that the node must know
based on their respective IDs.

From alenlpeacock at gmail.com  Mon Mar 20 23:05:27 2006
From: alenlpeacock at gmail.com (Alen Peacock)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <20060320215848.GG5200@cs.uoregon.edu>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
Message-ID: <ffe450f90603201505h49bf524l7d375a4b5bcb36e9@mail.gmail.com>

On 3/20/06, Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:
>
> Resiliency depends on other features of the graph.  If the small world
> is also a power-law graph, for example, it is vulnerable to attacks on
> the high-degree peers.  The fact that the graph is a small world is
> irrelevant.


Speaking of which, I found this quite interesting: "The Topology of
Covert Conflict",
http://www.cl.cam.ac.uk/TechReports/UCAM-CL-TR-637.pdf

(quick and incomplete summary: uses evolutionary game theory to
analyze the effectiveness of different strategies for dealing with
vertex-order attacks in scale-free networks.)

Alen

From agthorr at cs.uoregon.edu  Mon Mar 20 23:11:39 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet
	(RE:	[p2p-hackers] clustering))
In-Reply-To: <441F2FE0.5090702@math.chalmers.se>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
	<20060320222537.GI5200@cs.uoregon.edu>
	<441F2FE0.5090702@math.chalmers.se>
Message-ID: <20060320231139.GJ5200@cs.uoregon.edu>

On Mon, Mar 20, 2006 at 11:42:40PM +0100, Oskar Sandberg wrote:
> The point with the type of networks discussed earlier in this thread
> (which do not fall into the category of structured DHT nor flooding an
> unstructured network) is that they offer a middle-ground. Look-ups are
> routed queries, but they need not necessarily go to one particular node
> (there only needs to be a high probability that two queries for the same
> thing will intersect), and nodes do not have any "canonical neighbors"
> (*) in the graph.

I believe all major DHTs have eventually adopted the policy that
look-ups do not necessarily go to one particular node, there just
needs to be a high probability that two queries for the same thing
will intersect.  Really there's no way to guarantee anything else :)

Most DHTs that I'm aware use some sort of replication in the
publishing process, either by publishing to several nodes in the
neighborhood of the canonical node, or by publishing to several
canonical nodes (such as the one closest to the target, and the one
closest to the bitwise-inverse of the target).

I'm not saying that having a middle ground is bad, though. :)

> Structured DHT networks are brittle. 

Kad and OverNet seem to be doing okay.  They have around 1 million and
half a million nodes, respectively.  I imagine the BitTorrent DHT is
pretty big by now, too, but I haven't played with it yet.

> If a Chord node, for instance, for some reason thinks that a node on
> the other side of the ring is actually its closest neighbor, bad
> things will happen, and in theory it could cause the whole network
> to degenerate.  Similar scenarios are possible for other structured
> DHTs.

That wouldn't be problem in Kademlia-based DHTs (nor in other DHTs
that use parallel queries).  If there's a problem, it routes around
it.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From coderman at gmail.com  Mon Mar 20 23:40:48 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <ffe450f90603201505h49bf524l7d375a4b5bcb36e9@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
	<ffe450f90603201505h49bf524l7d375a4b5bcb36e9@mail.gmail.com>
Message-ID: <4ef5fec60603201540l2df64420v6d8eb887a871d828@mail.gmail.com>

On 3/20/06, Alen Peacock <alenlpeacock@gmail.com> wrote:
> ...
> Speaking of which, I found this quite interesting: "The Topology of
> Covert Conflict",
> http://www.cl.cam.ac.uk/TechReports/UCAM-CL-TR-637.pdf

an interesting paper! the 'dining steganographers' defence strategy is
widely applicable.  it's also interesting to note that in the
telecommunications realm rapid repair (direct replenishment, not given
as a strategy?) is currently the method of choice. (for example,
at&t's network disaster recovery program
http://www.att.com/ndr/team_equipment.html )

carriers have also been diversifying routes but this is a slow and
expensive process due to the cost of buried infrastructure. 
fortunately the major threats right now are not intentional directed
attacks but random failures / cataclysms.

From bob.harris.spamcontrol at gmail.com  Mon Mar 20 23:55:45 2006
From: bob.harris.spamcontrol at gmail.com (Bob Harris)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <20060320220942.GH5200@cs.uoregon.edu>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<20060320214709.GF5200@cs.uoregon.edu>
	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>
	<20060320220942.GH5200@cs.uoregon.edu>
Message-ID: <a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>

I thought we went through this fruitless discussion on what
constitutes a small world already. I use the term in the same sense as
Barabasi, Strogatz and Kleinberg. Kleinberg is the only one who has a
tight, formal definition (albeit in a highly idealized grid), but if
that's too restrictive for your taste, I'm happy to include any
unstructured overlay where the edges are selected at random and the
number of edges per node is relatively small in number.

I didn't know what a clustering coefficient was. I looked it up and it
seems like a pointless metric. If you are lumping CAN/Chord/Pastry
with Gnutella/Freenet, you are doing something wrong.

Bob.

On 3/20/06, Daniel Stutzbach <agthorr@cs.uoregon.edu> wrote:
> On Mon, Mar 20, 2006 at 02:00:50PM -0800, coderman wrote:
> > i should distinguish between unstructured small worlds (what i've been
> > calling small worlds) and highly structured overlay small worlds
> > (CAN/Chord/Pastry/etc).
> >
> > where's Zooko's p2p ontology page? :)
>
> That's an important distinction.  I'd suggest you say "unstructured"
> instead of "small world" then, because the Chord, Pastry, and company
> are more small-world-ish than unstructured overlays like Gnutella.
>
> (specifically because they have a much higher clustering coefficient
> than Gnutella)
>
> --
> Daniel Stutzbach                           Computer Science Ph.D Student
> http://www.barsoom.org/~agthorr                     University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From agthorr at cs.uoregon.edu  Tue Mar 21 02:53:57 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Dijjer and Freenet (RE: [p2p-hackers] clustering)
In-Reply-To: <EPEJIODJLBDLEHGHIEADAEMBGBAB.gbildson@limepeer.com>
References: <20060320182918.GB5200@cs.uoregon.edu>
	<EPEJIODJLBDLEHGHIEADAEMBGBAB.gbildson@limepeer.com>
Message-ID: <20060321025356.GS5200@cs.uoregon.edu>

Here's the data we've collected, covering October 2004 through January 2006:

http://nazanin.ir/amir/Versions-nolog-norm.png

My original statement was a little misleading.  I don't have any way
to observe how often users *upgrade*, we can only measure what
ultrapeers are currently running.  Presumably, new users download the
latest version.

Also, now that I look at the figure again, I see that it's more than
"a small percentage" still running the old version after 2 months.
However, new versions do seem to consistency get more than 50% market
share within 2 months.

I should also add that by "version", I'm ignoring the third value in
the version number (mostly because including those would make the
graph impossible to read :) ).

On Mon, Mar 20, 2006 at 01:33:43PM -0500, Greg Bildson wrote:
> Define "most".  If you mean more than 70% (not sure of the exact
> percentage), I would have to disagree.
> 
> Thanks
> -greg
> 
> > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
> > Behalf Of Daniel Stutzbach
> > Sent: Monday, March 20, 2006 1:29 PM
> > To: Peer-to-peer development.
> > Subject: Re: Dijjer and Freenet (RE: [p2p-hackers] clustering)
> >
> >
> > On Mon, Mar 20, 2006 at 10:20:02AM -0800, Serguei Osokine wrote:
> > > 	You mean you did not expect to grow Dijjer and Freenet beyond
> > > 512K nodes before you'd have to replace all the client code? With
> > > today's P2P network sizes it might be a good idea to have the code
> > > that would be ready to scale into high millions at least - you never
> > > know when you might need it... :-)
> >
> > Most users upgrade their software within 2 months [1], so replacing
> > all the client code actually isn't that hard.  I'm assuming the
> > network is robust enough to keep working if a small percentage of
> > clients have the old code.
> >
> > [1] = based on measurements of LimeWire Ultrapeer users.  Amir
> > H. Rasti, Daniel Stutzbach, Reza Rejaie, "On the Long-term Evolution
> > of the Two-Tier Gnutella Overlay", to appear at the Global Internet
> > Symposium 2006.
> >
> 
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From ossa at math.chalmers.se  Tue Mar 21 07:30:03 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet
	(RE:	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>	<20060320214709.GF5200@cs.uoregon.edu>	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>	<20060320220942.GH5200@cs.uoregon.edu>
	<a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>
Message-ID: <441FAB7B.6040808@math.chalmers.se>

Bob Harris wrote:
> I thought we went through this fruitless discussion on what
> constitutes a small world already. I use the term in the same sense as
> Barabasi, Strogatz and Kleinberg. Kleinberg is the only one who has a
> tight, formal definition (albeit in a highly idealized grid), but if
> that's too restrictive for your taste,

The only thing these three models have in common is the small diameter
and high clustering.

> I'm happy to include any
> unstructured overlay where the edges are selected at random and the
> number of edges per node is relatively small in number.

Small-world network is a graph-theoretic term, it has nothing to do with
overlays. You can use these terms to describe the graph formed by an
overlay, regardless if it is structured or not.

> I didn't know what a clustering coefficient was. I looked it up and it
> seems like a pointless metric. If you are lumping CAN/Chord/Pastry
> with Gnutella/Freenet, you are doing something wrong.

Just because a term applies to two things doesn't mean they are the
same. I would "lump" them all as distributed. What is wrong with that?

Where would use place "Symphony" [1] btw? A very structured DHT, but
based directly on Kleinberg's model.

// oskar

[1] http://www-db.stanford.edu/~manku/papers/03usits-symphony/

From ossa at math.chalmers.se  Tue Mar 21 08:10:23 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and
	Freenet	(RE:	[p2p-hackers] clustering))
In-Reply-To: <20060320231139.GJ5200@cs.uoregon.edu>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>	<20060320215848.GG5200@cs.uoregon.edu>	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>	<20060320222537.GI5200@cs.uoregon.edu>	<441F2FE0.5090702@math.chalmers.se>
	<20060320231139.GJ5200@cs.uoregon.edu>
Message-ID: <441FB4EF.8070605@math.chalmers.se>

Daniel Stutzbach wrote:
> On Mon, Mar 20, 2006 at 11:42:40PM +0100, Oskar Sandberg wrote:
>>Structured DHT networks are brittle. 
> 
> 
> Kad and OverNet seem to be doing okay.  They have around 1 million and
> half a million nodes, respectively.  I imagine the BitTorrent DHT is
> pretty big by now, too, but I haven't played with it yet.

Is there any more detailed information about these networks somewhere? I
have played with the former a few times, and found that while it seems
to work for its purpose, it does seem to give pretty strange and not
always consistant results. I was under the impression that it used only
a subset of stable users in the actual grid, though, so a million and a
half users seems large. (Haven't most users had been scared off these
networks?)

What surprises me most with these networks, actually, is that they
haven't been taken down by DoS attacks. Saying this isn't an attack on
any particular system, since every system I have seen is vulnerable to
the point where it ought to take only one sufficiently motivated script
kiddie (/ national organization) with a few hundred purposely broken nodes.

>>If a Chord node, for instance, for some reason thinks that a node on
>>the other side of the ring is actually its closest neighbor, bad
>>things will happen, and in theory it could cause the whole network
>>to degenerate.  Similar scenarios are possible for other structured
>>DHTs.
> 
> 
> That wouldn't be problem in Kademlia-based DHTs (nor in other DHTs
> that use parallel queries).  If there's a problem, it routes around
> it.

No, you can have similar inconsistancies that would cause hypercube
based systems like Kademlia to degenerate. Parallel queries can only
offset such problems a little, unless the network updates itself based
on which queries work well.

You can make a hypercube based mesh more flexible if you start being
very lax about having the "correct" neighbors and just ask nodes to try
to match the levels as well as possible. But the result is essentially
the same thing as Kleinberg's model (though since you are using bitwise
distance it will remind more of [1] than the geographic papers).

// oskar

[1] http://privacy.cs.cmu.edu/dataprivacy/papers/socialnetworks/40.pdf

From agthorr at cs.uoregon.edu  Tue Mar 21 08:50:47 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and
	Freenet	(RE:	[p2p-hackers] clustering))
In-Reply-To: <441FB4EF.8070605@math.chalmers.se>
References: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
	<20060320222537.GI5200@cs.uoregon.edu>
	<441F2FE0.5090702@math.chalmers.se>
	<20060320231139.GJ5200@cs.uoregon.edu>
	<441FB4EF.8070605@math.chalmers.se>
Message-ID: <20060321085046.GW5200@cs.uoregon.edu>

On Tue, Mar 21, 2006 at 09:10:23AM +0100, Oskar Sandberg wrote:
> Daniel Stutzbach wrote:
> > Kad and OverNet seem to be doing okay.  They have around 1 million and
> > half a million nodes, respectively.  I imagine the BitTorrent DHT is
> > pretty big by now, too, but I haven't played with it yet.
> 
> Is there any more detailed information about these networks
> somewhere?

They're all based on Kademlia, for which the original paper is:

Petar Maymounkov and David Mazieres, "Kademlia: A Peer-to-peer
Information System Based on the XOR Metric", IPTPS, 2002,
http://www.scs.cs.nyu.edu/~dm/papers/maymounkov:kademlia.ps.gz

I did a measurement study based on Kad, which you can find here:

Daniel Stutzbach and Reza Rejaie, "Improving Lookup Performance over a
Widely-Deployed DHT", to appear at INFOCOM, 2006,
http://www.barsoom.org/~agthorr/papers/infocom-2006-kad.pdf

Other than that, I'm not aware of much in the way of documentation of
the particular implementations.  Just source code.

> I have played with the former a few times, and found that while it
> seems to work for its purpose, it does seem to give pretty strange
> and not always consistant results. 

Yeah, it's a little buggy.  As I understand it, the guy who developed
the Kad code for eMule left the eMule team before he finished
debugging so it's functional but not a sleek machine.  I think in the
past few months someone there has starting digging into it.

> I was under the impression that it used only a subset of stable
> users in the actual grid, though, so a million and a half users
> seems large.

Yes, only non-NATed peers are in the actual grid.  Meaning there's a
million users in the Kad DHT, plus a whole bunch of additional users
who do lookups over Kad but don't route.

> (Haven't most users had been scared off these networks?)

Rumors of the demise of P2P file-sharing have been greatly exaggerated. :)

> What surprises me most with these networks, actually, is that they
> haven't been taken down by DoS attacks. Saying this isn't an attack on
> any particular system, since every system I have seen is vulnerable to
> the point where it ought to take only one sufficiently motivated script
> kiddie (/ national organization) with a few hundred purposely broken nodes.

I spent a lot of time staring at the Kad source-code and poking and
prodding it with various home-brewed measurement tools.  The
conclusion I came to was this:

    The Kad implementation had many bugs and the bugs were hard to
    find, so they made the system so robust that it continues to
    operate even though it is full of bugs.

    However, it is not very efficient: every node has around 700 neighbors.

So I don't think a few hundred purposely broken nodes would change
much. :)

You should read the original Kademlia paper.  I'd be interested to
know if you'd think it'd be as vulnerable as you describe.

> > That wouldn't be problem in Kademlia-based DHTs (nor in other DHTs
> > that use parallel queries).  If there's a problem, it routes around
> > it.
> 
> No, you can have similar inconsistancies that would cause hypercube
> based systems like Kademlia to degenerate. Parallel queries can only
> offset such problems a little, unless the network updates itself based
> on which queries work well.

Kademlia isn't hypercube-based, that's CAN.  Kademlia does
prefix-matching like Pastry, though that's not obvious from the
Kademlia paper.  The "XOR metric" is just a clever way of saying "find
me the highest order non-matching bit".

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From ossa at math.chalmers.se  Tue Mar 21 11:25:08 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer
	and	Freenet	(RE:	[p2p-hackers] clustering))
In-Reply-To: <20060321085046.GW5200@cs.uoregon.edu>
References: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>	<20060320215848.GG5200@cs.uoregon.edu>	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>	<20060320222537.GI5200@cs.uoregon.edu>	<441F2FE0.5090702@math.chalmers.se>	<20060320231139.GJ5200@cs.uoregon.edu>	<441FB4EF.8070605@math.chalmers.se>
	<20060321085046.GW5200@cs.uoregon.edu>
Message-ID: <441FE294.1020609@math.chalmers.se>

Daniel Stutzbach wrote:
> I spent a lot of time staring at the Kad source-code and poking and
> prodding it with various home-brewed measurement tools.  The
> conclusion I came to was this:
> 
>     The Kad implementation had many bugs and the bugs were hard to
>     find, so they made the system so robust that it continues to
>     operate even though it is full of bugs.
> 
>     However, it is not very efficient: every node has around 700 neighbors.

I think this underlines the point that a simple, less structured
solution has value. If the people who added DHT lookups to EMule had
asked me for advice, I would have said:

1) Have each node pick an identity at random when they join, and have
them connect to any k other nodes.
2) Every once in a while have nodes replace one of their current
neighbors with one that is the destination (ie the closest matching
identity to the search that can be found) of a query they received.
3) Do greedy routing.

If a node quits, it just quits, and its neighbors just fill their slots
like in step 2. Store data by routing to the key value and having nodes
cache it along the route (perhaps being more likely to cache if their
identity is close to the key) and cache data from queries when they return.

It is that simple. Anybody can implement it in like a hundred lines of
code, making obscure bugs unlikely. Yes, in theory it would only scale
log^2, but with 700 neighbors I can promise it will scale far beyond a
million nodes with only a few steps needed for the average lookup.

> You should read the original Kademlia paper.  I'd be interested to
> know if you'd think it'd be as vulnerable as you describe.

I have read it of course, but it was a while ago. The kind of attack I
mean is wide scale sybil type attack with a user spawning millions of
fake identities for his node, giving nodes faulty neighbors and
misinforming them about the size of the network etc. There are other
more devious attacks such as those attempting to upset just routes for
one particular key value as well.

I tend to think the reason these networks haven't been brought down is
mostly security by obscurity. The intersection of people who understand
the theory, know the code, and are sufficiently motivated to try to hurt
them is currently empty.

> Kademlia isn't hypercube-based, that's CAN.  Kademlia does
> prefix-matching like Pastry, though that's not obvious from the
> Kademlia paper.  The "XOR metric" is just a clever way of saying "find
> me the highest order non-matching bit".

I use "hyperbuse-based" in a very general sense. I mean networks
constructed sort of like a L^d mesh, in which one walks by prefix
matching (as you say), having a level of neighbors who match only in the
first b bits, then those that match in 2b bits, etc. The difference
between the Plaxton derivatives (I shall call these P-nets for lack of a
better term) and CAN in this regard is simply that as the network grows
CAN scales L but keeps d constant, while P-nets keep L constant and
scales d.

As far as the P-nets are concerned, one can see the development as:

Hypercube -> Plaxton -> [Pastry|Tapestry|Kademlia]

where in each step, the networks have become less structured (in a
Hypercube, every edge is canonical, but Plaxton realized that since you
only need to match starting from the first bit, you can choose neighbors
which don't match in that bit freely, etc.) Losing unnecessary structure
is good, but I'm saying you can go even further by recognizing the
essential dynamics (which is simply this: if there are N nodes which are
as close to x as y, then the density of edges from x to y should be 1/N)
and creating random algorithms where these dynamics hold.

// oskar

From bob.harris.spamcontrol at gmail.com  Tue Mar 21 14:47:59 2006
From: bob.harris.spamcontrol at gmail.com (Bob Harris)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <441FAB7B.6040808@math.chalmers.se>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<20060320214709.GF5200@cs.uoregon.edu>
	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>
	<20060320220942.GH5200@cs.uoregon.edu>
	<a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>
	<441FAB7B.6040808@math.chalmers.se>
Message-ID: <a5b3d0b30603210647u61e605e1yb7b2d9e859428257@mail.gmail.com>

>Where would use place "Symphony" [1] btw? A very structured DHT, but
>based directly on Kleinberg's model.

Very telling question. I would file Symphony under the "not worth
implementing or discussing" category.
Even its authors did not bother to implement that system.

As I was saying, there is a lot of hype in this area. All this effort on
small world
networks with O(log^2 N) lookups seems to be misplaced in the presence of
practical O(log N) and O(1) lookup systems.

Bob.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060321/1b557ebf/attachment.htm
From ossa at math.chalmers.se  Tue Mar 21 15:39:31 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet
	(RE:	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603210647u61e605e1yb7b2d9e859428257@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>	<E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>	<20060320214709.GF5200@cs.uoregon.edu>	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>	<20060320220942.GH5200@cs.uoregon.edu>	<a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>	<441FAB7B.6040808@math.chalmers.se>
	<a5b3d0b30603210647u61e605e1yb7b2d9e859428257@mail.gmail.com>
Message-ID: <44201E33.7060802@math.chalmers.se>

Bob Harris wrote:
> 
>>Where would use place "Symphony" [1] btw? A very structured DHT, but
>>based directly on Kleinberg's model.
> 
> Very telling question. I would file Symphony under the "not worth 
> implementing or discussing" category.

That is great and all, but it wasn't what I asked. You are free make 
categories of "DHTs I like" and "DHTs I don't like" to your hearts 
content. But you can't make a a division between "small-world networks" 
and DHTs, because all DHTs are, as has been noted, small-world networks, 
just with varying degrees of randomness.

// oskar

From bob.harris.spamcontrol at gmail.com  Tue Mar 21 15:58:42 2006
From: bob.harris.spamcontrol at gmail.com (Bob Harris)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <44201E33.7060802@math.chalmers.se>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>
	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<20060320214709.GF5200@cs.uoregon.edu>
	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>
	<20060320220942.GH5200@cs.uoregon.edu>
	<a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>
	<441FAB7B.6040808@math.chalmers.se>
	<a5b3d0b30603210647u61e605e1yb7b2d9e859428257@mail.gmail.com>
	<44201E33.7060802@math.chalmers.se>
Message-ID: <a5b3d0b30603210758x4295075ap4d89098ba58a47e4@mail.gmail.com>

> But you can't make a a division between "small-world networks"
> and DHTs, because all DHTs are, as has been noted, small-world networks,
> just with varying degrees of randomness.
>

Oskar: I don't think you realize that your use of the word "small world" is
quite
different from everyone else's and the graph theoretic definition. If you
think CAN defines a small world in any sense, we will not be able to
communicate.

I guess the success of the "small worlds" meme owes a lot to having a catchy
name, a loose allusion to human behavior, and a model simple enough that
anyone, including theoretician wannabes and even crackpot physicists, to get

into the p2p game. When performance is not an issue or when one cannot tell
X from X^2, every approach is as good as every other.

Bob.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060321/ea6243a1/attachment.html
From agthorr at cs.uoregon.edu  Tue Mar 21 17:09:18 2006
From: agthorr at cs.uoregon.edu (Daniel Stutzbach)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603210758x4295075ap4d89098ba58a47e4@mail.gmail.com>
References: <9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>
	<20060320214709.GF5200@cs.uoregon.edu>
	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>
	<20060320220942.GH5200@cs.uoregon.edu>
	<a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>
	<441FAB7B.6040808@math.chalmers.se>
	<a5b3d0b30603210647u61e605e1yb7b2d9e859428257@mail.gmail.com>
	<44201E33.7060802@math.chalmers.se>
	<a5b3d0b30603210758x4295075ap4d89098ba58a47e4@mail.gmail.com>
Message-ID: <20060321170917.GA6232@cs.uoregon.edu>

On Tue, Mar 21, 2006 at 10:58:42AM -0500, Bob Harris wrote:
> Oskar: I don't think you realize that your use of the word "small
> world" is quite different from everyone else's and the graph
> theoretic definition. If you think CAN defines a small world in any
> sense, we will not be able to communicate.

Oskar and I seem to be using the same definition of "small world
network", the same definition put forth by Watts and Strogatz when
they defined the term, also the same as the definition given in
Wikipedia, as well as in several books and peer-reviewed publications
devoted to the topic.

http://en.wikipedia.org/wiki/Small-world_network

Watts, D. J. and S. H. Strogatz. 1998. "Collective dynamics of
'small-world' networks". Nature 393:440-42.
http://tam.cornell.edu/SS_nature_smallworld.pdf

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon

From bob.harris.spamcontrol at gmail.com  Tue Mar 21 17:38:57 2006
From: bob.harris.spamcontrol at gmail.com (Bob Harris)
Date: Sat Dec  9 22:13:11 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <20060321170917.GA6232@cs.uoregon.edu>
References: <9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>
	<20060320214709.GF5200@cs.uoregon.edu>
	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>
	<20060320220942.GH5200@cs.uoregon.edu>
	<a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>
	<441FAB7B.6040808@math.chalmers.se>
	<a5b3d0b30603210647u61e605e1yb7b2d9e859428257@mail.gmail.com>
	<44201E33.7060802@math.chalmers.se>
	<a5b3d0b30603210758x4295075ap4d89098ba58a47e4@mail.gmail.com>
	<20060321170917.GA6232@cs.uoregon.edu>
Message-ID: <a5b3d0b30603210938sc46b7d1hf89c0c0a7eaa658e@mail.gmail.com>

Hi Daniel,

I saw the earlier fruitless discussion on the definition of "small world"
networks.
Let me summarize my thoughts:

o You and Oskar may well be using the same definition.

o I suspect you have not read the papers you cite. The wikipedia article
is quite clear that small world networks are a subclass of random graphs.
The
paper is talking about random rewirings. Oskar and you agreed the other day
that _all_ DHTs form small world networks. Something is amiss.

o Take some DHT, say CAN. It decidedly does not fit the 'definition'
provided
by Watts and Strogatz.

o The definition provided by Watts and Strogatz is quite loose in the
first place.


o I also question why, if 'small worlds' were such an important,
fundamental,
defining characteristic of graphs, it took mankind until 1998 to come up
with a (catchy) name for them.

o I maintain that there is more hype here than substance.

But look, I don't _really_ care if you guys build systems with O(log^2 N)
lookup
time when better techniques are available. It just so happens that too much
noise misplaced in an area will creat a fog and lead people astray. But hey, at

the end of the day, it's someone else's problem. I saw 40+ hop simulations
and
felt the need to call it as I saw it.

Bob.


> Oskar and I seem to be using the same definition of "small world
> network", the same definition put forth by Watts and Strogatz when
> they defined the term, also the same as the definition given in
> Wikipedia, as well as in several books and peer-reviewed publications
> devoted to the topic. (http://en.wikipedia.org/wiki/Small-world_network,
> Watts, D. J. and S. H. Strogatz. 1998. "Collective dynamics of
> 'small-world' networks". Nature 393:440-42.
> http://tam.cornell.edu/SS_nature_smallworld.pdf)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060321/8b3ddf8a/attachment.htm
From ossa at math.chalmers.se  Tue Mar 21 20:19:04 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet
	(RE:	[p2p-hackers] clustering))
In-Reply-To: <a5b3d0b30603210758x4295075ap4d89098ba58a47e4@mail.gmail.com>
References: <a5b3d0b30603201211y7bdad875r5150395613cb2557@mail.gmail.com>	<9517DB4A-9964-49F7-A50D-AA9AC5CDCC1F@cluck.com>	<a5b3d0b30603201339w281a6b5bvccc0a78da7ed0710@mail.gmail.com>	<20060320214709.GF5200@cs.uoregon.edu>	<4ef5fec60603201400h48a7bba7mffdc2220736914d5@mail.gmail.com>	<20060320220942.GH5200@cs.uoregon.edu>	<a5b3d0b30603201555q7ae967b6o4a76fcb7547c19e1@mail.gmail.com>	<441FAB7B.6040808@math.chalmers.se>	<a5b3d0b30603210647u61e605e1yb7b2d9e859428257@mail.gmail.com>	<44201E33.7060802@math.chalmers.se>
	<a5b3d0b30603210758x4295075ap4d89098ba58a47e4@mail.gmail.com>
Message-ID: <44205FB8.6090806@math.chalmers.se>

Bob Harris wrote:
> 
>     But you can't make a a division between "small-world networks"
>     and DHTs, because all DHTs are, as has been noted, small-world networks,
>     just with varying degrees of randomness.
> 
> 
> Oskar: I don't think you realize that your use of the word "small world" 
> is quite
> different from everyone else's and the graph theoretic definition. If you
> think CAN defines a small world in any sense, we will not be able to
> communicate.

I don't know if somebody using the term "small-world network" got your 
grant money or something, but you need to get over your fixation with 
having your own definition for this term, and be more specific about 
what you do not like. As far as I can tell, your problem seems to be 
with trying to use the Kleinberg model (and further developments on it) 
to build probabilistic DHT networks. This is a particular model, which 
really has nothing to do with Watz and Strogatz, "theoretician wannabes" 
or "crackpot physicists".

> I guess the success of the "small worlds" meme owes a lot to having a 
> catchy
> name, a loose allusion to human behavior, and a model simple enough that
> anyone, including theoretician wannabes and even crackpot physicists, to 
> get
> into the p2p game. When performance is not an issue or when one cannot tell
> X from X^2, every approach is as good as every other.

I don't know what part of "log^2 scaling when the node degree is 
constant" it is that you do not understand, but it feels like a lost 
cause trying to repeat it again. For the benefit of any other readers, 
however, I reran the same simulations I posted before, this time scaling 
up the degrees with the size of the network (2 log_2 N edges per node):

1000    4.6811
2000    5.0803
4000    5.446
8000    5.7923
16000   6.1494
32000   6.561
64000   6.8694
128000  7.2843
256000  7.6469
512000  8.0424

For those who are to lazy to calculate the deltas, here is a plot:

http://www.math.chalmers.se/~ossa/temp/llog.png

What kind of scaling does that like to you? (Hint: what function gives a 
straight line when the x-axis increases exponentially?)

But the fact that the number of edges can be scaled independently of the 
size of the network is a STRENGTH of the randomized networks. 
Prescribing a certain number of edges for a certain N is dumb, because 
more edges can mean quicker lookups, so there is no reason for nodes to 
have less edges then they can manage. In the randomized networks every 
edge is helpful, but no edge is necessary.

The bigger point is, however, that staring oneself blind at the 
asymptotic order of the scaling is the silly behavior of people whose 
reasoning is limited to "5 << 25". One should look at the size window 
that one can expect the network to grow to, consider acceptable values 
for other parameters like node degree, and pick the appropriate 
algorithm which performs well in that window.

// oskar

From Serguei.Osokine at efi.com  Tue Mar 21 20:47:06 2006
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and
	Freenet(RE:	[p2p-hackers] clustering))
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42AD2@fcexmb04.efi.internal>

On Tuesday, March 21, 2006 
> I reran the same simulations I posted before, this time scaling
> up the degrees with the size of the network (2 log_2 N edges per 
> node)...

	Out of curiosity, what happens to the phase transition with
this number of degrees? 1M meltdown was with 20 links, right? Does
this extra 2x avoid the phase transition at 1M nodes, and if so,
does it merely push it to say, 2M, or it is completely avoided
for all practical node counts (say, less than 10^12 or something)?

	Best wishes -
	S.Osokine.
	21 Mar 2006.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Oskar Sandberg
Sent: Tuesday, March 21, 2006 12:19 PM
To: Peer-to-peer development.
Subject: Re: Big hype on small worlds. (was Re: Dijjer and Freenet(RE:
[p2p-hackers] clustering))


Bob Harris wrote:
> 
>     But you can't make a a division between "small-world networks"
>     and DHTs, because all DHTs are, as has been noted, small-world
networks,
>     just with varying degrees of randomness.
> 
> 
> Oskar: I don't think you realize that your use of the word "small world" 
> is quite
> different from everyone else's and the graph theoretic definition. If you
> think CAN defines a small world in any sense, we will not be able to
> communicate.

I don't know if somebody using the term "small-world network" got your 
grant money or something, but you need to get over your fixation with 
having your own definition for this term, and be more specific about 
what you do not like. As far as I can tell, your problem seems to be 
with trying to use the Kleinberg model (and further developments on it) 
to build probabilistic DHT networks. This is a particular model, which 
really has nothing to do with Watz and Strogatz, "theoretician wannabes" 
or "crackpot physicists".

> I guess the success of the "small worlds" meme owes a lot to having a 
> catchy
> name, a loose allusion to human behavior, and a model simple enough that
> anyone, including theoretician wannabes and even crackpot physicists, to 
> get
> into the p2p game. When performance is not an issue or when one cannot tell
> X from X^2, every approach is as good as every other.

I don't know what part of "log^2 scaling when the node degree is 
constant" it is that you do not understand, but it feels like a lost 
cause trying to repeat it again. For the benefit of any other readers, 
however, I reran the same simulations I posted before, this time scaling 
up the degrees with the size of the network (2 log_2 N edges per node):

1000    4.6811
2000    5.0803
4000    5.446
8000    5.7923
16000   6.1494
32000   6.561
64000   6.8694
128000  7.2843
256000  7.6469
512000  8.0424

For those who are to lazy to calculate the deltas, here is a plot:

http://www.math.chalmers.se/~ossa/temp/llog.png

What kind of scaling does that like to you? (Hint: what function gives a 
straight line when the x-axis increases exponentially?)

But the fact that the number of edges can be scaled independently of the 
size of the network is a STRENGTH of the randomized networks. 
Prescribing a certain number of edges for a certain N is dumb, because 
more edges can mean quicker lookups, so there is no reason for nodes to 
have less edges then they can manage. In the randomized networks every 
edge is helpful, but no edge is necessary.

The bigger point is, however, that staring oneself blind at the 
asymptotic order of the scaling is the silly behavior of people whose 
reasoning is limited to "5 << 25". One should look at the size window 
that one can expect the network to grow to, consider acceptable values 
for other parameters like node degree, and pick the appropriate 
algorithm which performs well in that window.

// oskar
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From ossa at math.chalmers.se  Tue Mar 21 20:59:30 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer
	and	Freenet(RE:	[p2p-hackers] clustering))
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AD2@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AD2@fcexmb04.efi.internal>
Message-ID: <44206932.40802@math.chalmers.se>

Serguei Osokine wrote:
> On Tuesday, March 21, 2006 
> 
>>I reran the same simulations I posted before, this time scaling
>>up the degrees with the size of the network (2 log_2 N edges per 
>>node)...
> 
> 
> 	Out of curiosity, what happens to the phase transition with
> this number of degrees? 1M meltdown was with 20 links, right? Does
> this extra 2x avoid the phase transition at 1M nodes, and if so,
> does it merely push it to say, 2M, or it is completely avoided
> for all practical node counts (say, less than 10^12 or something)?

When scaling the node degree with log N I have never observed the number 
of successful searches decreasing with the network size. In this 
particular case there was not a single failed search for any network 
size. There are (slightly less than rigorous) mathematical reasons why 
this should be the case.  It could probably grow forever without having 
a phase transition.

// oskar

From Serguei.Osokine at efi.com  Tue Mar 21 21:11:34 2006
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re:
	Dijjerand	Freenet(RE:	[p2p-hackers] clustering))
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42AD3@fcexmb04.efi.internal>

On Tuesday, March 21, 2006 Oskar Sandberg wrote:
> It could probably grow forever without having a phase transition.

	I sure hope so, but then the next question is, what should be
the scaling factor k in "degree=k x log2(N)" for this to be the case?
I mean, k=1 seems to be not enough - with k=1 you have a transition 
with 1M nodes (when your degree is 20), correct?

	So what is the needed k? Does the theory say anything about it?
And more importantly, when you deviate from the ideal (random-pairs) 
rewiring algorithm due to the caching present in the system, what
should be your practical value of k to give you a safety margin 
sufficient to avoid the phase transition at any network size? 

	In practice, of course, people tend to work with k>>1, if only
to minimize the number of hops and to increase the network robustness
(like this Kad example with hundreds of links). But still it might be
interesting to see whether any value of k used in practice would be
theoretically enough to handle any practically possible content
popularity distributions, caching, and such.

	What do you think about all this?

	Best wishes -
	S.Osokine,
	Crackpot Physicist.
	21 Mar 2006.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Oskar Sandberg
Sent: Tuesday, March 21, 2006 1:00 PM
To: Peer-to-peer development.
Subject: Re: Big hype on small worlds. (was Re: Dijjerand Freenet(RE:
[p2p-hackers] clustering))


Serguei Osokine wrote:
> On Tuesday, March 21, 2006 
> 
>>I reran the same simulations I posted before, this time scaling
>>up the degrees with the size of the network (2 log_2 N edges per 
>>node)...
> 
> 
> 	Out of curiosity, what happens to the phase transition with
> this number of degrees? 1M meltdown was with 20 links, right? Does
> this extra 2x avoid the phase transition at 1M nodes, and if so,
> does it merely push it to say, 2M, or it is completely avoided
> for all practical node counts (say, less than 10^12 or something)?

When scaling the node degree with log N I have never observed the number 
of successful searches decreasing with the network size. In this 
particular case there was not a single failed search for any network 
size. There are (slightly less than rigorous) mathematical reasons why 
this should be the case.  It could probably grow forever without having 
a phase transition.

// oskar
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From ossa at math.chalmers.se  Tue Mar 21 21:16:34 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was
	Re:	Dijjerand	Freenet(RE:	[p2p-hackers] clustering))
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AD3@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AD3@fcexmb04.efi.internal>
Message-ID: <44206D32.3020101@math.chalmers.se>

Serguei Osokine wrote:
> On Tuesday, March 21, 2006 Oskar Sandberg wrote:
> 
>>It could probably grow forever without having a phase transition.
> 
> 
> 	I sure hope so, but then the next question is, what should be
> the scaling factor k in "degree=k x log2(N)" for this to be the case?
> I mean, k=1 seems to be not enough - with k=1 you have a transition 
> with 1M nodes (when your degree is 20), correct?

No theory. Your best bet is to simulate and see.

// oskar

From Serguei.Osokine at efi.com  Tue Mar 21 22:15:13 2006
From: Serguei.Osokine at efi.com (Serguei Osokine)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds.
	(wasRe:	Dijjerand	Freenet(RE:	[p2p-hackers] clustering))
Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42AD4@fcexmb04.efi.internal>

On Tuesday, March 21, 2006 Oskar Sandberg wrote:
> No theory. Your best bet is to simulate and see.

	I seem to recall that you had a link to your simulations source
code somewhere in the previous mails, but I could not find it when
I looked. Is it a confabulation and there was no link, or I simply
missed it when looking?

	Sorry to be such a pest, but I'm still trying to come to terms
with this phase transition thing at 1M and degree of 20. Intuitively
I have this feeling that the graph with this constant outdegree should
not disintegrate into unconnected pieces no matter what's its size,
and that phase transition should happen only if the links are truly
random and the outdegree of some nodes can be zero (or close to it)
as a result. Then sure, these nodes are simply not connected to anyone
and their queries all fail. I simply cannot picture the graph where
every node carefully monitors the number of its links, never allowing
their number to drop below 20, and the graph still breaks into many
components - the intuitive feeling is that as every node maintains 
these 20 links, the probability of at least one of them leading to
the giant component should be very close to 1, and as we are adding
nodes, this should still remain the case, so all nodes should stay
clustered together.

	I have this feeling that I'm not the only one on this list who
might have trouble visiualizing the phase transition, so maybe looking
at how it is happening in a real simulation would help. 

	By the way - could it be that the phase transition is due to
the unidirectional nature of the links? I mean, speaking in naive
terms: let's say that the graph breaks down into two components of
roughly the same size. Then as you add one more node to it, the
probability of this node connecting to both halves of the graph at
once is about (1 minus 2^(-19)). Then this node will become a bridge
between the graph components and glue them back together - but *only*
if the links are bidirectional. Unidirectional links won't act as a
"bridge", or "glue", and half of the graph still won't be able to
find anything in the other half.

	So could it be the case that if the links are bidirectioonal
(a la Gnutella), you cannot have the phase transitions no matter
what - but if the links are unidirectional (like certain UDP-based 
DHT implementaitons that I can imagine), you do have the phase
transition problem? And if so, does it mean that it is an extremely
good idea to make sure that all links are bidirectional and can
be used for searching, regardless of whether you establish them 
yourself or receive as an incoming UDP packet from the previously
unknown node?

	Best wishes -
	S.Osokine.
	21 Mar 2006.


-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On
Behalf Of Oskar Sandberg
Sent: Tuesday, March 21, 2006 1:17 PM
To: Peer-to-peer development.
Subject: Re: Big hype on small worlds. (wasRe: Dijjerand Freenet(RE:
[p2p-hackers] clustering))


Serguei Osokine wrote:
> On Tuesday, March 21, 2006 Oskar Sandberg wrote:
> 
>>It could probably grow forever without having a phase transition.
> 
> 
> 	I sure hope so, but then the next question is, what should be
> the scaling factor k in "degree=k x log2(N)" for this to be the case?
> I mean, k=1 seems to be not enough - with k=1 you have a transition 
> with 1M nodes (when your degree is 20), correct?

No theory. Your best bet is to simulate and see.

// oskar
_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From coderman at gmail.com  Tue Mar 21 22:59:09 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (wasRe: Dijjerand Freenet(RE:
	[p2p-hackers] clustering))
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AD4@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AD4@fcexmb04.efi.internal>
Message-ID: <4ef5fec60603211459t2e4ada92wa86a961879da28bf@mail.gmail.com>

On 3/21/06, Serguei Osokine <Serguei.Osokine@efi.com> wrote:
> ...
>         By the way - could it be that the phase transition is due to
> the unidirectional nature of the links?

my impression is that it has more to do with the search / routing
algorithm itself; at some point you time out or fail a query.  once a
network reaches sufficient size with low node degree the ability to
reach some arbitrary subset of the graph is thus constrained by the
excessive path length required.  i don't know that unidirectionality
would alter the behavior much, and fortunately wireless networks are
the only transport where unidirectionality may be common.

the naive approach (high TTL, broadcast and forward?) hits practical
(bandwidth) limits long before theoretical ones.

From ossa at math.chalmers.se  Tue Mar 21 23:06:26 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small
	worlds.	(wasRe:	Dijjerand	Freenet(RE:	[p2p-hackers] clustering))
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AD4@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AD4@fcexmb04.efi.internal>
Message-ID: <442086F2.4070607@math.chalmers.se>

Serguei Osokine wrote:
> On Tuesday, March 21, 2006 Oskar Sandberg wrote:
> 
>>No theory. Your best bet is to simulate and see.
> 
> 
> 	I seem to recall that you had a link to your simulations source
> code somewhere in the previous mails, but I could not find it when
> I looked. Is it a confabulation and there was no link, or I simply
> missed it when looking?

I will try to clean up the code and post a simulator some time, but it
is a very simple algorithm so you are probably just as well off
implementing it yourself.

> 	Sorry to be such a pest, but I'm still trying to come to terms
> with this phase transition thing at 1M and degree of 20. Intuitively
> I have this feeling that the graph with this constant outdegree should
> not disintegrate into unconnected pieces no matter what's its size,
> and that phase transition should happen only if the links are truly
> random and the outdegree of some nodes can be zero (or close to it)
> as a result. Then sure, these nodes are simply not connected to anyone
> and their queries all fail. I simply cannot picture the graph where
> every node carefully monitors the number of its links, never allowing
> their number to drop below 20, and the graph still breaks into many
> components - the intuitive feeling is that as every node maintains 
> these 20 links, the probability of at least one of them leading to
> the giant component should be very close to 1, and as we are adding
> nodes, this should still remain the case, so all nodes should stay
> clustered together.

(a) I didn't say that the network ends up completely disconnected. Only
that it is not sufficiently connected for routing to work. There being a
 path between to nodes is not the same as greedy routing finding one in
a reasonable amount of time.

(b) Seen as random graph, the resulting network is very complicated and
rife with dependencies among the edges etc. You can definitely end up
with negative feedback loops because a node not being reachable from one
"region" of the network will not become the destination of any shortcuts
from that region, etc.

// oskar

From ian at locut.us  Wed Mar 22 00:55:29 2006
From: ian at locut.us (Ian Clarke)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds.
	(wasRe:	Dijjerand	Freenet(RE:	[p2p-hackers] clustering))
In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42AD4@fcexmb04.efi.internal>
References: <4A60C83D027E224BAA4550FB1A2B120EC42AD4@fcexmb04.efi.internal>
Message-ID: <8494B23F-04C1-49E3-9042-78FA5BAC6069@locut.us>


On 21 Mar 2006, at 14:15, Serguei Osokine wrote:
>  Intuitively
> I have this feeling that the graph with this constant outdegree should
> not disintegrate into unconnected pieces no matter what's its size,
> and that phase transition should happen only if the links are truly
> random and the outdegree of some nodes can be zero (or close to it)
> as a result.

Oskar has already said this, but I will say it again just for emphasis.

The phase transition does not occur because the network has become  
disconnected, the network may well be fully connected (indeed, as you  
say, it would be very unlikely for it not to be fully connected), but  
greedy routing may not be able to find short routes between nodes.

It would certainly be useful to come up with a way to predict when  
the phase transition occurs so that one can say with certainty that  
it won't happen, but I suspect one will find (and simulations  
suggest) that if the degree is scaled with the log of the network  
size, then the network will never get anywhere close to this phase  
transition.

Ian.


From alenlpeacock at gmail.com  Wed Mar 22 04:55:39 2006
From: alenlpeacock at gmail.com (Alen Peacock)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <441FE294.1020609@math.chalmers.se>
References: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
	<20060320222537.GI5200@cs.uoregon.edu>
	<441F2FE0.5090702@math.chalmers.se>
	<20060320231139.GJ5200@cs.uoregon.edu>
	<441FB4EF.8070605@math.chalmers.se>
	<20060321085046.GW5200@cs.uoregon.edu>
	<441FE294.1020609@math.chalmers.se>
Message-ID: <ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>

On 3/21/06, Oskar Sandberg <ossa@math.chalmers.se> wrote:
>
> The kind of attack I
> mean is wide scale sybil type attack with a user spawning millions of
> fake identities for his node, giving nodes faulty neighbors and
> misinforming them about the size of the network etc.

Isn't the sybil attack against kademlia mitigated by the fact that the
routing table has a "LRU with live nodes never evicted from k-buckets"
strategy?  It seems to me that this preference for old contacts would
make it unlikely that a sybil attack against an established kademlia
DHT could have much success.  Admittedly, churn rate comes into play
here, but the fact that a sybil attack could *never* purge currently
connected valid nodes from a peer's routing table means that such a
peer would always have at least some valid contacts.  And the fact
that each peer has some valid contacts implies that a valid route can
always resolve, doesn't it (admittedly, with some decrease in
performance/efficiency)?


> There are other
> more devious attacks such as those attempting to upset just routes for
> one particular key value as well.

There are defenses against targetted key attacks (in addition to the
old contacts preference).  For example, make each node choose its own
Ku/Kr pair before joining, with nodeID = H(Ku).  A node would have to
'prove' its identity before any of its operations or results are
accepted (through challenge/response or signatures).  Under such a
scheme, an adversary could still spawn millions of sybil identities,
but it wouldn't be able to choose a specific ID space to target.  The
millions of nodes /could/ try to upset some specific route, but
preference for old contacts still makes this rather difficult.

If you wanted to get really paranoid, you could introduce a
trust/reputation system on top of a strong ID system like that
mentioned above.  This would even further diminish the effectiveness
of sybil attacks of this nature.

I hope I'm not being naive or unimaginitive by proposing that these
countermeasures make such attacks less effective.  I'd love to see
further discussion on why these are insufficient, as well as further
discussion on attackability of DHTs.

Alen

From m.rogers at cs.ucl.ac.uk  Wed Mar 22 11:33:31 2006
From: m.rogers at cs.ucl.ac.uk (Michael Rogers)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE: 
	[p2p-hackers] clustering))
In-Reply-To: <ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>
References: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>
	<20060320215848.GG5200@cs.uoregon.edu>
	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
	<20060320222537.GI5200@cs.uoregon.edu>
	<441F2FE0.5090702@math.chalmers.se>
	<20060320231139.GJ5200@cs.uoregon.edu>
	<441FB4EF.8070605@math.chalmers.se>
	<20060321085046.GW5200@cs.uoregon.edu>
	<441FE294.1020609@math.chalmers.se>
	<ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>
Message-ID: <4421360B.4010003@cs.ucl.ac.uk>

Alen Peacock wrote:
> There are defenses against targetted key attacks (in addition to the
> old contacts preference).  For example, make each node choose its own
> Ku/Kr pair before joining, with nodeID = H(Ku).  A node would have to
> 'prove' its identity before any of its operations or results are
> accepted (through challenge/response or signatures).  Under such a
> scheme, an adversary could still spawn millions of sybil identities,
> but it wouldn't be able to choose a specific ID space to target.

If I understand your suggestion correctly, the attacker could generate 
keypairs offline until it found a suitable keypair, then join the network.

Herbivore's entry control protocol [1] mitigates this attack by 
assigning nodeID = H(Ku||y) where y must be a partial hash collision 
with Ku and must also contain the current date. It takes a lot of CPU 
time to find a suitable hash collision and you don't get to find out 
your node's ID until you've done so, so targetting a particular ID 
becomes very expensive. Unfortunately the date requirement forces all 
nodes to search for new partial hash collisions periodically, so the CPU 
requirements have to be kept reasonable, and it's probably safe to 
assume that the attacker is willing to donate more CPU time to the 
problem than innocent nodes. So the protocol limits the Sybil attack 
rather than eliminating it.

Cheers,
Michael

[1] http://www.cs.cornell.edu/People/egs/papers/herbivore-tr.pdf

From ossa at math.chalmers.se  Wed Mar 22 16:07:59 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet
	(RE:	[p2p-hackers] clustering))
In-Reply-To: <ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>
References: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>	<4ef5fec60603201353k1a814ebdnacd4edebb91c9c71@mail.gmail.com>	<20060320215848.GG5200@cs.uoregon.edu>	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>	<20060320222537.GI5200@cs.uoregon.edu>	<441F2FE0.5090702@math.chalmers.se>	<20060320231139.GJ5200@cs.uoregon.edu>	<441FB4EF.8070605@math.chalmers.se>	<20060321085046.GW5200@cs.uoregon.edu>	<441FE294.1020609@math.chalmers.se>
	<ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>
Message-ID: <4421765F.8080200@math.chalmers.se>

Alen Peacock wrote:
> Isn't the sybil attack against kademlia mitigated by the fact that the
> routing table has a "LRU with live nodes never evicted from k-buckets"
> strategy?  It seems to me that this preference for old contacts would
> make it unlikely that a sybil attack against an established kademlia
> DHT could have much success.  Admittedly, churn rate comes into play
> here, but the fact that a sybil attack could *never* purge currently
> connected valid nodes from a peer's routing table means that such a
> peer would always have at least some valid contacts.  And the fact
> that each peer has some valid contacts implies that a valid route can
> always resolve, doesn't it (admittedly, with some decrease in
> performance/efficiency)?

But you cannot discount churn from this equation - this is a distributed 
P2P scenario, nobody is up 24/7. And, on the flipside, never changing 
neighbors will means that once an attacker is in, he can do a lot of 
damage without being replaced.

> There are defenses against targetted key attacks (in addition to the
> old contacts preference).  For example, make each node choose its own
> Ku/Kr pair before joining, with nodeID = H(Ku).  A node would have to
> 'prove' its identity before any of its operations or results are
> accepted (through challenge/response or signatures).  Under such a
> scheme, an adversary could still spawn millions of sybil identities,
> but it wouldn't be able to choose a specific ID space to target.  The
> millions of nodes /could/ try to upset some specific route, but
> preference for old contacts still makes this rather difficult.

I don't understand the problem. Trying to target your ID to a certain 
piece of data is on the order of the size of the network, so it won't be 
a problem to find something that hashes "close enough".

Hashing a public key will work only if the key has to be signed by some 
certificate authority, but that isn't desirable in most p2p scenarios. 
The only other option I know is having the ID be the hash of the nodes 
IP adress, which feels very hackish and dependent on a hopefully 
transient situation (the shortage of IPv4 addresses). (Of course, one 
also make it the hash of the computers TCPA fingerprint...)

> If you wanted to get really paranoid, you could introduce a
> trust/reputation system on top of a strong ID system like that
> mentioned above.  This would even further diminish the effectiveness
> of sybil attacks of this nature.

I don't think a trust system can do much at all. Because the graph in 
the DHT (however you do it) prescribes at least a distribution regarding 
who should be linking whom (based on ID), there are certain other nodes 
you have to talk to, whether you have reason to trust them or not. The 
only system of integrating trust I know is the new Freenet system which 
allows only verified, trusted, connections, but that has it's own set of 
headaches.

// oskar

From adam at cypherspace.org  Wed Mar 22 16:28:02 2006
From: adam at cypherspace.org (Adam Back)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet
	(RE:	[p2p-hackers] clustering))
In-Reply-To: <4421765F.8080200@math.chalmers.se>
References: <20060320215848.GG5200@cs.uoregon.edu>
	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
	<20060320222537.GI5200@cs.uoregon.edu>
	<441F2FE0.5090702@math.chalmers.se>
	<20060320231139.GJ5200@cs.uoregon.edu>
	<441FB4EF.8070605@math.chalmers.se>
	<20060321085046.GW5200@cs.uoregon.edu>
	<441FE294.1020609@math.chalmers.se>
	<ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>
	<4421765F.8080200@math.chalmers.se>
Message-ID: <20060322162802.GA19611@bitchcake.off.net>

Hashing a public key prevents you choosing arbitrary hashes.

(Same as hashing any other document type... hash functions are
designed to be one-way.)

Adam

On Wed, Mar 22, 2006 at 05:07:59PM +0100, Oskar Sandberg wrote:
> Hashing a public key will work only if the key has to be signed by some 
> certificate authority, but that isn't desirable in most p2p scenarios. 

From ossa at math.chalmers.se  Wed Mar 22 16:38:28 2006
From: ossa at math.chalmers.se (Oskar Sandberg)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and
	Freenet	(RE:	[p2p-hackers] clustering))
In-Reply-To: <20060322162802.GA19611@bitchcake.off.net>
References: <20060320215848.GG5200@cs.uoregon.edu>	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>	<20060320222537.GI5200@cs.uoregon.edu>	<441F2FE0.5090702@math.chalmers.se>	<20060320231139.GJ5200@cs.uoregon.edu>	<441FB4EF.8070605@math.chalmers.se>	<20060321085046.GW5200@cs.uoregon.edu>	<441FE294.1020609@math.chalmers.se>	<ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>	<4421765F.8080200@math.chalmers.se>
	<20060322162802.GA19611@bitchcake.off.net>
Message-ID: <44217D84.3000904@math.chalmers.se>

Adam Back wrote:
> Hashing a public key prevents you choosing arbitrary hashes.

True, of course, but if there are N nodes in the network, and I only 
want to make my ID closer to x than anybody elses, I don't need to 
choose an arbitrary hash, I only need to test O(N) public keys until I 
find a hash that fits the criteria.

// oskar

From alenlpeacock at gmail.com  Wed Mar 22 19:03:13 2006
From: alenlpeacock at gmail.com (Alen Peacock)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <4421765F.8080200@math.chalmers.se>
References: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<4ef5fec60603201416nbe7330eyea94c545f47a7a31@mail.gmail.com>
	<20060320222537.GI5200@cs.uoregon.edu>
	<441F2FE0.5090702@math.chalmers.se>
	<20060320231139.GJ5200@cs.uoregon.edu>
	<441FB4EF.8070605@math.chalmers.se>
	<20060321085046.GW5200@cs.uoregon.edu>
	<441FE294.1020609@math.chalmers.se>
	<ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>
	<4421765F.8080200@math.chalmers.se>
Message-ID: <ffe450f90603221103i3267ad08s5457800fafbc2779@mail.gmail.com>

On 3/22/06, Oskar Sandberg <ossa@math.chalmers.se> wrote:
>
> But you cannot discount churn from this equation - this is a distributed
> P2P scenario, nobody is up 24/7. And, on the flipside, never changing
> neighbors will means that once an attacker is in, he can do a lot of
> damage without being replaced.

Churn rate will be specific to the application running on top of the
DHT, but yes, if churn is high and *no one* stays on 24/7, my argument
goes away to a large extent.


> I don't think a trust system can do much at all. Because the graph in
> the DHT (however you do it) prescribes at least a distribution regarding
> who should be linking whom (based on ID),

Nodes in kademlia do have some freedom of choice.  They can certainly
choose to reject adding blacklisted nodes to their own routing tables.
 Nodes are rejected if they are unreachable due to non-transitivity,
for example.  Adding other criteria for rejection is likewise okay,
because the core algorithm's parallel query strategy is resilient to
errors -- in the case of trust, if your closest neighbors don't trust
you, then more distant neighbors certainly shouldn't either, so the
fact that they don't route to you is beneficial.

Of course, trust doesn't protect as well against nodes that behave
kindly to all near neighbors, but maliciously to those more distant. 
To start to deal with that you'd need to extend from purely local
trust to some sort of reputation system.

Alen

From coderman at gmail.com  Wed Mar 22 19:49:34 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: Big hype on small worlds. (was Re: Dijjer and Freenet (RE:
	[p2p-hackers] clustering))
In-Reply-To: <ffe450f90603221103i3267ad08s5457800fafbc2779@mail.gmail.com>
References: <E9D2E4C1-705F-4399-A4BA-DEECA9E2B065@locut.us>
	<20060320222537.GI5200@cs.uoregon.edu>
	<441F2FE0.5090702@math.chalmers.se>
	<20060320231139.GJ5200@cs.uoregon.edu>
	<441FB4EF.8070605@math.chalmers.se>
	<20060321085046.GW5200@cs.uoregon.edu>
	<441FE294.1020609@math.chalmers.se>
	<ffe450f90603212055q74111be0va6508675e9aa495e@mail.gmail.com>
	<4421765F.8080200@math.chalmers.se>
	<ffe450f90603221103i3267ad08s5457800fafbc2779@mail.gmail.com>
Message-ID: <4ef5fec60603221149p1d85214ekf0ee6d955e40b434@mail.gmail.com>

On 3/22/06, Alen Peacock <alenlpeacock@gmail.com> wrote:
> ...
> Of course, trust doesn't protect as well against nodes that behave
> kindly to all near neighbors, but maliciously to those more distant.
> To start to deal with that you'd need to extend from purely local
> trust to some sort of reputation system.

this is why i'm fond of highly structured CAN/Chord/Pastry/Kad
networks for small groups where you can implement good
reputation/identity and unstructured iterative unicast discovery for
very large groups (which can also integrate reputation but based on a
local view of your interactions with your peers over time).

i see these as complementary approaches with the efficiency of a
structured overlay appropriate for small groups of collaborators to
provide robust/resilient services to a larger body of peers.

regarding the attacks against keys/identifiers in the network, i was
always fond of the achord approach which constrains the lookup based
on an sha1 digest of the node's IP address.  this isn't perfect, but
it does provide some degree of protection against single nodes causing
lots of problems.  (you now need to use cooperative attacks from
multiple endpoints). see
http://thalassocracy.org/achord/achord-iptps.html

From dcarboni at gmail.com  Fri Mar 24 13:41:27 2006
From: dcarboni at gmail.com (Davide "dada" Carboni)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] WORKSHOP on DISTRIBUTED AGENT-BASED RETRIEVAL TOOLS
	(DART '06)
Message-ID: <71b79fa90603240541n57fd0a79h5c283e840c0924b3@mail.gmail.com>

[Apologies if your receive multiple copies]

WORKSHOP on DISTRIBUTED AGENT-BASED RETRIEVAL TOOLS (DART '06)

at the IEEE Symposium on Computers and Communications (ISCC '06)

Pula (Cagliari), Italy,  June 26, 2006
http://www.crs4.it/ict/wsdart/

CALL FOR PAPERS

The current arena in search engine applications is dominated by large
corporations that provide search services based on traditional
crawling/indexing tecniques, through powerful clusters. But new needs,
technologies and opportunities are addressing the research in this
field towards different directions: mobility, ubiquitous access and
personalization, P2P systems, both for contents sharing and
collaborative computation, open new, exciting perspecitves in the
landscape of search applications.

The proposed workshop, is aimed at highlighting the cutting edge of
search engines' technology and applications, fostering information
exchange between researchers working on various fields, such as
distributed and pervasive systems, Semantic Web and Web services,
mobile and personalized services and applications and compare their
efforts against users' and industry's needs and expectations as well
as at increasing mutual awareness and cooperation between ongoing
projects on advanced search systems around the world.

Suggested topics of interest include, but are not restricted to:

Large Distributed Computing Systems
    * P2P for collaborative data intensive computation
    * Distributed Filesystems
    * (Semantic) Web Services and Applications
    * Privacy, Security & Accountability in P2P applications

Mobile and Ubiquitous Services and Applications
    * Location Based Services
    * Human Machine Interaction
    * Personalized Services

Multimedia Content Discovery and Distribution
    * Digital Rights Management
    * Intelligent Agents
    * Audio-visual indexing and retrieval
    * Multimedia content adaptation


IMPORTANT DATES

Extended Abstact Due		April 7, 2006
Notification of Acceptance	April 13, 2006
Camera-Ready Version Due	April 27, 2006
Workshop			June 26, 2006

WORKSHOP ORGANIZERS
The Workshop "The Future of Search Engines' Technologies" is jointly
organized by Tiscali, CRS4 and University of Cagliari, that
participate in the project "Distributed Architecture for Semantic
Search and Personalized Content Distribution" funded by the Italian
Ministry of Research, as well as in several other projects funded by
the European Union and the industry

Organizers
Giuliano Armano
University of Cagliari (Italy)

Alessandro Soro
CRS4 (Italy)
mailto:workshop_set@crs4.it?subject=Workshop_Information_Request


TECHNICAL PROGRAM COMMITTEE
Maurizio Agelli, CRS4 - Italy.
Giuliano Armano, University of Cagliari - Italy.
Ernesto Damiani, University of Milano - Italy.
Domenico Dato, Tiscali - Italy.
Gabriele Gianini, University of Milano - Italy.
Sylvain Giroux, University of Sherbrooke - Qc. Canada.
Andrea Manconi, University of Cagliari - Italy.
Michele Marchesi, University of Cagliari - Italy.
Claude Moulin, University of Technology of Compiegne - France.
Gavino Paddeu, CRS4 - Italy.
Jean-Christophe Pazzaglia, SAP Research - SAP Labs, France.
Sergej Sizov, University of Koblenz-Landau - Germany.
Alessandro Soro, CRS4 - Italy.
Ivana Turnu, University of Cagliari - Italy.
Eloisa Vargiu, University of Cagliari - Italy.

--
Prima il 30% poi Barbolomeo.
--
http://people.crs4.it/dcarboni

From coderman at gmail.com  Sun Mar 26 16:40:27 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Fwd: RFA: hardware, wireless,
	defcon (request for assistance with project
	release/distribution/runtime at defcon 14)
In-Reply-To: <4ef5fec60603260837y743c6b2ar4570a26c5cc9406e@mail.gmail.com>
References: <4ef5fec60603260837y743c6b2ar4570a26c5cc9406e@mail.gmail.com>
Message-ID: <4ef5fec60603260840g37076ba6p49c6669f1eea5bfd@mail.gmail.com>

---------- Forwarded message ----------
From: coderman <coderman@gmail.com>
Date: Mar 26, 2006 8:37 AM
Subject: RFA: hardware, wireless, defcon (request for assistance with
project release/distribution/runtime at defcon 14)
To: cypherpunks@jfet.org


:: public request for help with janus wireless / open source project
at defcon 14 ::

if you will be at defcon 14 this august and have one or more of the
following and would be willing to help with an open source project
launch / test during the con please get in touch with me using
Off-The-Record or coordinate a meat space rendezvous via email -
coderman@gmail.

coderman42 on AIM :: OTR print A59CDCB3 46468A16 27D21678 270AF0B5 0B0477CF

my appreciation to anyone and everyone for their help; we will need it
(we are a very small group based in portland with limited resources
and time).

i will try to express my appreciation and reward your generosity in
some fashion.  please forward this to anyone with crypto clue who
might be interested and likely to participate.

desired and/or required:
-  VIA Nehemiah hardware and >128M of memory.  C5XL, C5P or C5J / C7  required.
-  slimline IDE or USB CDROM/DVDROM drives.

-  any x586/Pentium system with > 128M of ram and 8G or more free on
unformatted disk partition.

-  portable USB storage devices that can be formatted to XFS/iso image.

-  any system capable of burning single or dual layer DVD-R discs.

-  any wireless equipment that can support WPA/WPA2 EAP TLS w RADIUS
(enterprise mode)

-  any prism2, hermes, atheros, cisco, intel or other linux supported
wireless hardware in pcmcia/cardbus or mini-PCI/PCI formfactor.
200mW+ especially useful.

-  802.11 or other HAM/FHSS/DSSS/OFDM amplifiers in the 900Mhz,
2.4Ghz, and 5.8Ghz bands (or other reasonable bands - HAM with
auth/no-privacy packet radio signalling?)

-  antennas / cables / filters / mounting systems / for any of the above bands.

-  audio/video recording and/or mastering equipment and knowledge.

-  home/work/edu internet bandwidth that can support and would be
available for the conference (or a subset) running a tor proxy and/or
bittorrent seeder.  traffic shaping and read-only boot/runtime is
supported if you use the live ISO cd for hosting a tor[rent] node.
please consider the potential security risks of running a tor node
reachable from a private defcon wireless network before agreeing to
this.  middle/relay only nodes would still be helpful.

- well CPU and memory endowed systems that you would make available to
a private IPsec/OpenVPN network for distributed build and test
services.


all hardware you want to keep is encouraged to stay in your possession
and a few hours or more would be helpful when contributing time/skills
at the conference.  you will need to meet me in person before or the
day of the conference.  the earlier the better.

thanks again,
  i look forwarded to meeting any of you in person and discussing this
project and code.

martin - janus wireless

coderman@gmail.com|peertech.org|charter.net|mindspring.com

<coderman> 'bastardized Leonard Cohen; the only quote you'll ever see
me tarnish so,'
---cut---

"It is not to tell you anything
But to live forever
That I write this.

...

This is the only code
I can write.
I am the only one
who has built it.
I didn't kill myself
When things went wrong
I didn't shirk difficult integrity,
  when the easy seduced me.
I learned to write
I learned to code
What might be named
On nights like this
By one like me.
"
---end-cut---

-- out of date and high level description of what this project is all about:


0.  Overview

Warning: this software is in early experimental stages and should be
used accordingly.   The Janus Wireless distribution provides a secure
environment for private group networking.   Please read the rest of
this document for a description of digital identity and group
networking features implemented in this release.


1.  Identity Management

The cornerstone of any secure system is the concept of digital
identity used to establish authenticated sessions and manage
resources.   The Janus Wireless software defines your identity with a
combination of passphrase and a USB memory stick.   Both of these
methods must be used together to authenticate you and should be
protected like you would protect keys to other valuable personal items
like a residence or vehicle.  It is very important that you understand
the security of your communications and data is dependant on the
security of your passphrase and USB memory stick.   Store these safely
and never use them on a computer where your passphrase may be captured
(key logger or shoulder surfer) or the USB memory copied.

Physically hardened tamper resistant and/or evident hardware tokens
may be used where needed for stronger authentication security.


2.  Boot Options

There are four different options to choose from when booting into a
secure operating system instance.   Each has a distinct purpose and
you must reboot your system when changing from one domain to another.
 This may seem a bit cumbersome at first but this step is required to
ensure the security of the operating system by initializing the
computer with a known configuration from the BIOS bootstrap upward.

       keys :  The first option presented is the secure key management
mode which handles creation, modification, and distribution of digital
identities and the cryptographic keys associated with them.  All
interaction with this domain occurs via the USB memory stick and other
storage devices to implement a logical "air gap" boundary between this
secure domain and others.  No network services or capabilities are
provided.

       live :  Live mode provides a client environment that can run
directly off of the disc used to boot the computer.  Network support
is provided for establishing virtual private network connections.

    install :  A permanent installation on encrypted hard disk can be
deployed with this mode.  Please note that full disk encryption across
all partitions is required.

        hdd :  Encrypted operating systems stored on disk can be
launched with this option.  Note that the USB key used to install the
encrypted OS is required to boot.  If you lose this key or it becomes
corrupted all data will on disk will be lost.


 3.  Getting Started

Reboot into the 'keys' mode with a USB memory stick inserted to begin
creating user and resource identities.  Any live or hdd configuration
options can be defined at this point as well.


 4.  Additional Information

Invoke the 'about' command and select the desired topic for additional
information on using this software and other common questions.

Press the <Ctrl> <Alt> <Delete> keys or invoke the 'reboot' command to
restart the system and enter a different bootstrap target.

f9e6efb5-0374f333-978717d5-9194321e-67215b35-1c1b3106-1496b640-690342ed

gpg --print-md sha512 janus-wireless-pub.txt
/etc/janus/keymgr/public/janus-wireless-pub.txt:
E93E70B4 B457EB34 298C7A00 32CB5FE3 832DBC69 F894E747 F1C86D5F 454B9595 C2CC5C80
 4CFBB105 8639C0A3 A442424F 0CF932F6 AFA8CCD0 25E6FA02 9CEC860C

From coderman at gmail.com  Sun Mar 26 17:46:17 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and maintenance /
	user centric identity with single passwords (or a small
	number at most over time)
Message-ID: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>

comments?

Creating a secure password:

    o Include punctuation marks and numbers.
    o Mix capital, lowercase and space characters.
    o Create a unique acronym.
    o Short passwords should be 8 chars at least.

Weaknesses to avoid:

    o Don't use a password that is listed as an example or public.
    o Don't use a password you have been using for years.
    o Don't use a password someone else has seen you type.
    o Don't use a password that contains personal information.
    o Don't use words or acronyms that can be found in a dictionary.
    o Don't use keyboard patterns (qwerty) or sequential numbers.
    o Don't use repeating characters (aa11).

Keep your password secure:

    o Never tell your password to anyone or use it where they can observe it.
    o Never send your password by email or speak it where others may hear.
    o Occasionally verify your current password and change it to a new one.
    o Avoid writing your password down.  (Keep it with you in a purse
or wallet if you have to write down the password until you remember
it.)

---

High assurance passwords / exotic threat model interactive auth: use
challenge response for single use Key Encryption Keys containing a
minimum of 128 bits of entropy in a full SHA-512 derived key.  exotic
threat model implies full process for physical, emission,
cryptographic and user interface security.  (i.e. expert level
security infrastructure and flawless identity management).

ideally this would be coupled with a personal vascular scan biometric
device (user centric with vascular auth challenge to open/sign
hardened internal secrets)

the odds of such a device being designed, produced and verified in an
open and full disclosure manner is not high. :P

From coderman at gmail.com  Sun Mar 26 18:50:17 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Re: [Full-disclosure] guidelines for good password
	policy and maintenance / user centric identity with single
	passwords (or a small number at most over time)
In-Reply-To: <4426D76B.2020505@maginetworks.com>
References: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>
	<4426D76B.2020505@maginetworks.com>
Message-ID: <4ef5fec60603261050l4a4536b9o2d955de280adb728@mail.gmail.com>

On 3/26/06, J. Theriault <administrator@maginetworks.com> wrote:
> ...
> Why not just encourage your users to use a "passphrase" instead of a
> "password", such as using a (with proper grammar) book/movie quote or
> phrase?

excessive typing == unnecessary leaked information and longer auth
process (acoustic, profiling, easier pattern discovery, etc.)

i don't have a problem supporting a passphrase mode (>16 chars?  >32?)
but i'd rather not make it the default.

(and the default is and must be the most secure and usable path for
this to be trustworthy and widely usable)

From enzomich at gmail.com  Mon Mar 27 09:16:10 2006
From: enzomich at gmail.com (Enzo Michelangeli)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and maintenance
	/user centric identity with single passwords (or a smallnumber
	at most over time)
References: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>
Message-ID: <019101c6517f$2b9a5970$0200a8c0@EMNB>

----- Original Message ----- 
From: "coderman" <coderman@gmail.com>
Sent: Monday, March 27, 2006 1:46 AM

> comments?
>
> Creating a secure password:
>
>     o Include punctuation marks and numbers.
>     o Mix capital, lowercase and space characters.

I would avoid mixed-case passwords: the extra bit of information per
character is a small reward for the increase in difficulty to remember the
position of the dang lower- and uppercase characters... Better just add a
few characters.

Enzo

P.S. I don't add the obvious: almost nothing can help if the password is
intended for Windows login ;-) (see www.loginrecovery.com )


From alenlpeacock at gmail.com  Mon Mar 27 16:21:32 2006
From: alenlpeacock at gmail.com (Alen Peacock)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and maintenance
	/ user centric identity with single passwords (or a small
	number at most over time)
In-Reply-To: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>
References: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>
Message-ID: <ffe450f90603270821t64fe66a9i2eeac3d9ed0f3a0d@mail.gmail.com>

Chapter 7, "The Memorability and Security of Passwords" in O'Reilly's
Security and Usability: Designing Secure Systems that People Can Use
(http://www.amazon.com/gp/product/0596008279/sr=8-1/qid=1143475157/ref=pd_bbs_1/103-1120557-9567045?%5Fencoding=UTF8)
might be of interest to you.

The overarching theme of the book is that theoretically secure systems
with usability problems end up being neither secure (because users
subvert them) nor usable.  Some findings from Chap 7 include the fact
that a significant number of users did not comply with instructions
for password generation: "Theoretical analysis does not guarantee the
security of systems.  It is often necessary to study systems as they
are used in practice," and "Rigorous experimental testing of interface
usability is one of the necessary ingredients for robust secure
systems."

The authors suggest mnemonic-based passwords (generated from
passphrases) as one alternative that was both usable and which had
nearly as much resistance to brute-force crackers as did completely
random passwords.

Chapter 6 also provides some interesting criteria for evaluating
authentication mechanisms.

Cheers,
Alen

(disclaimer: although I contributed to one of the chapters in the
book, I don't get a dime from sales.  I think it is full of great
insight, though, as do the customer reviews at Amazon)

From coderman at gmail.com  Mon Mar 27 17:10:16 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and maintenance
	/ user centric identity with single passwords (or a small
	number at most over time)
In-Reply-To: <ffe450f90603270821t64fe66a9i2eeac3d9ed0f3a0d@mail.gmail.com>
References: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>
	<ffe450f90603270821t64fe66a9i2eeac3d9ed0f3a0d@mail.gmail.com>
Message-ID: <4ef5fec60603270910l1d908f72n36581d8493bac356@mail.gmail.com>

On 3/27/06, Alen Peacock <alenlpeacock@gmail.com> wrote:
> ...
> The overarching theme of the book is that theoretically secure systems
> with usability problems end up being neither secure (because users
> subvert them) nor usable.

very true.


>  Some findings from Chap 7 include the fact
> that a significant number of users did not comply with instructions
> for password generation

it is my personal hunch that if users had just one password they
needed to remember they could remember a good one.  the janus stuff we
are working on uses loop-aes volumes specifically so you can store
passwords in a browser, store capability URL's, keep accounts and
logins in a text file, etc.

[i'd love to know of any studies to this end though.  i have tried
experiments to see just how much entropy i can commit to memory and it
is more than enough for a good interactive authentication.  i think
this is within the ability of most, if they had a desire to do so and
understood the benefit.]

so the goal is to provide a usable system with a single password, and
make it user centric, so that all the other credentials and secrets
associated with other digital identies can benefit from this bootstrap
(and presumably share this more secure bootstrap).

From fxcabral at yahoo.com.br  Mon Mar 27 17:31:34 2006
From: fxcabral at yahoo.com.br (=?ISO-8859-1?Q?Fabr=EDcio?= Barros Cabral)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] How do P2P file sharing networks statistics?
Message-ID: <1143480694.4364.14.camel@hades.no-ip.org>

Hello everybody!

I'm searching about usage statistics of P2P file sharing networks, for
example, the number of users, number of shared files and etc. So, I
found the Slyck's P2P Network Stats Page [1], and I've some doubts about
this measurement:

1) How does it do these statistics? Is used a special program, or is it
a just traffic analysis?

2) Are these statistics correct? If not, what's the margin of error?

3) Is possible measurement the Bittorrent network? If not, why?

4) Is possible measurement the number of shared files? If not, why?

5) Does anyone have any suggestions about how do P2P file sharing
networks usage statistics? Can be better or worse than Slyck's
statistics method.

Thanks in advance,

--fx


[1] http://www.slyck.com/stats.php


_______________________________________________________ 
Yahoo! Acesso Gr�tis - Internet r�pida e gr�tis. Instale o discador agora! 
http://br.acesso.yahoo.com

From dcarboni at gmail.com  Mon Mar 27 19:31:52 2006
From: dcarboni at gmail.com (Davide "dada" Carboni)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] ANN: Tutorial on line
Message-ID: <71b79fa90603271131v4ae85580j95f012de91fb7d89@mail.gmail.com>

Hi, I'm preparing a 45' tutorial that introduces P2P systems. It is
mainly biased on gnutella and kademlia. You can find it in:

http://p2p-mentor.berlios.de/

Any comment is welcome.
Bye

--
Prima il 30% poi Barbolomeo.
--
http://people.crs4.it/dcarboni

From john.casey at gmail.com  Mon Mar 27 20:04:14 2006
From: john.casey at gmail.com (John Casey)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] ANN: Tutorial on line
In-Reply-To: <71b79fa90603271131v4ae85580j95f012de91fb7d89@mail.gmail.com>
References: <71b79fa90603271131v4ae85580j95f012de91fb7d89@mail.gmail.com>
Message-ID: <be7f17170603271204w7ad7d14do3642b5335ea5011a@mail.gmail.com>

OK. it looks interesting. But it appears to be a set of power point
slides printed out to gif/jpeg. Why don't you print them out to
something like PDF format ?? also the Page 1 etc undex the index page
doesn't really give a good first impression. Just my quick 5 minute
apprecication.

On 3/28/06, Davide dada Carboni <dcarboni@gmail.com> wrote:
> Hi, I'm preparing a 45' tutorial that introduces P2P systems. It is
> mainly biased on gnutella and kademlia. You can find it in:
>
> http://p2p-mentor.berlios.de/
>
> Any comment is welcome.
> Bye
>
> --
> Prima il 30% poi Barbolomeo.
> --
> http://people.crs4.it/dcarboni
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>

From mfreed at cs.nyu.edu  Mon Mar 27 20:19:32 2006
From: mfreed at cs.nyu.edu (Michael J Freedman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and maintenance
	/ user centric identity with single passwords (or a small number at
	most over time)
In-Reply-To: <4ef5fec60603270910l1d908f72n36581d8493bac356@mail.gmail.com>
References: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>
	<ffe450f90603270821t64fe66a9i2eeac3d9ed0f3a0d@mail.gmail.com>
	<4ef5fec60603270910l1d908f72n36581d8493bac356@mail.gmail.com>
Message-ID: <Pine.BSO.4.62.0603271513490.6047@ludlow.scs.cs.nyu.edu>

> it is my personal hunch that if users had just one password they
> needed to remember they could remember a good one.  the janus stuff we

This approach is certainly commonly done by people for useability. 
However, the problem is that the best security you get is that of security 
provided by the weakest site (i.e., the weakest link the chain analogy).

As an example, let's say that you use the same password to login to an 
online banking site (which really cares about security) and some 
random-dating site (which stores all unencrypted passwords in a big 
plaintext file on a rootable machine).  An adversary trying to break-in to 
your bank account doesn't need to subvert the security of the bank site: 
He just needs to break into the dating site.  No matter how many bits of 
entropy your password has, you lose.

As a solution developed precisely for this problem, you should check out 
the pwdhash extension for browsers:

   http://crypto.stanford.edu/PwdHash/

Enjoy,
--mike


-----
www.michaelfreedman.org                              www.coralcdn.org

From dcarboni at gmail.com  Mon Mar 27 20:26:29 2006
From: dcarboni at gmail.com (Davide "dada" Carboni)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] ANN: Tutorial on line
In-Reply-To: <be7f17170603271204w7ad7d14do3642b5335ea5011a@mail.gmail.com>
References: <71b79fa90603271131v4ae85580j95f012de91fb7d89@mail.gmail.com>
	<be7f17170603271204w7ad7d14do3642b5335ea5011a@mail.gmail.com>
Message-ID: <71b79fa90603271226j3b69752m8731da3a53d8968f@mail.gmail.com>

On 3/27/06, John Casey <john.casey@gmail.com> wrote:
> OK. it looks interesting. But it appears to be a set of power point
> slides printed out to gif/jpeg. Why don't you print them out to
> something like PDF format ?? also the Page 1 etc undex the index page
> doesn't really give a good first impression. Just my quick 5 minute
> apprecication.

Thank you. I've chosen the OOImpress export to HTML to keep also the
notes (the bottom frame, below the slide). But I'm considering to
publish a PDF export too.

Bye.

--
Prima il 30% poi Barbolomeo.
--
http://people.crs4.it/dcarboni

From jacob at mungo.dk  Mon Mar 27 20:25:59 2006
From: jacob at mungo.dk (Jacob Madsen)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Comparing Chord and Kademlia
Message-ID: <44284A57.50004@mungo.dk>

Hey

I'm comparing Chord and Kademlia in a small project of mine, and i'm
trying to find names of actual applications, where these 2 DHT are being
used, so i can tell a small story about them in the intro to the project.
So far i've found out Kademlia is implemented in Overnet, Azureus and
the official Bittorrent client, but i havent yet found any applications
where Chord is implemented.

If some of you could spare some time and tell me of applications where
they are being used, i would appriciate it alot!

Thanks!

From coderman at gmail.com  Mon Mar 27 22:04:55 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and maintenance
	/ user centric identity with single passwords (or a small
	number at most over time)
In-Reply-To: <Pine.BSO.4.62.0603271513490.6047@ludlow.scs.cs.nyu.edu>
References: <4ef5fec60603260946j7adfa545gd4f70d6c2e4ec3a9@mail.gmail.com>
	<ffe450f90603270821t64fe66a9i2eeac3d9ed0f3a0d@mail.gmail.com>
	<4ef5fec60603270910l1d908f72n36581d8493bac356@mail.gmail.com>
	<Pine.BSO.4.62.0603271513490.6047@ludlow.scs.cs.nyu.edu>
Message-ID: <4ef5fec60603271404g18a7ab75h681cfac70b00acc2@mail.gmail.com>

On 3/27/06, Michael J Freedman <mfreed@cs.nyu.edu> wrote:
> ...
> This approach is certainly commonly done by people for useability.
> However, the problem is that the best security you get is that of security
> provided by the weakest site (i.e., the weakest link the chain analogy).

true; which is why i'd like to see them use a single good password to
mount an encrypted volume and secure OS where the rest of the
(different*) passwords and PIN's and whatever else are kept.


> As a solution developed precisely for this problem, you should check out
> the pwdhash extension for browsers:
>
>    http://crypto.stanford.edu/PwdHash/

this is a handy utility!

i'd still be concerned about dictionary attacks on poor passwords
(that is, discovering '.848fe29s44j' is the hash for pwned.com and
'secret'.)  secure digests make this more expensive but not by much.

* are you aware of any utility for the browser that generates random
passwords?  i'd like something like this as well, with the idea that
the first time you visit the site (or need to change a password) a
random password is generated, placed in the input text field, and then
the browser password manager remembers it after that point.  (and the
password db is stored on an encrypted file system to prevent theft).

someone will ask about users who aren't on their machine and need to
access a site.  i don't like to support this ability because you
should never be using an untrusted computer to access a secure site. 
if the computer is trusted you should also be able to boot from CD and
insert your USB storage key (which lets you use your browser password
manager).

(actually, looking at the source for PwdHash it appears easy enough to
modify for random password generation)

thanks for the tip,

From mgp at ucla.edu  Mon Mar 27 23:51:01 2006
From: mgp at ucla.edu (Michael Parker)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Comparing Chord and Kademlia
In-Reply-To: <44284A57.50004@mungo.dk>
References: <44284A57.50004@mungo.dk>
Message-ID: <20060327155101.1sjtgovi0wc04wwg@mail.ucla.edu>

I don't know of any applications outside of academia, but from what I 
understand i3 is a fully-deployed (and interesting) application of 
Chord:

http://i3.cs.berkeley.edu/

- Mike


Quoting Jacob Madsen <jacob@mungo.dk>:

> Hey
>
> I'm comparing Chord and Kademlia in a small project of mine, and i'm
> trying to find names of actual applications, where these 2 DHT are being
> used, so i can tell a small story about them in the intro to the project.
> So far i've found out Kademlia is implemented in Overnet, Azureus and
> the official Bittorrent client, but i havent yet found any applications
> where Chord is implemented.
>
> If some of you could spare some time and tell me of applications where
> they are being used, i would appriciate it alot!
>
> Thanks!
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>


From enzomich at gmail.com  Tue Mar 28 02:42:50 2006
From: enzomich at gmail.com (Enzo Michelangeli)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Comparing Chord and Kademlia
References: <44284A57.50004@mungo.dk>
Message-ID: <03dd01c65211$74321a00$0200a8c0@EMNB>

----- Original Message ----- 
From: "Jacob Madsen" <jacob@mungo.dk>
To: "Peer-to-peer development." <p2p-hackers@zgp.org>
Sent: Tuesday, March 28, 2006 4:25 AM
Subject: [p2p-hackers] Comparing Chord and Kademlia

> Hey
>
> I'm comparing Chord and Kademlia in a small project of mine, and i'm
> trying to find names of actual applications, where these 2 DHT are being
> used, so i can tell a small story about them in the intro to the
> project.
> So far i've found out Kademlia is implemented in Overnet,

...and eMule (in a non-interoperable way, but with larger user base)

> Azureus and the official Bittorrent client

Talking about which: does anybody have figures on the approximate size (in
terms of average number of NATted nodes and supernodes) of the DHT's in
BitComet/uTorrent/BitTorrent, and the non-interoperable one in Azureus?

Enzo


From dbarrett at quinthar.com  Tue Mar 28 04:36:15 2006
From: dbarrett at quinthar.com (David Barrett)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and
	maintenance/ user centric identity with single passwords (or a
	smallnumber at most over time)
In-Reply-To: <4ef5fec60603271404g18a7ab75h681cfac70b00acc2@mail.gmail.com>
Message-ID: <20060328043621.981C53FC44@capsicum.zgp.org>

> -----Original Message-----
> From: coderman
> Sent: Monday, March 27, 2006 2:05 PM
> 
> On 3/27/06, Michael J Freedman <mfreed@cs.nyu.edu> wrote:
> > ...
> > This approach is certainly commonly done by people for useability.
> > However, the problem is that the best security you get is that of
> security
> > provided by the weakest site (i.e., the weakest link the chain analogy).
> 
> true; which is why i'd like to see them use a single good password to
> mount an encrypted volume and secure OS where the rest of the
> (different*) passwords and PIN's and whatever else are kept.

What are your thoughts on using PKI?

For example, create private keys (with no passwords) and put them in an
encrypted volume.  Then use one strong password to unlock your encrypted
volume (and thus, unlock your private keys), and then SSH to everywhere else
securely.  Thus a user need only remember one password to get access to all
servers.  (And you can individually grant or revoke access to servers by
adding/removing the corresponding public key.)

Win32 has 'TrueCrypt', which has a nice feature of auto-unmounting the
encrypted volume on suspend/hibernate.  Thus even if your laptop gets stolen
while hibernated, the private keys aren't compromised.  And if you're laptop
is configured to suspend on the screen closing, they'd need to steal your
laptop from you, while it's running, and begin hacking on it before closing
the screen. 

(And in the time someone can mount an offline attack, you can remove the
user's corresponding public keys from the servers.)

-david


From coderman at gmail.com  Tue Mar 28 09:06:43 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] guidelines for good password policy and
	maintenance/ user centric identity with single passwords (or a
	smallnumber at most over time)
In-Reply-To: <20060328043621.981C53FC44@capsicum.zgp.org>
References: <4ef5fec60603271404g18a7ab75h681cfac70b00acc2@mail.gmail.com>
	<20060328043621.981C53FC44@capsicum.zgp.org>
Message-ID: <4ef5fec60603280106na9f2184pc6e4256b66e9e443@mail.gmail.com>

On 3/27/06, David Barrett <dbarrett@quinthar.com> wrote:
> ...
> What are your thoughts on using PKI?

fine as long as trust and identity are properly implemented. 
physically hardened tokens are very good (ex: the rsa challenge / pin
based token authenticator via radius)

SPEKE and variants are also highly recommended in my book if you can
use them in a secure context (that is, no rootkits and equivalents to
capture passwords/phrases - a situation where single use passwords /
bingo auth are helpful if secure hardware tokens are not feasible)


> For example, create private keys (with no passwords) and put them in an
> encrypted volume.  Then use one strong password to unlock your encrypted
> volume (and thus, unlock your private keys), and then SSH to everywhere else
> securely.

this works very well, and if you have hardware accelerated encryption
it can be transparent.  you can also pre distribute keys (public and
secret) to the encrypted volumes you mount and run within (via a
secure bootstrap of course...)
[ see http://www.via.com.tw/en/initiatives/padlock/hardware.jsp ]

i think this is a rich field of discovery when considering the user
interface and authentication / session aspects of a secure system.

best regards,

From jinz at mail.ustc.edu.cn  Tue Mar 28 09:28:23 2006
From: jinz at mail.ustc.edu.cn (=?gb2312?B?1cW9+A==?=)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] lockstep synchronization protocol problem
Message-ID: <343538103.05175@ustc.edu.cn>

I'm doing research about synchronization problem in P2P system,and the basic
synchronization protocol is the lockstep protocol,and it use rounds to synchronize
all the peer's movements,the problem is lockstep only synchronize peer's
movements?what about the event created by all the peers?can it use rounds to
synchronize them?and how to ?


From coderman at gmail.com  Tue Mar 28 09:49:17 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] lockstep synchronization protocol problem
In-Reply-To: <343538103.05175@ustc.edu.cn>
References: <343538103.05175@ustc.edu.cn>
Message-ID: <4ef5fec60603280149s1ca73327j639244acb4d11032@mail.gmail.com>

T24gMy8yOC8wNiwg1cW9+CA8amluekBtYWlsLnVzdGMuZWR1LmNuPiB3cm90ZToKPiBJJ20gZG9p
bmcgcmVzZWFyY2ggYWJvdXQgc3luY2hyb25pemF0aW9uIHByb2JsZW0gaW4gUDJQIHN5c3RlbSxh
bmQgdGhlIGJhc2ljCj4gc3luY2hyb25pemF0aW9uIHByb3RvY29sIGlzIHRoZSBsb2Nrc3RlcCBw
cm90b2NvbCxhbmQgaXQgdXNlIHJvdW5kcyB0byBzeW5jaHJvbml6ZQo+IGFsbCB0aGUgcGVlcidz
IG1vdmVtZW50cyx0aGUgcHJvYmxlbSBpcyBsb2Nrc3RlcCBvbmx5IHN5bmNocm9uaXplIHBlZXIn
cwo+IG1vdmVtZW50cz93aGF0IGFib3V0IHRoZSBldmVudCBjcmVhdGVkIGJ5IGFsbCB0aGUgcGVl
cnM/Y2FuIGl0IHVzZSByb3VuZHMgdG8KPiBzeW5jaHJvbml6ZSB0aGVtP2FuZCBob3cgdG8gPwoK
bG9vayBhdCB1c2luZyBhIHF1b3J1bSBiYXNlZCBrZXkgZGlzdHJpYnV0aW9uIGFuZCBhZ3JlZW1l
bnQgcHJvdG9jb2wKKHdoZXJlIHF1b3J1bSA9PSBhIHNwZWNpZmljIHN1YnNldCBvZiBncm91cCBr
ZXkgbWFuYWdlbWVudCkgd2l0aApyZWd1bGFyIGF0dGVzdGF0aW9uIC8gcmVrZXlpbmcgdmlhIHRy
dXN0ZWQgYW5kIHN0cm9uZ2x5IGF1dGhlbnRpY2F0ZWQKbWVjaGFuaXNtcy4gIHNlc3Npb24gdGlt
ZW91dCAoZm9yIGZhaWx1cmUgLyBsYWNrIG9mIGNvbnNlbnN1cyAvCm1hbGljaW91cyBhdHRhY2sp
IHNob3VsZCBiZSBkZXRlY3RlZCB3aXRoaW4gYW4gYXBwcm9wcmlhdGUgdGltZSBmcmFtZQpmb3Ig
dGhlIHVzZXIgdG8gcmVzcG9uZCBzZWN1cmVseS4gIChpIHRlbmQgdG8gdGhpbmsgNjAgc2Vjb25k
cyBpcyBhbgphY2NlcHRhYmxlIHdpbmRvdykKCmRvaW5nIHRoaXMgaW4gYSB1c2VyIGZyaWVuZGx5
IG1hbm5lciBpcyB2ZXJ5IGRpZmZpY3VsdCBhbmQgcHJvYmFibHkKdGhlIHJlYXNvbiBwcmlvciB3
b3JrIGluIHRoaXMgZG9tYWluIGlzIHNjYXJjZS4K

From coderman at gmail.com  Tue Mar 28 10:24:58 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: Fwd: [p2p-hackers] lockstep synchronization protocol problem
In-Reply-To: <343540055.27718@ustc.edu.cn>
References: <343540055.27718@ustc.edu.cn>
Message-ID: <4ef5fec60603280224q5de3742cw7becaf88f3d881e8@mail.gmail.com>

LS0tLS0tLS0tLSBGb3J3YXJkZWQgbWVzc2FnZSAtLS0tLS0tLS0tCkZyb206INXFvfggPGppbnpA
bWFpbC51c3RjLmVkdS5jbj4KRGF0ZTogTWFyIDI4LCAyMDA2IDI6MDAgQU0KU3ViamVjdDogUmU6
IFtwMnAtaGFja2Vyc10gbG9ja3N0ZXAgc3luY2hyb25pemF0aW9uIHByb3RvY29sIHByb2JsZW0K
VG86IGNvZGVybWFuQGdtYWlsLmNvbQoKCmNhbiB5b3UgaW50cm9kdWNlIHNvbWUgcGFwZXJzIHRv
IHJlYWQgYWJvdXQgd2hhdCB5b3UgaGF2ZSBzYWlkP0kgY2FuJ3QgZ2V0IHlvdXIKbWVhbmluZyxi
dXQgdGhhbmsgeW91LmhvdyBpcyBpdCByZWxhdGVkIHRvIHRoZSBzeW5jaHJvbml6YXRpb24gcHJv
YmxlbT8KCgppIGFtIGFib3V0IHRvIGdvIG9mZmxpbmUgZm9yIHRoZSBuaWdodDsgaGVyZSBhcmUg
YSBmZXcgb2ZmIHRoZSB0b3Agb2YKbXkgaGVhZCB0aGF0IGFyZSByZWxldmFudC4gIGkgY2FuIHBv
c3QgbW9yZSBsYXRlciB0aGlzIHdlZWsgYW5kIG90aGVycwpvbiB0aGlzIGxpc3Qgd2lsbCBsaWtl
bHkgaGF2ZSBpbnB1dC4KCmdyb3VwIGtleSBkaXN0cmlidXRpb246CkVmZmljaWVudCBTZWxmLUhl
YWxpbmcgR3JvdXAgS2V5IERpc3RyaWJ1dGlvbiB3aXRoIFJldm9jYXRpb24gQ2FwYWJpbGl0eSAo
MjAwMykKaHR0cDovL2NpdGVzZWVyLmlzdC5wc3UuZWR1LzYyMzgwMi5odG1sCgpncm91cCByZXB1
dGF0aW9uIC8gdHJ1c3QgbWV0cmljczoKd3d3Lmxldmllbi5jb20vdGhlc2lzL2NvbXBhY3QucGRm
CgpxdW9ydW1zIGFuZCB1c2FiaWxpdHkgYXJlIG1vcmUgY29tcGxpY2F0ZWQgYW5kIGkgZG9uJ3Qg
aGF2ZSBhbnkgbGlua3Mgb2ZmIGhhbmQuCgpiZXN0IHJlZ2FyZHMsCgpQLlMuICBwbGVhc2UgcmVw
bHkgd2l0aCBhbnkgYWRkaXRpb25hbCByZXNlYXJjaCAvIHJlc3VsdHMgaWYgeW91CmVuY291bnRl
ciB0aGVtLi4uCgoKCgrU2sT6tcTAtNDF1tDU+L6tzOG1vToKPkZyb206IGNvZGVybWFuIDxjb2Rl
cm1hbkBnbWFpbC5jb20+Cj5SZXBseS1UbzoKPlRvOiAi1cW9+CIgPGppbnpAbWFpbC51c3RjLmVk
dS5jbj4sCiAgIlBlZXItdG8tcGVlciBkZXZlbG9wbWVudC4iIDxwMnAtaGFja2Vyc0B6Z3Aub3Jn
Pgo+U3ViamVjdDogUmU6IFtwMnAtaGFja2Vyc10gbG9ja3N0ZXAgc3luY2hyb25pemF0aW9uIHBy
b3RvY29sIHByb2JsZW0KPkRhdGU6VHVlLCAyOCBNYXIgMjAwNiAwMTo0OToxNyAtMDgwMAo+Cj5P
biAzLzI4LzA2LCDVxb34IDxqaW56QG1haWwudXN0Yy5lZHUuY24+IHdyb3RlOgo+PiBJJ20gZG9p
bmcgcmVzZWFyY2ggYWJvdXQgc3luY2hyb25pemF0aW9uIHByb2JsZW0gaW4gUDJQIHN5c3RlbSxh
bmQgdGhlIGJhc2ljCj4+IHN5bmNocm9uaXphdGlvbiBwcm90b2NvbCBpcyB0aGUgbG9ja3N0ZXAg
cHJvdG9jb2wsYW5kIGl0IHVzZSByb3VuZHMgdG8Kc3luY2hyb25pemUKPj4gYWxsIHRoZSBwZWVy
J3MgbW92ZW1lbnRzLHRoZSBwcm9ibGVtIGlzIGxvY2tzdGVwIG9ubHkgc3luY2hyb25pemUgcGVl
cidzCj4+IG1vdmVtZW50cz93aGF0IGFib3V0IHRoZSBldmVudCBjcmVhdGVkIGJ5IGFsbCB0aGUg
cGVlcnM/Y2FuIGl0IHVzZSByb3VuZHMgdG8KPj4gc3luY2hyb25pemUgdGhlbT9hbmQgaG93IHRv
ID8KPgo+bG9vayBhdCB1c2luZyBhIHF1b3J1bSBiYXNlZCBrZXkgZGlzdHJpYnV0aW9uIGFuZCBh
Z3JlZW1lbnQgcHJvdG9jb2wKPih3aGVyZSBxdW9ydW0gPT0gYSBzcGVjaWZpYyBzdWJzZXQgb2Yg
Z3JvdXAga2V5IG1hbmFnZW1lbnQpIHdpdGgKPnJlZ3VsYXIgYXR0ZXN0YXRpb24gLyByZWtleWlu
ZyB2aWEgdHJ1c3RlZCBhbmQgc3Ryb25nbHkgYXV0aGVudGljYXRlZAo+bWVjaGFuaXNtcy4gIHNl
c3Npb24gdGltZW91dCAoZm9yIGZhaWx1cmUgLyBsYWNrIG9mIGNvbnNlbnN1cyAvCj5tYWxpY2lv
dXMgYXR0YWNrKSBzaG91bGQgYmUgZGV0ZWN0ZWQgd2l0aGluIGFuIGFwcHJvcHJpYXRlIHRpbWUg
ZnJhbWUKPmZvciB0aGUgdXNlciB0byByZXNwb25kIHNlY3VyZWx5LiAgKGkgdGVuZCB0byB0aGlu
ayA2MCBzZWNvbmRzIGlzIGFuCj5hY2NlcHRhYmxlIHdpbmRvdykKPgo+ZG9pbmcgdGhpcyBpbiBh
IHVzZXIgZnJpZW5kbHkgbWFubmVyIGlzIHZlcnkgZGlmZmljdWx0IGFuZCBwcm9iYWJseQo+dGhl
IHJlYXNvbiBwcmlvciB3b3JrIGluIHRoaXMgZG9tYWluIGlzIHNjYXJjZS4K

From coderman at gmail.com  Tue Mar 28 11:06:23 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Fwd: off the record howto / best practices
In-Reply-To: <4ef5fec60603280239o38594f0flf2a1fc2137a6e5d1@mail.gmail.com>
References: <4ef5fec60603280239o38594f0flf2a1fc2137a6e5d1@mail.gmail.com>
Message-ID: <4ef5fec60603280306r4ec2906bhb86f579c504a6f25@mail.gmail.com>

---------- Forwarded message ----------
From: coderman <coderman@gmail.com>
Date: Mar 28, 2006 2:39 AM
Subject: off the record howto / best practices
To: cypherpunks@jfet.org


verify fingerprint:
"Buddy List"
   -> Tools Menu Options
      -> Preferences Menu Option
         -> Plugins Menu Option
            -> Select Username from Known Fingerprints
               -> Press "Verify fingerprint" action
                  -> VIEW FINGERPRINT AND APPROVE/REJECT IF EXPECTED
---
Fingerprint for you, coderman42 (AIM/ICQ):
A59CDCB3 46468A16 27D21678 270AF0B5 0B0477CF

Purported fingerprint for anonymous:
0B0477CF 270AF0B5 27D21678 46468A16 A59CDCB3
---
                     -> Select "i have" verified action only if expected is true

using verified otr credentials:
  -> Select "OTR: Not Private" image button at lower right corner if
secure channel is down
     -> Verify "OTR: Private" image button at lower right corner before chat


example of a failed key agreement:

the "OTR: Unverified" image was never shown at the lower right corner
of the window
indicating an initial OTR exchange had taken place.

---cut---
(02:12:01) anonymous: hi code
(02:12:05) Attempting to start a private conversation with anonymous...
(02:12:11) coderman42: hello
(02:12:14) coderman42: do you have OTR?
(02:12:38) anonymous: hold on
(02:12:51) coderman42: k
(02:13:39) anonymous: *** Encrypted with the Gaim-Encryption plugin
            02:17
(02:18:16) coderman42: sorry, no worky for you.
(02:18:19) coderman42: try again
(02:18:29) coderman42: what client / OS are you using?
(02:18:38) coderman42: i recommend a unix like system with gaim
(02:19:46) anonymous: i am on gaim and i was useing gaim encrypt
(02:21:47) Attempting to start a private conversation with anonymous...
(02:22:00) coderman42: maybe you were; it is not working currently.
            02:22
(02:22:33) anonymous logged out.
(02:23:04) anonymous logged in.
(02:23:10) coderman42: wb
(02:23:12) Attempting to start a private conversation with anonymous...
(02:23:23) coderman42: (02:23:12) Attempting to start a private
conversation with anonymous...
(02:23:27) coderman42: waiting ...
(02:23:44) coderman42: brb
(02:26:58) anonymous: *** Encrypted with the Gaim-Encryption plugin
            02:27
(02:28:45) anonymous logged out.
---end-cut--

remember to protect your keys.

From m.leslie1 at physics.ox.ac.uk  Tue Mar 28 13:46:27 2006
From: m.leslie1 at physics.ox.ac.uk (Matthew Leslie)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Symmetric Replication for Chord
Message-ID: <97A92A773C8D30458210542B6523B41001063BE3@exchng3.physics.ox.ac.uk>

Hi Michael,

I would suggest that replication factors which do not divide the
keyspace are not going to cause problems, simply make sure you round any
fractional numbers of keys in a consistent manner. A greater issue might
be replication factors which do not divide the number of nodes, as this
will lead to an uneven distribution of replicas over nodes.  This is an
issue I have run into myself - perhaps this is what you meant? I don't
have any useful suggestions on this myself, yet.

The practicalities of replication in Chord quickly become somewhat
fiendish.  We've found that although using a second hash function to
place replicas can increase performance, it is at the cost of
considerable extra complexity. You can see my comparison of various
replication schemes, including a form of symmetric replication in a
paper I wrote for DASP2P-06. 

http://urchin.earth.li/~mleslie/mleslieStorage.pdf

The 'Block' replication algorithm in my paper is an instance of
symmetric replication, though it divides the nodes into equivalence
classes in a slightly different way to that proposed in the paper by
Ghosi et al. The most interesting thing we have found (I think) is that
symmetric replication leads to a more reliable system for a given
replication factor than DHash style replication - the authors of the
original paper do not make this point.


Matt


PS: Sorry it took me so long to get round to replying, I hope this isn't
irrelevant now.
	
 
________________________________

From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]
On Behalf Of Michael Hartmann
Sent: 23 February 2006 17:20
To: p2p-hackers@zgp.org
Subject: [p2p-hackers] Symmetric Replication for Chord


Anyone actually tried implementing the replication scheme suggested by
this
paper: 
Symmetric Replication for Structured Peer-to-Peer Systems,
A. Ghodsi, L-O Alima, S. Haridi, DBISP2P2005, 2005.
http://dks.sics.se/pub/replication.pdf 

I am trying to implement it for Chord, it seems better than the
successor-list approach, but I am wondering if it is possible to
make it work for replication degrees that do not divide the size of the
key
space?

Mike 

From a_acquisti at ppmusic.com  Tue Mar 28 16:55:44 2006
From: a_acquisti at ppmusic.com (Alessandro Acquisti)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] CALL FOR NOMINATIONS - 2006 PET AWARD
In-Reply-To: <20060327200005.A48363FDA9@capsicum.zgp.org>
Message-ID: <200603281655.KAA07899@wisteria.propagation.net>

CALL FOR NOMINATIONS - 2006 PET AWARD

[Please forward and distribute]

You are invited to submit nominations to the 2006 PET Award.

The PET Award is presented annually to researchers who have made
an outstanding contribution to the theory, design, implementation,
or deployment of privacy enhancing technology. It is awarded at
the annual Privacy Enhancing Technologies Workshop (PET). The PET
Award carries a prize of 3000 Euros thanks to the generous support
of Microsoft.

Any paper by any author written in the area of privacy enhancing
technologies is eligible for nomination. However, the paper must
have appeared in a refereed journal, conference, or workshop with
published proceedings in the period that goes from the end of the
penultimate PET Workshop (the PET workshop prior to the last PET
workshop that has already occurred: i.e. June 2004) until April
15th, 2006. The complete Award rules including eligibility
requirements can be found at http://petworkshop.org/award/.

Anyone can nominate a paper by sending an email message containing
the following to award-chairs06@petworkshop.org:

    - Paper title
    - Author(s)
    - Author(s) contact information
    - Publication venue
    - A nomination statement of no more than 250 words.

All nominations must be submitted by April 15th, 2006. A
seven-member Award committee will select one or two winners among
the nominations received. Winners must be present at the PET
workshop in order to receive the Award. This requirement can be
waived only at the discretion of the PET Advisory board.

2006 Award Committee:

    - Alessandro Acquisti (chair), Carnegie Mellon University, USA
    - Roger Dingledine (co-chair), The Free Haven Project, USA
    - Ram Chellappa, Emory University, USA
    - Lorrie Cranor, Carnegie Mellon University, USA
    - Rosario Gennaro, IBM Research, USA
    - Ian Goldberg, Zero Knowledge Systems, Canada
    - Markus Jakobsson, Indiana University at Bloomington, USA

More information about the PET award (including past winners) is
available at http://petworkshop.org/award/.

More information about the 2006 PET workshop is available at
http://petworkshop.org/2006/.


-----------------------
Alessandro Acquisti
Heinz School, Carnegie Mellon University
(P) 412 268 9853
(F) 412 268 5339
http://www.heinz.cmu.edu/~acquisti
-----------------------  


From bwong at cs.cornell.edu  Tue Mar 28 17:28:30 2006
From: bwong at cs.cornell.edu (Bernard Wong)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Octant announcement
Message-ID: <4429723E.9070209@cs.cornell.edu>

Hi,

We have recently deployed a system for determining the physical location 
of Internet hosts called Octant. Given a host that responds to ICMP 
pings, Octant determines the boundaries of the region in which the node 
is likely to lie, and displays the result using Google maps:

   http://www.cs.cornell.edu/~bwong/octant

Behind the scenes, Octant consists of two parts:
- a comprehensive framework for efficiently representing and
  combining a system of constraints.
- a set of mechanisms (aka "crazy hacks") to extract useful
  and tight constraints on where nodes are likely to be, without
  resulting in an unsatisfiable set of constraints.

Our website describes the general framework Octant provides. One key 
feature is that the framework permits reasoning about positive 
constraints (where a node may be located), as well as negative 
constraints (where a node is unlikely to be). Another key feature is 
that Octant can reason about concave regions (e.g. "node is in the 
Boston area but not in Cambridge") using Bezier regions. Finally, Octant 
can reason in the presence of uncertainty (e.g. "this node is within 
30km. of this other one in the Ohio area").

Figuring out the physical location of nodes based solely on network 
measurements is challenging. Routes don't necessarily follow 
great-circle distances, slow nodes appear to be "off the map," routers 
add delays that are hard to predict, and generally network measurements 
are difficult to perform accurately. We have developed various 
mechanisms for these challenges. Our evaluation indicates that we get a 
median error of 22 miles, factor of 3 better than previous schemes for 
geolocalization.

So, check Octant out if you are interested in a free geolocalization 
service (or you want to cyberstalk that person who has been connecting 
to your SSH port or sending you unwanted addresses). Two caveats: for 
security reasons, we do not geolocalize arbitrary IP addresses, and the 
current deployment is limited to North America (though we hope to deploy 
in Europe+Asia in the future).

Thanks,
Bernard Wong, Ivan Stoyanov, Gun Sirer.

From julien.lociuro at student.uclouvain.be  Tue Mar 28 21:06:35 2006
From: julien.lociuro at student.uclouvain.be (Julien Lociuro)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Symmetric Replication for Chord
In-Reply-To: <20060328200004.BF2923FDE3@capsicum.zgp.org>
References: <20060328200004.BF2923FDE3@capsicum.zgp.org>
Message-ID: <37633.213.49.167.165.1143579995.squirrel@webmail.sipr.ucl.ac.be>

Hello Michael and Matt,
I have implemented the symmetric replication scheme too.
When the replication factor does not divide the network size I manage it
like this (illustration through an example) :
NetSize = 11
RepFactor = 3

I define a virtual network size as :
VirtualNetSize = NetSize - NetSize mod RepFactor

So here, we have a virtual network of size 9, and the replication factor
divides it.

We can though define the classes :
{0,3,6}, {1,4,7} and {2,5,8}
The keys in the virtual network are called virtual keys.

I consider that only virtual keys exist [0..8] and replication works
with the virtual network size.

The remaining keys (9 and 10) are just mapped to virtual key 0.

So items with keys 9 and 10 will be stored at the nodes responsible
for {0,3,6}

Hope that helps,

Julien.


From coderman at gmail.com  Thu Mar 30 06:34:15 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Fwd: how to get johnny to encrypt (his hard drive)
In-Reply-To: <4ef5fec60603292230t5bda1b8ftbebcaa56328b1896@mail.gmail.com>
References: <4ef5fec60603292230t5bda1b8ftbebcaa56328b1896@mail.gmail.com>
Message-ID: <4ef5fec60603292234g63b051aar1d4da1f122ba323c@mail.gmail.com>

usability is the foundation of good security

---------- Forwarded message ----------
From: coderman <coderman@gmail.com>
Date: Mar 29, 2006 10:30 PM
Subject: how to get johnny to encrypt (his hard drive)
To: cypherpunks@jfet.org


thoughts on making this simpler?

0. insert new second disk of equal or greater size
1. boot from trusted cd/dvd ISO image
2. insert USB memory stick (or two if you want a backup)
3. enter new password / passphrase (see good password howto)
4. agree/confirm to copy over empty / target disk
5. wait as new disk is encrypted via loop-aes, keys are stored on
password protected USB image, all existing OS data* on source disk is
copied to encrypted volume on new disk.
6. reboot into new encrypted volume and copy back over original source
hard disk with loop-aes and store keys for this disk on USB image.
7. Johnny gets a data backup with his privacy.

* ubuntu, knoppix, slackware, linspire and centos supported.  a
windoze or other partition (vfat, ntfs, etc) can be copied and mounted
under a new installation of the previously mentioned linux OS'es on
the new encrypted disk. (if one of these linux flavors is not already
installed)

From m.leslie1 at physics.ox.ac.uk  Thu Mar 30 11:26:56 2006
From: m.leslie1 at physics.ox.ac.uk (Matthew Leslie)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] Symmetric Replication for Chord
Message-ID: <97A92A773C8D30458210542B6523B41001063D1D@exchng3.physics.ox.ac.uk>

The virtual network size would give you whole numbers for your replica
locations, but if you choose a keyspace size (virtual or otherwise) that
isn't a power of two, the Chord finger table keys will not necessarily
be whole numbers. Since the object of the exercise seems to be to avoid
non-integer keys, this is probably unacceptable.

Again, this is easily solved by consistently rounding, so personally I'm
not sure it is an issue. 

Still, if you are picking an arbitrary keyspace size, and really want it
to be divisible into finger tables entries, and some range of replica
factors, why not have a key space size of something like (2^42*32!).
This is the same order of magnitude as the 2^160 I think is suggested by
the Chord paper, and would allow you up to 42 whole numbered finger
table entries, and replication factors between 1 and 32, all of which
divide the keyspace evenly.


Matt

-----Original Message-----
From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]
On Behalf Of Julien Lociuro
Sent: 28 March 2006 22:07
To: p2p-hackers@zgp.org
Subject: [p2p-hackers] Symmetric Replication for Chord

Hello Michael and Matt,
I have implemented the symmetric replication scheme too.
When the replication factor does not divide the network size I manage it
like this (illustration through an example) :
NetSize = 11
RepFactor = 3

I define a virtual network size as :
VirtualNetSize = NetSize - NetSize mod RepFactor

So here, we have a virtual network of size 9, and the replication factor
divides it.

We can though define the classes :
{0,3,6}, {1,4,7} and {2,5,8}
The keys in the virtual network are called virtual keys.

I consider that only virtual keys exist [0..8] and replication works
with the virtual network size.

The remaining keys (9 and 10) are just mapped to virtual key 0.

So items with keys 9 and 10 will be stored at the nodes responsible for
{0,3,6}

Hope that helps,

Julien.


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

From coderman at gmail.com  Thu Mar 30 12:08:16 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] lockstep synchronization protocol problem
In-Reply-To: <343538103.05175@ustc.edu.cn>
References: <343538103.05175@ustc.edu.cn>
Message-ID: <4ef5fec60603300408o198b047blb41c46a5be62eebf@mail.gmail.com>

aGkgamlueiwKCmkgZG9uJ3QgaGF2ZSB0aW1lIGZvciBhIGRldGFpbGVkIHJlcGx5IGJ1dCBpIHRo
b3VnaHQgYSBsaXR0bGUgbW9yZQppbmZvIHdvdWxkIGJlIHVzZWZ1bAoKCk9uIDMvMjgvMDYsINXF
vfggPGppbnpAbWFpbC51c3RjLmVkdS5jbj4gd3JvdGU6Cj4gSSdtIGRvaW5nIHJlc2VhcmNoIGFi
b3V0IHN5bmNocm9uaXphdGlvbiBwcm9ibGVtIGluIFAyUCBzeXN0ZW0sYW5kIHRoZSBiYXNpYwo+
IHN5bmNocm9uaXphdGlvbiBwcm90b2NvbCBpcyB0aGUgbG9ja3N0ZXAgcHJvdG9jb2wsYW5kIGl0
IHVzZSByb3VuZHMgdG8gc3luY2hyb25pemUKPiBhbGwgdGhlIHBlZXIncyBtb3ZlbWVudHMsdGhl
IHByb2JsZW0gaXMgbG9ja3N0ZXAgb25seSBzeW5jaHJvbml6ZSBwZWVyJ3MKPiBtb3ZlbWVudHM/
d2hhdCBhYm91dCB0aGUgZXZlbnQgY3JlYXRlZCBieSBhbGwgdGhlIHBlZXJzP2NhbiBpdCB1c2Ug
cm91bmRzIHRvCj4gc3luY2hyb25pemUgdGhlbT9hbmQgaG93IHRvID8KCmkgbWVudGlvbmVkIHF1
b3J1bSBzeXN0ZW1zIGFuZCBncm91cCBrZXkgZGlzdHJpYnV0aW9uIHRvIGFjaGlldmUgYQpzaGFy
ZWQgYW5kIGF1dGhlbnRpY2F0ZWQgc3RhdGUgYW1vbmcgYSBncm91cCBvZiBwZWVycyB0aGF0IGNh
biBiZSBrZXB0CmluIHN5bmMgLyBjb2hlcmVudCB2aWEgZnJlcXVlbnQgYXR0ZXN0YXRpb24gKGdy
b3VwIHJlIGtleWluZyB3aXRoCnF1b3J1bSBjb25zZW5zdXMgdG8gZGlzdHJpYnV0ZSBuZXcga2V5
cykuCgp0aGVyZSBhcmUgbWFueSB3YXlzIHRvIGltcGxlbWVudCB0aGlzIHNvIGknbGwgc3RpY2sg
dG8gY29uY2VwdHVhbApmZWF0dXJlcyAvIGF0dHJpYnV0ZXMgYW5kIGhvdyB0aGlzIHJlbGF0ZXMg
dG8gYSBwcml2YXRlIGdyb3VwIG5ldHdvcmsKc3lzdGVtIHdlIGFyZSBpbXBsZW1lbnRpbmcuCgpx
dW9ydW0gYXV0aG9yaXRpZXMgYXJlIHRob3NlIHdobyBzaWduIGFsbCB0aGUgb3RoZXIgYXV0aG9y
aXRpZXMga2V5cwphcyBwYXJ0IG9mIHRoZSBncm91cCBrZXkgZGlzdHJpYnV0aW9uLiAgcXVvcnVt
ICBvciBncm91cCBtZW1iZXJzIGFyZQp0aG9zZSB3aG8gcmVjZWl2ZSBrZXlzIGZyb20gb25lIG9y
IG1vcmUgcXVvcnVtIGF1dGhvcml0aWVzLgoKdGhlIHF1b3J1bSBhdXRob3JpdGllcyBtYWludGFp
biBhbiBpbmRleCBvZiBhbGwga25vd24gLyB0cnVzdGVkIGdyb3VwCm1lbWJlcnMgYW5kIHRoZSB0
cnVzdCBtZXRyaWNzIGFzc2lnbmVkIHRvIHRoZSByb2xlcyAvIHNlcnZpY2VzIHRoZXkKY2FuIHBl
cmZvcm0uCgphbmQgcGVlciBtYXkgc29saWNpdCwgcHJvdmlkZSBhbmQgY29uc3VtZSB0aGUgc2Vy
dmljZXMgb2YgYW5vdGhlciBvbmNlCnRoZXkgdmVyaWZ5IHRoZXkgYXJlIHRydXN0ZWQgdG8gZG8g
c28uICB0aGV5IGNhbiBjb250YWN0IGFueSBvZiB0aGUKcXVvcnVtIGF1dGhvcml0aWVzICh3aG8g
aGF2ZSBhIGZ1bGwgaW5kZXggYW5kIHRydXN0IG1ldHJpYyBzdGF0ZSAvCmdyYXBoKSB0byBjZXJ0
aWZ5IHRoZSByZW1vdGUgcGVlciBiZWZvcmUgZG9pbmcgc28uCgphIGdyb3VwIGF1dGhvcml0aWVz
IG1heSBpc3N1ZSBhIHJldm9jYXRpb24gc2lnbmVkIGJ5IGhpcyBjdXJyZW50IGdyb3VwCmlkZW50
aXR5IGtleSB0byBkaXNiYW5kIHRoZSBxdW9ydW0gLyBncm91cC4KCmlmIGNvbnNlbnN1cyBjYW5u
b3QgYmUgcmVhY2hlZCB3aXRoaW4gdGhlIG5leHQgZ3JvdXAgcmUta2V5IGludGVydmFsCihkdWUg
dG8gZmFpbHVyZSwgbGFjayBvZiBjb25zZW5zdXMgYXQgdGhlIG1lYXRzcGFjZSAvIHVzZXIgbGV2
ZWwsIG9yCm1hbGljaW91cyBhdHRhY2sgLyBEb1MpIHRoZSBncm91cCBtdXN0IGJlIHJlLWtleWVk
IGZyb20gdGhlIGZhY2UgdG8KZmFjZSBncm91bmQgdXAgYW5kIGFsbCByZXB1dGF0aW9uIHJlYnVp
bHQuCgp0aGUgaWRlbnRpZmllcnMgc2lnbmVkIGJ5IHRoZSBxdW9ydW0gZHVyaW5nIGVhY2ggaXRl
cmF0aW9uIGNvbnNpc3Qgb2Y6Ci0gdGhlIGtleSBkaWdlc3RzIGZvciBlYWNoIGF1dGhvcml0eSBm
b3IgdGhlIG5leHQgZ3JvdXAga2V5IGV4Y2hhbmdlCi0gdGhlIHNoYS0yNTYgZGlnZXN0IG9mIHRo
ZSBjdXJyZW50IGJhc2Ugc2hhcmUgZmlsZSBzdGF0ZSBpbWFnZQooaW5jbHVkZXMgYmFzZSBPUyBh
bmQgcHJpdmF0ZSBncm91cCBmaWxlcy9rZXlzKQotIHRoZSBzaGEtMjU2IGRpZ2VzdHMgb2YgYWxs
IGRlbHRhIGJhc2VkIG92ZXJsYXkgZmlsZXN5c3RlbSBpbWFnZXMuIAp0aGVzZSBhcmUgb3B0aW9u
YWwgYW1vbmcgZ3JvdXAgbWVtYmVycyBidXQgbWFuZGF0b3J5IGZvciBhbGwgcXVvcnVtCmF1dGhv
cml0aWVzLgoKdXBvbiB0aGlzIGJhc2UgeW91IGNhbiBidWlsZCAvIHRpZSB0byB2YXJpb3VzIGdy
b3VwIHN5bmNocm9uaXphdGlvbgptZWNoYW5pc21zIHRoYXQgYXJlIHN0cm9uZ2x5IGF1dGhlbnRp
Y2F0ZWQgYW5kIHlldCBzdGlsbCBmdWxseQpkZWNlbnRyYWxpemVkLgoKaSBob3BlIHRoYXQgaGVs
cHMuCg==

From coderman at gmail.com  Fri Mar 31 00:09:38 2006
From: coderman at gmail.com (coderman)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] group key agreement (was: lockstep synchronization
	protocol problem)
Message-ID: <4ef5fec60603301609jb422e7cr7464ba5656ddedb@mail.gmail.com>

a very detailed paper on group key agreement for ad-hoc networks:

http://eprint.iacr.org/2006/006

Abstract. Over the last 30 years the study of group key agreement has
stimulated much work. And as a result of the increased popularity of
ad hoc networks, some approaches for the group key establishment in
such networks are proposed. However, they are either only for static
group or the memory, computation and communication costs are
unacceptable for ad-hoc networks. In this thesis some protocol suites
from the literature (2^d-cube, 2^d-octopus, Asokan-Ginzboorg, CLIQUES,
STR and TGDH) shall be discussed. We have optimized STR and TGDH by
reducing the memory, communication and computation costs. The
optimized version are denoted by ?STR and ?TGDH respectively. Based on
the protocol suites ?STR and ?TGDH we present a Tree-based group key
agreement Framework for Ad-hoc Networks (TFAN). TFAN is especially
suitable for ad-hoc networks with limited bandwidth and devices with
limited memory and computation capability. To simulate the protocols,
we have implemented TFAN, ?STR and ?TGDH with J2ME CDC. The TFAN API
will be described in this thesis.

From philip_matthews at magma.ca  Fri Mar 31 16:40:52 2006
From: philip_matthews at magma.ca (Philip Matthews)
Date: Sat Dec  9 22:13:12 2006
Subject: [p2p-hackers] any job opportunities in P2P area?
In-Reply-To: <005a01c648cd$557ae210$cfa45d81@csewang03>
References: <005a01c648cd$557ae210$cfa45d81@csewang03>
Message-ID: <C3997843-2FA3-40C4-B768-CC849163CECC@magma.ca>

[Sorry to be slow in replying, but I have been away and am just  
catching up on e-mail.]

Here at Avaya, we have both a specific product that is based on P2P  
technology
(a P2P telephone system for small organizations), and Avaya Labs  
(which was
once a part of Bell Labs) which does more theoretical P2P research.

I am in the P2P product development organization, and I know we are  
hiring,
though most of our positions are not P2P specific. I don't know the  
hiring situation
at the labs.

Note that most of our interest is in P2P for telephony, which is a  
bit different than P2P for file sharing.

You can either send me your resume, or go to www.avaya.com and submit  
it directly.

- Philip Matthews
   Senior Architect
   Peer-to-Peer Business Unit
   Avaya
   www.avaya.com


On 16-Mar-06, at 2:43 , Hailong Cai wrote:

> Hi guys,
>
> Just want to know if anybody knows some companies (esp. big ones)  
> hiring people with P2P background?
> It seems difficult to find such positions since few big companies  
> do P2P development.  Thanks!
>
> -Hailong
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences