From eugen at leitl.org Fri Oct 1 10:34:34 2004 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] [IP] White House slams file sharing software on FedGov workers' PCs (fwd from dave@farber.net) Message-ID: <20041001103434.GL1457@leitl.org> ----- Forwarded message from David Farber ----- From: David Farber Date: Fri, 1 Oct 2004 06:24:17 -0400 To: Ip Subject: [IP] White House slams file sharing software on FedGov workers' PCs X-Mailer: Apple Mail (2.619) Reply-To: dave@farber.net Begin forwarded message: From: Declan McCullagh Date: October 1, 2004 1:14:04 AM EDT To: politech@politechbot.com Subject: [Politech] White House slams file sharing software on FedGov workers' PCs http://www.whitehouse.gov/omb/memoranda/fy04/m04-26.html September 8, 2004 MEMORANDUM FOR CHIEF INFORMATION OFFICERS FROM: Karen S. Evans Administrator, IT and E-Gov Image of Karen S. Evans' signature SUBJECT: Personal Use Policies and ?File Sharing? Technology The purpose of this memorandum is to detail specific actions agencies must take to ensure the appropriate use of certain technologies used for file sharing across networks. These actions are based on recommended guidance developed by the CIO Council in 1999. The effective use and management of file sharing technology requires a clear policy, training of employees on the policy, and monitoring and enforcement. Background A type of file sharing known as Peer-to-Peer (P2P) refers to any software or system allowing individual users of the Internet to connect to each other and trade files. These systems are usually highly decentralized and are designed to facilitate connections between persons who are looking for certain types of files. While there are many appropriate uses of this technology, a number of studies show, the vast majority of files traded on P2P networks are copyrighted music files and pornography. Data also suggests P2P is a common avenue for the spread of computer viruses within IT systems. Federal computer systems or networks (as well as those operated by contractors on the government's behalf) must not be used for the downloading of illegal and/or unauthorized copyrighted content. It is important to ensure computer resources of the Federal government are not compromised and to demonstrate to the American public the importance of adopting ethical and responsible practices on the Internet. The CIO Council has issued recommended guidance on ?Limited Personal Use of Government Office Equipment Including Information Technology.1? Examples of inappropriate personal use include ?the creation, download, viewing, storage, copying, or transmission of materials related to illegal gambling, illegal weapons, terrorist activities, and any other illegal activities or activities otherwise prohibited? and ?the unauthorized acquisition, use, reproduction, transmission, or distribution of any controlled information including computer software and data, that includes privacy information, copyrighted, trade marked or material with other intellectual property rights (beyond fair use), proprietary data, or export controlled software or data.? Direction to Agencies Effective use and management of file sharing technology requires a clear policy, training of employees on the policy, and monitoring and enforcement. Specifically, agencies are directed to: 1. Establish or Update Agency Personal Use Policies to be Consistent with CIO Council Recommended Guidance. OMB expects all agencies to establish personal use policies, consistent with the recommended guidance developed by the CIO Council. Agencies who have not established personal use guidance should do so without delay, but no later than December 1, 2004. 2. Train All Employees on Personal Use Policies and Improper Uses of File Sharing Agencies? IT security or ethics training must train employees on agency personal use policies and the prohibited improper uses of file sharing. Training must be consistent with OMB Circular A-130, appendix III paragraph (3)(a)(b) which states agencies must ?ensure that all individuals are appropriately trained in how to fulfill their security responsibilities [?]. Such training shall assure that employees are versed in the rules of the system, be consistent with guidance issued by NIST and OPM, and apprise them about available assistance and technical security products and techniques.? On October 6, 2004, as part of the agency annual reports required by Federal Information Security Management Act of 2002 (FISMA) described in OMB Memorandum 04-25, FY 2004 Reporting Instructions for FISMA2 agencies must report whether they provide training regarding the appropriate use of P2P file sharing. 3. Implement Security Controls to Prevent and Detect Improper File Sharing As required by FISMA, agencies are to use existing NIST standards and guidance to complete system risk and impact assessments in developing security plans and authorizing systems for operation. Operational controls detailing procedures for handling and distributing information and management controls outlining rules of behavior for the user must ensure the proper controls are in place to prevent and detect improper file sharing. Again, OMB recognizes there are appropriate uses of file sharing technologies, but as with all technology it must be appropriately managed. If you have any questions regarding this memorandum, please contact Jeanette Thornton, Policy Analyst, Information Policy and Technology Branch, Office of Management and Budget, phone (202) 395-3562, fax (202) 395-5167, e-mail: jthornto@omb.eop.gov. _______________________________________________ Politech mailing list Archived at http://www.politechbot.com/ Moderated by Declan McCullagh (http://www.mccullagh.org/) ------------------------------------- You are subscribed as eugen@leitl.org To manage your subscription, go to http://v2.listbox.com/member/?listname=ip Archives at: http://www.interesting-people.org/archives/interesting-people/ ----- End forwarded message ----- -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20041001/a68d11e5/attachment.pgp From farez at imetrix.co.uk Mon Oct 4 15:46:05 2004 From: farez at imetrix.co.uk (Farez Rahman) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Soc science and network trust lit. surveys Message-ID: <6.1.2.0.0.20041004164522.02db0300@pop3.metronet.co.uk> Hi, Draft literature survey chapters from my thesis, one on trust in the social sciences and the other on network trust models and comparing them to the findings in the soc science chapter, are online, for those interested in having a read. You can find them near the top of this page: http://www.cs.ucl.ac.uk/staff/F.AbdulRahman/docs Hope to hear some feedback on them, if possible. Cheers, Farez [ trust . reputation . research [ www.cs.ucl.ac.uk/staff/f.abdulrahman/ [ skype: callto://alfarez From eugen at leitl.org Tue Oct 5 21:32:44 2004 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] [i2p] weekly status notes [oct 5] (fwd from jrandom@i2p.net) Message-ID: <20041005213244.GI1457@leitl.org> ----- Forwarded message from jrandom ----- From: jrandom Date: Tue, 5 Oct 2004 12:48:58 -0700 To: i2p@i2p.net Subject: [i2p] weekly status notes [oct 5] -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi y'all, its weekly update time * Index: 1) 0.4.1.1 status 2) Pretty pictures 3) 0.4.1.2 and 0.4.2 4) Bundled eepserver 5) ??? * 1) 0.4.1.1 status After a pretty bumpy 0.4.1 release (and subsequent rapid 0.4.1.1 update), the net seems to be back to normal - 50-something peers actrive at the moment, and both irc and eepsites are reachable. Most of the pain was caused by insufficient testing of the new transport outside lab conditions (e.g. sockets breaking at strange times, excessive delays, etc). Next time we need to make changes at that layer, we'll be sure to test it more widely prior to release. * 2) Pretty pictures Over the last few days there have been a large number of updates going on in CVS, and one of the new things added was a new stat logging component, allowing us to simply pull out the raw stat data as its being generated, rather than deal with the crude averages gathered on /stats.jsp. With it, I've been monitoring a few key stats on a few routers, and we're getting closer to tracking down the remaining stability issues. The raw stats are fairly bulky (a 20-hour run on duck's box generated almost 60MB of data), but thats why we've got pretty pictures - http://dev.i2p.net/~jrandom/stats/ The Y axis on most of those is milliseconds, while the X axis is seconds. There are a few interesting things to note. First, client.sendAckTime.png is a pretty good approximation of a single round trip delay, as the ack message is sent with the payload and then returns the full path of the tunnel - as such, the vast majority of the nearly 33,000 successful messages sent had a round trip time under 10 seconds. If we then review the client.sendsPerFailure.png along side client.sendAttemptAverage.png, we see that the 563 failed sends were almost all sent the maximum number of retries we allow (5 with a 10s soft timeout per try and 60s hard timeout) while most of the other attempts succeeded on the first or second try. Another interesting image is client.timeout.png which sheds much doubt on a hypothesis I had - that the message send failures were correlated with some sort of local congestion. The plotted data shows that the inbound bandwidth usage varied widely when failures occurred, there were no consistent spikes in local send processing time, and seemingly no pattern whatsoever with tunnel test latency. The files dbResponseTime.png and dbResponseTime2.png are similar to the client.sendAckTime.png, except they correspond to netDb messages instead of end to end client messages. The transport.sendMessageFailedLifetime.png shows how long we sit on a message locally before failing it for some reason (for instance, due to its expiration being reached or the peer it is targetting being unreachable). Some failures are unavoidable, but this image shows a significant number failing right after the local send timeout (10s). There are a few things we can do to address this: - first, we can make the shitlist more adaptive- exponentially increasing the period a peer is shitlisted for, rather than a flat 4 minutes each. (this has already been committed to CVS) - second, we can preemptively fail messages when it looks like they'd fail anyway. To do this, we have each connection keep track of its send rate and whenever a new message is added to its queue, if the number of bytes already queued up divided by the send rate exceeds the time left until expiration, fail the message immediately. We may also be able to use this metric when determining whether to accept any more tunnel requests through a peer. Anyway, on to the next pretty picture - transport.sendProcessingTime.png. In this you see that this particular machine is rarely responsible for much lag - typically 10-100ms, though some spikes to 1s or more. Each point plotted in the tunnel.participatingMessagesProcessed.png represents how many messages were passed along a tunnel that router participated in. Combining this with the average message size gives us an estimated network load that the peer takes on for other people. The last image is the tunnel.testSuccessTime.png, showing how long it takes to send a message out a tunnel and back home again through another inbound tunnel, giving us an estimage of how good our tunnels are. Ok, thats enough pretty pictures for now. You can generate the data yourself with any release after 0.4.1.1-6 by setting the router config property "stat.logFilters" to a comma seperated list of stat names (grab the names from the /stats.jsp page). That is dumped to stats.log which you can process with java -cp lib/i2p.jar net.i2p.stat.StatLogFilter stat.log which splits it up into seperate files for each stat, suitable for loading into your favorite tool (e.g. gnuplot). 3) 0.4.1.2 and 0.4.2 There have been lots of updates since the 0.4.1.1 release (see the history [1] for a full list), but no critical fixes yet. We'll be rolling them out in the next 0.4.1.2 patch release later this week after some outstanding bugs relating to IP autodetection are addressed. The next major task at that point will be to hit 0.4.2, which is currently slated [2] as a major revamp to the tunnel processing. Its going to be a lot of work, revising the encryption and message processing as well as the tunnel pooling, but its pretty critical, as an attacker could fairly easily mount some statistical attacks on the tunnels right now (e.g. predecessor w/ random tunnel ordering or netDb harvesting). dm raised the question however as to whether it'd make sense to do the streaming lib first (currently planned for the 0.4.3 release). The benefit of that would be the network would become both more reliable and have better throughput, encouraging other developers to get hacking on client apps. After that's in place, I could then return to the tunnel revamp and address the (non-user-visible) security issues. Technically, the two tasks planned for 0.4.2 and 0.4.3 are orthogonal, and they're both going to get done anyway, so there doesn't seem to be much of a downside to switching those around. I'm inclined to agree with dm, and unless someone can come up with some reasons to keep 0.4.2 as the tunnel update and 0.4.3 as the streaming lib, we'll switch 'em. [1] http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/history.txt?rev=HEAD [2] http://www.i2p.net/roadmap * 4) Bundled eepserver As was mentioned in the 0.4.1 release notes [3], we've bunded the software and configuration necessary for running an eepsite out of the box - you can simply drop a file in the ./eepsite/docroot/ directory and share the I2P destination found on the router console. A few people called me on my zeal for .war files though - most apps unfortunately need a little more work than simply dropping a file in the ./eepsite/webapps/ dir. I've put together a brief tutorial [4] on running the blojsom [5] blogging engine, and you can see what that looks like on detonate's site [6]. [3] http://dev.i2p.net/pipermail/i2p/2004-September/000456.html [4] http://www.i2p.net/howto_blojsom [5] http://wiki.blojsom.com/wiki/display/blojsom/About+blojsom [6] http://detonate.i2p/ * 5) ??? Thats about all I've got at the moment - swing on by the meeting in 90 minutes if you want to discuss things. =jr -----BEGIN PGP SIGNATURE----- Version: PGP 8.1 iQA/AwUBQWL3MxpxS9rYd+OGEQLk1gCfeMpSoYfbIlPWobks3i7lr8MjwDkAoOMS vkNuIUa6ZwkKMVJWhoZdWto4 =hCGS -----END PGP SIGNATURE----- _______________________________________________ i2p mailing list i2p@i2p.net http://i2p.dnsalias.net/mailman/listinfo/i2p ----- End forwarded message ----- -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20041005/6f266bc2/attachment.pgp From farez at imetrix.co.uk Wed Oct 6 11:53:02 2004 From: farez at imetrix.co.uk (Farez Rahman) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Soc science and network trust lit. surveys In-Reply-To: <20041005150341.GA5561@faui10.informatik.uni-erlangen.de> References: <6.1.2.0.0.20041004164522.02db0300@pop3.metronet.co.uk> <20041005150341.GA5561@faui10.informatik.uni-erlangen.de> Message-ID: <6.1.2.0.0.20041006124938.03c161e8@pop3.metronet.co.uk> Matthias, Sorry about that. The links should work now. Again, look forward to comments! If "Trusted Computing" means a platform whose "trustworthiness" is dictated by someone other than the end user then my work will not be relevant :) Will have a look at your PGP stuff. Cheers, Farez At 16:03 05/10/2004, Matthias Bauer wrote: >Farez, >> Hope to hear some feedback on them, if possible. > >unfortunely not: > > Error: f.abdulrahman - papers.pl not found in cgi-bin > >The approach taken in the HICCS paper is interesting >because of the insight that trust is not easily modelled >by probabilies, is dynamic over time and depends on social >parameters. Most publications found in access control >and crypto journals tend to ignore that aspect completely. >The application of your work to so-called ''Trusted Computing'' >could be interesting :-) > >I'm doing some research on the pgp web-of-trust, see >the post on > >http://pestilenz.org/cgi-bin/blosxom.cgi > >Best regards, > >Matthias Bauer From Bernard.Traversat at Sun.COM Wed Oct 6 18:38:56 2004 From: Bernard.Traversat at Sun.COM (Bernard Traversat) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] [Fwd: [JXTA user] Reminder: JXTA Kitchen - October 19th] Message-ID: <41643BC0.7060102@sun.com> For people interested to learn more about the JXTA open-source P2P platform. B. -------- Original Message -------- Subject: [JXTA user] Reminder: JXTA Kitchen - October 19th Date: Tue, 05 Oct 2004 12:51:38 -0700 From: Lauren Zuravleff Reply-To: user@jxta.org To: discuss@jxta.org, user@jxta.org, dev@jxta.org You are invited to the JXTA (www.jxta.org) Peer-to-Peer Developer Kitchen on Tuesday, October 19th. This is a full day event at the Sun Microsystems' offices in Santa Clara, CA. Bring your JXTA application and work on it with the Sun JXTA engineers and other JXTA developers. Our engineers will work with you on your projects. We'll also talk about what is new and upcoming with JXTA P2P technology. You take care of your own travel arrangements, and we'll bring in lunch. There is no charge to attend. Registration: Registration for the Kitchen is on a first-come, first-serve basis. Please fill out the information requested (address, fax #, etc.) below and email to Lauren Zuravleff at lauren.zuravleff@sun.com to reserve a seat in the class. RSVP no later than October 8th. Please note that you must receive a confirmation from him before concluding that a space has been reserved for you. When/Where it'll be: Tuesday, October 19th, 9am - 6pm The Sun Microsystems Campus in Santa Clara, CA. Building details will be sent to those who RSVP. Equipment: We'll provide a training room with Sun Ultra 10 workstations and wireless access. Please let us know which you prefer to use. What you need to bring: Please bring your application to work on. You can also bring a laptop system to work with the wireless system. Breakfast & Lunch: We'll provide a buffet style breakfast and lunch. Please let me know if you have special dietary needs. RSVP is required! If you would like to attend, please RSVP to Lauren Zuravleff at lauren.zuravleff@sun.com no later than October 8th. We have a limited number of systems available for this event, so registration will be on a first-come/first-served basis. Please wait to receive a RSVP confirmation before making any non-refundable travel plans. Lodging: Sierra Suites is walking distance to the Sun Santa Clara campus. We have a special rate of $105 per night for a queen bed. Reservations available by calling 408.486.0800. You can see more information on the hotel at: http://www.sierrasuites.com/locations/santa-clara.asp =================================== RSVP Form - mailto: lauren.zuravleff@sun.com --------- [ ] Yes, I will be attending the JXTA Kitchen on October 19th. Name: Company: Address: City: State: Zip: Country: Phone: Fax: E-mail: Any Dietary Restrictions? Planning to use our Sun Ultra 10 or bringing your own laptop to run on the wireless network? Which JXTA projects do you work on? --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@jxta.org For additional commands, e-mail: user-help@jxta.org -- "As Java implies platform independence, and XML implies language independence, then JXTA implies network independence." From em at em.no-ip.com Fri Oct 8 03:17:11 2004 From: em at em.no-ip.com (Enzo Michelangeli) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Announcement: KadC library Message-ID: <042f01c4ace5$52f20ee0$0200a8c0@em.noip.com> A while ago I posted an article, archived at http://zgp.org/pipermail/p2p-hackers/2004-March/001780.html , where I discussed the possible use of DHT techniques to implement P2P-style directory services for opensource VoIP and IM clients. Since then, I have worked at a portable C library designed for easy integration with existing applications. I have now released its version 0.0.1, available from: http://kadc.sourceforge.net/ In its current state, it can access the Overnet Kademlia-based overlay network to store and retrieve generic data. Planned extensions include active participation to the DHT, and support for the eMule-KAD and RevConnect flavours of Kademlia. The package comes with two sample applications, one of which (called "namecache") is a DNS caching proxy able to map some user-defined top-domains to the DHT (http://kadc.sourceforge.net/apps.html ). Due to the fact that I intend to retain ownership of the copyright, I cannot accept third-party contributions to the code; however, the released code is licensed under GPL, and it's free for use under its conditions. Bug reports will also be gladly accepted. Comments are welcome. A project-specific mailing list is also available: http://lists.sourceforge.net/lists/listinfo/kadc-users Enzo From aloeser at cs.tu-berlin.de Mon Oct 11 08:56:40 2004 From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Help on Query and Uptime Distribution Model among users in File Sharing Networks Message-ID: <416A4AC8.665271E3@cs.tu-berlin.de> Hi, for a simulation network I need to make assumptions on : 1.) the distribution of queries among users in file sharing networks. In particular I'm interested in data statistics like user1 5 queries user2 10 queries user3 2 queries ..... Is there any research that shows how the queries are distributed among the users? Probably with a random distribution, normal distribution or zipf distribution ? 2.) the average uptime of users in file sharing networks, especially the average uptime and the uptime distribution ( zipf, normal, power law, random...?) Any suggesations are very much appreciated!! Alex -- ___________________________________________________________ Alexander L?ser Technische Universitaet Berlin hp: http://cis.cs.tu-berlin.de/~aloeser/ office: +49- 30-314-25551 fax : +49- 30-314-21601 ___________________________________________________________ From sam at neurogrid.com Wed Oct 13 01:12:33 2004 From: sam at neurogrid.com (Sam Joseph) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Help on Query and Uptime Distribution Model among users in File Sharing Networks In-Reply-To: <416A4AC8.665271E3@cs.tu-berlin.de> References: <416A4AC8.665271E3@cs.tu-berlin.de> Message-ID: <416C8101.6020109@neurogrid.com> Hi Alex, I'm sorry not to answer your questions directly as I'm currently snowed under with non-p2p related work. However perhaps I can point you to my paper "An Extendible Open Source P2P Simulator", which I believe contains references to various bits of research relating to your questions. http://www.neurogrid.net/php/publications.php http://www.neurogrid.net/php/P2PSimulator3.pdf If you can't find the answers you need in that paper, I'll try and draw up some additional pointers for you. CHEERS> SAM Alexander L?ser wrote: >Hi, >for a simulation network I need to make assumptions on : >1.) the distribution of queries among users in file sharing networks. >In particular I'm interested in data statistics like > >user1 5 queries >user2 10 queries >user3 2 queries >..... > >Is there any research that shows how the queries are distributed among >the users? Probably with a random distribution, normal distribution or >zipf distribution ? > >2.) the average uptime of users in file sharing networks, especially >the average uptime and the uptime distribution ( zipf, normal, power >law, random...?) > > >Any suggesations are very much appreciated!! > From bram at gawth.com Wed Oct 13 02:34:52 2004 From: bram at gawth.com (Bram Cohen) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] CodeCon 2005 Call For Papers Message-ID: CodeCon 4.0 February 2005 San Francisco CA, USA www.codecon.org Call For Papers CodeCon is the premier showcase of cutting edge software development. It is an excellent opportunity for programmers to demonstrate their work and keep abreast of what's going on in their community. All presentations must include working demonstrations, ideally accompanied by source code. Presenters must be done by one of the active developers of the code in question. We emphasize that demonstrations be of *working* code. We hereby solicit papers and demonstrations. * Papers and proposals due: December 15, 2004 * Authors notified: January 1, 2005 Possible topics include, but are by no means restricted to: * community-based web sites - forums, weblogs, personals * development tools - languages, debuggers, version control * file sharing systems - swarming distribution, distributed search * security products - mail encryption, intrusion detection, firewalls Presentations will be a 45 minutes long, with 15 minutes allocated for Q&A. Overruns will be truncated. Submission details: Submissions are being accepted immediately. Acceptance dates are November 15, and December 15. After the first acceptance date, submissions will be either accepted, rejected, or deferred to the second acceptance date. The conference language is English. Ideally, demonstrations should be usable by attendees with 802.11b connected devices either via a web interface, or locally on Windows, UNIX-like, or MacOS platforms. Cross-platform applications are most desirable. Our venue will be 21+. To submit, send mail to submissions2005@codecon.org including the following information: * Project name * url of project home page * tagline - one sentence or less summing up what the project does * names of presenter(s) and urls of their home pages, if they have any * one-paragraph bios of presenters, optional, under 100 words each * project history, under 150 words * what will be done in the project demo, under 200 words * slides to be shown during the presentation, if applicable * future plans General Chairs: Jonathan Moore, Len Sassaman Program Chair: Bram Cohen Program Committee: * Jeremy Bornstein, AtomShockwave Corp., USA * Bram Cohen, BitTorrent, USA * Jered Floyd, Permabit, USA * Ian Goldberg, Zero-Knowledge Systems, CA * Dan Kaminsky, Avaya, USA * Klaus Kursawe, Katholieke Universiteit Leuven, BE * Ben Laurie, A.L. Digital Ltd., UK * David Molnar, University of California, Berkeley, USA * Jonathan Moore, Mosuki, USA * Len Sassaman, Nomen Abditum Services, USA Sponsorship: If your organization is interested in sponsoring CodeCon, we would love to hear from you. In particular, we are looking for sponsors for social meals and parties on any of the three days of the conference, as well as sponsors of the conference as a whole and donors of door prizes. If you might be interested in sponsoring any of these aspects, please contact the conference organizers at codecon-admin@codecon.org. Press policy: CodeCon provides a limited number of passes to bona fide press. Complimentary press passes will be evaluated on request. Everyone is welcome to pay the low registration fee to attend without an official press credential. Questions: If you have questions about CodeCon, or would like to contact the organizers, please mail codecon-admin@codecon.org. Please note this address is only for questions and administrative requests, and not for workshop presentation submissions. -Bram Cohen "Markets can remain irrational longer than you can remain solvent" -- John Maynard Keynes From aloeser at cs.tu-berlin.de Wed Oct 13 08:41:22 2004 From: aloeser at cs.tu-berlin.de (Alexander =?iso-8859-1?Q?L=F6ser?=) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Help on Query and Uptime Distribution Model amongusers in File Sharing Networks References: <416A4AC8.665271E3@cs.tu-berlin.de> <416C8101.6020109@neurogrid.com> Message-ID: <416CEA31.4F4BF4BD@cs.tu-berlin.de> Hi Sam, it seems that especially for the query distribution over the users, e.g. how many users only issue a few queries, two contrary assumptions are made in the literature: In http://www.cse.psu.edu/~meli/papers/icnp04.pdf Li et.al assume for the QUERY SEED (see page 7 right above) a ZIPF Distribution while Schlosser et.al. http://www.stanford.edu/~sdkamvar/papers/simulator.pdf assume a POISSON distribution. Do you know any work based on real world Gnutella simulations that prove either the first or the second hypothesis? Alex Sam Joseph wrote: > Hi Alex, > > I'm sorry not to answer your questions directly as I'm currently snowed > under with non-p2p related work. However perhaps I can point you to my > paper "An Extendible Open Source P2P Simulator", which I believe > contains references to various bits of research relating to your questions. > > http://www.neurogrid.net/php/publications.php > http://www.neurogrid.net/php/P2PSimulator3.pdf > > If you can't find the answers you need in that paper, I'll try and draw > up some additional pointers for you. > > CHEERS> SAM > > Alexander L?ser wrote: > > >Hi, > >for a simulation network I need to make assumptions on : > >1.) the distribution of queries among users in file sharing networks. > >In particular I'm interested in data statistics like > > > >user1 5 queries > >user2 10 queries > >user3 2 queries > >..... > > > >Is there any research that shows how the queries are distributed among > >the users? Probably with a random distribution, normal distribution or > >zipf distribution ? > > > >2.) the average uptime of users in file sharing networks, especially > >the average uptime and the uptime distribution ( zipf, normal, power > >law, random...?) > > > > > >Any suggesations are very much appreciated!! > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences -- ___________________________________________________________ Alexander L?ser Technische Universitaet Berlin hp: http://cis.cs.tu-berlin.de/~aloeser/ office: +49- 30-314-25551 fax : +49- 30-314-21601 ___________________________________________________________ From oskar.s at gmail.com Wed Oct 13 10:59:46 2004 From: oskar.s at gmail.com (Oskar Sandberg) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Help on Query and Uptime Distribution Model amongusers in File Sharing Networks In-Reply-To: <416CEA31.4F4BF4BD@cs.tu-berlin.de> References: <416A4AC8.665271E3@cs.tu-berlin.de> <416C8101.6020109@neurogrid.com> <416CEA31.4F4BF4BD@cs.tu-berlin.de> Message-ID: On Wed, 13 Oct 2004 10:41:22 +0200, Alexander L?ser wrote: > Hi Sam, > it seems that especially for the query distribution over the users, e.g. how > many users only issue a few queries, two contrary assumptions are made in the > literature: In http://www.cse.psu.edu/~meli/papers/icnp04.pdf Li et.al assume > for the QUERY SEED (see page 7 right above) a ZIPF Distribution while Schlosser > et.al. http://www.stanford.edu/~sdkamvar/papers/simulator.pdf assume a > POISSON distribution. Do you know any work based on real world Gnutella > simulations that prove either the first or the second hypothesis? The paper by Schlosser et al does not assume that the number of nodes having a particularly query rate follows a Poisson distribution. I cannot imagine what kind of reasoning could lead to that assumption. What it does say is that the number of queries a given node generates in a given time is Poisson distributed with some intensity (rate). That is the typical assumption for an event model if one doesn't have reason to assume that rate of queries changes with time (which one probably does, but it is difficult to model). The expected number of queries in a unit of time is then the same as the intensity for the distribution, which they selected uniformly for each node between a minimum and maximum value. Thus the number of nodes issuing, on average, an amount of queries per unit time is also uniformly distributed between these values. So you are right to say that there are two contrary assumption, but they are between the Zipf (Harmonic) distribution and a uniform distribution between two values. Without empirical data it is difficult to say which is better, but it would surprise me if the data didn't seem to indicate something in between. // oskar From sam at neurogrid.com Wed Oct 13 17:30:26 2004 From: sam at neurogrid.com (Sam Joseph) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Help on Query and Uptime Distribution Model amongusers in File Sharing Networks In-Reply-To: <416CEA31.4F4BF4BD@cs.tu-berlin.de> References: <416A4AC8.665271E3@cs.tu-berlin.de> <416C8101.6020109@neurogrid.com> <416CEA31.4F4BF4BD@cs.tu-berlin.de> Message-ID: <416D6632.3010707@neurogrid.com> Hi Alex, I believe Kunwadee has the kind of data that might answer your question: http://www-2.cs.cmu.edu/~kunwadee/research/p2p/gnutella.html Although I'm not sure that the above paper covers specifically what you need. What I mean to say is that I have seen Kunwadee's data and it has timestamp, ip-address, query string, so you could work out something about the query distribution over ip addresses from the data. However I don't know if kunwadee himself has done this. You might try and contact Kunwadee to get hold of his data. However I think the fundamental difficulty in getting this kind of information out of Gnutella traces is that queries get forwarded, and are not differentiated by source, i.e. knowing which ip address a query came in from does not tell you who precisely who sent it. I think this is one of the reasons why there is a lack of data on the query distribution over users. The closest you can probably get is to look at something like the query distribution over users of web sites. Unless of course there is some other non-trivial open protocol p2p network that doesn't mask the identity of those querying, which you could extract data from ... CHEERS> SAM Alexander L?ser wrote: >Hi Sam, >it seems that especially for the query distribution over the users, e.g. how >many users only issue a few queries, two contrary assumptions are made in the >literature: In http://www.cse.psu.edu/~meli/papers/icnp04.pdf Li et.al assume >for the QUERY SEED (see page 7 right above) a ZIPF Distribution while Schlosser >et.al. http://www.stanford.edu/~sdkamvar/papers/simulator.pdf assume a >POISSON distribution. Do you know any work based on real world Gnutella >simulations that prove either the first or the second hypothesis? > >Alex > > > > > > > >Sam Joseph wrote: > > > >>Hi Alex, >> >>I'm sorry not to answer your questions directly as I'm currently snowed >>under with non-p2p related work. However perhaps I can point you to my >>paper "An Extendible Open Source P2P Simulator", which I believe >>contains references to various bits of research relating to your questions. >> >>http://www.neurogrid.net/php/publications.php >>http://www.neurogrid.net/php/P2PSimulator3.pdf >> >>If you can't find the answers you need in that paper, I'll try and draw >>up some additional pointers for you. >> >>CHEERS> SAM >> >>Alexander L?ser wrote: >> >> >> >>>Hi, >>>for a simulation network I need to make assumptions on : >>>1.) the distribution of queries among users in file sharing networks. >>>In particular I'm interested in data statistics like >>> >>>user1 5 queries >>>user2 10 queries >>>user3 2 queries >>>..... >>> >>>Is there any research that shows how the queries are distributed among >>>the users? Probably with a random distribution, normal distribution or >>>zipf distribution ? >>> >>>2.) the average uptime of users in file sharing networks, especially >>>the average uptime and the uptime distribution ( zipf, normal, power >>>law, random...?) >>> >>> >>>Any suggesations are very much appreciated!! >>> >>> >>> >>_______________________________________________ >>p2p-hackers mailing list >>p2p-hackers@zgp.org >>http://zgp.org/mailman/listinfo/p2p-hackers >>_______________________________________________ >>Here is a web page listing P2P Conferences: >>http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences >> >> > >-- >___________________________________________________________ > > Alexander L?ser > Technische Universitaet Berlin > hp: http://cis.cs.tu-berlin.de/~aloeser/ > office: +49- 30-314-25551 > fax : +49- 30-314-21601 >___________________________________________________________ > > >_______________________________________________ >p2p-hackers mailing list >p2p-hackers@zgp.org >http://zgp.org/mailman/listinfo/p2p-hackers >_______________________________________________ >Here is a web page listing P2P Conferences: >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > From seth.johnson at RealMeasures.dyndns.org Sat Oct 16 06:12:17 2004 From: seth.johnson at RealMeasures.dyndns.org (Seth Johnson) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] FTC P2P Workshop Seeks Panelists References: <40F24EC6.94C3637D@RealMeasures.dyndns.org> Message-ID: <4170BBC1.B98CEB82@RealMeasures.dyndns.org> > http://www.ftc.gov/opa/2004/10/p2p.htm FTC to Host Two-day Peer-to-peer File-sharing Workshop The Federal Trade Commission will host a public workshop, “Peer-to-Peer File-Sharing Technology: Consumer Protection and Competition Issues,” to explore consumer protection and competition issues associated with the distribution and use of peer-to-peer (P2P) file-sharing. The workshop will be held December 15 and 16, 2004. It is free and open to the public. A Federal Register Notice to be published shortly says the workshop is intended to provide an opportunity to learn how P2P file-sharing works and to discuss current and future applications of the technology. The workshop will focus on: The uses of P2P file-sharing technology; The role of P2P file-sharing technology in the economy; Identification and disclosure of P2P file-sharing software program risks; Technological solutions to protect consumers from risks associated with P2P file- sharing software programs; P2P file-sharing and music distribution; and P2P file-sharing and its impact on copyright holders. Interested parties can submit written comments to Federal Trade Commission, Office of the Secretary, Room 159-H (Annex B), 600 Pennsylvania Avenue N.W., Washington, DC, 20580. The Commission is particularly interested in studies, surveys, research, and other empirical data related to P2P file-sharing. Comments and envelopes should be marked “P2P File-Sharing Workshop – Comment P034517.” Interested parties are encouraged to submit comments electronically at: https://secure.commentworks.com/ftc-p2pfilesharing/. Persons seeking to participate as panelists in the workshop must notify the FTC in writing of their interest in participating and describe their expertise in, or knowledge of, the issues. Interested parties may submit requests to participate by mail, as indicated above, or electronically to: filesharingworkshop@ftc.gov. Panelists will be selected based on whether they have expertise or knowledge, and whether their participation would promote a balance of interests at the workshop. Comments and requests to participate must be submitted on or before Monday, November 15, 2004. Panelists will be notified on or before Monday, November 29, 2004, if they have been selected. A detailed agenda and additional information on the workshop will be posted on the FTC’s Web site at: www.ftc.gov/bcp/workshops/filesharing/index.htm. The commission vote to issue the Federal Register Notice was 4-0-1, with Commissioner Jon Leibowitz not participating. Copies of the Federal Register Notice are available from the FTC’s Web site at http://www.ftc.gov and also from the FTC’s Consumer Response Center, Room 130, 600 Pennsylvania Avenue, N.W., Washington, D.C. 20580. The FTC works for the consumer to prevent fraudulent, deceptive, and unfair business practices in the marketplace and to provide information to help consumers spot, stop, and avoid them. To file a complaint in English or Spanish (bilingual counselors are available to take complaints), or to get free information on any of 150 consumer topics, call toll-free, 1-877-FTC-HELP (1-877-382-4357), or use the complaint form at http://www.ftc.gov. The FTC enters Internet, telemarketing, identity theft, and other fraud-related complaints into Consumer Sentinel, a secure, online database available to hundreds of civil and criminal law enforcement agencies in the U.S. and abroad. MEDIA CONTACT: Claudia Bourne Farrell, Office of Public Affairs 202-326-2181 STAFF CONTACT: Elizabeth Delaney, Bureau of Consumer Protection 202-326-2903 (FTC File No. P03 4577) (http://www.ftc.gov/opa/2004/10/p2p.htm) From p2p-hackers at ryanb.org Wed Oct 20 04:45:13 2004 From: p2p-hackers at ryanb.org (Ryan Barrett) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Announcement: p4 Message-ID: Hi all. We've just released p4, a peer-to-peer overlay network library. P4's hook is that it's a library, not an app, and that it provides a network abstraction of users and messages, instead of files or "resources" like most overlay networks. P4 is GPLed. Packages, documentation, protocol spec, etc. are here: http://snarfed.org/space/p4 The p4 library provides a API that exposes network functionality including unicast, multicast, user discovery, application discovery, and hooks for encryption and authentication. This allows developers to focus on building real functionality on top of the overlay network. To be fair, p4's routing algorithms are rudimentary when compared to most modern DHTs. (They weren't even state of the art when it was developed in 2001, and the state of the art has advanced a fair amount since then. :P) However, it has a few desirable features - it's highly portable, has a small memory footprint, and comes as a single 270Kb library. Feedback is welcome! Ken Ashcraft http://www.stanford.edu/~kash Ryan Barrett http://ryan.barrett.name/ Maulik Shah http://maulik.net/ Nathan Stoll http://cs.stanford.edu/~nstoll From john.casey at gmail.com Wed Oct 20 12:19:04 2004 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Opaque Key Access in DHT's ?? Message-ID: Hi All, just wondering if there was any literature on anything but opaque key access in DHT's? I am interested in creating a signature based search engine/data indexer on top of a DHT but it doesn't seem feasible when you consider the number of empty signatures that are possible. You don't really want to be doing a lookup() operation on a node many hops away just to find out the bucket you were interested in is empty! Did anyone ever investigate some sort of cache digest (aka. bloom filter) to fulfil this purpose ?? From jdd at dixons.org Wed Oct 20 15:03:11 2004 From: jdd at dixons.org (Jim Dixon) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Opaque Key Access in DHT's ?? In-Reply-To: References: Message-ID: On Wed, 20 Oct 2004, John Casey wrote: > Hi All, just wondering if there was any literature on anything but > opaque key access in DHT's? I am interested in creating a signature > based search engine/data indexer on top of a DHT but it doesn't seem > feasible when you consider the number of empty signatures that are > possible. You don't really want to be doing a lookup() operation on a > node many hops away just to find out the bucket you were interested in > is empty! Did anyone ever investigate some sort of cache digest (aka. > bloom filter) to fulfil this purpose ?? You might look at the org.xlattice.crypto.filters package at http://xlattice.sourceforge.net/components/crypto/src The package was designed to allow neighbors to cheaply look at one another's caches. -- Jim Dixon jdd@dixons.org tel +44 117 982 0786 mobile +44 797 373 7881 http://jxcl.sourceforge.net Java unit test coverage http://xlattice.sourceforge.net p2p communications infrastructure From adamsz at gmail.com Wed Oct 20 17:36:34 2004 From: adamsz at gmail.com (Adam Souzis) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Opaque Key Access in DHT's ?? In-Reply-To: References: Message-ID: Tapestry uses a hierarchical bloom filter data structure for this purpose. I think they wrote a paper on it (or at least there are java docs on it) : http://www.cs.berkeley.edu/~ravenben/tapestry/ -- adam On Wed, 20 Oct 2004 22:19:04 +1000, John Casey wrote: > Hi All, just wondering if there was any literature on anything but > opaque key access in DHT's? I am interested in creating a signature > based search engine/data indexer on top of a DHT but it doesn't seem > feasible when you consider the number of empty signatures that are > possible. You don't really want to be doing a lookup() operation on a > node many hops away just to find out the bucket you were interested in > is empty! Did anyone ever investigate some sort of cache digest (aka. > bloom filter) to fulfil this purpose ?? > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From adamsz at gmail.com Wed Oct 20 18:54:19 2004 From: adamsz at gmail.com (Adam Souzis) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] using Kademlia as a distributed binary search tree? Message-ID: hi all, It seems that Kademlia could be modified slightly to enable it to become a more like distributed binary search tree (let's call that a DBT) and this approach has some advantages over other attempts to build searching primitives on a DHT (e.g. using locality preserving hashing with Chord). I'm hoping that those of you with a good understanding of Kademlia could sanity check the description below to make sure I'm not missing something big here. If all goes well, I'd like to try to implement a distributed data store based on these ideas. An brief overview of how it would be used can be found here: http://rhizome.liminalzone.org/FOAFPaper Note that Kademlia organizes its nodes as leaves in a binary tree by assigning each node a 160 bit ID. The value space of the keys of the DHT are mapped to that 160 bit range. With the modifications to the protocol described below, we can search for any keys that fall within a range of the valuespace and enable the node topology to follow the key distribution. * We indicate which keys a node will store by associating the number of the significant bits (s) to consider in the node ID. Any key between 2**s - 2**(s+1) will be stored on that node. * Instead of randomly assigning node IDs, we only assign them when a node can no longer store any more keys (either because of storage or request load, though caching should mitigate the latter) -- when that happens, we split the node: another k nodes are assigned a node ID that matches the node ID's significant bits and the next bit being the complement of first non-significant bit. Then the first node's signficant bit count is incremented, and STORE_VALUEs are sent for all the keys that are no longer covered by the orginal node (As an optimization, you don't always need to add k new nodes but i won't get into that here.) * Now we can introduce a new RPC message to the protocol: QUERY_VALUE. Instead of finding a whole key, only the leading bits of key are searched for. It can used to build search primitives like range, nearest neighbor or wildcard searches. It's identical to the FIND_VALUE message -- except in following ways: ** The leading bits are padded with trailing zeros, contacted nodes return all the values whose keys match. ** FIND_VALUE stops as soon as a node returns a value; however, query_value needs to contact all k closest nodes to be sure that it exhaustively retrieved all values associated with the key range. ** QUERY_VALUE is recursive in this way: For each node that returns values has more signficant bits than the query value, we must generate new QUERY_VALUE requests the new query value with additional bits in order to exhaust the value space. This means the msgs may increase geometrically for more general searches, so QUERY_VALUE searches should have some bounds to prevent flooding the system. ** Caching the results can be done as with FIND_VALUE except that the original source node will have to be stored along with the value, so that QUERY_VALUE can use the cached values as a substitute for one of the k nearest nodes it needs to find. Some more details: * One consequence of these changes is that since node IDs are no longer guaranteed to be evenly distributed the log(n) bound for routing time no longer holds. Instead the maximun number of hops is bounded by the the maxinum number of significant bits that a node may have. * Node growth implies physical nodes can have multiple node IDs, i.e., support multiple virtual nodes. We assume new physical nodes start with a randomly generated 160 bit node ID, just as Kademlia works today. When a node needs to split, it sends a NEW_NODE RPC message towards a randomly chose 160 bit node ID. As it routes towards that ID, contacted nodes evaluate whether it has the capacity and capabilities to handle the request; if it can it adds a virtual node with the requested node ID (as described above). * Keys will need to be more than 160 bits but we'd only see large node IDs being added when an index gets populated with many, many keys, and while the routing table might end up with many k-buckets, most will be empty. Still, for indexes of relatively long strings or compound indexes, the key will need to be reduced somehow. * To handle that case where there are more values for a given key than a single node can store (aka the "britney spears effect") we can append a random salt or maybe a digest of the value to the key to allow a node to split. (BTW, a wiki page version of this description can be found at http://rhizome.liminalzone.org/DBT) thanks, adam From john.casey at gmail.com Wed Oct 20 23:03:31 2004 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Opaque Key Access in DHT's ?? In-Reply-To: References: Message-ID: Hi thanks guys. thanks I am just reading up on tapestry now. From what I have read it seems they are able to summarise a large range of hash values using bloom filters. What I am asking is are they able to accurately create key digests for the ranges of keys distributed across each of the nodes ? For example, if you have 4 nodes in a bamboo/pastry system that would mean (I think) each node gets 2^160/4 of the range. Can you really successfully summarize 2^40 keys using bloom filters ??? On Wed, 20 Oct 2004 10:36:34 -0700, Adam Souzis wrote: > Tapestry uses a hierarchical bloom filter data structure for this > purpose. I think they wrote a paper on it (or at least there are java > docs on it) : > > http://www.cs.berkeley.edu/~ravenben/tapestry/ From john.casey at gmail.com Wed Oct 20 23:40:51 2004 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Opaque Key Access in DHT's ?? In-Reply-To: References: Message-ID: To be a bit more specific I am talking about summarizing the keys that are actually in use. Becuase in reality there could be many keys that have no values. Thats what I am talking about. On Thu, 21 Oct 2004 09:03:31 +1000, John Casey wrote: > Hi thanks guys. thanks I am just reading up on tapestry now. From what > I have read it seems they are able to summarise a large range of hash > values using bloom filters. What I am asking is are they able to > accurately create key digests for the ranges of keys distributed > across each of the nodes ? For example, if you have 4 nodes in a > bamboo/pastry system that would mean (I think) each node gets 2^160/4 > of the range. Can you really successfully summarize 2^40 keys using > bloom filters ??? > > > > On Wed, 20 Oct 2004 10:36:34 -0700, Adam Souzis wrote: > > Tapestry uses a hierarchical bloom filter data structure for this > > purpose. I think they wrote a paper on it (or at least there are java > > docs on it) : > > > > http://www.cs.berkeley.edu/~ravenben/tapestry/ > From zooko at zooko.com Thu Oct 21 08:38:15 2004 From: zooko at zooko.com (Zooko Wilcox-O'Hearn) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] using Kademlia as a distributed binary search tree? In-Reply-To: References: Message-ID: <8D0613BB-233C-11D9-A2F8-000A95E2A184@zooko.com> On 2004, Oct 20, at 15:54, Adam Souzis wrote: > It seems that Kademlia could be modified slightly to enable it to > become a more like distributed binary search tree (let's call that a > DBT) With your modification, the tree could devolve into a linked list in response to certain patterns of load. This linked list would require O(n) hops for n nodes in the network, and it would also be very fragile -- loss of any node would partition the network. Regards, Zooko From adamsz at gmail.com Thu Oct 21 17:03:58 2004 From: adamsz at gmail.com (Adam Souzis) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] using Kademlia as a distributed binary search tree? In-Reply-To: <8D0613BB-233C-11D9-A2F8-000A95E2A184@zooko.com> References: <8D0613BB-233C-11D9-A2F8-000A95E2A184@zooko.com> Message-ID: On 21 Oct 2004 05:38:15 -0300, Zooko Wilcox-O'Hearn wrote: > On 2004, Oct 20, at 15:54, Adam Souzis wrote: > > > It seems that Kademlia could be modified slightly to enable it to > > become a more like distributed binary search tree (let's call that a > > DBT) > > With your modification, the tree could devolve into a linked list in > response to certain patterns of load. This linked list would require > O(n) hops for n nodes in the network, and it would also be very fragile > -- loss of any node would partition the network. > No -- I don't think you're getting it -- let me highlight two quotes from my email: >>We indicate which keys a node will store by associating the number >>of the significant bits (s) to consider in the node ID. Any key >>between 2**s - 2**(s+1) will be stored on that node. and >>since node IDs are no >>longer guaranteed to be evenly distributed the log(n) bound for >>routing time no longer holds. Instead the maximun number of hops is >>bounded by the the maxinum number of significant bits that a node may >>have. An easy way to visualize the structure is to think of each node being a subtree or branch of the binary tree -- that's what the significant bit number indicates. The assumption is that each node would store thousands or millions of keys -- what you are saying could only happen if each node could only store one key -- that condition would be a non-starter with any p2p network :) But even if that was true i don't see how the network is more fragile -- just impossibly slow. The k-bucket routing tables would be very sparse but still guaranteed to work as long as keys are replicated on k nodes. -- adam > Regards, > > Zooko > > From Euseval at aol.com Fri Oct 22 07:12:46 2004 From: Euseval at aol.com (Euseval@aol.com) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] new p2p architecture: JetiAnts.tk Message-ID: <1df.2cb1a0bd.2eaa0cee@aol.com> see here, anonymous filesharing app http://www.jetiants.tk any comments ? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zgp.org/pipermail/p2p-hackers/attachments/20041022/a64bd19e/attachment.html From anwitaman at hotmail.com Fri Oct 22 11:51:39 2004 From: anwitaman at hotmail.com (Anwitaman Datta) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] RE:using Kademlia as a distributed binary search Message-ID: This is a very interesting question. Actually, we address many of these problems in the context of P-Grid project (www.p-grid.org). Topologically P-Grid routing network is similar to (same as) Kademlia, but the network is built in an emergent manner (for load-balancing). And even though the tree so formed will be unbalanced, we prove that randomized routing keeps the routing cost logarithmic (in terms of number of leaf nodes) in expectation. Look at the website for relevant literature, but most important one will be: http://www.p-grid.org/Papers/TR-IC-2004-23.pdf For a simple overview of the P-Grid routing network, also check http://www.p-grid.org/Papers/P-Grid-SIGMOD2003.pdf Rgds, Anwitaman From: Adam Souzis Subject: [p2p-hackers] using Kademlia as a distributed binary search tree? To: p2p-hackers@zgp.org Message-ID: Content-Type: text/plain; charset=US-ASCII hi all, It seems that Kademlia could be modified slightly to enable it to become a more like distributed binary search tree (let's call that a DBT) and this approach has some advantages over other attempts to build searching primitives on a DHT (e.g. using locality preserving hashing with Chord). I'm hoping that those of you with a good understanding of Kademlia could sanity check the description below to make sure I'm not missing something big here. If all goes well, I'd like to try to implement a distributed data store based on these ideas. An brief overview of how it would be used can be found here: http://rhizome.liminalzone.org/FOAFPaper Note that Kademlia organizes its nodes as leaves in a binary tree by assigning each node a 160 bit ID. The value space of the keys of the DHT are mapped to that 160 bit range. With the modifications to the protocol described below, we can search for any keys that fall within a range of the valuespace and enable the node topology to follow the key distribution. * We indicate which keys a node will store by associating the number of the significant bits (s) to consider in the node ID. Any key between 2**s - 2**(s+1) will be stored on that node. * Instead of randomly assigning node IDs, we only assign them when a node can no longer store any more keys (either because of storage or request load, though caching should mitigate the latter) -- when that happens, we split the node: another k nodes are assigned a node ID that matches the node ID's significant bits and the next bit being the complement of first non-significant bit. Then the first node's signficant bit count is incremented, and STORE_VALUEs are sent for all the keys that are no longer covered by the orginal node (As an optimization, you don't always need to add k new nodes but i won't get into that here.) * Now we can introduce a new RPC message to the protocol: QUERY_VALUE. Instead of finding a whole key, only the leading bits of key are searched for. It can used to build search primitives like range, nearest neighbor or wildcard searches. It's identical to the FIND_VALUE message -- except in following ways: ** The leading bits are padded with trailing zeros, contacted nodes return all the values whose keys match. ** FIND_VALUE stops as soon as a node returns a value; however, query_value needs to contact all k closest nodes to be sure that it exhaustively retrieved all values associated with the key range. ** QUERY_VALUE is recursive in this way: For each node that returns values has more signficant bits than the query value, we must generate new QUERY_VALUE requests the new query value with additional bits in order to exhaust the value space. This means the msgs may increase geometrically for more general searches, so QUERY_VALUE searches should have some bounds to prevent flooding the system. ** Caching the results can be done as with FIND_VALUE except that the original source node will have to be stored along with the value, so that QUERY_VALUE can use the cached values as a substitute for one of the k nearest nodes it needs to find. Some more details: * One consequence of these changes is that since node IDs are no longer guaranteed to be evenly distributed the log(n) bound for routing time no longer holds. Instead the maximun number of hops is bounded by the the maxinum number of significant bits that a node may have. * Node growth implies physical nodes can have multiple node IDs, i.e., support multiple virtual nodes. We assume new physical nodes start with a randomly generated 160 bit node ID, just as Kademlia works today. When a node needs to split, it sends a NEW_NODE RPC message towards a randomly chose 160 bit node ID. As it routes towards that ID, contacted nodes evaluate whether it has the capacity and capabilities to handle the request; if it can it adds a virtual node with the requested node ID (as described above). * Keys will need to be more than 160 bits but we'd only see large node IDs being added when an index gets populated with many, many keys, and while the routing table might end up with many k-buckets, most will be empty. Still, for indexes of relatively long strings or compound indexes, the key will need to be reduced somehow. * To handle that case where there are more values for a given key than a single node can store (aka the "britney spears effect") we can append a random salt or maybe a digest of the value to the key to allow a node to split. (BTW, a wiki page version of this description can be found at http://rhizome.liminalzone.org/DBT) thanks, adam ------------------------------ _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers End of p2p-hackers Digest, Vol 15, Issue 8 ****************************************** _________________________________________________________________ Get head-hunted by 10,500 recruiters. http://www.naukri.com/msn/index.php?source=hottag Post your CV on naukri.com today. From adamsz at gmail.com Thu Oct 28 19:00:38 2004 From: adamsz at gmail.com (Adam Souzis) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] RE:using Kademlia as a distributed binary search In-Reply-To: References: Message-ID: hmm, P-Grid (and Gridvine) looks very promising -- are you planning to publically release an implementation of them? thanks, adam On Fri, 22 Oct 2004 17:21:39 +0530, Anwitaman Datta wrote: > This is a very interesting question. > Actually, we address many of these problems in the context of P-Grid project > (www.p-grid.org). > Topologically P-Grid routing network is similar to (same as) Kademlia, but > the network is built in an emergent manner (for load-balancing). And even > though the tree so formed will be unbalanced, we prove that randomized > routing keeps the routing cost logarithmic (in terms of number of leaf > nodes) in expectation. > > Look at the website for relevant literature, but most important one will be: > > http://www.p-grid.org/Papers/TR-IC-2004-23.pdf > > For a simple overview of the P-Grid routing network, > also check http://www.p-grid.org/Papers/P-Grid-SIGMOD2003.pdf > > Rgds, > Anwitaman > > From: Adam Souzis > Subject: [p2p-hackers] using Kademlia as a distributed binary search > tree? > To: p2p-hackers@zgp.org > Message-ID: > Content-Type: text/plain; charset=US-ASCII > > hi all, > > It seems that Kademlia could be modified slightly to enable it to > become a more like distributed binary search tree (let's call that a > DBT) and this approach has some advantages over other attempts to > build searching primitives on a DHT (e.g. using locality preserving > hashing with Chord). > > I'm hoping that those of you with a good understanding of Kademlia > could sanity check the description below to make sure I'm not missing > something big here. If all goes well, I'd like to try to implement a > distributed data store based on these ideas. An brief overview of how > it would be used can be found here: > http://rhizome.liminalzone.org/FOAFPaper > > Note that Kademlia organizes its nodes as leaves in a binary tree by > assigning each node a 160 bit ID. The value space of the keys of the > DHT are mapped to that 160 bit range. With the modifications to the > protocol described below, we can search for any keys that fall within > a range of the valuespace and enable the node topology to follow the > key distribution. > > * We indicate which keys a node will store by associating the number > of the significant bits (s) to consider in the node ID. Any key > between 2**s - 2**(s+1) will be stored on that node. > > * Instead of randomly assigning node IDs, we only assign them when a > node can no longer store any more keys (either because of storage or > request load, though caching should mitigate the latter) -- when that > happens, we split the node: another k nodes are assigned a node ID > that matches the node ID's significant bits and the next bit being the > complement of first non-significant bit. Then the first node's > signficant bit count is incremented, and STORE_VALUEs are sent for all > the keys that are no longer covered by the orginal node (As an > optimization, you don't always need to add k new nodes but i won't get > into that here.) > > * Now we can introduce a new RPC message to the protocol: QUERY_VALUE. > Instead of finding a whole key, only the leading bits of key are > searched for. It can used to build search primitives like range, > nearest neighbor or wildcard searches. It's identical to the > FIND_VALUE message -- except in following ways: > ** The leading bits are padded with trailing zeros, contacted nodes > return all the values whose keys match. > ** FIND_VALUE stops as soon as a node returns a value; however, > query_value needs to contact all k closest nodes to be sure that it > exhaustively retrieved all values associated with the key range. > ** QUERY_VALUE is recursive in this way: For each node that returns > values has more signficant bits than the query value, we must generate > new QUERY_VALUE requests the new query value with additional bits in > order to exhaust the value space. This means the msgs may increase > geometrically for more general searches, so QUERY_VALUE searches > should have some bounds to prevent flooding the system. > ** Caching the results can be done as with FIND_VALUE except that the > original source node will have to be stored along with the value, so > that QUERY_VALUE can use the cached values as a substitute for one of > the k nearest nodes it needs to find. > > Some more details: > * One consequence of these changes is that since node IDs are no > longer guaranteed to be evenly distributed the log(n) bound for > routing time no longer holds. Instead the maximun number of hops is > bounded by the the maxinum number of significant bits that a node may > have. > * Node growth implies physical nodes can have multiple node IDs, i.e., > support multiple virtual nodes. We assume new physical nodes start > with a randomly generated 160 bit node ID, just as Kademlia works > today. When a node needs to split, it sends a NEW_NODE RPC message > towards a randomly chose 160 bit node ID. As it routes towards that > ID, contacted nodes evaluate whether it has the capacity and > capabilities to handle the request; if it can it adds a virtual node > with the requested node ID (as described above). > * Keys will need to be more than 160 bits but we'd only see large node > IDs being added when an index gets populated with many, many keys, and > while the routing table might end up with many k-buckets, most will be > empty. Still, for indexes of relatively long strings or compound > indexes, the key will need to be reduced somehow. > * To handle that case where there are more values for a given key than > a single node can store (aka the "britney spears effect") we can > append a random salt or maybe a digest of the value to the key to > allow a node to split. > > (BTW, a wiki page version of this description can be found at > http://rhizome.liminalzone.org/DBT) > > thanks, > adam > > ------------------------------ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > > End of p2p-hackers Digest, Vol 15, Issue 8 > ****************************************** > > _________________________________________________________________ > Get head-hunted by 10,500 recruiters. > http://www.naukri.com/msn/index.php?source=hottag Post your CV on naukri.com > today. > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From eugen at leitl.org Fri Oct 29 18:29:38 2004 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] [FoRK] Peer quest for peerless peer-to-peer knowledge (fwd from sdw@lig.net) Message-ID: <20041029182938.GG1457@leitl.org> ----- Forwarded message from "Stephen D. Williams" ----- From: "Stephen D. Williams" Date: Fri, 29 Oct 2004 13:46:12 -0400 To: fork@xent.com Subject: [FoRK] Peer quest for peerless peer-to-peer knowledge User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) I know a number of you are giants among peers in peer-to-peer. Could you please forward any links or documents that analyze various architectures for peer-to-peer, distributed communication, zeroconfig, and route detection and maintenance (like Ant Routing), show requirements, successes and failures, and illustrate best current practices? I'm interested in both Internet scale and Internet realistic approaches and those that may rely on more progressive ideas. I'm not so interested in simplistic, inefficient, half-duplex, and heavy systems like HTTP/Soap. I'm aware of a number of systems, but help in pointing right to the most complete analysis a la [Fielding2000] would be helpful. I have been designing some reusable middleware with specialized support for knowledge processing and I'd like to validate, extend, and complete ideas and design details with all available knowledge. I'm not planning a startup anytime soon, this work is not likely to be a product per se. Systems like Skype, SETI at Home, and Jabber have a lot of the characteristics I'm interested in. Thanks! sdw [Fielding2000] http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm _______________________________________________ FoRK mailing list http://xent.com/mailman/listinfo/fork ----- End forwarded message ----- -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20041029/9642f303/attachment.pgp From mgp at ucla.edu Fri Oct 29 22:30:47 2004 From: mgp at ucla.edu (Michael Parker) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] Optimized UDP Protocols Message-ID: <4182C497.4040504@ucla.edu> Hi all, I know that in peer-to-peer networks with high levels of churn and activity, TCP isn't really the protocol of choice because of the overhead caused by nodes constantly arriving and departing, retransmissions due to queuing, cross-traffic, and lost packets, etc. That said, I've read in many papers that UDP has been used with success. Most protocols simply put a reliable, TCP-like layer on top of it, maintaining exponentially weighted averages and variances of RTTs to other nodes, slow-start congestion windows, etc. Has there been a survey of the different reliable UDP protocols used in peer-to-peer networks, either in simulation or in real-life experimental setups? Are there any optimizations that we can make to this reliable UDP protocol that will make it perform better assuming high levels of churn? The only paper I've seen that tries to ?optimize? a UDP protocol for high churn has been the one on the Bamboo DHT, a variant of the Pastry DHT. Just curious if anyone knew of any others. Thanks, Michael Parker From Euseval at aol.com Fri Oct 29 21:35:13 2004 From: Euseval at aol.com (Euseval@aol.com) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] [FoRK] Peer quest for peerless peer-to-peer knowledge(fwd from sdw@lig.net) Message-ID: <3C7DE8CC.421143FD.00175F91@aol.com> see here: http://www.JetiAnts.tk In einer eMail vom Fr, 29. Okt. 2004 19:29 MEZ schreibt Eugen Leitl : >----- Forwarded message from "Stephen D. Williams" ----- > >From: "Stephen D. Williams" >Date: Fri, 29 Oct 2004 13:46:12 -0400 >To: fork@xent.com >Subject: [FoRK] Peer quest for peerless peer-to-peer knowledge >User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) > >I know a number of you are giants among peers in peer-to-peer. > >Could you please forward any links or documents that analyze various >architectures for peer-to-peer, distributed communication, zeroconfig, >and route detection and maintenance (like Ant Routing), show >requirements, successes and failures, and illustrate best current practices? > >I'm interested in both Internet scale and Internet realistic approaches >and those that may rely on more progressive ideas. ?I'm not so >interested in simplistic, inefficient, half-duplex, and heavy systems >like HTTP/Soap. > >I'm aware of a number of systems, but help in pointing right to the most >complete analysis a la [Fielding2000] would be helpful. ?I have been >designing some reusable middleware with specialized support for >knowledge processing and I'd like to validate, extend, and complete >ideas and design details with all available knowledge. ?I'm not planning >a startup anytime soon, this work is not likely to be a product per se. > >Systems like Skype, SETI at Home, and Jabber have a lot of the >characteristics I'm interested in. > >Thanks! >sdw > >[Fielding2000] ?http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm > >_______________________________________________ >FoRK mailing list >http://xent.com/mailman/listinfo/fork > >----- End forwarded message ----- >-- >Eugen* Leitl leitl >______________________________________________________________ >ICBM: 48.07078, 11.61144 ? ? ? ? ? ?http://www.leitl.org >8B29F6BE: 099D 78BA 2FD3 B014 B08A ?7779 75B0 2443 8B29 F6BE >http://moleculardevices.org ? ? ? ? http://nanomachines.net > From eugen at leitl.org Sat Oct 30 18:25:24 2004 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] P2P Not Dead, Just Hiding (fwd from brian-slashdotnews@hyperreal.org) Message-ID: <20041030182524.GE1457@leitl.org> ----- Forwarded message from brian-slashdotnews@hyperreal.org ----- From: brian-slashdotnews@hyperreal.org Date: 30 Oct 2004 01:26:01 -0000 To: slashdotnews@hyperreal.org Subject: P2P Not Dead, Just Hiding User-Agent: SlashdotNewsScooper/0.0.3 Link: http://slashdot.org/article.pl?sid=04/10/29/1915210 Posted by: michael, on 2004-10-29 23:59:00 from the hunting-wabbits dept. adavies42 writes "Contrary to media reports, P2P [1]is not dying ([2]PDF); it's just becoming harder to detect. In a paper for CAIDA, the Cooperative Association for Internet Data Analysis, researchers present evidence that the supposed decline in P2P traffic is actually due to a decline in easy-to-track protocols as those that change port numbers on a regular basis become more popular." [3]Click Here References 1. http://www.caida.org/outreach/papers/2004/p2p-dying/ 2. http://www.caida.org.nyud.net:8090/outreach/papers/2004/p2p-dying/p2p-dying.pdf 3. http://ads.osdn.com/?ad_id=5431&alloc_id=11753&site_id=1&request_id=290450&op=click&page=%2farticle%2epl ----- End forwarded message ----- -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20041030/3d02a07d/attachment.pgp From Euseval at aol.com Sun Oct 31 14:11:08 2004 From: Euseval at aol.com (Euseval@aol.com) Date: Sat Dec 9 22:12:43 2006 Subject: [p2p-hackers] QNEXT versus GAIM and JABBER and JetiAnts.tk Message-ID: <4AC104D4.77BCF02E.00175F91@aol.com> http://www.zeropaid.com/bbs/showthread.php?threadid=24148 Qnext 1.0.4.75. all-in-one Messenger for Filesharing -------------------------------------------------------------------------------- Qnext 1.0.4.75. all-in-one Messenger for Filesharing Future of Peer-to-Peer Communications & Sharing One powerful P2P software to communicate, share and interact with rich media with complete security and no restrictions on file size. Universal Messenger (Chat with friends on Yahoo, AIM, MSN & ICQ!) Voice over IP Video Conferencing Unlimited File Transfer p2p File Sharing Online Gaming Group Text Chat Advanced Photo Sharing Remote PC Access Our Universal Instant Messenger keeps you connected to all popular IM?s, you can also Talk Online anywhere around the world for free, host a live Video Conference or Transfer Files of any size with complete security. But the closer you look, the better it gets. Qnext allows you to create private, shared environments called Zones where invite other users to securely share your content and online services. A Zone can contain exciting services such as File Sharing, Group Text Chat, Qnext Photo and Live Online Games. And this is just the beginning ? Qnext will be offering many new services to enhance your online experience. We have also included a powerful remote access service that gives you fast, easy and secure access to your computer from anywhere in the world using QnextMyPC. Downloading and Using Qnext is Always Free! Stay connected to your online buddy lists on other IM clients High level security using 192-bit data encryption Fast, direct P2P connections between users Easy content and service sharing using Qnext Zones Permission-based sharing between friends Multi-language support Personalize Qnext with downloadable skins Put an End to FTP With Qnext you can take a file of any size from your computer and drag it onto another users name. This initiates a secure P2P file transfer. A file can be 1 MB or 10MB or your entire C drive! Qnext puts an end to FTP sites and email attachment limitations and makes handling large file transfers incredibly easy. Files are sent using a secure, encrypted P2P connection directly to the other user. Receiving a file requires the Qnext user to accept and initiate the transfer. Without any servers in the way to slow you down - file transfers occur as fast as your Internet service connection and computer system will allow. And nobody can ever hack or intercept your file transfer. You can transfer multiple file transfers or folder to online and offline users. The Ultimate in Collaborative Computing Our sophisticated P2P engine allows you to securely share files with your friends, as a group or individually. By opening a Zone with the File Sharing service, you can instantly collaborate and download/upload files into the shared folders of your choice. We even allow you to create virtual folder that you share with other users. These virtual folders can be composed of multiple folders located anywhere on your hard drive. Your files always sit securely on your computer and you decide who has permission to access your File Sharing Zone. Create custom virtual folders for specific users Select download/upload/overwrite permissions No size limitations on files or folders Eliminate bulky email attachments Share and receive files without being in front of your computer Collaborate as a group in a shared, secure Zone You can include multiple Qnext services inside the same Zone that contains File Sharing. Also, you can host a Group Text Chat or talk live with an Audio/Video call while using File Sharing. http://www.qnext.com/index.html http://downloads2.qnext.com/QnextSetup.exe