[p2p-research] Yacy search engine

Heath Matlock heathmatlock at gmail.com
Tue Nov 10 23:21:18 CET 2009


On Tue, Nov 10, 2009 at 4:00 PM, Eugen Leitl <eugen at leitl.org> wrote:
> On Tue, Nov 10, 2009 at 03:49:49PM +0000, Heath Matlock wrote:
>
>> A friend and I had done some initial work on converting it to Python
>> (since we weren't fans of Java), but he got a job in Chicago and I
>
> Right. So no full Python port yet? Too bad.

We were going for a p2p anonymous search engine, because paranoia both
had us in its grasp. So were only porting some aspects of it.

> How many nodes are out there, and how large the total index crawled?
> What's the average query response time? How is query farmed (and
> babby formed)? To how many nodes would it scale?

You want to visit http://yacystats.de , and if you don't speak German,
punch that in at http://translate.google.com . From there, you'll want
to look at the "Network Statistics" link and also the "Peer Table". I
went ahead and emailed a developer those questions, but I don't expect
a response. That was another reason we decided to fork the work, the
devs never responded to our inquiries, nor are they active on the
#yacy channel on Freenode.

Anyway, on average there are about 55 peers hosting a yacy instance,
and about 1 billion links indexed and available. There are more peers,
it's just 55 is the average available. It's supposed to scale to an
infinite number of nodes. It's slow, being that it's based on http
requests. Query response time depends on peers, the default timeout is
at 3 seconds at the moment. There aren't any statistics for this
available right now. "YaCy knows 5 different ways to acquire web
indexes, find out detalis by downloading it and going to
http://localhost:8080/CrawlResults.html ..The image without the
details is on my server at
http://ybit.ath.cx/images/yacy_index_monitor.png


-- 
Heath Matlock
+1 256 274 4225



More information about the p2presearch mailing list