Received: from sog-mx-2.v43.ch3.sourceforge.com ([172.29.43.192]
	helo=mx.sourceforge.net)
	by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <ayeowch@gmail.com>) id 1XCcka-0001iO-JR
	for bitcoin-development@lists.sourceforge.net;
	Wed, 30 Jul 2014 22:53:36 +0000
Received-SPF: pass (sog-mx-2.v43.ch3.sourceforge.com: domain of gmail.com
	designates 209.85.214.178 as permitted sender)
	client-ip=209.85.214.178; envelope-from=ayeowch@gmail.com;
	helo=mail-ob0-f178.google.com; 
Received: from mail-ob0-f178.google.com ([209.85.214.178])
	by sog-mx-2.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128)
	(Exim 4.76) id 1XCckZ-0005X8-DU
	for bitcoin-development@lists.sourceforge.net;
	Wed, 30 Jul 2014 22:53:36 +0000
Received: by mail-ob0-f178.google.com with SMTP id nu7so1031240obb.37
	for <bitcoin-development@lists.sourceforge.net>;
	Wed, 30 Jul 2014 15:53:30 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.182.114.169 with SMTP id jh9mr160002obb.25.1406760809917;
	Wed, 30 Jul 2014 15:53:29 -0700 (PDT)
Received: by 10.76.25.8 with HTTP; Wed, 30 Jul 2014 15:53:29 -0700 (PDT)
In-Reply-To: <CAJHLa0O1EP8aUn4KLbo3OvzjgVfF8onrMjNnkRAnuWHwbofWBQ@mail.gmail.com>
References: <CAJHLa0O1EP8aUn4KLbo3OvzjgVfF8onrMjNnkRAnuWHwbofWBQ@mail.gmail.com>
Date: Thu, 31 Jul 2014 08:53:29 +1000
Message-ID: <CAA3bHnyk5etZvYmbsYcBqBwMLG5VJstbAJDzFrzNWU1bTTdzkg@mail.gmail.com>
From: Addy Yeow <ayeowch@gmail.com>
To: Jeff Garzik <jgarzik@bitpay.com>
Content-Type: multipart/alternative; boundary=001a11c2f63e5bf47904ff7106a1
X-Spam-Score: -0.6 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
	sender-domain
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	(ayeowch[at]gmail.com)
	-0.0 SPF_PASS               SPF: sender matches SPF record
	0.0 NORMAL_HTTP_TO_IP URI: URI host has a public dotted-decimal IPv4
	address
	0.0 WEIRD_PORT URI: Uses non-standard port number for HTTP
	1.0 HTML_MESSAGE           BODY: HTML included in message
	-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from
	author's domain
	0.1 DKIM_SIGNED            Message has a DKIM or DK signature,
	not necessarily valid
	-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-Headers-End: 1XCckZ-0005X8-DU
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] Abusive and broken bitcoin seeders
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Wed, 30 Jul 2014 22:53:36 -0000

--001a11c2f63e5bf47904ff7106a1
Content-Type: text/plain; charset=UTF-8

I believe the requests Jeff is seeing came from my crawler although anyone
could be running it (https://github.com/ayeowch/bitnodes) since there is no
IP address in the log to confirm the source of the requests.

This is a sample log of an actual request from my crawler at 148.251.238.178
:
*2014-07-30 22:43:54 receive version message: /getaddr.bitnodes.io:0.1/:
version 70001, blocks=313244, us=X.X.X.X:8333, them=0.0.0.0:0
<http://0.0.0.0:0>, peer=148.251.238.178:47635*

Currently, the crawler takes a full snapshot of the network of reachable
nodes as soon as it is done with previous snapshot. I want to be able to
diff between the snapshots to get the join and leave nodes periodically.
Each full snapshot is taken on average between 3 to 4 minutes hence the
requests that you see from the crawler every 3 to 4 minutes.
I have a task in my schedule (
https://github.com/ayeowch/bitnodes/wiki/Schedule#crawlpypingpy) to improve
upon this method by skipping a new connection with currently reachable
nodes while still being able to perform the diff.


On Wed, Jul 30, 2014 at 11:22 PM, Jeff Garzik <jgarzik@bitpay.com> wrote:

> Seeing this on one of my public nodes:
> 2014-07-30 13:13:26 receive version message:
> /getaddr.bitnodes.io:0.1/: version 70001, blocks=313169,
> us=162.219.2.72:8333, peer=11847
> 2014-07-30 13:13:33 receive version message:
> /getaddr.bitnodes.io:0.1/: version 70001, blocks=290000,
> us=162.219.2.72:8333, peer=11848
> 2014-07-30 13:14:21 receive version message:
> /getaddr.bitnodes.io:0.1/: version 70001, blocks=313169,
> us=162.219.2.72:8333, peer=11849
>
> That is abusive, taking up public slots.  There is no reason to
> connect so rapidly to the same node.
>
> Other seeders are also rapidly reconnect'ers, though the time window
> is slightly more wide:
> 2014-07-30 13:09:35 receive version message: /bitcoinseeder:0.01/:
> version 60000, blocks=230000, us=162.219.2.72:8333, peer=11843
> 2014-07-30 13:12:42 receive version message: /bitcoinseeder:0.01/:
> version 60000, blocks=230000, us=162.219.2.72:8333, peer=11846
>
> The version message helpfully tells me my own IP address but not theirs ;p
>
> --
> Jeff Garzik
> Bitcoin core developer and open source evangelist
> BitPay, Inc.      https://bitpay.com/
>

--001a11c2f63e5bf47904ff7106a1
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>I believe the requests Jeff is seeing came from my cr=
awler although anyone could be running it (<a href=3D"https://github.com/ay=
eowch/bitnodes" target=3D"_blank">https://github.com/ayeowch/bitnodes</a>) =
since there is no IP address=C2=A0in the log to confirm the source of the r=
equests.</div>
<div><br></div><div>This is a sample log of an actual request from my crawl=
er at <a href=3D"http://148.251.238.178">148.251.238.178</a>:</div><div><i>=
2014-07-30 22:43:54 receive version message: /getaddr.bitnodes.io:0.1/: ver=
sion 70001, blocks=3D313244, us=3DX.X.X.X:8333, them=3D<a href=3D"http://0.=
0.0.0:0">0.0.0.0:0</a>, peer=3D<b>148.251.238.178</b>:47635</i></div>
<div>
<p class=3D"">Currently, the crawler takes a full snapshot of the network o=
f reachable nodes as soon as it is done with previous snapshot. I want to b=
e able to diff between the snapshots to get the join and leave nodes period=
ically. Each full snapshot is taken on average between 3 to 4 minutes hence=
 the requests that you see from the crawler every 3 to 4 minutes.</p>
</div><div>I have a task in my schedule (<a href=3D"https://github.com/ayeo=
wch/bitnodes/wiki/Schedule#crawlpypingpy">https://github.com/ayeowch/bitnod=
es/wiki/Schedule#crawlpypingpy</a>) to improve upon this method by skipping=
 a new connection with currently reachable nodes while still being able to =
perform the diff.</div>
</div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Wed,=
 Jul 30, 2014 at 11:22 PM, Jeff Garzik <span dir=3D"ltr">&lt;<a href=3D"mai=
lto:jgarzik@bitpay.com" target=3D"_blank">jgarzik@bitpay.com</a>&gt;</span>=
 wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Seeing this on one of my public nodes:<br>
2014-07-30 13:13:26 receive version message:<br>
/getaddr.bitnodes.io:0.1/: version 70001, blocks=3D313169,<br>
us=3D<a href=3D"http://162.219.2.72:8333" target=3D"_blank">162.219.2.72:83=
33</a>, peer=3D11847<br>
2014-07-30 13:13:33 receive version message:<br>
/getaddr.bitnodes.io:0.1/: version 70001, blocks=3D290000,<br>
us=3D<a href=3D"http://162.219.2.72:8333" target=3D"_blank">162.219.2.72:83=
33</a>, peer=3D11848<br>
2014-07-30 13:14:21 receive version message:<br>
/getaddr.bitnodes.io:0.1/: version 70001, blocks=3D313169,<br>
us=3D<a href=3D"http://162.219.2.72:8333" target=3D"_blank">162.219.2.72:83=
33</a>, peer=3D11849<br>
<br>
That is abusive, taking up public slots. =C2=A0There is no reason to<br>
connect so rapidly to the same node.<br>
<br>
Other seeders are also rapidly reconnect&#39;ers, though the time window<br=
>
is slightly more wide:<br>
2014-07-30 13:09:35 receive version message: /bitcoinseeder:0.01/:<br>
version 60000, blocks=3D230000, us=3D<a href=3D"http://162.219.2.72:8333" t=
arget=3D"_blank">162.219.2.72:8333</a>, peer=3D11843<br>
2014-07-30 13:12:42 receive version message: /bitcoinseeder:0.01/:<br>
version 60000, blocks=3D230000, us=3D<a href=3D"http://162.219.2.72:8333" t=
arget=3D"_blank">162.219.2.72:8333</a>, peer=3D11846<br>
<br>
The version message helpfully tells me my own IP address but not theirs ;p<=
br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
--<br>
Jeff Garzik<br>
Bitcoin core developer and open source evangelist<br>
BitPay, Inc. =C2=A0 =C2=A0 =C2=A0<a href=3D"https://bitpay.com/" target=3D"=
_blank">https://bitpay.com/</a><br>
</font></span></blockquote></div><br></div>

--001a11c2f63e5bf47904ff7106a1--