1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
|
Received: from sog-mx-3.v43.ch3.sourceforge.com ([172.29.43.193]
helo=mx.sourceforge.net)
by sfs-ml-3.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
(envelope-from <olivier@trillion01.com>) id 1VQrbY-0007aA-PP
for bitcoin-development@lists.sourceforge.net;
Tue, 01 Oct 2013 04:30:36 +0000
Received-SPF: neutral (sog-mx-3.v43.ch3.sourceforge.com: 67.222.55.9 is
neither permitted nor denied by domain of trillion01.com)
client-ip=67.222.55.9; envelope-from=olivier@trillion01.com;
helo=oproxy7-pub.mail.unifiedlayer.com;
Received: from oproxy7-pub.mail.unifiedlayer.com ([67.222.55.9])
by sog-mx-3.v43.ch3.sourceforge.com with smtp (Exim 4.76)
id 1VQrbN-0004VX-Pc for bitcoin-development@lists.sourceforge.net;
Tue, 01 Oct 2013 04:30:36 +0000
Received: (qmail 12389 invoked by uid 0); 1 Oct 2013 04:03:40 -0000
Received: from unknown (HELO box610.bluehost.com) (70.40.220.110)
by oproxy7.mail.unifiedlayer.com with SMTP; 1 Oct 2013 04:03:40 -0000
Received: from [173.179.63.169] (port=38364 helo=[192.168.1.104])
by box610.bluehost.com with esmtpsa (TLSv1:RC4-SHA:128) (Exim 4.80)
(envelope-from <olivier@trillion01.com>)
id 1VQrBU-0005oT-9P; Mon, 30 Sep 2013 22:03:40 -0600
Message-ID: <1380600219.932.21.camel@Wailaba2>
From: Olivier Langlois <olivier@trillion01.com>
To: slush <slush@centrum.cz>
Date: Tue, 01 Oct 2013 00:03:39 -0400
In-Reply-To: <CAJna-Hi+eyRnZUtHpfvod_uRCmjPOL5HS3ZZpr54yzbKRRT9-w@mail.gmail.com>
References: <CAJna-Hi+eyRnZUtHpfvod_uRCmjPOL5HS3ZZpr54yzbKRRT9-w@mail.gmail.com>
Organization: Trillion01 Inc
Content-Type: text/plain; charset="ISO-8859-1"
X-Mailer: Evolution 3.8.5
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Identified-User: {5686:box610.bluehost.com:olivierl:trillion01.com}
{sentby:smtp auth 173.179.63.169 authed with
olivier@trillion01.com}
X-Spam-Score: 0.7 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
See http://spamassassin.org/tag/ for more details.
0.7 SPF_NEUTRAL SPF: sender does not match SPF record (neutral)
X-Headers-End: 1VQrbN-0004VX-Pc
Cc: "bitcoin-development@lists.sourceforge.net"
<bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] bitcoind stops responding
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Tue, 01 Oct 2013 04:30:36 -0000
On Mon, 2013-09-30 at 22:44 +0200, slush wrote:
> Hi,
>
>
> during several weeks I'm observing more and more frequent issues with
> bitcoind. The problem is that bitcoind stops responding to RPC calls,
> but there's no other suspicious activity in bitcoind log, CPU usage is
> low, disk I/O is standard etc.
>
>
> I observed this problem with version 0.8.2, but it is still happening
> with 0.8.5. Originally this happen just one or twice per day. Today my
> monitoring scripts restarted bitcoind more than 30x, which sounds
> alarming. This happen on various backends, so it isn't a problem of
> one specific node. Is there anybody else who's observing similar
> problem?
What a coincidence. I do have observed the same thing. right now with
0.8.5. I am writing a small app. My jsonrpc client is programmed to
timeout after 2 secs and I did see a couple of timeouts once in while.
What I did is a simple test app that just hammer bitcoind with 3 rpc
requests every 30 seconds and I abort it as soon as it encountered a
timeout.
The 3 request burst is performed on the same HTTP 1.1 kept alive
connection. Then I disconnect. When I launch my app before leaving in
the morning, pretty sure that I have a core dump waiting for me when I
come back.
I choose very simple calls: getinfo,getaccount
Added a couple of traces in the RPC handling code. (BTW, timestamps in
traces would be tremendously useful for tracking problems...). I see my
request received by bitcoind but there is no trace yet to show that the
reply is sent.
Not sure yet exactly where the problem is but my current #1 suspect is:
LOCK2(cs_main, pwalletMain->cs_wallet);
with some kind of lock contention with the other threads.
|