|
|||||||||||
|
Re: [GENERAL] (Never?) Kill Postmaster?
From: Tom Lane <tgl(at)sss.pgh.pa.us>
Date: Sun Nov 11 2007 - 14:48:50 EST
> Seems to be the same as for the processes that were stuck inside of a Well, the top of the stack is the same, but this is useful anyway because it shows that an I/O error on the input side can trigger the problem as well as one on the output side. We're still left wondering how a thread mutex down inside strerror() could be left in a "stuck" state, when the process doesn't appear to contain more than one thread. > I recompiled the server with debugging symbols enabled and then did the That is probably not the same situation because (assuming the query didn't produce a lot of output) the kernel does not yet think that the network connection is lost irretrievably. You'd have to wait for the TCP timeout interval to elapse, whereupon the kernel would report the connection lost (EPIPE or ECONNRESET error), whereupon we'd enter the code path shown above. One thing I'm suddenly thinking might be related: didn't you mention that you have some process that goes around and SIGINT's backends that it thinks are running too long? I'm wondering if a SIGINT event is a necessary component of producing the problem ... regards, tom lane ---------------------------(end of broadcast)---------------------------TIP 2: Don't 'kill -9' the postmaster Received on Sun Nov 11 14:50:26 2007 This archive was generated by hypermail 2.1.8 : Mon Jun 16 2008 - 19:40:29 EDT |
||||||||||
|
|||||||||||