When a conn is going to execute a query, the first thing it does is to
deallocate any invalidated prepared statements from the statement cache.
However, the statements were removed from the cache regardless of
whether the deallocation succeeded. This would cause subsequent calls of
the same SQL to fail with "prepared statement already exists" error.
This problem is easy to trigger by running a query with a context that
is already canceled.
This commit changes the deallocate invalidated cached statements logic
so that the statements are only removed from the cache if the
deallocation was successful on the server.
https://github.com/jackc/pgx/issues/1847
QueryExecModeCacheDescribe actually is safe even when the schema or
search_path is modified. It may return an error on the first execution
but it should never silently encode or decode a value incorrectly. Add a
test to demonstrate and ensure this behavior.
Update documentation of QueryExecModeCacheDescribe to remove warning of
undetected result decoding errors.
Update documentation of QueryExecModeCacheStatement and
QueryExecModeCacheDescribe to indicate that the first execution of an
invalidated statement may fail.
The non-blocking IO system was designed to solve three problems:
1. Deadlock that can occur when both sides of a connection are blocked
writing because all buffers between are full.
2. The inability to use a write deadline with a TLS.Conn without killing
the connection.
3. Efficiently check if a connection has been closed before writing.
This reduces the cases where the application doesn't know if a query
that does a INSERT/UPDATE/DELETE was actually sent to the server or
not.
However, the nbconn package is extraordinarily complex, has been a
source of very tricky bugs, and has OS specific code paths. It also does
not work at all with underlying net.Conn implementations that do not
have platform specific non-blocking IO syscall support and do not
properly implement deadlines. In particular, this is the case with
golang.org/x/crypto/ssh.
I believe the deadlock problem can be solved with a combination of a
goroutine for CopyFrom like v4 used and a watchdog for regular queries
that uses time.AfterFunc.
The write deadline problem actually should be ignorable. We check for
context cancellation before sending a query and the actual Write should
be almost instant as long as the underlying connection is not blocked.
(We should only have to wait until it is accepted by the OS, not until
it is fully sent.)
Efficiently checking if a connection has been closed is probably the
hardest to solve without non-blocking reads. However, the existing code
only solves part of the problem. It can detect a closed or broken
connection the OS knows about, but it won't actually detect other types
of broken connections such as a network interruption. This is currently
implemented in CheckConn and called automatically when checking a
connection out of the pool that has been idle for over one second. I
think that changing CheckConn to a very short deadline read and changing
the pool to do an actual Ping would be an acceptable solution.
Remove nbconn and non-blocking code. This does not leave the system in
an entirely working state. In particular, CopyFrom is broken, deadlocks
can occur for extremely large queries or batches, and PgConn.CheckConn
is now a `select 1` ping. These will be resolved in subsequent commits.
The test was relying on sending so big a message that the write blocked.
However, it appears that on Windows the TCP connections over localhost
have an very large or infinite sized buffer. Change the test to simply
set the deadline to the current time before triggering the write.
The first 5 fake non-blocking reads are limited to 1 byte. This should
ensure that there is a measurement of a read where bytes are already
waiting in Go or the OS's read buffer.
The reason for a high max wait time was to ensure that reads aren't
cancelled when there is data waiting for it in Go or the OS's receive
buffer. Unfortunately, there is no way to know ahead of time how long
this should take.
This new code uses 2x the fastest successful read time as the max read
time. This allows the code to adapt to whatever host it is running on.
https://github.com/jackc/pgx/issues/1481
Previously, a batch with 10 unique parameterized statements executed
100 times would entail 11 network round trips. 1 for each prepare /
describe and 1 for executing them all. Now pipeline mode is used to
prepare / describe all statements in a single network round trip. So it
would only take 2 round trips.