Commit Graph

81 Commits (63422c7d6cfe092af402f48e16729acd1e3bae1c)

Author SHA1 Message Date
Felix Röhrich 05e72a5ab1 make connection logic more forgiving 2025-02-17 21:24:38 +01:00
Anthonin Bonnefoy 228cfffc20 Unwatch and close connection on a batch write error
Previously, a conn.Write would simply unlock pgconn, leaving the
connection as Idle and reusable while the multiResultReader would be
closed. From this state, calling multiResultReader.Close won't try to
receiveMessage and thus won't unwatch and close the connection since it
is already closed. This leaves the connection "open" and the next time
it's used, a "Watch already in progress" panic could be triggered.

This patch fixes the issue by unwatching and closing the connection on a
batch write error. The same was done on Sync.Encode error even if the
path is unreachable as Sync.Error never returns an error.
2025-01-24 08:49:07 +01:00
zenkovev c96a55f8c0 private const for pipelineRequestType 2025-01-11 19:54:18 +03:00
zenkovev de3f868c1d pipeline queue for client requests 2025-01-06 13:54:48 +03:00
zenkovev 76593f37f7 add flush request in pipeline 2024-12-17 11:49:13 +03:00
Jack Christensen a966716860 Replace DSN with keyword/value in comments and documentation
The term DSN is not used in the PostgreSQL documentation. I'm not sure
why it was originally used. Use the correct PostgreSQL terminology.
2024-05-11 14:33:35 -05:00
Jack Christensen 8db971660e Failed connection attempts include all errors
A single Connect("connstring") may actually make multiple connection
requests due to TLS or HA configuration. Previously, when all attempts
failed only the last error was returned. This could be confusing.
Now details of all failed attempts are included.

For example, the following connection string:

host=localhost,127.0.0.1,foo.invalid port=1,2,3

Will now return an error like the following:

failed to connect to `user=postgres database=pgx_test`:
	lookup foo.invalid: no such host
	[::1]:1 (localhost): dial error: dial tcp [::1]:1: connect: connection refused
	127.0.0.1:1 (localhost): dial error: dial tcp 127.0.0.1:1: connect: connection refused
	127.0.0.1:2 (127.0.0.1): dial error: dial tcp 127.0.0.1:2: connect: connection refused

https://github.com/jackc/pgx/issues/1929
2024-05-11 14:25:03 -05:00
Jack Christensen 48ae1f4b2c Fix ResultReader.Read() to handle nil values
The ResultReader.Read() method was erroneously converting nil values
to []byte{}.

https://github.com/jackc/pgx/issues/1987
2024-05-09 17:13:26 -05:00
Jack Christensen 6f0deff015 Add custom data to pgconn.PgConn
https://github.com/jackc/pgx/issues/1896
2024-05-09 15:39:28 -05:00
Jack Christensen 93a579754b Add CancelRequestContextWatcherHandler
This allows a context to cancel a query by sending a cancel request to
the server before falling back to setting a deadline.
2024-05-08 07:41:02 -05:00
Jack Christensen 42c9e9070a Allow customizing context canceled behavior for pgconn
This feature made the ctxwatch package public.
2024-05-08 07:41:02 -05:00
Jack Christensen a3d9120636 Add SeverityUnlocalized field to PgError / Notice
https://github.com/jackc/pgx/issues/1971
2024-04-07 08:58:10 -05:00
Jack Christensen adbb38f298 Do not allow protocol messages larger than ~1GB
The PostgreSQL server will reject messages greater than ~1 GB anyway.
However, worse than that is that a message that is larger than 4 GB
could wrap the 32-bit integer message size and be interpreted by the
server as multiple messages. This could allow a malicious client to
inject arbitrary protocol messages.

https://github.com/jackc/pgx/security/advisories/GHSA-mrww-27vc-gghv
2024-03-04 09:09:29 -06:00
Jack Christensen 2e84dccaf5 *Pipeline.getResults should close pipeline on error
Otherwise, it might be possible to panic when closing the pipeline if it
tries to read a connection that should be closed but still has a fatal
error on the wire.

https://github.com/jackc/pgx/issues/1920
2024-02-29 18:44:01 -06:00
Jack Christensen 5d26bbefd8 Make pgconn.ConnectError and pgconn.ParseConfigError public
fixes #1773
2024-01-12 17:52:25 -06:00
Jack Christensen cbc5a7055f Fix: close conn on read failure in pipeline
Suggested by @jameshartig in https://github.com/jackc/pgx/issues/1847
2023-12-23 12:11:23 -06:00
James Hartig 22fe50149b pgconn: check if pipeline i closed in Sync/GetResults
Otherwise there will be a nil pointer exception accessing the conn
2023-12-23 12:04:21 -06:00
Ryan Fowler dfd198003a Fix panic in Pipeline when PgConn is busy or closed 2023-12-23 10:30:59 -06:00
Samuel Stauffer 2daeb8dc5f pgconn: normalize starTLS connection error
Normalize the error that is returned by startTLS in pgconn.connect. This
makes it possible to determine if the error was a context error.
2023-12-16 11:15:35 -06:00
Jack Christensen df3c5f4df8 Use "Pg" instead of "PG" in new PgError related identifiers
Arguably, PGError might have been better. But since the precedent is
long since established it is better to be consistent.
2023-12-15 18:33:51 -06:00
James Hartig b1631e8e35 pgconn: add OnPGError to Config for error handling
OnPGError is called on every error response received from Postgres and can
be used to close connections on specific errors. Defaults to closing on
FATAL-severity errors.

Fixes #1803
2023-12-15 18:29:32 -06:00
Jack Christensen 7d5a3969d0 Improve docs and tests 2023-11-18 07:44:24 -06:00
Jack Christensen 4dbd57a7ed Add PgConn.Deallocate method
This method uses the PostgreSQL protocol Close method to deallocate a
prepared statement. This means that it can succeed in an aborted
transaction.
2023-11-18 07:44:24 -06:00
Jack Christensen 0570b0e196 Better document PgConn.Prepare implementation 2023-11-18 07:44:24 -06:00
Ivan Posazhennikov 6f7400f428 fix typo in the comment in the pgconn.go 2023-10-14 18:02:35 -05:00
Anton Levakin 304697de36 CancelRequest: Wait for the cancel request to be acknowledged by the server 2023-10-14 17:48:16 -05:00
Anton Levakin 6ca3d8ed4e Revert "CancelRequest: don't try to read the reply"
This reverts commit c861bce438.
2023-10-14 17:48:16 -05:00
Jack Christensen 81ddcfdefb Fix spurious deadline exceeded error
stdlib_test.TestConnConcurrency had been flickering on CI deadline /
timeout errors. This was extremely confusing because the test deadline
was set for 2 minutes and the errors would occur much quicker.

The problem only manifested in an extremely specific and timing
sensitive situation.

1. The watchdog timer for deadlocked writes starts the goroutine to
   start the background reader
2. The background reader is stopped
3. The next operation is a read without a preceding write (AFAIK only
   CheckConn does this)
4. The deadline is set to interrupt the read
5. The goroutine from 1 actually starts the background reader
6. The background reader gets an error reading the connection with the
   deadline
7. The deadline is cleared
8. The next read on the connection will get the timeout error
2023-10-14 11:38:33 -05:00
Ville Skyttä c6c50110db Spelling and grammar fixes 2023-10-07 09:26:23 -05:00
Jack Christensen 163eb68866 Normalize timeout error when receiving pipeline results
https://github.com/jackc/pgx/issues/1748#issuecomment-1740437138
2023-09-30 08:50:40 -05:00
Alexey Palazhchenko 8fb309c631 Use Go 1.20's link syntax for `ParseConfig` 2023-07-28 17:51:42 -05:00
smaher-edb f47f0cf823 connect_timeout is not obeyed for sslmode=allow|prefer
connect_timeout given in conn string was not obeyed if sslmode is not specified (default is prefer) or equals sslmode=allow|prefer. It took twice the amount of time specified by connect_timeout in conn string. While this behavior is correct if multi-host is provided in conn string, it doesn't look correct in case of single host. This behavior was also not matching with libpq.

The root cause was to implement sslmode=allow|prefer conn are tried twice. First with TLSConfig and if that doesn't work then without TLSConfig. The fix for this issue now uses the same context if same host is being tried out. This change won't affect the existing multi-host behavior.

This PR goal is to close issue [jackc/pgx/issues/1672](https://github.com/jackc/pgx/issues/1672)
2023-07-15 09:49:09 -05:00
Jack Christensen 95aa87f2e8 exitPotentialWriteReadDeadlock stops bgReader
It's not enough to stop the slowWriteTimer, because the bgReader may
have been started.
2023-07-11 21:29:11 -05:00
Jack Christensen f512b9688b Add PgConn.SyncConn
This provides a way to ensure it is safe to directly read or write to
the underlying net.Conn.

https://github.com/jackc/pgx/issues/1673
2023-07-11 21:29:11 -05:00
Jack Christensen cd46cdd450 Recreate the frontend in Construct with the new bgReader
https://github.com/jackc/pgx/pull/1629#discussion_r1251472215
2023-07-08 11:39:39 -05:00
Adrian-Stefan Mares 2bf5a61401 fix: Do not use infinite timers 2023-07-08 11:24:39 -05:00
Brandon Kauffman 1dd69f86a1 Enable failover efforts when pg_hba.conf disallows non-ssl connections
Copy of https://github.com/jackc/pgconn/pull/133
2023-06-24 06:41:35 -05:00
Jack Christensen 5f28621394 Add docs clarifying that FieldDescriptions may return nil
https://github.com/jackc/pgx/issues/1634
2023-06-14 07:42:11 -05:00
Jack Christensen 34eddf9983 Increase slowWriteTimer to 15ms and document why 2023-06-12 09:39:26 -05:00
Jack Christensen 5d4f9018bf failed to write startup message error should be normalized 2023-06-12 09:39:26 -05:00
Jack Christensen 482e56a79b Fix race condition when CopyFrom is cancelled. 2023-06-12 09:39:26 -05:00
Jack Christensen 3ea2f57d8b Deprecate CheckConn in favor of Ping 2023-06-12 09:39:26 -05:00
Jack Christensen 26c79eb215 Handle writes that could deadlock with reads from the server
This commit adds a background reader that can optionally buffer reads.
It is used whenever a potentially blocking write is made to the server.
The background reader is started on a slight delay so there should be no
meaningful performance impact as it doesn't run for quick queries and
its overhead is minimal relative to slower queries.
2023-06-12 09:39:26 -05:00
Jack Christensen 85136a8efe Restore pgx v4 style CopyFrom implementation
This approach uses an extra goroutine to write while the main goroutine
continues to read. This avoids the need to use non-blocking I/O.
2023-06-12 09:39:26 -05:00
Jack Christensen 4410fc0a65 Remove nbconn
The non-blocking IO system was designed to solve three problems:

1. Deadlock that can occur when both sides of a connection are blocked
   writing because all buffers between are full.
2. The inability to use a write deadline with a TLS.Conn without killing
   the connection.
3. Efficiently check if a connection has been closed before writing.
   This reduces the cases where the application doesn't know if a query
   that does a INSERT/UPDATE/DELETE was actually sent to the server or
   not.

However, the nbconn package is extraordinarily complex, has been a
source of very tricky bugs, and has OS specific code paths. It also does
not work at all with underlying net.Conn implementations that do not
have platform specific non-blocking IO syscall support and do not
properly implement deadlines. In particular, this is the case with
golang.org/x/crypto/ssh.

I believe the deadlock problem can be solved with a combination of a
goroutine for CopyFrom like v4 used and a watchdog for regular queries
that uses time.AfterFunc.

The write deadline problem actually should be ignorable. We check for
context cancellation before sending a query and the actual Write should
be almost instant as long as the underlying connection is not blocked.
(We should only have to wait until it is accepted by the OS, not until
it is fully sent.)

Efficiently checking if a connection has been closed is probably the
hardest to solve without non-blocking reads. However, the existing code
only solves part of the problem. It can detect a closed or broken
connection the OS knows about, but it won't actually detect other types
of broken connections such as a network interruption. This is currently
implemented in CheckConn and called automatically when checking a
connection out of the pool that has been idle for over one second. I
think that changing CheckConn to a very short deadline read and changing
the pool to do an actual Ping would be an acceptable solution.

Remove nbconn and non-blocking code. This does not leave the system in
an entirely working state. In particular, CopyFrom is broken, deadlocks
can occur for extremely large queries or batches, and PgConn.CheckConn
is now a `select 1` ping. These will be resolved in subsequent commits.
2023-06-12 09:39:26 -05:00
Nicola Murino c861bce438 CancelRequest: don't try to read the reply
Postgres will just process the request and close the connection
2023-06-03 06:45:28 -05:00
Jack Christensen b3739c1289 pgconn.CheckConn locks connection
This ensures that a closed connection at the pgconn layer is not
considered okay when the background closing of the net.Conn is still in
progress.

This also means that CheckConn cannot be called when the connection is
locked (for example, by in an progress query). But that seems
reasonable. It's not exactly clear that that would have ever worked
anyway.

https://github.com/jackc/pgx/issues/1618#issuecomment-1563702231
2023-05-26 06:03:25 -05:00
Evan Jones 11d892dfcf pgconn.CancelRequest: Fix unix sockets: don't use RemoteAddr()
The tests for cancelling requests were failing when using unix
sockets. The reason is that net.Conn.RemoteAddr() calls getpeername()
to get the address. For Unix sockets, this returns the address that
was passed to bind() by the *server* process, not the address that
was passed to connect() by the *client*. For postgres, this is always
relative to the server's directory, so is a path like:

    ./.s.PGSQL.5432

Since it does not return the full absolute path, this function cannot
connect, so it cannot cancel requests. To fix it, use the connection's
config for Unix sockets. I think this should be okay, since a system
using unix sockets should not have "fallbacks". If that is incorrect,
we will need to save the address on PgConn.

Fixes the following failed tests when using Unix sockets:

--- FAIL: TestConnCancelRequest (2.00s)
    pgconn_test.go:2056:
          Error Trace:  /Users/evan.jones/pgx/pgconn/pgconn_test.go:2056
                              /Users/evan.jones/pgx/pgconn/asm_arm64.s:1172
          Error:        Received unexpected error:
                        dial unix ./.s.PGSQL.5432: connect: no such file or directory
          Test:         TestConnCancelRequest
    pgconn_test.go:2063:
          Error Trace:  /Users/evan.jones/pgx/pgconn/pgconn_test.go:2063
          Error:        Object expected to be of type *pgconn.PgError, but was <nil>
          Test:         TestConnCancelRequest
--- FAIL: TestConnContextCanceledCancelsRunningQueryOnServer (5.10s)
    pgconn_test.go:2109:
          Error Trace:  /Users/evan.jones/pgx/pgconn/pgconn_test.go:2109
          Error:        Received unexpected error:
                        timeout: context already done: context deadline exceeded
          Test:         TestConnContextCanceledCancelsRunningQueryOnServer
2023-05-20 08:08:47 -05:00
Jack Christensen eee854fb06 iobufpool uses *[]byte instead of []byte to reduce allocations 2023-01-28 08:02:49 -06:00
Jack Christensen a95cfe5cc5 Fix connect with multiple hostnames when one can't be resolved
If multiple hostnames are provided and one cannot be resolved the others
should still be tried.

Longterm, it would be nice for the connect process to return a list of
errors rather than just one.

fixes https://github.com/jackc/pgx/issues/1464
2023-01-14 09:19:00 -06:00