This introduces an interface for the freelist, splits it into two concrete
implementations.
fixes etcd-io#773
Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
If there is no logger defined (discardLogger), skip logging altogether for
highly frequent called methods (Put, Delete, CreateBucket,
CreateBucketIfNotExists, DeleteBucket, Begin, Commit, Open, MoveBucket, Sync).
Signed-off-by: Ivan Valdes <ivan@vald.es>
This moves the error variables that had been moved to the
internal/common package during recent refactoring to a non-internal
errors package, once again allowing consumers to test for particular
error conditions.
To preserve API compatibility with bbolt v1.3, these error variables
are also redefined in the bbolt package, with deprecation notice to
migrate to bbolt/errors.
Signed-off-by: Josh Rickmar <jrick@zettaport.com>
Recursive checker confirms database consistency with respect to b-tree
key order constraints:
- keys on pages must be sorted
- keys on children pages are between 2 consecutive keys on parent
branch page).
Signed-off-by: Piotr Tabor <ptab@google.com>
So far the code was frequently traversing all the keys (ignoring flag whether key is a bucket)
and trying to open each of the keys as bucket (seeking the same entry from the scratch).
In this proposal, we iterate only through bucket keys.
Signed-off-by: Piotr Tabor <ptab@google.com>
It makes it easy to find which page actually looks to be corrupted.
```
go build ./cmd/bbolt/ && ./bbolt check ~/Downloads/db
panic: freepages: failed to get all reachable pages (page 8314893338927566090: out of bounds: 6258 (stack: [4517 395 821]))
goroutine 18 [running]:
go.etcd.io/bbolt.(*DB).freepages.func2()
/Users/ptab/gits/bbolt/db.go:1056 +0x8c
created by go.etcd.io/bbolt.(*DB).freepages
/Users/ptab/gits/bbolt/db.go:1054 +0x138
```
Signed-off-by: Piotr Tabor <ptab@google.com>
After checkptr fixes by 2fc6815c, it was discovered that new issues
were hit in production systems, in particular when a single process
opened and updated multiple separate databases. This indicates that
some bug relating to bad unsafe usage was introduced during this
commit.
This commit combines several attempts at fixing this new issue. For
example, slices are once again created by slicing an array of "max
allocation" elements, but this time with the cap set to the intended
length. This operation is espressly permitted according to the Go
wiki, so it should be preferred to type converting a
reflect.SliceHeader.
When the database has a lot of freepages, the cost to sync all
freepages down to disk is high. If the total database size is
small (<10GB), and the application can tolerate ~10 seconds
recovery time, then it is reasonable to simply not sync freelist
and rescan the db to rebuild freelist on recovery.
Read txns would lock pages allocated after the txn, keeping those pages
off the free list until closing the read txn. Instead, track allocating
txid to compute page lifetime, freeing pages if all txns between
page allocation and page free are closed.