We just need to cache a list of freepage ID, and don't dare what's
the value (true or false) at all, so changed the map's value from
bool to struct{}.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
After checkptr fixes by 2fc6815c, it was discovered that new issues
were hit in production systems, in particular when a single process
opened and updated multiple separate databases. This indicates that
some bug relating to bad unsafe usage was introduced during this
commit.
This commit combines several attempts at fixing this new issue. For
example, slices are once again created by slicing an array of "max
allocation" elements, but this time with the cap set to the intended
length. This operation is espressly permitted according to the Go
wiki, so it should be preferred to type converting a
reflect.SliceHeader.
When the database has a lot of freepages, the cost to sync all
freepages down to disk is high. If the total database size is
small (<10GB), and the application can tolerate ~10 seconds
recovery time, then it is reasonable to simply not sync freelist
and rescan the db to rebuild freelist on recovery.
Read txns would lock pages allocated after the txn, keeping those pages
off the free list until closing the read txn. Instead, track allocating
txid to compute page lifetime, freeing pages if all txns between
page allocation and page free are closed.
Using a large (50gb) database with a read-write-delete heavy load,
nearly 100% of allocated space came from freelists.
1/3 came from freelist.release, 1/3 from freelist.write,
and 1/3 came from tx.allocate to make space for freelist.write.
In the case of freelist.write, the newly allocated giant slice gets
copied to the space prepared by tx.allocate and then discarded.
To avoid this, add func mergepgids that accepts a destination slice,
and use it in freelist.write.
This has a mild negative impact on the existing benchmarks,
but cuts allocated space in my real world db by over 30%.
name old time/op new time/op delta
_FreelistRelease10K-8 18.7µs ±10% 18.2µs ± 4% ~ (p=0.548 n=5+5)
_FreelistRelease100K-8 233µs ± 5% 258µs ±20% ~ (p=0.151 n=5+5)
_FreelistRelease1000K-8 3.34ms ± 8% 3.13ms ± 8% ~ (p=0.151 n=5+5)
_FreelistRelease10000K-8 32.3ms ± 1% 32.2ms ± 7% ~ (p=0.690 n=5+5)
DBBatchAutomatic-8 2.18ms ± 3% 2.19ms ± 4% ~ (p=0.421 n=5+5)
DBBatchSingle-8 140ms ± 6% 140ms ± 4% ~ (p=0.841 n=5+5)
DBBatchManual10x100-8 4.41ms ± 2% 4.37ms ± 3% ~ (p=0.548 n=5+5)
name old alloc/op new alloc/op delta
_FreelistRelease10K-8 82.5kB ± 0% 82.5kB ± 0% ~ (all samples are equal)
_FreelistRelease100K-8 805kB ± 0% 805kB ± 0% ~ (all samples are equal)
_FreelistRelease1000K-8 8.05MB ± 0% 8.05MB ± 0% ~ (all samples are equal)
_FreelistRelease10000K-8 80.4MB ± 0% 80.4MB ± 0% ~ (p=1.000 n=5+5)
DBBatchAutomatic-8 384kB ± 0% 384kB ± 0% ~ (p=0.095 n=5+5)
DBBatchSingle-8 17.2MB ± 1% 17.2MB ± 1% ~ (p=0.310 n=5+5)
DBBatchManual10x100-8 908kB ± 0% 902kB ± 1% ~ (p=0.730 n=4+5)
name old allocs/op new allocs/op delta
_FreelistRelease10K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
_FreelistRelease100K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
_FreelistRelease1000K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
_FreelistRelease10000K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
DBBatchAutomatic-8 10.2k ± 0% 10.2k ± 0% +0.07% (p=0.032 n=5+5)
DBBatchSingle-8 58.6k ± 0% 59.6k ± 0% +1.70% (p=0.008 n=5+5)
DBBatchManual10x100-8 6.02k ± 0% 6.03k ± 0% +0.17% (p=0.029 n=4+4)
This commit fixes a bug where page end-of-header pointers were being
converted to byte slices even when the pointer did not point to
allocated memory. This occurs with pages that have a `page.count`
of zero.
Note: This was not an issue in Go 1.6 but the new Go 1.7 SSA backend
handles `nil` checks differently.
See https://github.com/golang/go/issues/16772
This commit expands calls to _assert() that use variadic arguments. These calls require conversion to interface{} so there
was a large number of calls to Go's internal convT2E() function. In some profiling this was taking over 20% of total runtime.
I don't remember seeing this before Go 1.4 so perhaps something has changed.
This commit is a backwards compatible change that allows the freelist to overflow the
page.count (uint16). It works by checking if the overflow will occur and marking the
page.count as 0xFFFF and setting the actual count to the first element of the freelist.
This approach was used because it's backwards compatible and it doesn't make sense to
change the data type of all page counts when only the freelist's page can overflow.
Fixes#192.
This commit adds a cache to the freelist which combines the available free pages and pending free pages in
a single map. This was added to improve performance where freelist.isFree() was consuming 70% of CPU time
for large freelists.
This commit adds an explicit DefaultOptions variable for additional documentation.
Open() can still be passed a nil options which will cause options to be change to
the DefaultOptions variable. This change also allows options to be set globally for
an application if more than one database is being opened in a process.
This commit also moves all errors to errors.go so that the godoc groups them together.
This commit fixes the freelist so that it frees from the beginning of the data file
instead of the end. It also adds a fast path for pages which can be allocated from
the first free pages and it includes read transaction stats.
Well, this is embarassing. Somehow the freelist was never getting written after each commit.
This commit fixes that and fixes a small reporting issue with "bolt pages".
I changed the Transaction/RWTransaction types to Tx/RWTx, respectively. This makes the naming
more consistent with other packages such as database/sql. The txnid is changed to txid as well.