Commit Graph

42 Commits (dd4458ce3a82c1e6f7134d746adb5b64d36495e5)

Author SHA1 Message Date
Benjamin Wang 70e7654959 change freelist.cache from map[pgid]bool to map[pgid]struct{}
We just need to cache a list of freepage ID, and don't dare what's
the value (true or false) at all, so changed the map's value from
bool to struct{}.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-24 16:06:38 +08:00
Josh Rickmar 9034717d69 Try to use reflect.SliceHeader correctly this time 2020-05-21 18:50:41 +00:00
Josh Rickmar f9d3ff6648 Fix incorrect unsafe usage
After checkptr fixes by 2fc6815c, it was discovered that new issues
were hit in production systems, in particular when a single process
opened and updated multiple separate databases.  This indicates that
some bug relating to bad unsafe usage was introduced during this
commit.

This commit combines several attempts at fixing this new issue.  For
example, slices are once again created by slicing an array of "max
allocation" elements, but this time with the cap set to the intended
length.  This operation is espressly permitted according to the Go
wiki, so it should be preferred to type converting a
reflect.SliceHeader.
2020-04-28 20:30:23 +00:00
Josh Rickmar 543c40ab41 Fix unsafe pointer conversions caught by Go 1.14 checkptr 2020-03-18 21:18:39 -04:00
Xingyu Chen a0458a2b35 fix rollback panic bug (#153) 2019-06-08 09:57:04 -07:00
Xingyu Chen 8693da9f4d use segregated hashmap to boost the freelist allocate and release performance (#141) 2019-01-25 10:30:05 -08:00
Xingyu Chen f0ad07c7d4 add getFreePageIDs (#140) 2019-01-20 23:42:17 -08:00
Xingyu Chen c5638469ec update the freelist readIDs (#139) 2019-01-20 21:45:53 -08:00
Gyuho Lee 76a4670663 *: update import paths "go.etcd.io/bbolt"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 08:15:54 -07:00
Anthony Romano 386b851495 freelist: set alloc tx for freelist to prior txn
Was causing freelist corruption on tx.WriteTo
2017-11-16 08:16:58 -08:00
Joe Betz 237a4fcb31 Panic if page provided to freelist.read is incorrect page type. 2017-11-15 15:52:34 -08:00
Anthony Romano d3d8bbd794 pass gofmt 2017-08-10 22:07:25 -07:00
Anthony Romano 03f5e16968 freelist: read all free pages on count overflow
count is not shifted up by start index when taking subslice of free
list, dropping the last entry in the list.
2017-08-08 23:38:46 -07:00
Gyu-Ho Lee 7ce671beee *: fix gofmt style issues in 'range' 2017-07-27 14:57:54 -07:00
Xiang Li ad39960eb4 Merge pull request #3 from heyitsanthony/range-gc
Garbage collect pages allocated after minimum txid
2017-06-23 18:19:54 -07:00
Xiang 7149270521 *: add option to skip freelist sync
When the database has a lot of freepages, the cost to sync all
freepages down to disk is high. If the total database size is
small (<10GB), and the application can tolerate ~10 seconds
recovery time, then it is reasonable to simply not sync freelist
and rescan the db to rebuild freelist on recovery.
2017-06-22 12:46:56 -07:00
Anthony Romano 78d099ed1f Garbage collect pages allocated after minimum txid
Read txns would lock pages allocated after the txn, keeping those pages
off the free list until closing the read txn. Instead, track allocating
txid to compute page lifetime, freeing pages if all txns between
page allocation and page free are closed.
2017-06-05 16:07:55 -07:00
Josh Bleecher Snyder 7adfa44e02 Fix freelist.size calculation for large freelists
freelist.size did not account for the extra
fake freelist item used to hold the number of
elements when the freelist is large.
2016-12-23 09:18:57 -08:00
Josh Bleecher Snyder 0e120dc470 Precalculate size of pending pgids in freelist.copyall
This recovers the slight alloc regression in #636.
2016-12-23 09:18:47 -08:00
Josh Bleecher Snyder 1858583b3b Clean up after #636
freelist.lenall duplicated freelist.count.
freelist.copyall and mergepgids docs had typos.
2016-12-23 08:56:04 -08:00
Josh Bleecher Snyder 4d8824b05d Don't allocate huge slices to merge pgids in freelist.write
Using a large (50gb) database with a read-write-delete heavy load,
nearly 100% of allocated space came from freelists.
1/3 came from freelist.release, 1/3 from freelist.write,
and 1/3 came from tx.allocate to make space for freelist.write.
In the case of freelist.write, the newly allocated giant slice gets
copied to the space prepared by tx.allocate and then discarded.

To avoid this, add func mergepgids that accepts a destination slice,
and use it in freelist.write.

This has a mild negative impact on the existing benchmarks,
but cuts allocated space in my real world db by over 30%.

name                      old time/op    new time/op    delta
_FreelistRelease10K-8       18.7µs ±10%    18.2µs ± 4%    ~             (p=0.548 n=5+5)
_FreelistRelease100K-8       233µs ± 5%     258µs ±20%    ~             (p=0.151 n=5+5)
_FreelistRelease1000K-8     3.34ms ± 8%    3.13ms ± 8%    ~             (p=0.151 n=5+5)
_FreelistRelease10000K-8    32.3ms ± 1%    32.2ms ± 7%    ~             (p=0.690 n=5+5)
DBBatchAutomatic-8          2.18ms ± 3%    2.19ms ± 4%    ~             (p=0.421 n=5+5)
DBBatchSingle-8              140ms ± 6%     140ms ± 4%    ~             (p=0.841 n=5+5)
DBBatchManual10x100-8       4.41ms ± 2%    4.37ms ± 3%    ~             (p=0.548 n=5+5)

name                      old alloc/op   new alloc/op   delta
_FreelistRelease10K-8       82.5kB ± 0%    82.5kB ± 0%    ~     (all samples are equal)
_FreelistRelease100K-8       805kB ± 0%     805kB ± 0%    ~     (all samples are equal)
_FreelistRelease1000K-8     8.05MB ± 0%    8.05MB ± 0%    ~     (all samples are equal)
_FreelistRelease10000K-8    80.4MB ± 0%    80.4MB ± 0%    ~             (p=1.000 n=5+5)
DBBatchAutomatic-8           384kB ± 0%     384kB ± 0%    ~             (p=0.095 n=5+5)
DBBatchSingle-8             17.2MB ± 1%    17.2MB ± 1%    ~             (p=0.310 n=5+5)
DBBatchManual10x100-8        908kB ± 0%     902kB ± 1%    ~             (p=0.730 n=4+5)

name                      old allocs/op  new allocs/op  delta
_FreelistRelease10K-8         5.00 ± 0%      5.00 ± 0%    ~     (all samples are equal)
_FreelistRelease100K-8        5.00 ± 0%      5.00 ± 0%    ~     (all samples are equal)
_FreelistRelease1000K-8       5.00 ± 0%      5.00 ± 0%    ~     (all samples are equal)
_FreelistRelease10000K-8      5.00 ± 0%      5.00 ± 0%    ~     (all samples are equal)
DBBatchAutomatic-8           10.2k ± 0%     10.2k ± 0%  +0.07%          (p=0.032 n=5+5)
DBBatchSingle-8              58.6k ± 0%     59.6k ± 0%  +1.70%          (p=0.008 n=5+5)
DBBatchManual10x100-8        6.02k ± 0%     6.03k ± 0%  +0.17%          (p=0.029 n=4+4)
2016-12-20 14:32:15 -08:00
Nikita Vetoshkin 3d34fbcbfb Lower number of allocation in freelist.reindex()
Here is a profile taken etcd.
Before:
     10924      10924 (flat, cum)  4.99% of Total
         .          .    230:
         .          .    231:// reindex rebuilds the free cache based on available and pending free lists.
         .          .    232:func (f *freelist) reindex() {
         .          .    233:	f.cache = make(map[pgid]bool)
         .          .    234:	for _, id := range f.ids {
     10924      10924    235:		f.cache[id] = true
         .          .    236:	}
         .          .    237:	for _, pendingIDs := range f.pending {
         .          .    238:		for _, pendingID := range pendingIDs {
         .          .    239:			f.cache[pendingID] = true
         .          .    240:		}
After:
         1          1 (flat, cum) 0.0017% of Total
         .          .    228:	f.reindex()
         .          .    229:
}         .          .    230:
         .          .    231:// reindex rebuilds the free cache based on available and pending free lists.
         .          .    232:func (f *freelist) reindex() {
         1          1    233:	f.cache = make(map[pgid]bool, len(f.ids))
         .          .    234:	for _, id := range f.ids {
         .          .    235:		f.cache[id] = true
         .          .    236:	}
         .          .    237:	for _, pendingIDs := range f.pending {
         .          .    238:		for _, pendingID := range pendingIDs {
2016-09-05 14:04:40 +05:00
Ben Johnson 92410e0673
fix Go 1.7 pointer reference bug
This commit fixes a bug where page end-of-header pointers were being
converted to byte slices even when the pointer did not point to
allocated memory. This occurs with pages that have a `page.count`
of zero.

Note: This was not an issue in Go 1.6 but the new Go 1.7 SSA backend
handles `nil` checks differently.

See https://github.com/golang/go/issues/16772
2016-08-18 08:44:57 -06:00
Martin Kobetic 04a3e85793 Merge sorted pgids rather than resorting everything 2015-06-16 13:48:54 -06:00
Ben Johnson b4d00c394a Expand assertion statements.
This commit expands calls to _assert() that use variadic arguments. These calls require conversion to interface{} so there
was a large number of calls to Go's internal convT2E() function. In some profiling this was taking over 20% of total runtime.
I don't remember seeing this before Go 1.4 so perhaps something has changed.
2015-01-30 14:15:49 -05:00
Ben Johnson ce0754b0d3 Allow freelist overflow.
This commit is a backwards compatible change that allows the freelist to overflow the
page.count (uint16). It works by checking if the overflow will occur and marking the
page.count as 0xFFFF and setting the actual count to the first element of the freelist.

This approach was used because it's backwards compatible and it doesn't make sense to
change the data type of all page counts when only the freelist's page can overflow.

Fixes #192.
2014-07-10 14:50:21 -06:00
Ben Johnson 333c586ed0 Clean up freelist reindex. 2014-07-10 14:16:26 -06:00
Ben Johnson def455554b Add freelist cache.
This commit adds a cache to the freelist which combines the available free pages and pending free pages in
a single map. This was added to improve performance where freelist.isFree() was consuming 70% of CPU time
for large freelists.
2014-06-30 08:01:41 -06:00
Ben Johnson 642b104396 Add DefaultOptions variable.
This commit adds an explicit DefaultOptions variable for additional documentation.
Open() can still be passed a nil options which will cause options to be change to
the DefaultOptions variable. This change also allows options to be set globally for
an application if more than one database is being opened in a process.

This commit also moves all errors to errors.go so that the godoc groups them together.
2014-06-22 12:44:20 -06:00
Martin Kobetic 571f201672 split the freelist page count stats to free and pending 2014-06-20 14:53:25 +00:00
Martin Kobetic c105316292 add freelist stats to db stats 2014-06-17 18:40:56 +00:00
Ben Johnson 4db99647eb Fix freelist rollback. 2014-06-13 15:50:47 -06:00
Ben Johnson f448639ce4 Check for freelist overflow. 2014-06-13 07:56:10 -06:00
Ben Johnson 2eaf8f7ce0 Add freelist assertion on every free().
This commit performs a check on the freelist pages to ensure that a double free can never happen.
2014-05-29 08:02:15 -06:00
Ben Johnson 12b36fe70c Fix freelist allocate(). 2014-05-19 14:11:32 -06:00
Ben Johnson 782ead0dbf Fix freelist allocation direction.
This commit fixes the freelist so that it frees from the beginning of the data file
instead of the end. It also adds a fast path for pages which can be allocated from
the first free pages and it includes read transaction stats.
2014-05-19 12:08:33 -06:00
Ben Johnson 698b07b074 Add nested buckets.
This commit adds the ability to create buckets inside of other buckets.
It also replaces the buckets page with a root bucket.

Fixes #56.
2014-04-11 12:36:54 -06:00
Ben Johnson 440b89418f Write freelist after each commit.
Well, this is embarassing. Somehow the freelist was never getting written after each commit.
This commit fixes that and fixes a small reporting issue with "bolt pages".
2014-03-31 08:52:19 -06:00
Ben Johnson 0e4d77d424 Add 'bolt pages'. 2014-03-21 22:34:54 -06:00
Ben Johnson 57376f0905 Rename Transaction to Tx.
I changed the Transaction/RWTransaction types to Tx/RWTx, respectively. This makes the naming
more consistent with other packages such as database/sql. The txnid is changed to txid as well.
2014-03-08 17:04:02 -07:00
Ben Johnson 8ad59edd02 API Documentation. 2014-02-13 10:58:27 -07:00
Ben Johnson 509e93dff4 Add freelist. 2014-02-10 14:04:01 -07:00