Using a large (50gb) database with a read-write-delete heavy load,
nearly 100% of allocated space came from freelists.
1/3 came from freelist.release, 1/3 from freelist.write,
and 1/3 came from tx.allocate to make space for freelist.write.
In the case of freelist.write, the newly allocated giant slice gets
copied to the space prepared by tx.allocate and then discarded.
To avoid this, add func mergepgids that accepts a destination slice,
and use it in freelist.write.
This has a mild negative impact on the existing benchmarks,
but cuts allocated space in my real world db by over 30%.
name old time/op new time/op delta
_FreelistRelease10K-8 18.7µs ±10% 18.2µs ± 4% ~ (p=0.548 n=5+5)
_FreelistRelease100K-8 233µs ± 5% 258µs ±20% ~ (p=0.151 n=5+5)
_FreelistRelease1000K-8 3.34ms ± 8% 3.13ms ± 8% ~ (p=0.151 n=5+5)
_FreelistRelease10000K-8 32.3ms ± 1% 32.2ms ± 7% ~ (p=0.690 n=5+5)
DBBatchAutomatic-8 2.18ms ± 3% 2.19ms ± 4% ~ (p=0.421 n=5+5)
DBBatchSingle-8 140ms ± 6% 140ms ± 4% ~ (p=0.841 n=5+5)
DBBatchManual10x100-8 4.41ms ± 2% 4.37ms ± 3% ~ (p=0.548 n=5+5)
name old alloc/op new alloc/op delta
_FreelistRelease10K-8 82.5kB ± 0% 82.5kB ± 0% ~ (all samples are equal)
_FreelistRelease100K-8 805kB ± 0% 805kB ± 0% ~ (all samples are equal)
_FreelistRelease1000K-8 8.05MB ± 0% 8.05MB ± 0% ~ (all samples are equal)
_FreelistRelease10000K-8 80.4MB ± 0% 80.4MB ± 0% ~ (p=1.000 n=5+5)
DBBatchAutomatic-8 384kB ± 0% 384kB ± 0% ~ (p=0.095 n=5+5)
DBBatchSingle-8 17.2MB ± 1% 17.2MB ± 1% ~ (p=0.310 n=5+5)
DBBatchManual10x100-8 908kB ± 0% 902kB ± 1% ~ (p=0.730 n=4+5)
name old allocs/op new allocs/op delta
_FreelistRelease10K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
_FreelistRelease100K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
_FreelistRelease1000K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
_FreelistRelease10000K-8 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal)
DBBatchAutomatic-8 10.2k ± 0% 10.2k ± 0% +0.07% (p=0.032 n=5+5)
DBBatchSingle-8 58.6k ± 0% 59.6k ± 0% +1.70% (p=0.008 n=5+5)
DBBatchManual10x100-8 6.02k ± 0% 6.03k ± 0% +0.17% (p=0.029 n=4+4)
The subtraction for `TxN` was previously transposed which caused
the result to be a negative number. This change alters the order
to return the correct (positive) result.
This commit fixes a bug where page end-of-header pointers were being
converted to byte slices even when the pointer did not point to
allocated memory. This occurs with pages that have a `page.count`
of zero.
Note: This was not an issue in Go 1.6 but the new Go 1.7 SSA backend
handles `nil` checks differently.
See https://github.com/golang/go/issues/16772
armv5 devices and older (i.e. <= arm9 generation) require addresses that are
stored to and loaded from to to be 4-byte aligned.
If this is not the case the lower 2 bits of the address are cleared and the load
is performed in an unexpected order, including up to 3 bytes of data located
prior to the address.
Inlined buckets are stored after their key in a page and since there is no
guarantee that the key will be of a length that is a multiple of 4, it is
possible for unaligned load/stores to occur when they are cast back to bucket
and page pointer types.
The fix adds a new field to track whether the current architecture exhibits this
issue, sets it on module load for ARM architectures, and then on bucket open, if
this field is set and the address is unaligned, a byte-by-byte copy of the
inlined bucket is performed.
Ref: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html