- Attention
- Using any combination of MDBX_SAFE_NOSYNC, MDBX_NOMETASYNC and especially MDBX_UTTERLY_NOSYNC is always a deal to reduce durability for gain write performance. You must know exactly what you are doing and what risks you are taking!
- Note
- for LMDB users: MDBX_SAFE_NOSYNC is NOT similar to LMDB_NOSYNC, but MDBX_UTTERLY_NOSYNC is exactly match LMDB_NOSYNC. See details below.
THE SCENE:
- The DAT-file contains several MVCC-snapshots of B-tree at same time, each of those B-tree has its own root page.
- Each of meta pages at the beginning of the DAT file contains a pointer to the root page of B-tree which is the result of the particular transaction, and a number of this transaction.
- For data durability, MDBX must first write all MVCC-snapshot data pages and ensure that are written to the disk, then update a meta page with the new transaction number and a pointer to the corresponding new root page, and flush any buffers yet again.
- Thus during commit a I/O buffers should be flushed to the disk twice; i.e. fdatasync(), FlushFileBuffers() or similar syscall should be called twice for each commit. This is very expensive for performance, but guaranteed durability even on unexpected system failure or power outage. Of course, provided that the operating system and the underlying hardware (e.g. disk) work correctly.
TRADE-OFF: By skipping some stages described above, you can significantly benefit in speed, while partially or completely losing in the guarantee of data durability and/or consistency in the event of system or power failure. Moreover, if for any reason disk write order is not preserved, then at moment of a system crash, a meta-page with a pointer to the new B-tree may be written to disk, while the itself B-tree not yet. In that case, the database will be corrupted!
- See also
- MDBX_SYNC_DURABLE
-
MDBX_NOMETASYNC
-
MDBX_SAFE_NOSYNC
-
MDBX_UTTERLY_NOSYNC