Currently, libmdbx is only available in a source code form. Packages support for common Linux distributions is planned in the future, since release the version 1.0.
libmdbx provides two official three for integration in source code form:
An amalgamated source code includes all files required to build and use libmdbx, but not for testing libmdbx itself. Beside the releases an amalgamated sources could be created any time from the original clone of git repository on Linux by executing
make dist
. As a result, the desired set of files will be formed in thedist
subdirectory.
conan create .
inside the libmdbx' repo subdirectory;git submodule
from the origin git repository on GitFlic. Please, avoid using any other techniques. Otherwise, at least don't ask for support and don't name such chimerasThis allows you to build as libmdbx and testing tool. On the other hand, this way requires you to pull git tags, and use C++11 compiler for test tool.
libmdbx
.Both amalgamated and original source code provides build through the use CMake or GNU Make with bash.
All build ways are completely traditional and have minimal prerequirements like build-essential
, i.e. the non-obsolete C/C++ compiler and a SDK for the target platform. Obviously you need building tools itself, i.e. git
, cmake
or GNU make
with bash
. For your convenience, make help
and make options
are also available for listing existing targets and build options respectively.
The only significant specificity is that git' tags are required to build from complete (not amalgamated) source codes. Executing **git fetch --tags --force --prune
** is enough to get ones, and --unshallow
or --update-shallow
is required for shallow cloned case.
So just using CMake or GNU Make in your habitual manner and feel free to fill an issue or make pull request in the case something will be unexpected or broken down.
The amalgamated source code does not contain any tests for or several reasons. Please read the explanation and don't ask to alter this. So for testing libmdbx itself you need a full source code, i.e. the clone of a git repository, there is no option.
The full source code of libmdbx has a test
subdirectory with minimalistic test "framework". Actually yonder is a source code of the mdbx_test
– console utility which has a set of command-line options that allow construct and run a reasonable enough test scenarios. This test utility is intended for libmdbx's developers for testing library itself, but not for use by users. Therefore, only basic information is provided:
basic
test scenario.Makefile
provide several self-described targets for testing: smoke
, test
, check
, memcheck
, test-valgrind
, test-asan
, test-leak
, test-ubsan
, cross-gcc
, cross-qemu
, gcc-analyzer
, smoke-fault
, smoke-singleprocess
, test-singleprocess
, long-test
. Please run make --help
if doubt.mdbx_test
utility, there is the script stochastic.sh
, which calls mdbx_test
by going through set of modes and options, with gradually increasing the number of operations and the size of transactions. This script is used for mostly of all automatic testing, including Makefile
targets and Continuous Integration.--help
. However, you should dive into source code to get all, there is no option.Anyway, no matter how thoroughly the libmdbx is tested, you should rely only on your own tests for a few reasons:
By default libmdbx track build time via MDBX_BUILD_TIMESTAMP
build option and macro. So for a reproducible builds you should predefine/override it to known fixed string value. For instance:
make MDBX_BUILD_TIMESTAMP=unknown
...cmake -DMDBX_BUILD_TIMESTAMP:STRING=unknown
...Of course, in addition to this, your toolchain must ensure the reproducibility of builds. For more information please refer to reproducible-builds.org.
There are no special traits nor quirks if you use libmdbx ONLY inside the single container. But in a cross-container cases or with a host-container(s) mix the two major things MUST be guaranteed:
--pid=host
is required for run DB-aware processes inside Docker, either without host interaction a --pid=container:<name|id>
with the same name/id.OpenProcess(SYNCHRONIZE, ..., PID)
must return reasonable error, including ERROR_ACCESS_DENIED
, but not the ERROR_INVALID_PARAMETER
as for an invalid/non-existent PID.When building libmdbx as a shared library or use static libmdbx as a part of another dynamic library, it is advisable to make sure that your system ensures the correctness of the call destructors of Thread-Local-Storage objects when unloading dynamic libraries.
If this is not the case, then unloading a dynamic-link library with libmdbx code inside, can result in either a resource leak or a crash due to calling destructors from an already unloaded DSO/DLL object. The problem can only manifest in a multithreaded application, which makes the unloading of shared dynamic libraries with libmdbx code inside, after using libmdbx. It is known that TLS-destructors are properly maintained in the following cases:
__cxa_thread_atexit_impl()
function in the standard C library, including systems with GNU libc version 2.18 and later.To build the library it is enough to execute make all
in the directory of source code, and make check
to execute the basic tests.
If the make
installed on the system is not GNU Make, there will be a lot of errors from make when trying to build. In this case, perhaps you should use gmake
instead of make
, or even gnu-make
, etc.
As a rule on BSD and it derivatives the default is to use Berkeley Make and Bash is not installed.
So you need to install the required components: GNU Make, Bash, C and C++ compilers compatible with GCC or CLANG. After that, to build the library, it is enough to execute gmake all
(or make all
) in the directory with source code, and gmake check
(or make check
) to run the basic tests.
For build libmdbx on Windows the original CMake and Microsoft Visual Studio 2019 are recommended. Please use the recent versions of CMake, Visual Studio and Windows SDK to avoid troubles with C11 support and alignas()
feature.
For build by MinGW the 10.2 or recent version coupled with a modern CMake are required. So it is recommended to use chocolatey to install and/or update the ones.
Another ways to build is potentially possible but not supported and will not. The CMakeLists.txt
or GNUMakefile
scripts will probably need to be modified accordingly. Using other methods do not forget to add the ntdll.lib
to linking.
It should be noted that in libmdbx was efforts to avoid runtime dependencies from CRT and other MSVC libraries. For this is enough to pass the -DMDBX_WITHOUT_MSVC_CRT:BOOL=ON
option during configure by CMake.
To run the long stochastic test scenario, bash is required, and such testing is recommended with placing the test data on the RAM-disk.
libmdbx could be used in WSL2 but NOT in WSL1 environment. This is a consequence of the fundamental shortcomings of WSL1 and cannot be fixed. To avoid data loss, libmdbx returns the ENOLCK
(37, "No record locks available") error when opening the database in a WSL1 environment.
Current native build tools for MacOS include GNU Make, CLANG and an outdated version of Bash. Therefore, to build the library, it is enough to run make all
in the directory with source code, and run make check
to execute the base tests. If something goes wrong, it is recommended to install Homebrew and try again.
To run the long stochastic test scenario, you will need to install the current (not outdated) version of Bash. To do this, I recommend that you install Homebrew and then execute brew install bash
.
I recommend using CMake to build libmdbx for Android. Please refer to the official guide.
To build libmdbx for iOS, I recommend using CMake with the "toolchain file" from the ios-cmake project.
This section is based on Bert Hubert's intro "LMDB Semantics", with edits reflecting the improvements and enhancements were made in MDBX. See Bert Hubert's original.
Everything starts with an environment, created by mdbx_env_create(). Once created, this environment must also be opened with mdbx_env_open(), and after use be closed by mdbx_env_close(). At that a non-zero value of the last argument "mode" supposes MDBX will create database and directory if ones does not exist. In this case the non-zero "mode" argument specifies the file mode bits be applied when a new files are created by open()
function.
Within that directory, a lock file (aka LCK-file) and a storage file (aka DXB-file) will be generated. If you don't want to use a directory, you can pass the MDBX_NOSUBDIR option, in which case the path you provided is used directly as the DXB-file, and another file with a "-lck" suffix added will be used for the LCK-file.
Once the environment is open, a transaction can be created within it using mdbx_txn_begin(). Transactions may be read-write or read-only, and read-write transactions may be nested. A transaction must only be used by one thread at a time. Transactions are always required, even for read-only access. The transaction provides a consistent view of the data.
Once a transaction has been created, a database (i.e. key-value space inside the environment) can be opened within it using mdbx_dbi_open(). If only one database will ever be used in the environment, a NULL
can be passed as the database name. For named databases, the MDBX_CREATE flag must be used to create the database if it doesn't already exist. Also, mdbx_env_set_maxdbs() must be called after mdbx_env_create() and before mdbx_env_open() to set the maximum number of named databases you want to support.
Within a transaction, mdbx_get() and mdbx_put() can store single key-value pairs if that is all you need to do (but see Cursors below if you want to do more).
A key-value pair is expressed as two MDBX_val structures. This struct that is exactly similar to POSIX's struct iovec
and has two fields, iov_len
and iov_base
. The data is a void
pointer to an array of iov_len
bytes.
Because MDBX is very efficient (and usually zero-copy), the data returned in an MDBX_val structure may be memory-mapped straight from disk. In other words look but do not touch (or free()
for that matter). Once a transaction is closed, the values can no longer be used, so make a copy if you need to keep them after that.
To do more powerful things, we must use a cursor.
Within the transaction, a cursor can be created with mdbx_cursor_open(). With this cursor we can store/retrieve/delete (multiple) values using mdbx_cursor_get(), mdbx_cursor_put() and mdbx_cursor_del().
The mdbx_cursor_get() positions itself depending on the cursor operation requested, and for some operations, on the supplied key. For example, to list all key-value pairs in a database, use operation MDBX_FIRST for the first call to mdbx_cursor_get(), and MDBX_NEXT on subsequent calls, until the end is hit.
To retrieve all keys starting from a specified key value, use MDBX_SET. For more cursor operations, see the C API reference.
When using mdbx_cursor_put(), either the function will position the cursor for you based on the key, or you can use operation MDBX_CURRENT to use the current position of the cursor.
So we have a cursor in a transaction which opened a database in an environment which is opened from a filesystem after it was separately created.
Or, we create an environment, open it from a filesystem, create a transaction within it, open a database within that transaction, and create a cursor within all of the above.
Got it?
Do not have open an database twice in the same process at the same time, MDBX will track and prevent this. Instead, share the MDBX environment that has opened the file across all threads. The reason for this is:
Do not use opened MDBX environment(s) after fork()
in a child process(es), MDBX will check and prevent this at critical points. Instead, ensure there is no open MDBX-instance(s) during fork(), or at least close it immediately after fork()
in the child process and reopen if required - for instance by using pthread_atfork()
. The reason for this is:
fork()
, in order to remain connected to a database, the child process must have its own such "slot", which can't be assigned in any simple and robust way another than the regular.Do not start more than one transaction for a one thread. If you think about this, it's really strange to do something with two data snapshots at once, which may be different. MDBX checks and preventing this by returning corresponding error code (MDBX_TXN_OVERLAPPING, MDBX_BAD_RSLOT, MDBX_BUSY) unless you using MDBX_NOSTICKYTHREADS option on the environment. Nonetheless, with the MDBX_NOSTICKYTHREADS option, you must know exactly what you are doing, otherwise you will get deadlocks or reading an alien data.
Also note that a transaction is tied to one thread by default using Thread Local Storage. If you want to pass transactions across threads, you can use the MDBX_NOSTICKYTHREADS option on the environment. Nevertheless, a write transaction must be committed or aborted in the same thread which it was started. MDBX checks this in a reasonable manner and return the MDBX_THREAD_MISMATCH error in rules violation.
To actually get anything done, a transaction must be committed using mdbx_txn_commit(). Alternatively, all of a transaction's operations can be discarded using mdbx_txn_abort().
For read-only transactions, obviously there is nothing to commit to storage.
In addition, as long as a transaction is open, a consistent view of the database is kept alive, which requires storage. A read-only transaction that no longer requires this consistent view should be terminated (committed or aborted) when the view is no longer needed (but see below for an optimization).
There can be multiple simultaneously active read-only transactions but only one that can write. Once a single read-write transaction is opened, all further attempts to begin one will block until the first one is committed or aborted. This has no effect on read-only transactions, however, and they may continue to be opened at any time.
mdbx_get() and mdbx_put() respectively have no and only some support or multiple key-value pairs with identical keys. If there are multiple values for a key, mdbx_get() will only return the first value.
When multiple values for one key are required, pass the MDBX_DUPSORT flag to mdbx_dbi_open(). In an MDBX_DUPSORT database, by default mdbx_put() will not replace the value for a key if the key existed already. Instead it will add the new value to the key. In addition, mdbx_del() will pay attention to the value field too, allowing for specific values of a key to be deleted.
Finally, additional cursor operations become available for traversing through and retrieving duplicate values.
If you frequently begin and abort read-only transactions, as an optimization, it is possible to only reset and renew a transaction.
mdbx_txn_reset() releases any old copies of data kept around for a read-only transaction. To reuse this reset transaction, call mdbx_txn_renew() on it. Any cursors in this transaction can also be renewed using mdbx_cursor_renew() or freed by mdbx_cursor_close().
To permanently free a transaction, reset or not, use mdbx_txn_abort().
Any created cursors must be closed using mdbx_cursor_close(). It is advisable to repeat:
It is very rarely necessary to close a database handle, and in general they should just be left open. When you close a handle, it immediately becomes unavailable for all transactions in the environment. Therefore, you should avoid closing the handle while at least one transaction is using it.
The full C API documentation lists further details below, like how to:
Runtime | Repo | Author |
---|---|---|
Scala | mdbx4s | David Bouyssié |
Haskell | libmdbx-hs | Francisco Vallarino |
NodeJS, Deno | lmdbx-js | Kris Zyp |
NodeJS | node-mdbx | Сергей Федотов |
Ruby | ruby-mdbx | Mahlon E. Smith |
Go | mdbx-go | Alex Sharov |
Nim | NimDBX | Jens Alfke |
Lua | lua-libmdbx | Masatoshi Fukunaga |
Rust | libmdbx-rs | Artem Vorotnikov |
Rust | mdbx | gcxfd |
Java | mdbxjni | Castor Technologies |
Python (draft) | python-bindings branch | Noel Kuntze |
.NET (obsolete) | mdbx.NET | Jerry Wang |