Archive of libmdbx telegram group messages

8 June 2021

СО

11:27

Станислав Очеретный

И еще в пишущем потоке вызывается int mdbx_mapresize(MDBX_env *env, const pgno_t used_pgno,
Что захватывает лок в эксклюзивном режиме:
mdbx_srwlock_AcquireExclusive(&env->me_remap_guard);

11:28

Прошу прощения, если сумбурно объясняю

Л(

11:28

Леонид Юрьев (Leonid Yuriev)

Угу, варианта примерно два:
1. Вернуться к парадигме один-поток = одна-транзакция, т.е. отказаться от NOTLS.
2. Использовать БД фиксированного размера.

Тем не менее, замеченное на 50% баг.
Мне нужно либо явно это документировать, либо выпилить NOTLS, либо заменить SRW в Windows на что-то другое.

СО

11:32

Станислав Очеретный

Спасибо за информацию, буду думать

Л(

11:35

Леонид Юрьев (Leonid Yuriev)

В Windows приходиться использовать какие-то "костыли", ибо система не умеет полноценно расширять и совсем не умеет уменьшать mmap-секции.
Соответственно этот rwlock добавлен для обхода этих ограничений системы.

AV

16:05

Artem Vorotnikov

@erthink
Assertion failed: (!"Invalid data-size"), function mdbx_cursor_put, file mdbx, line 14385.

https://github.com/vorot93/mdbx-issue-repro

16:05

если убрать DUP_SORT с dbi, то работает

Л(

16:06

Леонид Юрьев (Leonid Yuriev)

Увидел, вечером посмотрю.

9 June 2021

AV

10:07

Artem Vorotnikov

In reply to this message

нам, асихнронщикам, NOTLS нужен 🙂 если его не будет, то просто будут костыли в биндингах / приложениях, как в случае с пишущими транзакциями

Л(

13:11

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Насколько я (пока) понимаю у вас просто превышение максимальной длины данных для DUPSORT.
Срабатывающий ассерт проверяет precondition для параметров вызываемой пользователем функции, а не корректность внутреннего состояния.
За этим ассертом сразу return MDBX_BAD_VALSIZE;.

На всякий для понимания:
- данные/значения для DUPSORT-таблиц хранятся во вложенных b-tree, т.е. у каждого ключа с множеством значений внутри своя мини-БД.
- соответственно на длину значений DUPSORT накладываются примерно те же ограничения что и на длину ключей.
- см. mdbx_limits_valsize_max()

AV

13:12

Artem Vorotnikov

Да, но почему assert, а не ошибка?

13:13

assert обрывает выполнение всей программы, что наверное не очень хорошо?

Л(

13:17

Леонид Юрьев (Leonid Yuriev)

In reply to this message

- это заведомо неверные данные в отладочной сборке, и где-то было удобно так ловить ошибки во внешнем/клиентском коде.
- эта функция также вызывается изнутри libmdbx и в этом случае ассерты уже явно нужны.
- никто не жаловался...

Л(

14:33

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://github.com/erthink/libmdbx/issues/203

10 June 2021

Л(

14:09

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Пожалуйста проверьте/попробуйте предварительный вариант исправления в ветке devel на github.

Суть этого исправления в доработках поддержки режима MDBX_NOTLS:
- сделан максимум без замены системного SRW на кастомную реализацию.
- читающие транзакции не захватывают разделяемую блокировку SRW.
- при необходимости изменения размера БД маппинг не снимается, что позволяет через NtExtendSection() только наращивать размер БД в пределах изначально заказанной геометрии.
- не работает авто-уменьшение размера БД, но это будет сделано при последующем открытии БД.
- не возможно изменение геометрии уже открытой БД, но это можно сделать при открытии первым процессом.
- под Wine также не работает авто-приращение БД, так как в Wine не реализована NtExtendSection().

СО

15:07

Станислав Очеретный

К сожалению, в ближайшее время проверить не могу, в отпуске до конца июня

11 June 2021

Л(

15:47

Леонид Юрьев (Leonid Yuriev)

off-topic:

Кто в Москве может помочь с проверкой LRDIMM-ом на 2-ом или 3-ем поколении Xeon Scallable ?

Т.е. нужна машина/сервер с Xeon Scallable 2-го или 3-го поколения, желательно с большим кол-вом слотов для памяти, чтобы вставить LDRIMM-ы и хоть немого погонять memtest-ом.

?

13 June 2021

AV

22:46

Artem Vorotnikov

@erthink клемент передал родительский репозиторий с биндингами к mdbx, скоро опубликую на crates.io

Л(

22:46

Леонид Юрьев (Leonid Yuriev)

In reply to this message

👍

14 June 2021

AV

19:28

Artem Vorotnikov

@erthink FYI
https://github.com/rpm-software-management/rpm/issues/958

15 June 2021

NK

07:42

Noel Kuntze

@erthink I completed writing all the python bindings for the relevant libmdbx functions and integrated it into cmake and the GNUmakefile.

07:42

Code will be put up sometime today.

07:42

It has tests for all relevant value passing between the bindings and the C code.

07:43

Would you be willing to eventually take over hosting of the code in your repo? That way it's all in one place. Also, the python code requires no compiling. It's just a sed-run over the .in file to fill in the LIBDIR var.

16 June 2021

Л(

02:40

Леонид Юрьев (Leonid Yuriev)

In reply to this message

👍
Unfortunately, I was very busy today.
I hope tomorrow I will find time to carefully consider the PR.

Preliminary, I have a couple of comments:
1. By defaults the bindings may be disabled, but it should be explicitly enabled in a CI builds.
2. In case a binding is in the same repo with library, I think the same version number must be used for both the library itself and the bindings.

NK

18:39

Noel Kuntze

In reply to this message

1) That's fine, it's what I want to do anyway. :)
2) I agree with the majority of it. I think because the libraries do not need to be released in lock-step with the bindings (e.g. when the bindings get an urgent bugfix or something like that), the patch number should be allowed to diverge. E.g. libmdbx 2.4 with python bindings 2.4.1

NK

21:43

Noel Kuntze

2) is done

21:44

Building bindings is disabled by default.

17 June 2021

NK

18:14

Noel Kuntze

In reply to this message

Should be all done.

18:14

(except circleci and windows, I don't want to touch that powershell for that)

19 June 2021

Л(

01:05

Леонид Юрьев (Leonid Yuriev)

Thanks to @thermi the Python bindings is mostly done.
Big thanks for contribution!
This is a big enough first step to move on.

Essentially, these bindings allow to use libmdbx from python.
However, ones are not ready to merge into the master branch until the TODO list is empty (at least mostly).

I will continue review the code (so the TODO-list will be updated) and will try to find some time to work on these TODO items.
But anyone are welcome to help with these tasks.
https://github.com/erthink/libmdbx/blob/python-bindings/python/TODO.md

NK

01:06

Noel Kuntze

@erthink You are very welcome. I am currently writing a comment on your TODO items.

01:06

Also, big thanks to cloud4job for financing the writing of the bindings.

21 June 2021

AV

16:18

Artem Vorotnikov

@erthink насколько безопасно использовать MDBX_txn в другом инстансе MDBX?

Если конкретно, то инстанс из растового биндинга в плюсовом проекте, собирающем libmdbx самостоятельно

Л(

16:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Не совсем понял что вы делаете:
- открывать одну БД два раза в одном процессе, в общем случае, нельзя/плохо.
- использовать в одном процессе два экземпляра библиотеки (не важно со статической или динамической линковкой) тоже нельзя, либо вы должны полностью в деталях представлять/понимать как конкретно в вашем случае будут работать Thread Local Storage Destructors и гарантировать полное совпадение версий и опций сборки всех экземпляров библиотек.
- в остальных случаях у вас просто есть MDBX_env, связанный с ним экземпляр MDBX_txn и код libmdbx, который может вызываться как и Rust, так и из C/C++, Python, etc.

AV

16:34

Artem Vorotnikov

In reply to this message

БД будет открываться один раз, но пишущая транзакция будет передаваться в плюсовую библиотеку

16:34

внутри плюсовой библиотеки только чтение и запись, без коммитов

16:36

если конкретно, то мы сейчас пишем ещё 2 клиента Ethereum: Akula (Rust) и Silkworm (C++)

посколько Silkworm уже год, для ускорения разработки Akula будет передавать пишущую транзакцию в Silkworm для выполнения некоторых нереализованных стадий синхронизации

Л(

16:37

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Тогда необходимо и достаточно чтобы был один экземпляр libmdbx.so.
Ну и плюс соблюдение ограничений/требований API (транзакции/треды и т.п.)

AV

16:38

Artem Vorotnikov

In reply to this message

вот как раз с экземпляром проблема, потому что разные системы сборки (CMake и Cargo), ещё и растовый генератор биндинга

16:38

была идея просто указать CMake в Silkworm на исходный код MDBX в биндинге, но всё равно собирать два статических экземпляра

Л(

16:40

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Это не проблема, ибо выход только один = собирайте и используйте одну libmdbx.so.
+ Все остальное = жуткий трешь и явно поставленные грабли, не вздумайте...

16:40

Я ушел, буду через пару часов.

Andrea Lanfranchi invited Andrea Lanfranchi

22 June 2021

AV

03:03

Artem Vorotnikov

@erthink вложенная транзакция же может иметь в качестве родителя ещё одну вложенную транзакцию?

Л(

03:06

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Глубина вложенности может быть больше 1, в тестах штатно до 5 уровней включительно.

AV

03:11

Artem Vorotnikov

спасибо! буду пробовать писать execution на базе вложенных транзакций


---- execution stage --\                                           /--\
                        \--block--\            /-\            /---/    \- block
                                   \--eth tx--/   \--eth tx--/

03:12

а насколько дорого открывать/закрывать вложенную транзакцию?

сейчас Erigon обрабатывает несколько десятков блоков в секунду - это несколько тысяч эфировских транзакций

если на каждый блок открывать вложенную транзакцию, и внутри неё открывать на каждую эфировскую транзакцию - это не будет большим пятном в профилировщике?

@erthink

Л(

08:01

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Вложенность не бесплатна, она добавляет накладных расходов:
- на каждом уровне вложенности свой свои списки страниц: dirty, spilled, retired.
- причем свои копии dirty-страниц в ОЗУ.
- при поиске/чтении страницы движок должен просматривать списки dirty и spilled страниц по всему стеку вложенности: Sum0..M(Olog(Nm))
- при коммите вложенной транзакции её списки dirty/spilled/retired страниц должны быть просеяны и объединены со списками родительской транзакции: Sum1..M(Nm*Olog(Nm))
- где M - количество уровней вложенности, а Nm - количество dirty/spilled/retired страниц на m-ом уровне вложенности.

Вложенные транзакции нужны только если их иногда требуются абортить, а не коммитеть. Иначе в них нет смысла.

AV

08:03

Artem Vorotnikov

In reply to this message

Да, эфировская транзакция может быть некорректной и тогда надо откатывать все её изменения

08:05

В других клиентах (даже в эригоне) откат реализован руками через журналирование в памяти - по сути WAL на коленке

08:06

мы вообще первые кто делает клиент с MVCC-базой, поэтому интересно и тут это свойство попробовать применить

08:08

In reply to this message

Либо блок может прилететь некорректный, и это может выясниться уже после обработки и наката всех эфировских транзакций

Л(

08:18

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Но тут есть и обратная сторона медали - пока не зафиксирована корневая транзакция все изменения только в ОЗУ, т.е. если что-то пойдет не так, то изменения будут потеряны.

AV

08:19

Artem Vorotnikov

Если блок с транзакциями некорректный, то надо делать полный откат всего блока

08:20

Т.е. или откат по всему наколеночному WAL-у, или просто аборт вложенной транзакции MDBX

AS

09:15

Alex Sharov

In reply to this message

Monero на LMDB - причем там Howard многое делал - можешь поковырять. Мое мнение - взлетай на транзакциях, нашлепать сверху journal/lru будет можно. А отсутствие нашлепок даст доступ к низкоуровневым фичам базы - всякие PutReserve, cursor.Current(), … возможно они откроют новые двери.

23 June 2021

СО

10:34

Станислав Очеретный

@erthink Добрый день. Проверил develop версию (Recursive use of SRW-lock on Windows cause by MDBX_NOTLS option. · Issue #203 · erthink/libmdbx). Работает

10:35

@erthink Но при сборке под Windows в MSVC 2019 срабатывает
//
#if (!defined(__cplusplus) || __cplusplus < 201103L) && \
!(defined( \
_MSC_VER) /* MSVC is mad and don't define __cplusplus properly */ \
&& _MSC_VER == 1900)
#error "C++11 or better is required"
#endif

10:35

#error "C++11 or better is required"

10:35

пришлось закомментировать

10:36

2. В файле VERSION пришлось добавить два слэша перед версией иначе библиотека не собиралась

10:37

externals\libmdbx\version(1): error C2059: syntax error: 'constant'
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include\sal.h(2361): error C2143: syntax error: missing ';' before '{'

10:38

Я собираю из AMALGAMATED_SOURCE

10:39

Эта ошибка при компиляции mdbx.c++

10:44

Скорее всего это неправильная интеграция в мой проект

10:45

Т.к. "чистый" MDBX собирается без ошибок

10:51

По #error - в моем проекте__cplusplus = 199711L, в чистом MDBX = 201705L

СО

11:35

Станислав Очеретный

По поводу (2. В файле VERSION пришлось добавить два слэша перед версией иначе библиотека не собиралась)
Как выяснилось в mdbx.h++
#if __has_include(<version>)
#include <version>
#endif /* <version> */
Подставлялся локальный файл VERSION а не системный header

Л(

11:50

Леонид Юрьев (Leonid Yuriev)

Stanislav, это проблемы Windows/MSVC:

1. MSVC по умолчанию работает в режиме "совместимости" с нарушением стандарта, выставляя некорректное значение __cplusplus.
Лечиться это добавлением какой-то опции командной строки, ив CMake-сценарияз сборки MDBX это сделано.
Если вы собираете библиотеку сами или без CMake, то нужно добавлять эту опцию ручками, либо локально править эту проверку в mdbx.
Добавлю более развернутое сообщение об ошибке, но каких-либо правок/workaround в mdbx для обхода этой ситуации вноситься не будет.

2. В Windows не различается регистр в имени файлов, а MSVC не предоставляет заголовочный файл <version>.
Раньше это не проявилось видимо из-за того что до 19.29 компилятор не умел __has_include().
Подумаю что можно сделать, видимо придется просто не пытаться инклудить <version> из MSVC.

СО

11:54

Станислав Очеретный

1. Я собираю через CMake. У меня есть папка externals куда добавлена папка libmdbx (AMALGAMATED_SOURCE)
В этой папке, то что касается MDBX:
set(MDBX_BUILD_SHARED_LIBRARY ON)
set(MDBX_BUILD_CXX OFF)
add_subdirectory(libmdbx)
set_target_properties( mdbx
PROPERTIES
ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib"
LIBRARY_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib"
RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin"
)

11:55

Видимо более высокоуровневый CMake вносит какие-то изменения

Л(

11:55

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Хм, тогда странно. Посмотрю.

СО

11:56

Станислав Очеретный

У меня в проекте включен C++17

11:56

Может из-за э

11:56

этого

Л(

11:57

Леонид Юрьев (Leonid Yuriev)

Нет, не из-за этого.
Нужна опция /Zc:__cplusplus

СО

11:58

Станислав Очеретный

По поводу VERISON можно файл VERSION переименовать в VERSION_MDBX и подправить CMake скрипты

12:12

In reply to this message

[CMake] -- MDBX Compile Flags: /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 LIBMDBX_EXPORTS MDBX_BUILD_SHARED_LIBRARY=1

12:14

setup_compile_flags() вызывается в ветке if(NOT MDBX_AMALGAMATED_SOURCE)

Л(

12:20

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Похоже на какой-то cut&paste баг.
Сегодня посмотрю и поправлю.

СО

12:27

Станислав Очеретный

setup_compile_flags() вызывается в случае когда SUBPROJECT = OFF
В моем случае SUBPROJECT = ON

Л(

13:53

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Тогда всё верно.
Если libmdbx собирается не как самостоятельный проект, а как "поддиректория", то будут использоваться (и должны быть подготовлены) опции/флаги компиляции основного проекта.
В свою очередь, вам нужно либо самостоятельно полечить MSVC нужными опциями, либо (при желании) подключить и задействовать cmake-функции из cmake-файлов libmdbx.

СО

13:54

Станислав Очеретный

In reply to this message

ок

Л(

13:56

Леонид Юрьев (Leonid Yuriev)

In reply to this message

При этом также можно подключить mdbx как внешний проект, а не под-директорию.
Тогда опции будут формироваться независимо (и возможно мне стоит переименовать SUBPROJECT`в что-то вроде `SUBDIR).

СО

13:59

Станислав Очеретный

Хотя на мой взгляд было бы логично к флагам основного проекта добавлять /Zc:__cplusplus при сборке MDBX, т.к. совсем не очевидная ошибка вылатает

Л(

14:00

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Как написал выше - поправлю сообщение об ошибке, т.е. добавлю явное упоминание этой опции в случае MSVC.

СО

14:05

Станислав Очеретный

@erthink планируется ли в ближайшем минорном релизе выложить исправления (Recursive use of SRW-lock on Windows cause by MDBX_NOTLS option)?

Л(

14:10

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Да, обязательно.

Но я также хотел поправить/переделать https://github.com/erthink/libmdbx/issues/191, но это оказалось сложнее чем кажется.
Надеюсь успеть на этой неделе.

Deleted invited Deleted Account

24 June 2021

NK

21:06

Noel Kuntze

At what times are file descriptors for the DB file open? I'm asking because when doing compaction, after moving the new, now compacted, DB file over the old one, any open FDs still refer to the old file, not the new, compacted one

25 June 2021

NK

14:20

Noel Kuntze

@erthink The situation is not that easy to resolve because we potentially also need to take care of the case where the directory with the DB file in it was moved, so just caching the path isn't going to work in that case

14:21

e.g. mdbx_env_open() ... (some operation) ... (user moves DB dir or file to different path) ... mdbx_env_get_fd() or so

Л(

14:26

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Hmm, please read API description.
There is no "in place compaction" but "copy with optional compaction".
I.e. you should:
- call API for make a DB copy with compaction.
- close the ENV
- rename DB files if reasonable and/or remove previously non-compacted instance.
- open ENV from the new DB file/path after the copy-with-compaction.

NK

14:27

Noel Kuntze

Yes, I know that. I saw many questions about (transparent) compaction and I'd like to try to implement that in the bindings as an abstraction

14:28

It's not always feasible/good to stop a long running process or so for runtime compaction

14:28

Stopping the process for compaction isn't good either, and opening the Env for every database access isn't good for performance either

Л(

14:37

Леонид Юрьев (Leonid Yuriev)

@thermi, briefly, you are unable this and shouldn't try to implement such abstraction:
- this API is copy-aware, i.e. for making an online backup.
- you will get a consistent MVCC-snapshot of DB, but during copying other process may commit some new transactions.
- other processes can work with the database, which do not expect that the database files will be deleted and replaced with new ones...

NK

14:37

Noel Kuntze

Yes, exactly that's why I asked when FDs are open

14:38

If FDs are not open, we can acquire a write TXN, do the stuff, and then abort the TXN to get rid of the FD

14:46

Also, how's work on mithrildb going?

8 July 2021

NK

22:54

Noel Kuntze

@erthink How can we proceed in getting the bindings into the master branch?

9 July 2021

Л(

10:53

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Please read the https://github.com/erthink/libmdbx/blob/python-bindings/python/README.md

NK

18:55

Noel Kuntze

Okay, I see. I will put work into providing it in another place then.

10 July 2021

Eugene Istomin invited Eugene Istomin

11 July 2021

Deleted invited Deleted Account

16:30

Deleted Account

indx_t mp_ptrs[]

this contains all the pointers to the MDBX_node correct?

Л(

16:31

Леонид Юрьев (Leonid Yuriev)

In reply to this message

in-page offsets to ones

16:32

Deleted Account

in-page offset to nodes?

Л(

16:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

yes

16:34

Deleted Account

thanks!

19:26

Deleted Account

in LMDB to find a page_ptr you do

64bit_value & 0xFFFFFFFFFFFF

, in MDBX page_ptrs 32bit correct? do I need to mask it the same way to find

page_ptr

and whats the mask value if any?

Л(

19:33

Леонид Юрьев (Leonid Yuriev)

In reply to this message

The size of the pointer depends on the bit depth of the target platform.
The mask for getting the page address depends on the page size.

In LMDB, the page size is defined at library build time.
But in MDBX, the page size is determined by the geometry which set when database was created.

19:34

In reply to this message

But why do you need a page address?

19:37

Deleted Account

In reply to this message

I'm doing some work on mdbx-go based project, and they recently switched from LMDB to MDBX so I'm looking at the go code written for LMDB and trying to do the same thing for MDBX

19:39

so I'm reading a mdbx.dat file and trying to traverse the tree, so I can better understand the structure and details

Л(

19:41

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Take look to the mdbx_chk.c

AA

20:00

Alexey Akhunov

@erthink I have asked 0x0 to help produce the images for MDBX equilavent for those I did for this doc: https://github.com/ledgerwatch/erigon/wiki/LMDB-freelist-illustrated-guide

13 July 2021

10:36

Deleted Account

is this how i find page number

return (pgno_t)(bytes >> env->me_psize2log);

10:40

if node is a SUB_DATABASE

Л(

10:48

Леонид Юрьев (Leonid Yuriev)

In reply to this message

if bytes is an offset inside DB file and/or mapping, then the bytes >> env->me_psize2log is a corresponding DB page number.
It's obvious, isn't it?

10:54

Deleted Account

thanks, so this is how i find page number... and

psize2log

is

log2(pageSize)

correct?

Л(

11:05

Леонид Юрьев (Leonid Yuriev)

In reply to this message

yes

11:07

Deleted Account

one more question please, if node is SUBDATA, data size is 48 bytes, first 8 bytes are are offset to the root node, how about rest bytes?

Л(

11:14

Леонид Юрьев (Leonid Yuriev)

> first 8 bytes are are offset to the root node

No.
There are no 64-bit offsets in the database structure, but only 16-bit offsets and 32-bit page numbers.

If node type is SUBDATA, then node data is struct MDBX_db.

11:16

Deleted Account

oh that makes so much sense right now, thanks!

14 July 2021

09:31

Deleted Account

Leonid, please, if node flag is BIGDATA, then what is the node's data?

09:57

Deleted Account

looks like i figured this out. first ksize bytes are key and second sizeof(pgno_t) are page number

11:44

Deleted Account

how do I read sub-pages? looks like MDBX_page struct isnt the right struct for it

11:50

I have a node with DUPDATA flag, i was trying to read the header the same way as regualar pages... looks like its not the right way

Л(

11:53

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Please read the code like an open book, but don't ask me to voice it.
In addition, there are several presentations by Howard Chu about the internal structure of the LMDB that are suitable for MDBX.

11:54

Deleted Account

understood

21 July 2021

Л(

16:40

Леонид Юрьев (Leonid Yuriev)

libmdbx v0.10.2 will be release soon

24 July 2021

AV

01:45

Artem Vorotnikov

немного необычный вопрос

@erthink а Вы не смотрели в сторону btrfs для Ваших задач? что если вместо CoW базы данных использовать CoW файловую систему, а вместо ключиков - файлы?

Л(

13:16

Леонид Юрьев (Leonid Yuriev)

In reply to this message

btrfs будет в разы медленнее, по классическим причинам:
- btrfs ориентирована на другие целевые сценарии использования.
- в mdbx один писатель, что позволяет многое упростить и сильно сократить накладные расходы.
- и т.д.

26 July 2021

Л(

06:38

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.10.2

4 August 2021

18:59

Deleted Account

I've just started working on a custom KV store with Go and MDBX. Will I get a better performance if I use smaller keys, e..g. 12 bytes instead of 16?

Л(

19:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Yes

Marin Ivanov invited Marin Ivanov

MI

19:52

Marin Ivanov

Thanks

5 August 2021

Л(

21:06

Леонид Юрьев (Leonid Yuriev)

off-topic: У меня просьба к россиянам.

Нужно на время 8-16 модулей Samsung M386AAG40MM2-CVF (128Gb ECC LRDIMM), либо M386AAG40MMB-CVF.
Чтобы проверить их совместимость с матплатой Supermicro X12DPi-NT6, причем в режиме совместной работы с Intel NMB1XXD1281S (Persistent Memory 200).

Техподдержка Supermicro ответить по-существу не может, просто предлагает использовать M393AAG40M32-CAE, которые RDIMM, а не LRDIMM.
Но дело в том что рекомендуемые M393AAG40M32-CAE ожидаемо существенно медленнее: CL=26, tRCD=22, tRP=22 вместо CL=21, tRCD=21, tRP=21 у LRDIMM-ов.

При успешной проверке я готов сразу приобрести как минимум 8 модулей.
Т.е. суть в том, чтобы избежать покупки 1-2 терабайт только для "попробовать".

C

21:07

CZ

In reply to this message

я был бы рад помочь если бы понял хоть что-то в этом 🙂

11 August 2021

AV

17:56

Artem Vorotnikov

А можно ли использовать MDBX под WebAssembly, и что для этого потребуется?

Л(

19:07

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Гипотетически "да", но в реальности "нет".

1. Фактически поддержка каждого варианта WebAssembly (в каждом браузере и ОС) соответствует поддержке еще одной ОС со специфическими возможностями и набором системных вызовов. т.е. это большая неинтересная рутина.

2. WebAssembly - это "сова", которую нужно еще долго штопать после (регулярного) натягивания на глобус.
Например, нужна поддержка mmap, IPC и т.д.

3. Если же "не заморачиваться" и делать некое подобие MDBX, которое будет не совместимо с нативной версией, и/или будет поддерживать работу БД только внутри песочницы WebAssembly, и/или без других features - то непонятно зачем это нужно.
Проще использовать что-то готовое/подходящее из мира JS.

16 August 2021

MI

22:53

Marin Ivanov

What about Wasm runtime + WASI?

23:03

You can expose syscalls to the Wasm program. I am not sure how mmap is handled, but I think it is.
However I think the major issue with the wasm runtime is the limited memory space, which has 32-bit addresses, thus limited to 4GB.

23:07

It would be easier to expose some function interface to the runtime and keep the library outside the wasm program.

Л(

23:15

Леонид Юрьев (Leonid Yuriev)

In reply to this message

AFAIK there is no support for a number of system services/calls (mmap, file locking, shared/IPC mutexes) which are required for MDBX.

Briefly, it is no way to implement common API both for Windows and POSIX, just since Windows kernel unable some things.
Moreover, due Windows limitations require the use of undocumented ntdll functions and a number of additional efforts.

In general, it is possible, but quite time-consuming.
So I don't see any point in doing this.

MI

23:17

Marin Ivanov

Yep, I totally agree with you.

MI

23:45

Marin Ivanov

In reply to this message

Afaik, The Graph protocol are compiling AssemblyScript to wasm and they are exposing functions to save Ethereum event data into Postgres database. It sounds reasonable to implement the MDBX solution as an external service to the wasm programs as well.

23 August 2021

MI

14:41

Marin Ivanov

Ok, so I am running the ioarena bench, and lmdb and rocksdb finish in a few seconds, but mdbx is going for ages. I am using the defaults - 1M operations, 16 byte key, 32 byte value. The mdbx engine caps at 166 set requests per second, which is quite slow. Am I doing something wrong?

14:46

Deleted Account

IOARENA (embedded storage benchmarking)

configuration:
  engine       = mdbx
  datadir      = ./_ioarena
  benchmark    = set, get
  durability   = lazy
  wal          = indef
  operations   = 1000000
  key size     = 16
  value size   = 32
  binary       = no
  continuous   = no

key-gen: using 24 bits, up to 1000000 keys
doer.0: {set, get}, key-space 0, key-sequence 0
     time | bench      rps      min       avg       rms       max       vol           #N | bench      rps      min       avg       rms       max       vol           #N
    1.013 |   set:  43.326K 320.000ns  22.832us 383.898us  13.226ms   2.080Mbps  43.901K |   get        -        -         -         -         -         -           - 
    2.021 |   set: 171.708    2.423ms   5.820ms   6.096ms  12.861ms   8.242Kbps  44.074K |   get        -        -         -         -         -         -           - 
    3.027 |   set: 152.029    1.894ms   6.574ms   7.026ms  15.937ms   7.297Kbps  44.227K |   get        -        -         -         -         -         -           - 
    4.028 |   set: 161.890    2.102ms   6.173ms   6.607ms  19.476ms   7.771Kbps  44.389K |   get        -        -         -         -         -         -           - 
    5.029 |   set: 160.858    3.838ms   6.213ms   6.551ms  19.244ms   7.721Kbps  44.550K |   get        -        -         -         -         -         -           - 
    6.031 |   set: 168.611    2.483ms   5.927ms   6.272ms  18.173ms   8.093Kbps  44.719K |   get        -        -         -         -         -         -           - 
    7.033 |   set: 173.704    2.350ms   5.753ms   5.991ms  12.194ms   8.338Kbps  44.893K |   get        -        -         -         -         -         -           - 
    8.036 |   set: 158.412    2.120ms   6.308ms   6.794ms  20.175ms   7.604Kbps  45.052K |   get        -        -         -         -         -         -           - 
    9.041 |   set: 147.356    3.490ms   6.783ms   7.166ms  14.143ms   7.073Kbps  45.200K |   get        -        -         -         -         -         -           - 
   10.046 |   set: 170.121    3.426ms   5.876ms   6.349ms  20.710ms   8.166Kbps  45.371K |   get        -        -         -         -         -         -           - 
   11.050 |   set: 177.379    2.247ms   5.635ms   5.957ms  16.931ms   8.514Kbps  45.549K |   get        -        -         -         -         -         -           - 
   12.061 |   set: 171.976    1.877ms   5.813ms   6.158ms  15.371ms   8.255Kbps  45.723K |   get        -        -         -         -         -         -           - 
   13.070 |   set: 168.557    3.497ms   5.930ms   6.236ms  15.038ms   8.091Kbps  45.893K |   get        -        -         -         -         -         -           - 
   14.071 |   set: 162.772    3.579ms   6.140ms   6.445ms  19.071ms   7.813Kbps  46.056K |   get        -        -         -         -         -         -           - 
   15.084 |   set: 170.823    2.962ms   5.850ms   6.168ms  14.239ms   8.199Kbps  46.229K |   get        -        -         -         -         -         -           - 
   16.091 |   set: 172.821    2.097ms   5.783ms   6.159ms  15.177ms   8.295Kbps  46.403K |   get        -        -         -         -         -         -           - 
   17.100 |   set: 161.474    3.518ms   6.189ms   6.453ms  15.083ms   7.751Kbps  46.566K |   get        -        -         -         -         -         -           - 
   18.108 |   set: 165.651    3.503ms   6.033ms   6.308ms  13.021ms   7.951Kbps  46.733K |   get        -        -         -         -         -         -           - 
   19.116 |   set: 171.631    2.364ms   5.823ms   6.097ms  13.350ms   8.238Kbps  46.906K |   get        -        -         -         -         -         -           - 
   20.117 |   set: 171.853    1.796ms   5.816ms   6.169ms  12.736ms   8.249Kbps  47.078K |   get        -        -         -         -         -         -           - 
   21.120 |   set: 158.628    3.835ms   6.300ms   6.642ms  18.696ms   7.614Kbps  47.237K |   get        -        -         -         -         -         -           -

14:46

And here is the lmdb with teh same settings:

key-gen: using 24 bits, up to 1000000 keys
doer.0: {set, get}, key-space 0, key-sequence 0
     time | bench      rps      min       avg       rms       max       vol           #N | bench      rps      min       avg       rms       max       vol           #N
    1.001 |   set: 126.223K   5.027us   7.779us   8.250us 157.693us   6.059Mbps 126.327K |   get        -        -         -         -         -         -           - 
    2.001 |   set: 130.709K   6.296us   7.523us   7.693us  39.340us   6.274Mbps 257.089K |   get        -        -         -         -         -         -           - 
    3.002 |   set: 122.507K   6.370us   8.021us   8.467us 607.975us   5.880Mbps 379.647K |   get        -        -         -         -         -         -           - 
    4.002 |   set: 123.721K   6.498us   7.948us   8.166us 140.770us   5.939Mbps 503.420K |   get        -        -         -         -         -         -           - 
    5.003 |   set: 124.238K   6.475us   7.922us   8.123us 104.968us   5.963Mbps 627.714K |   get        -        -         -         -         -         -           - 
    6.003 |   set: 108.162K   6.604us   9.082us   9.517us 411.214us   5.192Mbps 735.927K |   get        -        -         -         -         -         -           - 
    7.394 |   set:  14.143K   6.634us  70.541us   8.493ms   1.191s  678.884Kbps 755.603K |   get        -        -         -         -         -         -           - 
    8.397 |   set: 109.924K   6.698us   8.953us  14.354us   2.771ms   5.276Mbps 865.879K |   get        -        -         -         -         -         -           - 
   10.577 |   set:  36.224K   6.652us  27.466us   5.277ms   1.483s    1.739Mbps 944.842K |   get        -        -         -         -         -         -           - 
   11.579 |   set:  55.083K   7.879us  12.203us 370.741us  86.997ms   2.644Mbps   1.000M |   get: 375.055K 347.000ns 768.000ns 874.000ns  31.745us   6.001Mbps 375.566K
   12.113 |   set        -        -         -         -         -         -           -  |   get:   1.168M 306.000ns 772.000ns 902.000ns 180.391us  18.690Mbps   1.000M
complete.

24 August 2021

Л(

12:26

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Briefly, this is because the "lazy" mode works differently in an engines and provides different guarantees for data durability.
In other words, it is impossible to ensure equal working conditions for different engines for the "lazy" mode since the different engines have the too different set of features.

So for comparing engines, it is better to use more unambiguous "sync" and "nosync" ioarena' modes.
But for the "lazy" mode you should dig deeper to check which exactly options are used for each engine, and maybe change some flags/options for appropriating to your cases.

MI

12:54

Marin Ivanov

Thanks

Л(

14:10

Леонид Юрьев (Leonid Yuriev)

The release of v0.10.3 is planned for the end of this week.

26 August 2021

Максим Заикин invited Максим Заикин

28 August 2021

Л(

02:26

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.10.3

29 August 2021

Deleted invited Deleted Account

Thinker invited Thinker

31 August 2021

МЗ

19:34

Максим Заикин

Добрый день, вопрос по функции mdbx_env_copy в отличии от lmdb_env_copy требуется задавать путь с именем файла базы данных, а не только путь .... если принципиальная причина так сделать? или это сложилось так случайно

Л(

19:49

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Так сделано намеренно. В частности, чтобы устранить неоднозначность поведения в зависимости от использования опции MDBX_NOSUBDIR при создании исходной БД.

МЗ

19:54

Максим Заикин

In reply to this message

так я бы понимал что везде нужно указывать имя файла, а получается в некоторых вариантах создания файла базы есть умалчиваемое значение имени, я просто не давно узнал про Ваше так сказать изделие:)) тот врапер C# который Вы даете ссылку на гитхабе... мягко говоря устарел, я нашел Lightning.NET но он для LMDB, но я так понимаю поддерживается более менее активно да и тестов самого движка там достаточно большое количество.... я пытаюсь его заставить работать с mdbx

Л(

19:56

Леонид Юрьев (Leonid Yuriev)

> тот врапер C# который Вы даете ссылку на гитхабе... мягко говоря устарел
https://github.com/wangjia184/mdbx.NET/issues/3

19:58

In reply to this message

> так я бы понимал что везде нужно указывать имя файла, а получается в некоторых вариантах создания файла базы есть умалчиваемое значение имени

Идеальных решений не существует.
То как реализовано сейчас является компромиссом между необходимостью соблюдать совместимость по API с ранними версиями libmdbx, удобством использования и логичностью API.

МЗ

20:00

Максим Заикин

возможно так и есть, если бы не было фразы:) MithrilDB will be radically different from libmdbx by the new database format and API

20:02

mdbx.net не обновлялся с конца 2018 года, до текущей Вашей актульной версии его нужно допиливать

Л(

20:10

Леонид Юрьев (Leonid Yuriev)

На практике большинство разработчиков (пользователей libmdbx) предпочитают использовать опцию MDBX_NOSUBDIR, т.е. создавать новые БД без дополнительной директории. В свою очередь при открытии libmdbx автоматически обрабатывает оба варианта.

Далее, большинство разработчиков (пользователей libmdbx) посчитали более логичным и удобным, чтобы функция копирования работала одинаково вне зависимости от MDBX_NOSUBDIR в исходной БД и интерпретировала аргумент именно как имя целевого файла, а не имя директории.

Была идея и сюда добавить "автоматики", т.е. если pathname существует и являет директорией, то включать поведение MDBX_NOSUBDIR.
Но не нашлось сценариев когда это было-бы действительно нужно/полезно, поэтому было сохранено текущее поведение чтобы не усложнять поведение.

МЗ

20:11

Максим Заикин

In reply to this message

ок я понял, вопрос снимается

Л(

20:11

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Не понял причем тут уведомление о том, что в MithrilDB будет другой формат БД и API.

20:11

In reply to this message

Да, там заведено issue (ссылку давал).

МЗ

20:13

Максим Заикин

In reply to this message

я имею ввиду что если кто использует оригинальный движок и вдруг захотел использовать ваш ... то в большинстве вариантов использования API будет приблизительно похоже

Л(

20:16

Леонид Юрьев (Leonid Yuriev)

In reply to this message

При переходе с LMDB на libmdbx - да.
Но MithrilDB - это совсем другое, совсем.

1 September 2021

МЗ

14:51

Максим Заикин

продолжаю изучать движок:)) обнаружил не очень понятную мне проблему с mdbx_dbi_open есть option MDBX_CREATE при этом файл базы создается на диске даже если ее не задавать (вся битовая маска 0), вдогонку вторая проблема с option MDBX_DUPSORT без задания MDBX_CREATE выдает ошибку MDBX_INCOMPATIBLE

Л(

16:27

Леонид Юрьев (Leonid Yuriev)

In reply to this message

1. Вы запутались, (видимо) из-за того что в LMDB заложена историческая терминологическая путаница:
- "средой"/environment называется как БД на диске, так и экземпляр контекста для работы с БД на диске.
- "базой данных"/database называется как один набор пар key-value (aka "space" или "map" в терминах многих других key-value движков), так и "вся БД" как файл на диске в привычном понимании (в которой может быть несколько независимых наборов key-value).

В свою очередь, исторически истоки этой путаницы идут со времен dbm и первых версий Berkeley DB, где вся БД была одним наборов пар key-value.
Соответственно, вы попутали БД как "вся БД или environment" и БД как "один набор key-value или map/space ".

2. Файл БД на диске создается при вызове mdbx_env_open(), т.е. до вызова mdbx_dbi_open(). Чтобы открывать существующую БД без создания новой, следует передавать аргумент mode = 0.

3. Опция/флажок MDBX_CREATE предназначена для использования с mdbx_dbi_open(), но не с mdbx_env_open().
При этом остальные элементы enum MDBX_db_flags_t задают тип ключей и map/multimap для самого создаваемого набора пар key-value.
Соответственно, ошибку MDBX_INCOMPATIBLE вы получаете когда пытаетесь открыть (получить dbi-хендл) для работы с существующим набором key-value передавая в mdbx_dbi_open() другой/несовместимых набор опций/флагов (на всякий см MDBX_DB_ACCEDE).

4. Путаницы еще чуть больше...
В одной "БД на диске" aka environment может быть либо несколько именованных наборов пар key-value, либо один безымянный (получить хендл можно вызвав mdbx_dbi_open() с name = NULL ).
Более того, эти режимы в некоторой степени можно совмещать.

5. Если будет желание что-то улучшить в документации по этому вопросу - оформляйте pull request.
Это лучше делать сразу, при свежем незамыленном взгляде.

МЗ

17:05

Максим Заикин

да Вы правы (видимо когда писал попутал) я понимаю эту концепцию
но все равно мне думается есть проблема, а именно с dbi которая условно говоря NULL ( без имени),
для именованной dbi если не задать MDBX_CREATE мне честно отвечают что такой нет (mdbx_dbi_open: (-30798) MDBX_NOTFOUND: No matching key/data pair found) странное описание ошибки, видимо из за того как это реализовано ....

а вот c dbi(NULL) если MBDX_CREATE не задать она открывается нормально потому что видимо уже есть...... и если указать только параметр MDBX_DUPSORT то возникает ошибка MDBX_INCOMPATIBLE, но если задать options MBDX_CREATE | MDBX_DUPSORT то она выходит пересоздает заново dbi(NULL), хотя фактически она уже есть при открытии env

Л(

17:48

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Все подобные странности объясняются тем как оно реализовано, изначальными недостатками/дефектами API и их последующим исправлением.

Внутри LMDB/MDBX всегда есть безымянный набор key-value (aka "@MAIN"), а добавляемые именнованные наборы key-value живут внутри основного безымянного как специфические записи key-value.
Далее примерно так:
- режим map/multimap, тип ключей и т.п. задается при создании набора key-value;
- при попытке открытия существующего набора key-value с несовместимымb опциями возвращается MDBX_INCOMPATIBLE;
- если задана опция MDBX_CREATE и целевой набор отсутствует, либо пуст, но допускается смена опций.

Однако, безымянная @MAIN есть всегда, причем после создания БД она с опциями по-умолчанию (нулевыми).
Поэтому чтобы изменить опции отличные от нулевых, нужно при пустой БД, хотя-бы один раз, вызвать mdbx_dbi_open(NULL, MDBX_CREATE | XXX).

В LMDB же, насколько помню, во многих подобных случаях, вы можете поломать базу, либо получить SIGSEVG или неожиданную ошибку в любой последующий момент.

МЗ

18:10

Максим Заикин

ок спасибо за разъяснения...

28 September 2021

Л(

18:38

Леонид Юрьев (Leonid Yuriev)

Some simple benchmark which I done within /dev/shm on an old laptop with an Intel i7-4600U @ fixed 2GHz.

$ make bench-mdbx_25000000.txt
// TIP: Use `make V=1` for verbose.
  RUNNING ioarena for mdbx/25000000...
 throughput: 124.021Kops/s
 throughput:  13.570Mops/s
 throughput:   1.138Mops/s

$ make bench-lmdb_25000000.txt 
// TIP: Use `make V=1` for verbose.
  RUNNING ioarena for lmdb/25000000...
 throughput: 108.858Kops/s
 throughput:  14.758Mops/s
 throughput:   1.211Mops/s

So, by this benchmark:

1) libmdbx is ~14% faster than LMDB in CRUD cases.
Such acceleration should be expected in most use cases (without taking into account the waiting time for disk operations).

2) libmdbx is ~8% slower than LMDB when iterating records sequentially.
This is expected, since libmdbx has more checks (for the correctness of arguments and database structure) and in this scenario, these overhead costs become noticeable because the engine performs significantly less other actions.

However, in most real-world scenarios, a difference will most likely be less noticeable, since most of time will taken the waiting for disk operations.

5 October 2021

AS

14:16

Alex Sharov

Hello.
I've service with "high read-only rps" - in language with coroutines/actors - means with MDBX_NOTLS | MDBX_RDONLY.
And i see that begin read transaction sometime takes hundreds of milliseconds.
I did try mdbx_txn_renew - and it also takes hundreds of milliseconds.
As I understand bind_rslot is expensive on high-concurrency?
Can I try something to improve it (before I start thinking about having less read transactions)?

Л(

15:00

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Ok, could you estimate how many i/o-waiting (paging, etc) time in such hundreds of ms?

AS

15:29

Alex Sharov

if create endpoint which only begin/abort transaction, and stress-test it - then:
- /proc/self/io: syscw=195/sec syscr=120/sec
- /proc/self/stat: Minflt=2K/sec Majflt=0/sec

Л(

21:27

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Transactions with MDBX_NOTLS | MDBX_RDONLY requires to lock/unlock the "Readers Table" for each ones creation and finalization.
To avoid malloc+lock/unlock+free overhead and reduce rdt-mutex contention you should implement a pool of transactions with reasonable size (e.g. one txn per coroutine, etc):
- when necessary (while are no transactions in the pool), create transactions as usual;
- during completion the read-only transaction, decide whether it needs to be returned to the pool (it is not full), or not (the pool is full).
- when placed in the pool, use mdb_txt_reset(), which will allow it to be reused, otherwise destroy the transaction as usual by mdbx_txn_abort().
- use mdbx_txn_renew() when retrieving a transaction from the pool.

P.S. This solution has been pronounced and explained several times already ;)

6 October 2021

AS

04:33

Alex Sharov

Yes, i have pool and renew, and renew takes several hundred’s ms sometimes. Time grows under load: at low rps it takes 1microsecond.

AS

06:27

Alex Sharov

Actually looks like i'm worng - after setup better Grafana I don't see co-relation between RPS and txn_renew speed anymore.

AS

08:01

Alex Sharov

Yes, confirmed i did handle 150K rps on 8 cores machine - and mdbx_txn_begin didn't slow down from rps growth. It slow down only if too much concurrency in app (context swithes, scheduling, etc...). And my initial problem was - too much concurrency.

Л(

08:51

Леонид Юрьев (Leonid Yuriev)

In reply to this message

👍😎

8 October 2021

Deleted invited Deleted Account

10 October 2021

Л(

13:26

Леонид Юрьев (Leonid Yuriev)

Решил не тянуть с релизом и делать его сегодня. по трем причинам:
- главное изменение (исправление ошибки) уже достаточно протестировано (суммарно порядка 3000 часов работы стохастического теста).
- остальные изменения прозрачны и не могу привести к регрессам.
- для когерентности версий с libfpta.

b

13:26

basiliscos

👍

Л(

15:02

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.10.4

12 October 2021

NK

00:10

Noel Kuntze

Hey Leonid, it doesn't seem not all is fine and dandy in libmdbx land. With the new release, the smoke tests crash on armhf, and armv7 (unknown for aarch64 and ppc64le) with Bus error (core dumped)

00:10

The pipeline is currently running here: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/26344/pipelines#diff-content-08453e920742f5966eb201875a30c8b41a5ae7ca

00:11

This is more a FYI and heads up to other users. It might be breakage introduced by changes in surrounding software (compilers, headers, ...).

Л(

00:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Hmm, this is very strange, but it is unlikely that the problem is in libmdbx:
- there are no changes for a such regression;
- there is success story with iOS after fixing pragma pack for modern Apple's LLVM;
- the make cross-qemu is still ok.

Nonetheless, please provide full stack trace from your cases.

NK

00:33

Noel Kuntze

🤷

00:33

We'll see what we can get.

Viktor invited Viktor

Л(

14:58

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Oops, seems this is a regress from the pragma pack fix.

Л(

16:30

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Confirmed.
This is my stupid mistake, sorry.
All aligned-required architectures are affected.

Fix will be available promptly.
To avoid similar regressions in a future, I will add the UBSAN test to CI.

NK

16:42

Noel Kuntze

Ty. Maybe do some release testing on real hardware too, like Alpine does. I could ask them for privileges for you to do your testing there.

Л(

16:43

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Basically the qemu + UBSAN are enough.

NK

16:53

Noel Kuntze

Well evidently it isn't as we just saw

Л(

16:56

Леонид Юрьев (Leonid Yuriev)

In reply to this message

My mistake is I don't ran the UBSAN-test after the pragma pack fix.
Otherwise, the problem would have been identified immediately.

NK

16:56

Noel Kuntze

Ubsan found it?

Л(

16:57

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Yes, of cause.

Л(

18:01

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://github.com/erthink/libmdbx/commit/6e6448559e8167a5d22967fdd7a924a37fc84e64

AV

20:42

Artem Vorotnikov

@thermi I wonder if it makes sense to build tarball in your AUR package, as opposed to just pulling libmdbx repo

20:42

I do that with Erigon, at least
https://aur.archlinux.org/packages/erigon

13 October 2021

NK

00:03

Noel Kuntze

No, I want to run all the tests.

15 October 2021

M

21:06

Mark

Hey guys. Will you be tagging another release for the error return fixes checked in after 0.10.5?

Л(

21:25

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No.

Since it is the insignificant bug - in the worst case and only if a database is damaged, instead of MDBX_PAGE_NOTFOUND or MDBX_CORRUPTED, you will get the 1 as error code (the EPERM in Linux, INVALID_FUNCTION in Windows, etc).

I.e. in any case, a transaction will be aborted, and checking a database using mdbx_chk will show its corruption.

21:27

Are you agree?

M

21:30

Mark

Well, for my application no. It's in an embedded device that is not maintainable by people easily. So I have code that looks for page corruption. If it happens I'm able to rebuild my database.

21:30

But having potentially wrong data in the application logic would be worse.

21:31

So for me that change is actually pretty important. Everywhere I process error codes I check for page corruption.

If not I can use a specific version. I asked because I would have liked something a little more official just for sanity

Л(

21:39

Леонид Юрьев (Leonid Yuriev)

In reply to this message

👍

16 October 2021

b

20:53

basiliscos

Корректно ли следующее: я прочитал значения из курсора (MDBX_val key, value), закончил R/O транзацию, и потом только начал вычитывать (десериализовать) полученные значения. Останутся ли они корректными, если не было новых R/W транзаций?

20:57

мне хотелось бы передать std::string_view вместо std::string на другой (другие) треды, чтобы там они могли модель мою построить. Понятно, что std::string_view гораздо легче чем std::string

Л(

20:59

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Прочитанные данные останутся корректными, если:
- не было запущенных пишущих транзакций на момент старта вашей читающей транзакции;
- не будет запущено новых пишущих транзакций;
- не будет других процессов открывающих, либо закрывающих БД, а также меняющих её геометрию.

Во всех остальных случаях есть вероятность прочитать мусор. Причем никакого разумного способа узнать об этом нет, кроме как просто не завершать транзакцию чтения.

b

21:00

basiliscos

отлично, то что надо. Условия я обеспечу. Спасибо!

МЗ

21:01

Максим Заикин

In reply to this message

а в чем проблема перед закрытием транзакции скопировать в свою память, то что отдал курсор?

Л(

21:01

Леонид Юрьев (Leonid Yuriev)

In reply to this message

std::string не гарантирует выравнивание. Точнее говоря, реализация std::string в С++ библиотеке от clang гарантирует отсутствие выравнивания.
Поэтому советую посмотреть на mdbx::buffer из C++ API.

21:05

In reply to this message

Как вариант - использовать mdbx::buffer и вызвать mdbx::buffer::make_freestanding().
https://erthink.github.io/libmdbx/classmdbx_1_1buffer.html#aa80450b8539ee20ead1589d6f65daa7a

b

21:06

basiliscos

In reply to this message

У меня полная копия модели будет на разных тредах. Соотвественно, либо, как предлагаете, 1 раз вычитать и восстановить модель в текущем треде, затем её сериализовать ещё раз (куда-нить типа std::string), передать, и десериализовать (2-й расход памяти, медленно). Либо MBDX светить и на других тредах вычитывать (нарушает архитектуру). Либо R/O снимок модели передать везде где надо и там десериализоваться - само быстро выйдет.

Л(

21:06

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Не рекомендую.
Но в крайнем случае советую открывать БД в эксклюзивном режиме (опция/флажок MDBX_EXCLUSIVE).

b

21:07

basiliscos

In reply to this message

да, отлично, у меня только 1 точка доступа, опция вполне подходит. Спасибо за совет.

Л(

21:07

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Используйте MDBX_NOTLS, тогда будет возможность использовать читающую транзакцию из разных тредов.

МЗ

21:10

Максим Заикин

In reply to this message

ну если выделение памяти 2х так критично... то мб, но я так понимаю ядро использует буфера и не выделяет каждый раз новую память когда нужно что то прочитать

b

21:11

basiliscos

In reply to this message

да, критично, тк фактически вся БД будет вычитываться.

Л(

21:11

Леонид Юрьев (Leonid Yuriev)

Если вы будите переиспользовать читающие транзакции (mdbx_txn_reset() + mdbx_txn_renew()), то оверхед получится примерно нулевым.

Т.е. это правильный надежный путь, без костылей и лишнего оверхеда.

b

21:13

basiliscos

In reply to this message

да, у меня стоит NOTLS. Как я понял, если 1 точка доступа (r/w и r/o) из одного треда, то это лишние синхронизации через thread-local -storage должно убрать

Л(

21:17

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Немного "наоборот".

Без MDBX_NOTLS регистрироваться в таблице читателей будет каждый тред запускающий транзакцию чтения, но такая регистрация будет один раз на тред.

С MDBX_NOTLS регистрация (и де-регистрация) в таблице читателей будет для каждой читающей транзакции, но этого можно избежать вместе с выделением/освобождением памяти используя mdbx_txn_reset() + mdbx_txn_renew().

21:19

Но в целом, в libmdbx примерно всегда есть возможность обеспечить lockfree для читателей.

МЗ

21:21

Максим Заикин

In reply to this message

я если честно не понял идеи, вернее понял что есть желание чтобы ядро зачитало все данные из файла базы в память, а дальше Вы будите использовать то что навыделяло ядро... Леонид конечно точнее скажет, но мне кажется такое не возможно... извините что встрял (:

Л(

21:45

Леонид Юрьев (Leonid Yuriev)

In reply to this message

в MDBX все прочитанные данные - это просто указатели на данные лежащие в MVCC-снимке внутри замепленного в память файла.
Каждая читающая транзакция читает актуальный MVCC-снимок на момент своего старта и полностью блокирует его.

Поэтому пока активна транзакция чтения будут доступны и валидны (неизменны) все данные в MVCC-снимке.
Соответственно, вместо копирования данных почти всегда выгоднее не завершать транзакцию чтения, пока эти данные нужны.
Проблема тут только одна - транзакция чтения останавливает переработку мусора (старых версий данных), поэтому могут быть затруднения при большом потоке изменений на фоне долгих читающих транзакций (см https://erthink.github.io/libmdbx/intro.html#long-lived-read).

Однако, в рассматриваемом случае, пишущих транзакций не ожидается (либо их кол-во можно контролировать).
Поэтому самый логичный и правильный путь:
- не копировать данные из БД, а просто не завершать транзакцию пока нужен доступ к этим данным.
- не будет никаких лишних накладных расходов, но будет полная гарантия на чтение консистентных данных.

17 October 2021

AV

11:40

Artem Vorotnikov

Скромный вопрос: а можно добавить поддержку декремента для счётчиков? Судя по коду просто сделать идентичный mdbx_dbi_sequence вызов где не + а -

11:43

либо поменять тип аргумента increment с u64 на i64

Л(

15:39

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Это не счётчик, а генератор последовательности. Поэтому он forward-only by design.

Л(

18:46

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Если очень-очень сильно необходимо, то можно добавить еще одну функцию (но лучше не портить).

19 October 2021

Sproul invited Sproul

10:28

Deleted Account

А есть биндинги для c#(актуальные)?. Видел mdbx.net но скажем так он уже три года как не обновлялся, да и качество кода там в некоторых местах хромает

Л(

11:05

Леонид Юрьев (Leonid Yuriev)

In reply to this message

О каких-либо других привязках для C# ничего не известно.
Если вдруг будете делать самостоятельно, то лучше сделать "кальку" с текущего C++ API.

11:41

Deleted Account

Ok

11:41

Буду сам делать

11:41

Спс

МЗ

13:22

Максим Заикин

In reply to this message

Тоже сам делаю, за основу взял с гита c# код для mdbx и там еще есть для lmdb

27 October 2021

Mikhail K. invited Mikhail K.

10 November 2021

b

22:41

basiliscos

> MDBX_THREAD_MISMATCH: A thread has attempted to use a not owned object, e.g. a transaction that started by another thread

Ы-ы-ы! Открыл транзакцию в 1-м треде, а закрыть хочу в другом (чтобы поинтеры на вычитанные данные валидными были). MDBX_NOTLS мне помог.

NK

22:58

Noel Kuntze

Can also happen if you have a problem with memory corruption. If you can, run with valgrind. That's quick and easy to check for that case.

23:00

In reply to this message

I see the problem now though. Gimme a second.

b

23:00

basiliscos

Thank you for the advice, valgrind shows all OK.

NK

23:02

Noel Kuntze

In reply to this message

I understand it's an issue with handling the data. How are you trying to handle the data? Remember that libmdbx returns pointers into the memory mapped page. It does not copy it beforehand. So you probably want to copy it after getting it from the database and then pass it on inside your application.

b

23:05

basiliscos

there was a discussion above in Russian. Shortly: I do read data as vector<std::string_view> (aka pointers), send them to different threads, and the last reader closes R/O transaction. ` MDBX_NOTLS` helped me, I just want to be sure, that everything is OK.

NK

23:06

Noel Kuntze

I see.

AV

23:07

Artem Vorotnikov

In reply to this message

строго говоря, не очень, просто мы сами сделали счётчики на коленке

23:08

собственно, вопрос - а какой изначальный use case именно для генератора последовательности?

b

23:08

basiliscos

In reply to this message

почему?

AV

23:09

Artem Vorotnikov

In reply to this message

нам нужны счётчики с поддержкой декремента

b

23:10

basiliscos

In reply to this message

вычитка (по сути разметка) всей базы, рассылка нужным тредам, а те уже, восстанавливают модель и далее каждый работает со своей копией.

23:11

In reply to this message

счётчики чего?

23:11

refcounter'ы? Чего?

AV

23:13

Artem Vorotnikov

просто счётчик u64 привязанный к таблице

23:13

в нашем случае - индекс последней транзакции в блокчейне

b

23:16

basiliscos

ну у меня по сути просто, там 1 транзакция, читает всё , потом её закрывают (из другого треда). Там даже другие пишущие и читающие транзации не начинаются.

11 November 2021

AS

04:09

Alex Sharov

In reply to this message

Не закрывай транзакцию пока данные нужны. Подожди пока данные перестанут быть нужны. И закрой транзакцию в том же треде где открывал.

Ryan invited Ryan

R

06:45

Ryan

is libmdbx expected to handle point in time recovery snapshots? eg: persistent disk snapshots on google cloud?

06:45

(because I have a snapshot that wont open)

Л(

07:18

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Briefly:
- if the storage subsystem works correctly, then in MDBX_SYNC_DURABLE mode, the write order guarantees the consistency of DB.
- if you have a disk issue or you got a system crash/failure with MDBX_UTTERLY_NOSYNC mode, then some data (a pages) just has not written and thus couldn't be recovered.
- you can try use mdbx_chk tool, especially with -0, -1 ,-2 options, for checking all snapshots which pointed by each of three meta-pages; and by -t option switch to valid one which the maximal transaction id.

R

07:19

Ryan

What mode does erigon use? Not sure myself...

Л(

07:19

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Please provide the output of mdbx_chk -vvv (from current master branch) for such DB.

R

07:20

Ryan

I already nuked the data, it was useless to me. next time if it pops up. is there a docker build for mdbx_chk ?

07:20

Dockerfile build at least

Л(

07:23

Леонид Юрьев (Leonid Yuriev)

Please follow https://github.com/erthink/libmdbx#building-and-testing or ask Erigon' team for support.

R

07:24

Ryan

mmm if I write a dockerfile would you merge it in?

Л(

07:25

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Hm, wherefore/why?

07:37

In reply to this message

Currently, there are no known cases or scenarios of database corruption due to any errors in Erigon or in libmdbx, including due to incorrect/unsafe usage of a fragile modes of operation.
On the other hand there know incidents because of RAM failure and BTRFS issues (maybe because of a disk or RAID troubles).

R

07:37

Ryan

ok thanks for the info!

AS

07:39

Alex Sharov

In reply to this message

Erigon using MDBX_SYNC_DURABLE.
I will try to add mdbx_chk to our dockerfile (there is no real problem, just some CI issue).
This chat is for MDBX users, they don't know about Erigon's details.

C

16:47

CZ

hey guys, have a quick question: one of our clients trying to launch our damon under RaspberyPI (Raspberry Pi 4 Model B Rev 1.4, width: 64 bits), we are having error opening database with error (12)Cannot allocate memory.
Mdbx submodule linked to commit b7ed675

16:48

any ideas what that might be?

Л(

16:49

Леонид Юрьев (Leonid Yuriev)

In reply to this message

b7ed675 is extreme obsolete version.
Please use the last release or current master branch.

C

16:53

CZ

thanks, let me do this

16:54

btw, we never had any problems with old version too

16:55

seems to be very stable

Л(

16:58

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Thank you, but still several errors were found and fixed (most of ones occur quite rarely and only in certain scenarios).

C

19:09

CZ

https://prnt.sc/1z60t2h

19:09

it now reporting a problem regarding compatibilty of coimpiller, which is a bit confusing for me

19:12

i do have visual studio 2015 update 3

C

20:02

CZ

compiller version is a just a bit smaller then needed

20:07

that is the latest update i could locate

20:07

https://prnt.sc/1z675ng

20:09

https://prnt.sc/1z67dto

20:09

any ideas where i can get 19.00.24234 ?

Л(

20:11

Леонид Юрьев (Leonid Yuriev)

The 19.0.24241.7 is the latest = https://ci.appveyor.com/project/leo-yuriev/libmdbx/builds/41487416/job/wt67qegg2tn4y94v#L49
Please ask M$ for version differences and getting builds with all fixes.

Л(

20:46

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://github.com/erthink/libmdbx/commit/79a5802ad4dcb8798b8b5aa7f2aa3dd6747696dd

20:52

In reply to this message

https://community.chocolatey.org/packages?q=Visual+Studio+2015

C

20:55

CZ

Thanks will try it!

C

22:45

CZ

Sorry for late messages, with the new codebase still the same problem: mdbx_env_open returns (12)Cannot allocate memory
on Raspberry Pi 4

22:46

any ideas why it might be happening?

Л(

22:50

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Most likely the DB's upper size is too large for available RAM, i.e. system unable allocate memory for PTEs.

Please use strace or mdbx_chk -vvv to clarify the causes of problems.

22:52

Initially just show the output of mdbx_chk -vvv for this DB running on your host system.

17 November 2021

Mauro invited Mauro

Л(

17:16

Леонид Юрьев (Leonid Yuriev)

In reply to this message

For general awareness:

In particular, the output of mdbx_chk -vvv for this DB contains:
-

dynamic datafile: 12288 (12.00 Kb) .. 1099511627776 (1.00 Tb)

-

current datafile: 2684313600 (2.50 Gb), 655350 pages

Thus this DB is just too large for Raspberry Pi 4, but actually a problems began when the database size reached ≈655350 pages.

C

17:18

CZ

i’ve talked to guy today, he say he will manage to give me access to console on his device, so i can investigate if further, if we really need it. For example a can put gdb and see where exactly error happening.

17:20

i’m wondering, this hardware has 8 gb of ram and linux, at the same time we have our daemon (which work on lmdb and mdbx) works perfectly well on vps hosts with just only 2GB of memory, i tend to believe it worth to investigate a bit

M

17:21

Mauro

Hi, I am evaluating libmdbx for a project where I need to have separate databases (or environments) for each user/account (since it is sensitive data and one corrupted DB should not affect other users) and would like to ask you a few questions I could not find an answer for in the docs:

1) I understand there is a limit on the number of databases that can be opened within an environment, but is there a limit on the number of environments that can be opened on the same thread?

2) Is it safe to open for read or writing a same environment from multiple threads?

3) Would there be a performance impact if on each request the environment is opened, a database is queried/updated and then the environment closed? The reason I'm asking is because there might be around 100k accounts on the server and keeping all those environments open would be too much for MMAP.

4) I also need to store an inverted index to perform full text search. Does libmdbx perform prefix compression similarly to RockdDB? When RocksDB stores a key, it drops the prefix shared with the previous key, which helps reduce the space requirement significantly.

Thank you for your time and for creating this amazing DB!

Cheers.

Л(

17:30

Леонид Юрьев (Leonid Yuriev)

In reply to this message

I expect/assume his have a model with 4 or 2 Gb RAM with other running software and disabled swapping.
As I wrote before, you should prefer to use strace firstly rather than gdb.

C

17:30

CZ

ok, will do strace, thanks

AS

18:43

Alex Sharov

In reply to this message

Hi. 4. Prefix deduplication called DupSort. 2. No parallel write transactions for 1 environment. Everything else can be parallel. 1. No environments limit - can open as much environments as OS can open.

M

18:52

Mauro

Thank you very much!

M

18:59

Mark

CZ: are they running the raspberry pi in 32-bit mode? Is your code compiles for 64 bits?

18:59

You can only map so much in a 32-bit address space

C

19:01

CZ

In reply to this message

He sent me this info regarding his system:
pi@raspiMB1:~ $ more /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
[2:45 PM]
pi@raspiMB1:~ $ uname -a
Linux raspiMB1 5.10.63-v8+ #1459 SMP PREEMPT Wed Oct 6 16:42:49 BST 2021 aarch64 GNU/Linux

Л(

19:02

Леонид Юрьев (Leonid Yuriev)

In reply to this message

I will answer your questions, but in order to avoid confusion, it is better to ask them one by one.

> 1) I understand there is a limit on the number of databases that can be opened within an environment, but is there a limit on the number of environments that can be opened on the same thread?

A environment is created/opened not for a thread but a whole process.
There is no such explicit limit, but the actual limit determines the available amount of RAM.
But it should be borne in mind that will be some overhead for processing TLS (thread local storage) destructors during threads termination/finalization.

> 2) Is it safe to open for read or writing a same environment from multiple threads?

No. The good rules are:
- non-const (i.e. read-write) access to an environment object is allowed at the same time strictly from single thread.
- non-const (i.e. read-write) access to a transaction object is allowed at the same time strictly from single thread.
- one transaction = one thread with some exceptions for read-only transation for a environment with the MDBX_NOTLS option.

> 3) Would there be a performance impact if on each request the environment is opened, a database is queried/updated and then the environment closed? The reason I'm asking is because there might be around 100k accounts on the server and keeping all those environments open would be too much for MMAP.

Yes, this should be a performance impact.
Roughly I can estimate that such a scenario will be 10 times slower than the the traditional transaction-per-request approach.
Obviously, you should use a pool/cache of open instances/DB to avoid performing an open-close cycle with each request.
Besided, MDBX is not very good in write-intensive scenarios (because it does not have a WALL), and additionally in you case the LCK file will be updated during each opening/closing a DB.

> 4) I also need to store an inverted index to perform full text search. Does libmdbx perform prefix compression similarly to RockdDB? When RocksDB stores a key, it drops the prefix shared with the previous key, which helps reduce the space requirement significantly.

As @AskAlexSharov answered MDBX have MDBX_DUPSORT feature which is extremely useful for inverted lists and secondary indices.
Briefly, you can have a multi-million values associated with a particular key - such key will be stored once and values will be storead as a sorted searchable/iterable list in the nested B-tree.

C

19:03

CZ

In reply to this message

raspimb1
description: Computer
product: Raspberry Pi 4 Model B Rev 1.4
serial: 10000000716f7b7e
width: 64 bits
capabilities: smp cp15_barrier setend swp tagged_addr_disabled
*-core
description: Motherboard
physical id: 0
*-cpu:0
description: CPU
product: cpu
physical id: 1
bus info: cpu@0
size: 1500MHz
capacity: 1500MHz
capabilities: fp asimd evtstrm crc32 cpuid cpufreq
*-cpu:1
description: CPU
product: cpu
physical id: 2
bus info: cpu@1
size: 1500MHz
capacity: 1500MHz
capabilities: fp asimd evtstrm crc32 cpuid cpufreq
*-cpu:2
description: CPU
product: cpu
physical id: 3
bus info: cpu@2
size: 1500MHz
capacity: 1500MHz
capabilities: fp asimd evtstrm crc32 cpuid cpufreq
*-cpu:3
description: CPU
product: cpu
physical id: 4
bus info: cpu@3
size: 1500MHz
capacity: 1500MHz
capabilities: fp asimd evtstrm crc32 cpuid cpufreq
*-memory
description: System memory
physical id: 5
size: 7632MiB
*-pci
description: PCI bridge
product: Broadcom Limited
vendor: Broadcom Limited
physical id: 0
bus info: pci@0000:00:00.0
version: 10
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress normal_decode bus_master cap_list
resources: memory:600000000-6000fffff
*-usb
description: USB controller
product: VL805 USB 3.0 Host Controller
vendor: VIA Technologies, Inc.
physical id: 0
bus info: pci@0000:01:00.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress xhci bus_master cap_list
configuration: driver=xhci_hcd latency=0
resources: irq:48 memory:600000000-600000fff
*-usbhost:0
product: xHCI Host Controller
vendor: Linux 5.10.63-v8+ xhci-hcd
physical id: 0
bus info: usb@1
logical name: usb1
version: 5.10
capabilities: usb-2.00
configuration: driver=hub slots=1 speed=480Mbit/s
*-usb
description: USB hub
product: USB2.0 Hub
vendor: VIA Labs, Inc.
physical id: 1
bus info: usb@1:1
version: 4.21
capabilities: usb-2.10
configuration: driver=hub maxpower=100mA slots=4 speed=480Mbit/s
*-usb
description: Keyboard
product: USB Receiver
vendor: Logitech
physical id: 3
bus info: usb@1:1.3
version: 23.00
capabilities: usb-2.00
configuration: driver=usbhid maxpower=98mA speed=12Mbit/s
*-usbhost:1
product: xHCI Host Controller
vendor: Linux 5.10.63-v8+ xhci-hcd
physical id: 1
bus info: usb@2
logical name: usb2
version: 5.10
capabilities: usb-3.00
configuration: driver=hub slots=4 speed=5000Mbit/s
*-usb
description: Mass storage device
product: ASM225
vendor: ASMT
physical id: 2
bus info: usb@2:2
logical name: scsi0
version: 1.00
serial: 000000000001
capabilities: usb-3.10 scsi
configuration: driver=uas speed=5000Mbit/s
*-disk
description: SCSI Disk

19:03

In reply to this message

product: 2235
vendor: ASMT
physical id: 0.0.0
bus info: scsi@0:0.0.0
logical name: /dev/sda
version: 0
serial: 100000000000
size: 223GiB (240GB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=6 logicalsectorsize=512 sectorsize=512 signature=99075c74
*-volume:0 UNCLAIMED
description: Windows FAT volume
vendor: mkfs.fat
physical id: 1
bus info: scsi@0:0.0.0,1
version: FAT32
serial: 54e3-79ce
size: 255MiB
capacity: 256MiB
capabilities: primary fat initialized
configuration: FATs=2 filesystem=fat label=boot
*-volume:1
description: EXT4 volume
vendor: Linux
physical id: 2
bus info: scsi@0:0.0.0,2
logical name: /dev/sda2
version: 1.0
serial: c6dd3b94-a789-4d57-9080-1472f721804b
size: 223GiB
capacity: 223GiB
capabilities: primary journaled extended_attributes large_files dir_nlink recover extents ext4 ext2 initialized
configuration: created=2020-08-20 13:58:06 filesystem=ext4 label=rootfs lastmountpoint=/ modified=2020-08-20 13:58:52 mounted=2021-11-11 14:33:28 state=clean
*-network:0
description: Ethernet interface
physical id: 1
logical name: eth0
serial: dc:a6:32:cc:ba:2f
size: 1Gbit/s
capacity: 1Gbit/s
capabilities: ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=bcmgenet driverversion=5.10.63-v8+ duplex=full ip=192.168.0.96 link=yes multicast=yes port=twisted pair speed=1Gbit/s
*-network:1 DISABLED
description: Wireless interface
physical id: 2
logical name: wlan0
serial: dc:a6:32:cc:ba:30
capabilities: ethernet physical wireless
configuration: broadcast=yes driver=brcmfmac driverversion=7.45.229 firmware=01-2dbd9d2e multicast=yes wireless=IEEE 802.11

M

19:05

Mark

CZ: The only thing that is relevant there is that the kernel is in fact 64-bit.

19:05

The next question is about your application. Is that compiled for 64-bit? You can use the file command on the binary

M

19:06

Mauro

Thank you very much for the detailed answer.

19 November 2021

AS

18:21

Alex Sharov

please remind what does MDBX_commit_latency.write measure? (it measure speed of mdbx_txn_write - but what it does? ) And what means if this parameter takes > 10 seconds during commit? (MDBX_commit_latency.sync = 1 sec, MDBX_commit_latency.gc = 1 sec)

Л(

18:37

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://github.com/erthink/libmdbx/commit/773172cc99edb4f3c19c09b3863ac9f0737aff45

AS

18:54

Alex Sharov

Thanks.

Л(

22:23

Леонид Юрьев (Leonid Yuriev)

JNI-bindings for Java by Castor Technologies are updated, i.e. not an obsolete for now.
https://github.com/castortech/mdbxjni

Many thanks to Alain Picard (Chief Strategy Officer of Castor Technologies Inc) for this job.

22 November 2021

S

09:41

Sproul

I've been working on getting @vorot93's mdbx-rs bindings working on Windows and I noticed a test failure related to mdbx_cursor_get. In a few of the mdbx-rs tests we look up a key in an empty database, which results in ERROR_HANDLE_EOF (38) on Windows but MDBX_NOTFOUND on Unix. We could work around this in the Rust bindings, but it seems like it should be fixed on the MDBX side?

This is the PR containing the failing tests (`cargo test` triggers it): https://github.com/vorot93/mdbx-rs/pull/6

And thank you for MDBX by the way, I'm using it in place of LMDB and loving the garbage collection and improved use of free space 😊

Л(

11:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

I'll check this out.
Thank for reporting.

Л(

15:45

Леонид Юрьев (Leonid Yuriev)

In reply to this message

I haven't found any flaws in the behaviour of libmdbx, and I can't confirm an mdbx_cursor_get()' issue yet.

Basically the difference of MDBX_NOTFOUND and MDBX_ENODATA are:
- MDBX_NOTFOUND is returned for an unsuccessful but valid search operation, or similarly for a cursor move operation from valid state.
- MDBX_ENODATA is returned for a get- or a move operation which is not valid for current cursor state.

For instance, the MDBX_ENODATA will be returned for MDBX_GET_CURRENT operation for empty DB or for MDBX_NEXT when a cursor is in the eof-of-file state.

---

However, perhaps the reason is that MDBX_ENODATA depends on the system:
- MDBX_ENODATA == ENODATA on systems that defined it nativelly (Linux, but not *BSD);
- MDBX_ENODATA == ERROR_HANDLE_EOF on Windows;
- MDBX_ENODATA == -1 on *nix without ENODATA in the errno.h;
- MDBX_ENODATA == 9919 from CLANG C++ headers on *nix without ENODATA in the errno.h;

I will think about how to improve the situation with this, but libmdbx will still return system-dependent error codes in some cases (since libmdbx in not an abstraction/virtualization layer for system error codes).

23 November 2021

S

00:27

Sproul

Awesome, I wasn't aware of MDBX defining MDBX_ENODATA. I've updated the Rust bindings to handle that error and the tests are passing. Thanks!

Л(

00:30

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Take look to the devel branch (https://github.com/erthink/libmdbx/tree/devel).
There are some related changes.

25 November 2021

Л(

20:27

Леонид Юрьев (Leonid Yuriev)

Релиз v0.11.2 планируется до начала зимы.

26 November 2021

AI

23:03

A I

Hello. I need to have custom key comparator. In current code functions which have custom key comparator marked as deprecated. Could you tell me what the best way to use custom key comparator in libmdbx?

Л(

23:55

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://github.com/erthink/libmdbx/commit/e912f87b2afe2970f1e661a5ecaa129fe9f9ee36

27 November 2021

Л(

00:07

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Nonetheless, the rigth way is: not using custom comparison functions, but instead converting the keys to one of the forms that are suitable for built-in comparators (for instance take look to the Value-to-Key functions).
https://erthink.github.io/libmdbx/group__value2key.html

Л(

17:28

Леонид Юрьев (Leonid Yuriev)

Current addresses for donations/sponsorship:
ETH => 0x19291d8658f762f3baceae1700c0b9466572ceab
BTC => 152u2KXNWWGHQS3qiBEoQaveWyPvaSWYGC
Рубли => https://sobe.ru/na/libmdbx
USD => https://paypal.me/erthink

Rgds.

AV

19:43

Artem Vorotnikov

Looks like some Chinese guy squatted mdbx name on crates.io so I will have to rename the bindings

Л(

19:45

Леонид Юрьев (Leonid Yuriev)

In reply to this message

try the libmdbx

AV

21:01

Artem Vorotnikov

In reply to this message

probably will, sorry for being an idiot and not publishing on crates.io 🙄

1 December 2021

Simon C. invited Simon C.

SC

12:52

Simon C.

Hey, is MithrilDB still under development? Will it replace MDBX? Will it be possible to upgrade without loosing data? Thanks 🙌

Л(

13:11

Леонид Юрьев (Leonid Yuriev)

In reply to this message

> is MithrilDB still under development?
Yes, but it is for Positive Technologies firstly and will be opened after corresponding PT' products will be releases.

> Will it replace MDBX?
No.

> Will it be possible to upgrade without loosing data?
No, sure.
But it will be possible to export+import data and migrate to a new API.

SC

13:16

Simon C.

👍

3 December 2021

Л(

01:22

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.11.2

6 December 2021

terry invited terry

t

16:04

terry

according to MVCC , if no one read the data ,data file will increase , i want to ask how to solve this question if i just want to cover？ thanks

Л(

16:31

Леонид Юрьев (Leonid Yuriev)

In reply to this message

我不明白你的问题，因为在所描述的情况下，没有理由增加数据库的大小
I did not understand your question, because in the described situation there is no reason to the size of the database be increased.

t

18:12

terry

i will try again tomorrow

SC

19:02

Simon C.

@erthink are there plans to support encryption like lmdb (mdb.master3)

Л(

19:04

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No, since it is require to change database format which is frozen for libmdbx.
MithrilDB will be support such feature.

SC

19:13

Simon C.

Sounds good. Do you have an appropriate release estimation for Mithril?

Л(

19:15

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Sorry, no any details until release.

7 December 2021

SC

13:30

Simon C.

"Currently a moderate number of slots are cheap but a huge number gets expensive: 7-120 words per txn"

Does every db used in a txn require 7-120 words or every db in the env?

Should named dbs be avoided for best performance (for example by using a key prefix instead)

Л(

13:39

Леонид Юрьев (Leonid Yuriev)

In reply to this message

This depends on the number of records, the size of the keys (including the prefix length) and the number of named subDB when using ones.

In general, the bad ideas for performance:
- using a large number (100 and more) of subDB.
- using key-prefix instead of subDB for large number (10^7 and more) of items.

SC

13:58

Simon C.

Thanks a lot! Libfpta uses a really high number of maxdbs right (not entirely sure, can't speak Russian 😅)

Is there a reason / trick why it does that?

Л(

14:02

Леонид Юрьев (Leonid Yuriev)

In reply to this message

libfpta does this since users are require it.
On the other hand, libfpta uses some ticks to reduce an overhead (i.e. the cache of dbi-handles: https://github.com/erthink/libfpta/blob/master/src/dbi.cxx#L50-L305, etc)

SC

14:05

Simon C.

Thank you! That helps a lot

8 December 2021

AV

11:40

Artem Vorotnikov

In reply to this message

what if mdbx_env stored opened dbi handles in string => dbi hashmap?

9 December 2021

Л(

22:35

Леонид Юрьев (Leonid Yuriev)

In reply to this message

This just will adds more overhead, but you can follow this approach in your own bindings/wrapper.

11 December 2021

?

16:05

󠃲

In reply to this message

In general, the bad ideas for performance:
 - using a large number (100 and more) of subDB.
 - using key-prefix instead of subDB for large number (10^7 and more) of items.

huh
should we bother with subDBs with much more than 10^7 items, but with only 8 byte keys each (not counting the data)?
what is the magnitude of perf degradation are we talking about?

Л(

16:13

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No. It was about something else here:
- for large numbers of items a subDB should used instead of adding a key prefix.
- contrary, for many "bushes" with ~10-100 items a key prefix is preferred rather than many subDBs.

?

16:25

󠃲

In reply to this message

we do talk about the same thing :)

but 10^7 doesn't sound like much, I thought millions upon millions (w/ short key length) was a norm

Л(

20:11

Леонид Юрьев (Leonid Yuriev)

In reply to this message

What was meant was not a significant degradation, but a slight slowdown relative to the alternative.

Nonetheless, these are very rough estimates, since the costs of starting a transaction are compared with the costs of searching or updating data.
So subDB-approach will always beat key prefixes in case of long/large transactions with many lookup and/or updates.

?

20:16

󠃲

understood
thanks

12 December 2021

SC

21:18

Simon C.

I found the libmdbx binaries to be about 30% larger than lmdb on all platforms. That's a problem for mobile use-cases where size is important.
Are the added features the reason for the size increase? Are there compile options that improve binary size?

Л(

21:40

Леонид Юрьев (Leonid Yuriev)

In reply to this message

There are four reasons for this:
- were added the features;
- some originally universal functions and code fragments were cloned with specialization, as for clarity and speed both;
- a lot checks (for arguments and preconditions) were added;
- a some tricks were used (__hot and likely/unlikely macros, etc) which are allows compiler more aggressively optimization that in many cases lead to increase code volume (i.e. inlining, etc).

21:44

If you want to minimize the amount of executable code, then there is a well-known good recipe: -Os, static library, LTO (link-time optimization) as for _libmdbx_ and your APP or bindings.

21:47

Next level trick - using PGO (profile guided optimization), where the profile been gathered by real APP usage.

13 December 2021

SC

00:21

Simon C.

Will try that, thanks for the explanation and ideas 🙌

AS

05:51

Alex Sharov

About AugmentLimit:
As I understand now it only stops iteration and allocate new space for GC record. But it doesn't break-down large GC record (which touched AugmentLimit to smaller pieces).
Problem is - if 1 record in GC touched AugmentLimit once - likely it will touch it in each new write transaction (if record is big enough) - and db size will growth to infinity.

Is it possible to handle "touching AugmentLimit" event somehow (for example break-down big gc record to smaller) - that will totally prevent future touching AugmentLimit for same GC record? To prevent possibility of "growth db to Infinity".
Because I willing to reduce AugmentLimit (we have too high now), but affraid to face "growth db to Infinity".

Л(

16:09

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No, this is differ/independent scopes of logic and code.

The MDBX_opt_rp_augment_limit affect only fetching records from GC and to increasing the reclaimed_pglist of the current write-transaction in RAM inside mdbx_page_alloc().
But then, during transaction commit, a large reclaimed_pglist as a large retired_pages both, will be chunked by me_maxgc_ov1page inside mdbx_update_gc() if a free GC-ids are available to push such records into GC.

So if you have troubles in some case(s) I need a reproducible scenario.
On the other hand, behaviour for extreme huge page lists could not be fixed perfectly because of fundamental drawbacks which are inherited from LMDB.
Only MithrilDB (will) solve this issue completely through streaming mode for large objects and some tricks that beat "roaring bitmaps" for these use cases.

14 December 2021

AS

05:18

Alex Sharov

Thank you and sorry. Actually I again describing issue non-correct way. My problem is: i do touch some “** restart: “ corner-case, where pglist is full and need to growth, but it grows by small steps and I getting many “** restart: “ log lines - and probably this what makes commit slow. I will create github issue with more useful details today.

15 December 2021

?

13:06

󠃲

lmdb had this behavior where it would put values exceeding some limit in overflow pages and pin them -- not move them around upon gc
is this missing in mdbx? can't seem to find anything in either docs or src except for a single mention of ms_overflow_pages in the header

Л(

13:29

Леонид Юрьев (Leonid Yuriev)

git grep overflow ?

?

13:45

󠃲

well there are mentions of both "hard pages" and "overflow pages"
also there's this, core.c: /* LY: add configurable threshold to keep reserve space */
so the question is, what is the actual behavior?
valsize_max does some calculations based on max num of normal pages...
I guess we can deduce max num of hard pages from this
but what is the threshold at which the value goes into overflow page?
lurking at mdbx_cursor_put there's me_leaf_nodemax, but what is its value?

Л(

13:48

Леонид Юрьев (Leonid Yuriev)

Some clarification:
- due MVCC any used pages will be COW'ed on update, i.e. update a values;
- LMDB will not convert overflow/large node to usual even new value don't require a overflow-page, but mdbx do this;
- other behavior is the same, except mdbx allows larger keys "out of a box".

?

13:50

󠃲

so there is no pinning?

Л(

13:53

Леонид Юрьев (Leonid Yuriev)

In reply to this message

con si con sa.
A value on an overflow page(s) is "pinned" (i.e. don't COW'ed) when neighbors keys/values going update.

?

13:57

󠃲

мда

Л(

14:04

Леонид Юрьев (Leonid Yuriev)

In reply to this message

?

14:07

󠃲

In reply to this message

con si con sa on you

но отредактированное последнее сообщение ответило на вопрос
гуд

Л(

18:44

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Вот и поговорили ;)

?

19:37

󠃲

д

20 December 2021

Dmitry Savonin invited Dmitry Savonin

DS

00:46

Dmitry Savonin

Hey. I have an issue with libmdbx in erigon, it crashes on mdbx_pnl_check. Do u know why it might happen or what can I do with it?

Assertion failed: ((pl)[1]) < limit (mdbx: mdbx_pnl_check: 6368)
SIGABRT: abort
PC=0x7f3cf16ad3f2 m=37 sigcode=18446744073709551610
signal arrived during cgo execution

It happens after very long opened and heavy cursor. Database size is 4tb+.

Л(

03:22

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Hmm, seems like a database corruption because of bug in the code or a hardware failure.
Please contact Erigon's support first to ensure you using the last release, etc.

P.S.
In the current understanding, the presence of errors in the code is quite unlikely, since we not seen such problems a long enough time.
On the other hand, we got some reports of problems caused by hardware failures.
Nevertheless, all such reports will be analyzed.
This will (most likely) require the database file, core dump, and constructing a scenario to reproduce the problem.

AL

18:52

Andrea Lanfranchi

In reply to this message

Hi, I'm from Erigon team. Could you please provide some further detail about the environment and the chain you are syncing ?
Very first hint is your DB (4TB+) is totally out of scale (no chain actually requires that space)

18:52

Please refer to our discord channel so we don't pollute this place.

DS

19:43

Dmitry Savonin

In reply to this message

This is BSC chain

https://github.com/ledgerwatch/erigon/pull/3144

AL

19:51

Andrea Lanfranchi

In reply to this message

Have you already reported the issue on Erigon's discord channel ? (Maybe #testing is the appropriate place)

21 December 2021

DS

11:13

Dmitry Savonin

In reply to this message

I've setup new machine with clean environment and after 1 day of sync it showed this error:

failed MdbxKV cursor.Next(): mdbx_cursor_get: MDBX_PAGE_NOTFOUND: Requested page not found

I think the nature of this error is the same as above.

Erigon uses this Go binding (https://github.com/torquem-ch/mdbx-go) that is running libmdbx v0.10.4

11:22

In reply to this message

It seems it was upgraded to v0.11.1 that is almost the latest known

AS

11:33

Alex Sharov

In reply to this message

Here is an example how to check you RAM: https://github.com/ledgerwatch/erigon/issues/2777#issuecomment-935578878

22 December 2021

b

09:31

basiliscos

MDBX_MAP_FULL: Environment mapsize limit reached . Подскажите, что с этим делать?

AS

11:00

Alex Sharov

mdbx_stat -a

11:00

Дальше зависит от того какая под-база скушала все место

b

11:04

basiliscos

Status of Main DB
  Pagesize: 4096
  Tree depth: 3
  Branch pages: 3
  Leaf pages: 95
  Overflow pages: 43
  Entries: 4914

что тут не так?

AS

11:05

Alex Sharov

In reply to this message

Вы пишите какой-то mvp на коленке? Если да - просто поднимите лимит размера базы.

11:07

Через geometry https://erthink.github.io/libmdbx/usage.html#autotoc_md52

b

11:11

basiliscos

я не совсем понял.. я вроде нигде никаких лимитов не задаю.

mdbx_env_create(&env);
auto flags = MDBX_WRITEMAP | MDBX_COALESCE | MDBX_LIFORECLAIM | MDBX_EXCLUSIVE | MDBX_NOTLS;
auto r = mdbx_env_open(env, db_dir.c_str(), flags, 0664);

ну т.е. дефолтные лимиты какие-то не такие, как мне надо :-/

11:12

но да, там 2-3 транзакции проходят, 1-я в несколько сот байт, а 2-я уже несколько мегов. Но вообще ожидал, что авто-разрулится, не такие большие даннные

МЗ

11:15

Максим Заикин

In reply to this message

через mdbx_env_set_geometry следует задать размеры

b

11:16

basiliscos

спасибо. Повтыкаю доку на досуге, как там сделать "анлим"

Deleted invited Deleted Account

SC

16:27

Simon C.

If I remove an entry, what happens to other cursors in the same txn pointing at or near that entry?
Only the behavior for the current cursor is documented.

Л(

16:42

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Briefly, the maximum that you can expect in STL, std:: and other DB with properly cursors:
- cursor(s) in the position before or after the deleted one remains valid and unchanged.
- cursor(s) at the deleted position going to "deleted state", but could be moved to prev or next record if available one.

SC

16:43

Simon C.

Perfect, thanks!

b

19:10

basiliscos

Можно ещё раз, для тупых (как я). Почитал доку mdbx_env_set_geometry(), меня всё устраивает по-дефолту, и, как я понял, по-дефолту она же должна расти, если вдруг не хватает место. Во всяком случае я так понял "automatic size management" . Что мне нужно сделать, чтобы не получать MDBX_MAP_FULL в процессе работы (вставок в БД)?

МЗ

19:11

Максим Заикин

In reply to this message

по умолчанию как мне кажется она не динамик

b

19:14

basiliscos

> For instance, the default for database size is 10485760 bytes.

нашёл это. Вполне ок, если растёт при необходимости

Л(

19:14

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Исторически по-умолчанию (без вызова mdbx_env_set_geometry()) поведение libmdbx соответствует LMDB по-умолчанию (и предельный размер там меньше).
Проще говоря, чтобы "не париться" и не зависеть от поведения "по-умолчанию" лучше явно вызвать mdbx_env_set_geometry().

b

19:17

basiliscos

блин, понятно.

> In contrast to LMDB, the MDBX provide automatic size management of an database according the given parameters, including shrinking and resizing on the fly. From user point of view all of these just working.

надо тогда доку подправить. А то, по вышесказанному, складывается ощущение что по-умолчанию именно вариант "не париться" работает.

Л(

19:20

Леонид Юрьев (Leonid Yuriev)

In reply to this message

PR(s) are welcome ;)

SC

20:48

Simon C.

Sorry for asking a lot of questions 😅 Can you reset a transaction in one environment and renew it in another one? E.g. use a common txn pool for multiple environments?

AL

22:11

Andrea Lanfranchi

In reply to this message

A transaction is mostly a pointer to a database view. A lightweight structure object which members are initialized on behalf of the current state of the environment at the moment of transaction start. There is no reuse of storage space whatsoever.
So I don't see the point in having a pool of transactions to poll from.

23 December 2021

Л(

14:37

Леонид Юрьев (Leonid Yuriev)

In reply to this message

> ... E.g. use a common txn pool for multiple environments?

No, sure.

1. The actual size of txn object is depend from environment, i.e. the configured maximal number of DBI-handles.
2. With MDBX_NOTXN the txn object is binded to a slot of environment' reader table.

14:51

In reply to this message

Minor addition/clarification:
Pre-allocation and re-use of r/o transactions and cursors makes sense to ensure lock-free (mostly wait-free for libmdbx), including malloc()/free() overhead.
So such approch is necessary for real-time scenarios, including the ability to start r/o txn(s) and use cursor(s) from the interrupt context, IRQL > PASSIVE, signal handlers, and so on.

25 December 2021

t

12:21

terry

Merry Christmas！according to Fully ACID-compliant, through to MVCC and CoW If data is constantly written but not consumed, will the amount of data keep increasing? How to solve this problem?

Л(

12:22

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Please RTFM https://erthink.github.io/libmdbx/intro.html#long-lived-read

t

12:27

terry

got thanks. when MithrilDB will ready?

Л(

12:39

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No any announcement and/or promises, but it will be available inside Positive Technologies' product(s) firstly.

t

13:05

terry

👍

28 December 2021

Л(

00:26

Леонид Юрьев (Leonid Yuriev)

Есть определенный прогресс в решении https://github.com/erthink/libmdbx/issues/254.
Если за последующие три дня тестирования не будет замечено отклонений, то в канун нового года будет релиз v0.11.3.

Л(

00:42

Леонид Юрьев (Leonid Yuriev)

После этого, за новогодние выходные, надеюсь реализовать:
https://github.com/erthink/libmdbx/issues/224, https://github.com/erthink/libmdbx/issues/223, https://github.com/erthink/libmdbx/issues/210, https://github.com/erthink/libmdbx/issues/204.

Затем во вторую очередь:
https://github.com/erthink/libmdbx/issues/193, https://github.com/erthink/libmdbx/issues/192, https://github.com/erthink/libmdbx/issues/115, https://github.com/erthink/libmdbx/issues/124

В сумме это снимает препятствия для релиза libmdbx 1.0, т.е. полного консервирования API.
Приветствуется обратная связь (feedback): кому-то интересно и/или нужно.

00:50

🤔 invited 🤔

?

13:13

🤔

Hello! I'm currently facing the following segfault:

what I do:

mdbx_env_create(...);
mdbx_env_set_mapsize(...);
mdbx_env_set_maxdbs(..., 8);
mdbx_env_open(...);

mdbx_txn_begin(...); // with flags=0
mdbx_dbi_open(...); // with name & with MDB_CREATE
mdbx_txn_commit(...); // segfaults here!

More details:

  * frame #0: 0x0000000001e3dc87 mdbx_cursors_eot(txn=0x0000621000003d00, merge=1) at mdbx.c:8575:28
    frame #1: 0x0000000001e3c8d2 mdbx_txn_commit(txn=0x0000621000003d00) at mdbx.c:10606:5

    frame #0: 0x0000000002d029f6 mdbx_cursors_eot(txn=0x0000621000003d00, merge=1) at mdbx.c:8575:28
   8572
   8573   for (i = txn->mt_numdbs; --i >= 0;) {
   8574     for (mc = cursors[i]; mc; mc = next) {
-> 8575       unsigned stage = mc->mc_signature;
   8576       mdbx_ensure(txn->mt_env,
   8577                   stage == MDBX_MC_SIGNATURE || stage == MDBX_MC_WAIT4EOT);
   8578       next = mc->mc_next;

Here, there 3 cursor entries (I geuss that is for 2 core dbs and 1 for newly added one).
First two are just NULLs and third one is a gibberish value, hence it segfaults trying to read that.

This issue disappears sometimes (mostly when I try to debug in gdb :D).

Л(

13:20

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Are you sure using an actual version, i.e. the last release?

13:23

@ololocat, please show the output of ./mdbx_chk -V

?

13:24

🤔

just checked, that is actually:

/* MDBX version 0.7.0, released 2020-03-18 */

13:25

is it a known issue that was fixed? or are there so many things changed, so I might not face the issue in the latest release? :)

Л(

13:28

Леонид Юрьев (Leonid Yuriev)

In reply to this message

We' ve gone a long way in two years.
Please just switch to the last release firstly.

?

13:30

🤔

ok, will try the latest one and get back to you. thanks and have a nice holidays!

Л(

13:30

Леонид Юрьев (Leonid Yuriev)

In reply to this message

👌👍

?

15:01

🤔

In reply to this message

ok, upgraded to the latest version aaand it seems that now I can't repro the original issue, thanks!

btw, a side question: is mdbx_dbi_open really lightweight? can it be used for any new transaction?

Л(

15:15

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Historically this is a "gray area" inherited from LMDB:
1) Opened DBI-handles will be available for subsequent transaction, i.e. you not need to call mdbx_dbi_open() for every transaction.
2) But(!) opened DBI-handle will becomes immediatelly invalid for all transactions when it closed/dropped explicitly, or when a new transaction started for recent MVCC-snapshot in which the corresponding named subDB was dropped by another process.

?

15:16

🤔

👌 ty

Deleted invited Deleted Account

19:25

Deleted Account

Hi really amazing work with libmdbx! Congrats on this ultra fast solution. With this key-value store, we notice the data seems to get moved around to align on 4-byte boundaries, however for ARM64 we need to align to 8-byte boundaries for our use case (when updating data, the ptr to this data is potentially shared with multiple threads that need 64-bit atomic increment and we're getting bus errors on ARM64). Any ideas on how we might be able to achieve this?

Л(

20:55

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Seems you have wrong assumptions and intentions for make a bad things ;)

Basically there very few limited cases when you can change a data inside the DB directly through a pointer:
- your process runs the write transaction;
- AND you got the data' address using MDBX_RESERVE put's option or by last get-operation from the DB opened with MDBX_WRITEMAP option;
- AND such address is valid (could be used to read/modify data) ONLY until next modify operation AND while the write-transaction is running (was not commited nor aborted).

And since write-transactions in libmdbx are strictly serialized (protected by a shared POSIX mutex), it makes no sense to change the data (which placed in the DB) using atomic operations.

21:22

Deleted Account

Yes, I thought you might say that! I believe each of those conditions is indeed met but there is still a need for ultra fast atomic operations for what we do while the write transaction is still uncommitted. The POSIX mutex wrapping the write operation doesn't help in this case as multiple threads need to update atomically during this phase and the last thing we want is an additional mutex to slow operations between the threads during the write commit. I'm not the main person working on this so let me get back to you.

Л(

21:57

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Ok, you have been warned.

So about alignment, there two major points:
- on the one hand: binary format for libmdbx is frozen, thus you can get some unique properties only by your own fork with breaking compatibility with the mainstream;
- on the other hand: you may rely on current format properties, especially of fact that array of MDBX_node structures even-byte aligned there are at the end of (non-MDBX_DUPSORT_FIXED) pages be inside a DB.

See https://github.com/erthink/libmdbx/blob/master/src/internals.h#L1445-L1470

Thus you can achieve 8-byte alignment for values by satisfy preconditions:
- length/size of each node (4-byte header, key's bytes, value's bytes) is multiple of 8;
- offset of each value inside node is is multiple of 8, i.e. the size of key plus 4-byte node header is multiple of 8;
- no too large values, i.e. no overflow/large pages.

29 December 2021

18:17

Deleted Account

Thank you! I'm still a little confused as to where the 4-byte alignment is coming from. Is MDBX_struct what you mean by the header? That struct appears to be 8 bytes? (32bit + 8bit + 8bit + 16bit)

Л(

20:16

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Oh, oops, I'm sorry.
I have been confusing the actual libmdbx' nodes with other experimental/internal versions.

31 December 2021

Л(

10:52

Леонид Юрьев (Leonid Yuriev)

@vorot93, please check out current master branch to Akula.
Thanks to @AskAlexSharov and Erigon resources for remote debugging, we can expect up to x100 performance when committing huge transactions.

🔥

AV

10:53

Artem Vorotnikov

In reply to this message

Thanks, will check it out when I’m back from vacation in January

Л(

11:03

Леонид Юрьев (Leonid Yuriev)

@fuunyK, thanks for support!
libmdbx v0.11.3 will be released today.
So expecting that the January's versions Erigon and Akula will be even faster.

b

11:26

basiliscos

👍👍

Л(

12:45

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.11.3

👍

AV

G

Л(

18:06

Леонид Юрьев (Leonid Yuriev)

https://opennet.ru/56438-libmdbx

🎉

AV

3 January 2022

Deleted invited Deleted Account

SC

00:38

Simon C.

Does libmdbx mmap the current size of the database or the maximum size?

I was under the impression that it only mmaps the current size (or a little more) but looking at the code it seems to mmap the maximum size.
If that's true, it's a big problem for iOS devices...

AS

04:56

Alex Sharov

In reply to this message

Lmdb does mmap maximum size, mdbx does mmap according to set_geometry. Worst case “current_size+growth_step”.

👍

SC

5 January 2022

Л(

22:16

Леонид Юрьев (Leonid Yuriev)

@vorot93, подвинь pls акулу на актуальный мастер.
Там по наводке @AskAlexSharov поправлена переработка GC при попадании туда записей из-под мега-больших транзакций.

Бага и исправление логически простые, но без исправления переработка GC будет "давиться" на большой транзакции в истории и БД будет перманентно расти.
В качестве woraround можно выставить MDBX_opt_rp_augment_limit = MDBX_PGL_LIMIT (0x7FFFffff для 64-битных сборок).

AV

22:17

Artem Vorotnikov

In reply to this message

спасибо за наводку, обновлю биндинг и акулу

👍

Л(

22:26

In reply to this message

я кстати сейчас и выставляю в акуле rp_augment_limit чтобы GC отрабатывал - хотя и получаю коммиты по 2с

22:26

с последним мастером больше не надо выставлять параметр?

Л(

22:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Надо, но это компромисс между ростом БД (при меньших значениях) и тормозами при коммите больших транзакций (при больших значениях).

Пожалуй надо провести эксперимент на актуальной кодовой базе: поставить MDBX_opt_rp_augment_limit в 0x7FFFffff (максимум) и посмотреть на поведение.

22:33

Может быть не будет сильно больших тормозов, ибо уже много что было допеределанно.

8 January 2022

AV

14:42

Artem Vorotnikov

In reply to this message

I wonder what of python bindings, will they be released?

Л(

14:50

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Please see https://github.com/erthink/libmdbx/issues/147
Especially the additions to README by https://github.com/erthink/libmdbx/commit/ba0dd1db49b7061cc701e13eda79e376d3d01536

So, python bindings should be provided and released aside, but not inside the libmdbx's repo.

10 January 2022

Л(

14:32

Леонид Юрьев (Leonid Yuriev)

Good news: futex_wait() is now available since 5.16!
https://www.kernel.org/doc/html/latest/userspace-api/futex2.html#futex-waitv

Will Q invited Will Q

12 January 2022

pahuljica invited pahuljica

14 January 2022

МЗ

15:01

Максим Заикин

добрый день, обнаружил проблему при использовании библиотеки в .net5, код на C#.... mdbx_env_create, затем mdbx_env_open ... после чего в коде любое необработанное исключение, консольное приложение выплевывает текст исключения и процесс завершается, это и происходить но с зависанием процесса на несколько минут.... при этом мало того что сам процесс зависает но еще и другие случайные процессы также подвисают, например диспечер задач, на WS2012R2 это несколько минут, на WS2022 секунд 15-20

Л(

15:27

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Подвисание других процессов намекает что проблема в Windows (в ядре или в поддержке среды/окружения/сеанса пользователя).
Теоретически триггером для проявления проблемы может выступать libmdbx, но относительно причин у меня пока нет предположений.

Могу посоветовать:
1) Посмотреть как будет вести себя приложение без libmdbx, но с аналогичными необработанными исключениями.
Имеется в виду вообще без использования библиотеки (без файла dll), если это возможно.
2) Посмотреть как будет вести себя приложения с загруженной libmdbx, при необработанных исключениях, но без вызова функций libmdbx.
3) Используя ProcessHacker и/или WinDbg выяснить где именно (в каком вызове какой библиотеки, ntlddl, системном вызове ядра) происходит зависание.

+4) Использовать CheckedBuild ядра с удаленной отладкой.

МЗ

15:28

Максим Заикин

из опыта знаю что такое поведение возможно если активен хотя бы один поток в процессе кроме главного, видимо библиотека чего через какое-то время завершает его, после чего и сам процесс завршается

15:31

возможно W2012 и есть проблемы ( обновлялась с 2008r2), но 2022 вообще чистая система, я уже упростил приложение до нескольких строк... проверю с загрузкой но без вызова

Л(

15:31

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Внутри libmdbx не создается потоков, кроме случая копирования с компактификацией.
Поэтому потерянные треды нужно искать либо в биндингах C#, либо в приложении, либо в других библиотеках используемых приложением.

15:33

In reply to this message

Поставьте ProcessHacker и покажите стек вызовов проблемного потока при зависании.

15:35

И на всякий - пожалуйста, используйте актуальный master, либо крайний релиз.

МЗ

15:35

Максим Заикин

In reply to this message

да я скачал крайний релиз

Л(

15:44

Леонид Юрьев (Leonid Yuriev)

На всякий - в зависимости от режима использования libmdbx может быть такой эффект:
- приложение что-то делает с БД, но изменения не записывают ядром на диск немедленно;
- при крахе/терминации процесса ядро ждет завершения всех отложенных операций записи на диск;
- в некоторых сценариях (забитая NTFS, hdd, антивирус) это может выглядеть как "подвисание".

МЗ

15:46

Максим Заикин

антивируса нет, без вызова функций все ок, достаточно вызвать mdbx_env_create далее сгенерировать исключение и все повисает

15:47

главное Process Explorer подвисает...

15:59

если после mdbx_env_create вызвать mdbx_env_close_ex, то проблемы зависания нет

15:59

мне кажется какая то проблема в DllMain

16:00

mdbx_rthc_global_dtor(); или mdbx_rthc_thread_dtor(module);

Л(

16:03

Леонид Юрьев (Leonid Yuriev)

Давно не пользовался, но Process Hacker был хитрее Process Explorer-а.
В том числе умел работать когда заблокированы некоторые таблицы процесса.

Советую попробовать так:
- запустить приложение и дойти до точки вброса исключения.
- запустить process hacker и увидеть все стеки.
- поставить точку останова в DllMain и продолжить выполнение (врос исключения и терминирование процесса).

МЗ

17:07

Максим Заикин

http://prntscr.com/26dmqy8

17:09

после того как в C# кидается исключение, отладчик выдает http://prntscr.com/26dmrin

Л(

17:15

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Этот (первый) скриншот не интересен, всё ожидаемо.

МЗ

17:15

Максим Заикин

ну я показал что в DllMain заходит

Л(

17:16

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Этот (второй) скриншот бесполезен, видим что в коде обвязки вбрасывается исключение.
Нужен скриншот со стеком вызова проблемного потока при зависании.

МЗ

18:57

Максим Заикин

мистика какая то (: вот тайминги на W2022 http://prntscr.com/26dnwl5 если без исключение то закрытие env происходит быстро, а если через исключение то Dispose случается через 12 секунд и при этом mdbx_env_close выполняется 30 секунд (:

18:58

http://prntscr.com/26dnyrc

19:00

на 2012R2 вообще все печально

19:01

http://prntscr.com/26dnzu8

19:01

хотя закрыие происходит быстро (:

Л(

19:01

Леонид Юрьев (Leonid Yuriev)

In reply to this message

А что происходит с транзакциями активными на момент вброса исключения?
Есть подозрение что они у вас не абортяться явно, но при этом выполняется штатное закрытие БД.

МЗ

19:02

Максим Заикин

In reply to this message

еще раз, нет транзакций ... создается env и все... больше ничего из библиотеки не вызывается

Л(

19:03

Леонид Юрьев (Leonid Yuriev)

А из-за какой ошибки и от какой функции libmdbx бросается исключение?

МЗ

19:04

Максим Заикин

In reply to this message

никакая:)))) http://prntscr.com/26do17c я в сам его создают

19:05

в основном коде там исключения кидались по возврату GET PUT , щас я уже просто упростил до минимального кода

Л(

19:07

Леонид Юрьев (Leonid Yuriev)

Я правильно понимаю, что проблема возникает при вбросе исключения между mdbx_env_open() и mdbx_env_close() ?

МЗ

19:07

Максим Заикин

да

19:10

In reply to this message

да все верно

19:11

и не важность есть сам файл или он создается в первый раз

19:12

могу код простой подготовить, если есть возможность протестировать

Л(

19:13

Леонид Юрьев (Leonid Yuriev)

Нужен стек вызовов при зависании.
Даже если отладчик будет тупить во время зависания, то (теоретически должен) прервать процесс сразу после выхода их системного вызова.
Попробуйте просто пройтись пошагово внутрь mdbx_env_close() при проблемном сценарии.

19:13

In reply to this message

У меня плохо с виндой, почти не осталось...

19:16

Как вариант пробуйте Process Monitor, насколько помню он может писать логи вместе с отметками времени.
Т.е. почти как strace на Linux.

https://stackoverflow.com/questions/3847745/systrace-for-windows

МЗ

19:18

Максим Заикин

так проблема не столько в close, а в том что процесс чем то занимается непонятным после генерации исключения.....

Л(

19:19

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Так будет видно в каких системных вызовах происходит "залипание".

МЗ

19:49

Максим Заикин

мне кажется полезной инфы мало (:

19:50

в этом месте приложение плюнуло исключение http://prntscr.com/26doisl

19:51

через 10 секунд, загрузило WerFault ( регистрация ошибки в журнале событий windows)

19:51

что 10 секунд происходило не понятно

19:54

http://prntscr.com/26dokfg и в этом тесте начался env_close и так же непонятно что 5 секунд оно делало

20:03

cстек на UnlockFileSingle http://prntscr.com/26doo2d

Л(

20:04

Леонид Юрьев (Leonid Yuriev)

Похоже на работу руткита или антивируса (включая "родной" defender).
Но утверждать я не берусь, ибо на глубоком системном уровне использую винду уже ~15 лет.

20:06

Deleted Account

А ты пробовал в mono runtime?

МЗ

20:07

Максим Заикин

на 2012 вообще песня http://prntscr.com/26dopek

20:07

если просто открыть и не зависнет, то показывает что один поток загружается на 12%

20:08

In reply to this message

нет, в моно я смысла не вижу... им не угнаться за .net

20:09

Deleted Account

Причем тут это)))

МЗ

20:09

Максим Заикин

In reply to this message

и в в 2012 нет дефендера

Л(

20:09

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Внутри windows очень плохо с UnlockFile, в частности существенная часть обработки идет в отдельном потоке.
Местами чуть не в режиме полинга (сканирование процессов ожидающих блокировки).
Из-за это есть традиционный глюк - ожидающий процесс может долго не видеть снятия блокировки, особенно если он делает проверку без ожидания.

Но тормоза на ~5-10 секунд - слишком много.

МЗ

20:10

Максим Заикин

In reply to this message

тогда бы это всегда было на env_close

20:14

вот получилось стек получить, правда не уверен что в момент зависона (: http://prntscr.com/26dorln

Л(

20:15

Леонид Юрьев (Leonid Yuriev)

In reply to this message

В libmdbx нет ничего что мешало-бы dotnet-runtime и/или нативным исключениям.
Кроме блокировок файлов и критических секций, самое неординарное - наращиваемая секция памяти.
Это частично undocumented, но де-факто используется в массе продуктов, включая мелко-мягкие.
Однако, такие "необычные" секции часто ломают логику руткитов и антивирусов...

МЗ

20:17

Максим Заикин

это возможно .... но если исключения ловить то я пока не замечал каких либо проблем при использовании библиотеки

20:17

Deleted Account

Я бы сравнил .net 5 runtime с net framework(любым), чисто ради эксперемента

Л(

20:18

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Если зависание здесь (т.е. внутри ZwUnmapViewOfSection), то проблема похоже именно в том что я написал выше.

МЗ

20:18

Максим Заикин

In reply to this message

это возможно, щас попробую :))) а потом и на .net 6:)) так как они уже его рекомендуют использовать

Л(

20:19

Леонид Юрьев (Leonid Yuriev)

In reply to this message

в данном случае, крайне маловероятно что будут различия, ибо "залипание" происходит внутри системных вызовов.

20:19

Deleted Account

Ну а вдруг))) хотя я тоже не особо в это верю

МЗ

20:20

Максим Заикин

In reply to this message

ну FW они много лет отлаживали, а 5-6 это по мне как еще то изделие

Л(

20:22

Леонид Юрьев (Leonid Yuriev)

По-возможности советую попробовать Checked Build, т.е. поставить в виртуалку и запустить саму систему (ядро) под WinDbg.
Будет больше внутреннего логирования, а возможно и какой-нибудь assert внутри ядра сработает...
https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/checked-build-of-windows

20:30

На всякий - имейте в виду. что подобные "залипания" могут быть из-за багов в драйверах, в частности ошибок работы с IRQL (не восстановление или неверное восстановление и т.п.)
Насколько эффективно актуальные ядра Windows обнаруживают и обходят/изолируют подобные ошибки я не в курсе (в мои "виндовые" времена могла быть масса глюков без синих экранов).

МЗ

20:43

Максим Заикин

на 4.7.2 все тоже самое......

Л(

21:24

Леонид Юрьев (Leonid Yuriev)

Как промежуточный итог по обсуждаемой выше проблеме: пока я не вижу ни каких-либо проблем/недостатков в libmdbx, ни возможностей что-либо поправить/улучшить.

As an interim result of the issue discussed above: for now I do not see any problems/shortcomings in libmdbx, nor opportunities to fix/improve anything.

МЗ

21:26

Максим Заикин

In reply to this message

да, спасибо для помощь, как будет время попробую еще поковырять... этот момент

👍

Л(

15 January 2022

Deleted invited Deleted Account

AL

20:39

Andrea Lanfranchi

Did something change in structure of freelist bucket values ? Have a small tool which accounts freelist by txid but after 11.3 I get martian values

20:41

Specifically I had the first 8 bytes (sizeof(size_t)) of value being the number of pages

20:41

(sorry for lazy question ... didn't have time to look into code changes)

Л(

20:43

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No.
The database format is frozen.

AL

20:46

Andrea Lanfranchi

FYI this snippet returns a martian number of pages https://github.com/torquem-ch/silkworm/blob/master/cmd/toolbox.cpp#L200-L219

Л(

20:48

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Your code is wrong!
PNL (page number list) in libmdbx is linear array of 32-bit numbers, where the zero-indexed item holds the size of list.

1) The page numbers in libmdbx is always 32-bit, not the`size_t`.
2) The transactions numbers is always 64-bit.

AL

20:50

Andrea Lanfranchi

Ouch ... I had consistent and correct values up to 11.2
Trying to understand what happened ... code by toolbox is frozen since 6 months now
Maybe one element of the array was not valued before ?

Л(

20:52

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Sure you are missed something.
You can ensure that I right by using mdbx_chk tools from current release and some old version(s).

AL

20:54

Andrea Lanfranchi

Not saying I am right ... for sure.
Only trying to understand why the same code produced correct results up to libmdbx 11.2 but on 11.3 is producing martian values.
Not your fault ... sure is mine. Only wanted to understand.

Л(

20:57

Леонид Юрьев (Leonid Yuriev)

In reply to this message

I think you just need to take the old database and see in the debugger what data is actually being read from GC/freelist.
Then repeat this with a new DB.

AL

21:00

Andrea Lanfranchi

Nevermind ... I figured it out. Another contributor to our project has introduced the bug yesterday. Sorry for that.

17 January 2022

b

08:34

basiliscos

https://db.cs.cmu.edu/papers/2022/cidr2022-p13-crotty.pdf

Интересно услышать ваше мнение по этому )

Л(

09:48

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Ничего нового, от слова совсем.
Просто странноватый и (как мне показалось) не cовсем квалифицированный пересказ "вода мокрая, а с ножа лучше не есть" .
Кроме этого, местами авторы не полностью понимают причин и следствий, например "... LMDB solves this problem by allowing only a single writer".
Стратегическое упущение - нет ни слова о NVDIMM.

Если серьезно, то (конечно) mmap вовсе не идеальное средство. Но дело не в mmap, а как он используется в конкретном движке хранения и насколько тот или иной движок хранения подходит для конкретного сценария использования.
Проще говоря, в частности, mmap позволяет сильно экономить, когда данных помещаются в памяти, но случайные чтения холодных данных через mmap (как правило) порождают больше накладных расходов из-за page faults = это крайне очевидная вещь, не имеющего ничего общего с "MMAP Gone Wrong".
Поэтому, например, в сценарии "много холодных OLAP данных" MonetDB (использует mmap) будет проигрывать Clickhouse-у (filter-pipe с асинхронным чтением).
Но ситуация моментально переворачивается если данные остаются в ОЗУ при медленном i/o на шпиндельных дисках.

b

09:50

basiliscos

ну если БД влазит в память, то вроде всё ок с mmap'ом, а вот если нет, то да, уже некотроллируемая деградация перфа, как я понял

Л(

10:18

Леонид Юрьев (Leonid Yuriev)

In reply to this message

"Не контроллируемая" деградация будет если делать много случайных чтений с носителя с медленным позиционированием (шпиндельные диски), вне зависимости от mmap.
mmap же добавит затраты на page faults и неявно задействует кеширование ядра, но не добавит чего-либо "не контроллируемого".

В конечном счете, эффективность или не-эффективность mmap тут определяется тем, насколько целесообразно кеширование данных для конкретного сценария использования.

Поэтому меня несколько раздражают подобные "научные работы" - натягиваем сову на глобус, а потом делаем глубокомысленный вывод что (внезапно оказывается!) сова плохо натягивается...

b

10:19

basiliscos

Спасибо за комментарии ) 👍

20 January 2022

Alexander Brilliantov invited Alexander Brilliantov

chen invited chen

21 January 2022

Л(

23:34

Леонид Юрьев (Leonid Yuriev)

Устранена замеченная в Binance проблема слишком строгих assert-проверок.

Revealed in Binance issue of excessive/too-strict assertion was fixed yesterday.

https://github.com/erthink/libmdbx/issues/260

👍

b

23:34

Релиз libmdbx v0.11.4 намечен на конец января.

libmdbx v0.11.4 scheduled for 2022-01-28 with an important fixes for maximal sized DBs.
See the changelog for details https://erthink.github.io/libmdbx/md__change_log.html

23:36

Похоже что все перспективные/высокопроизводительные реализации Ethereum используют libmdbx как бекенд хранения.
Захватываем Мир, понемногу ;)

It seems that all promising/high-performance Ethereum implementations uses libmdbx as a storage backend.
We are conquering the World, step by step ;)

👍

АМ

ПО

SA

6

25 January 2022

Л(

00:27

Леонид Юрьев (Leonid Yuriev)

Guess which one of them will be called Mithril?

M

01:13

Mark

So adorable!!

AS

04:43

Alex Sharov

Clockwise: Berkley, Light, Barsik, Mithril

👍

A

2

🔥

SC

Л(

28 January 2022

Awpteamoose invited Awpteamoose

Л(

21:55

Леонид Юрьев (Leonid Yuriev)

Seems the https://github.com/erthink/libmdbx/issues/260 is (re)fixed completely finally, so the 0.11.4 may be released.
However, I decided to wait a bit and reschedule the release for 2022-01-31 (the next Monday).

29 January 2022

AL

00:04

Andrea Lanfranchi

In reply to this message

❤️

AL

14:10

Andrea Lanfranchi

Maybe dumb question: is there a way to determine if a cursor is "live" (i.e. bound to a live transaction) using C++ bindings ?

14:11

Actually the only reliable way I found is to catch an exception trying to access crs.txn() but I wish I could avoid that

Л(

15:52

Леонид Юрьев (Leonid Yuriev)

This was not envisaged since the need for such doesn't evidence in favor of an application design.
So currently it is no other way to do this by the C++ API only.
However, you can directly call mdbx_cursor_txn() or submit a PR with the addition of the appropriate method.

👍

AL

31 January 2022

b

09:41

basiliscos

Здравствуйте, снова я с MDBX_MAP_FULL: Environment mapsize limit reached . На этот раз точно вызывается mdbx_env_set_geometry(env, 0, -1, 0, -1, -1, 0) , после которого я рассчитывал на "automatic size management". ЧЯДНТ?

Л(

11:49

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Что было в прошлый раз, пардон, не помню. Но сейчас вызов mdbx_env_set_geometry(env, 0, -1, 0, -1, -1, 0) означает:
size_lower = minimal
size_now = don't change (default)
size_upper = minimal
growth_step = don't change (default)
shrink_threshold = don't change (default)
pagesize = minimal (256)

Соответственно, ЧЯДНТ == почти_все.

b

11:51

basiliscos

Ну мне в прошлый раз пояснили, что достаточно его вызвать и будет ASM (automatic size management). Поэтому я и вызвал его с дефолтами

11:52

с какими параметрами нужно вызвать, "чтобы не париться"? ;)

МЗ

11:54

Максим Заикин

In reply to this message

задать максимальное значение для типа параметра size_upper

👍

b

Л(

11:58

Леонид Юрьев (Leonid Yuriev)

In reply to this message

В данном случае вы явно выключили ASM, так как size_lower == size_upper == minimal.

👍

b

11:59

In reply to this message

Да, нужно хотя-бы mdbx_env_set_geometry(env, -1, -1, upper_limit_of_db_size, -1, -1, -1)

b

12:02

basiliscos

немного боязно, этот верхний предел задавать. Я хз сколько он там будет, он к-ва данные зависит. Т.е. если я задам какой-то фиксированный, то всё равно раньше или позже могу опять MDBX_MAP_FULL получить

МЗ

12:04

Максим Заикин

In reply to this message

максимальное количество страниц может быть 0x7FFFFFFF при станице в 4к это 8TB у вас такая база?

b

12:05

basiliscos

In reply to this message

меньше. Думаю пару гигов, ну мб 16Г. Задать на 8TB безопасно?

МЗ

12:08

Максим Заикин

In reply to this message

задайте 160GB:)) с запасом...

b

12:09

basiliscos

ок, спасибо всем.

Л(

19:31

Леонид Юрьев (Leonid Yuriev)

https://hn.svelte.dev/item/30013919
Hm, how to login to commenting there?

AV

20:08

Artem Vorotnikov

In reply to this message

https://news.ycombinator.com/item?id=30013919

Л(

20:09

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Thank.

Л(

23:38

Леонид Юрьев (Leonid Yuriev)

The release v0.11.4 is be delayed for 1-2 days until https://github.com/erthink/libmdbx/issues/265

1 February 2022

AL

21:16

Andrea Lanfranchi

Sorry for another mybe dumb question
I have a *dup sorted* table with these records

k: A v: AA
k: A v: AB
k: A v: AC

Why I get MDBX_KEYEXIST when trying to use cursor::insert(A, AD) ?
Do I necessarily have to use cursor::upsert ?

Л(

21:18

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://github.com/erthink/libmdbx/blob/master/mdbx.h%2B%2B#L2860-L2865

AL

21:21

Andrea Lanfranchi

Not sure I fully follow. insert_unique for dup sorted tables shouldn't check uniqueness for k v pair ?

Л(

21:22

Леонид Юрьев (Leonid Yuriev)

Please see these tables with the cases:
https://erthink.github.io/libmdbx/group__c__crud.html#c_crud_hints

21:25

The second table has all the info you need.

AL

21:27

Andrea Lanfranchi

Thank you

2 February 2022

b

08:24

basiliscos

MDBX_SAFE_NOSYNC ... a system crash can't corrupt the database, but you will lose the last transactions;

А можно уточнить сколько именно последних транзакций может потеряться или как приблизительно самому вычислить?

b

09:52

basiliscos

и ещё...

    auto flags = MDBX_WRITEMAP | MDBX_COALESCE | MDBX_LIFORECLAIM | MDBX_EXCLUSIVE | MDBX_NOTLS | MDBX_SAFE_NOSYNC;

Вот с такими флагами я получил MDBX_CORRUPTED: Database is corrupted . WTF?

Л(

10:07

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Если будет системный сбой (отключение питания и т.п.) то при последующем открытии БД будут "откачены" все транзакции до последней steady-фиксации.
Технически будет использована мета-страница помеченная как steady и с максимальным номером транзакции, а остальные мета-страницы будут затерты.

Steady-точки формируются при:
- каждом коммите без nosync;
- при каждом коммите или вызове mdbx_env_sync_poll() и достижении порогов заданных mdbx_env_set_syncbytes() или mdbx_env_set_syncperiod();
- при явном вызове mdbx_env_sync();
- при штатном закрытии БД;
- при приращении и при уменьшении файла БД;

При работе в utterly-nosync режиме steady-фиксации могут уничтожаться.

Поэтому кол-во потерянных транзакций определяется историей действий с БД.
Проще говоря, может быть потеряно всё что не потеряли.

👍

b

10:09

In reply to this message

Не видя всей картины не возможно сказать что и как у вас получилось.
Но стохастическими тестами проверяются все комбинации флажков (их примерно 448).

b

10:10

basiliscos

In reply to this message

Спасибо. Мне подойдёт явный синк (mdbx_env_sync()), а всё что может быть потеряно - после его. Отдельно буду трекать это.

Л(

10:11

Леонид Юрьев (Leonid Yuriev)

Для начала запустите mdbx_chk -vvv для поврежденной БД и покажите вывод утилиты.

b

10:12

basiliscos

In reply to this message

Понял. В следующий раз так и сделаю. Сейчас уже не могу, тк снёс БД.

10:15

In reply to this message

ну пока там был такой кейс: открывается новая БД, с флагами выше. Потом периодически насобачиваются транзакции с данными, и никогда не происходит синка (ну может быть разве что неявный - при приращении и при уменьшении файла БД;) , потом крэш.

Л(

10:20

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Проблемы из-за каких-то внутренних ошибок маловероятны, но могут быть какие-то недочеты/упущения из-за странного или не-предполагаемого использования API.
Поэтому предпочитаю разбираться с каждым случаем повреждения и при необходимости добавлять защиту "от дурака".

10:21

In reply to this message

А как буквально вы делали "крэш" ?

b

10:22

basiliscos

In reply to this message

ctrl+c в gdb, он в свою консольку выходит, а там q (принудительный выход). Вроде так.

10:23

да, так ( https://i.imgur.com/ajbYdaA.png )

Л(

10:24

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Это не может сломать БД.
Даже если открыть в режиме MDBX_UTTERLY_NOSYNC при повторном открытии будет совпадать bootid, поэтому weak-точки фиксации не будут отбраковываться.

b

10:25

basiliscos

ну скрин-пруф выше. Версия libmbdx`f836c928`

10:25

(гит коммит)

Л(

10:27

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Либо у вас что-то записалось непосредственно в память из-за WRITEMAP (т.е. БД была разрушена приложением), либо еще что-то подобное.
Советую задействовать логирование. В сборках без отладки (с NDEBUG) это не влияет на производительность, но вы увидите явные причины MDBX_CORRUPTED.

b

10:32

basiliscos

Логирование в mbdx? Как его включить?

Л(

10:33

Леонид Юрьев (Leonid Yuriev)

MDBX_WRITEMAP рекомендуется использовать только после достижения полной уверенности в корректности кода приложения.
Иначе есть вероятность повредить данные (или БД целиком) записав по неверному указателю.

b

10:35

basiliscos

In reply to this message

Ну у меня так и есть (под санитайзером гоняю периодически, давно ничего нет). WM давно включена, а вот MDBX_SAFE_NOSYNC включил только сегодня, и получил коррапт.

Л(

10:35

Леонид Юрьев (Leonid Yuriev)

In reply to this message

см mdbx_setup_debug().

Рекомендую один раз полностью прочитать mdbx.h.

👍

b

10:36

basiliscos

хорошо, спасибо. Буду экспериментировать.

Л(

11:18

Леонид Юрьев (Leonid Yuriev)

In reply to this message

На всякий, в linux/unix возможен еще один сценарий повреждения БД - через ошибочную запись в файловый дескриптор, либо неверное использование дескриптора.
Грубо говоря, например когда происходит запись в ранее использовавшийся файловый дескриптор, который был закрыт и теперь уже связан с файлом БД.

После https://github.com/erthink/libmdbx/issues/144 был добавлен предохранитель чтобы исключить пересечение с stdin/stdout/stderr, но полностью защититься от этого нельзя.

11:23

"Прелесть" подобных ситуаций в том, что их можно долго не замечать, так как случайные записи через "левый" файловый дескриптор в начало БД будут перетираться обновлением мета-страниц при штатной работе.
Но в нештатных ситуациях проблема может проявляться, именно как было у @basiliscos.
А выявить наличие такой проблемы иногда очень сложно...

b

11:24

basiliscos

вряд ли мой случай, консольную прилагу запускаю, и там std in/out/err не трогаю.

Л(

11:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Для уверенности в этом нужно посмотреть через strace что у вас нет открытия/закрытия каких-либо дескрипторов, кроме как из libmdbx.
Как вариант: открыть ~1000 дескрипторов перед открытием БД и закрыть сразу после.
Так хотя-бы легче изолировать и/или увидеть проблему.

AS

12:08

Alex Sharov

In reply to this message

Выключите WriteMap пока разрабатываете и в тестах - тогда неверное использование бд (нечаянная запись в область памяти бд) приведет к sigbus.

b

12:36

basiliscos

In reply to this message

👍👍

12:36

словил багу

12:36


Thread 2 "ss/net" received signal SIGBUS, Bus error.
[Switching to Thread 0x7ffff6bee640 (LWP 20006)]
0x0000555555a16b18 in mdbx_page_alloc (mc=0x7ffff6becff0, num=1, flags=67108871) at /home/b/development/tmp/syncspirit/lib/mbdx/src/core.c:6752
6752      ret.page->mp_pgno = pgno;
(gdb) bt
#0  0x0000555555a16b18 in mdbx_page_alloc (mc=0x7ffff6becff0, num=1, flags=67108871) at /home/b/development/tmp/syncspirit/lib/mbdx/src/core.c:6752
#1  0x0000555555a1785e in mdbx_page_touch (mc=0x7ffff6becff0) at /home/b/development/tmp/syncspirit/lib/mbdx/src/core.c:6881
#2  0x0000555555a50170 in mdbx_cursor_touch (mc=0x7ffff6becff0) at /home/b/development/tmp/syncspirit/lib/mbdx/src/core.c:14760
#3  0x0000555555a51a9b in mdbx_cursor_put (mc=0x7ffff6becff0, key=0x7ffff6bed3e0, data=0x7ffff6bed3f0, flags=0) at /home/b/development/tmp/syncspirit/lib/mbdx/src/core.c:15078
#4  0x0000555555a7255e in mdbx_put (txn=0x7ffff0005340, dbi=1, key=0x7ffff6bed3e0, data=0x7ffff6bed3f0, flags=0) at /home/b/development/tmp/syncspirit/lib/mbdx/src/core.c:18619
#5  0x000055555588308d in syncspirit::db::save (container=..., txn=...) at /home/b/development/tmp/syncspirit/src/db/utils.cpp:135

12:37

ща в gdb стоит. Что делать?

Л(

12:41

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Довольно странно...
Лучше всего дать ssh-доступ.

b

12:43

basiliscos

попробую

Л(

12:47

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Еще стоит собрать libmdbx с -DMDBX_FORCE_ASSERTIONS=1 и попробовать воспроизвести проблему.

b

12:47

basiliscos

плохо воспроизводится. Где-то после 20-40 запусков выстрелило

Л(

12:50

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Думаю надо включать ассерты и воспроизводить, так как с учетом текущей истории использования и объёма тестов наиболее вероятно что проблему провоцирует какая-то либо особенность, либо ошибка в вашем коде.

+ Наверное это будет лучше чем отладка через ssh сборки без ассертов.

Л(

14:53

Леонид Юрьев (Leonid Yuriev)

@basiliscos, буду ждать от вас информации еще 3-4 часа.
Если будет повод беспокоиться о баге в libmdbx, то отложу релиз v0.11.4
Иначе займусь релизом.

b

14:57

basiliscos

Не ждите специально, не воспроизводится пока

Л(

15:08

Леонид Юрьев (Leonid Yuriev)

In reply to this message

У меня идет прогон тестов (на всякий случай).
Это примерно 3 суток и к вечеру по Мск они должны завершиться...

Л(

23:23

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.11.4

4 February 2022

Л(

03:23

Леонид Юрьев (Leonid Yuriev)

I ask for help with the purchase of the two NMB1XXD128GPSU.
I have not been able to buy it or arrange delivery to Russia for almost a year (

#MithrilDB

b

13:14

basiliscos

в общем у меня больше не воспроизводится, но я паттерны использования в своих кодах поменял. Скажу что до этого было: около 5-10к мелких вставок, с UUID-ключём и разным payload'ом (думаю 100 байт). Каждая вставка - транзакция с коммитом.

Вчера я выяснил, что у меня "тормозит" как раз из-за того, что коммит на каждый чих делаю; ну я и переделал на то, чтобы коммит делался на 150 вставок. Если встречу ещё проблему, дам знать, ближайшее время вряд ли что-нибудь добавлю.

Л(

15:37

Леонид Юрьев (Leonid Yuriev)

In reply to this message

В целом, конечно, теперь правильнее.

Но меня, увы, больше волнует озвученная проблема с повреждением БД.
Вероятность того, что причиной был баг в libmdbx, я оцениваю как очень низкую.
Однако, причину очень желательно всё-таки выяснить, даже если в текущем коде проблема не воспроизводится.

Возможно, проблема была из-за ошибки в вашем коде, то также может быть что какая-то особенность поведения вашего кода провоцирует проявления какой-то ошибки или недочета в libmdbx.
В этом случае, намного рациональнее приложить усилия сейчас, чтобы исключить повторение проблемы при эксплуатации.

Поэтому, прошу вас, по-возможности вернуться к старому коду и добиться воспроизведения проблемы.
Чтобы в результате я мог понять в чем дело и при необходимости сделать доработки.

b

15:45

basiliscos

к сожалению, была ошибка ещё в одной моей библиотеке, которое НЕСКОЛЬКО иное поведение вызывало в основном приложении. Думаю, через месяц (отпуск + кое-какая фича), смогу запускать что-то вроде интеграционного теста в автоматическом режиме, ну и тогда будет гоняться почаще и может наткнётся. Вручную каждый раз запускать очень трудозатратно, особенно если оно достаточно редко выстреливало.

5 February 2022

Л(

19:54

Леонид Юрьев (Leonid Yuriev)

Камрады из Ethereum (Erigon/Akula/Silkworm), просьба кратко дать статус/обстановку/перспективы по проектам и охарактеризовать последствия использования libmdbx.
Эта информация будет использована компанией Positive Technologies (финансирует разработку libmdbx) для генерации новости/истории об использовании libmdbx в перспективных/высокопроизводительных реализациях Ethererum.
При желании можно дать мне контакты, которые я передам в пресс-службу для уточнения информации и/или получения комментариев.

FYI: @ledgerwatch, @AskAlexSharov, @vorot93, @leisim, @ioverclock ,@fuunyK.
Пардон, если кого-то забыл упомянуть.

M

20:20

Mark

I don't have anything to do with Ethereum but I am using libmdbx in an embedded system. It's an access control system. It's normally cloud-based but some of our customers wanted an option to run some features when there is an internet outage. For this we use libmdbx. Having a transactional nature is very important to us. Because when the connection is up we are synchronizing changes from the server side. And we don't want those changes to be visible until they are all made. We originally used LMDB in our prototype but it did not work well. We hit many corner cases where it could not garbage collect. For us high reliability is the most important characteristic. Performance is secondary. Even though we are scanning vehicles they are not driving fast enough that there are any performance issues with libmdbx even on flash storage.

We also found the support much better than LMDB.

❤

M

Л(

AV

20:41

Artem Vorotnikov

In reply to this message

Я думаю @ledgerwatch даст развёрнутый комментарий. Со своей стороны скажу что за одно лишь MVCC большое спасибо, оно автоматически делает MDBX go-to движком хранения практически для любого use case.

20:41

К сожалению, отсутствие биндингов на Python отпугнуло как минимум один проект в экосистеме Ethereum, они сейчас используют LMDB

Л(

21:04

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Биндинги для питона сделал https://github.com/Thermi в ветке python-bindings.
Насколько я знаю они собираются и работают.
Но мне крайне проблематично встроить их исходный код внутрь репозитория libmdbx.
Фактически в такой "встройке" нет никакого смысла, только неудобство и путаница.
Кроме этого, сам я не пользуюсь питоном, совсем.

Поэтому, я не считаю правильным заниматься этими биндингами.
Но еще раз повторю, де-факто они готовы, и дело только за тем, чтобы человек с прямыми руками оформил репозиторий, настроил CI и потом занимался поддержкой.

SC

22:23

Simon C.

I use libmdbx as a backend for the Isar database, one of the fastest NoSQL databases for Flutter apps.
It has only been released a month ago and is already used by thousands of developers. At ClickUp we plan to use it for the offline mode of our app.
Thanks to libmdbx Isar runs on iOS, Android, macOS, Windows, Linux and soon even the web.
Libmdbx provides incredible performance even on older devices. Compared to LMDB, it is faster, the codebase is very readable, it is actively maintained and has a much nicer community! Keep up the great work 💪

❤

Л(

🔥

E

6 February 2022

AS

04:33

Alex Sharov

For me most important feature over lmdb - exposing metrics. With metrics - i can do better decisions, more precisely see bottlenecks (what part of app generating most of dirty pages?). And mdbx_durable holding my pens well. It’s also more compressible than lmdb because of empty space nullification. And support is over 9000. Seems like we had 0 production issues last year (some things were catch at testing, other things were related to broken hardware).

❤

Л(

M

19:49

Mark

Is nullification of empty space something that can be disabled? For embedded it's best to prevent excessive flash wear

Л(

19:52

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Yes, see MDBX_NOMEMINIT.

❤

M

14 February 2022

18:35

Deleted Account

Is it necessary to use mdbx_dbi_close? I'm asking because in documentation I found next - "doing so can cause misbehaviour from database corruption.... As I understand you suggests to set maxdbs value, but not so huge?

Л(

18:54

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Hmm, it is too hard to guess what is necessary in your case ;)

However, follows from the documentation that only unused handles should be closed, i.e. it's obvious that doing close used handles on the fly is a bad idea.
So you should provide some synchronization to satisfy this rule or set the maxdbs large enough.
Nonetheless, transaction overhead in many aspects is linear proportional to the maxdbs and/or to the number of currently used/opened handles.

15 February 2022

12:17

Deleted Account

Small question, if I setup max_readers for example 50 of 100, why get_max_readers returns 120?

Л(

12:21

Леонид Юрьев (Leonid Yuriev)

In reply to this message

It is rounded up to fill the whole system page(s) which uses for LCK-file.

18:03

Deleted Account

bool env::is_empty() const { return get_stat().ms_branch_pages == 0; }

🔥

Л(

18:03

Why is_empty using only branch_pages?

18:04

How about leaf pages?

Л(

18:04

Леонид Юрьев (Leonid Yuriev)

Oops, seems like a typo

18:05

Deleted Account

Ok

Л(

18:17

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://github.com/erthink/libmdbx/commit/72bc655eceed2e530ecd800ec0644cc7c9b60bdc

16 February 2022

11:28

Deleted Account

also i have a question regarding env::remove.
It returns bool and inside using error::boolean_or_throw
inline bool error::boolean_or_throw(int error_code) {
switch (error_code) {
case MDBX_RESULT_FALSE:
return false;
case MDBX_RESULT_TRUE:
return true;
default:
MDBX_CXX20_UNLIKELY throw_exception(error_code);
}
}

env::remove returns:
false - if files or directories was removed
true - if files or directories is not exists
otherwise - exception

is it right behaviour?
Maybe should to be next:
true - if files or directories was removed
false - if files or directories is not exists
otherwise - exception

Л(

11:38

Леонид Юрьев (Leonid Yuriev)

In reply to this message

This is to avoid/reduce confusion with the C API error codes:
- historicaly`MDBX_SUCCESS` == 0 == MDBX_RESULT_FALSE, and the false because zero is here.
- MDBX_RESULT_TRUE (successful result with special meaning or a flag) == -1, and the true because non-zero.

🤷‍♂️

12:49

Deleted Account

Ok

20 February 2022

SC

23:13

Simon C.

Do you think it would be possible to make a small release with the fix from https://github.com/erthink/libmdbx/issues/267

I have a few users waiting for it 😅

21 February 2022

Л(

00:13

Леонид Юрьев (Leonid Yuriev)

In reply to this message

A release requires a large testing if we have a particular kind of changes.
This is exactly the case now.
So I have no plans to make a release anytime soon.

On the other hand, you can always use the local make dist or point git submoduke to the any commit.

👍

SC

Л(

00:36

Леонид Юрьев (Leonid Yuriev)

Some good news!
Thanking to nix.ru I got a couple of NMB1XXD128GPSU (Intel Persistent Memory 200 NVDIMM).
This unlocks to continue some of the work on MithrilDB.

However, the acquisition took almost 9 months, so in the near future I am extremely busy with other projects (replication for libfpta).
Over more actually, I will be able to even install these NVDIMMs only after done testing the next release of libmdbx (scheduled to 2022-03-01).

00:37

Л(

13:14

Леонид Юрьев (Leonid Yuriev)

Thanks to Kai Wetlesen for RPMs!
http://copr.fedorainfracloud.org/coprs/kwetlesen/libmdbx/

17:11

Deleted Account

Why mdbx_dbi_close returns BAD_DBI if current transaction was aborted(not a commited)?

17:11

It's expected behaviour?

Л(

17:12

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Does such dbi created in this (aborted) transaction?

17:22

Deleted Account

Yes

Л(

17:43

Леонид Юрьев (Leonid Yuriev)

In reply to this message

When a transaction is aborted, all changes made to it must be canceled, including the creation of a new key-value map(s) (i.e. named subdbs).
Thus no an underlying key-value map to be associated with such dbi-handle, after the transaction was aborted.

18:03

Deleted Account

Ok

Л(

19:59

Леонид Юрьев (Leonid Yuriev)

In a few hours I expect get a result from one more test iteration.
Perhaps this will confirm the regression in the Linux kernel in the unified page cache subsystem and/or its interaction with io-scheduler, disk/controller drivers & virtual-memory-manager.
If the regression is confirmed, then this will be a sufficient reason to make a release tomorrow with an emergency fix/workaround.
2022-02-22

👍

SC

20:01

However, in the current understanding, this is not a bug in libmdbx.

22 February 2022

Л(

22:09

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/issues/269

23 February 2022

Л(

21:41

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.11.5

25 February 2022

Kai Wetlesen invited Kai Wetlesen

KW

05:37

Kai Wetlesen

Thanks for the RPM shoutout! I’ll update the build constants tomorrow to pick up new versions

❤

Л(

b

20:17

basiliscos

если github заблочат, куда переедите?

Л(

20:17

Леонид Юрьев (Leonid Yuriev)

In reply to this message

см снизу readme.md
давно уже...

МЗ

20:18

Максим Заикин

In reply to this message

все будет норм, пока нефть по 100 и газ по 1500

b

20:21

basiliscos

In reply to this message

👍

Л(

20:35

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Это всё не принципиально, но:
- Экономика сша и еврозоны на грани коллапса (повтора депрессии начала прошлого века, см Караганова, Хазина и Делягина), не считая кучи других проблем.
- Китай еще думает утопить или нет сша вместе со своими триллиардами трежерис, всё это в контексте Тайваня и ситуации в том регионе.
- Давайте попробуем не обсуждать тут политику и экономику.

Всё будет хорошо!

C

20:58

CZ

так ты сам пишешь про политику и экономику…

21:14

Deleted Account

Чё то я уже очень долго слышу про то что сша загнуться

Л(

21:21

Леонид Юрьев (Leonid Yuriev)

Камрады, с дискуссией по этому офф-топику просьба к вышеназваным экономистам.

21:35

21:41

От себя добавлю - не потерплю здесь какую-либо про-бандеровскую или анти-российскую риторику, и т.п.

❤

ED

AB

AV

28 February 2022

KW

00:04

Kai Wetlesen

Hi all, just wanted to inform the most recent package version is now built and linted. As libmdbx is still on major version 0, there is no increment to the shared object version in accordance with libtool and semver standards. (huzzah!!! 🙌🏼 for semantic versioning!) ABI checker came out good, so it should be suitable for use with any v0 application. The current COPR repo is https://copr.fedorainfracloud.org/coprs/kwetlesen/libmdbx/

Л(

00:12

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Basically it is to hard to release 1.0, since thus made promises for users.
I wanna resolve most of TODOs before.

KW

00:21

Kai Wetlesen

In reply to this message

I totally understand! Let the users needs do the navigation. Is there any todo that a crusty old American C programmer can help with?

Л(

00:26

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No ;)
libmdbx was step-by-step cultivated from LMDB, but doesn't rewritten from scratch. Thus libmdbx have the same "rebus" code style, but more and complex. Therefore both projects actually are a solo-development, without options...

KW

01:09

Kai Wetlesen

Ah got it, I thought it looked familiar. Formed from the old openldap database format then?

Л(

01:13

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Yes, but there are a few "old format" of OpenLDAP DB, so should be clarified = https://github.com/erthink/libmdbx#history

1 March 2022

13:51

Deleted Account

inline bool cursor::erase(const slice &key, const slice &value) {
move_result data = find_multivalue(key, value, false);
return data.done ? erase() : data.done;
}

i think next construction easy to read:
data.done && erase();

3 March 2022

Deleted invited Deleted Account

Л(

18:51

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Ok.

18:53

Deleted Account

Hi, I am surprised by the size of database I am generating for my data.

18:53

[try again after editing]

21:29

Deleted Account

Hi there,
I am surprised by the sizes of database I am generating.
I am confused with how page size impacts this significantly.
I hope you can help me understand, please?

My data is 20MB in 75,000 records.
Most (70,000) records are 200-250 bytes,
A few (<4000) get bigger - 3 are 5K and 2 are 16K, 29K.
Keys are 16 byte hashes, so should total <1.5MB.
Of course there is some key duplication to build the tree, and some record-keeping.
This is being injected in about 75,000 transactions, with the big records being logarithmically replaced as they grow.

The issue is:
With the default page size (on linux), this generates a DB file of 140MB!
With the largest page size then it uses 48MB for this data.
[But yesterday I thought it stayed within 32MB initial allocation so perhaps this varies somehow]
with page size set to MAX_PAGESIZE/2 Db grows to 40MB
with page size set to MAX_PAGESIZE/4 Db grows to 37MB

Clearly there is a sweet-spot here somewhere, but I don't know if it is a good idea to find it?
Is there some leakage when the DB dynamically grows the file?

Are these the numbers I should expect? Or have I broken my system in some way?

My understanding is that at worst case each record might bad fit in half of 1 page or 2 pages,
so the DB would be max 2x the size of the data plus the keys,
plus some allowance for the write transaction?

So how can it be 7 times the data size?

/Gem

Л(

21:42

Леонид Юрьев (Leonid Yuriev)

In reply to this message

I advise you to try using mdbx_chk -vvv, mdbx_chk -vvvv, mdbx_chk -vvvvv to to get information about the number of pages and their fullness.
The output of the mdbx_chk -vvv is small enough to be shown/post here and get some answers. And then, I think, you can figure it out for yourself.

21:43

Deleted Account

Thanks for the tip!

👍

Л(

6 March 2022

Peter Johnson invited Peter Johnson

7 March 2022

20:49

Deleted Account

Back to my oversized DB 😊
These are 2 consecutive 'live' figures at the point where the DB starts to grow.
I'm thinking that up until the DB overflows then it might be greedy to use new pages?

mdbx_chk v0.11.5-18-g9569b864 (2022-03-05T15:37:32+03:00, T-402e8b2770025a48cb8d4c1551398c0850a6e584)
Running for /dev/shm/cuno/cuno.db.uid.kalimba.1004.1004/persist in 'read-only' mode...
- cooperative mode
- current boot-id ba3a1620c88213e7-ca5ba05782044e9f
- pagesize 16384 (4096 system), max keysize 8124..8166, max readers 2038
- mapsize 4294967296 (4.00 Gb)
- dynamic datafile: 1048576 (1.00 Mb) .. 4294967296 (4.00 Gb), +524288 (512.00 Kb), -0 (0.00 Kb)
- current datafile: 4194304 (4.00 Mb), 256 pages
- meta-0: steady txn#43, stay
- meta-1: steady txn#44, head
- meta-2: steady txn#42, tail
- skip check recent-txn-id with meta-pages (monopolistic or read-write mode only)
- transactions: recent 44, latter reader 44, lag 0
Traversal b-tree by txn#44...
- found 'table_1' area
- pages: walked 111, left/unused 126
- summary: average fill 70.6%, 0 problems
Processing '@MAIN'...
- key-value kind: usual-key => single-value
- summary: 1 records, 0 dups, 7 key's bytes, 48 data's bytes, 0 problems
Processing '@GC'...
- key-value kind: ordinal-key => single-value
- last modification txn#44
- fixed key-size 8
- summary: 3 records, 0 dups, 24 key's bytes, 516 data's bytes, 0 problems
- space: 262144 total pages, backed 256 (0.1%), allocated 237 (0.1%), available 261961 (99.9%)
- skip check used and gc pages (btree-traversal with monopolistic or read-write mode only)
Processing 'table_1'...
- key-value kind: usual-key => single-value
- last modification txn#44
- summary: 3802 records, 0 dups, 60832 key's bytes, 1179040 data's bytes, 0 problems
No error is detected, elapsed 0.001 seconds
10008 10008 800396
mdbx_chk v0.11.5-18-g9569b864 (2022-03-05T15:37:32+03:00, T-402e8b2770025a48cb8d4c1551398c0850a6e584)
Running for /dev/shm/cuno/cuno.db.uid.kalimba.1004.1004/persist in 'read-only' mode...
- cooperative mode
- current boot-id ba3a1620c88213e7-ca5ba05782044e9f
- pagesize 16384 (4096 system), max keysize 8124..8166, max readers 2038
- mapsize 4294967296 (4.00 Gb)
- dynamic datafile: 1048576 (1.00 Mb) .. 4294967296 (4.00 Gb), +524288 (512.00 Kb), -0 (0.00 Kb)
- current datafile: 4718592 (4.50 Mb), 288 pages
- meta-0: steady txn#49, head
- meta-1: steady txn#47, tail
- meta-2: steady txn#48, stay
- skip check recent-txn-id with meta-pages (monopolistic or read-write mode only)
- transactions: recent 49, latter reader 49, lag 0
Traversal b-tree by txn#49...
- found 'table_1' area
- pages: walked 125, left/unused 143
- summary: average fill 71.3%, 0 problems
Processing '@MAIN'...
- key-value kind: usual-key => single-value
- summary: 1 records, 0 dups, 7 key's bytes, 48 data's bytes, 0 problems
Processing '@GC'...
- key-value kind: ordinal-key => single-value
- last modification txn#49
- fixed key-size 8
- summary: 3 records, 0 dups, 24 key's bytes, 584 data's bytes, 0 problems
- space: 262144 total pages, backed 288 (0.1%), allocated 268 (0.1%), available 261948 (99.9%)
- skip check used and gc pages (btree-traversal with monopolistic or read-write mode only)
Processing 'table_1'...
- key-value kind: usual-key => single-value
- last modification txn#49
- summary: 4433 records, 0 dups, 70928 key's bytes, 1339360 data's bytes, 0 problems
No error is detected, elapsed 0.001 seconds
10008 10008 800396

👍

T

20:52

My reading of this is that we have 1.5MB of data+keys, using 75% of 111-125 pages of the capacity, but there are another 126-143 pages free that could be being used, but are not? This gives us our overall 30% utilisation. No stuick readers or other silly errors that I can see?
Is that about right?

Л(

21:42

Леонид Юрьев (Leonid Yuriev)

For the last chunk of output:
- the DB file is 288 pages sized;
- 268 pages are allocated, 125 are used, 143 left (GC have 3 records with the 584/4 - 3 = 143 pages);
- average fill of used pages is 71.3% (it is good enough).

GC' pages will be reclaimed/reused in a subsequent transactions, but a sources of CoW'ed pages will be put into GC.

The circulation of pages into and from GC is a big complex picture.
So for more info of how GC works (aka FreeDB in the LMDB), please see Howard Chu presentation about LMDB internals.

8 March 2022

00:42

Deleted Account

Agreed the 73% filling of in-use pages is OK. I would not complain if this was the only problem 😊
Ok so the problem was that for these runs I used aggregated write transactions. 150 record insertions hit almost all the pages!
Using a smaller transaction batch reduces this significantly because that has less churn.
Thanks for the clues!
This does not explain why the DB grew so bad for the default page size, but I have stopped using that anyway.

12 March 2022

AS

09:34

Alex Sharov

If use SafeNoSync+syncPeriod - does mdbx do any special madvise after background sync?

Л(

19:26

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No.
Only common "readahead" hint following mdbx_is_readahead_reasonable()' result for the actual DB size.

14 March 2022

KW

20:50

Kai Wetlesen

Hi all, does there exist any command line client by which a user of libmdbx may query a database?

Л(

23:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No, there is no such thing.

In general, there are no plans for any development of libmdbx, except for the implementation of already planned features (see ChangeLog.md) as soon as there is free time.

24 March 2022

10:08

Deleted Account

i think we have a bug in the cursor::find_multivalue
inline cursor::move_result cursor::find_multivalue(const slice &key,
const slice &value,
bool throw_notfound) {
return move(key_exact, key, value, throw_notfound);
}

looks like key_exact - wrong parameter, should to be multi_find_pair.
Am i right?

10:08

@erthink?

Л(

10:16

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Yes, seems like a copy&paste mistake.

https://github.com/erthink/libmdbx/commit/b67074c52057604206eb77dd2301220ee8661bc6

10:22

In reply to this message

Thank for reporting.

12:08

Deleted Account

No problem

Л(

23:16

Леонид Юрьев (Leonid Yuriev)

https://github.com/erthink/libmdbx/releases/tag/v0.11.6

👍

KW

AA

25 March 2022

KW

01:09

Kai Wetlesen

In reply to this message

New RPM release upcoming for RedHat systems!

👍

Л(

KW

02:32

Kai Wetlesen

So how do we enable MDBX_WRITEMAP mode? Is this something that happens when building libmdbx? Or is this a runtime thing?

Л(

08:51

Леонид Юрьев (Leonid Yuriev)

In reply to this message

- Runtime.
But now it is not required, since the workaround is done.

29 March 2022

Vladislav Shchapov invited Vladislav Shchapov

30 March 2022

Andrei M invited Andrei M

6 April 2022

KW

04:00

Kai Wetlesen

Excellent! So will we still be able to read and write to MDBX databases from old kernels as normal?

AS

04:04

Alex Sharov

Yes

9 April 2022

EI

18:43

Eugene Istomin

Добрый,

вот такую ошибку получать начал - не часто, пока второй раз: "mdbx_txn_begin: (-30420) MDBX_EBADSIGN: Wrong signature of a runtime object(s), e.g. memory corruption or double-free"

Это означает, что транзакция не закоммитилась?

Л(

18:49

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Транзакция не началась (не запустилась).
Скорее всего в mdbx_txn_begin() передан какой-то левый указатель.

Проверяйте корректность вашего кода. При необходимости задействуйте Valgrind или ASAN (Address Sanitizer).

EI

18:50

Eugene Istomin

In reply to this message

Спасибо!

Л(

18:51

Леонид Юрьев (Leonid Yuriev)

Не за что.

10 April 2022

AndreiIV invited AndreiIV

EI

14:44

Eugene Istomin

In reply to this message

Разобрался, возник по пути вопрос про collections и ключи, на примере покажу:
есть три контекста:
- aa
- bb
- сс

Каждый логически (т.е. с точки зрения данных) отличается от другого, какой способ будет адекватнее:

1) Три разные collections, внутри натуральные ключи, отражающие структуру данных
2) Один collection, к ключу прибавляем соль вида "aa::" (т.е. один collection)
3) Комбинируем 1 и 2 - т.е. сегментируем не связываемые друг с другом участки в отдельные collection - а внутри связываем их солью "aa::"

Откуда вопрос: внутри транзакции (transaction, snapshot) нельзя переключиться в другую коллекцию на ходу, т.е. если в итераторе нужно получать доступ к объектам этого же логического домена данных - то остаётся переходить на синтетические ключи с солью.

14:47

In reply to this message

При переключении в другую коллекцию на лету вылетает "Unable to change collection: transaction open"

14:49

In reply to this message

Как понимаю, я могу находится в snapshot-итераторе, и оттуда открывать множество аналогичных snapshot или одну transaction?
В таком виде я не нарушаю изоляцию при условии, что не пишу в те же ключи, которые читаю

Л(

15:23

Леонид Юрьев (Leonid Yuriev)

1. Судя по "Unable to change collection: transaction open" вы используете какие-о биндинги.
Поэтому могут быть какие-то дополнительные ограничения этих привязок и/или соответствующего языка программирования.

2. В читающих транзакциях вы имеете полный доступ ко всему снимку БД (который полностью соответствует последней зафиксированной транзакции на момент старта соответствующей читающей).
Поэтому мы можете произвольно читать, в том числе через курсоры, все коллекции (отображения key->value).

3. В пишущих транзакциях все несколько сложнее. но в целом libmdbx отслеживает все изменения в данных и отображает их на все открытые курсоры - проще говоря "всё просто работает" и вы можете читать/итерировать те же коллекции и ключи, что изменяете.
Отдельная тонкость только при работе с данными через указатели непосредственно в телом БД - в этом случае, указательн на данные расположенные в "грязной" (уже измененной) страницы БД могут стать некорректным при любой последующей операции изменяющей данные.
См. mdbx_is_dirty().

4. При отображении схемы/структуры данных на key-value следует стремиться как к минимизации длины ключей, так и к количеству отдельных коллекций.
Это противоречивые требования, но стремиться их выполнить есть смысл только при больших объемах данных.
Поэтому при маленьких БД (например меньше 10 Мб) можно делать как удобнее, а на больших БД (гигабайты) всегда требуется вдумчивый анализ и взвешенные компромиссы.
Некие варианты и мысли по теме есть в статье https://habr.com/ru/company/vk/blog/480850/ и комментариях к ней.

👍

АМ

EI

15:27

Eugene Istomin

In reply to this message

1. https://gitlab.com/mahlon/ruby-mdbx/-/blob/master/ext/mdbx_ext/database.c#L422

if ( db->txn )
rb_raise( rmdbx_eDatabaseError, "Unable to change collection: transaction open" );

Правильно понимаю, что в этом ruby-биндинге не совсем корректная, с точки зрения mdbx, реализация смены коллекции? Если так - то свяжусь с разработчиком, подумаем

15:36

In reply to this message

2. Понял, очень классно
3. Тоже радует, про mdbx_is_dirty понял
4. https://habr.com/ru/company/vk/blog/480850/#comment_21704514 хорошо рассказывает по теме, спасибо

Л(

23:23

Леонид Юрьев (Leonid Yuriev)

Mithril (at right) with brother, 2 kg.
Both make me hurry up with the release of MithrilDB ;)

👍

AM

ED

7

14 April 2022

b

17:27

basiliscos

LIBMDBX_API int mdbx_cursor_get   (   MDBX_cursor *    cursor,
    MDBX_val *    key,
    MDBX_val *    data,
    MDBX_cursor_op    op 
  )     

[in,out]  data  The data of a retrieved item.

скажите, в чём смысл data как входного параметра? А то он у меня на стеке лежит, и всякий мусор содержит?

AS

17:37

Alex Sharov

In reply to this message

Смотрите DupSort фичу, и операции курсора связанные с DupSort

18:03

Deleted Account

есть проблемка, собрал под андроид(x86; x86_64; armeabi-v7a; arm64-v8a).
На девайсах с x86 - при вызове функции mdbx_env_open - возвращает EINVAL(22 system error code).
size_lower = 0; size_now=0; upper_size=100Mb; page_size=4096(Default);
причем на всех остальных архитектурах этот же код работает без ошибок.

18:03

?

18:39

Deleted Account

It can be connected with the mmap() function?

Л(

20:09

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Предположительно какая-то несовместимость и/или недоработка в реализации libc, либо системного вызова (андроид на x86 скорее экзотика).
Для начала предлагаю использовать strace чтобы понять какой системный вызов возвращает EINVAL, либо проблема где-то в другом месте.

Также следует проверить что значение __ANDROID_API__ при сборке соответствовало платформе (ядру и bionic).

22:09

Deleted Account

Спс

22:09

Отпишусь по результатам

15 April 2022

EI

02:01

Eugene Istomin

In reply to this message

При работе с синтетическими составными ключами возник вопрос: если ли такой вызов, который позволяет выдавать ключи, начинающиеся с такой-то комбинации.

Например, есть ключи:
aa::bb::cc
aa::bb::dd
aa::dd::ee

Было бы много быстрее сказать "дай выборку, которая начинается с aa::bb" - и получить сразу два ключа, вместо перебора всех ключей.
Даже условный find (найти ключи по маске) бы помог

02:04

In reply to this message

https://erthink.github.io/libmdbx/functions_f.html
Нашел )

Буду в ruby-биндинги вносить

AM

04:26

Andrei M

https://github.com/erthink/libmdbx выдает 404 🤔

Л(

04:28

Леонид Юрьев (Leonid Yuriev)

Github удалил мой аккуант (со всеми репозиториями) без каких-либо предупреждений и объяснения причин.
В связи с этим стоит напомнить, что уже два года основным является https://abf.io/erthink/libmdbx (см конец README), а github использовался для удобства пользователей, CI и размещения генерируемых справочных страниц.

https://gitflic.ru/project/erthink/libmdbx остаётся пока резервным, но возможно станет основным из-за более удобного интерфейса и скорости.

04:29

In reply to this message

Вы опередили меня пока я писал предыдущее сообщение.

АМ

04:29

Антон Марфин

Похоже все репы Позитивных технологий потерты.
https://abf.io/erthink/libmdbx.git тоже 404 выдает

AM

04:30

Andrei M

правильная веб ссылка похоже без .git https://abf.io/erthink/libmdbx

Л(

04:32

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Смотря для чего использовать - для git clone нужно с .git

👍

AM

АМ

04:35

Антон Марфин

Ссылки на доку тоже надо бы поправить тогда.
Удаление страниц идет прямо сейчас. Час назад открывал ссылку на страницу документации навигировал по ссылкам. А теперь похоже работает только то, что сохранилось у меня в кэше

KW

04:38

Kai Wetlesen

I’ll update the RPM spec tonight to point at the new remote.

👍

Л(

❤

AV

04:41

Well… once my girlfriend and I can finally exit this dreadful hospital.

👍

Л(

AV

05:25

Artem Vorotnikov

like, retweet
https://twitter.com/vorot93/status/1514791500025978883

AA

10:36

Alexey Akhunov

Is there english interface for gitflic.ru ? Some people say that learning Russian is a bit of hurdle 🙂

EI

10:41

Eugene Istomin

In reply to this message

Я ночью подумал, что баг - но блин ..

МЗ

10:42

Максим Заикин

In reply to this message

прикольно:)) выглядит как клоунада какая то (:

10:44

In reply to this message

clowns wiggle Russian developers, I think it's time to learn Russian

AA

10:58

Alexey Akhunov

In reply to this message

I can read Russian, but we have non-Russian speaking developers in our team. We could of course make cheat-sheet for the interface but if it is easier to make english version as well as russian, it would be great

Л(

11:08

Леонид Юрьев (Leonid Yuriev)

In reply to this message

For now I have not seen any other languages in the gitflic interface other than Russian nor any possibility of switching.

However, it seems to be possible to machine translation = https://gitflic-ru.translate.goog/project/erthink/libmdbx?_x_tr_sl=ru&_x_tr_tl=en

AL

11:10

Andrea Lanfranchi

In reply to this message

Yes true ... for any inner page it works automatically but for whatever reason the main page does not translate automatically (you manually have to tamper with the arguments in URL). Just FYI.
For what is worth ... I'm sorry this happened

Л(

13:05

Леонид Юрьев (Leonid Yuriev)

https://gitflic.ru/project/erthink/libmdbx/issue/1

dubbelosix | Overclock invited dubbelosix | Overclock

Л(

14:47

Леонид Юрьев (Leonid Yuriev)

In reply to this message

gitflic developers plan to release the English-language interface within 1-2 weeks.

👍

AL

МЗ

18:49

Максим Заикин

Веб-сервис для хостинга IT-проектов GitHub начал блокировать аккаунты российских компаний - к настоящему времени заблокированы уже десятки аккаунтов

18:49

в дополнение так сказать (:

18:50

Однако в марте GitHub отказывался от блокировки российских разработчиков, ссылаясь на то, что сервис "является домом для всех разработчиков"

16 April 2022

MI

03:00

Marin Ivanov

In reply to this message

Yep, that's what I first thought when I heard about erthink's deleted/disabled account.

M

03:17

Mark

It's kind of ridiculous. We (Americans) are not at war with the Russian people. There's no point to punishing the Russian people with this nonsense. Especially software developers. Companied should stay the hell out of politics. Silicon valley companies love to get involved in this crap. 😡

I wonder if there is an email address I can complain to at GitHub

👍

ED

Colossus Data Company invited Colossus Data Company

17 April 2022

AV

01:38

Artem Vorotnikov

In reply to this message

мне кажется, если Windows и macOS всё равно не тестируются, можно вообще удалить костыли для их поддержки, до лучших времён или вообще навсегда

Deleted invited Deleted Account

11:11

Deleted Account

Не стоит

11:11

Я тут планировал

Л(

12:25

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Функционирование libmdbx на Windows и macOS протестировано достаточно хорошо.
Многие пользователи уже полагаются на поддержку этих ОС в libmdbx.
"Костыли" действительно были добавлены, но только для Windows (ибо вместо нормального mmap() есть масса ограничений).

Поэтому удалять поддержку этих ОС с одной стороны нельзя, а с другой не имеет смысла.

👍

ED

AV

b

13:27

basiliscos

In reply to this message

ы-ы-ы. А можно подробней? Будет ли дропаться версия под винду и мак или будет поддерживаться?

Л(

14:38

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Как уже отвечал выше, специально/намеренно удаляться ничего не будет.

Сейчас поломаны/недоступны все использовавшиеся механизмы для Continuous Integration, а восстановление требует (как минимум) затрат времени.
При этом для open-source платформ есть готовые решения, а с проприетарными об-санкционненными (Windows и OSX/iOS) есть дополнительные сложности и риски:
- официально они могут оказаться недоступными для использования, в том числе для CI.
- нужно как-то решать вопросы с лицензиями и/или оплатой подписки.

В свою очередь, отсутствие CI затрудняет цикл разработки и выпуска релизов.
Это НЕ ПРИЧИНА чтобы удалить работающий код и/или создавать проблемы пользователям, но повод вести поддержку Windows/OSXiOS по запросам: поломалось => завели issue => пофиксили => проверили => влили в master.

b

15:12

basiliscos

👍👍

18 April 2022

AV

05:18

Artem Vorotnikov

@erthink А почему Вы отказались от донатов в крипте? Крипта как раз остаётся нейтральной и не подчиняется тараканам в голове союзников по НАТО 😉

Л(

10:52

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Крипту стали блокировать еще в марте. Не только вывод, но и (например) сделки/заявки на биржах.
С одной стороны, можно иметь некий кошелей и получать переводы, но извлечь из этого пользу уже нельзя.
Кроме этого возникают проблемы с уплатой налогов - либо явно нарушать закон, либо много возни без гарантий отсутствия претензий со стороны регуляторов.

AV

10:56

Artem Vorotnikov

In reply to this message

Биржи не являются криптой, это тот же банк, только в профиль. Кроме бирж, в рубли можно выводить через различные обменники, либо оплачивать криптой различные сервисы, что супер актуально в свете ухода Visa и Mastercard.

Что касается налогов - насколько мне известно - нет ни судебной практики, ни умения и даже желания администрировать криптоналоги со стороны ФНС.

16:55

Deleted Account

По поводу андроид на x86

16:55

28374 fstatat64(AT_FDCWD, "/data/user/0/com.companyname.libmdbx.android.samples/files/bingo_mdbx_new/mdbx-test.db", {st_mode=S_IFREG|0600, st_size=0, ...}, 0) = 0
28374 ioctl(0, TCGETS, 0xffef0e58) = -1 ENOTTY (Not a typewriter)
28374 ioctl(1, TCGETS, 0xffef0e58) = -1 ENOTTY (Not a typewriter)
28374 ioctl(2, TCGETS, 0xffef0e58) = -1 ENOTTY (Not a typewriter)
28374 openat(AT_FDCWD, "/data/user/0/com.companyname.libmdbx.android.samples/files/bingo_mdbx_new/mdbx-test.db", O_RDWR|O_LARGEFILE|O_CLOEXEC) = 49
28374 ioctl(0, TCGETS, 0xffef0e58) = -1 ENOTTY (Not a typewriter)
28374 ioctl(1, TCGETS, 0xffef0e58) = -1 ENOTTY (Not a typewriter)
28374 ioctl(2, TCGETS, 0xffef0e58) = -1 ENOTTY (Not a typewriter)
28374 openat(AT_FDCWD, "/data/user/0/com.companyname.libmdbx.android.samples/files/bingo_mdbx_new/mdbx-test.db", O_WRONLY|O_DSYNC|O_LARGEFILE|O_CLOEXEC) = 56
28374 fstat64(49, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
28374 ioctl(0, TCGETS, 0xffef0dc8) = -1 ENOTTY (Not a typewriter)
28374 ioctl(1, TCGETS, 0xffef0dc8) = -1 ENOTTY (Not a typewriter)
28374 ioctl(2, TCGETS, 0xffef0dc8) = -1 ENOTTY (Not a typewriter)
28374 openat(AT_FDCWD, "/data/user/0/com.companyname.libmdbx.android.samples/files/bingo_mdbx_new/mdbx-test.db-lck", O_RDWR|O_CREAT|O_LARGEFILE|O_CLOEXEC, 0600) = 59
28374 sched_yield() = 0
28374 fcntl64(59, F_OFD_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=4294967296, l_len=4113621231163408384}) = -1 EINVAL (Invalid argument)
28374 fcntl64(59, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0
28374 fstat64(49, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
28374 fstat64(59, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
28374 fcntl64(49, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=-65536}) = -1 EINVAL (Invalid argument)

👍

DH

16:56

@erthink?

16:57

Какое значение надо использовать для mdbx_env_open, mode?

16:58

Я пробовал 640

16:58

И 0

Л(

17:01

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Этот тут не при чем, совсем.
Судя по логу ядро не хочет принимать 64-битные аргументы для fcntl64(F_OFD_SETLK).

17:03

Deleted Account

Это может быть связано с тем что я на эмуляторе?

Л(

17:09

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Пока не могу ответить. Нужно понять что именно ядру не нравится в параметрах.

Л(

17:48

Леонид Юрьев (Leonid Yuriev)

Наблюдается вот что:

1. fcntl64(59, F_OFD_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=4294967296, l_len=4113621231163408384}) => EINVAL
Это попытка использовать F_OFD_SETLK и EINVAL тут вполне может быть.
Однако аргументы l_start=4294967296 и` l_len=4113621231163408384`не верные, таковых в исходном коде нет.
Должно быть fcntl64(fd, F_OFD_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}).

2. Из-за ошибки с F_OFD_SETLK происходит fallback на F_SETLK:
fcntl64(59, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) => OK
Тут всё отрабатывает нормально, причем аргументы уже нормальные, хотя в коде это буквально одно и тоже место.

3. fcntl64(49, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=-65536}) => EINVAL
Тут FFFF0000 превратился в -65536.

17:48

Посмотрите как определяются средой сборки макросы _LARGEFILE_SOURCE, _LARGEFILE64_SOURCE и _FILE_OFFSET_BITS.
Как вариант попробуйте в начало файла src/osal.h добавить:
#define _LARGEFILE_SOURCE
#define _LARGEFILE64_SOURCE
#define _FILE_OFFSET_BITS 64

17:59

Deleted Account

Спс

17:59

Попробую

17:59

Отпишу

19 April 2022

Mks invited Mks

Deleted invited Deleted Account

17:49

Deleted Account

In reply to this message

Попробовал не помогло

17:50

https://android.googlesource.com/platform/bionic/+/master/docs/32-bit-abi.md

17:51

Здесь вроде как описаны проблемы, ну вроде все так и сделано

17:54

@erthink а важно где я компилю под Андроид, на Винде или на Linux?

Л(

17:55

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Я делал некоторые эксперименты, но с этой информацией немного переделаю.

17:56

In reply to this message

Разницы быть не должно, если сборки настроена и происходит корректно - но в этом сейчас нет уверенности.

17:58

Deleted Account

Под все остальные платформа работают, а я кроме ndk и cmake ничего не использую

Л(

18:00

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Попробуйте взять https://gitflic.ru/project/erthink/libmdbx/commit/593d025ed1f7fd661a82958a1f2752c24d364b69 (т.е. ветку devel но без последних коммитов) и собрать с включенными ассертами (-DCMAKE_BUILD_TYPE=Debug или -DMDBX_FORCE_ASSERTION=ON).

18:02

Deleted Account

Ok

18:04

Может спросить на английском, один ли я тут такой у которого x86 для андроид не работает

G

18:05

G

In reply to this message

разве много android-устройств на x86 ?

18:06

Deleted Account

Нет ну есть

18:06

Проблема в том

18:06

Что стор не даёт заливать x86_64, без x86

18:07

Guys is anybody compiled libmdbx for Android x86?

Л(

18:08

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Исходя из этого в логах strace видны странности, которые как-бы всё объясняют.
В логах виден системный вызова fcntl64(), который есть только на 32-битных платформах и обязан принимать 64-битные смещения.
Соответственно, если для этого системного вызова подготавливаются как для 32-битного, то будет именно наблюдаемая проблема (мусор в части аргументов).

18:09

Короче, я сейчас явно добавлю проверку на 32-битность off_t для 32-битном Андроиде.

G

18:09

G

In reply to this message

а каким компилятором компилите? из NDK ? какая версия NDK ?

18:11

Deleted Account

Версия 21.4

G

18:11

G

In reply to this message

а другими версиями пробовали?

18:11

Deleted Account

Пока нет

G

18:15

G

In reply to this message

вы apk загружаете или app bundle в google play?

18:17

Deleted Account

Apk

G

18:18

G

In reply to this message

app bundle не пробовали? а в build.gradle указать только x86_64 без x86

S

18:58

Sproul

A Lighthouse user reported an assert fail on libmdbx 0.11.6.4:

lighthouse: mdbx:18954: mdbx_cursor_put: Assertion `(char )olddata.iov_base + olddata.iov_len <= (char)(mc->mc_pg[mc->mc_top]) + env->me_psize' failed.

Is this a known issue fixed in a more recent commit? I think it's distinct from the assert that was fixed for Erigon recently.

Л(

19:24

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Sure this has nothing to do with recent fix.
Presumably this is the result of database corruption due to a RAM or disk failure, as has already happened in several recent cases.
To give a more definite answer, I need to get this database (if it is fails verification by mdbx_chk for now).

Please try to register and fill an issue at https://gitflic.ru/project/erthink/libmdbx/issue
There is no English language support yet, but the developers promise to do it in 1-2 weeks.

19:24

In reply to this message

+ Consider use https://www.memtest86.com/ to check RAM, etc.

19:26

In reply to this message

Я нашел причину проблем.
Постараюсь сегодня поправить.

S

19:54

Sproul

Thanks 🙏

Л(

20:06

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://gitflic.ru/project/erthink/libmdbx/commit/eb8bc865d120676f3a83bf03d1d6df975cffb155

Пробуйте, должно работать.

Л(

20:26

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Проблема состояла из двух частей:

1. Когда-то давно я под #ifndef _FILE_OFFSET_BITS`добавил явно `#define _FILE_OFFSET_BITS 64, так давно что забыл об этом.
Это устраняло проблемы на старых системах, но и не должно было что-либо сломать.
На системах без off64_t это определение должно быть проигнорировано, либо в худшем случае привести к ошибке/предупреждению, которое лечится определением _FILE_OFFSET_BITS в параметрах сборки.

2. В андроид же наблюдается самый странный и хрупкий вариант:
- в 32-битном ядре есть поддержка 64-битных смещений и размеров файлов, включая fcntl64().
- при _FILE_OFFSET_BITS=64 всё собирается (без предупреждений!) и даже используется 64-битный системный вызов, но ему передается указатель на 32-битную версию структуры (т.е. на левые параметры).

22:37

Deleted Account

Hi There!

I have been investigating an issue that some records are created in the database at alignments of 4 bytes,
but we have fields in our records that have 8 byte alignments (double, int64_t, etc).
This causes problems with SIMD optimisation and with ARM64, also slightly slower on x86.
If I memcpy the records out of the database to read them, then this makes the code significantly slower.
Also I want to use mdbx_cursor_put(MDBX_RESERVE) so I can construct records faster direct in the DB.

Of course, I make sure /all/ our DB records and indexes are %8 size.

The particular case I discovered is when a record is too big for the page: F_BIGDATA.
In this case the key is stored on the "top page" as MDBX_node+key+pgno_t
The data is stored in a new largepage at MDBX_page::mp_ptrs

But mp_ptrs is at %4 offset because the previous field of MDBX_page is "pgno_t mp_pgno".

Also MDBX_node+key+pgno_t is %4 offset because pgno_t is %4. (MDBX_node is good %8).
When the next (smaller) record is inserted, the offset is %4 for that record,
even though its size MDBX_node+key+data is %8.

It seems the "obvious" fix is to make pgno_t into uint64_t,
but this has many second effects in the code.

Of course: it breaks compatibility with other MDBX tools I did not compile, but that is acceptable for me :-).

I don't mind fixing some things to make it work (use PRIaPGNO in printf), but I see problems so some design is needed. Please help me consider these options?

1) If I also change atomic_pgno_t then I bang STATIC_ASSERT(sizeof(MDBX_reader) == 32);
I could reduce mr_tid to 32 bits?

2) If I do not change atomic_pgno_t then there is the risk they will mismatch? I could keep "#define MAX_PAGENO UINT64_C(0x7FFFffff)"

3) I could change pgno_t to "struct { int32_t value, filler; }" - will produce many cosmetic ".value" changes, but is the safest change?

4) I can try to understand where the code assumes "MDBX_node+key+pgno_t", but I know I don't understand all the code.
For example I worry that if pages are split or merged then it will not copy these gaps correctly.
Similarly I don't know all the places where F_BIGDATA assumes MDBX_page::mp_ptrs.
Also I don't use duplicate records, so I did not try to understand that code.

If you have a preferred fix, or suggest another fix I could try? I can branch or offer a patch when we are friends again on github.

Thankyou for your help in advance

/Gem

22:46

Deleted Account

In reply to this message

Завтра проверю, большое спасибо за помощь

Л(

23:46

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Oh, a lot of questions...

I will try to answer briefly, but giving you the most important information.

1. Pages are always filled with keys and data from the end, with MDBX_node instances. Therefore, in order to get the right alignment, it is necessary and sufficient that: (size of(MDBX_node) + node_key.length + node_data.length) % alignment == 0 for each item.
1.1 This right for F_BIGDATA nodes, where node_data.length == sizeof(pgno_t)
1.2 The P_LEAF2 pages is major exception, but seems this is not your case.

2. I think you are somewhat overestimating the impact of alignment.
2.1 Modern compilers optimize unnecessary copying well for short fixed-length items, if possible using instructions for accessing unaligned data.
And you can greatly help the compiler by hints the assumed alignment by std::assume_aligned<>, etc.
2.2 During searching over keys, the influence of transitions far exceeds the influence of alignment , if it does not come to unaligned access exception.
2.3 Long data will be aligned to the page border, and short data will have no effect and/or will be cached in the CPU register(s).

23:48

Deleted Account

Hi Leonid. Currently we just get crashes on ARM. yes, we could use the unaligned data types in c++20, except we are still only using c++11.

23:48

If you are not that interested then we will just branch.

23:50

Similar with SIMD. The only option is to disable SIMD optimisation on all modules that interface with the DB. That is what we currently do.

20 April 2022

Л(

00:16

Леонид Юрьев (Leonid Yuriev)

For now I doubt that the micro-optimizations that you want to do will take a noticeable effect.
Of course, something will be a little faster, but you will spend time and may suddenly worsen something.

I think you will be pleasantly surprised if you make a lightweight C++ wrapper over MDBX_val with hints about assumed align and/or copying fixed-length fields for alignment.
It is not necessary to use C++20 for this, all modern compilers have long provided the means to do this, or accepts some tricks.

However, in any case I (long ago)It is aimed at stabilisation of libmdbx (rather than add features), to have more time for other projects, including MithrilDB.
In other words, there are a lot of things that can be improved and redone in libmdbx, but then MithrilDB will turn out ;)

Л(

01:53

Леонид Юрьев (Leonid Yuriev)

Основной репозиторий проекта перемещен на https://gitflic.ru/project/erthink/libmdbx, так как 15 апреля 2022 администрация Github без предупреждения и без объяснения причин удалила libmdbx вместе с массой других проектов, одновременно заблокировав доступ многим разработчикам. По той же причине Github навсегда занесен в черный список.

На случай если это была случайность или ошибка мы ждали 5 суток (три рабочих дня), но чуда не случилось. Github умер, как и многие декларируемые либеральные ценности (свобода слова, презумпция невиновности и право на суд, неприкосновенность личности и частной собственности и т.д.).

---

The origin repository of the project has been moved to https://gitflic.ru/project/erthink/libmdbx since on April 15, 2022, the Github administration, without warning and without explanation, deleted libmdbx along with a lot of other projects, simultaneously blocking access to many developers. For the same reason Github is blacklisted forever.

In case it was an accident or a mistake, we waited 5 days (three working days), but no miracle happened. So Github is died, as well as many declared liberal values (freedom of speech, presumption of innocence and right to trial, inviolability of the person and private property, etc).

https://gitflic.ru/project/erthink/libmdbx/commit/1a471ed04b12d90514d37d95af3316a59503d943

GK

04:16

Gleb K.

In reply to this message

Наверное из-за того что positive technologies под санкциями.
Немного конечно поддевает спросить почему люди, не поддерживающие либеральные ценности, так расстраиваются, когда другие их нарушают, но да это десятое.
Спасибо за отличную библиотеку в любом случае и огромную работу туда вложенную!

💩

AV

ED

13:37

Deleted Account

Hmm the other problem we have with arm is we have some 64bit atomics in the records - our own transaction event counter and expiry time.
We have MDBX_WRITEMAP, and for our own sanity we generally only set the atomics inside write transaction, but we read them in read transactions, so atomic read is important. On arm that has to be aligned.

Л(

14:11

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Of course you can do it, but it's a somewhat strange design from a (my) theoretical point of view.

Changes to such atomics must be synchronized with writing transactions, otherwise the updates may be lost/overwritten.
Moreover, such changes should be synchronized with a mdbx::put-operations, since each of ones can move a specific element (i.e. change the actual RAM address of it).

Therefore, it looks like a non-optimal fragile design from the desire to use atomic types where they are useless (from the point of view of the temporal topology of data operations).
I think it would be better to use a separate/independent lock-free data structure with cache-aligned atomics, which is periodically converted/megred/flatted to a linear representation and then stored in a DB.

16:27

Deleted Account

In reply to this message

Не не помогло, сейчас mdbx_env_open вообще крашит процесс. Лог https://ctxt.io/2/AABgf929EA

Л(

16:47

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Упс, это мой баг, перестарался и недопроверил.

Откатите коммит https://gitflic.ru/project/erthink/libmdbx/commit/befb8dd2033a0aa4ecc300935802d7de134038fa
git revert befb8dd2033a0aa4ecc300935802d7de134038fa

17:00

Deleted Account

Yes, perhaps it is not the best design for a database, but we are using the DB engine mainly to perform managed IPC between parallel processes, and also to provide temporal cacheing between consecutive process runs.
I'm aware that records can move when new records are being put, and the transactional nature means the reader process won't see the latest atomic value change if the writer relocates, but it will see a valid and safe value. Mostly, the actual value visible to a client at any moment is not so important, but it must be a valid value, not a torn value. Clearly it is possible to spin-lock against torn values, but that is also potentially fragile code 😊

17:10

Deleted Account

In reply to this message

No problem, thx

18:22

Deleted Account

Works, @erthink thx for the help.

👍

Л(

20:27

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Вариант с откатом коммита может зависать на 32-битном андроиде из-за проблем в bionic.
Итоговое исправление будет завтра.

KW

21:50

Kai Wetlesen

What is the current latest release available from gitflic.ru? I see tags corresponding to release versions up to 0.9.3

21:51

Last version I have a version for is v0.11.6

Л(

21:51

Леонид Юрьев (Leonid Yuriev)

In reply to this message

All tags are available.
The v0.11.7 scheduled at 2022-04-22

KW

21:54

Kai Wetlesen

Awesome, thank you Leonid!

21:57

Is there an amalgamated source download available for the gitflic version?

Л(

21:59

Леонид Юрьев (Leonid Yuriev)

No.
I decided that it was better to focus on releasing the next release using new tools, rather than restoring the old one.

👍

KW

Deleted invited Deleted Account

23:21

Deleted Account

hello, is there a plan to fix x86, 32 bit issue?

Л(

23:22

Леонид Юрьев (Leonid Yuriev)

In reply to this message

For Android/Bionic?
- Yes

23:23

Deleted Account

is there a timeline for that?

Л(

23:25

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Sure before v0.11.7 (scheduled for 2022-04-22)
just git revert befb8dd2033a0aa4ecc300935802d7de134038fa if you're in a hurry.

23:27

Deleted Account

Awesome.

21 April 2022

Л(

00:04

Леонид Юрьев (Leonid Yuriev)

Online docs restored at https://libmdbx.website.yandexcloud.net/
The libmdbx.dqdkfa.ru domain is still pending (waiting for TLS Certificate).

👍

SC

ED

5

❤

1

KW

01:39

Kai Wetlesen

The libmdbx RPM hosted in Fedora COPR is now updated. I’m just manually generating and posting amalgamated sources for the build and linking back to GitFlic until I can figure out a less stupid way to do this.

👍

Л(

❤

01:40

Would it be possible to simply upstream the RPM spec to the project?

chriss invited chriss

22 April 2022

Л(

11:20

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Yes.
Please try made a merge-request at GitFlic.

12:21

Deleted Account

In reply to this message

about this, is update now available?

Л(

13:17

Леонид Юрьев (Leonid Yuriev)

In reply to this message

not ready yet.
I am engaged in CI and elimination of minor shortcomings.

👍

SC

VS

16:46

Vladislav Shchapov

Леонид, скажите, пожалуйста, почему https://abf.io/erthink/t1ha не переехал на https://gitflic.ru/user/erthink?

Л(

16:46

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Переехал, но нужно написать в саппорт чтобы дали право сделать его публичным.

16:47

Есть копия на abf.io

16:52

В t1ha давно ничего не обновлялось.
Предполагалась разработка новой версии, но остались неопределенности по седьмой архитектуре e2k.
Там в частности обсуждалась поддержка магмы, кузнечика и стрибога специальными командами, от чего принципиально сильно зависит вектор развития t1ha.
Грубо говоря, либо соревноваться с wyhash, либо смотреть в сторону e2kv7.

VS

16:54

Vladislav Shchapov

Спасибо за ответ! Я думал у них флажок на право создания публичных репозиториев на аккаунт дается, а не на каждый репозиторий.

16:55

Увы, у меня пока везде x86. А эльбрусы только в виде сувенирного процессора есть.

Л(

16:57

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Ну я регистрировался еще когда эти права давались сразу, но тогда просто не создал репу для t1ha.

МЗ

17:03

Максим Заикин

In reply to this message

интересно, я чего то пропустил вообще что есть wyhash, а есть где либо материалы по коллизиям?

Л(

17:11

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Ну гуглится же элементарно.

МЗ

17:22

Максим Заикин

In reply to this message

ищу:)

SC

20:22

Simon C.

Do I understand the changelog of v0.11.7 correctly: mdbx_env_set_geometry() is no longer idempotent and the env needs to be discarded if it fails?

Л(

20:26

Леонид Юрьев (Leonid Yuriev)

In reply to this message

No.
What exactly was the reason for such thought?

20:29

Maybe I should clarify the changelog, etc?

SC

21:21

Simon C.

"Switched to using MDBX_EPERM instead of MDBX_RESULT_TRUE to indicate that the geometry cannot be updated."

21:21

But I understand what it means now 🙂

👍

Л(

23 April 2022

Л(

00:32

Леонид Юрьев (Leonid Yuriev)

libmdbx 0.11.7 (Resurrected Sarmat)
https://gitflic.ru/project/erthink/libmdbx/release/90ec9985-cd60-4d9a-8c98-8417506fd26d

+++ link was updated !!!

🔥

AA

t

06:05

timothy

In reply to this message

нельзя скачать без регистрации теперь

AA

09:17

Alexey Akhunov

у меня работает отлично без регистрации: git clone https://gitflic.ru/project/erthink/libmdbx.git

t

09:27

timothy

речь про релизы для того же bazel, чтобы собирать amalgamated версии.

09:57

Deleted Account

@leisim https://gitflic.ru/project/erthink/libmdbx/issue/2 issue is fixed.

Л(

12:02

Леонид Юрьев (Leonid Yuriev)

In reply to this message

ru: Заметил проблему только когда вы на неё указали.
Долго думал/возился как поправить.
В результате файлы размещены на отдельном хостинге (рядом с авто-генерируемой online-документацией), со ссылками из описания релиза.
Но релиз пришлось пересоздать, из-за чего его собственная ссылка изменилась (

en: I noticed the problem only when you pointed it out and been thinking / messing around for a time how to fix it.
As a result, the files are placed on a separate hosting (next to the auto-generated online documentation), with links from the release description.
But the release had to be recreated, which is why its own link changed (

👍

t

Л(

12:44

Леонид Юрьев (Leonid Yuriev)

Once again:
- unexpectedly, it shown up that the archives attached to the release are downloadable only thru GitFlic registration/authorization;
- the v0.11.7 release was republished today to made archives be downloadable without authorization at GitFlic, but (therefore) the release ID/URL has also changed;
- there are still no restrictions on access to libmdbx.

https://gitflic.ru/project/erthink/libmdbx/release/90ec9985-cd60-4d9a-8c98-8417506fd26d

👍

SC

20:36

Simon C.

Github mirror until Gitflic properly supports English 🙂 https://github.com/isar/libmdbx

👍

A

22:25

Aleksei🐈

Нет у кого ссылки на готовый простой пример использования C++ обертки?

22:39

Я так понимаю в конструктор mdbx::env надо указатель на уже созданный экземляр MDBX_env передавать?

Л(

22:42

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Нет.
Всегда начинайте с экземпляров managed-классов, из которых при необходимости получайте не-управляемые.

т.е. сначала создавайте mdbx::env_managed.

A

22:42

Aleksei🐈

ага, ок, спасибо

Л(

22:59

Леонид Юрьев (Leonid Yuriev)

Сделал несколько правок для Doxygen и обновил online-документацию.
Стало лучше, особенно по C++.
https://libmdbx.website.yandexcloud.net/group__cxx__api.html

24 April 2022

Л(

13:55

Леонид Юрьев (Leonid Yuriev)

In reply to this message

GitFlic already support Russian and English languages, plan to support more, including 和中文.
You are welcome!

SC

16:16

Simon C.

Does anyone know how to fix this Linux arm64 build error? Windows and Mac arm64 work fine

/usr/include/stdint.h:26:10: fatal error: 'bits/libc-header-start.h' file not found

Л(

16:24

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Sure you system and/or toolchain is broken/inconsistent to do this, but there are too many reasons for.
At least you should known enough about cross-compilation.

For instance see https://gitflic.ru/project/erthink/libmdbx/blob?file=GNUmakefile&commit=ce229c750082d642d94e2a89b161acb439b0f8b5#line-num-696

👍

SC

25 April 2022

Artyom Vorobyov invited Artyom Vorobyov

Deleted invited Deleted Account

13:55

Deleted Account

Всем добрый день. Леонид я вам писал на вк пол часа назад, я сделал то что вы сказали но, я продолжаю получать ошибку - No tags can describe 'b7ed67543fefb0878dba1c70dea2a81201041314'. Try —always, or create some tags. Извините за назойливость. я уже долго не спал простите за тупость

Л(

14:08

Леонид Юрьев (Leonid Yuriev)

In reply to this message

После переезда на GitFlic я удалил старые теги в качестве временного решения проблемы usability в web-интерфейсе GitFlic.
Это не должно было создать проблемы, так как:
- для новых клонов и/или новых пользователей старые теги не нужны;
- в уже клонированных репозиториях старые теги остаются;
- кроме этого, удаление старых тегов помогает выявить необновляемые/забытые субмодули (вероятно это ваш случай).

Вы же пытаетесь собрать очень старую версию (2019) и при этом в вашем локальном репозитории нет старых тегов.

Вам просто нужно обновить используемую ветку в вашей локальной копии репозитория и/или перемотать вперед указатель submodule.
Например так: git checkout master && git reset --hard origin/master.

14:17

Deleted Account

In reply to this message

я попробую, спасибо Леонид.👍

26 April 2022

Kirill Temnenkov invited Kirill Temnenkov

Л(

15:15

Леонид Юрьев (Leonid Yuriev)

@vorot93, @AskAlexSharov and anybody from SilkWorm.

Pay attention to the v0.11.7 release (https://gitflic.ru/project/erthink/libmdbx/release/90ec9985-cd60-4d9a-8c98-8417506fd26d )
There were no bugs after v0.11.6-6-g18789654 from 2022-03-27, i.e. this version is enough for stable operation.

However:
1. SilkWorm uses v0.11.6 without the last fix d5220690 and should be updated.
2. I wanna to make sure there are no troubles after migrating to GitFlic.

AS

15:20

Alex Sharov

Thank you. No troubles.

18:11

Deleted Account

No trouble !

👍

Л(

Wtz_LASR invited Wtz_LASR

28 April 2022

Deleted invited Deleted Account

17:31

Deleted Account

Mdbx_env_get_maxreaders, is it possible that current function returns different values for different os? For example for windows returns 120, for Mac 116

Л(

17:34

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Yes.

17:34

Deleted Account

Ok

Л(

17:41

Леонид Юрьев (Leonid Yuriev)

Actually it is depends from MDBX_LOCKING
https://gitflic.ru/project/erthink/libmdbx/blob?file=src%2Foptions.h#line-num-214

30 April 2022

aarvay invited aarvay

2 May 2022

SC

09:18

Simon C.

Is there a way to not include this data in release binaries? https://ibb.co/xX366xh

Л(

10:04

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Seems you are producing a build with enabled assertions (i.e. not disabled ones by -DNDEBUG), aka debug build.
https://en.cppreference.com/w/c/error/assert

To build libmdbx, the GNUmakefile and CMakeLists.txt are provided for GNU Make and CMake correspondently.
By default these scripts produce a release build with disable assertions but enabled compiler optimization, unless additional flags/options are specified.
If you build in a different way, you should take care of the necessary flags/options yourself (i.e. enable/disable compiler optimization, assertion control, etc).

In addition, you may want to use the strip utility to remove debug information.

SC

10:22

Simon C.

I do use the CMakeLists (through a rust build script) and stripping is not supported for iOS afaik. Let me do some more research why these symbols are included 🤔

Л(

10:23

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Please, show the ./mdbx_chk -V output.

👍

SC

10:32

Simon C.

mdbx_chk version 0.11.7.0
- source: v0.11.7-0-g40ec559 2022-04-22T23:28:56+03:00, commit 40ec559c8c1e0b9794cdc0cd12c155d05e7e8547, tree 6fdefd6844bd6d212ea6dec99892ae1fac981521
- anchor: f0dd6710aa0a5b973b65ceaa06a09586a416662a062842f6a0955cb97ec29e1a_v0_11_7_0_g40ec559
- build: 2022-05-02T09:31:05+0200 for arm64-apple-darwin21.4.0 by Apple clang version 13.1.6 (clang-1316.0.21.2.3)
- flags: -DNDEBUG=1 -std=gnu++2b -std=gnu11 -O2 -g -Wall -Werror -Wextra -Wpedantic -ffunction-sections -fPIC -fvisibility=hidden -pthread -Wno-error=attributes -fno-semantic-interposition -Wno-unused-command-line-argument -Wno-tautological-compare -lm
- options: MDBX_DEBUG=0 MDBX_WORDBITS=64 BYTE_ORDER=LITTLE_ENDIAN MDBX_ENV_CHECKPID=AUTO=1 MDBX_TXN_CHECKOWNER=AUTO=1 MDBX_64BIT_ATOMIC=AUTO=1 MDBX_64BIT_CAS=AUTO=1 MDBX_TRUST_RTC=AUTO=1 MDBX_ENABLE_REFUND=1 MDBX_ENABLE_MADVISE=1 _GNU_SOURCE=NO MDBX_OSX_SPEED_INSTEADOF_DURABILITY=0 MDBX_LOCKING=AUTO=2001 MDBX_USE_OFDLOCKS=AUTO=0 MDBX_CACHELINE_SIZE=64 MDBX_CPU_WRITEBACK_INCOHERENT=1 MDBX_MMAP_INCOHERENT_CPU_CACHE=0 MDBX_MMAP_INCOHERENT_FILE_WRITE=0 MDBX_UNALIGNED_OK=4 MDBX_PNL_ASCENDING=0

Л(

10:56

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Ok.
For now you can see some error-messages are still present.
Some of these messages correspond to the mdbx_ensure() checks (git grep mdbx_ensure |wc => 117 564 7830), which I prefer to leave even in release builds.
The rest are logging messages for a cases:
1) incorrect data is detected, i.e. with various database damages.
2) erroneous and/or noteworthy rare situations.

SC

10:57

Simon C.

Okay got it, thanks!!

5 May 2022

Руслан Лайшев invited Руслан Лайшев

РЛ

14:21

Руслан Лайшев

Hi there!

Л(

14:21

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Группа двуязычная, можно и по-русски.

РЛ

14:22

Руслан Лайшев

Добар дан, коллеги!

👍

Л(

14:25

Леонид, правильно я понимаю, что пара mdbx_craete_env().mdbx_open() в случае, если мне надо иметь несколько файлов-таблиц, должна быть вызвана для каждого файла ?

Л(

14:26

Леонид Юрьев (Leonid Yuriev)

In reply to this message

Да, если (в терминах libmdbx/LMDB) речь о "named subDB".

РЛ

14:28

Руслан Лайшев

(хм ... пропустил по доке, зачту опять)
И второй вопрос, мне нужно для каждой треды создавать каким-то образом свой контекст ? Или можно не париться и мутексы-локи делаются внутри API ?

Л(

14:39

Леонид Юрьев (Leonid Yuriev)

In reply to this message

1. https://libmdbx.dqdkfa.ru/usage.html#starting

2. В целом вся необходимая синхронизация/блокировка делается внутри libmdbx, но нельзя одновременно работать с объектом (env, txn, cursor) одновременно из нескольких потоков, кроме нескольких очевидных исключений (запуск/остановка транзакций, некоторые функции получающие информацию/статистику).

3. Самое мутно-непонятное место = время жизни и валидности DBI-хендлов, в случаях когда приходиться удалять и/или пересоздавать таблицы (aka named subDB).

РЛ

14:41

Руслан Лайшев

По subDB я что-то поиском не нашёл, ткни , пожалуйста, в ссылку?

14:42

Also, mdbx_env_set_maxdbs() must be called after mdbx_env_create() and before mdbx_env_open() to set the maximum number of named databases you want to support.

14:42

Это ?

Л(

14:42

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://libmdbx.dqdkfa.ru/group__c__dbi.html#ga9bef4a9fdf27655e9343bbbf8b6fc5a1

14:50

In reply to this message

Нужная информация есть в mdbx.h и/или на libmdbx.dqdkfa.ru (содержимое генерируется doxygen + make).
Но найти сразу часто не получается, как-то не попадается на глаза и т.п.
Поэтому merge-request-ы с улучшениями документации приветствуются.

РЛ

14:53

Руслан Лайшев

Спасибо, Леонид. Наконец-то до меня допёрло, делаем maxdbs - по числу файлов-таблиц, и при mdbx_open получаем dbi. Который и есть id файла-таблицы.

14:55

И про тредовость не паримся - ибо уже всё сделано внутри. 😊

РЛ

20:12

Руслан Лайшев

Есть проблемка с пониманием, Леонид,
беру пример, вместо одного put делаю 1350 - получаю:

mdbx_txn_commit: (-30792) MDBX_MAP_FULL: Environment mapsize limit reached

что мне делать с этим ?

20:13

  for ( int i = 1350; i; i--)
    {
        key.iov_len  = snprintf(l_key, l_key - 1, "key - %d", i);
        data.iov_len = snprintf(l_val, l_val - 1, "key - %d", i);
        if ( MDBX_SUCCESS != (rc = mdbx_txn_begin(env, NULL, 0, &txn)) )
        {
            $LOG(STS$K_ERROR, "mdbx_txn_begin: (%d) %s", rc, mdbx_strerror(rc));
            goto bailout;
        }

        if ( MDBX_SUCCESS != (rc = mdbx_put(txn, dbi, &key, &data, 0)) )
        {
            $LOG(STS$K_ERROR, "mdbx_put: (%d) %s", rc, mdbx_strerror(rc));
            goto bailout;
        }

        if ( MDBX_SUCCESS != (rc = mdbx_txn_commit(txn)) )
        {
            $LOG(STS$K_ERROR, "mdbx_txn_commit: (%d) %s", rc, mdbx_strerror(rc));
            goto bailout;
        }


    }

20:14

Или я что-то совсем не так делаю ?

Л(

20:14

Леонид Юрьев (Leonid Yuriev)

Надо сделать БД более резиновой, см. https://libmdbx.dqdkfa.ru/group__c__settings.html#ga79065e4f3c5fb2ad37a52b59224d583e

РЛ

20:16

Руслан Лайшев

Не, погодь, а во что я упираюсь ? Около 30 put-ов прошло ведь.

20:17

В размер базы ? А могу я unlim сразу включить ?

Л(

20:22

Леонид Юрьев (Leonid Yuriev)

Ну так читайте доку и делайте что требуется.

РЛ

20:23

Руслан Лайшев

😊 Видимо придётся.

Л(

20:24

Леонид Юрьев (Leonid Yuriev)

И лучше использовать C++ API

РЛ

20:24

Руслан Лайшев

Увы, не пишу на плюсавом.

6 May 2022

РЛ

14:48

Руслан Лайшев

Привет, всем!
Может я опять задам глупый вопрос, однако: не соображу как мне сделать что-нибудь , а-ля truncate "таблице" или delete всего содержимого за один вызов ?

Л(

14:50

Леонид Юрьев (Leonid Yuriev)

In reply to this message

https://libmdbx.dqdkfa.ru/group__c__crud.html#gadc9d57659242902003bb5459bc063b32

РЛ

15:09

Руслан Лайшев

Ах-хах! Я искал по дел, а тут даже целый дроп. 😊

7 May 2022

Deleted invited Deleted Account

evm invited evm

e

22:29

evm

with libmdbx on Windows 11 I get

11 | use std::os::unix::ffi::OsStrExt;
| ^^^^ could not find unix in os

22:29

how do I fix that?

Л(

22:36

Леонид Юрьев (Leonid Yuriev)

No such code inside libmdbx.
Is this a Rust bindings ?

e

22:37

evm

In reply to this message

it is from a rust project that i'm trying to compile. I'm just reading what is printed on the console

22:38

>cargo build --all --profile=production
Compiling libmdbx v0.1.5
error[E0433]: failed to resolve: could not find unix in os

22:38

.cargo\registry\src\github.com-1ecc6299db9ec823\libmdbx-0.1.5\src\environment.rs:11:14
|
11 | use std::os::unix::ffi::OsStrExt;
| ^^^^ could not find unix in os

22:39

seemed logical to come here for support. but how do I figure out whats going on and what to do

Л(

22:52

Леонид Юрьев (Leonid Yuriev)

@vorot93, глянь pls, похоже на твоё

AV

22:53

Artem Vorotnikov

In reply to this message

there's no Windows support code in Rust bindings atm, unix module is a hard dep

22:53

my suggestion is to use WSL

e

22:53

evm

okay, I'll try that

22:53

I didn't realize rust was completely not Windows

22:53

given how popular the interest in rust is

AV

22:54

Artem Vorotnikov

Rust is on windows, rust-libmdbx is not

22:54

after GitHub incident, I'm not going to go out of my way to support proprietary platforms, especially since no one on Erigon team is using them

e

22:54

evm

In reply to this message

I found a github repo about this, https://github.com/vorot93/libmdbx-rs, how do I make that a part of my project

22:55

In reply to this message

this is you

AV

22:55

Artem Vorotnikov

yes

e

22:55

evm

In reply to this message

oh, this still won't help me on windows then, unless trying WSL2

22:56

oh and you're in Akula

AV

22:56

Artem Vorotnikov

yes, you should use WSL2, but in the long run move to Linux

e

22:56

evm

welp. guess I'm at the edge of development in this area lol.

AL

22:57

Andrea Lanfranchi

In reply to this message

Unfortunately wsl2 though working is very bad for performances

e

22:57

evm

In reply to this message

damn. ok.

22:57

I have a very powerful machine, for its time, but with Erigon new blocks are processed at 1 block per second

22:57

its very sad performance

22:57

and that machine is behind 100,000 blocks

AV

22:57

Artem Vorotnikov

you're not running Linux why?

e

22:58

evm

In reply to this message

that machine was running windows 11 for some other development

22:58

it can be nuked or potentially partioned

22:58

it had very good performance for everything else

22:58

and could do everything

AV

22:59

Artem Vorotnikov

well, Akula does not have Windows support either for that matter, so it's Linux or bust anyway

22:59

if that's your interest in libmdbx-rs

AL

23:01

Andrea Lanfranchi

An Akula sister project, named silkworm, is on the path for getting a full sustainable node. And it does support windows.
Not as ready as Akula is thoug

e

23:01

evm

In reply to this message

how is syncing performance compared to erigon?

23:01

archival node syncing

23:01

it is important for our development to branch to prior states

AL

23:02

Andrea Lanfranchi

Download of headers and blocks is same but all other stages are from 40% to 300% faster

e

23:07

evm

In reply to this message

yeah stage 5 is the important one, where it is processing blocks

23:11

In reply to this message

even for silkworm?

AL

23:16

Andrea Lanfranchi

Download heavily relies on network peers and how fast they're in providing blocks. Hence process perf and iron are less relevant.
But all other stages which do not require networking are very fast. Actually on my windows box (32 GB Ram and 2TB NVMe) stage execution grinds at 350Mgas/sec at 14.6M blocks

23:17

Must say Windows is not the best env if you aim to sheer perf as MSVC compiler lacks support for 128 bits integers. You'll always be faster on linux

23:18

But I stop here as I'm polluting this channel with off topics

AV

23:20

Artem Vorotnikov

Windows is a poor env regardless, because it's a closed platform, has its own quirks compared to Linux, and it's a PITA to develop on

👍

Л(

e

23:20

evm

In reply to this message

very insightful thanks

AV

23:22

Artem Vorotnikov

also Microsoft-owned GitHub's deletion of libmdbx repo and Windows 11 being centralized spyware-riddled bloatware should be a wake-up call to all who have not moved yet

AL

23:26

Andrea Lanfranchi

In reply to this message

A lot of userbase though. As long as we can provide support for them we definitely help spreading node software usage.

AV

23:26

Artem Vorotnikov

95% of Ethereum nodes run on Linux

https://ethernodes.org/os

AL

23:30

Andrea Lanfranchi

Providing a decent implementation maybe won't bring to a flip but for sure manu Win users will be able to run a node in background on their desktops.

e

23:31

evm

In reply to this message

why was libmdbx repo deleted

23:31

In reply to this message

thats because they are on aws

23:32

I consentfully have a machine amongst many machines that is centralized spyware, I do not care, moving on

23:33

In reply to this message

yeah that would be great

AV

23:34

Artem Vorotnikov

In reply to this message

https://twitter.com/vorot93/status/1514791500025978883

e

23:35

evm

In reply to this message

ah got it, have you considered a different git host

AV

23:36

Artem Vorotnikov

In reply to this message

Akula will be moving to radicle when it's ready, Erigon won't be staying on GitHub either

Л(

23:41

Леонид Юрьев (Leonid Yuriev)

In reply to this message

"The origin for now is at GitFlic since on 2022-04-15 the Github administration, without any warning nor explanation, deleted libmdbx along with a lot of other projects, simultaneously blocking access for many developers. For the same reason Github is blacklisted forever."

No new information nor updates for now.

10 May 2022

РЛ

11:38

Руслан Лайшев

Всем привет!
Коллеги, а есть возможность "пробежать" по таблице по элементам и выбрать их по порядку вставки ?