Univention Bugzilla – Full Text Bug Listing |
Summary: | LDAP server restart is not reliable | ||
---|---|---|---|
Product: | UCS | Reporter: | Stefan Gohmann <gohmann> |
Component: | LDAP | Assignee: | Felix Botner <botner> |
Status: | CLOSED FIXED | QA Contact: | Arvid Requate <requate> |
Severity: | normal | ||
Priority: | P5 | CC: | markus.daehlmann, schwardt |
Version: | UCS 4.1 | ||
Target Milestone: | UCS 4.1-2-errata | ||
Hardware: | Other | ||
OS: | Linux | ||
What kind of report is it?: | --- | What type of bug is this?: | --- |
Who will be affected by this bug?: | --- | How will those affected feel about the bug?: | --- |
User Pain: | Enterprise Customer affected?: | ||
School Customer affected?: | ISV affected?: | ||
Waiting Support: | Flags outvoted (downgraded) after PO Review: | ||
Ticket number: | Bug group (optional): | ||
Max CVSS v3 score: | |||
Bug Depends on: | 33992 | ||
Bug Blocks: |
Description
Stefan Gohmann
2014-01-25 09:32:06 CET
Happened again: Ticket #2016050621000287 Please provide the Erratum for UCS 4.1-1 and 4.1-2. On a i386 system with ldap mdb and a mdb maxsize of at least 2147483648, slapd sometimes refuses to start (exactly the same setup as on Tiket #2016050621000287). amd64 works fine. -> while true; do /etc/init.d/slapd restart; done Jun 28 00:31:54 slave2 slapd[23930]: mdb_db_open: database "dc=w2k12,dc=test" cannot be opened: Cannot allocate memory (12). Restore from backup! Jun 28 00:31:54 slave2 slapd[23930]: backend_startup_one (type=mdb, suffix="dc=w2k12,dc=test"): bi_db_open failed! (12) Jun 28 00:31:54 slave2 slapd[23930]: slapd stopped. -> strace 4006 open("/var/lib/univention-ldap/ldap/data.mdb", O_RDWR|O_CREAT|O_LARGEFILE, 0600) = 17 4006 fstatfs64(17, 84, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=12486714, f_bfree=11816933, f_bavail=11176882, f_files=3180464, f_ffree=3059825, f_fsid={11195914, 1546682992}, f_namelen=255, f_frsize=4096}) = 0 4006 uname({sys="Linux", node="slave2", ...}) = 0 4006 pread64(17, "\0\0\0\0\0\0\10\0\0\0\0\0\336\300\357\276\1\0\0\0\0\0\0\0\0\0\0\200\0\20\0\0\10\0\1\0\0\0\0\0\1\0\0\0\0\0\0\0\r\0\0\0\26\0\0\0\0\0\0\0\0\0\2\0\1\0\0\0\2\0\0\0\0\0\0\0T\0\0\0a\1\0\0t\1\0\0P\2\0\0", 92, 0) = 92 4006 pread64(17, "\1\0\0\0\0\0\10\0\0\0\0\0\336\300\357\276\1\0\0\0\0\0\0\0\0\0\0\200\0\20\0\0\10\0\1\0\0\0\0\0\1\0\0\0\0\0\0\0\16\0\0\0v\0\0\0\0\0\0\0\0\0\2\0\1\0\0\0\2\0\0\0\0\0\0\0T\0\0\0\33\0\0\0t\1\0\0Q\2\0\0", 92, 4096) = 92 4006 mmap2(NULL, 2147483648, PROT_READ, MAP_SHARED, 17, 0) = -1 ENOMEM (Cannot allocate memory) -> gdb --args /usr/sbin/slapd -d -1 -h "ldapi:/// ldap://:7389/ ldap://:389/ ldaps://:7636/ ldaps://:636/" break mdb_env_map run 3739 { (gdb) step 3741 unsigned int flags = env->me_flags; (gdb) 3739 { (gdb) 3782 if (flags & MDB_WRITEMAP) { (gdb) 3739 { (gdb) 3782 if (flags & MDB_WRITEMAP) { (gdb) 3781 int prot = PROT_READ; (gdb) 3787 env->me_map = mmap(addr, env->me_mapsize, prot, MAP_SHARED, (gdb) print pro No symbol "pro" in current context. (gdb) print prot $1 = 1 (gdb) print flags $2 = 805306368 (gdb) print env->me_mapsize $3 = 2147483648 (gdb) print addr $4 = (void *) 0x0 (gdb) step 3789 if (env->me_map == MAP_FAILED) { (gdb) 3787 env->me_map = mmap(addr, env->me_mapsize, prot, MAP_SHARED, (gdb) 3789 if (env->me_map == MAP_FAILED) { (gdb) print env->me_map $5 = 0xffffffff <Address 0xffffffff out of bounds> (gdb) $6 = 0xffffffff <Address 0xffffffff out of bounds> (gdb) step 3790 env->me_map = NULL; (gdb) 3791 return ErrCode(); (gdb) 0xb7a81440 in __errno_location () from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0 Seems that slapd can not allocate enough memory for the database memory map. Strange thing is, that this happens sporadically. And my test script #include <stdio.h> #include <fcntl.h> #include <sys/mman.h> int main (int argc, char *argv[]) { char *p; int fd; fd = open (argv[0], O_RDONLY); int prot = PROT_READ; int size = 2147483648; fd = open (argv[0], O_RDONLY); p = mmap (NULL, size, prot, MAP_SHARED, fd, 0); if (p == MAP_FAILED) { printf ("p: %d", p); perror ("mmap"); return 1; } close (fd); return 0; } -> gcc -g -D_FILE_OFFSET_BITS=64 test.c always works. But here https://symas.com/getting-down-and-dirty-with-lmdb-qa-with-symas-corporations-howard-chu-about-symass-lightning-memory-mapped-database/ is explained, that there is indeed a limit for the size of database on i386 systems (which happens to be our default). A: No. The maximum database size is constrained only by the amount of disk space available and by the size of the address space on the machine. For 32-bit implementations it’s restricted to approximately 2^31 bytes (2 GB), and for 64-bit implementations, which typically bring 48 address bits out of the CPU, it’s restricted to 2^47 bytes (128 TB). The operating system takes care of moving data in and out of available memory as needed. This means that database sizes can be many multiples of available physical memory. So maybe we should reduce the mdb maxsize default? All i can do for now, is to triple check slapd start in /etc/init.d/slapd on i686 systems. If the start fails, just try it again. univention-ldap: 12.1.6-24.815.201606281334 Customer reported that reducing the maxsize made his slapd restart reliable again. Maybe we want to change the maxsize by default? +1 from my side. It's a bit unclear though which value to choose. So, it might not be bad to have this fallback in place too? My recommendation to Felix was: In case a failure has been detected decrease maxsize by 1 and try again. This should converge to the the optimal value :-) Changed default for i686 system to 1.9GB in univention-ldap. Not changed documentation as the recommendation is to not migrate i386 system to MDB. "If BDB is still in use, a migration to MDB should be performed for amd64 systems (not i386). Ok, works. |