Bug 29337 - Vergleich von LDAP-Inhalten
Vergleich von LDAP-Inhalten
Status: CLOSED FIXED
Product: UCS Test
Classification: Unclassified
Component: LDAP
unspecified
Other Linux
: P5 normal (vote)
: UCS 3.2
Assigned To: Philipp Hahn
Sönke Schwardt-Krummrich
: interim-1
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-20 17:23 CET by Janek Walkenhorst
Modified: 2013-11-19 06:43 CET (History)
2 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Janek Walkenhorst univentionstaff 2012-11-20 17:23:53 CET
Es sollte ein Programm geben das unter Angabe von 2 Dateien/Hostnamen zwei LDIF-Dateien (aus Datei oder per SSH+slapcat vom Host) vergleicht und die Unterschiede in lesbaren diff-Format ausgibt.
Comment 1 Liam Schwez univentionstaff 2012-12-18 13:15:16 CET
A new script was developed and added to:

billy.knut.univention.de/var/univention/svn/dev/branches/ucs-3.1/ucs/test/ucs-test/compareldap/


Usage: ./compareldif [option] [TARGET] [option] TARGET
The program compares the LDAP directory contents of the TARGETs
TARGET can be: a local LDIF file and a hostname whose directory will be read using slapcat over ssh. If the first TARGET is omitted, the local hosts directory is used instead. Options can be: "--file" for ldif-filename, "--host" for ldpa-host.

To avoid faulty findings please make sure that the comparative ldif-file has a LDAP Data Interchange Format.
Comment 2 Sönke Schwardt-Krummrich univentionstaff 2013-04-09 13:54:10 CEST
The compare process is extremly slow if larger LDIF files come into play. I stopped the compare script after 140 minutes while looking at two 300 MiB LDIF files. Additionally the memory footprint of the compare scripte rose to 1.7 GiB.

The compare process should be speed up and (if possible) the memory footprint reduced.
Comment 3 Stefan Gohmann univentionstaff 2013-06-12 08:39:33 CEST
I think we should put the compare function of two ldap streams / files  into a library. Then we could use this library in a generic way for ACL checks, for listener consistency checks and so on. From my point of view one big advantage is that we could check subtrees or direct objects instead of comparing the whole slapcat tree.
Comment 4 Philipp Hahn univentionstaff 2013-07-18 13:48:42 CEST
svn42561:
Rewrite as /usr/bin/ldiff
* Improve performance: O(n) after sorting O(N*log(n))
* Sort DNs reversed, support RDN.
* Use OptParser instead of custom command line parsing.
* Ignore operational attributes by default.
* Allow to show unmodified attributes of changed objects.
* Allow to show unmodified objects.
* Handle OIDs instead of attribute names.
* Install by default.
* Return comparison result as return value.

Also available from univention.testing.ldif from Python.

The parser is more picky about proper LDIFormat (Bug #31997), but that is a good thing.
Comment 5 Sönke Schwardt-Krummrich univentionstaff 2013-08-20 10:28:57 CEST
Tested:
OK: 2 files
OK: 1 file, 1 IP address
OK: 1 file, 1 hostname
FAIL: host unreachable (see below problem 1)
FAIL: 2 IP addresses (see below problem 2)
OK: unittests
FAIL: CLI arguments (see below problem 3)
OK: performance (see below)
OK: python API
OK: changelog

Performance)
Tested with two customer LDIF backup files (originating from slapcat). Each file had a size of about 282MiB.
The compare run took 5min of real/CPU time and about 2880 MiB RAM. A compare run between file and IP adresss (ssh) took much, much longer and has been aborted after 10min. The process hadn't reached it's final size in memory at that point in time.


Problem 1)
root@master80:~# ldiff -o -a -H billy > /dev/null
ssh: Could not resolve hostname billy: Name or service not known
root@master80:~# 
→ If ssh fails, the corresponding LDIF will be treated as empty → all objects/attributes of the local slapcat output are treated as "new".
→ If stdout does _not_ get redirected, it looks like ldiff/ssh fails silently, which is the regular case.


Problem 2)
root@master80:~# ldiff -H 10.200.18.80 -H 10.200.18.81
Password: Password: 
→ Both ssh subprocesses are called simultaneously and are asking for a password
→ The first password entered will not be echoed back. The second password is echoed back and therefore visible in terminal.


Problem 3)
If --operational is given, ONLY operational attributes are compared. It is not possible to compare operational AND regular attributes.
Comment 6 Philipp Hahn univentionstaff 2013-08-20 20:32:18 CEST
(In reply to Sönke Schwardt-Krummrich from comment #5)
> Performance)

# time python ldif.py test/ucs-test-tools/usr/share/ucs-test-tools/customer5000.ldif{,}
real    0m15.761s
user    0m15.588s
sys     0m0.180s

A version using arrays instead of dictionaries doesn't perform better, but it might use less RAM:
real    0m16.792s
user    0m16.644s
sys     0m0.152s

> → If ssh fails, the corresponding LDIF will be treated as empty → all
> objects/attributes of the local slapcat output are treated as "new".

Command exiting != 0 is now treated as an error.

> → Both ssh subprocesses are called simultaneously and are asking for a
> password

Commands are now serialized.

> If --operational is given, ONLY operational attributes are compared. It is
> not possible to compare operational AND regular attributes.

WORKS-FOR-ME: *not* specifying --operational *excludes* the operational attributes from the comparison and displaying them, so using --operational treats all objects containing differences in operational attributes as different and print the object including the operational attributes.

svn43342, ucs-test_4.0.92-1.546.201308202026
ChangeLog: ±0
Comment 7 Sönke Schwardt-Krummrich univentionstaff 2013-08-21 13:43:50 CEST
(In reply to Philipp Hahn from comment #6)
> (In reply to Sönke Schwardt-Krummrich from comment #5)
> > Performance)
> 
> # time python ldif.py
> test/ucs-test-tools/usr/share/ucs-test-tools/customer5000.ldif{,}
> real    0m15.761s
> user    0m15.588s
> sys     0m0.180s
> 
> A version using arrays instead of dictionaries doesn't perform better, but
> it might use less RAM:
> real    0m16.792s
> user    0m16.644s
> sys     0m0.152s

The "array version" need slightly more memory (2875 instead of 2800MiB) but also takes about 40% longer (7m01s instead of 4m57s) that the original version.
I think the current performance is ok.
 
> > → If ssh fails, the corresponding LDIF will be treated as empty → all
> > objects/attributes of the local slapcat output are treated as "new".
> 
> Command exiting != 0 is now treated as an error.

→ Yes and no. Indeterministically the old and the new behaviour occurs in a series of execution attempts. → REOPEN

> > → Both ssh subprocesses are called simultaneously and are asking for a
> > password
> 
> Commands are now serialized.

→ OK
 
> > If --operational is given, ONLY operational attributes are compared. It is
> > not possible to compare operational AND regular attributes.
> 
> WORKS-FOR-ME: *not* specifying --operational *excludes* the operational
> attributes from the comparison and displaying them, so using --operational
> treats all objects containing differences in operational attributes as
> different and print the object including the operational attributes.

This has been a PICNIC.
Comment 8 Philipp Hahn univentionstaff 2013-08-21 18:43:09 CEST
I could easily reproduce the error from comment 7 by specifying an unreachable host, e.g. ldiff.py -H 10.200.18.180


r43371 | Bug #29337: ldiff: Option for ssh replacement

Allows to configure a replacement command for ssh; useful for testing with univention-ssh

r43372 | Bug #29337: ldiff: Fix error and signal handling

Ctrl-C printed an ugly traceback and did not terminate my surrounding while-loop.

r43373 | Bug #29337: ldiff: Handle ssh exiting early

Delay reading the process status of the forked ssh by half a second to decide, if the ssh command terminated with an error or with success; in the later case the pipe becomes readable with EOF.

ucs-test_4.0.94-1.548.201308211840
Comment 9 Sönke Schwardt-Krummrich univentionstaff 2013-08-23 14:25:26 CEST
(In reply to Philipp Hahn from comment #8)
> I could easily reproduce the error from comment 7 by specifying an
> unreachable host, e.g. ldiff.py -H 10.200.18.180
> 
> r43371 | Bug #29337: ldiff: Option for ssh replacement
> 
> Allows to configure a replacement command for ssh; useful for testing with
> univention-ssh

→ OK

> r43372 | Bug #29337: ldiff: Fix error and signal handling
> 
> Ctrl-C printed an ugly traceback and did not terminate my surrounding
> while-loop.

→ OK

> r43373 | Bug #29337: ldiff: Handle ssh exiting early
> 
> Delay reading the process status of the forked ssh by half a second to
> decide, if the ssh command terminated with an error or with success; in the
> later case the pipe becomes readable with EOF.

→ OK
 
> ucs-test_4.0.94-1.548.201308211840

→ VERIFIED
Comment 10 Stefan Gohmann univentionstaff 2013-11-19 06:43:59 CET
UCS 3.2 has been released:
 http://docs.univention.de/release-notes-3.2-en.html
 http://docs.univention.de/release-notes-3.2-de.html

If this error occurs again, please use "Clone This Bug".