Bug 54203 - Partial transaction import on Backup - block downstream replication
Partial transaction import on Backup - block downstream replication
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Notifier (univention-directory-notifier)
UCS 5.0
Other Linux
: P5 normal (vote)
: UCS 5.0-1-errata
Assigned To: Julia Bremer
Arvid Requate
:
Depends on: 54204 51911 53355
Blocks: 53821
  Show dependency treegraph
 
Reported: 2021-12-07 07:28 CET by Philipp Hahn
Modified: 2022-02-10 10:09 CET (History)
3 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 3: Simply Wrong: The implementation doesn't match the docu
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.103
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): bitesize
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2021-12-07 07:28:15 CET
During domain join /usr/sbin/univention-join:232 the following happens:
1. Copy /var/lib/univention-ldap/notify/transaction from primary
2. Initialize cn=translog via /usr/share/univention-ldap/setup-translog init
3. Import via /usr/share/univention-directory-notifier/univention-translog

By default this only does a *partial* import:

# Import via /usr/share/univention-directory-notifier/univention-translog import --help
…
minimum:
  Amount of transactions to import

  --count COUNT, -c COUNT
                        Minimum number of transaction [100000]
  --percent PERCENT, -p PERCENT
                        Minimum percentage of transaction file [0%]
  --size SIZE, -s SIZE  Minimum number of bytes from transaction file [10M]


My primary currently has only 362 transactions in 23KiB:

# ls -gG
insgesamt 32
-rw-r----- 1 23515 Jul 15 01:02 transaction
-rw-r--r-- 1  3275 Jul 15 01:02 transaction.index
-rw------- 1     0 Jun 15 10:32 transaction.lock
# wc -l transaction
362 transaction

After joining a Backup both the Primary and the Backup reports this:

# wc -l transaction
447 transaction
# ls -gG
insgesamt 40
-rw-r----- 1 28965 Dez  6 13:45 transaction
-rw-r--r-- 1  4040 Dez  6 13:45 transaction.index
-rw------- 1     0 Jun 15 10:32 transaction.lock
# /usr/share/univention-directory-notifier/univention-translog stat
Index.file: /var/lib/univention-ldap/notify/transaction.index
Index.size: 4040
Index.count: 447
Index.okay: yes
Translog.file: /var/lib/univention-ldap/notify/transaction
Translog.size: 28965
Translog.first: 1
Translog.last: 447
Translog.count: 447
Ldap.last: 447
Ldap.okay: yes

Doing a lookup from LDAP works fine on both Primary and Backup:
# /usr/share/univention-directory-notifier/univention-translog ldap 1
tid=b'1' dn=b'zoneName=qa50.pmh,cn=dns,dc=qa50,dc=pmh' command=b'a'

But doing the same for the file will work on Primary but fail on Backup:
# /usr/share/univention-directory-notifier/univention-translog lookup 1
/usr/share/univention-directory-notifier/univention-translog:364: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
  self.log.warn("Transaction %d is invalid", tid)
2021-12-06 13:53:50,284:WARNING:Transaction 1 is invalid
Traceback (most recent call last):
  File "/usr/share/univention-directory-notifier/univention-translog", line 1444, in <module>
    exit(main())
  File "/usr/share/univention-directory-notifier/univention-translog", line 424, in main
    return opt.func(opt) or 0
  File "/usr/share/univention-directory-notifier/univention-translog", line 608, in lookup
    print(translog[tid])
  File "/usr/share/univention-directory-notifier/univention-translog", line 330, in __getitem__
    self.seek(tid)
  File "/usr/share/univention-directory-notifier/univention-translog", line 365, in seek
    raise IndexError(tid)
IndexError: 1

Dumping the index will catually show that UDN only initialized the last X transaction on Backup:
# /usr/share/univention-directory-notifier/univention-translog index
       1[ ]: 0
…
     384[ ]: 0
     385[x]: 24987
…
     447[x]: 28918


univention-translog will create an empty /var/lib/univention-ldap/notify/transaction.index
When /usr/sbin/univention-directory-notifier starts it will fill the cache of size 1000; but if less then 1000 transactions exists it will fail due to using mixed *signed* and *unsigned* int:

Breakpoint 2, notifier_cache_init (max_id=447) at cache.c:66
66              for ( i=MAX(max_id - (notifier_cache_size-1), 1); i <= max_id; i++) {
(gdb) p i
$10 = -552
(gdb) p max_id
$11 = 447
(gdb) p notifier_cache_size
$12 = 1000
(gdb) p max_id
$13 = 447

> int notifier_cache_init ( unsigned long max_id) {
> 	int i;
> 	int size;
> 	int count = 0;

The Bug was probably introduced via bb2dc8fa802 and ec5d2e22505 for Bug #51911



+++ This bug was initially created as a clone of Bug #53355 +++

univention-directory-notifier should be migrated to be Python 3 compatible.

> management/univention-directory-notifier/univention-translog
Comment 1 Philipp Hahn univentionstaff 2021-12-07 08:10:56 CET
(In reply to Philipp Hahn from comment #0)
> > int notifier_cache_init ( unsigned long max_id) {
> > 	int i;
...
> >     for ( i=MAX(max_id - (notifier_cache_size-1), 1); i <= max_id ) {


In C99, integer promotion is clearly defined in following rule (6.3.1.1):
> Then, if either operand is unsigned long the other shall be converted to unsigned long.

`i: int = -552`  → `unsigned int = 18446744073709551064` <= `max_id: unsigned long = 1000` → FALSE
Comment 2 Philipp Hahn univentionstaff 2021-12-07 13:43:25 CET
(In reply to Philipp Hahn from comment #1)
> (In reply to Philipp Hahn from comment #0)
> > > int notifier_cache_init ( unsigned long max_id) {
> > > 	int i;
> ...
> > >     for ( i=MAX(max_id - (notifier_cache_size-1), 1); i <= max_id ) {
> 
> 
> In C99, integer promotion is clearly defined in following rule (6.3.1.1):
> > Then, if either operand is unsigned long the other shall be converted to unsigned long.
> 
> `i: int = -552`  → `unsigned int = 18446744073709551064` <= `max_id:
> unsigned long = 1000` → FALSE


From @arequate via chat:
> sollte das MAX hier nicht dafür sorgen, dass i immer größer 0 ist?

With
> i: signed int (32 bit)
> max_id: unsigned long == long unsigned int (64 bit)
> notifier_cache_size: signed long long == long long int (64 bit)
and
> #define MAX(x,y) (((x)>(y))?(x):(y))
this becomes
> i = max_id-(n_c_s-1) > 1 ? max_id-(n_c_s-1) : 1
              ^^^^^        → long long          int
              ^^^^^^^      → long long          int
      ^^^^^^               → long      unsigned int
      ^^^^^^^^^^^^^^^      → long long unsigned int
      ^^^^^^^^^^^^^^^^^^^^ → TRUE (except for 0 and 1)

So
> i = (signed int)(long long unsigned int)-522
>   = (signed int)18446744073709551094
>   = -522


# cc -Wall -g -D_FILE_OFFSET_BITS=64 -Wsign-compare -Wsign-conversion -c -o cache.o cache.c
…
cache.c:66:21: warning: conversion to ‘long long unsigned int’ from ‘long long int’ may change the sign of the result [-Wsign-conversion]
  for ( i=MAX(max_id - (notifier_cache_size-1), 1); i <= max_id; i++) {
                     ^
cache.c:48:21: note: in definition of macro ‘MAX’
 #define MAX(x,y) (((x)>(y))?(x):(y))
                     ^        ^
cache.c:66:54: warning: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Wsign-compare]
  for ( i=MAX(max_id - (notifier_cache_size-1), 1); i <= max_id; i++) {
                                                      ^~
…
Here "int i = -522" gets promoted to "long unsigned int" → "(long unsigned int)-522 = 4294966774" <= "max_id: long unsigned int = 100" → FALSE


FYI: <https://www.oreilly.com/library/view/c-in-a/0596006977/ch04.html>
The relevant rule is
> Otherwise, one operand `n_c_s` has a signed type T `long²int` whose conversion rank is higher than that of the other operand’s `max_id` type `long unsigned int`.
> The other operand `max_id` is converted to type T `long²int` only if type T `long²int` is capable of representing all values of its previous type `long unsigned int`.
→ No, they are both 64 bit and negative values cannot be represented in `unit64`
> If not, then both operands are converted to the unsigned type that corresponds to the signed type T.
→ Yes, -522 is cast to unsigned, becomes positive and the >1 will "always" match except for 0 and 1

PS: There are many more sign conversions warning in the code
Comment 3 Julia Bremer univentionstaff 2021-12-21 17:00:35 CET
714a411a42 Bug #54208: advisory
30192bbfbd Bug #54208: Fix unsigned values fail to compare when negative

I changed the code so that only positive unsigned values are compared. Not negative ones. I reactivated the tests and tested a fresh backup join with transactions < 1000 and index 1 was initialized.

The problem was reproducible with the old code as well, so this is not a regression due to bb2dc8fa802 and ec5d2e22505.
Comment 4 Arvid Requate univentionstaff 2021-12-21 20:38:39 CET
As discussed, the type definition of notifier_cache_size in univention-directory-notifier.c needs to be adjusted too,
and that in turn suggests adding some error handling around the atoll call in the getopt loop.
Comment 5 Julia Bremer univentionstaff 2021-12-23 10:38:44 CET
The function parse_ullong was created,
it checks if the given arg is negative, or consists of characters.

bb166e5f13 Bug #54208: Advisory update
11bb3d7c04 Bug #54208: Make sure notifier_cache_size is not negative
9d8fe894b8 fixup! Bug #54208: Fix unsigned values fail to compare when negative


Successful build
Package: univention-directory-notifier
Version: 14.0.5-7A~5.0.0.202112231026
Branch: ucs_5.0-0
Scope: errata5.0-1
Comment 6 Arvid Requate univentionstaff 2022-01-11 18:25:17 CET
Ok, Bug number twist here, commits can be found with

git log -p --grep "Bug #54208" ## instead of 54203

Verified:
* Code review
* Package update
* Functional test
* Advisory

ee20f50424 | Minor typo fixes