Bug 57038 - Add new cn=udm_archive LDAP database and syncrepl
Add new cn=udm_archive LDAP database and syncrepl
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: LDAP
UCS 5.0
Other Linux
: P5 normal (vote)
: UCS 5.0-6-errata
Assigned To: Felix Botner
Julia Bremer
https://git.knut.univention.de/univen...
:
Depends on:
Blocks: 56999
  Show dependency treegraph
 
Reported: 2024-02-06 09:05 CET by Felix Botner
Modified: 2024-02-28 13:17 CET (History)
3 users (show)

See Also:
What kind of report is it?: Development Internal
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Felix Botner univentionstaff 2024-02-06 09:05:08 CET
add cn=udm_archive database to univention-ldap-server config, setup and provide syncrepl.
Comment 1 Philipp Hahn univentionstaff 2024-02-06 10:08:40 CET
Why?
Where is the concept to prune that DB?
Comment 2 Felix Botner univentionstaff 2024-02-06 10:48:12 CET
We want to use a dedicated database for the blocklist feature. The blocklist feature creates objects in LDAP to block the re-use of certain user/group attributes (email).

Why an extra database:
* We don't want these objects in listener/notifier (especially not during join)
* we only need them on primary (but we syncrepl this database to backups)

Blocklist objects have a retention-time and we will have a cron job to remove old objects.
Comment 3 Philipp Hahn univentionstaff 2024-02-06 11:09:34 CET
Q: Why another `udm settings/lock`?
A: Because it is not replicated?
> replication.py:68 filter = '(!(objectClass=lock))'  

Q: Why a new replication mechanism to learn and maintain in the future? Please remember that this will increase the cognitive load for every developer, supporter and professional servicer.

Nit: Why "blocklist" and not "denylist"? The term "blocklist" has a second meaning in "file systems".
> However, the term 'denylist' is often preferred because it more accurately describes the function of the list.

Nit: The name "cn=udm_archive" confuses me even more as there is no relation to "blocklist".
Comment 4 Felix Botner univentionstaff 2024-02-06 11:42:31 CET
> Q: Why a new replication mechanism to learn and maintain in the future? Please remember that this will increase the cognitive load for every developer, supporter and professional servicer.

We don't want to have an impact on the initial join, but we also want to replicate these objects to backups. So we do the initial join for ldap/base and afterwards the syncrepl of cn=udm_archive to backups (and syncrepl does its thing in the background, nobody waits for the replication, it is only replicated for a backup2master)

And about syncrepl, yes it is new in UCS. But there are ideas in "nubus" and k8s to replace listener/notifier with a new provisioning and syncrepl for LDAP. So it is also a bit of a test-run with syncrepl for us on this very small feature.

"blocklist" was used by us and the customer who bought this feature. So we just went with that.

> Nit: The name "cn=udm_archive"

There is a container "cn=blocklistentries,cn=udm_archive" for blocklist objects. We thought we may want to reuse this database in the future for other things, like tombstone objects. But that is just an assumption, we can still rename.
Comment 5 Florian Best univentionstaff 2024-02-06 18:57:53 CET
I am also a little bit unhappy with "cn=udm_archive". Because usually we don't use abbreviations like "udm" in such names.
I would find just "cn=archive"/etc more suitable.

(In reply to Philipp Hahn from comment #3)
> Q: Why another `udm settings/lock`?
> A: Because it is not replicated?
> > replication.py:68 filter = '(!(objectClass=lock))'  
yes, it's basically a second implementation of settings/lock but with the difference that it
1. stores UDM property values and not LDAP attribute values
2. stores only SHA256 hashes of the values
Comment 6 Felix Botner univentionstaff 2024-02-07 11:30:26 CET
> I would find just "cn=archive"/etc more suitable.

then "cn=archive" it is (not sure about this /etc though)
Comment 7 Philipp Hahn univentionstaff 2024-02-07 11:55:26 CET
(In reply to Felix Botner from comment #4)
> > Nit: The name "cn=udm_archive"
> 
> There is a container "cn=blocklistentries,cn=udm_archive" for blocklist
> objects. We thought we may want to reuse this database in the future for
> other things, like tombstone objects. But that is just an assumption, we can
> still rename.

Please try to not share databases for different functionality. With amd64 only it is relatively easy to add another one. It increases data security as a "disk-full" will only affect one functionality and will not trash all user data. It also simplifies designing ACLs as you only have to think about one use-case.

(In reply to Felix Botner from comment #6)
> > I would find just "cn=archive"/etc more suitable.
> 
> then "cn=archive" it is (not sure about this /etc though)

Quoting:
> There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one-errors. 
So this is number 2: naming. That name is very generic: Maybe try to answer the following questions to find a better name:
- It it something like a "graveyard" where entries removed from "$ldap_base" are moved to?
- What is "archived" here actually?
- Why and for how long?
- At least from the name I also read that "something" is archived, but not "what" or "why" and why should I ever (not) look at it.
- If the disk gets full and I find all space taken up by the "archive" database, I might delete it to free up that space. But which functionality will I loose then?
Comment 8 Daniel Tröder univentionstaff 2024-02-12 11:49:06 CET
AFAIK this database will be used only for the "blocklist" feature.
A name more that is more specific and also says something about the function could include the word "uniqueness".

UCS@school creates two containers for a related feature:
* cn=unique-usernames,cn=ucsschool,cn=univention,$ldap_base
* cn=unique-email,cn=ucsschool,cn=univention,$ldap_base

So, I propose "cn=unique" as the root node.
The next level will be "cn=<blocklistname>" e.g. "cn=mailPrimaryAddress".
Below that we'll have "cn=<algo>:<hash>" e.g. "cn=sha3-256:c0067d4af4e87f00dbac63b6156828237059172d1bbeac67427345d6a9fda484"

The complete DN would be:

cn=<algo>:<hash>,cn=<blocklistname>,cn=unique
  or
cn=sha3-256:c0067d4af4e87f00dbac63b6156828237059172d1bbeac67427345d6a9fda484,cn=mailPrimaryAddress,cn=unique
Comment 9 Felix Botner univentionstaff 2024-02-13 09:54:12 CET
Team decision:

LDAP database: cn=internal
Blocklists: cn=blocklists,cn=internal
Comment 10 Felix Botner univentionstaff 2024-02-21 15:49:46 CET
Successful build
Package: univention-ldap
Version: 16.0.14-5
Branch: ucs_5.0-0
Scope: errata5.0-6
Comment 11 Felix Botner univentionstaff 2024-02-21 15:52:58 CET
Successful build
Package: univention-management-console-module-diagnostic
Version: 6.0.7-3
Branch: ucs_5.0-0
Scope: errata5.0-6
bug: [56999, 57038]
* A check has been added to verify that the LDAP server's config
   file has the filesystem permissions `640`.
Comment 12 Julia Bremer univentionstaff 2024-02-27 09:46:03 CET
OK: LDAP config
OK: cn=internal
OK: Syncrepl config
OK: ACLs
OK: ACLs configurable
OK: Join performance

Verified