Bug 54273 - Analyzing failed server_password_change is difficult due to missing error handling and debug logging
Analyzing failed server_password_change is difficult due to missing error han...
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Password changes
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 5.0-2-errata
Assigned To: Juan Carlos
Arvid Requate
https://git.knut.univention.de/univen...
:
Depends on: 53182
Blocks: 56630
  Show dependency treegraph
 
Reported: 2021-12-23 17:51 CET by Dirk Schnick
Modified: 2023-09-20 14:47 CEST (History)
8 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 2: Improvement: Would be a product improvement
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.103
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2021032221000553, 2021121421000211
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Schnick univentionstaff 2021-12-23 17:51:10 CET
Please excuse my unclear definition of my/our (support-team) needs in the first bug.

To debug a failed server password change we need a log entry with exact timestamps _for each relevant action_. 
So _f.e._ the heart of the server password change
/usr/sbin/univention-directory-manager "computers/$server_role" modify --binddn "$ldap_hostdn" --bindpwd "$old_password" --dn "$ldap_hostdn" --set password="$new_password" >&3 2>&3
should be relevant enough to write a log entry.
Alternatively, you could of course evaluate the return value and only write such an entry if it is not 0. From my point of view it would be a giant idea to generally evaluate the return value.

So a clear definition of the needs of the support-team (but I think also our customer will like that):

Please check the _whole_ server_password_change process (including all subscripts) and add a log entry with detailed timestamp to every in debugging relevant action that will take place during server_password_change; at least if the action runs against the wall.


+++ This bug was initially created as a clone of Bug #53182 +++

We have to debug failed server password changes quite a lot, but in the first step it is mostly not possible as there are no timestamps in the logfile.

A logfile should contain always a timestamp, especially when you need to compare with other logfiles.
In the case of a failed server password change you need to know when exactly it failed to examine with other logfiles what may cause this.

Please improve the server_password_change script and the hook scripts with f.e. ($(date '+%F %T.%N')) in the echos.
Comment 1 Ingo Steuwer univentionstaff 2022-01-04 12:59:40 CET
I changed the "What type of bug is this" to "Improvement" instead of "Simply Wrong" - I think there is nothing wrong with the current implementation, but additional information would be helpfull in case of debugging.

What I'm unsure is wheter it is a good idea to add this information in the password change process. The example given references to UMC, which has it's own logging mechanism and in addition can be "tracked" using the Admin Diary module.

I haven't checked the linked tickets - which information is missing in the log entries produced by UDM itself?
Comment 2 Dirk Schnick univentionstaff 2022-01-04 13:29:02 CET
Please excuse my sarcasm: sure let us remove all logging as logging is overloading our code.

Find the error with that logfile:

root@slmem1:~ # cat /var/log/univention/server_password_change.log 
Starting server password change (Fr 24. Dez 09:20:50 CET 2021)
Proceeding with regular server password change scheduled for today
run-parts: executing /usr/lib/univention-server/server_password_change.d/50univention-mail-server prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/portal-server-password-rotate prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-admin-diary prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-libnss-ldap prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-nscd prechange
Permission denied.
run-parts: executing /usr/lib/univention-server/server_password_change.d/50univention-mail-server nochange
File: /etc/listfilter.secret
Comment 3 Christina Scheinig univentionstaff 2022-01-05 13:28:15 CET
(In reply to Ingo Steuwer from comment #1)
> I changed the "What type of bug is this" to "Improvement" instead of "Simply
> Wrong" - I think there is nothing wrong with the current implementation, but
> additional information would be helpfull in case of debugging.
> 
> What I'm unsure is wheter it is a good idea to add this information in the
> password change process. The example given references to UMC, which has it's
> own logging mechanism and in addition can be "tracked" using the Admin Diary
> module.
> 
> I haven't checked the linked tickets - which information is missing in the
> log entries produced by UDM itself?

Let me show with the following example, what we need.
We face the challenge in customer environments, that we have to check each step in the password change and replication of the new password. We had to implement print statements in the code after each step manually, to get to know, what is the root cause of failing. We had to do this several times in several environments, and it just costs a lot of time.

Example logfile part from the customer:
Starting server password change (Thu Oct  7 01:03:47 CEST 2021)
Proceeding with regular server password change scheduled for today
run-parts: executing /usr/lib/univention-server/server_password_change.d/50univention-mail-server prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/portal-server-password-rotate prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/squid-pw-rotate prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/ucs-school-netlogon-user-logonscripts prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-admin-diary prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-bind prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-dhcp prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-libnss-ldap prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-node-exporter prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-nscd prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-s4-connector prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-samba4 prechange
Object modified: cn=ucs-slave,cn=dc,cn=server,cn=computers,ou=school,dc=schein,dc=int
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-samba4 localchange
Changed password OK
Stopping samba-ad-dc (via systemctl): samba-ad-dc.service.
Stopping smbd (via systemctl): smbd.service.
Stopping nmbd (via systemctl): nmbd.service.
Starting nmbd (via systemctl): nmbd.service.
Starting smbd (via systemctl): smbd.service.
Starting samba-ad-dc (via systemctl): samba-ad-dc.service.
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
ldap_bind: Invalid credentials (49)
(2021-10-07 01:05:10.036981019): Starting revert in master
Object modified: cn=ucs-slave,cn=dc,cn=server,cn=computers,ou=school,dc=schein,dc=int
(2021-10-07 01:05:10.203483160): Rollback machine.secret
run-parts: executing /usr/lib/univention-server/server_password_change.d/50univention-mail-server nochange
File: /etc/listfilter.secret

We did this to get a clue:

Object modified: cn=ucs-slave,cn=dc,cn=server,cn=computers,ou=school,dc=schein,dc=int
(2021-11-25 01:08:19.882289290): Start counting
(2021-11-25 01:08:20.884210374): doing univention-ldapsearch against master
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-samba4 localchange
Changed password OK
Stopping samba-ad-dc (via systemctl): samba-ad-dc.service.
Stopping smbd (via systemctl): smbd.service.
Stopping nmbd (via systemctl): nmbd.service.
Starting nmbd (via systemctl): nmbd.service.
Starting smbd (via systemctl): smbd.service.
Starting samba-ad-dc (via systemctl): samba-ad-dc.service.
(2021-11-25 01:08:29.620639274): doing count after change in samba
(2021-11-25 01:08:30.622179125): doing univention-ldapsearch against myself
ldap_bind: Invalid credentials (49)
(2021-11-25 01:08:31.782274367): doing univention-ldapsearch against myself
ldap_bind: Invalid credentials (49)
(2021-11-25 01:08:32.974216740): doing univention-ldapsearch against myself
ldap_bind: Invalid credentials (49)
[...]

I think outsourcing the timestamps is not helpfull.
Comment 4 Ingo Steuwer univentionstaff 2022-01-05 16:53:26 CET
Thanks for the additional information / examples. 

About the additional debugging you added: 

* the debugging was needed in the "sub-scripts" below "/usr/lib/univention-server/server_password_change.d/", not in the main script for the password change - right?

* were the actual timestamps helpfull, or the information that came in addition (like "doing count after change in samba")?

As we added timestamps in Bug #54019 I doubt that adding even more timestamps is a solution. Based on comment #3 I think what you really ask for here is a "debug logging" that gives information about the "what", not (only) the "when". If you agree I'll change the bug description.
Comment 5 Christina Scheinig univentionstaff 2022-01-06 10:57:16 CET
(In reply to Ingo Steuwer from comment #4)
> Thanks for the additional information / examples. 
> 
> About the additional debugging you added: 
> 
> * the debugging was needed in the "sub-scripts" below
> "/usr/lib/univention-server/server_password_change.d/", not in the main
> script for the password change - right?

Both. We have to add timestamps and the description, to get the process clear, because we have to check with the timestamps in other logfiles where the error comes from. Especially for the replication this is important to have exact timestamps.

So both is very helpful.

> * were the actual timestamps helpfull, or the information that came in
> addition (like "doing count after change in samba")?

Yes, because in the code we have to count parts. So we could now differentiate which check fails:
So 

(2021-11-25 01:08:19.882289290): Start counting
(2021-11-25 01:08:20.884210374): doing univention-ldapsearch against master

works and

(2021-11-25 01:08:29.620639274): doing count after change in samba
(2021-11-25 01:08:30.622179125): doing univention-ldapsearch against myself
ldap_bind: Invalid credentials (49)

failes.

> 
> As we added timestamps in Bug #54019 I doubt that adding even more
> timestamps is a solution. Based on comment #3 I think what you really ask
> for here is a "debug logging" that gives information about the "what", not
> (only) the "when". If you agree I'll change the bug description.

Unfortunately I do not have a logfile after the fix from this customer, so I could not compare. but I have this one from an other customer, which is not helpful at all with the new timestamps.

[2021-12-14 01:08:46.368033655] Starting server password change
[2021-12-14 01:08:46.652200440] Proceeding with regular server password change scheduled for today
run-parts: executing /usr/lib/univention-server/server_password_change.d/50univention-mail-server prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/portal-server-password-rotate prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-admin-diary prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-dhcp prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-libnss-ldap prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-nscd prechange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-radius prechange
Permission denied.
run-parts: executing /usr/lib/univention-server/server_password_change.d/50univention-mail-server nochange
File: /etc/listfilter.secret
Multifile: /etc/postfix/ldap.distlist
Multifile: /etc/postfix/ldap.groups
Multifile: /etc/postfix/ldap.external_aliases
Multifile: /etc/postfix/ldap.sharedfolderlocal
Multifile: /etc/postfix/ldap.virtualwithcanonical
Multifile: /etc/postfix/ldap.virtual_mailbox
Multifile: /etc/postfix/ldap.sharedfolderremote
Multifile: /etc/postfix/ldap.sharedfolderlocal_aliases
Multifile: /etc/postfix/ldap.virtual
Multifile: /etc/postfix/ldap.canonicalrecipient
Multifile: /etc/postfix/ldap.transport
Multifile: /etc/postfix/ldap.canonicalsender
Multifile: /etc/postfix/ldap.saslusermapping
Multifile: /etc/postfix/ldap.virtualdomains
run-parts: executing /usr/lib/univention-server/server_password_change.d/portal-server-password-rotate nochange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-admin-diary nochange
751e3cc6-2413-46e8-9f67-c9aa01108725
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-dhcp nochange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-libnss-ldap nochange
File: /etc/libnss-ldap.conf
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-nscd nochange
run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-radius nochange
[2021-12-14 01:08:50.059359415] failed to change server password for cn=sn-member1,cn=computers,ou=hebk,dc=hebk,dc=de
[2021-12-15 01:01:50.701342614] Starting server password change

Here we have a permission denied between univention-radius prechange and 50univention-mail-server nochange. So the interesting part is, where does the permission denied comes from. So what is done here and when. 

run-parts: executing /usr/lib/univention-server/server_password_change.d/univention-radius prechange
Permission denied.
run-parts: executing /usr/lib/univention-server/server_password_change.d/50univention-mail-server nochange

You may be right if you think that is part of the support to find out, but you may also agree, that it could be easier for us to see in the logfile what is going on especially because server_password_change is really complex with 'prechange', 'change', 'replication' and 'postchange' through many different scripts, and more than one server.
Comment 7 Juan Carlos univentionstaff 2022-07-14 12:52:13 CEST
Updated packages:

Package: univention-admin-diary
Version: 2.0.4-4A~5.0.0.202207141212
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-bind
Version: 14.0.2-1A~5.0.0.202207141215
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-dhcp
Version: 14.0.5-1A~5.0.0.202207141218
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-directory-manager-rest
Version: 10.0.4-2A~5.0.0.202207141219
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-mail-postfix
Version: 14.0.2-1A~5.0.0.202207141221
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-pam
Version: 13.0.4-1A~5.0.0.202207141222
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-portal
Version: 4.0.7-5A~5.0.0.202207141223
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-postgresql
Version: 12.0.3-1A~5.0.0.202207141230
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-s4-connector
Version: 14.0.10-2A~5.0.0.202207141231
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-samba4
Version: 9.0.8-2A~5.0.0.202207141233
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-squid
Version: 13.0.4-2A~5.0.0.202207141235
Branch: ucs_5.0-0
Scope: errata5.0-2

Package: univention-server
Version: 15.0.5-2A~5.0.0.202207141236
Branch: ucs_5.0-0
Scope: errata5.0-2


----------------------
Commits:

9b58db624c58 | Bug #54273: Create debug log only when the execution is not successfully completed
a032fdaf2036 | Bug #54273: Update README.API to add information about debugging
8f3563a10f7c | Bug #54273: Improved debug mechanism for server_password_change script
ff9f0281344b | Bug #54273: Update debian
62f4319202c5 | Bug #54273: update advisories
d1839a6923bc | Bug #54273: Pause debugging when executing ucr shell
5a077da5d599 | Bug #54273: add advisories
ff9f0281344b | Bug #54273: Update debian changelog
77c16a75bab9 | Bug #54273: Update debian install files
6b73bc91e1aa | Bug #54273: Updated the bash subscripts. Change the hashbang to bash and include the new debug script
Comment 8 Arvid Requate univentionstaff 2022-07-19 18:54:17 CEST
LGTM. Waiting for Bug 55010 to restore the advisory for univention-portal.
Comment 9 Arvid Requate univentionstaff 2022-07-20 13:55:30 CEST
Verified:
* Code review
* Package update
* Functional test with mocked error
* Functional test under normal conditions
* CI test 01_base/02server_password_change still Ok
* No documentation update required
* Advisories