Bug 52188 - Self Service reports Errorcode 20 when changing password, but the password-change takes place anyway
Summary: Self Service reports Errorcode 20 when changing password, but the password-ch...
Status: CLOSED FIXED
Alias: None
Product: UCS
Classification: Unclassified
Component: Self Service
Version: UCS 4.4
Hardware: Other Linux
: P5 major
Target Milestone: UCS 4.4-7-errata
Assignee: Felix Botner
QA Contact: Erik Damrose
URL:
Keywords:
: 52058 (view as bug list)
Depends on:
Blocks:
 
Reported: 2020-10-06 13:39 CEST by Marc Schwarz
Modified: 2021-03-02 14:45 CET (History)
10 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 4: A User would return the product
User Pain: 0.229
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2020100721000231
Bug group (optional): Error handling
Customer ID: 57195
Max CVSS v3 score:


Attachments
error 20 message to user (96.94 KB, image/png)
2020-10-16 14:42 CEST, Marc Schwarz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Schwarz univentionstaff 2020-10-06 13:39:56 CEST
univention-app info
UCS: 4.4-6 errata758
Installed: adconnector=12.0 itslearning=3.2 self-service=4.0 self-service-backend=4.0 ucs-to-school-transformer=1.3.2 ucsschool=4.4 v7 ucsschool-kelvin-rest-api=1.1.1
Upgradable: ucsschool-kelvin-rest-api

In some scenarios (we don't know yet how to reproduce the behaviour), the Self Service reports Errorcode 20, when a user changes his password, but in fact, the new password is stored correctly.
This is a very big issue at an important UCS@School customer. I will add some umc-log.
Comment 4 Marc Schwarz univentionstaff 2020-10-09 10:27:30 CEST
*** Bug 52058 has been marked as a duplicate of this bug. ***
Comment 5 Felix Botner univentionstaff 2020-10-16 09:50:32 CEST
what we need is

* A detailed description of the setup (this is some special school environment i guess)
* Is there a samba DC involved?
* /var/log/auth.log ("debug" for pam_krb5.so in /etc/pam.d/univention-management-console)
* /var/log/syslog
* /var/log/univention/management-console-server.log, /var/log/univention/management-console-web-server.log (umc/server/debug/level=3 (4)? )
* test user object (in form of a import csv, or LDAP LDIF)

There is also the possibility to enable debug for the kerberos libs. But that should only be added for a limited time period!!!

-> more /root/.krb5/config 
[logging]
   krb5 = 0-/FILE:/tmp/krb.log

But that did indeed help a lot for the other "error code 20" bugs.
Comment 7 Marc Schwarz univentionstaff 2020-10-16 13:54:11 CEST
(In reply to Felix Botner from comment #5)

> * A detailed description of the setup
the setup is a UCS DC Master with installed and configured AD-Connector, which is in sync mode with the AD. Primary group sync is deactivated by ad-connector mapping. please find the full UCR AD-Connector settings in the private comments.

Furthermore the itslearning connector and the UCS-to-School-Transformer App (customer app) are installed. The Transformer App constantly converts synced users from regular UCS-Users to UCS@School-Users and updates them based on the information which is located in an ext. attribute on each user.

> * Is there a samba DC involved?
Samba is not installed in this environment. There is atm just one UCS DC backup.
Comment 10 Marc Schwarz univentionstaff 2020-10-16 14:13:19 CEST
Created attachment 10520 [details]
management-console-web-server.log
Comment 11 Felix Botner univentionstaff 2020-10-16 14:18:06 CEST
one more thing, please give us a username who wasn't able to correctly change the password
Comment 17 Marc Schwarz univentionstaff 2020-10-16 14:42:13 CEST
Created attachment 10526 [details]
error 20 message to user
Comment 20 Philipp Hahn univentionstaff 2020-10-29 10:49:24 CET
Some ideas / hints:

> # grep ^[^#] /etc/pam.d/univention-management-console
> auth  sufficient  pam_saml.so grace=600 userid=urn:oid:0.9.2342.19200300.100.1.1 idp=/usr/share/univention-management-console/saml/idp/ucs-sso.phahn.dev.xml trusted_sp=https://m34.phahn.dev/univention/saml/metadata
> auth  sufficient  pam_unix.so try_first_pass
> auth  sufficient  pam_krb5.so use_first_pass defer_pwchange
> auth  sufficient  pam_ldap.so use_first_pass
> auth  required    pam_deny.so
> 
> account  sufficient  pam_unix.so
> account  sufficient  pam_krb5.so force_pwchange
> account  required    pam_ldap.so
> 
> session  required   pam_unix.so
> 
> password  requisite   pam_cracklib.so
> password  sufficient  pam_unix.so obscure use_first_pass use_authtok
> password  required    pam_krb5.so use_first_pass use_authtok force_pwchange

- there a 4 phases:
  - "auth" to proof the identity of user and to grant additional permissions
  - "account" to verify the user is allowed to use the service
  - "session" to setup the service for the user
  - "password" to change the credentials
- if the user account credentials have expired "auth" will return "chauthtok" instead of "okay" which forces a credential change NOW and the result of "auth" is delayed only AFTER the password was changed successfully: If the new credentials are too week "auth" is denied.
- PAM is modular
- modules can abort the stack early (die), but to prevent attackers from identifying the denying module this is not done for "auth".
- each module contributes its local result into the global result. "sufficient", "required", "requisite" are just shortcuts for the longer format.
- for "password" change 3 modules are called:
  - "cracklib" is called 1st and does abort early ("requisite"=[default=die]) if the new credentials do not met the password policy
  - "unix" is called 2nd: As the user is not local but from LDAP, the user password in not managed in /etc/shadow and the module fails. Due to "sufficient"=[default=ignore] this is not fatal and PAM continues with the next module
  - "krb5" is called 3rd and succeeds changing the credentials.
- for password change all modules are called twice:
  - 1st phase is "prelim" where each module should check the pre-condition, if changing the password is okay. If any module signals an error, the change is no initiated at all.
  - 2nd phase is "update" where each module actually performs the update.

when multiple modules are used this always becomes error-prone as there is no guarantee that the update is atomic: Due to the mixed use of "sufficient" and "required" the log output becomes very confusing, as some errors in specific modules are ignored ("sufficient"=[default=ignore]) and don't contribute to the final global return code.

It would be better to call "unix" only for local users and "krb5" only for LDAP users.


Also look at management/univention-management-console/src/univention/management/console/pam.py which implements the PAM logic for UMC: The main problem there is that the PAM "conversion" function is supposed to be used interactively and a PAM module can use it to print and query arbitrary values. Currently is has a hard-coded list is "known prompt texts", which it does recognize. As you see from the logs it is confused by the extra PAM_TEXT_INFO message showing the expire date:

> Prompts: [
>   ('Current Kerberos password: ', 1),
>   ('Your password will expire at Thu Sep 24 02:00:00 2020\n', 4),
>   ('Geben Sie ein neues Passwort ein: ', 1),
>   ('Geben Sie das neue Passwort erneut ein: ', 1)
> ]

include/security/_pam_types.h
> #define PAM_PROMPT_ECHO_OFF     1
> #define PAM_PROMPT_ECHO_ON      2
> #define PAM_ERROR_MSG           3
> #define PAM_TEXT_INFO           4

Do to reproduce the bug make sure you have password expiry set, which should get you that message.

I have been unable to reproduce that so far myself.
Maybe it only happens when SAML is in use, which looks like from comment 1.
Comment 24 Julia Bremer univentionstaff 2021-03-01 11:39:31 CET
TL;DR: The pam-krb5 function, which tries to get a ticket immediately after changing the password, is only used in this specific scenario and creates an Errorcode20 if the replication is slow. We think it is best to deactivate it. 

============================================================================

To analyse this problem further, we configured the DC Primary to be the only KDC of the domain. This eliminated the replication time factor to the DC backups from the equation and seemingly eliminated the problem for the customer. 

We can "replicate" this problem by creating a user in AD, expiring its password and stopping the notifier on the Primary. Changing the password on the DC backup then raises Errorcode20, because getting a new Kerberos Ticket after changing the password fails, due to the new password not being replicated to the backup yet. 

Doing this, we noticed, that no Errorcode20 was raised with users which were created in UCS and not in AD. 
Here we found out, that pam-krb5 does not try to get a new ticket after a password_change if a user has a valid userpassword. Users created in AD do not have one, their userpassword is set to {K5KEY}

pam_krb5 only tries to get a ticket after a password_change if it knows that the password was previously expired. The only way it can get that information is in the initial authentication before changing the password in the function pamk5_authenticate. 
In Users created in UCS, this does not happen, since pam_unix is configured earlier in the pamstack than pam_krb5. Since its password can be validated by "getent shadow", pam_unix is used for this initial authentication, before a password change through pam_krb5 is made. pamk5_authenticate is never executed in that case.
In users created in AD, pam_unix does not work for the initial authentication and pam_krb5 is used as a fallback. Here, the function pamk5_authenticate is executed, the Errorcode PAM_NEW_AUTHTOK_REQD is raised, and pam_krb5 can recognize that the password was actually expired. ( ctx->expired = 1;)
int
pamk5_authenticate(struct pam_args *args) {
    if (pamret == PAM_NEW_AUTHTOK_REQD) {
            ...       
            putil_debug(args, "expired account, deferring failure");
            ctx->expired = 1;
            ...

Later, when pamk5_password is run, which changes the password, pam_krb5 tries to get a new ticket only if ctx->expired.
     */
    if (pamret == PAM_SUCCESS && ctx->expired) {
        krb5_creds *creds = NULL;

        putil_debug(args, "obtaining credentials with new password");
        pamret = pamk5_password_auth(args, NULL, &creds);

which is where an "Errorcode20" is raised if a backup server is used to get the ticket and replication was not fast enough.

This explains why this problem never occurred on other customer systems, since this is specific to users, whose password was set in AD, and where the password was expired. 
Since in the default case (password was set in UCS), pam_krb5 does not try to 
get a new ticket after the password change, we think it is best to deactivate it. We need to patch pam_krb5 to do that.
Comment 25 Felix Botner univentionstaff 2021-03-01 17:30:07 CET
libpam-krb5.yaml
univention-management-console.yaml
univention-pam.yaml

libpam-krb5 (4.4-0-0-ucs/4.7-4+deb9u1-errata4.4-7 and 5.0-0-0-ucs/4.8-2+deb10u1/)
Added patch 050-bug52188-added-ticket-after-pwchange.quilt to disable ticket request after pwchange and added flag ticket_after_pwchange to restore default behavior.

univention-management-console - a22876037c6020954d021ba4e0ac6c3e7c63a2a4
Added UCRV pam/krb5/ticket_after_pwchange to configure ticket_after_pwchange flag in pam.d/univention-management-console.d/80_password

univention-pam - 54dd968b85039713747d9837ca48709f8b1ec8c4
Added UCRV pam/krb5/ticket_after_pwchange to configure ticket_after_pwchange flag in pam.d/common-password

The changes in univention-pam and univention-management-console are not yet merged to 5.0-0, please re-open anyway for merge.
Comment 26 Felix Botner univentionstaff 2021-03-01 21:57:24 CET
ucs-test - 306b48597b1b7d3a259218260e35ffa65db2e46f
added 01_base/91_pwchange_without_password_synchronization
 * only on dc backup
 * use (only) backup as kdc
 * create user with userPassword {K5KEY}
 * stop listener (ldap replication, so that we do not see the new password on the backup/kdc)
 * change password
Comment 27 Erik Damrose univentionstaff 2021-03-02 09:45:11 CET
OK: Reproduce issue
* testuser with {K5KEY} in userPassword attribute (e.g. user created in AD)
* stop replication
* set dc backup as only KDC:
 ucr set kerberos/defaults/dns_lookup_kdc=false kerberos/kdc=$(ucr get hostname).$(ucr get domainname)
* change password via pam-umc
OK: Fix for libpam-krb5 050-bug52188-added-ticket-after-pwchange.quilt
OK: ucs-test 01_base/91_pwchange_without_password_synchronization
OK: UCRv pam/krb5/ticket_after_pwchange to toggle old behavior in u-pam and u-management-console
OK: yaml

Reopen: Please merge to UCS 5
Comment 29 Felix Botner univentionstaff 2021-03-02 09:58:54 CET
(In reply to Florian Best from comment #28)
> The test case fails on a DC Backup:
> https://jenkins.knut.univention.de:8181/job/UCS-4.4/job/UCS-4.4-7/job/
> AutotestJoin/lastCompletedBuild/SambaVersion=s4,Systemrolle=backup/
> testReport/01_base/91_pwchange_without_password_synchronization/backup093/

The reproducer part of the test failed, so maybe this isn't a problem in an samba environment, for now i deactivated the test for samba environments, but i will look into that
Comment 30 Felix Botner univentionstaff 2021-03-02 10:23:11 CET
(In reply to Erik Damrose from comment #27) 
> Reopen: Please merge to UCS 5

merged, packages built

cf5878dab47a50a25e4a804feb0a7703fd034be0 - univention-management-console
570ed98287012858e27cfd6ea57937e3d7344964 - univention-pam
13e06bdcb6f257a4f421cde9a6193bc11101c8b8 - ucs-test
Comment 31 Erik Damrose univentionstaff 2021-03-02 14:15:53 CET
(In reply to Felix Botner from comment #30)
> merged, packages built
> 
> cf5878dab47a50a25e4a804feb0a7703fd034be0 - univention-management-console
> 570ed98287012858e27cfd6ea57937e3d7344964 - univention-pam
> 13e06bdcb6f257a4f421cde9a6193bc11101c8b8 - ucs-test

OK: Changes in UCS 5
OK: 91_pwchange_without_password_synchronization successful

Verified