Bug 51501 - The transport connection is now disconnected
The transport connection is now disconnected
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: AD Connector
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 4.4-6-errata
Assigned To: Felix Botner
Julia Bremer
:
Depends on: 45127
Blocks: 52432 48266
  Show dependency treegraph
 
Reported: 2020-06-16 11:33 CEST by Christina Scheinig
Modified: 2020-11-25 16:20 CET (History)
10 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.143
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2020061621000436
Bug group (optional): Large environments
Max CVSS v3 score:


Attachments
reconnect_set_password_in_ad.patch (1.58 KB, patch)
2020-06-18 13:16 CEST, Arvid Requate
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Christina Scheinig univentionstaff 2020-06-16 11:33:15 CEST
+++ This bug was initially created as a clone of Bug #45127 +++

From https://help.univention.com/t/possible-bug-in-ad-sync-connector/4916

I think there may be a bug in the connection to a 2008 AD.

We are running a mirror off the main AD, as a test measure.
yesterday we had a non-graceful C&B (crash & burn) loss of a UPS, this took the 2008 AD server down hard.

After the system was brought back up and functioning, we noticed the Uni. could not re-connect and was filling the error logs

it seems that once a connection is made to a MS AD , there is not any real checking to see if the connection goes down, instead the log files just fill with PY errors, even after the remote system comes backup.

a stopping of the AD connection at univention & a restarting, fixes the log errors and catches up on the domain syncs.

it seems the error routines need to be made a bit more robust & try re-forming the connection if the code is producing connection errors.

    26.01.2017 09:44:37,729 LDAP (ERROR ): failed in post_con_modify_functions
    26.01.2017 09:44:37,730 LDAP (ERROR ): Traceback (most recent call last):
    File "/usr/lib/pymodules/python2.7/univention/connector/__init__.py", line 1326, in sync_to_ucs
    f(self, property_type, object)
    File "/usr/lib/pymodules/python2.7/univention/connector/ad/password.py", line 381, in password_sync
    res = get_password_from_ad(connector, univention.connector.ad.compatible_modstring(object['dn']))
    File "/usr/lib/pymodules/python2.7/univention/connector/ad/password.py", line 180, in get_password_from_ad
    (level, ctr) = connector.drs.DsGetNCChanges(connector.drsuapi_handle, 8, req8)
    NTSTATUSError: (-1073741300, 'The transport connection is now disconnected.')

then after re-connecting:

    File "/usr/lib/pymodules/python2.7/univention/connector/__init__.py", line 1326, in sync_to_ucs
    f(self, property_type, object)
    File "/usr/lib/pymodules/python2.7/univention/connector/ad/password.py", line 381, in password_sync
    res = get_password_from_ad(connector, univention.connector.ad.compatible_modstring(object['dn']))
    File "/usr/lib/pymodules/python2.7/univention/connector/ad/password.py", line 180, in get_password_from_ad
    (level, ctr) = connector.drs.DsGetNCChanges(connector.drsuapi_handle, 8, req8)
    NTSTATUSError: (-1073741300, 'The transport connection is now disconnected.')

    26.01.2017 09:45:18,218 MAIN (------ ): DEBUG_INIT
    26.01.2017 09:45:18,237 LDAP (ERROR ): Failed to lookup AD LDAP base, using UCR value.
    26.01.2017 09:45:18,270 LDAP (PROCESS): Building internal group membership cache
    26.01.2017 09:45:18,411 LDAP (PROCESS): Internal group membership cache was created
    26.01.2017 09:45:18,449 LDAP (PROCESS): Using GP01 as AD Netbios domain name
    26.01.2017 09:45:18,521 LDAP (PROCESS): sync from ucs: Resync rejected file: /var/lib/univention-connector/ad/1485393377.000030
    26.01.2017 09:45:18,548 LDAP (PROCESS): sync from ucs: [ user] [ modify] cn=xxxxxx,ou=hk office,DC=xx,DC=xx,DC=xxx,DC=xx

and everything is fine with the world until next time
---------------------------------------------------------------------------------------
#######################################################################################
---------------------------------------------------------------------------------------


The description is nearly the same. The traceback is a little bit different:

16.06.2020 10:55:48.847 LDAP        (WARNING): sync failed, saved as rejected
16.06.2020 10:55:48.847 LDAP        (WARNING): Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/univention/connector/__init__.py", line 780, in __sync_file_from_ucs
    if ((old_dn and not self.sync_from_ucs(key, object, premapped_ucs_dn, unicode(old_dn, 'utf8'))) or (not old_dn and not self.sync_from_ucs(key, object, premapped_ucs_dn, old_dn))):
  File "/usr/lib/python2.7/dist-packages/univention/connector/ad/__init__.py", line 2649, in sync_from_ucs
    f(self, property_type, object)
  File "/usr/lib/python2.7/dist-packages/univention/connector/ad/password.py", line 397, in password_sync_ucs
    res = set_password_in_ad(connector, object['attributes']['sAMAccountName'][0], pwd)
  File "/usr/lib/python2.7/dist-packages/univention/connector/ad/password.py", line 211, in set_password_in_ad
    (rids, types) = connector.samr.LookupNames(connector.dom_handle, [sam_accountname, ])
NTSTATUSError: (3221225996, 'The transport connection is now disconnected.')

Restarting the connector fixes this issue again, till the next occurrence.


univention-app info
UCS: 4.4-4 errata620
Installed: adconnector=12.0 office365=3.2 self-service=4.0 self-service-backend=4.0 ucsschool=4.4 v5

operatingSystem: Windows Server 2016 Datacenter
Comment 1 Ingo Steuwer univentionstaff 2020-06-17 08:43:42 CEST
OK, took me a while to understand this, the old bug description confused me.

To be sure:

- this happens with an up to date Windows Server (not the 2008 mentioned in the cloned bug reported in 2017)

- the traceback at the end of the bug report is the current one, occuring with UCS 4.4 and Windows Server 2016

- the affected customer is an @school environment (therefore I change the flags from enterprise to school)

- the issue happens after every restart / downtime of the AD DC
Comment 2 Ingo Steuwer univentionstaff 2020-06-17 08:45:15 CEST
Maybe this has been introduced with the new password retrieval mechanism that can get the kerbers hashes?
Comment 3 Christina Scheinig univentionstaff 2020-06-17 09:09:57 CEST
(In reply to Ingo Steuwer from comment #1)
> OK, took me a while to understand this, the old bug description confused me.
> 
> To be sure:
> 
> - this happens with an up to date Windows Server (not the 2008 mentioned in
> the cloned bug reported in 2017)
> 
> - the traceback at the end of the bug report is the current one, occuring
> with UCS 4.4 and Windows Server 2016
> 
> - the affected customer is an @school environment (therefore I change the
> flags from enterprise to school)
> 
> - the issue happens after every restart / downtime of the AD DC

The issue occues after some time, maybe after an import of school users, but that is a guess. I am waiting for a reply of the customer for this question.

So after some time, the customer gets a lot of rejects after importing students and teachers. These rejects causes the teachers to get deactivated accounts. 

Resolving these rejects with restarting the ad-connector, because the message "NTSTATUSError: (3221225996, 'The transport connection is now disconnected.')" occurs in the log, also solves the deactivation of the users.
Comment 5 Arvid Requate univentionstaff 2020-06-18 13:16:58 CEST
Created attachment 10395 [details]
reconnect_set_password_in_ad.patch

Tha attached custom patch was applied in the customer environment at the 2nd of June to fix a similar/related support issue at that time, but for some reason it's not applied in the file any longer on the customer system. Jusdging from the timestamp it looks like the file has been overwritten by https://errata.software-univention.de/ucs/4.4/554.html , maybe. Anyway, the attached patch may be an improvement for the AD-Connector.
Comment 7 Florian Best univentionstaff 2020-09-23 17:03:05 CEST
(In reply to Arvid Requate from comment #5)
> Created attachment 10395 [details]
> reconnect_set_password_in_ad.patch
Just a superficial look:
Why do the reconnection on "Exception" and not only on "NTSTATUSError" ?
Comment 8 Christian Castens univentionstaff 2020-09-30 09:17:17 CEST
Applied patch

Package: univention-ad-connector
Version: 13.0.0-52A~4.4.0.202009291828
Branch: ucs_4.4-0
Scope: errata4.4-6

changed file:
services/univention-ad-connector/modules/univention/connector/ad/password.py

commits (4.4-6):
7ad8faf09a67ceeff9965ce8eecece7bc9053672 (changes and changelog)
b7a6d30b3ebf49909621ebbe40ba4698976cc200 (yaml)
Comment 9 Felix Botner univentionstaff 2020-09-30 09:24:50 CEST
see jenkins ad connector tests

29.09.2020 23:13:14.726 LDAP        (WARNING): sync failed, saved as rejected
29.09.2020 23:13:14.744 LDAP        (WARNING): Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/univention/connector/__init__.py", line 803, in __sync_file_from_ucs
    or (not old_dn and not self.sync_from_ucs(key, object, premapped_ucs_dn, old_dn, object_old))):
  File "/usr/lib/python2.7/dist-packages/univention/connector/ad/__init__.py", line 2640, in sync_from_ucs
    f(self, property_type, object)
  File "/usr/lib/python2.7/dist-packages/univention/connector/ad/password.py", line 408, in password_sync_ucs
    res = set_password_in_ad(connector, object['attributes']['sAMAccountName'][0], pwd, reconnect=True)
TypeError: set_password_in_ad() got an unexpected keyword argument 'reconnect'
Comment 10 Felix Botner univentionstaff 2020-09-30 09:32:00 CEST
(In reply to Felix Botner from comment #9)
> see jenkins ad connector tests
> 
> 29.09.2020 23:13:14.726 LDAP        (WARNING): sync failed, saved as rejected
> 29.09.2020 23:13:14.744 LDAP        (WARNING): Traceback (most recent call
> last):
>   File "/usr/lib/python2.7/dist-packages/univention/connector/__init__.py",
> line 803, in __sync_file_from_ucs
>     or (not old_dn and not self.sync_from_ucs(key, object, premapped_ucs_dn,
> old_dn, object_old))):
>   File
> "/usr/lib/python2.7/dist-packages/univention/connector/ad/__init__.py", line
> 2640, in sync_from_ucs
>     f(self, property_type, object)
>   File
> "/usr/lib/python2.7/dist-packages/univention/connector/ad/password.py", line
> 408, in password_sync_ucs
>     res = set_password_in_ad(connector,
> object['attributes']['sAMAccountName'][0], pwd, reconnect=True)
> TypeError: set_password_in_ad() got an unexpected keyword argument
> 'reconnect'

I think this part is missing

-def set_password_in_ad(connector, samaccountname, pwd):
+def set_password_in_ad(connector, samaccountname, pwd, reconnect=False):

please fix and restart the ad connector test.
Comment 11 Christian Castens univentionstaff 2020-09-30 10:51:35 CEST
fix + new build:

Package: univention-ad-connector
Version: 13.0.0-53A~4.4.0.202009301026
Branch: ucs_4.4-0
Scope: errata4.4-6

commits (4.4-6)
e056eb354aa661b4529cc14565749360cb366140 (fix and changelog version update)
cd4cffe46b04224cb97fad75cf374cd2266e4d5d (yaml update)
Comment 12 Felix Botner univentionstaff 2020-09-30 12:19:27 CEST
FAIL - yaml, i don't like this message (but i don't have a better one, maybe
       something like "the initialization of the service for password changes
       has been fixed??)
TODO - jenkins Tests (wait)

OK - univention-ad-connector (manual tests)
Comment 13 Felix Botner univentionstaff 2020-10-01 09:06:27 CEST
TODO - yaml
TODO - merge to 5.0
OK - Jenkins tests
Comment 14 Christian Castens univentionstaff 2020-10-02 09:56:10 CEST
revised yaml file:
commit (4.4-6):
d07788d09895e9c92667685a7dfbb99f62d463a9

created merge request:
https://git.knut.univention.de/univention/ucs/-/merge_requests
Comment 15 Felix Botner univentionstaff 2020-10-02 11:02:34 CEST
OK
Comment 16 Florian Best univentionstaff 2020-10-02 14:48:40 CEST
Why is nobody answering the question in comment #7?
Comment 17 Felix Botner univentionstaff 2020-10-02 16:08:12 CEST
(In reply to Florian Best from comment #7)
> (In reply to Arvid Requate from comment #5)
> > Created attachment 10395 [details]
> > reconnect_set_password_in_ad.patch
> Just a superficial look:
> Why do the reconnection on "Exception" and not only on "NTSTATUSError" ?

sorry, totally forgot that, i will speak to christian on Monday, depends on if we can make that change (NTSTATUSError instead of Exception) in the current sprint
Comment 18 Felix Botner univentionstaff 2020-10-05 10:53:13 CEST
b62a5e990ecd0bfe6fb89a459a1f6ee38cbcea78 - univention-ad-connector
ae3595feda62f42ed948c31a58b0540aaa365e8d - yaml
Comment 19 Julia Bremer univentionstaff 2020-10-05 11:31:58 CEST
Package install: OK
Code review: OK
Password change still works: OK
Exception handling fixed: OK
Merge request updated: OK
Yaml: OK

Verified