Bug 39802 - Windows Client join: smbd core due to panic /SIGABRT "smbXsrv_session_logoff_all failed"
Windows Client join: smbd core due to panic /SIGABRT "smbXsrv_session_logoff_...
Status: RESOLVED WONTFIX
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 4.1
Other Linux
: P5 normal (vote)
: ---
Assigned To: Samba maintainers
https://bugzilla.samba.org/show_bug.c...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-05 17:56 CET by Arvid Requate
Modified: 2019-01-03 07:23 CET (History)
2 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 1: Cosmetic issue or missing function but workaround exists
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 1: Nuisance – not a big deal but noticeable
User Pain: 0.017
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2015110921000315 2016011921000304 2016032921000318 2017030821000936
Bug group (optional):
Max CVSS v3 score:


Attachments
win7pro_smbd.core.log (9.17 KB, text/plain)
2015-11-05 17:56 CET, Arvid Requate
Details
win7pro_join.tar.bz2 all logs (6.18 MB, application/x-bzip)
2015-11-05 17:57 CET, Arvid Requate
Details
fix-windowscomputer-uid.sh (790 bytes, patch)
2016-03-29 18:31 CEST, Arvid Requate
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate univentionstaff 2015-11-05 17:56:18 CET
During UCS 4.1 Samba4 product tests (Samba 4.3.1) I discovered an smbd core dump in /var/log/samba/cores/smbd/core. The essential part is:

============================================================================
Program terminated with signal 6, Aborted.
#0  0x00007fc862685165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#0  0x00007fc862685165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x00007fc8626883e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2  0x00007fc863fabc5b in dump_core () at ../source3/lib/dumpcore.c:337
        called = true
        __FUNCTION__ = "dump_core"
#3  0x00007fc863f9e2f9 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:812
        cmd = <optimized out>
        result = <optimized out>
        __FUNCTION__ = "smb_panic_s3"
#4  0x00007fc86607687f in smb_panic (why=why@entry=0x7fc865db6578 "smbXsrv_session_logoff_all failed") at ../lib/util/fault.c:166
No locals.
#5  0x00007fc865c90cb8 in exit_server_common (how=SERVER_EXIT_ABNORMAL, how@entry=SERVER_EXIT_NORMAL, reason=0x7fc865db6578 "smbXsrv_session_logoff_all failed") at ../source3/smbd/server_exit.c:246
============================================================================

This is from log.smbd:
============================================================================
[2015/11/05 16:26:56.005699,  0, pid=4560] ../source3/auth/pampass.c:89(smb_pam_error_handler)
  smb_pam_error_handler: PAM: session setup failed : User not known to the underlying authentica
tion module
[2015/11/05 16:26:56.007041,  1, pid=4560] ../source3/smbd/session.c:70(session_claim)
  pam_session rejected the session for AR41S4PT1+WIN7PRO$ [smb/67916629]
[2015/11/05 16:26:56.007071,  1, pid=4560] ../source3/smbd/smb2_sesssetup.c:448(smbd_smb2_auth_g
eneric_return)
  smb2: Failed to claim session for vuid=67916629
[2015/11/05 16:26:56.007269,  0, pid=4560] ../source3/smbd/smbXsrv_session.c:1553(smbXsrv_sessio
n_logoff)
  smbXsrv_session_logoff(0x040c5355): failed to delete global key '040C5355': NT_STATUS_NOT_FOUN
D
[2015/11/05 16:26:56.013638,  0, pid=4560] ../source3/smbd/smbXsrv_session.c:1662(smbXsrv_sessio
n_logoff_all)
  smbXsrv_session_logoff_all: count[1] errors[1] first[NT_STATUS_NOT_FOUND]
[2015/11/05 16:26:56.013668,  0, pid=4560] ../source3/smbd/server_exit.c:159(exit_server_common)
  Server exit (NT_STATUS_CONNECTION_DISCONNECTED)
[2015/11/05 16:26:56.013680,  0, pid=4560] ../source3/smbd/server_exit.c:162(exit_server_common)
  exit_server_common: smbXsrv_session_logoff_all() failed (NT_STATUS_NOT_FOUND) - triggering cle
anup
[2015/11/05 16:26:56.014792,  0, pid=4560] ../source3/lib/util.c:789(smb_panic_s3)
  PANIC (pid 4560): smbXsrv_session_logoff_all failed
[2015/11/05 16:26:56.015730,  0, pid=4560] ../source3/lib/util.c:900(log_stack_trace)
  BACKTRACE: 23 stack frames:
   #0 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(log_stack_trace+0x1a) [0x7fc863f9e1ba]
   #1 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(smb_panic_s3+0x20) [0x7fc863f9e290]
   #2 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(smb_panic+0x2f) [0x7fc86607687f]
   #3 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x177cb8) [0x7fc865c90cb8]
   #4 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x177fde) [0x7fc865c90fde]
   #5 /usr/lib/x86_64-linux-gnu/samba/libsmbd-shim.so.0(exit_server_cleanly+0x12) [0x7fc86395dd42]
   #6 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_server_connection_terminate_ex+0x28) [0x7fc865c6eb88]
   #7 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x15cb4b) [0x7fc865c75b4b]
   #8 /usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_req_error+0x22) [0x7fc8629e47b2]
   #9 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_immediate+0xe2) [0x7fc8629e3d92]
   #10 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(run_events_poll+0x48) [0x7fc863fb41d8]
   #11 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x384eb) [0x7fc863fb44eb]
   #12 /usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x9d) [0x7fc8629e34dd]
   #13 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7fc8629e369b]
   #14 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_process+0x715) [0x7fc865c5fa75]
   #15 /usr/sbin/smbd(+0xa006) [0x560b085ba006]
   #16 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(run_events_poll+0x187) [0x7fc863fb4317]
   #17 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x38599) [0x7fc863fb4599]
   #18 /usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x9d) [0x7fc8629e34dd]
   #19 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7fc8629e369b]
   #20 /usr/sbin/smbd(main+0x15d6) [0x560b085b7fa6]
   #21 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7fc862671ead]
   #22 /usr/sbin/smbd(+0x8375) [0x560b085b8375]
[2015/11/05 16:26:56.015858,  0, pid=4560] ../source3/lib/dumpcore.c:318(dump_core)
  dumping core in /var/log/samba/cores/smbd
============================================================================

This happened 4 times in about one second. Looking at the listener.log it was the time where I joined a Windows 7 client. This is also confirmed by the object timestamps:

Samba4:       whenCreated: 20151105152653.0Z
OpenLDAP: createTimestamp: 20151105152657Z

The join succeeded and the client seemed to behave normal (user login, share access). No panic messages after that.
Comment 1 Arvid Requate univentionstaff 2015-11-05 17:56:47 CET
Created attachment 7251 [details]
win7pro_smbd.core.log
Comment 2 Arvid Requate univentionstaff 2015-11-05 17:57:36 CET
Created attachment 7252 [details]
win7pro_join.tar.bz2   all logs
Comment 3 Stefan Gohmann univentionstaff 2015-11-05 22:43:33 CET
I'm currently unable to reproduce it in my test environment.
Comment 4 Arvid Requate univentionstaff 2015-11-05 22:53:42 CET
Might be a timing issue. auth.log shows exactly 4 messages

  pam_mkhomedir(samba:session): User unknown.

This is the relevant part:
============================================================================
    Nov  5 16:26:54 master100 PAM-univentionsambadomain[4546]: continuing as user WIN7PRO$
    Nov  5 16:26:54 master100 smbd[4546]: pam_unix(samba:session): session opened for user WIN7PRO$
    by (uid=0)
    Nov  5 16:26:54 master100 smbd[4546]: pam_mkhomedir(samba:session): User unknown.
    Nov  5 16:26:54 master100 smbd[4548]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:26:54 master100 PAM-univentionsambadomain[4550]: continuing as user WIN7PRO$
    Nov  5 16:26:54 master100 smbd[4550]: pam_unix(samba:session): session opened for user WIN7PRO$ by (uid=0)
    Nov  5 16:26:55 master100 smbd[4546]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:26:55 master100 smbd[4550]: pam_mkhomedir(samba:session): User unknown.
    Nov  5 16:26:55 master100 smbd[4553]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:26:55 master100 smbd[4550]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:26:55 master100 PAM-univentionsambadomain[4556]: continuing as user WIN7PRO$
    Nov  5 16:26:55 master100 smbd[4556]: pam_unix(samba:session): session opened for user WIN7PRO$ by (uid=0)
    Nov  5 16:26:55 master100 smbd[4556]: pam_mkhomedir(samba:session): User unknown.
    Nov  5 16:26:55 master100 smbd[4558]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:26:55 master100 PAM-univentionsambadomain[4560]: continuing as user WIN7PRO$
    Nov  5 16:26:55 master100 smbd[4560]: pam_unix(samba:session): session opened for user WIN7PRO$ by (uid=0)
    Nov  5 16:26:55 master100 smbd[4556]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:26:55 master100 smbd[4560]: pam_mkhomedir(samba:session): User unknown.
    Nov  5 16:26:55 master100 smbd[4563]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:26:56 master100 smbd[4560]: pam_unix(samba:session): session closed for user AR41S4PT1+WIN7PRO$
    Nov  5 16:27:09 master100 smbd[4537]: pam_unix(samba:session): session closed for user AR41S4PT1+Administrator
    Nov  5 16:28:00 master100 PAM-univentionsambadomain[4570]: continuing as user WIN7PRO$
    Nov  5 16:28:00 master100 smbd[4570]: pam_unix(samba:session): session opened for user WIN7PRO$ by (uid=0)
    Nov  5 16:29:52 master100 PAM-univentionsambadomain[4608]: continuing as user WIN7PRO$
    Nov  5 16:29:52 master100 smbd[4608]: pam_unix(samba:session): session opened for user WIN7PRO$ by (uid=0)
    Nov  5 16:29:57 master100 PAM-univentionsambadomain[4614]: continuing as user WIN7PRO$
    Nov  5 16:29:57 master100 smbd[4614]: pam_unix(samba:session): session opened for user WIN7PRO$ by (uid=0)
============================================================================

So I guess pam_mkhomedir ('required' in common-session) fails because the uid is not yet resolvable via nss_ldap.
Comment 5 Arvid Requate univentionstaff 2016-01-21 21:23:39 CET
At Ticket#: 2016011921000304 I discovered the same traceback/panic/core dump in the log.samba of the DC Master (UCS 4.1-0, Errata 56), also triggered by a machine account but repeatedly. Will report more details as they become available.
Comment 6 Arvid Requate univentionstaff 2016-02-16 17:03:00 CET
The same traceback I see at Ticket#:  2015110921000315 (UCS 4.0-4 Samba 4.2.3)
Comment 7 Arvid Requate univentionstaff 2016-03-08 13:48:35 CET
We see this regularly in log.smbd, e.g. Ticket#2016030421000203.

My impression is, that this happens whenever a machine account logs off.
My gut feeling is that it might be due to homeDirectory: /dev/null.

Looks a bit but not entirely similar like https://bugzilla.samba.org/show_bug.cgi?id=9873 .

Anyway, it's annoying but apparently of no consequence.
Comment 8 Arvid Requate univentionstaff 2016-03-22 19:09:17 CET
In the case of 2015110921000315 (Comment 6) the coredump is triggered by two clients, both of which have uid != sAMAccontName, so getent passwd doesn't work on the sAMAccountName. Probably this specific case is yet another collateral damage of Bug 37388.
Comment 9 Arvid Requate univentionstaff 2016-03-29 18:31:00 CEST
Created attachment 7564 [details]
fix-windowscomputer-uid.sh

Happened again at Ticket# 2016032921000318, fixed by renaming the corresponding computers/windows object in OpenLDAP. The attached script may be useful to check and fix all computers/windows objects ("untested").
Comment 10 Arvid Requate univentionstaff 2016-11-07 12:27:41 CET
In a Samba 4.5.1 test domain Felix saw this during the initial join of a Windows client, but only in one out of four cases.

I looked at core dump and it's not a segfault (code defect/regression) but an explicit call to smb_panic in the samba code. The reason seems to be that Samba performs a pam login with the machine account and that fails during pam_mkhomedir. It seems to be a timing issue (S4-Connector -> OpenLDAP -> NSS). /var/log/auth.log says:

==============================================================================
Nov  7 10:38:53 master PAM-univentionsambadomain[13866]: continuing as user WIN7EN$
Nov  7 10:38:53 master smbd[13866]: pam_unix(samba:session): session opened for user WIN7EN$ by (uid=0)
Nov  7 10:38:53 master smbd[13866]: pam_mkhomedir(samba:session): User unknown.
Nov  7 10:38:53 master smbd[13868]: pam_unix(samba:session): session closed for user W2K12+WIN7EN$
Nov  7 10:38:53 master PAM-univentionsambadomain[13870]: continuing as user WIN7EN$
Nov  7 10:38:53 master smbd[13870]: pam_unix(samba:session): session opened for user WIN7EN$ by (uid=0)
Nov  7 10:38:53 master smbd[13866]: pam_unix(samba:session): session closed for user W2K12+WIN7EN$
Nov  7 10:38:53 master smbd[13870]: pam_mkhomedir(samba:session): User unknown.
Nov  7 10:38:53 master smbd[13873]: pam_unix(samba:session): session closed for user W2K12+WIN7EN$
Nov  7 10:38:53 master smbd[13870]: pam_unix(samba:session): session closed for user W2K12+WIN7EN$
==============================================================================

And the corresponding log.smbd messages (before the usual stack trace):
==============================================================================
[2016/11/07 10:38:53.832846,  1, pid=8082] ../source3/smbd/server.c:868(remove_child_pid)
  Scheduled cleanup of brl and lock database after unclean shutdown
[2016/11/07 10:38:53.848626,  0, pid=13873] ../source3/auth/pampass.c:89(smb_pam_error_handler)
  smb_pam_error_handler: PAM: session setup failed : Benutzer bei zu Grunde liegendem Authentifi
zierungsmodul nicht bekannt
[2016/11/07 10:38:53.850747,  1, pid=13873] ../source3/smbd/session.c:70(session_claim)
  pam_session rejected the session for W2K12+WIN7EN$ [smb/2546232234]
[2016/11/07 10:38:53.850781,  1, pid=13873] ../source3/smbd/smb2_sesssetup.c:462(smbd_smb2_auth_
generic_return)
  smb2: Failed to claim session for vuid=2546232234
[2016/11/07 10:38:53.913918,  0, pid=13870] ../source3/auth/pampass.c:89(smb_pam_error_handler)
  smb_pam_error_handler: PAM: session setup failed : Benutzer bei zu Grunde liegendem Authentifi
zierungsmodul nicht bekannt
[2016/11/07 10:38:53.915006,  1, pid=13870] ../source3/smbd/session.c:70(session_claim)
  pam_session rejected the session for W2K12+WIN7EN$ [smb/2546232234]
[2016/11/07 10:38:53.915029,  1, pid=13870] ../source3/smbd/smb2_sesssetup.c:462(smbd_smb2_auth_
generic_return)
  smb2: Failed to claim session for vuid=2546232234
[2016/11/07 10:38:53.915216,  0, pid=13870] ../source3/smbd/smbXsrv_session.c:1675(smbXsrv_session_logoff)
  smbXsrv_session_logoff(0x97c46baa): failed to delete global key '97C46BAA': NT_STATUS_NOT_FOUND
[2016/11/07 10:38:53.920076,  0, pid=13870] ../source3/smbd/smbXsrv_session.c:1775(smbXsrv_session_logoff_all)
  smbXsrv_session_logoff_all: count[1] errors[1] first[NT_STATUS_NOT_FOUND]
==============================================================================

Later auth.log looks ok, there are several logins with the machine account and all of them are ok.
Comment 11 Stefan Gohmann univentionstaff 2019-01-03 07:23:38 CET
This issue has been filled against UCS 4.1. The maintenance with bug and security fixes for UCS 4.1 has ended on 5st of April 2018.

Customers still on UCS 4.1 are encouraged to update to UCS 4.3. Please contact
your partner or Univention for any questions.

If this issue still occurs in newer UCS versions, please use "Clone this bug" or simply reopen the issue. In this case please provide detailed information on how this issue is affecting you.