Univention Bugzilla – Bug 42582
list_add corruption - probably timer/workqueue related
Last modified: 2019-01-03 07:20:42 CET
Created attachment 8070 [details] Kernel OOPS notice swapped arguments between consecutive calls.
I asked on LKML but didn't get a reply yet: <https://marc.info/?l=linux-kernel&m=147508265316854&w=2> I also haven't found a similar bug report. Happened in 2. school with (probably) different virtualization: Bochs vs. VMWare. (?)
Created attachment 8076 [details] Kernel OOPS 2
The common things is a timer addition: [<ffffffff810da5a6>] ? internal_add_timer+0x36/0xa0 [<ffffffff810dc42b>] ? add_timer_on+0x8b/0x100 [<ffffffff810da5a6>] ? internal_add_timer+0x36/0xa0 [<ffffffff810dc6fa>] ? mod_timer_pending+0xfa/0x140 So something is racing without proper locking. The 2nd OOPS looks like some RCU locking might be missing for the WQ. See <http://linux-kernel.2935.n7.nabble.com/mod-timer-list-add-corruption-WARNING-CPU-1-PID-0-at-lib-list-debug-c-33-list-add-0xbe-0xd0-td684405.html>. There have been updates between v4.1.16 and v4.1.33 in that field, e.g. add92082e2d14367b27b0e18b0deeaedd7c1f938 68fce03ba7901aa338a566292a59e6a753948861 ! Especially the last one looks promising: v4.1.12~18 introduced the bug v4.1.19~70 fixed it that would explain why no-one except UCS customers see this bug. Will hopefully be fixed with the new linux-4.1.33 kernel from Bug #41058. Maybe enabling CONFIG_DEBUG_OBJECTS could help.
Might need debug enabled kernel build: <https://marc.info/?l=linux-btrfs&m=147694635511693&w=2>
Did it happened again or has it been fixed with the latest kernel updates?
There is a Customer ID set so I set the flag "Enterprise Customer affected".
This issue has been filled against UCS 4.1. The maintenance with bug and security fixes for UCS 4.1 has ended on 5st of April 2018. Customers still on UCS 4.1 are encouraged to update to UCS 4.3. Please contact your partner or Univention for any questions. If this issue still occurs in newer UCS versions, please use "Clone this bug" or simply reopen the issue. In this case please provide detailed information on how this issue is affecting you.