Bug 35733 - Jenkins failed reboot
Jenkins failed reboot
Status: CLOSED FIXED
Product: UCS Test
Classification: Unclassified
Component: General
unspecified
Other Linux
: P5 normal (vote)
: UCS 4.0
Assigned To: Philipp Hahn
Stefan Gohmann
: interim-2
Depends on: 35648 35767
Blocks:
  Show dependency treegraph
 
Reported: 2014-08-28 08:13 CEST by Alexander Kramer
Modified: 2014-11-26 06:54 CET (History)
1 user (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Kramer univentionstaff 2014-08-28 08:13:34 CEST
During the Jenkins-Tests the instances stucked after rebooting and had to terminate manually. For a quick fix I commented the reboot line in the cfg-files, so the test will start.

As an example see: (autotest-09*.cfg / Autotest MultiEnv) 
command3:
  univention-license-import /root/autotest090.ldif
  . utils.sh; install_ucs_test
  # reboot
  LOCAL sleep 60
  . utils.sh; wait_for_reboot
 command4:
Comment 1 Alexander Kramer univentionstaff 2014-08-28 08:24:42 CEST
I also commented the following lines, so the whole commando will be skipped at the moment.

 # reboot failed (Bug 35733)
 # reboot
 # LOCAL sleep 60
 # . utils.sh; wait_for_reboot
Comment 2 Alexander Kramer univentionstaff 2014-08-28 09:24:17 CEST
I tested autotest-07*-update-3.2-to-4.0-master-s4.cfg if the same problem occurs,
but it just works fine.
Comment 3 Philipp Hahn univentionstaff 2014-08-29 14:02:20 CEST
Our EC2-images for pre-UCS-4 (Bug #35648) are broken:

1. insserv is used
The images were set up with the still un-patched insserv, which re-ordered the symlinks in /etc/rc?.d/*.
Only with the update to testing is the patches version installed.
This also seems to then do some voodoo, which breaks the following boot.
Manually restoring the explicit order fixed the problem.

 #!/bin/sh
 up () {
   update-rc.d -f "$1" remove
   update-rc.d "$@"
 }
 eval "$(sed -rne 's/^[ \t]*update-rc\.d[ \t]+([a-z][^\|>]+).*/up \1/p' /var/lib/dpkg/info/*.postinst | sort -u)"


2. UUIDs are wrong:
 # cat /proc/cmdline 
  root=UUID=128d6760-7b69-496b-a25b-eaed20ec7275
 # cat /etc/fstab
  UUID=376f1ea7-a4a8-427b-94c6-511cdac387af /
  UUID=229d2c00-9366-48cc-b81e-4992017a1871 swap
 # ls -gG /dev/disk/by-uuid/
  128d6760-7b69-496b-a25b-eaed20ec7275 -> ../../xvda1
 # grep root /boot/grub/grub.cfg
  set root='hd0,msdos1'
  search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1  376f1ea7-a4a8-427b-94c6-511cdac387af
 # grep root= /boot/grub/menu.lst
  kernel /boot/vmlinuz-3.10.0-ucs71-amd64 root=UUID=128d6760-7b69-496b-a25b-eaed20ec7275

I know that Xen supports a feature, where a single partition can be passed to a VM instead of the disk containing it and Xen does some magic to create a fake partition scheme.

 # update-grub
 # grep root /boot/grub/grub.cfg
  search --no-floppy --fs-uuid --set=root  128d6760-7b69-496b-a25b-eaed20ec7275
 # grep root= /boot/grub/menu.lst 
  kernel /boot/vmlinuz-3.10.0-ucs71-amd64 root=UUID=128d6760-7b69-496b-a25b-eaed20ec7275

Please also note that the swap partition does not exists.


According to Drees only the file system content was rsynced. Perhaps the UUIDs were not updated?
Comment 4 Stefan Gohmann univentionstaff 2014-09-02 07:11:02 CEST
I've created Bug #35767. I think this bug can be closed as duplicate.
Comment 5 Philipp Hahn univentionstaff 2014-09-02 09:04:49 CEST
(In reply to Stefan Gohmann from comment #4)
> I've created Bug #35767. I think this bug can be closed as duplicate.

The changes to the Jenkins file need to be reverted as soon as Bug #35767 is fixed, as the reboot is required.
Comment 6 Philipp Hahn univentionstaff 2014-09-02 15:09:19 CEST
r53254 | Bug #35733 EC2: Fix reboot for broken pre-UCS-4.0-interim1 EC2 images
 Added a temporary work-around to fix the insserv re-ordering
Comment 7 Philipp Hahn univentionstaff 2014-09-03 15:18:55 CEST
r53299 | Bug #35733 EC2: Disable EC2 reboot for << UCS-4.0-interim1
  Disable reboot for now as they still fail to reboot until Bug #35767 is fixed.
Comment 8 Philipp Hahn univentionstaff 2014-09-16 18:00:20 CEST
r53709 | Bug #35733 EC2: Re-enable reboot
r53708 | Bug #35767 EC2: Switch to updated UCS-4.0 EC2 AMI
Comment 9 Stefan Gohmann univentionstaff 2014-10-01 15:50:09 CEST
The Jenkins tests for UCS 4 are up and running.
Comment 10 Stefan Gohmann univentionstaff 2014-11-26 06:54:52 CET
UCS 4.0-0 has been released:
 http://docs.univention.de/release-notes-4.0-0-en.html
 http://docs.univention.de/release-notes-4.0-0-de.html

If this error occurs again, please use "Clone This Bug".