Bug 56574 - EC2/KVM/OpenStack: Wrong grub-pc/install_devices will break on next GRUB update
EC2/KVM/OpenStack: Wrong grub-pc/install_devices will break on next GRUB update
Status: NEW
Product: UCS
Classification: Unclassified
Component: Grub
UCS 5.0
All other
: P5 normal (vote)
: UCS 5.1
Assigned To: Philipp Hahn
UCS maintainers
:
Depends on: 38911 ucs505ec2
Blocks:
  Show dependency treegraph
 
Reported: 2023-09-13 00:06 CEST by Philipp Hahn
Modified: 2024-03-08 11:20 CET (History)
2 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 6: Setup Problem: Issue for the setup process
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.069
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2023-09-13 00:06:32 CEST
Our EC2 image for UCS 5.0-5 from Bug #56447 (and probably all previous images, too) have a wrong value for the debconf-variable `grub-pc/install_devices` which controls the device where GRUB is installed to:
- the image is build internally within KVM and thus has either `/dev/vda` when using VirtIO or `/dev/sda` when using VirtiIO-SCSI or SATA.
- for AWS-EC2 instance `t2.large` the root device is `/dev/xdva`
- for AWD-EC2 instance `x5.xlarge` the root device is `/dev/nvme0`

The the image boots up fine, but the VM will on the next update of GRUB as `grub-install` will try to install GRUB to a not-existing device.

cloud-init has module [cc_grub_dpkg](https://cloudinit.readthedocs.io/en/latest/reference/modules.html#grub-dpkg) for this.

- [ ] detect correct boot device on first boot and fix setting
- [ ] Fix git:09bca8ac6e8d9eaf9df8f7fa943491dc303f738f
Comment 1 Philipp Hahn univentionstaff 2023-09-27 10:47:56 CEST
OpenStack is also affected: <https://jenkins2022.knut.univention.de/job/UCS-5.0/job/UCS-5.0-5/view/all/job/AutotestJoinOpenstack/9/SambaVersion=s4,Systemrolle=master/testReport/junit/00_checks/12_check_grub_debconf/master091/>

[2023-09-26 16:04:30.923229] Currently grub-pc/install_devices is set to '/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive0'
[2023-09-26 16:04:30.924894] Checking '/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive0'...
[2023-09-26 16:04:30.924996] The device '/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive0' is MISSING.
Comment 2 Felix Botner univentionstaff 2023-10-10 09:44:56 CEST
Also happens during the 5.1 update of EC2 instances:

----
Setting up grub-pc (2.06-3~deb11u5) ...


Creating config file /etc/default/grub.debian with new version
/dev/sda does not exist, so cannot grub-install to it!
You must correct your GRUB install devices before proceeding:

  DEBIAN_FRONTEND=dialog dpkg --configure grub-pc
  dpkg --configure -a
dpkg: error processing package grub-pc (--configure):
 installed grub-pc package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 grub-pc
needrestart is being skipped since dpkg has failed
...
E: Sub-process /usr/bin/dpkg returned an error code (1)
exitcode of apt-get dist-upgrade: 100
ERROR: update failed. Please check /var/log/univention/updater.log
----

----
$ ucr search --value sda
grub/boot: /dev/sda

grub/root: /dev/sda1
----
Comment 3 Philipp Hahn univentionstaff 2023-10-10 09:49:45 CEST
This might block the security import of GRUB 2.06-3~deb10u4 into UCS 5.0-[45] fixing Secure Boot compromise CVE-2023-4693.
Comment 4 Philipp Hahn univentionstaff 2023-10-17 18:16:30 CEST
We already have code in "test/utils/utils.sh" to fix this on AWS, but currently it is hard-coded to "/dev/xvda" which is no longer correct for newer EC2 generations or for other virt-platforms:

52 basic_setup_allow_uss () {
...
55 »···case "$VIRTTECH" in
56 »···qemu|kvm)
...
60 »···amazon|xen)
61 »···»···echo "Assuming Amazon Cloud"
62 »···»···if grep -F /dev/vda /boot/grub/device.map && [ -b /dev/xvda ] # Bug 36256
63 »···»···then
64 »···»···»···grub-mkdevicemap
65 »···»···»···echo set grub-pc/install_devices /dev/xvda | debconf-communicate
66 »···»···fi

As a temporary work-around:
- [ ] Execute that code block also for qemu/kvm
- [ ] Fix the code block to also work with /dev/{[hs[x]v]da,nvme0}

Mid-term we need a solution which is by default part of our KVM/VirtualBox/VMware/EC2 images itself instead of running a piece of code, which is only part of our test suite.
Comment 5 Philipp Hahn univentionstaff 2023-10-18 17:55:51 CEST
(In reply to Philipp Hahn from comment #0)
> - for AWD-EC2 instance `x5.xlarge` the root device is `/dev/nvme0`

m5.xlarge
Comment 6 Philipp Hahn univentionstaff 2023-10-18 18:03:31 CEST
[5.0-5] 524ae16fa1 fix(utils.sh): Fix GRUB root device
 test/utils/utils.sh | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

[5.0-4] 2c9854ab8b fix(utils.sh): Fix GRUB root device
 test/utils/utils.sh | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

[4.4-9] 4799d86dae fix(utils.sh): Fix GRUB root device
 test/utils/utils.sh | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)
Comment 7 Philipp Hahn univentionstaff 2023-10-19 11:07:00 CEST
(In reply to Philipp Hahn from comment #4)
> As a temporary work-around:
> - [x] Execute that code block also for qemu/kvm
> - [x] Fix the code block to also work with /dev/{[hs[x]v]da,nvme0}

(In reply to Philipp Hahn from comment #0)
> - [ ] detect correct boot device on first boot and fix setting
> - [ ] Fix git:09bca8ac6e8d9eaf9df8f7fa943491dc303f738f
Comment 8 Philipp Hahn univentionstaff 2023-10-26 09:08:39 CEST
https://cloudinit.readthedocs.io/en/latest/reference/base_config_reference.html#example

/etc/cloud/cloud.cfg:
  cloud_config_modules:
    - grub_dpkg

echo get grub-pc/install_devices | debconf-communicate
Comment 9 Florian Best univentionstaff 2024-03-08 11:19:58 CET
ucs-test (10.0.19-15)
525d5477a2c3 | test: Fix ucs.sh unset VALID_CHARS substitution
5f46ec1251ed | test: Ignore /var/log/cloud-init.log

524ae16fa1ec | fix(utils.sh): Fix GRUB root device