Bug 51498 – Rework NTP configuration

Bug 51498 - Rework NTP configuration


Summary:	Rework NTP configuration

Status:	NEW

Product:	UCS
Classification:	Unclassified
Component:	NTP
Version:	UCS 5.0
Hardware:	Other Linux

Importance:	P5 normal with 2 votes (vote)
Target Milestone:	---
Assigned To:	UCS maintainers
QA Contact:	UCS maintainers

URL:
Keywords:

Depends on:	30854 42171 chrony 50269 ucs445ec2
Blocks:	51493
	Show dependency tree / graph

Reported:	2020-06-16 09:08 CEST by Philipp Hahn
Modified:	2022-01-21 17:47 CET (History)
CC List:	2 users (show)

See Also:
What kind of report is it?:	Development Internal
What type of bug is this?:	---
Who will be affected by this bug?:	---
How will those affected feel about the bug?:	---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Philipp Hahn

2020-06-16 09:08:53 CEST

Our NTP configuration has many deficiencies:

- running NTP inside a VM is not recommended (Bug #47939) and leads to errors like Bug #51493 or Bug #49455

- DCs always configures a local clock; if only one external NTP service is configured, NTPd cannot decide which one is correct and requires a long time to synchronize. At least 3 external services should be configured. (Bug #30854)

- All except the DC Master configure all other DCs as time servers first and add explicitly configures additional servers via UCRV "timeserver[23]" on top (the order is not important). Switch to peer-mode between Master and Backups?
 (Bug #42171)

- No encryption

- Beter support for pool: should not be configured through "server" as that can lead to one server being selected multiple times (by accident) (Bug #50269)

- By default not external NTP is configured (Bug #27728)

- NTP configuration is not provided via DHCP option 042

- Also support PTP?

Comment 1 Philipp Hahn

2020-07-03 05:55:07 CEST

(In reply to Philipp Hahn from comment #0)
> - NTP configuration is not provided via DHCP option 042

Actually this already works through `/etc/dhcp/dhclient-exit-hooks.d/ntp`, which uses `/etc/ntp.conf` as its base, overwrites the `server|pool|peer` configuration with the servers from DHCP, writes it to `/run/ntp.conf.dhc` and restarts `ntpd` through `/etc/init.d/ntp`, which preferres said file IFF it exists. UCRVs `timeserver[23]` are thus IGNORED.

This lead to the strange situation during my last 3
TT 2020-07-01/02
TT 2020-06-18/19
TT 2020-05-14/15
that NTP was NOT WORKING at all in AWS EC2: The reason for this was that in our CloudFormation template `schulung_v1.json` contains this chunk:

    "DhcpOptions" : {
      "Type" : "AWS::EC2::DHCPOptions",
        "Properties" : {
          "DomainName" : "schulung.ucs",
          "DomainNameServers" : ["10.0.0.13"],
            "NtpServers" : ["10.0.0.13"],
            "NetbiosNameServers" : ["10.0.0.13"],
            "NetbiosNodeType" : "2"
        }
    },

This also gets applied to the "DC Master", which the is using ITSELF as the ONLY time source. NTPd refuses to do so and will be stuck in the INIT state forever:

$ ntpq  -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 dc1.schulung.uc .INIT.          16 u    - 1024    0    0.000    0.000   0.000

This then cascaded to the "DC Slave", which also receives the same NTP configuration and will only contact NTPd on the Master. As that one never gets a stable time, the Master will not answer any request until then and so the Slave will also be stuck in that INIT state forever.

This is then flagged as an error by `check_univention_ntp` in Nagios.

As AWS nowadays runs its own internal NTP server network <https://docs.aws.amazon.com/de_de/AWSEC2/latest/UserGuide/set-time.html>, "169.254.169.123" should be used by default in AWS EC2. It should be configured through
  ucr set timeserver='169.254.169.123 iburst'
to also speed up the initial sync.
As those servers are internal to AWS no extra rule is needed in the SecurityGroup like this:
            { "IpProtocol" : "udp", "FromPort" : "123", "ToPort" : "123", "CidrIp" : "0.0.0.0/0" },

Comment 2 Philipp Hahn

2020-11-26 11:48:06 CET

With UCS 5.0-0 `apt-get update` is broken after a suspend as the downloaded file is too much in the future: Bug #51493

Comment 3 Philipp Hahn

2021-08-31 09:44:45 CEST

For AWS we set 169.254.169.123 by default since Bug #51558

Comment 4 Philipp Hahn

2022-01-21 17:47:41 CET

(In reply to Philipp Hahn from comment #0)
> - running NTP inside a VM is not recommended (Bug #47939) and leads to
> errors like Bug #51493 or Bug #49455

With Linux-4.11+ on the host there is "ptp-kvm", which exports the time of the host into any VM; "chrony" then can use "/dev/ptp_kvm" inside the VM to keep the VM up-to-date. "chrony" is also needed on the host to keep the host itself synchronized to wall-clock.