Bug 50295 - New Docker service does not take /etc/default/docker into account
New Docker service does not take /etc/default/docker into account
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: App Center
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 4.4-2-errata
Assigned To: Dirk Wiesenthal
Johannes Keiser
https://docs.docker.com/v17.09/engine...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-09-30 15:16 CEST by Dirk Wiesenthal
Modified: 2019-10-09 14:21 CEST (History)
5 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 6: Setup Problem: Issue for the setup process
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.171
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Wiesenthal univentionstaff 2019-09-30 15:16:36 CEST
... instead, it uses
  /etc/docker/daemon.json

Therefore, nothing set in the former file is used. And we put some UCR variables in there, notably docker/daemon/default/opts/bip.

This value is also used as "DB_HOST" for some Apps (owncloud, nextcloud) and now they cannot access the our MySQL / Postgres DB.
Comment 1 Dirk Wiesenthal univentionstaff 2019-09-30 15:17:06 CEST
This is the file that seems to make it work:

{
	"storage-driver": "overlay",
       "log-opts": {
	       "max-file": "4",
	      "max-size": "10m"
       },
       "log-driver": "json-file",
      "bip": "172.17.42.1/16",
     "live-restore": true
}
Comment 2 Dirk Wiesenthal univentionstaff 2019-09-30 15:21:05 CEST
Double check: After I restarted the Docker Daemon with the new json file, all running containers seemed to be gone...
Comment 3 Dirk Wiesenthal univentionstaff 2019-10-01 02:36:13 CEST
Findings so far:

If the system was at 4.4-1 at some point, the docker0 is accessible via 172.17.42.1 (important for database integration) and the storage driver is "overlay" (deprecated!). Even after a service restart.

If it is a new 4.4-2, docker0 is at 172.168.0.1 (problem!) and storage driver is "overlay2".


If we change the storage driver during a package update (and restart the daemon), all containers are "gone". At least it seems like it. Changing the storage driver back makes them appear again. => We may not change the currently running driver. Maybe not set it in daemon.json at all?

If we use the daemon.json and restart the service (therebye leaving storage driver as is), some containers fail to start. Their status is "Exited (137)". But a "docker restart $id" helps! From what I can tell, this affects only Single Container Apps (nextcloud, horde, etc.), not Multi Container Apps (rocketchat, etc.). Maybe docker-compose created them / manages them smart? => "docker restart $id" in postinst?


Also found that the service is very picky regarding its config file. If the json does not fully fit its expectations, the service refuses to start. This is a problem with our current approach to just write anything that we find in UCR. We may want to support only a well-defined subset of the daemon.json?


"overlay2" seems to be the way to go. Unfortunately, I am not aware of any fool-proof migration from "overlay". And I hesitate to do it automatically during a postinst. We may have to leave that for later and run with "overlay" for updated UCS installations while staying at "overlay2" for new installations. The App Center can handle both today as it parses "docker inspect $container".
Comment 4 Dirk Wiesenthal univentionstaff 2019-10-01 13:22:14 CEST
/etc/docker/daemon.json is now generated by

docker/daemon/default/opts/max-file
docker/daemon/default/opts/max-size
docker/daemon/default/opts/log-driver
docker/daemon/default/parameter/live-restore
docker/daemon/default/opts/bip

storage driver is missing.

Container that do not survive the restart of docker (happens to be images from Single Container Apps, still don't know why) are restarted with
  docker restart

This is only done if not coming from UCS 4.4-1 and not having used the latest DVD with the first fix.

Additionally, ucs-test now has an updated 00_checks/51_check_for_docker that checks for the IP address of docker0 and checks the storage driver.
Comment 5 Felix Botner univentionstaff 2019-10-01 17:56:17 CEST
FAIL - yaml version
FAIL - container restart wird nur bei exit137 ausgeführt, grafana und prometheus beenden sich aber mit exit0 (beim docker restart in univention-docker postinst),so dass diese beiden Container (Apps) nach dem errata Update von univention-docker nicht mehr laufen, wie weuter?



OK - restart wird NICHT ausgeführt bei Neu-Installation
OK - restart wird NICHT ausgefüjrt bei Update von UCS < 4.4-2
OK - restart WIRD ausgeführt bei Update von kaputter UCS 4.4-2 Installation (univention-docker 4.0.0-4A~4.4.0.201904301731)
Comment 6 Dirk Wiesenthal univentionstaff 2019-10-02 10:19:42 CEST
Fixed in
  univention-docker 4.0.1-4A~4.4.0.201910021015

Containers are not restarted with "docker restart" but with "univention-app restart". Theoretically, some Apps were stopped by the Admin for a reason and theoretically, there may be some Docker containers not managed by the App Center that are not restarted. But all of this seems unlikely as the broken version was only available for 7 days and only for new installations.
Comment 7 Dirk Wiesenthal univentionstaff 2019-10-02 11:17:21 CEST
These Apps from the production App Center may be effected:

Multi Container Apps _should_ be restarted by docker-compose (at least this is what I found), the rest needs to be restarted. Those with "database" may not be installable at all. At least they should have some issues as their database connection cannot be established until the new univention-docker version is installed.

multi           4.3/guacamole=0.9.13-univention14
multi           4.3/rocketchat=1.2.1
multi           4.3/zammad=3.0.0-9
multi           seafile=7.0.7
multi           wekan=3.42
multi  database 4.3/owncloud=10.2.1-1
single          4.1/jira=7.1.4
single          4.1/tecart=4.5.1 ucs-1
single          4.1/wildfly=9.0.2-1
single          4.2/bettermarks=1.0
single          4.2/jenkins=2.150.2
single          4.2/matterbridge=1.14.4
single          4.2/mattermost=5.10.0
single          4.2/minio=RELEASE.2018-09-12T18-49-56Z
single          4.3/admin-dashboard=1.2
single          4.3/agorumcore-pro=9.0.0
single          4.3/benno-mailarchiv=2.4.6
single          4.3/collabora-online=3.4.2.1
single          4.3/collabora=4.0.6.1
single          4.3/dudle=1.2.0-1
single          4.3/itslearning=2.1
single          4.3/openid-connect-provider=1.1-konnect-0.23.3
single          4.3/prometheus=1.1
single          filewave=13.1.1
single          gitlab=12.1.0
single database 4.1/nextcloud=16.0.4-0
single database 4.1/wawision=16.3
single database 4.2/digitec-bis=1.2.0
single database 4.2/noctua=3.0
single database 4.2/tine20=2017.11.8-ucs1
single database 4.2/wordpress=4.9.4
single database 4.3/bluespice=3.0.1-ucs.1
single database 4.3/digitec-suitecrm=7.11.3
single database 4.3/egroupware=17.1.20190808-docker-ucs43
single database 4.3/etherpad-lite=1.6.6
single database 4.3/horde=5.2.17-3
single database 4.3/openproject=9.0.2
single database 4.3/relution=4.52
single database 4.3/xentral=19.1
single database odoo=12.0
single database onlyoffice-ds=5.3.0.243
Comment 8 Johannes Keiser univentionstaff 2019-10-08 12:18:05 CEST
(In reply to Dirk Wiesenthal from comment #7)
> These Apps from the production App Center may be effected:
> 
> Multi Container Apps _should_ be restarted by docker-compose (at least this
> is what I found), the rest needs to be restarted. Those with "database" may
> not be installable at all. At least they should have some issues as their
> database connection cannot be established until the new univention-docker
> version is installed.
> 
> multi           4.3/guacamole=0.9.13-univention14
> multi           4.3/rocketchat=1.2.1
> multi           4.3/zammad=3.0.0-9
> multi           seafile=7.0.7
> multi           wekan=3.42
> multi  database 4.3/owncloud=10.2.1-1
> single          4.1/jira=7.1.4
> single          4.1/tecart=4.5.1 ucs-1
> single          4.1/wildfly=9.0.2-1
> single          4.2/bettermarks=1.0
> single          4.2/jenkins=2.150.2
> single          4.2/matterbridge=1.14.4
> single          4.2/mattermost=5.10.0
> single          4.2/minio=RELEASE.2018-09-12T18-49-56Z
> single          4.3/admin-dashboard=1.2
> single          4.3/agorumcore-pro=9.0.0
> single          4.3/benno-mailarchiv=2.4.6
> single          4.3/collabora-online=3.4.2.1
> single          4.3/collabora=4.0.6.1
> single          4.3/dudle=1.2.0-1
> single          4.3/itslearning=2.1
> single          4.3/openid-connect-provider=1.1-konnect-0.23.3
> single          4.3/prometheus=1.1
> single          filewave=13.1.1
> single          gitlab=12.1.0
> single database 4.1/nextcloud=16.0.4-0
> single database 4.1/wawision=16.3
> single database 4.2/digitec-bis=1.2.0
> single database 4.2/noctua=3.0
> single database 4.2/tine20=2017.11.8-ucs1
> single database 4.2/wordpress=4.9.4
> single database 4.3/bluespice=3.0.1-ucs.1
> single database 4.3/digitec-suitecrm=7.11.3
> single database 4.3/egroupware=17.1.20190808-docker-ucs43
> single database 4.3/etherpad-lite=1.6.6
> single database 4.3/horde=5.2.17-3
> single database 4.3/openproject=9.0.2
> single database 4.3/relution=4.52
> single database 4.3/xentral=19.1
> single database odoo=12.0
> single database onlyoffice-ds=5.3.0.243

OK: docker containers are still running after upgrade
OK: yaml
-> verified
Comment 9 Erik Damrose univentionstaff 2019-10-09 14:21:19 CEST
<http://errata.software-univention.de/ucs/4.4/300.html>