Univention Bugzilla – Bug 39429
Handle a restart of docker gracefully
Last modified: 2015-11-17 12:12:35 CET
When docker.io is restarted invoke-rc.d docker restart it hard kills all containers. Those containers do not come back again unless manually started. We may want to patch the init script of docker to send stop to the containers (and wait) when docker is stopped. We may also want to save the state of the containers prior to stop and restart those containers. Also check how docker behaves after a reboot of the host. And how the apps behave. They should be automatically started depending on $appid/autostart. Does the init script of the apps wait until docker is actually up and running?
Commit 64262: fix univention-firewall if Docker has not yet run Commit 64270: make docker, app-container and non-app-container {start,stop,restart} in correct order START 1. docker.io is started 2. if containers were stopped on shutdown of docker.io they are started now 3. app-containers are started STOP 1. app-containers are stopped 2. (docker.io init script) all running (non-app) containers are stopped, their IDs safed for START.2 3. docker.io is stopped init order is hard coded, grep for 40/15 and 41/14 in r64270.
(In reply to Daniel Tröder from comment #1) > Commit 64270: make docker, app-container and non-app-container > {start,stop,restart} in correct order This seems to break the App tests. Simply try this: ucr set repository/online/unmaintained='yes' univention-install ucs-test-docker /usr/share/ucs-test/80_docker/55_app_modproxy -f From the command line output: Going to remove np1gw082xh (9.1.7) Stopping np1gw082xh Container 2f896c594db7e0cf20b452c7efb2aedd4375e5995d07177568f127e9220c71eb .... 2f896c594db7 2f896c594db7e0cf20b452c7efb2aedd4375e5995d07177568f127e9220c71eb File: /etc/univention/service.info/services/univention-appcenter.cfg Multifile: /etc/apache2/sites-available/default-ssl Multifile: /etc/apache2/sites-available/default Registering UCR for np1gw082xh Removing localhost from LDAP object Removing LDAP object Setting overview variables File: /var/www/ucs-overview/entries.json Reloading web server config: apache2. update-rc.d -f docker-app-np1gw082xh remove Removing any system startup links for /etc/init.d/docker-app-np1gw082xh ... /etc/rc0.d/K14docker-app-np1gw082xh /etc/rc1.d/K14docker-app-np1gw082xh /etc/rc2.d/S41docker-app-np1gw082xh /etc/rc3.d/S41docker-app-np1gw082xh /etc/rc4.d/S41docker-app-np1gw082xh /etc/rc5.d/S41docker-app-np1gw082xh /etc/rc6.d/K14docker-app-np1gw082xh Removing /etc/init.d/docker-app-np1gw082xh File: /usr/share/univention-management-console/modules/apps.xml File: /usr/share/univention-management-console/i18n/de/apps.mo File: /etc/apt/apt.conf.d/55user_agent Downloading "https://master441.deadlock44.intranet/meta-inf/categories.ini"... Downloading "https://master441.deadlock44.intranet/meta-inf/4.1/index.json.gz"... 0 file(s) are new 'module' object has no attribute 'rm' Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/__init__.py", line 182, in call_with_namespace result = self.main(namespace) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/remove.py", line 59, in main self.do_it(args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/install_base.py", line 106, in do_it self._do_it(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/docker_remove.py", line 54, in _do_it super(Remove, self)._do_it(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/remove.py", line 67, in _do_it self._unregister_app(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/docker_remove.py", line 84, in _unregister_app return super(Remove, self)._unregister_app(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/remove.py", line 95, in _unregister_app os.rm(init_script) AttributeError: 'module' object has no attribute 'rm' Traceback (most recent call last): File "/usr/bin/univention-app", line 84, in <module> main() File "/usr/bin/univention-app", line 74, in main ret = args.func(args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/__init__.py", line 182, in call_with_namespace result = self.main(namespace) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/remove.py", line 59, in main self.do_it(args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/install_base.py", line 106, in do_it self._do_it(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/docker_remove.py", line 54, in _do_it super(Remove, self)._do_it(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/remove.py", line 67, in _do_it self._unregister_app(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/docker_remove.py", line 84, in _unregister_app return super(Remove, self)._unregister_app(app, args) File "/usr/lib/pymodules/python2.7/univention/appcenter/actions/remove.py", line 95, in _unregister_app os.rm(init_script) AttributeError: 'module' object has no attribute 'rm' Cleanup after exception: <class 'dockertest.UCSTest_DockerApp_RemoveFailed'>
Ah great - I didn't know how to test the code. Actually I found out, that the symlink is removed by other code already, so the offending code has been removed. Fixed in 64296.
Seems to work, but is this correct -> -> started docker container Chain FORWARD (policy ACCEPT) target prot opt source destination ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 Chain DOCKER (0 references) target prot opt source destination ACCEPT tcp -- 0.0.0.0/0 172.17.0.20 tcp dpt:23 ACCEPT tcp -- 0.0.0.0/0 172.17.0.20 tcp dpt:21 -> univention-firewall stop Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain DOCKER (0 references) target prot opt source destination -> univention-firewall start Chain FORWARD (policy ACCEPT) target prot opt source destination ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ACCEPT tcp -- 0.0.0.0/0 172.17.0.20 tcp dpt:21 ACCEPT tcp -- 0.0.0.0/0 172.17.0.20 tcp dpt:23 Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 Chain DOCKER (0 references) target prot opt source destination -> docker container restarted Chain FORWARD (policy ACCEPT) target prot opt source destination ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ACCEPT tcp -- 0.0.0.0/0 172.17.0.20 tcp dpt:21 ACCEPT tcp -- 0.0.0.0/0 172.17.0.20 tcp dpt:23 Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 Chain DOCKER (0 references) target prot opt source destination ACCEPT tcp -- 0.0.0.0/0 172.17.0.21 tcp dpt:21 ACCEPT tcp -- 0.0.0.0/0 172.17.0.21 tcp dpt:23
(IMO this kind of belongs to #38307, but I fixed it referencing this bug.) I moved the Docker container rules to the DOCKER chain, so the Docker engine removes them when it shuts down containers. Commit: 64626
(In reply to Daniel Tröder from comment #5) > (IMO this kind of belongs to #38307, but I fixed it referencing this bug.) > > I moved the Docker container rules to the DOCKER chain, so the Docker engine > removes them when it shuts down containers. > > Commit: 64626 ok, reopend #38307 After stopping containers via /etc/init.d/docker stop, the id is written to $CONT_ID_FILE. /etc/init.d/docker start starts all containers in $CONT_ID_FILE. But as soon as i use "/etc/init.d/docker stop" once, the id exists in $CONT_ID_FILE and the container is always started during docker start, regardless if the container was stopped by docker or the app init script. Who is responsible for cleaning up $CONT_ID_FILE?
(In reply to Felix Botner from comment #6) > But as soon as i use "/etc/init.d/docker stop" once, the id exists in > $CONT_ID_FILE and the container is always started during docker start, > regardless if the container was stopped by docker or the app init script. > Who is responsible for cleaning up $CONT_ID_FILE? The file is now cleaned after starting the container: r64976
previous_containers_list_clean() { ehco -n > "$CONT_ID_FILE" } => echo
(In reply to Felix Botner from comment #8) > previous_containers_list_clean() { > ehco -n > "$CONT_ID_FILE" > } > > => echo Yes: r64977
OK, works fine
UCS 4.1 has been released: https://docs.software-univention.de/release-notes-4.1-0-en.html https://docs.software-univention.de/release-notes-4.1-0-de.html If this error occurs again, please use "Clone This Bug".