Univention Bugzilla – Bug 45572
[4.2] Teachers collect data from themselves, quickly multiplying the folder size
Last modified: 2017-12-21 12:22:57 CET
When a teacher collects a project with a class as member, files will be collected from every user inside that class. This unfortunately includes his own project folder, which contains every previously collected version of this project. If this teacher now collects a project multiple times, the size of the newly collected project versions will multiply, as they include all previously collected project versions inside the teachers project folder. The collecting teacher shouldn't collect his own folder content. ------------------- Simple example, assuming no one touches any files: 1 Class containing 2 Students and 1 Teacher 1 Project with that Class as member and 1 Project File, Size: 1Mb Teacher distributes project: All class members (3) receive one 1Mb file Teacher collects project: Teacher collects two 1Mb files from every member excluding himself Teacher then collects his own folder including one 1Mb file and all folders from other members :Total Teacher size: 1Mb + 2Mb + 3Mb = 6Mb Teacher collects project a second time: Teacher collects two 1Mb files from every member excluding himself Teacher then collects his own folder including one 1Mb file, his first collection of members (2Mb), his second collection of members (2Mb) his first collection of himself (3Mb) and his new collection of himself (8Mb) :Total Teacher size: 1Mb + 2Mb + 2Mb + 3Mb + 8Mb = 16Mb The folder would have a size of 36Mb after a third collection. Graphical Display after second collection: [16.0M] ├── [1.0M] FILE ├── [3.0M] teacher1 │ ├── [1.0M] FILE │ ├── [1.0M] bschwartz │ └── [1.0M] testStudent ├── [8.0M] teacher1 version2 │ ├── [1.0M] FILE │ ├── [3.0M] teacher1 │ │ ├── [1.0M] FILE │ │ ├── [1.0M] bschwartz │ │ └── [1.0M] testStudent │ ├── [1.0M] bschwartz │ ├── [1.0M] bschwartz version2 │ ├── [1.0M] testStudent │ └── [1.0M] testStudent version2 ├── [1.0M] bschwartz ├── [1.0M] bschwartz version2 ├── [1.0M] testStudent └── [1.0M] testStudent version2 --------------------
*** Bug 45573 has been marked as a duplicate of this bug. ***
The teachers home directory is now skipped, when collecting distributed materials. [4.2 3cf055cd] Bug #45572: don't collect teachers own data [4.2 c876fe56] Bug #45572: advisory ucs-school-umc-distribution 15.0.1-3A~4.2.0.201711031346
If missing, please add a note in the UCS@school teacher manual, that clearly states, that files of users with role teacher are never collected.
[4.2 7f2bbda9] Bug #45572: mention that teachers file are not collected Last sentence of chapter 3.7.3: http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/20/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf
Are you sure that the DN's are always equal in case? Proper DN comparision must be done with lo.compareDn(a, b). Also in the manual you are writing that no teacher projects are collected but this check only excludes the project owner.
Commit 700c665e: store and compare school roles instead of DNs Commit c8e1cce8: update advisory
So the behavior now is: "Teacher" and "Teachers and Staff" materials aren't collected. "Students" and "Staff" materials are collected.
The behavior is not consistent anymore. Why are the files distributed to teachers and teachers+staff but not collected from them? They should not be distributed at all then.
(In reply to Florian Best from comment #8) > The behavior is not consistent anymore. Why are the files distributed to > teachers and teachers+staff but not collected from them? They should not be > distributed at all then. That is not the issue here. It might even be intended. Please create a new bug if you think it should be changed.
As Sönke will discuss with you: Please revert the changes and instead implement a solution by renaming the directory into which the files are collected. This prevents quadratic blow up.
(In reply to Florian Best from comment #7) > So the behavior now is: "Teacher" and "Teachers and Staff" materials aren't > collected. "Students" and "Staff" materials are collected. Staff users are usually not replicated to the educational UCS@school servers (→ Edu DC Slave). I.e. only files of "student" users are collected. (In reply to Florian Best from comment #8) > The behavior is not consistent anymore. Why are the files distributed to > teachers and teachers+staff but not collected from them? They should not be > distributed at all then. Good question. We initiated a new discussion with ProfServices and Development and we all think, that the current approach does not exactly hit the requirements. The current approach will prevent the collection of files distributed to other teachers. Instead, a new solution has been discussed: --- a/ucs-school-umc-distribution/umc/python/distribution/util.py +++ b/ucs-school-umc-distribution/umc/python/distribution/util.py @@ -54,6 +54,7 @@ _ = Translation('ucs-school-umc-distribution').translate DISTRIBUTION_CMD = '/usr/lib/ucs-school-umc-distribution/umc-distribution' DISTRIBUTION_DATA_PATH = ucr.get('ucsschool/datadistribution/cache', '/var/lib/ucs-school-umc-distribution') +POSTFIX_DATADIR_SENDER_PROJECT_SUFFIX = ucr.get('ucsschool/datadistribution/datadir/sender/project/suffix', '-Ergebnisse') POSTFIX_DATADIR_SENDER = ucr.get('ucsschool/datadistribution/datadir/sender', 'Unterrichtsmaterial') POSTFIX_DATADIR_RECIPIENT = ucr.get('ucsschool/datadistribution/datadir/recipient', 'Unterrichtsmaterial') @@ -264,7 +265,7 @@ class Project(_Dict): def sender_projectdir(self): '''The absolute path of the project directory in the senders home.''' if self.sender and self.sender.homedir: - return os.path.join(self.sender.homedir, POSTFIX_DATADIR_SENDER, self.name) + return os.path.join(self.sender.homedir, POSTFIX_DATADIR_SENDER, '%s%s' % (self.name, POSTFIX_DATADIR_SENDER_PROJECT_SUFFIX)) return None @property → files are distributed to $HOME/Unterrichtsmaterial/${PROJECTNAME}/ → files are collected to $HOME/Unterrichtsmaterial/${PROJECTNAME}-Ergebnisse/ → this should prevent the exponential collection loop → REOPEN → please revert the modifications done so far and apply the patch above A hotfix is currently also possible: Use different directory names in "ucsschool/datadistribution/datadir/sender" and "ucsschool/datadistribution/datadir/recipient". Currently both default to "Unterrichtsmaterial".
The previous commits were reverted: commit d9988b95c2138d5906768b53a4ffcfac5eef1eeb Revert "Bug #45572: store and compare school roles instead of DNs" This reverts commit 700c665ea1fe51326a84fa874a977f314249d3d5. commit fcc9fd2cfb0c06928b96b85218d0e1eee49da1bd Revert "Bug #45572: mention that teachers file are not collected" This reverts commit 7f2bbda9078d7ec9b8c7345f52f32e2eeecc5592. commit cd60ae1e2479814a9d8e31d3a726920c321226a4 Revert "Bug #45572: don't collect teachers own data" This reverts commit 3cf055cd97be2116ec5664964d01f7f0fa7fd720. And the new UCRV ucsschool/datadistribution/datadir/sender/project/suffix introduced. commit bf2fbb22e537af0c32a41ee012a1bac7532b54b7 Bug #45572: separate project distribution and collection directories And explained in the manal. commit 26c28855cd54db3fb59fb142cbb61cfb31038f02 Bug #45572: literal and anvar cannot be children of footnote commit 1cb4ac5b8d4056ff2941a9040d4a45b5d015be31 Bug #45572: explain UCR in manual http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/28/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf → "3.7.3. Einsammeln der verteilten Dokumente" Asked @ProfS, UCRV descriptions have not been requested.
commit 150fde31d0be7f11beb627f883b8f747bfe179e6 Bug #45572: advisory update commit c0c00b8ced39ab34f755e7f2b315c98996112d1e Bug #45572: changelog and advisory ucs-school-umc-distribution 15.0.1-5A~4.2.0.201711221151
OK: changes work nice REOPEN: the adjustment in the documentation: It's a teacher manual not an administrator manual: UCR variables shouldn't be mentioned here. There is a typo "des Lehrkraft".
[4.2 3838aeca] Bug #45572: don't mention UCR variable in teacher manual, fix typo http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/29/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf
OK: docs
I found several problems: 1) the ucs-test script fails, because it has not been adapted according to the changes of this bug → will commit the neccessary changes in a minute 2) manual and YAML now state that the results are collected to "Unterrichtsmaterial-Ergebnisse", but they are not! The results are collected to "Unterrichtsmaterial/${USERNAME}-Ergebnisse/" 3) suffix versus prefix in YAML file: "The prefix for the name of that directory can be modified with the UCR variable <envar>ucsschool/datadistribution/datadir/sender/project/suffix</envar>" Please fix the manual and the YAML file. The target directory is correct (matches to our discussion between development and professional services).
ucs-test-ucsschool (4.0.4-46): 220796179424 | Bug #45572: fixed essential/distribution.py according to new directory for results
[4.2 b4274dd0] Bug #45572: fix directory name 3.7.3. Einsammeln der verteilten Dokumente -> http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/lastSuccessfulBuild/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf
(In reply to Daniel Tröder from comment #19) > [4.2 b4274dd0] Bug #45572: fix directory name → VERIFIED
UCS@school 4.2 v6 has been released. http://docs.software-univention.de/changelog-ucsschool-4.2v6-de.html If this error occurs again, please clone this bug.