Bug 45572 - [4.2] Teachers collect data from themselves, quickly multiplying the folder size
[4.2] Teachers collect data from themselves, quickly multiplying the folder size
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: UMC - Distribution
UCS@school 4.1 R2
Other Linux
: P5 normal (vote)
: UCS@school 4.2 v6
Assigned To: Daniel Tröder
Florian Best
:
: 45573 (view as bug list)
Depends on:
Blocks: 45645
  Show dependency treegraph
 
Reported: 2017-10-20 15:21 CEST by Hendrik Peter
Modified: 2017-12-21 12:22 CET (History)
4 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hendrik Peter univentionstaff 2017-10-20 15:21:17 CEST
When a teacher collects a project with a class as member, files will be collected from every user inside that class.
This unfortunately includes his own project folder, which contains every previously collected version of this project.

If this teacher now collects a project multiple times, the size of the newly collected project versions will multiply, as they include all previously collected project versions inside the teachers project folder.

The collecting teacher shouldn't collect his own folder content.

-------------------
Simple example, assuming no one touches any files:
1 Class containing
2 Students and
1 Teacher
1 Project with that Class as member and
1 Project File, Size: 1Mb

Teacher distributes project:
All class members (3) receive one 1Mb file

Teacher collects project:
Teacher collects two 1Mb files from every member excluding himself 
Teacher then collects his own folder including one 1Mb file and all folders from other members
:Total Teacher size: 1Mb + 2Mb + 3Mb = 6Mb

Teacher collects project a second time:
Teacher collects two 1Mb files from every member excluding himself 
Teacher then collects his own folder including 
 one 1Mb file, 
 his first collection of members (2Mb), 
 his second collection of members (2Mb) 
 his first collection of himself (3Mb) and 
 his new collection of himself (8Mb)
:Total Teacher size: 1Mb + 2Mb + 2Mb + 3Mb + 8Mb = 16Mb

The folder would have a size of 36Mb after a third collection.

Graphical Display after second collection:
[16.0M]
├── [1.0M]  FILE
├── [3.0M]  teacher1
│   ├── [1.0M]  FILE
│   ├── [1.0M]  bschwartz
│   └── [1.0M]  testStudent
├── [8.0M]  teacher1 version2
│   ├── [1.0M]  FILE
│   ├── [3.0M]  teacher1
│   │   ├── [1.0M]  FILE
│   │   ├── [1.0M]  bschwartz
│   │   └── [1.0M]  testStudent
│   ├── [1.0M]  bschwartz
│   ├── [1.0M]  bschwartz version2
│   ├── [1.0M]  testStudent
│   └── [1.0M]  testStudent version2
├── [1.0M]  bschwartz
├── [1.0M]  bschwartz version2
├── [1.0M]  testStudent
└── [1.0M]  testStudent version2

--------------------
Comment 1 Florian Best univentionstaff 2017-10-23 10:10:18 CEST
*** Bug 45573 has been marked as a duplicate of this bug. ***
Comment 2 Daniel Tröder univentionstaff 2017-11-03 13:50:04 CET
The teachers home directory is now skipped, when collecting distributed materials.

[4.2 3cf055cd] Bug #45572: don't collect teachers own data
[4.2 c876fe56] Bug #45572: advisory

ucs-school-umc-distribution 15.0.1-3A~4.2.0.201711031346
Comment 3 Sönke Schwardt-Krummrich univentionstaff 2017-11-06 11:41:20 CET
If missing, please add a note in the UCS@school teacher manual, that clearly states, that files of users with role teacher are never collected.
Comment 4 Daniel Tröder univentionstaff 2017-11-06 15:56:56 CET
[4.2 7f2bbda9] Bug #45572: mention that teachers file are not collected

Last sentence of chapter 3.7.3: http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/20/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf
Comment 5 Florian Best univentionstaff 2017-11-08 11:00:23 CET
Are you sure that the DN's are always equal in case? Proper DN comparision must be done with lo.compareDn(a, b).

Also in the manual you are writing that no teacher projects are collected but this check only excludes the project owner.
Comment 6 Daniel Tröder univentionstaff 2017-11-08 12:31:22 CET
Commit 700c665e: store and compare school roles instead of DNs
Commit c8e1cce8: update advisory
Comment 7 Florian Best univentionstaff 2017-11-09 12:53:15 CET
So the behavior now is: "Teacher" and "Teachers and Staff" materials aren't collected. "Students" and "Staff" materials are collected.
Comment 8 Florian Best univentionstaff 2017-11-09 13:13:25 CET
The behavior is not consistent anymore. Why are the files distributed to teachers and teachers+staff but not collected from them? They should not be distributed at all then.
Comment 9 Daniel Tröder univentionstaff 2017-11-09 13:59:43 CET
(In reply to Florian Best from comment #8)
> The behavior is not consistent anymore. Why are the files distributed to
> teachers and teachers+staff but not collected from them? They should not be
> distributed at all then.
That is not the issue here. It might even be intended.
Please create a new bug if you think it should be changed.
Comment 10 Florian Best univentionstaff 2017-11-09 14:55:56 CET
As Sönke will discuss with you:
Please revert the changes and instead implement a solution by renaming the directory into which the files are collected. This prevents quadratic blow up.
Comment 11 Sönke Schwardt-Krummrich univentionstaff 2017-11-09 15:02:26 CET
(In reply to Florian Best from comment #7)
> So the behavior now is: "Teacher" and "Teachers and Staff" materials aren't
> collected. "Students" and "Staff" materials are collected.

Staff users are usually not replicated to the educational UCS@school servers (→ Edu DC Slave). I.e. only files of "student" users are collected.

(In reply to Florian Best from comment #8)
> The behavior is not consistent anymore. Why are the files distributed to
> teachers and teachers+staff but not collected from them? They should not be
> distributed at all then.

Good question. We initiated a new discussion with ProfServices and Development and we all think, that the current approach does not exactly hit the requirements. The current approach will prevent the collection of files distributed to other teachers.

Instead, a new solution has been discussed:

--- a/ucs-school-umc-distribution/umc/python/distribution/util.py
+++ b/ucs-school-umc-distribution/umc/python/distribution/util.py
@@ -54,6 +54,7 @@ _ = Translation('ucs-school-umc-distribution').translate
 
 DISTRIBUTION_CMD = '/usr/lib/ucs-school-umc-distribution/umc-distribution'
 DISTRIBUTION_DATA_PATH = ucr.get('ucsschool/datadistribution/cache', '/var/lib/ucs-school-umc-distribution')
+POSTFIX_DATADIR_SENDER_PROJECT_SUFFIX = ucr.get('ucsschool/datadistribution/datadir/sender/project/suffix', '-Ergebnisse')
 POSTFIX_DATADIR_SENDER = ucr.get('ucsschool/datadistribution/datadir/sender', 'Unterrichtsmaterial')
 POSTFIX_DATADIR_RECIPIENT = ucr.get('ucsschool/datadistribution/datadir/recipient', 'Unterrichtsmaterial')
 
@@ -264,7 +265,7 @@ class Project(_Dict):
 	def sender_projectdir(self):
 		'''The absolute path of the project directory in the senders home.'''
 		if self.sender and self.sender.homedir:
-			return os.path.join(self.sender.homedir, POSTFIX_DATADIR_SENDER, self.name)
+			return os.path.join(self.sender.homedir, POSTFIX_DATADIR_SENDER, '%s%s' % (self.name, POSTFIX_DATADIR_SENDER_PROJECT_SUFFIX))
 		return None
 
 	@property


→ files are distributed to $HOME/Unterrichtsmaterial/${PROJECTNAME}/
→ files are collected to $HOME/Unterrichtsmaterial/${PROJECTNAME}-Ergebnisse/
→ this should prevent the exponential collection loop
→ REOPEN
→ please revert the modifications done so far and apply the patch above

A hotfix is currently also possible:
Use different directory names in "ucsschool/datadistribution/datadir/sender" and "ucsschool/datadistribution/datadir/recipient". Currently both default to "Unterrichtsmaterial".
Comment 12 Daniel Tröder univentionstaff 2017-11-22 11:22:54 CET
The previous commits were reverted:

commit d9988b95c2138d5906768b53a4ffcfac5eef1eeb
    Revert "Bug #45572: store and compare school roles instead of DNs"    
    This reverts commit 700c665ea1fe51326a84fa874a977f314249d3d5.
commit fcc9fd2cfb0c06928b96b85218d0e1eee49da1bd
    Revert "Bug #45572: mention that teachers file are not collected"    
    This reverts commit 7f2bbda9078d7ec9b8c7345f52f32e2eeecc5592.
commit cd60ae1e2479814a9d8e31d3a726920c321226a4
    Revert "Bug #45572: don't collect teachers own data"
    This reverts commit 3cf055cd97be2116ec5664964d01f7f0fa7fd720.

And the new UCRV ucsschool/datadistribution/datadir/sender/project/suffix introduced.

commit bf2fbb22e537af0c32a41ee012a1bac7532b54b7
    Bug #45572: separate project distribution and collection directories

And explained in the manal.

commit 26c28855cd54db3fb59fb142cbb61cfb31038f02
    Bug #45572: literal and anvar cannot be children of footnote
commit 1cb4ac5b8d4056ff2941a9040d4a45b5d015be31
    Bug #45572: explain UCR in manual

http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/28/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf → "3.7.3. Einsammeln der verteilten Dokumente"

Asked @ProfS, UCRV descriptions have not been requested.
Comment 13 Daniel Tröder univentionstaff 2017-11-22 11:56:33 CET
commit 150fde31d0be7f11beb627f883b8f747bfe179e6
    Bug #45572: advisory update
commit c0c00b8ced39ab34f755e7f2b315c98996112d1e
    Bug #45572: changelog and advisory

ucs-school-umc-distribution 15.0.1-5A~4.2.0.201711221151
Comment 14 Florian Best univentionstaff 2017-11-22 13:47:37 CET
OK: changes work nice
REOPEN: the adjustment in the documentation:
It's a teacher manual not an administrator manual: UCR variables shouldn't be mentioned here.
There is a typo "des Lehrkraft".
Comment 15 Daniel Tröder univentionstaff 2017-11-23 09:49:36 CET
[4.2 3838aeca] Bug #45572: don't mention UCR variable in teacher manual, fix typo

http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/29/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf
Comment 16 Florian Best univentionstaff 2017-11-23 11:36:43 CET
OK: docs
Comment 17 Sönke Schwardt-Krummrich univentionstaff 2017-12-11 13:59:13 CET
I found several problems:
1) the ucs-test script fails, because it has not been adapted according to the changes of this bug 
→ will commit the neccessary changes in a minute

2) manual and YAML now state that the results are collected to "Unterrichtsmaterial-Ergebnisse", but they are not! The results are collected to
"Unterrichtsmaterial/${USERNAME}-Ergebnisse/"

3) suffix versus prefix in YAML file:
"The prefix for the name of that directory can be modified with the UCR variable <envar>ucsschool/datadistribution/datadir/sender/project/suffix</envar>"

Please fix the manual and the YAML file. The target directory is correct (matches to our discussion between development and professional services).
Comment 18 Sönke Schwardt-Krummrich univentionstaff 2017-12-11 14:25:52 CET
ucs-test-ucsschool (4.0.4-46):
220796179424 | Bug #45572: fixed essential/distribution.py according to new directory for results
Comment 19 Daniel Tröder univentionstaff 2017-12-11 15:15:08 CET
[4.2 b4274dd0] Bug #45572: fix directory name

3.7.3. Einsammeln der verteilten Dokumente -> http://jenkins.knut.univention.de:8080/job/UCSschool%204.2/job/Manual/lastSuccessfulBuild/artifact/webroot/ucsschool-lehrer-handbuch-4.2.pdf
Comment 20 Sönke Schwardt-Krummrich univentionstaff 2017-12-11 21:31:22 CET
(In reply to Daniel Tröder from comment #19)
> [4.2 b4274dd0] Bug #45572: fix directory name

→ VERIFIED
Comment 21 Sönke Schwardt-Krummrich univentionstaff 2017-12-21 12:22:57 CET
UCS@school 4.2 v6 has been released.

http://docs.software-univention.de/changelog-ucsschool-4.2v6-de.html

If this error occurs again, please clone this bug.