Bug 34984 – image decompression with python-lzma destroys "sparse" feature in ucc images

Bug 34984 - image decompression with python-lzma destroys "sparse" feature in ucc images


Summary:	image decompression with python-lzma destroys "sparse" feature in ucc images

Status:	CLOSED WONTFIX

Product:	Z_Univention Corporate Client (UCC)
Classification:	Unclassified
Component:	Image management
Version:	unspecified
Hardware:	Other Linux

Importance:	P5 enhancement
Target Milestone:	UCC 3.x
Assigned To:	UCC maintainers
QA Contact:

URL:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2014-05-27 10:52 CEST by Felix Botner
Modified:	2023-06-28 10:33 CEST (History)
CC List:	3 users (show)

See Also:
What kind of report is it?:	---
What type of bug is this?:	---
Who will be affected by this bug?:	---
How will those affected feel about the bug?:	---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:

Flags:	dormann: Patch_Available+

Attachments
Patch for ucc-image-toolkit so images will still be sparse files after decompressing (534 bytes, patch) 2014-06-25 11:31 CEST, Drees Dormann	Details \| Diff
New aproach: split uncompressed date into 4k blocks and check them seperatly, so we can keep the progress functionality of the original code (1.30 KB, patch) 2014-06-26 12:07 CEST, Drees Dormann	Details \| Diff
Newer Version of patch, changed according to alexanders proposals (1.42 KB, patch) 2014-07-03 09:05 CEST, Drees Dormann	Details \| Diff
Show Obsolete (2) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Felix Botner

2014-05-27 10:52:10 CEST

Afte downloading and decompression (pyhton-lzma) the ucc image is no longer a sparse file:

16G     /var/lib/univention-client-boot/ucc-2.0-desktop-image.img

When I decompress the xz file with unxz, the actual image size is much smaller 

3,8G    /opt/ucc-2.0-desktop-image.img

We losing the "sparse" feature when decompressing with python-lzma. Not tragic at the moment but maybe we can keep the image a sparse file for future optimizations (e.g. transferring the sparse file to the ucc client instead of the whole 16GB image).

Comment 1 Moritz Muehlenhoff

2014-05-27 11:10:59 CEST

Three possible options:
- Check whether Pylzma can be instructed to write the decompressed file in sparse mode
- Wrap the image file in a tar archive (which preserves the sparseness)
- Re-sparse the downloaded file locally using with "cp --sparse=always"

Comment 2 Alexander Kläser

2014-06-17 12:00:25 CEST

(In reply to Moritz Muehlenhoff from comment #1)
> Three possible options:
> - Check whether Pylzma can be instructed to write the decompressed file in
> sparse mode
> - Wrap the image file in a tar archive (which preserves the sparseness)
> - Re-sparse the downloaded file locally using with "cp --sparse=always"

AFAIS, this is in the our usage not a Pylmza issue, as we are writing the file data ourselves to the hard disk:

> [...]
> decompressor = lzma.LZMADecompressor()
> [...]
> with contextlib.nested(open(infile, 'rb'), open(outfile, 'wb')) as (fin, fout):
>   [...]
>   while True:
>     [...]
>     compressed_data = fin.read(DEFAULT_CHUNK_SIZE)
>     [...]
>     uncompressed_data = decompressor.decompress(compressed_data)
>     fout.write(uncompressed_data)
>     [...]

If I see it correctly, sparse files could be created by checking the current uncompressed_data chunk (8KiB in it size) whether it contains only 0s. If so, this chunk can be skipped via fout.seek().

Some example code that I found: http://blogs.tulsalabs.com/?p=166

Comment 3 Alexander Kläser

2014-06-17 12:02:24 CEST

Drees, could you please provide a patch for this issue (as attachment for this bug).

Comment 4 Drees Dormann

2014-06-25 11:31:19 CEST

Created attachment 5970 [details]
Patch for ucc-image-toolkit so images will still be sparse files after decompressing

Comment 5 Drees Dormann

2014-06-25 11:32:30 CEST

created a patch according to Alex's proposed method,
if decompressed chuck only consists of 0-bytes it will ke skipped in output file

Comment 6 Moritz Muehlenhoff

2014-06-25 11:51:30 CEST

Was this built already in a scope targeted for release? If no, please don't mark this as RESOLVED yet.

Comment 7 Drees Dormann

2014-06-26 12:07:54 CEST

Created attachment 5971 [details]
New aproach: split uncompressed date into 4k blocks and check them seperatly, so we can keep the progress functionality of the original code

Data will be decompressed like in the original code, then split into 4 kb blocks.
Each block will be checked if it consists solely of 0-bytes (which wil not be written)
This way the original progress function can be preserved.

Comment 8 Drees Dormann

2014-07-03 09:05:58 CEST

Created attachment 5985 [details]
Newer Version of patch, changed according to alexanders proposals

Comment 9 Felix Botner

2016-06-27 10:48:36 CEST

Patch does work (can't mount the image). This is no option for now

Comment 10 Philipp Hahn

2023-06-28 10:30:37 CEST

UCC is EoL