Replies: 2 comments 4 replies
-
Hi and thanks for moving this over. The rzip pass DOES remove redundancies. In fact, all compression programs try and do this. rzip is special because it works over longer ranges. The rzip application uses two passes. The first pass is identical to I assume you read the Wikipedia article on rzip and The rzip page. One suggestion to help you appreciate what the rzip pass can do, is to use the Here's output from the enwik9 compression test file -R1
-R9
There's a lot to learn. But you can see, the impact of rzip pass above. The hash indexes are stored always in Stream 0. Stream 1 contains actual data. |
Beta Was this translation helpful? Give feedback.
-
Not quite. And I hate big words. The first pass of The backend receives both the hashed data and the Stream 1 data. The backend will even compress the Stream 0 hashes.
rzip -- > backend = lrzip-next file But this is limited by total system ram. So the hypothetical example of a 1TB disk image will be processed in chunks limited in size to the total compression window. Each chunk has its own pass 1. Chunks are not connected and hashing starts over. Consider this output from AOSP which was 22GB in size. Each Stream 0 is processed separately. The net result is the original data is reduced by 34% prior to being sent to the backend. The backend reduced the already deduped data by a further 59%. Overall, there was a 73% reduction in size. One other very significant enhancement of
|
Beta Was this translation helpful? Give feedback.
-
Going on from the ckolivas/lrzip#214 @pete4abw last reply:
Ok, so there's no way any software could be able to remove this kind of redundancy?
I thought rzip compression pass meant only redundancy retrieval and removal, but really don't know enough about compression at all.
Beta Was this translation helpful? Give feedback.
All reactions