Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

100mb size decision is optimal? #2

Open
dim-geo opened this issue Nov 8, 2019 · 2 comments
Open

100mb size decision is optimal? #2

dim-geo opened this issue Nov 8, 2019 · 2 comments

Comments

@dim-geo
Copy link

dim-geo commented Nov 8, 2019

Hi,

Can you please elaborate on 100MB size decision?
Sia is using 40MB as a block.

Does LZ always compress chunks of data to sizes near but not higher than 40 MB?
From your screenshot it seems that chunks can reach 50MB which is bad because it would consume sia storage and upload(?) of 80MB total.

Would it make sense to use a block size near 40MB?

In that case the blocks would be always close to the limit of 40MB without crossing it and you can estimate more accurately the space that sia-slice will consume. (at the risk of underusing that space)

@YoRyan
Copy link
Owner

YoRyan commented Nov 10, 2019

The 100MB block size is by no means optimal. It's a balance between minimizing the number of files that need to be periodically reuploaded and maximizing the potential LZ compression. I chose a value that I thought was reasonable for multi-terrabyte targets: 10,000 files per 1TB. One must also, of course, mind the atomic Sia chunk size...

...which I had misinterpreted. The Sia docs imply the 40MB limit is a minimum file size and that as long as your files are larger than that, you'll be okay. But of course, you'll waste lots of space if your files are all 41MB, too! So for Sia Slice, I feel like an 80MB block size would be a sane default; just write off any losses to LZ compression.

If you want to tweak this yourself, there is currently no CLI switch, but the block size is easily accessible a constant at the top of siaslice.py. On subsequent syncs, Sia Slice will pick up the last block size used, even if it differs from this constant. But choose wisely, because there's no way to change the block size after the initial sync.

@YoRyan
Copy link
Owner

YoRyan commented Nov 13, 2019

CLI switch is now implemented with 018a2f3.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants