Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce disk usage during write #239

Merged
merged 12 commits into from
Mar 3, 2024

Conversation

DanEngelbrecht
Copy link
Owner

@DanEngelbrecht DanEngelbrecht commented Feb 8, 2024

  • CHANGED API Longtail_StorageAPI.OpenAppend added to Longtail_StorageAPI to open files without truncating existing data
  • CHANGED API Longtail_CreateConcurrentChunkWriteAPI changed to take source_version_index and version_diff
  • CHANGED API Longtail_ConcurrentChunkWriteAPI refactored to use asset index and open/close files instead of keeping all open during entire lifetime
    • Longtail_ConcurrentChunkWriteAPI.CreateDir now takes asset index instead of version local path
    • Longtail_ConcurrentChunkWriteAPI.Open now takes asset index instead of version local path and dropping chunk_write_count parameter
    • Longtail_ConcurrentChunkWriteAPI.Write now takes asset index instead of version local path and dropping chunk_write_count parameter
  • CHANGED API Longtail_SetMonitor callback functions refactored to accomodate changes in Longtail_ConcurrentChunkWriteAPI
  • REMOVED API Remove platform api for Read/Write mutex
    • Longtail_GetRWLockSize
    • Longtail_CreateRWLock
    • Longtail_DeleteRWLock
    • Longtail_LockRWLockRead
    • Longtail_LockRWLockWrite
    • Longtail_UnlockRWLockRead
    • Longtail_UnlockRWLockWrite
  • ADDED memtracer now tracks allocations in stb_ds
  • ADDED memtracer now tracks allocations in zstd
  • FIXED Fixed excessive "Disk Used" increase during Longtail_ChangeVersion2 execution causing Out Of Disk space errors.
    The changes also improves performance for more common cases with smaller archive sizes (60 Gb raw data/many files) but causes a small regression compared to 0.4.1 for archives with many very large files. It is still performing much more reasonable than 0.4.0 for these cases.
    Version Files Raw Size Compressed Size Unpack Time Peak Memory
    0.4.0 1019 735 GB 214 GB 2h44m26s 7.9 GB
    0.4.1 1019 735 GB 214 GB 0h12m14s 1.9 GB
    0.4.2 1019 735 GB 214 GB 0h13m25s 2.2 GB
    0.4.0 239 340 60 GB 17 GB 0h01m24s 4.2 GB
    0.4.1 239 340 60 GB 17 GB 0h02m48s 0.9 GB
    0.4.2 239 340 60 GB 17 GB 0h01m12s 0.9 GB

@DanEngelbrecht DanEngelbrecht force-pushed the de/reduce-disk-usage-during-write branch from 15e9e25 to 9398684 Compare February 8, 2024 23:24
@DanEngelbrecht DanEngelbrecht marked this pull request as ready for review March 3, 2024 10:51
@DanEngelbrecht DanEngelbrecht merged commit 37cca28 into main Mar 3, 2024
37 checks passed
@DanEngelbrecht DanEngelbrecht deleted the de/reduce-disk-usage-during-write branch March 3, 2024 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant