-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the hdf5 videos store as int8 format #1559
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1443,19 +1443,28 @@ def to_hdf5( | |
|
||
def encode(img): | ||
_, encoded = cv2.imencode("." + format, img) | ||
return np.squeeze(encoded) | ||
return np.squeeze(encoded).astype("int8") | ||
|
||
# pad with zeroes to guarantee int8 type in hdf5 file | ||
frames = [] | ||
for i in range(len(frame_numbers)): | ||
frames.append(encode(frame_data[i])) | ||
|
||
max_frame_size = max([len(x) for x in frames]) | ||
|
||
dtype = h5.special_dtype(vlen=np.dtype("int8")) | ||
dset = f.create_dataset( | ||
dataset + "/video", (len(frame_numbers),), dtype=dtype | ||
dataset + "/video", | ||
(len(frame_numbers), max_frame_size), | ||
dtype="int8", | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The creation of the dataset now includes a second dimension for the maximum frame size, which is a change from the previous implementation. This is done to accommodate frames of varying sizes in the same dataset. However, this change doubles the file size as mentioned in the PR summary. Consider enabling gzip compression on the frames to mitigate the increase in file size. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't actually double the file size, I got confused. See my correction above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lambdaloop: Thank you for the clarification. It's good to know that the file size doesn't double as initially thought. The use of padding to ensure a consistent data type and size across all frames is a clever approach. However, it's still worth considering the use of gzip compression as suggested in the PR summary, especially for larger videos, to further optimize the file size. This could be particularly beneficial when loading the SLP files into a web browser for the web annotation tool. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay I mean you're only bringing up gzip compression cause I brought it up, silly bot. I did in fact include it in commit 95aa144 though. It reduces the file size by about 10%. Not dramatic, but you know, saving 10% of your time downloading/uploading a file is always nice There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lambdaloop: Indeed, every bit of optimization counts, especially when dealing with large files. The inclusion of gzip compression in commit 95aa144 is a great move towards this. A 10% reduction in file size can indeed translate to significant time savings in download/upload operations, particularly in environments with slower network connections. Thank you for your diligent work on this feature. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure thing, I'm happy to add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lambdaloop: Thank you for your willingness to add gzip compression. It's a valuable addition that can help optimize file size and improve the overall performance of the system. Your contributions are greatly appreciated! |
||
dset.attrs["format"] = format | ||
dset.attrs["channels"] = self.channels | ||
dset.attrs["height"] = self.height | ||
dset.attrs["width"] = self.width | ||
|
||
for i in range(len(frame_numbers)): | ||
dset[i] = encode(frame_data[i]) | ||
for i, frame in enumerate(frames): | ||
dset[i, 0 : len(frame)] = frame | ||
|
||
else: | ||
f.create_dataset( | ||
dataset + "/video", | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new code creates a list of encoded frames and calculates the maximum frame size. This is a change from the previous implementation where each frame was encoded and directly stored in the HDF5 dataset. This change could potentially increase memory usage if the number of frames is large. Consider a more memory-efficient approach if this is a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as @coderabbitai says, this will store the frames into RAM to find the maximum frame size. I think this is rarely a concern for the number of frames that most people deal with, including all the example SLEAP datasets.
Note that the current SLEAP code already stores the full frames in memory anyways, through the get_frames() function.
In the future, if someone has a lot more frames that don't fit into memory, you could do a 2-pass of encoding, which will be 2 times slower but not consume any more memory than the previous code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lambdaloop, thank you for your response. You're correct that the current implementation already stores the full frames in memory, and for most use cases, this won't be a problem.
Your suggestion of a two-pass encoding process for larger datasets is a good one. This would involve a first pass to calculate the maximum frame size and a second pass to encode and store the frames. While this would indeed be slower, it would keep memory usage in check.
Here's a rough sketch of how that could be implemented:
This way, we only keep one frame in memory at a time during the encoding process. However, as you mentioned, this would double the encoding time. It's a trade-off between memory usage and processing time, and the best approach would depend on the specific requirements of the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow I didn't expect this bot to write the full code for me, that's awesome!
Anyhow, I don't think it's worth doing the 2 pass system at this time. The rest of the code should be refactored first anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lambdaloop, I'm glad you found the code snippet helpful! You're right, it's important to prioritize refactoring and optimizing the existing code base before introducing new changes. The two-pass system can be considered in the future when handling larger datasets becomes a necessity. Feel free to reach out if you need further assistance or code reviews in the future. Happy coding!