Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the hdf5 videos store as int8 format #1559

Merged
merged 4 commits into from
Oct 20, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@
John Smith <john at example.com> Example Inc.

Jeremy Delahanty <jdelahanty@salk.edu> The Salk Institute for Biological Studies

Lili Karashchuk <lili.karashchuk@alleninstitute.org> Allen Institute of Neural Dynamics
19 changes: 14 additions & 5 deletions sleap/io/video.py
Original file line number Diff line number Diff line change
Expand Up @@ -1443,19 +1443,28 @@ def to_hdf5(

def encode(img):
_, encoded = cv2.imencode("." + format, img)
return np.squeeze(encoded)
return np.squeeze(encoded).astype("int8")

# pad with zeroes to guarantee int8 type in hdf5 file
frames = []
for i in range(len(frame_numbers)):
frames.append(encode(frame_data[i]))

max_frame_size = max([len(x) for x in frames])
Comment on lines +1448 to +1453
Copy link

@coderabbitai coderabbitai bot Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new code creates a list of encoded frames and calculates the maximum frame size. This is a change from the previous implementation where each frame was encoded and directly stored in the HDF5 dataset. This change could potentially increase memory usage if the number of frames is large. Consider a more memory-efficient approach if this is a concern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as @coderabbitai says, this will store the frames into RAM to find the maximum frame size. I think this is rarely a concern for the number of frames that most people deal with, including all the example SLEAP datasets.

Note that the current SLEAP code already stores the full frames in memory anyways, through the get_frames() function.

In the future, if someone has a lot more frames that don't fit into memory, you could do a 2-pass of encoding, which will be 2 times slower but not consume any more memory than the previous code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lambdaloop, thank you for your response. You're correct that the current implementation already stores the full frames in memory, and for most use cases, this won't be a problem.

Your suggestion of a two-pass encoding process for larger datasets is a good one. This would involve a first pass to calculate the maximum frame size and a second pass to encode and store the frames. While this would indeed be slower, it would keep memory usage in check.

Here's a rough sketch of how that could be implemented:

# First pass to calculate max frame size
max_frame_size = 0
for i in range(len(frame_numbers)):
    encoded_frame = encode(frame_data[i])
    if len(encoded_frame) > max_frame_size:
        max_frame_size = len(encoded_frame)

# Create the dataset with the calculated max frame size
dset = f.create_dataset(
    dataset + "/video", (len(frame_numbers), max_frame_size),
    dtype ='int8'
)

# Second pass to encode and store the frames
for i in range(len(frame_numbers)):
    encoded_frame = encode(frame_data[i])
    dset[i, 0:len(encoded_frame)] = encoded_frame

This way, we only keep one frame in memory at a time during the encoding process. However, as you mentioned, this would double the encoding time. It's a trade-off between memory usage and processing time, and the best approach would depend on the specific requirements of the use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow I didn't expect this bot to write the full code for me, that's awesome!

Anyhow, I don't think it's worth doing the 2 pass system at this time. The rest of the code should be refactored first anyway.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lambdaloop, I'm glad you found the code snippet helpful! You're right, it's important to prioritize refactoring and optimizing the existing code base before introducing new changes. The two-pass system can be considered in the future when handling larger datasets becomes a necessity. Feel free to reach out if you need further assistance or code reviews in the future. Happy coding!


dtype = h5.special_dtype(vlen=np.dtype("int8"))
dset = f.create_dataset(
dataset + "/video", (len(frame_numbers),), dtype=dtype
dataset + "/video",
(len(frame_numbers), max_frame_size),
dtype="int8",
)
Copy link

@coderabbitai coderabbitai bot Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The creation of the dataset now includes a second dimension for the maximum frame size, which is a change from the previous implementation. This is done to accommodate frames of varying sizes in the same dataset. However, this change doubles the file size as mentioned in the PR summary. Consider enabling gzip compression on the frames to mitigate the increase in file size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't actually double the file size, I got confused. See my correction above.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lambdaloop: Thank you for the clarification. It's good to know that the file size doesn't double as initially thought. The use of padding to ensure a consistent data type and size across all frames is a clever approach. However, it's still worth considering the use of gzip compression as suggested in the PR summary, especially for larger videos, to further optimize the file size. This could be particularly beneficial when loading the SLP files into a web browser for the web annotation tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I mean you're only bringing up gzip compression cause I brought it up, silly bot.

I did in fact include it in commit 95aa144 though. It reduces the file size by about 10%. Not dramatic, but you know, saving 10% of your time downloading/uploading a file is always nice

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lambdaloop: Indeed, every bit of optimization counts, especially when dealing with large files. The inclusion of gzip compression in commit 95aa144 is a great move towards this. A 10% reduction in file size can indeed translate to significant time savings in download/upload operations, particularly in environments with slower network connections. Thank you for your diligent work on this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, I'm happy to add compression="gzip" for you anytime, coderabbitai

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lambdaloop: Thank you for your willingness to add gzip compression. It's a valuable addition that can help optimize file size and improve the overall performance of the system. Your contributions are greatly appreciated!

dset.attrs["format"] = format
dset.attrs["channels"] = self.channels
dset.attrs["height"] = self.height
dset.attrs["width"] = self.width

for i in range(len(frame_numbers)):
dset[i] = encode(frame_data[i])
for i, frame in enumerate(frames):
dset[i, 0 : len(frame)] = frame

else:
f.create_dataset(
dataset + "/video",
Expand Down