Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive batch_size #369

Open
zchmielewska opened this issue Apr 22, 2024 · 0 comments
Open

Adaptive batch_size #369

zchmielewska opened this issue Apr 22, 2024 · 0 comments

Comments

@zchmielewska
Copy link
Collaborator

zchmielewska commented Apr 22, 2024

Currently, we calculate batch_size (to avoid memory error) only once.

If memory error varies, maybe we should recalculate it after each batch.

while batch_start < range_end:
    # Calculate the batch size based on available memory
    batch_size = self.calculate_batch_size(num_output_columns)

    # Process the batch
    batch_end = min(batch_start + batch_size, range_end)
    results = calculate_model_point_partial(batch_start, batch_end)

    # Update the batch start and end indices
    batch_start = batch_end

    # ... process the results ...

How to profile memory usage?

import psutil

# Get the current memory usage
mem_usage = psutil.virtual_memory().used

# Log the memory usage
print(f"Memory usage: {mem_usage / (1024 * 1024):.2f} MB")

or

import psutil

# ...

while batch_start < range_end:
    # Calculate the batch size based on available memory
    batch_size = self.calculate_batch_size(num_output_columns)

    # Log memory usage before processing the batch
    mem_usage_before = psutil.virtual_memory().used

    # Process the batch
    batch_end = min(batch_start + batch_size, range_end)
    results = calculate_model_point_partial(batch_start, batch_end)

    # Log memory usage after processing the batch
    mem_usage_after = psutil.virtual_memory().used

    # Log the difference in memory usage
    mem_usage_diff = mem_usage_after - mem_usage_before
    print(f"Memory usage increased by {mem_usage_diff} bytes")

    # ... process the results ...

We can use this code to understand better memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant