Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with memory de-allocation #154

Open
schlegelp opened this issue Aug 15, 2024 · 1 comment
Open

Issue with memory de-allocation #154

schlegelp opened this issue Aug 15, 2024 · 1 comment

Comments

@schlegelp
Copy link
Collaborator

schlegelp commented Aug 15, 2024

I've been having issue with my OS complaining it was running out of memory while doing some seemingly straight forward processing of neuron meshes in Python.

Consider this example from a fresh Python session in which we have loaded 200 neuron meshes:

>>> nl
<class 'navis.core.neuronlist.NeuronList'> containing 200 neurons (1.3GiB)
                 type  name     id            units  n_vertices  n_faces
0    navis.MeshNeuron  None  90672  1 dimensionless       87307   180694
1    navis.MeshNeuron  None  73645  1 dimensionless       83005   170294
..                ...   ...    ...              ...         ...      ...
198  navis.MeshNeuron  None  47779  1 dimensionless      102872   208686
199  navis.MeshNeuron  None  80131  1 dimensionless       98341   202250

>>> import psutil
>>> mem_info = psutil.Process(os.getpid()).memory_full_info()
>>> print(f"Resident Set Size: {mem_info.rss / 1e9:.2f}Gb")
Resident Set Size: 1.92Gb
>>> print(f"Unique Set Size: {mem_info.uss / 1e9:.2f}Gb")
Unique Set Size: 0.30Gb

Size of the neuron list is somewhere around 1.3Gb which accounts for most of the RSS of the process. The OSX Activity Monitor says the "Real Memory Size" is around 1.8Gb.

Screenshot 2024-08-15 at 16 45 32

Now watch what happens if we simply try to subset the neurons:

>>> nl_pr = navis.subset_neuron(nl, subset = lambda x: x.vertices[:, 2] > 224000)
>>> nl_pr
<class 'navis.core.neuronlist.NeuronList'> containing 200 neurons (1.3GiB)
                 type  name     id            units  n_vertices  n_faces
0    navis.MeshNeuron  None  90672  1 dimensionless       39859    82941
1    navis.MeshNeuron  None  73645  1 dimensionless       43707    88873
..                ...   ...    ...              ...         ...      ...
198  navis.MeshNeuron  None  47779  1 dimensionless       48063    97005
199  navis.MeshNeuron  None  80131  1 dimensionless       46913    95878

>>> # Force garbage collection before we measure the memory footprint again
>>> import gc
>>> gc.collect()

>>> mem_info = psutil.Process(os.getpid()).memory_full_info()
>>> print(f"Resident Set Size: {mem_info.rss / 1e9:.2f}Gb")
Resident Set Size: 9.02Gb
>>> print(f"Unique Set Size: {mem_info.uss / 1e9:.2f}Gb")
Unique Set Size: 6.31Gb
Screenshot 2024-08-15 at 16 48 59

The size of the process has exploded to ~9Gb even though the new neuron list is considerably smaller (fewer faces/vertices after pruning). Naively, I would have expected at worst a doubling of the memory usage. So what's happening?

I did a bit of digging and not all operations cause this behavior. For example as simple NeuronList.copy() only doubles the memory footprint as expected. In this particular case, the issue seems to be with trimesh's submesh function which we use under the hood. My best guess at the moment is that subset generates a bunch of temporary data that is correctly garbage collected when the function finishes but the memory is never de-allocated on the system side. The joys of automatic memory management...

The above becomes an annoying problem when processing hundreds or even thousands of meshes. I've had the same subset_mesh procedure crash with around 2k meshes on a 32Gb memory machine. One workaround is to run the function in a child process which ensures that memory is correctly de-allocated when that process terminates:

>>> from concurrent.futures import ProcessPoolExecutor
>>> with ProcessPoolExecutor(max_workers=1) as executor:
...   nl_pr = [executor.submit(navis.subset_neuron, n, subset=n.vertices[:, 2] > 22400).result() for n in nl]

>>> gc.collect()
>>> mem_info = psutil.Process(os.getpid()).memory_full_info()
>>> print(f"Resident Set Size: {mem_info.rss / 1e9:.2f}Gb")
Resident Set Size: 3.95Gb
>>> print(f"Unique Set Size: {mem_info.uss / 1e9:.2f}Gb")
Unique Set Size: 0.74Gb

This is obviously a pretty crude example but you can already achieve the same result with subset_neuron(..., parallel=True, n_cores=1).

A few options to deal with this:

  1. Add something (short tutorial?) on this to the docs
  2. Issue a warning when running potentially expensive operations and suggest running them in a child process
  3. Run all or just certain functions by default in a child process

(3) is the nuclear option but (1) and (2) would be pretty straight forward.

@schlegelp
Copy link
Collaborator Author

The immediate problem in subset_neuron has been addressed with #155. Leaving this issue open as a reminder though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant