Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone auto-generated reconstruction GUI #487

Merged
merged 38 commits into from
Jan 23, 2025

Conversation

amitabhverma
Copy link
Collaborator

@amitabhverma amitabhverma commented Dec 9, 2024

Quick guide to install the test version:

Create a new conda environment and activate it

conda create -y -n recOrder-test python=3.10
conda activate recOrder-test

Install current main branch with the [acq] optional flag

git clone https://github.com/mehta-lab/recOrder.git
pip install -e .[acq]

run napari to test
napari

If you get an error you might need to install napari with its Qt bindings:
python -m pip install "napari[pyqt6]"

Make sure napari runs now. If you want to test the current main branch version, you can test that from the plugins menu. Close napari.

We will now install the version for testing from the branch using:
pip install git+https://github.com/mehta-lab/recOrder.git@pydantic-model-prototype

If you are on a Windows machine, we will need a different submitit package (ignore for other OS)
pip uninstall submitit
pip install git+https://github.com/amitabhverma/submitit.git@windows_fix
Quick guide on Reconstruction:
  1. Select your zarr input storage location.
  2. Select which processing mode(s) you want (birefringence, phase, birefringence & phase, fluorescence) and use the Create Model button to add how many you need.
  3. For each added model set your parameters and your output storage location.
  4. Once ready to process, hit the Build & Run button which will start the Reconstruction.
  5. If there are any parameters that fail validation the model with the error will have an alert symbol and in the model GUI the field will be highlighted.
  6. If any model fails validation the Reconstruction will not proceed.
  7. Once all models are validated the Reconstruction will proceed and the processing updates will pop up in the bottom section.

Once each process successfully completes that table entry will be purged and only those that might have errored should remain.

Note

I would suggest for the first test, run using 1 model and make sure no errors are encountered and the data processes as expected. If you are running on a local machine I also suggest keeping the task manager open to make sure the OS is not running out of resources which might lead to a crash when running multiple processing concurrently.

- pydantic model UI widget creation based on CLI model
- does no processing, creates yaml files
@amitabhverma amitabhverma self-assigned this Dec 9, 2024
Copy link
Collaborator

@talonchandler talonchandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good @amitabhverma. I think the overall GUI, the "multiple reconstructions" logic, and the input/output paths make sense and are looking good.

I have three comments at this stage which might be useful before running complete CLI calls:

  1. When you hit "Build & Run Models" you generate a json with an ID, I/O paths, and the complete .yaml. Do you intend to have a list of each reconstruction in this file? The current behavior seems to only add the first reconstruction in json, and there is only a single output path.

  2. Is there anything preventing your json file from being a pydantic model? A major reason we use pydantic is that it is typed and validated early...are there equivalent features in your .json model? I like the design of saving everything to a file, but my concern is that when you save to a file, it will be hand edited then fail ungracefully, or worse...after a long reconstruction.

  3. It seems like you read most of the default parameters from the examples (which is great), but the regularization strengths seem to be set to zero (because of coarse resolution from the GUI elements?), which leads to non-validating defaults. I'm sure this is on your radar (and it's great that the validator flags this!), but it'll be great to have working defaults to iterate quickly.

@amitabhverma
Copy link
Collaborator Author

Thanks for the feedback @talonchandler

  1. The .json is not intended to be retained and only serves a temporary purpose (for now) till those identifiers are passed onto a table. There will be no intermediary json as well - the required params will directly be displayed in the table that shows the status of processes that are running. Sorry if that was not made clear in the code comments. It should have all the reconstructions show up on the json, with a time-ID and suffix-index. I'll double check.

  2. I guess this ties to the first part but as I said, it was just a temporary placeholder for some of the params that are being gathered so I just appended them as text. Everything to repeat an experiment is already stored in the yaml so I really don't see any value of retaining all this information. Any additional info (yaml file ID) should be updated in the dataset metadata in my opinion.

  3. I'll have a second look at this. I was under the impression the GUI implements the significant value based on pydantic model value props but maybe needs to be set explicitly. Yup agreed, working defaults are required.

An additional comment to this I made in the code was the root level validation compared to value. For example if a user tries to add two modes that cannot be combined I could check using the same validation and restrict that. Currently you can add a mode Bire+FL as a Model only to be restricted later down the process. Point 3 will fix this part though and can be implemented.

- refactored model to GUI implementation
- added table to hold processing info, entries purge on completion
- using click testing for now in prototyping
- creates yaml to GUI
- using napari notifications for msgs
- testing on Windows
- Multi processing using pydantic models
- Implemented a client/server approach for managing jobs queue
- Updates each job status based on logs
- Implemented Unique ID in CLI to associate Jobs with Process
- Tested on Windows with locally modified submitit
@talonchandler
Copy link
Collaborator

Thanks for the demo earlier @amitabhverma!

On my first trial running on my macos machine, I ran into issues whenever I created or added a model. For example, when I added the birefringence.yml example file, I received the following:

Traceback
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File src/psygnal/_signal.py:1196, in _run_emit_loop()

File src/psygnal/_signal.py:1225, in _run_emit_loop_immediate()

File src/psygnal/_weak_callback.py:354, in cb()

File ~/recOrder/recOrder/plugin/tab_recon.py:159, in Ui_Form.__init__.<locals>.<lambda>()
    158 # Passing model location label to model location selector
--> 159 _load_model_btn.clicked.connect(lambda: self.browse_dir_path_model(_load_model_loc))
        _load_model_loc = LineEdit(value='/Users/talon.chandler/recOrder/examples/birefringence.yml', annotation=None, name='')
        self = <recOrder.plugin.tab_recon.Ui_Form object at 0x158c4d2d0>
    161 # HBox for Loading Model

File ~/recOrder/recOrder/plugin/tab_recon.py:354, in Ui_Form.browse_dir_path_model(self=<recOrder.plugin.tab_recon.Ui_Form object>, elem=LineEdit(value='/Users/talon.chandler/recOrder/e...les/birefringence.yml', annotation=None, name=''))
    352     return
--> 354 pydantic_model = self._create_acq_contols2(selected_modes, exclude_modes, pydantic_model, json_dict)
        self = <recOrder.plugin.tab_recon.Ui_Form object at 0x158c4d2d0>
        json_dict = {'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'time_indices': 'all', 'reconstruction_dimension': 3, 'birefringence': {'transfer_function': {'swing': 0.1}, 'apply_inverse': {'wavelength_illumination': 0.532, 'background_path': '', 'remove_estimated_background': 'false', 'flip_orientation': 'false', 'rotate_orientation': 'false'}}}
        selected_modes = ['birefringence']
        exclude_modes = ['phase', 'fluorescence']
        pydantic_model = ReconstructionSettings(input_channel_names=['State0', 'State1', 'State2', 'State3'], time_indices='all', reconstruction_dimension=3, birefringence=BirefringenceSettings(transfer_function=BirefringenceTransferFunctionSettings(swing=0.1), apply_inverse=BirefringenceApplyInverseSettings(wavelength_illumination=0.532, background_path='', remove_estimated_background=False, flip_orientation=False, rotate_orientation=False)), phase=None, fluorescence=None)
    355 if pydantic_model is None:

File ~/recOrder/recOrder/plugin/tab_recon.py:585, in Ui_Form._create_acq_contols2(self=<recOrder.plugin.tab_recon.Ui_Form object>, selected_modes=['birefringence'], exclude_modes=['phase', 'fluorescence'], myLoadedModel=ReconstructionSettings(input_channel_names=['Sta...ientation=False)), phase=None, fluorescence=None), json_dict={'birefringence': {'apply_inverse': {'background_path': '', 'flip_orientation': 'false', 'remove_estimated_background': 'false', 'rotate_orientation': 'false', 'wavelength_illumination': 0.532}, 'transfer_function': {'swing': 0.1}}, 'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'reconstruction_dimension': 3, 'time_indices': 'all'})
    583 recon_pydantic_container = widgets.Container(name=_str, scrollable=False)     
--> 585 self.add_pydantic_to_container(pydantic_class, recon_pydantic_container, exclude_modes, json_dict)
        recon_pydantic_container = <Container ()>
        pydantic_class = ReconstructionSettings(input_channel_names=['State0', 'State1', 'State2', 'State3'], time_indices='all', reconstruction_dimension=3, birefringence=BirefringenceSettings(transfer_function=BirefringenceTransferFunctionSettings(swing=0.1), apply_inverse=BirefringenceApplyInverseSettings(wavelength_illumination=0.532, background_path='', remove_estimated_background=False, flip_orientation=False, rotate_orientation=False)), phase=None, fluorescence=None)
        self = <recOrder.plugin.tab_recon.Ui_Form object at 0x158c4d2d0>
        exclude_modes = ['phase', 'fluorescence']
        json_dict = {'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'time_indices': 'all', 'reconstruction_dimension': 3, 'birefringence': {'transfer_function': {'swing': 0.1}, 'apply_inverse': {'wavelength_illumination': 0.532, 'background_path': '', 'remove_estimated_background': 'false', 'flip_orientation': 'false', 'rotate_orientation': 'false'}}}
    587 # Run a validation check to see if the selected options are permitted
    588 # before we create the GUI
    589 # get the kwargs from the container/class

File ~/recOrder/recOrder/plugin/tab_recon.py:1001, in Ui_Form.add_pydantic_to_container(self=<recOrder.plugin.tab_recon.Ui_Form object>, py_model=ReconstructionSettings(input_channel_names=['Sta...ientation=False)), phase=None, fluorescence=None), container=<Container ()>, excludes=['phase', 'fluorescence'], json_dict={'birefringence': {'apply_inverse': {'background_path': '', 'flip_orientation': 'false', 'remove_estimated_background': 'false', 'rotate_orientation': 'false', 'wavelength_illumination': 0.532}, 'transfer_function': {'swing': 0.1}}, 'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'reconstruction_dimension': 3, 'time_indices': 'all'})
   1000 ftype = field_def.type_                
-> 1001 if isinstance(ftype, pydantic.BaseModel) or isinstance(ftype, pydantic.main.ModelMetaclass) or isinstance(ftype, pydantic.v1.main.ModelMetaclass):
        ftype = <class 'str'>
        pydantic.BaseModel = <class 'pydantic.main.BaseModel'>
        pydantic.main = <module 'pydantic.main' from '/opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/pydantic/main.cpython-310-darwin.so'>
        pydantic.v1.main = <module 'pydantic.v1.main' from '/opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/pydantic/v1/main.py'>
        pydantic.v1 = <module 'pydantic.v1' from '/opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/pydantic/v1/__init__.py'>
   1002     json_val = None

AttributeError: module 'pydantic.v1.main' has no attribute 'ModelMetaclass'

The above exception was the direct cause of the following exception:

EmitLoopError                             Traceback (most recent call last)
File /opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/magicgui/widgets/bases/_value_widget.py:71, in ValueWidget._on_value_change(self=PushButton(value=False, annotation=None, name='LoadModel'), value=False)
     69 if value is self.null_value and not self._nullable:
     70     return
---> 71 self.changed.emit(value)
        value = False
        self.changed = <SignalInstance 'changed' on PushButton(value=False, annotation=None, name='LoadModel')>
        self = PushButton(value=False, annotation=None, name='LoadModel')

File src/psygnal/_signal.py:1176, in emit()

File src/psygnal/_signal.py:1213, in _run_emit_loop()

File src/psygnal/_signal.py:1196, in _run_emit_loop()

File src/psygnal/_signal.py:1225, in _run_emit_loop_immediate()

File src/psygnal/_weak_callback.py:354, in cb()

File ~/recOrder/recOrder/plugin/tab_recon.py:159, in Ui_Form.__init__.<locals>.<lambda>()
    153 _load_model_btn = widgets.PushButton(
    154         name="LoadModel",
    155         label="Load Model"
    156 )
    158 # Passing model location label to model location selector
--> 159 _load_model_btn.clicked.connect(lambda: self.browse_dir_path_model(_load_model_loc))
        _load_model_loc = LineEdit(value='/Users/talon.chandler/recOrder/examples/birefringence.yml', annotation=None, name='')
        self = <recOrder.plugin.tab_recon.Ui_Form object at 0x158c4d2d0>
    161 # HBox for Loading Model
    162 _hBox_widget_model = QWidget()

File ~/recOrder/recOrder/plugin/tab_recon.py:354, in Ui_Form.browse_dir_path_model(self=<recOrder.plugin.tab_recon.Ui_Form object>, elem=LineEdit(value='/Users/talon.chandler/recOrder/e...les/birefringence.yml', annotation=None, name=''))
    351     self.messageBox(ret_msg)
    352     return
--> 354 pydantic_model = self._create_acq_contols2(selected_modes, exclude_modes, pydantic_model, json_dict)
        self = <recOrder.plugin.tab_recon.Ui_Form object at 0x158c4d2d0>
        json_dict = {'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'time_indices': 'all', 'reconstruction_dimension': 3, 'birefringence': {'transfer_function': {'swing': 0.1}, 'apply_inverse': {'wavelength_illumination': 0.532, 'background_path': '', 'remove_estimated_background': 'false', 'flip_orientation': 'false', 'rotate_orientation': 'false'}}}
        selected_modes = ['birefringence']
        exclude_modes = ['phase', 'fluorescence']
        pydantic_model = ReconstructionSettings(input_channel_names=['State0', 'State1', 'State2', 'State3'], time_indices='all', reconstruction_dimension=3, birefringence=BirefringenceSettings(transfer_function=BirefringenceTransferFunctionSettings(swing=0.1), apply_inverse=BirefringenceApplyInverseSettings(wavelength_illumination=0.532, background_path='', remove_estimated_background=False, flip_orientation=False, rotate_orientation=False)), phase=None, fluorescence=None)
    355 if pydantic_model is None:
    356     self.messageBox("Error - pydantic model returned None")

File ~/recOrder/recOrder/plugin/tab_recon.py:585, in Ui_Form._create_acq_contols2(self=<recOrder.plugin.tab_recon.Ui_Form object>, selected_modes=['birefringence'], exclude_modes=['phase', 'fluorescence'], myLoadedModel=ReconstructionSettings(input_channel_names=['Sta...ientation=False)), phase=None, fluorescence=None), json_dict={'birefringence': {'apply_inverse': {'background_path': '', 'flip_orientation': 'false', 'remove_estimated_background': 'false', 'rotate_orientation': 'false', 'wavelength_illumination': 0.532}, 'transfer_function': {'swing': 0.1}}, 'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'reconstruction_dimension': 3, 'time_indices': 'all'})
    581 # Container holding the pydantic UI components
    582 # Multiple instances/copies since more than 1 might be created
    583 recon_pydantic_container = widgets.Container(name=_str, scrollable=False)     
--> 585 self.add_pydantic_to_container(pydantic_class, recon_pydantic_container, exclude_modes, json_dict)
        recon_pydantic_container = <Container ()>
        pydantic_class = ReconstructionSettings(input_channel_names=['State0', 'State1', 'State2', 'State3'], time_indices='all', reconstruction_dimension=3, birefringence=BirefringenceSettings(transfer_function=BirefringenceTransferFunctionSettings(swing=0.1), apply_inverse=BirefringenceApplyInverseSettings(wavelength_illumination=0.532, background_path='', remove_estimated_background=False, flip_orientation=False, rotate_orientation=False)), phase=None, fluorescence=None)
        self = <recOrder.plugin.tab_recon.Ui_Form object at 0x158c4d2d0>
        exclude_modes = ['phase', 'fluorescence']
        json_dict = {'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'time_indices': 'all', 'reconstruction_dimension': 3, 'birefringence': {'transfer_function': {'swing': 0.1}, 'apply_inverse': {'wavelength_illumination': 0.532, 'background_path': '', 'remove_estimated_background': 'false', 'flip_orientation': 'false', 'rotate_orientation': 'false'}}}
    587 # Run a validation check to see if the selected options are permitted
    588 # before we create the GUI
    589 # get the kwargs from the container/class
    590 pydantic_kwargs = {}

File ~/recOrder/recOrder/plugin/tab_recon.py:1001, in Ui_Form.add_pydantic_to_container(self=<recOrder.plugin.tab_recon.Ui_Form object>, py_model=ReconstructionSettings(input_channel_names=['Sta...ientation=False)), phase=None, fluorescence=None), container=<Container ()>, excludes=['phase', 'fluorescence'], json_dict={'birefringence': {'apply_inverse': {'background_path': '', 'flip_orientation': 'false', 'remove_estimated_background': 'false', 'rotate_orientation': 'false', 'wavelength_illumination': 0.532}, 'transfer_function': {'swing': 0.1}}, 'input_channel_names': ['State0', 'State1', 'State2', 'State3'], 'reconstruction_dimension': 3, 'time_indices': 'all'})
    999 def_val = field_def.default
   1000 ftype = field_def.type_                
-> 1001 if isinstance(ftype, pydantic.BaseModel) or isinstance(ftype, pydantic.main.ModelMetaclass) or isinstance(ftype, pydantic.v1.main.ModelMetaclass):
        ftype = <class 'str'>
        pydantic.BaseModel = <class 'pydantic.main.BaseModel'>
        pydantic.main = <module 'pydantic.main' from '/opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/pydantic/main.cpython-310-darwin.so'>
        pydantic.v1.main = <module 'pydantic.v1.main' from '/opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/pydantic/v1/main.py'>
        pydantic.v1 = <module 'pydantic.v1' from '/opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/pydantic/v1/__init__.py'>
   1002     json_val = None
   1003     if json_dict is not None:

EmitLoopError: 

While emitting signal 'magicgui.widgets.PushButton.changed', a AttributeError occurred in a callback:

  Signal emitted at: /opt/anaconda3/envs/recorder-dev-test2/lib/python3.10/site-packages/magicgui/widgets/bases/_value_widget.py:71, in _on_value_change
    >  self.changed.emit(value)

  Callback error at: /Users/talon.chandler/recOrder/recOrder/plugin/tab_recon.py:1001, in add_pydantic_to_container
    >  if isinstance(ftype, pydantic.BaseModel) or isinstance(ftype, pydantic.main.ModelMetaclass) or isinstance(ftype, pydantic.v1.main.ModelMetaclass):

    Local variables:
       py_model = ReconstructionSettings(input_channel_names=['State0', 'State...
       container = <Container ()>
       excludes = ['phase', 'fluorescence']
       json_dict = {'input_channel_names': ['State0', 'State1', 'State2', 'Stat...
       field = 'input_channel_names'
       field_def = ModelField(name='input_channel_names', type=List[str], requi...
       def_val = ['State0', 'State1', 'State2', 'State3']
       ftype = <class 'str'>

See AttributeError above for original traceback.

Trying on linux now.

- import OS specific pydantic imports for ModelMetaclass
- pydantic possibly needs to be pinned at 1.10.19 @talonchandler
- implemented ver specific import for now
- check for update on the job file, if not - notify user in process table and exit after a 2 min timeout
@talonchandler
Copy link
Collaborator

Working on macOS now. Thanks for the fix @amitabhverma.

My testing just now was very successful. I think we're moving into the refinement stage, and I'm noting several minor issues:

  • I think the modified .yaml files are currently saved next to the loaded .yaml. Can the .yaml files used for each reconstruction be saved beside the output zarrs instead?
  • "Build & Run N Models" does not decrement after deleting models. Steps: (1) add three models, (2) delete two models, (3) buttons still says "3 Models".
  • The "reconstruction algorithm" is a text box...can it be a dropdown menu to select between the two options "Tikhonov" and "TV"? The "reconstruction dimension" field looks like the correct behavior.
  • The "regularization strength" field does not accept smaller values, say down to 1e-5, which we sometimes need.
  • Can "background path" have a button for selecting a folder?
  • I like how each run result has a foldable GUI element. I think it might make sense to have similar folding on each model's settings, but this is not critical.

- fixes newly created yaml files to reside besides output zarrs
- fixes Model numbers not correctly representing on Build & Run button
- fixes "reconstruction algorithm" to show up as dropdown box
- fixes "regularization strength" spinbox to accept 5 significant decimal place values
- background path now accepts Path and str in model and shows up as directory selector button
- each model container setting is now a collapsible item
- fixes issues when widget is closed/hidden and then started again

- all model(s) validation errors are not collected and presented in notification and highlighted on GUI in one go
- each collapsible model entry also highlights if any validation errors are present
- added confirm dialog for certain actions
- added clear results button
- job update thread will exit after a timeout limit and no update detected in job output file
@amitabhverma
Copy link
Collaborator Author

@talonchandler I've committed with all the fixes and other points you made. I see in the test-and-deploy run it uses pydantic==1.10.17. Should I go ahead and modify that in the setup.cfg file to use 1.10.19 ? I'll shortly be checking submitit compatibility.

@talonchandler
Copy link
Collaborator

Should I go ahead and modify that in the setup.cfg file to use 1.10.19 ?

Yes please! We try to use the automated tests to flag installation issues, so this is very helpful.

- pinning pydantic lib ver to 1.10.19
use PyQt6 for pyqtSignal
- fixes standalone widget
@amitabhverma amitabhverma changed the title pydantic-model initial commit Standalone auto-generated reconstruction GUI Dec 19, 2024
- fixes BG selection and resetting if Path is invalid
- fixes removal of row/stability - using Queued List to counter race conditions

- added .zarr data validation check for Input, also used for BG
- make uniqueID more unique
- break thread update operations when user has cleared the results table
- standalone GUI niceties
another check if result row is deleted by user action while processing
@ieivanov
Copy link
Contributor

I did some initial testing with phase reconstruction on Mac. I had no problems with the install following your instructions. The plugin also generally worked well - it ran the reconstruction, but for some reason I didn't see the reconstruction results pop up (more below). I have some suggestions on improving the user experience, feel free to implement whatever makes sense to you. Also, let me know if any of these notes are not clear and I can try to elaborate more.

  • The plugin is listed as Main Menu (recOrder-napari) - it would be nicer if we name it just recOrder-napari
  • What sets the width of the widget? It seems like the minimum width is a bit wider than it needs to be. We should reduce, and maybe set it dynamically, especially if we plan to remove one/some of the tabs, which seem to take most horizontal space.
Screenshot 2024-12-25 at 10 40 30 AM
  • Edge case: The plugin is built with iohub 0.2.0a1 which uses the read_micromanager reader. The plugin breaks if we upgrade to the latest iohub which renamed the universal reader to read_images (or something like that) - I needed to install it in order to use iohub set-scale. read_micromanager may still be available as a deprecated reader. I noticed you're also using pydantic==1.10.19. It may be good to upgrade to pydantic 2 for compatibility with our other repos, but I understand that may be a but of a heavy lift since you use extensive validation here; these should come in a separate PR
  • At first I was a bit confused by the widget structure. Also by what exactly it means to "Create", "Build" and "Run" a model. Here are some suggestions for UI refactor:
    • First row: Input Data: (as text), to the left a field for entering a data path, and then a button that says Choose or Browse (Browse would match the style of the LC Calibration tab).
    • Second row: Models Path: then text field, a button that says Load Model would pop up a window for the user to select a *.yml file and would validate it.
    • Third row: Three checkboxes as you have them currently, then a button that say Built Model (rather than Create Model) This button would validate the model and save the .yml file in the path given in the line above.
    • Fourth row: Output Path:, then text field, then two buttons that say Run Models and Clear Results. Run Models would compute the transfer functions and run the reconstructions. In this refactor there is no "Create" model, instead only "Build" and "Run" model.
  • Currently each model specifies a full path to the output zarr store. I would suggest the following structure of the output data path: f'{OUTPUT_PATH}{INPUT_DATA}{MODEL_NAME}.zarr', for example /Users/ivan.ivanov/Documents/images_local/Kazansky_phase_01.zarr. This way there can be a more clear mapping between .yml model files and reconstructed zarr stores (I know that the recon paramers are also saved as zarr metadata). The output zarr stores would also have unique names and would be harder to overwrite them.
  • It seems currently the reconstruction would overwrite any existing zarr stores - that's not a great behavior, it should instead throw an error when validating the output path.
  • I ran one phase reconstruction (on Mac). The reconstruction finished - the job status changed to finished and I got a check mark next to the model, but in napari the message still says "Updating phase ..." and I didn't see the reconstructed images pop up. If I drag and drop the reconstructed zarr store in napari, then I can see the recon results. I was expecting the results of the reconstruction to pop up once it's finished.
  • I then went on to create a second phase model. It started off with the default settings. It would be nice if any new models are initialized with the settings from the previous models of that type such that only one value would need to be updated during parameter tuning, for example.
  • Output datasets were chunked in XY planes - it would be better to chunk in XYZ volumes
  • It would be nice if the Load Model button looks for *.yml or *.yaml files specifically, rather than any files
  • I'm not too sure what the Clear Results button does - maybe there was an issue with my reconstructions running to completion, so this button didn't have much of an effect.

Feel free to further play around with the UI structure I suggested - I've reworked it a couple of times while writing this message. You may be able to combine the Models Path and Output Path into one field.

@ieivanov
Copy link
Contributor

P.S. the tests fail because pytest stalls on a call to lock.acquire() - not sure why

- timeout of 5 min per Job
- BF button error handling when not available
- GUI initiates clearing logs on first run to avoid file exist errors

- a single processing run now supports the CLI spawning multiple jobs
@amitabhverma
Copy link
Collaborator Author

Thank you @ieivanov, those are some very constructive and detailed feedback and will be very helpful moving forward. Some of the points you raised were outside the scope of this PR but nonetheless very valid. I will try and answer some of the points in the first pass and will leave some for @talonchandler to fill in. The idea of this Reconstruction GUI is also to work as a stand-alone, be agnostic of the acquisition and compliment the CLI and will be able to come up when run via the command-line recorder gui without napari. Since the idea is to batch process, pushing the reconstructions to viewer makes less sense (even if a viewer is available).

image

At first I was a bit confused by the widget structure. Also by what exactly it means to "Create", "Build" and "Run" a model. Here are some suggestions for UI refactor:

  • I agree, that we don't end up creating a .yml file until we get to the point of processing the input data and at that step it becomes a moot point. I'm also not a big fan of using "Run" since it does not convey that we are processing (reconstruction) the dataset. We might also want to re-think the use of the word "model" in this context. As a user who does not understand the backend side of things it does not convey any meaning, an alternate word like "Settings", "Parameters" conveys more in my opinion.

  • I also think it makes sense to drop the label and edit field for the model button. In most cases the .yml file will be residing next to the input data location and also providing a path for it has no value. It might make sense to only have a Load Model(s) button instead. The GUI will fit in 2 rows with this new setup and less confusing.

  • The Clear Results button is for clearing the processing table that shows entries that popup and remain on errors. It also seems to make more sense to be moved there.

Currently each model specifies a full path to the output zarr store. I would suggest the following structure of the output data path: f'{OUTPUT_PATH}{INPUT_DATA}{MODEL_NAME}.zarr', for example /Users/ivan.ivanov/Documents/images_local/Kazansky_phase_01.zarr. This way there can be a more clear mapping between .yml model files and reconstructed zarr stores (I know that the recon paramers are also saved as zarr metadata). The output zarr stores would also have unique names and would be harder to overwrite them.

  • The output data will have a new name similar to what you suggest, except for using first 3 letters of the model in order to keep Birefriengence+Phase shortened.

It seems currently the reconstruction would overwrite any existing zarr stores - that's not a great behavior, it should instead throw an error when validating the output path.

  • I believe there is an intended need to overwrite reconstructed data under certain circumstances, however, adding a clear indication the data exists and will get overwritten makes sense.

I ran one phase reconstruction (on Mac). The reconstruction finished - the job status changed to finished and I got a check mark next to the model, but in napari the message still says "Updating phase ..." and I didn't see the reconstructed images pop up. If I drag and drop the reconstructed zarr store in napari, then I can see the recon results. I was expecting the results of the reconstruction to pop up once it's finished.

  • The checkmark is to indicate the parameters validated, the unmarked results table is where a processing entry should pop-up when "Run" is clicked. Did you see this entry show up or was it affected by the error you posted in your second post ?

I then went on to create a second phase model. It started off with the default settings. It would be nice if any new models are initialized with the settings from the previous models of that type such that only one value would need to be updated during parameter tuning, for example.

  • Carrying over settings from a previous reconstruction makes sense for some cases. Re-thinking this, we should also clear any models built when a new dataset is selected in order to make sure user has validated the output paths.

It would be nice if the Load Model button looks for *.yml or *.yaml files specifically, rather than any files

  • Agree

I'm not too sure what the Clear Results button does - maybe there was an issue with my reconstructions running to completion, so this button didn't have much of an effect.

  • Moved to the Results table section to make its purpose more clear.

Feel free to further play around with the UI structure I suggested - I've reworked it a couple of times while writing this message. You may be able to combine the Models Path and Output Path into one field.

Thanks again for your time and thoughts on this.

- stand-alone GUI cmd
- refactored UI based on suggestions
- fixes cyclic import for stand-alone GUI
- duplicates prev model settings when available
- clears Model when changing Input datasets
- load model looks for .yml files
Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 13.65462% with 215 lines in your changes missing coverage. Please review.

Project coverage is 10.68%. Comparing base (7687259) to head (4ef11a1).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
recOrder/cli/jobs_mgmt.py 16.56% 136 Missing ⚠️
recOrder/scripts/simulate_zarr_acq.py 0.00% 55 Missing ⚠️
recOrder/cli/apply_inverse_transfer_function.py 18.75% 13 Missing ⚠️
recOrder/plugin/main_widget.py 0.00% 6 Missing ⚠️
recOrder/plugin/gui.py 0.00% 3 Missing ⚠️
recOrder/cli/parsing.py 50.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##            main     #487      +/-   ##
=========================================
+ Coverage   9.54%   10.68%   +1.14%     
=========================================
  Files         30       41      +11     
  Lines       4591     7275    +2684     
=========================================
+ Hits         438      777     +339     
- Misses      4153     6498    +2345     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@talonchandler
Copy link
Collaborator

Looking very good @amitabhverma. I just completed a set of tests that I'll summarize here:

All tests on Biohub HPC (linux):

  • PASSED single time-point birefringence reconstruction CZYX = (5, 350, 1500, 2048)
  • NEAR-PASS multi time-point birefringence and phase reconstruction TCZYX = (30, 5, 81, 1496, 1496)
    • /hpc/projects/comp.micro/mantis/2024_05_24_A549_Fig1_redo/0-zarr/a549_tomm20_lysotracker_w1_1/a549_tomm20_lysotracker_w1_labelfree_1.zarr/
    • First 7 timepoints completed correctly, and the plugin caught a submitit error and printed most of it to the processing area. Unfortunately the printout was incomplete, so I couldn't see the cause of the error. I also could not find the submitit/SLURM logs near the output, so I could not dig any deeper. The plugin correctly showed the incomplete reconstruction and did not crash. I suspect I ran out of resources, but I'm not able to confirm without logs.
  • NEAR-PASS multi-position fluorescence reconstruction PZYX = (8, 107, 2048, 2048)
    • /hpc/projects/comp.micro/zebrafish/20240126_3dpf_she_h2b_cldnb_mcherry/0-zarr/20240126_3dpf_she_h2b_cldnb_mcherry/fish1_60x_1_multipos_1.zarr/
    • All reconstructions completed correctly, but the plugin did not update correctly ("Updating fluorescence...please wait") and the result did not appear napari window.
  • NEAR-PASS on-the-fly reconstruction using the small test dataset you shared PTCZYX = (5, 10, 7, 11, 256, 256)
    • Running the "fake-acquisition" copying script worked as expected.
    • Was able to connect the plugin to the in-progress zarr. Green lights in the plugin, and new reconstructions were started as time points arrived, including setting off submitit jobs, and filling the output zarr correctly.
    • From the plugin's perspective, the jobs did not seem to complete ("Updating fluorescence...please wait"), and the results did not appear in napari.

I spotted a few other issues:

  • the simulate_zarr_acq.py script is a critical part of the on-the-fly design. Can this script be simplified as much as possible (no need for conversion, focus on the minimum necessary communication) and added to the PR?
  • as the GUI gets narrower, the "Delete Model" button disappears (but the "Clear" button doesn't)
  • you mentioned that you encountered some race conditions in the simultaneous reading/writing of metadata. I did not observe these race conditions in my quick tests, but I'm wondering if this sets a meaningful limit for the on-the-fly reconstruction throughput? Is there a characterization test we could run to check for throughput?

@talonchandler talonchandler linked an issue Jan 10, 2025 that may be closed by this pull request
@amitabhverma
Copy link
Collaborator Author

Thanks @talonchandler for your comprehensive testing and detailed summary.

  • NEAR-PASS multi time-point birefringence and phase reconstruction TCZYX = (30, 5, 81, 1496, 1496)

    • /hpc/projects/comp.micro/mantis/2024_05_24_A549_Fig1_redo/0-zarr/a549_tomm20_lysotracker_w1_1/a549_tomm20_lysotracker_w1_labelfree_1.zarr/
    • First 7 timepoints completed correctly, and the plugin caught a submitit error and printed most of it to the processing area. Unfortunately the printout was incomplete, so I couldn't see the cause of the error. I also could not find the submitit/SLURM logs near the output, so I could not dig any deeper. The plugin correctly showed the incomplete reconstruction and did not crash. I suspect I ran out of resources, but I'm not able to confirm without logs.

This should be a single submitit job since there were no positions involved and not an on-the-fly reconstruction. I am assuming it was going through the time-points and then choked at t=7 due to resources. I have set the -rx 20 flag as default but maybe we might need to add an option in the model GUI if the estimation is still off. I will test this on the HPC.

  • NEAR-PASS multi-position fluorescence reconstruction PZYX = (8, 107, 2048, 2048)

    • /hpc/projects/comp.micro/zebrafish/20240126_3dpf_she_h2b_cldnb_mcherry/0-zarr/20240126_3dpf_she_h2b_cldnb_mcherry/fish1_60x_1_multipos_1.zarr/
    • All reconstructions completed correctly, but the plugin did not update correctly ("Updating fluorescence...please wait") and the result did not appear napari window.

This seems an issue related to the HPC since I was unable to replicate on my local system. Possible reason could be access to the submitit log files or communication over the port we have designated for use. I will check on the HPC and get back.

  • NEAR-PASS on-the-fly reconstruction using the small test dataset you shared PTCZYX = (5, 10, 7, 11, 256, 256)

    • Running the "fake-acquisition" copying script worked as expected.
    • Was able to connect the plugin to the in-progress zarr. Green lights in the plugin, and new reconstructions were started as time points arrived, including setting off submitit jobs, and filling the output zarr correctly.
    • From the plugin's perspective, the jobs did not seem to complete ("Updating fluorescence...please wait"), and the results did not appear in napari.

Seems like the same update issue with file/port access on the HPC.

I spotted a few other issues:

  • as the GUI gets narrower, the "Delete Model" button disappears (but the "Clear" button doesn't)

I will remove the line widget and that should make the scrollbar at the end of the container more apparent for each model.

  • the simulate_zarr_acq.py script is a critical part of the on-the-fly design. Can this script be simplified as much as possible (no need for conversion, focus on the minimum necessary communication) and added to the PR?
  • you mentioned that you encountered some race conditions in the simultaneous reading/writing of metadata. I did not observe these race conditions in my quick tests, but I'm wondering if this sets a meaningful limit for the on-the-fly reconstruction throughput? Is there a characterization test we could run to check for throughput?

As discussed I don't see any bottlenecks occurring on the acquisition side or on the reconstruction. I will however change the script and reduced the update calls to write after every z dimension instead of every channel.

- added info icon next to Input Store label when a dataset path is set which displays channel names for convenience, more metadata could be displayed that might be relevant for reconstruction
- removed line widget at end of each model container to make scrollbar more apparent
- we are only reading the location here but it needs to exist, the CLI is creating the logs based on the path
- script to test on-the-fly reconstruction POC
@amitabhverma
Copy link
Collaborator Author

@talonchandler following up on the pending items on your last post here.

All tests on Biohub HPC (linux):

  • /hpc/projects/comp.micro/mantis/2024_05_24_A549_Fig1_redo/0-zarr/a549_tomm20_lysotracker_w1_1/a549_tomm20_lysotracker_w1_labelfree_1.zarr/
  • /hpc/projects/comp.micro/zebrafish/20240126_3dpf_she_h2b_cldnb_mcherry/0-zarr/20240126_3dpf_she_h2b_cldnb_mcherry/fish1_60x_1_multipos_1.zarr/
  • All reconstructions completed correctly, but the plugin did not update correctly ("Updating fluorescence...please wait") and the result did not appear napari window.

Tested with both datasets and they ran and updated successfully. Issue was reading the logs folder which was not existing. The logs folder is created in the cwd from where napari is started. I would suggest setting it based on the output directory we specify. The CLI could do something similar too. It would need a little re-work on the jobs management side to make sure multiple jobs with different parent output directories are handled correctly for updating the results window which reads in the logs.

  • NEAR-PASS on-the-fly reconstruction using the small test dataset you shared PTCZYX = (5, 10, 7, 11, 256, 256)
    • From the plugin's perspective, the jobs did not seem to complete ("Updating fluorescence...please wait"), and the results did not appear in napari.

This got fixed with the logs issue but I do see a recurring error (pasted below). The error interrupted the on-the-fly reconstruction once out of my 5 runs. We might need inputs from IT about this, I see it mentioned frequently on the nomachine forum too.
ERROR: ld.so: object '/usr/NX/lib/libnxegl.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

  • the simulate_zarr_acq.py script is a critical part of the on-the-fly design. Can this script be simplified as much as possible (no need for conversion, focus on the minimum necessary communication) and added to the PR?

A simplified version of that script has been added to the /scripts folder.

@amitabhverma
Copy link
Collaborator Author

@talonchandler
Following up on the error when reconstructing a fake-acquisition
ERROR: ld.so: object '/usr/NX/lib/libnxegl.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

I tested the simulate_zarr_acq.py script using nomachine but added an option to run the CLI cmd as we would via the GUI and I still see the error so I'm pretty sure it's not our code related per se.

As a follow up I ran the same script via ssh directly (without nomachine or noVNC) on the same node and did not see the error.

- logs folder will reside next to the output dataset
- fixed an issue with reconstructed data showing irrespective of selection
@talonchandler
Copy link
Collaborator

talonchandler commented Jan 16, 2025

Looking good @amitabhverma! I love the (i) info that shows the available metadata---extremely useful.

I went through the same tests as last time and here are my results:


  • NEAR-PASS multi time-point birefringence and phase reconstruction TCZYX = (30, 5, 81, 1496, 1496)
    /hpc/projects/comp.micro/mantis/2024_05_24_A549_Fig1_redo/0-zarr/a549_tomm20_lysotracker_w1_1/a549_tomm20_lysotracker_w1_labelfree_1.zarr/

I had an unknown failure after 11 timepoints...similar outward appearance as last time. I've attached the logs, but they didn't seem to suggest much...can you take a look? You're able to reconstruct all 30 timepoints?

I also checked the job efficiency on the HPC and found no evidence of OOM.

a549_tomm20_lysotracker_w1_labelfree_1_Bir_01_logs.zip
Note: First job is from a single time point, which completed successfully. Second job failed.


  • NEAR-PASS multi-position fluorescence reconstruction PZYX = (8, 107, 2048, 2048)
    /hpc/projects/comp.micro/zebrafish/20240126_3dpf_she_h2b_cldnb_mcherry/0-zarr/20240126_3dpf_she_h2b_cldnb_mcherry/fish1_60x_1_multipos_1.zarr/

This time I errored out with the attached traceback, which I'm having difficulty understanding. I tried restarting napari, reconstructing a different channel, and reconstructing to a different output directory---all three resulted in the same output.

zebrafish-recon-traceback.txt

Edit: OSError: [Errno 122] Disk quota exceeded: my fault! Will retry.


  • PASS on-the-fly reconstruction using the small test dataset you shared PTCZYX = (5, 10, 7, 11, 256, 256)

Running and reconstructing the fake-acquisition copying script worked as expected.

fixes:
- multi-pos dataset would be displayed after single pos processing

enhancements:
- CLI will print Job status when used as cmd line, not for GUI
- use single socket connection when multi-pos is spawned by a request
- added "rx" field to model-container
- minor GUI tweaks
Copy link
Collaborator

@talonchandler talonchandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of my manual tests are passing. My issues this week were related to the (new) restriction on the /mydata/ folders...thanks for your help and patience.

I gave all of the code a quick pass, and I left relatively minor comments. We use black for formatting (details here), and that should help automatically address some of the nits. We've been meaning to start automatically checking for formatting, but we haven't gotten to it.

After these are addressed, I will approve and merge.

recOrder/cli/jobs_mgmt.py Outdated Show resolved Hide resolved
recOrder/cli/jobs_mgmt.py Outdated Show resolved Hide resolved
recOrder/cli/monitor.py Outdated Show resolved Hide resolved
recOrder/plugin/tab_recon.py Outdated Show resolved Hide resolved
recOrder/plugin/tab_recon.py Outdated Show resolved Hide resolved
recOrder/plugin/tab_recon.py Outdated Show resolved Hide resolved
recOrder/plugin/tab_recon.py Outdated Show resolved Hide resolved
@talonchandler
Copy link
Collaborator

Looks great @amitabhverma! Just ran another quick test, and everything is working well. Merging.

@talonchandler talonchandler added this pull request to the merge queue Jan 23, 2025
Merged via the queue into main with commit e2eb24a Jan 23, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plugin GUI is quite wide
3 participants