Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOS conversion changes #74

Open
JessyBarrette opened this issue Jan 31, 2024 · 7 comments · May be fixed by #84
Open

IOS conversion changes #74

JessyBarrette opened this issue Jan 31, 2024 · 7 comments · May be fixed by #84
Assignees

Comments

@JessyBarrette
Copy link
Collaborator

Some changes are needed the ios conversionl

@JessyBarrette
Copy link
Collaborator Author

this is related to the conversion of the different IOS datasets here

@guanlu129
Copy link
Collaborator

Two items to follow up from today's meeting:

  1. Output files should be placed in folders by year (e.g. 2023, 2022).
    Which way would be easier: 1) specify output directory in config file, or 2) make changes in the parser source code?

  2. Make change to specify "creator_email " and "creator_url" in config file.

Thanks,
Lu

@JessyBarrette
Copy link
Collaborator Author

Here's a few answers:

  1. Parametrized Output Path: you can define a parametrized path either through your config file output: path: or file-name: (see example config here) or through the command line via the --output-path --output-file-name inputs. You can then define a path like:

    • --output-path 'folder/{time_min.year}/'
    • or define any date format: --output-path 'folder/{time_min:%Y-%m-%d}/'
    • A few parameters are reachable via the path. I need to bring them within cli documentation.
  2. We would need to modify the dictionary here

@guanlu129
Copy link
Collaborator

guanlu129 commented Apr 25, 2024

For IOS CTD data, I’ve compared the netCDF file generated by ODPY and by the old ios_data_tranform, and would like to suggest a few modifications:

  • 1. Float32 should be enough for latitude and longitude. Should we change the variable type from double to float for latitude and longitude? or doesn't matter to leave them as double.

  • 2. Adding long name and standard name to time.
    Example:
    double time ;
    time:_FillValue = NaN ;
    time:timezone = "UTC" ;
    time:long_name = "time" ;
    time:standard_name = "time" ;
    time:units = "seconds since 1970-01-01T00:00:00+00:00" ;
    time:calendar = "proleptic_gregorian" ;

  • 3. Variable double depth(depth):
    E.g. “/home/guanl/Documents/data/ios_raw_files/cruise_data_odpy/2023-003/CTD/2023-003-0001.ctd" - Duplicated variable from sub variables: depth, renamed depth02

    • Should keep pressure in the netCDF file without any conversion to depth in meters.
    • Write depth from ASCII file to depth (not depth02) in netCDF file.
    • If there’s no depth in the original file, then convert pressure to depth.
  • 4. In ODPY version, there are two temperature variables – TEMPPR01 and TEMPS901, keep both or remove TEMPPR01
    For the .ctd files, would be better to keep TEMPS901(depth) only.
    I would suggest use the file suffix (.ctd, .mctd) to specify the temperature variable name.

  • 5. Missing vocabularies:

    • "/home/guanl/Documents/data/ios_raw_files/cruise_data_odpy/2023-003/CTD/2023-003-0001.ctd"
      Missing vocabulary for file_type=ctd; variable name=number_of_bin_records, units=n/a
    • "/home/guanl/Documents/data/ios_raw_files/cruise_data_odpy/2023-025/CTD/2023-025-0046.ctd" Missing vocabulary for file_type=ctd; variable name=par,units=umol/m^2/sec
    • "/home/guanl/Documents/data/ios_raw_files/cruise_data_odpy/2023-025/CTD/2023-025-0046.ctd" Missing vocabulary for file_type=ctd; variable name=ph:sbe:nominal,units=n/a
  • 6. Float sea_water_temperature, sea_water_practical_salinity and sea_water_pressure in the old version.

@JessyBarrette
Copy link
Collaborator Author

Thanks @guanlu129 for spending the time to compare the data between the original version versus the new updated version from the ocean-data-parser. Here's a few points

  1. We can certainly reduce to float the lat/long variables. Looking quickly floats should still have a submeter accuracy so i think that fair enough to do.
    2.Sounds good to me

  2. I believed your suggest workflow was what was already implemented but it looks it was not, I will add the file 2023-003-0001.ctd as a test file to make sure the resulting variables corresponds to what we're expecting.

  3. The main reason why TEMPPR01 is generated within those files is to make it possible to ERDDAP to compile all the temperature associated to the CTD datasets. As of now ODPy will generate:

    • always a TEMPPR01 (this applies to temperature data with known or unknown scales)
    • if temperature is ITS-90 -> TEMPS901
    • if temperature is IPTS-68 -> TEMPS681
      Due to that TEMPPR01 regroups all the temperatures event if don't their related temperature scale. Assuming that all the data is ITS-90 than yes for sure we can certainly drop TEMPPR01 and only rely on TEMPS901.
  4. I can include those to the vocabulary

  5. You would want those back in the new files too?

@guanlu129
Copy link
Collaborator

Thanks @guanlu129 for spending the time to compare the data between the original version versus the new updated version from the ocean-data-parser. Here's a few points

  1. We can certainly reduce to float the lat/long variables. Looking quickly floats should still have a submeter accuracy so i think that fair enough to do.
    I'd like to confirm that let's keep the double for the lat/long variables - no need for any changes here. Thanks!
  1. Sounds good to me.
  1. I believed your suggest workflow was what was already implemented but it looks it was not, I will add the file 2023-003-0001.ctd as a test file to make sure the resulting variables corresponds to what we're expecting.
    Are the changes implemented in develop branch?
  1. The main reason why TEMPPR01 is generated within those files is to make it possible to ERDDAP to compile all the temperature associated to the CTD datasets. As of now ODPy will generate:

    • always a TEMPPR01 (this applies to temperature data with known or unknown scales)
    • if temperature is ITS-90 -> TEMPS901
    • if temperature is IPTS-68 -> TEMPS681
      Due to that TEMPPR01 regroups all the temperatures event if don't their related temperature scale. Assuming that all the data is ITS-90 than yes for sure we can certainly drop TEMPPR01 and only rely on TEMPS901.

I see! Yes, we would like to keep the TEMPPR01. This is equivalent to sea_water_temperature in the old version, correct?

  1. I can include those to the vocabulary
    Yes please.
  1. You would want those back in the new files too?
    IF the TEMPPR01 in the OPDY version is the equivalent of sea_water_temperature in the old version. How about sea_water_practival_salinity and sea_water_pressure? any suggestion on keeping these two? Thanks.

@JessyBarrette
Copy link
Collaborator Author

  1. Float sea_water_temperature, sea_water_practical_salinity and sea_water_pressure in the old version.

For this one, we can certainly change all the variables like TEMPPR01 to sea_water_temperature. But if the intention is only to match the old ERDDAP dataset, we can easily modify the ERDDAP configuration for those datasets to have TEMPPR01 been used by the sea_water_temperature variable as an example.

@JessyBarrette JessyBarrette linked a pull request May 1, 2024 that will close this issue
5 tasks
@JessyBarrette JessyBarrette linked a pull request May 1, 2024 that will close this issue
5 tasks
@JessyBarrette JessyBarrette self-assigned this May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants