Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added NDBC as a data source #146

Merged
merged 41 commits into from
Sep 16, 2024
Merged

Conversation

abdu558
Copy link
Contributor

@abdu558 abdu558 commented Jun 17, 2024

Added NDBC

@SorooshMani-NOAA
Copy link
Contributor

SorooshMani-NOAA commented Jun 17, 2024

@abdu558 please go over failed tests and make sure we pass the basic tests. Make sure the code works for the oldest supported Python (in this case 3.9)

@pmav99
Copy link
Member

pmav99 commented Jun 17, 2024

hi Abdu and welcome :)

The CI is failing due to mypy. Chances are that you haven't set up your dev environment properly. Check here: https://github.com/oceanmodeling/searvey?tab=readme-ov-file#development (i.e. run make init which will ensure that the pre-commit hooks are installed). After that you should fix the mypy errors. To make a long story short, try to avoid Optional. We use | None instead. Check here for more context: https://stackoverflow.com/questions/69440494/python-3-10-optionaltype-or-type-none

@pmav99
Copy link
Member

pmav99 commented Jun 17, 2024

Please convert to a draft PR

@abdu558 abdu558 marked this pull request as draft June 17, 2024 20:31
@abdu558
Copy link
Contributor Author

abdu558 commented Jun 17, 2024

hi Abdu and welcome :)

The CI is failing due to mypy. Chances are that you haven't set up your dev environment properly. Check here: https://github.com/oceanmodeling/searvey?tab=readme-ov-file#development (i.e. run make init which will ensure that the pre-commit hooks are installed). After that you should fix the mypy errors. To make a long story short, try to avoid Optional. We use | None instead. Check here for more context: https://stackoverflow.com/questions/69440494/python-3-10-optionaltype-or-type-none

Okay thank you, do you know how to fix these two below?
poetry export main.......................................................Failed

  • hook id: poetry-export
  • files were modified by this hook

poetry export dev........................................................Failed

  • hook id: poetry-export
  • files were modified by this hook

@tomsail
Copy link
Contributor

tomsail commented Jun 18, 2024

Hi @abdu558
I think this happens because your requirements files are in conflict with the poetry files. i.e.

  • requirements/requirements-dev.txt and
  • requirements/requirements.txt

Since you haven't committed changes in pyproject.toml or in poetry.lock, there is no reason these 2 files should have changes.

So you needs to drop changes in the requirements files or commit the corresponding changes in poetry.lock / pyproject.toml.

@tomsail
Copy link
Contributor

tomsail commented Jun 18, 2024

@abdu558 to avoid having to do this every time on github ci,
I recommend to set the pre-commit hooks locally on your machine
General instruction for pre-commit are there: https://pre-commit.com/

But we've automated things in the Makefile
run the make init step and have a look and test the different functions (style, lint, mypy)

Once pre-commit is installed, when you'll commit your changes, it will run all the steps in the .pre-commit-config.yaml

Some functions modify automatically the files (like ruff and black) so you need to add the modifications and try to commit again, which we run again .pre-commit-config.yaml... And so on until you don't have any error tracebacks

@abdu558
Copy link
Contributor Author

abdu558 commented Jun 18, 2024

Hi @abdu558 I think this happens because your requirements files are in conflict with the poetry files. i.e.

* `requirements/requirements-dev.txt` and

* `requirements/requirements.txt`

Since you haven't committed changes in pyproject.toml or in poetry.lock, there is no reason these 2 files should have changes.

So you needs to drop changes in the requirements files or commit the corresponding changes in poetry.lock / pyproject.toml.

I was trying to add the new pyproject/poetry files but it kept getting rejected from precommit

@abdu558
Copy link
Contributor Author

abdu558 commented Jun 18, 2024

@abdu558 to avoid having to do this every time on github ci, I recommend to set the pre-commit hooks locally on your machine General instruction for pre-commit are there: https://pre-commit.com/

But we've automated things in the Makefile run the make init step and have a look and test the different functions (style, lint, mypy)

Once pre-commit is installed, when you'll commit your changes, it will run all the steps in the .pre-commit-config.yaml

Some functions modify automatically the files (like ruff and black) so you need to add the modifications and try to commit again, which we run again .pre-commit-config.yaml... And so on until you don't have any error tracebacks

Thank you, I did set it up but it would just reject my commits everytime for the: poetry export main.......................................................Failed

  • hook id: poetry-export
  • files were modified by this hook

I'm not really sure how to proceed with this? this poetry export main failed thing is still happening after i added the dependency to poetry. I did think maybe updating the poetry file completely with poetry update but that results in IOC errors/incompatibility so im not really sure how to proceed with this.

@SorooshMani-NOAA SorooshMani-NOAA changed the title Added NDBC as a data source GSoC-2024: Added NDBC as a data source Jun 18, 2024
@SorooshMani-NOAA SorooshMani-NOAA changed the title GSoC-2024: Added NDBC as a data source Added NDBC as a data source Jun 18, 2024
@SorooshMani-NOAA
Copy link
Contributor

@abdu558 I was able to update the requirement files and commit on my local machine after getting your PR commits. We can resolve this if we can have a quick short meeting. Or tomorrow during the group meeting.

@abdu558
Copy link
Contributor Author

abdu558 commented Jun 18, 2024

Yea tomorrow would be good

@pmav99 pmav99 force-pushed the dev branch 2 times, most recently from 9357962 to 18b6a92 Compare June 19, 2024 10:17
@SorooshMani-NOAA
Copy link
Contributor

@abdu558 the resolved start and end times are causing problem. It seems that mypy is blindly looking at the original definition of input arg types so it says it cannot be indexed. Either use a different name for the resolved times or just resolve them where you're indexing.

-    start_date = _resolve_start_date(now, start_date)
-    end_date = _resolve_end_date(now, end_date)

     # Prepare arguments for each function call
     func_kwargs = [
         {
             "station_id": station_id,
             "mode": mode,
-            "start_time": start_date[0],
-            "end_time": end_date[0],
+            "start_time": _resolve_start_date(now, start_date)[0],
+            "end_time": _resolve_end_date(now, end_date)[0],
             "columns": columns,
         }

For the result kwarg dictionary I'm not sure what to do ... we seem to be doing similar in case of IOC:

    for result in ioc_responses:
        station_id = result.kwargs["station_id"]  # type: ignore[index]

@pmav99 what do you think?

@SorooshMani-NOAA SorooshMani-NOAA changed the base branch from dev to master June 26, 2024 16:09
@SorooshMani-NOAA SorooshMani-NOAA marked this pull request as ready for review June 26, 2024 16:12
@SorooshMani-NOAA SorooshMani-NOAA marked this pull request as draft June 26, 2024 19:11
@SorooshMani-NOAA SorooshMani-NOAA marked this pull request as ready for review June 27, 2024 12:13
@SorooshMani-NOAA
Copy link
Contributor

@pmav99 we rebased everything and notebooks are also updated. All the tests pass now. Can we go ahead with the merge?

@SorooshMani-NOAA SorooshMani-NOAA requested a review from pmav99 June 27, 2024 12:14
@pmav99
Copy link
Member

pmav99 commented Jun 30, 2024

Can we go ahead with the merge?

I don't think it is ready yet. For starters, I would expect that the API would follow the conventions we used on #125 but that doesn't seem to be the case.

@SorooshMani-NOAA
Copy link
Contributor

@pmav99 you're right, sorry I missed it!!

@abdu558 please review the IOC API:

  • List station API:

    searvey/searvey/ioc.py

    Lines 188 to 194 in 0209bcd

    def get_ioc_stations(
    region: Optional[Union[Polygon, MultiPolygon]] = None,
    lon_min: Optional[float] = None,
    lon_max: Optional[float] = None,
    lat_min: Optional[float] = None,
    lat_max: Optional[float] = None,
    ) -> gpd.GeoDataFrame:
  • Internal multistation data API:

    searvey/searvey/_ioc_api.py

    Lines 184 to 194 in 0209bcd

    def _fetch_ioc(
    station_ids: abc.Collection[str],
    start_dates: pd.DatetimeIndex,
    end_dates: pd.DatetimeIndex,
    *,
    rate_limit: multifutures.RateLimit | None,
    http_client: httpx.Client | None,
    multiprocessing_executor: multifutures.ExecutorProtocol | None,
    multithreading_executor: multifutures.ExecutorProtocol | None,
    progress_bar: bool,
    ) -> dict[str, pd.DataFrame]:
  • Single station data API:

    searvey/searvey/_ioc_api.py

    Lines 222 to 232 in 0209bcd

    def fetch_ioc_station(
    station_id: str,
    start_date: DatetimeLike | None = None,
    end_date: DatetimeLike | None = None,
    *,
    rate_limit: multifutures.RateLimit | None = None,
    http_client: httpx.Client | None = None,
    multiprocessing_executor: multifutures.ExecutorProtocol | None = None,
    multithreading_executor: multifutures.ExecutorProtocol | None = None,
    progress_bar: bool = False,
    ) -> pd.DataFrame:

and make sure the NDBC API follows the same naming convention as well as similar arguments (in cases the make sense). The current corresponding functions you implemented are:

  • get_ndbc_stations: Looks OK.
  • _fetch_ndbc_station_data
    • Rename to _fetch_ndbc
    • Need to accept multistations and multi start/end date
  • fetch_ndbc_stations_data
    • Rename to fetch_ndbc_station
    • Needs to accept single station and single start and end date

I think the rest of the arguments are not as important in this case, but it would be ideal to add them as well. Since columns is specific to NDBC, (like COOPS product, datum, etc.) It's OK to remain there.

@pmav99 do you see anything else that you'd like to be changed before going for the merge?

@abdu558
Copy link
Contributor Author

abdu558 commented Jul 2, 2024

Alright sounds good

@SorooshMani-NOAA
Copy link
Contributor

SorooshMani-NOAA commented Sep 12, 2024

Note to self: #161 #162 #163

@tomsail
Copy link
Contributor

tomsail commented Sep 13, 2024

Considering your last observations, especially #161, that would benefit EMC's work (ping @AliS-Noaa), I think we still need to implement the part on Historical data, and if we can, use parts of the Realtime implementation.

Also, I think we could leverage these 2 functions: available_realtime and available_historical that only return stations with available data.

@SorooshMani-NOAA
Copy link
Contributor

I think it'd be better to review all the current GSoC PRs/branches and merge first, then start thinking about adding the missing pieces (unless the missing pieces make the PR useless). While I'd like everything to be complete before merging, realistically I don't think we have the capacity to do so, so we have to do it iteratively.

I'm still reviewing the missing pieces of NDBC and creating issues!

@SorooshMani-NOAA SorooshMani-NOAA merged commit 4d23de7 into oceanmodeling:master Sep 16, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NDBC data
4 participants