Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open_mfdataset leads to UndefVarError: diskstack not defined #476

Closed
Balinus opened this issue Nov 27, 2024 · 7 comments · Fixed by #477 or #481
Closed

open_mfdataset leads to UndefVarError: diskstack not defined #476

Balinus opened this issue Nov 27, 2024 · 7 comments · Fixed by #477 or #481
Labels
bug Something isn't working

Comments

@Balinus
Copy link
Contributor

Balinus commented Nov 27, 2024

I have tried the new exported open_mfdataset function with a new dimension and I get the following error:

using YAXArrays
using NetCDF
import DimensionalData as DD

files = ["2020_09_crps_ecmwf.nc", "2020_10_crps_ecmwf.nc"]
dates = [DateTime(2020,9), DateTime(2020,10)]

newds = open_mfdataset(DD.DimArray(files,DD.Dim{:Ti}(dates)))
ERROR: UndefVarError: `diskstack` not defined
Stacktrace:
 [1] merge_new_axis(alldatasets::DimensionalData.DimVector{…}, firstcube::YAXArray{…}, var::Symbol, mergedim::Dim{…})
   @ YAXArrays.Datasets ~/.julia/packages/YAXArrays/ppMtD/src/DatasetAPI/Datasets.jl:343
 [2] (::YAXArrays.Datasets.var"#64#65"{Dim{}, Dataset, DimensionalData.DimVector{}})(var::Symbol)
   @ YAXArrays.Datasets ~/.julia/packages/YAXArrays/ppMtD/src/DatasetAPI/Datasets.jl:390
 [3] iterate
   @ ./generator.jl:47 [inlined]
 [4] _collect(c::Vector{Symbol}, itr::Base.Generator{Vector{…}, YAXArrays.Datasets.var"#64#65"{…}}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
   @ Base ./array.jl:854
 [5] collect_similar(cont::Vector{Symbol}, itr::Base.Generator{Vector{…}, YAXArrays.Datasets.var"#64#65"{…}})
   @ Base ./array.jl:763
 [6] map
   @ ./abstractarray.jl:3285 [inlined]
 [7] open_mfdataset(vec::DimensionalData.DimVector{…}; kwargs::@Kwargs{})
   @ YAXArrays.Datasets ~/.julia/packages/YAXArrays/ppMtD/src/DatasetAPI/Datasets.jl:385
 [8] open_mfdataset(vec::DimensionalData.DimVector{…})
   @ YAXArrays.Datasets ~/.julia/packages/YAXArrays/ppMtD/src/DatasetAPI/Datasets.jl:381
 [9] top-level scope
   @ REPL[9]:1
Some type information was truncated. Use `show(err)` to see complete types.

Here's the dimensions of the underlying data in the netcdf files:

julia

Cube(open_dataset(files[1]))
╭───────────────────────────────────────────────╮
│ 251×199×2 YAXArray{Union{Missing, Float64},3} │
├───────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────── dims ┐
  ↓ longitude Sampled{Float32} -80.0f0:0.1f0:-55.0f0 ForwardOrdered Regular Points,
  → latitude  Sampled{Float32} 42.2f0:0.1f0:62.0f0 ForwardOrdered Regular Points,
  ↗ Variable  Categorical{String} ["tp", "tmean"] ReverseOrdered
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 1 entry:
  "missing_value" => 1.0e32
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── loaded lazily ┤
  data size: 780.45 KB


Cube(open_dataset(files[2]))
╭───────────────────────────────────────────────╮
│ 251×199×2 YAXArray{Union{Missing, Float64},3} │
├───────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────── dims ┐
  ↓ longitude Sampled{Float32} -80.0f0:0.1f0:-55.0f0 ForwardOrdered Regular Points,
  → latitude  Sampled{Float32} 42.2f0:0.1f0:62.0f0 ForwardOrdered Regular Points,
  ↗ Variable  Categorical{String} ["tp", "tmean"] ReverseOrdered
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 1 entry:
  "missing_value" => 1.0e32
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── loaded lazily ┤
  data size: 780.45 KB

(Saison) pkg> st

  [179af706] CFTime v0.1.3
  [336ed68f] CSV v0.10.15
  [13f3f980] CairoMakie v0.12.16
  [7a955b69] CircularArrays v1.4.0
  [4f4ee721] ClimateTools v0.24.2 `~/.julia/dev/ClimateTools`
  [a93c6f00] DataFrames v1.7.0
⌃ [0703355e] DimensionalData v0.28.5
  [b4f34e82] Distances v0.10.12
  [31c24e10] Distributions v0.25.113
  [fe3fe864] Extremes v1.0.2
  [db073c08] GeoMakie v0.7.8
  [c27321d9] Glob v1.3.1
  [ee78f7c6] Makie v0.21.16
  [436b0209] NaturalEarth v0.1.0
  [30363a11] NetCDF v0.12.0
  [91a5bcdd] Plots v1.40.9
  [f27b6e38] Polynomials v4.0.12
  [1fd47b50] QuadGK v2.11.1
  [3cb90ccd] RasterDataSources v0.7.0
  [8e980c4a] Shapefile v0.13.1
  [f3b207a7] StatsPlots v0.15.7
  [592b5752] Trapz v2.0.3
  [c21b50f5] YAXArrays v0.5.14
  [0a941bbe] Zarr v0.9.4
  [ade2ca70] Dates
  [10745b16] Statistics v1.10.0
@lazarusA lazarusA added the bug Something isn't working label Nov 28, 2024
@lazarusA
Copy link
Collaborator

ohh... diskstack is not being imported? is this one from DiskArrays? @meggart .

@meggart
Copy link
Member

meggart commented Nov 28, 2024

Thanks for the report. I was indeed a bit quick in merging this. There was a missing import and also another small bug for the case where the arrays were concatenated along a new dimension not yet present in the existing datasets. Should be fixed by #477

@Balinus
Copy link
Contributor Author

Balinus commented Nov 28, 2024

Yes, It works on my end with #477

newds = Cube(open_mfdataset(DD.DimArray(files[end-1:end],DD.Dim{:Year}(dates))))
╭──────────────────────────────────────────────────╮
│ 251×199×2×2 YAXArray{Union{Missing, Float64}, 4} │
├──────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────── dims ┐
   longitude Sampled{Float32} -80.0f0:0.1f0:-55.0f0 ForwardOrdered Regular Points,
   latitude  Sampled{Float32} 42.2f0:0.1f0:62.0f0 ForwardOrdered Regular Points,
  ↗ Year      Sampled{Int64} 1:2 ForwardOrdered Regular Points,
  ⬔ Variable  Categorical{String} ["tp", "tmean"] ReverseOrdered
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 1 entry:
  "missing_value" => 1.0e32
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── loaded lazily ┤
  data size: 1.52 MB

However, the new dimension seems to fallback to integer values?

dates
2-element Vector{DateTime}:
 2020-11-01T00:00:00
 2020-12-01T00:00:00

(tmp) > newds.Year
Year Sampled{Int64} ForwardOrdered Regular DimensionalData.Dimensions.Lookups.Points
wrapping: 1:2

(tmp) > collect(newds.Year)
╭──────────────────────────────╮
│ 2-element DimArray{Int64, 1} │
├──────────────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   Year Sampled{Int64} 1:2 ForwardOrdered Regular Points
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
 1  1
 2  2

Trying with a list of string also gives integer values:

newds = Cube(open_mfdataset(DD.DimArray(files[end-1:end],DD.Dim{:Year}(["a", "b"]))))
╭──────────────────────────────────────────────────╮
│ 251×199×2×2 YAXArray{Union{Missing, Float64}, 4} │
├──────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────── dims ┐
   longitude Sampled{Float32} -80.0f0:0.1f0:-55.0f0 ForwardOrdered Regular Points,
   latitude  Sampled{Float32} 42.2f0:0.1f0:62.0f0 ForwardOrdered Regular Points,
  ↗ Year      Sampled{Int64} 1:2 ForwardOrdered Regular Points,
  ⬔ Variable  Categorical{String} ["tp", "tmean"] ReverseOrdered
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 1 entry:
  "missing_value" => 1.0e32
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── loaded lazily ┤
  data size: 1.52 MB

@Balinus
Copy link
Contributor Author

Balinus commented Dec 11, 2024

There is still the lastbug, where the new dimension values are not taken into account.

files = ["2020_09_crps_ecmwf.nc", "2020_10_crps_ecmwf.nc"]
2-element Vector{String}:
 "2020_09_crps_ecmwf.nc"
 "2020_10_crps_ecmwf.nc"

julia> dates = [DateTime(2020,9), DateTime(2020,10)]
2-element Vector{DateTime}:
 2020-09-01T00:00:00
 2020-10-01T00:00:00

julia> import DimensionalData as DD

julia> using NetCDF
[ Info: new driver key :netcdf, updating backendlist.

julia> newds = open_mfdataset(DD.DimArray(files,DD.Dim{:Ti}(dates)))
YAXArray Dataset
Shared Axes:
  ( longitude Sampled{Float32} -80.0f0:0.1f0:-55.0f0 ForwardOrdered Regular Points,
   latitude  Sampled{Float32} 42.2f0:0.1f0:62.0f0 ForwardOrdered Regular Points,
  ↗ Ti        Sampled{Int64} 1:2 ForwardOrdered Regular Points)

Variables:
tmean, tp

@lazarusA
Copy link
Collaborator

If those files are small enough, could you please share them? It might make things easier to debug/fix.

@lazarusA lazarusA reopened this Dec 11, 2024
@Balinus
Copy link
Contributor Author

Balinus commented Dec 12, 2024

I'll try to see how I can make a MWE. Files are not transferable due to cybersecurity policies here :)

@Balinus
Copy link
Contributor Author

Balinus commented Dec 12, 2024

Here's a standalone MWE with random array and dates:

using YAXArrays
using NetCDF
using Dates
import DimensionalData as DD

a1 = YAXArray(rand(10, 20, 5))
a2 = YAXArray(rand(10, 20, 5))

savecube(a1, "a1.nc")
savecube(a2, "a2.nc")

files = ["a1.nc", "a2.nc"]

dates = [Date(2020, 1, 1) + Dates.Day(i) for i in 1:2]

ds = open_mfdataset(DD.DimArray(files,DD.Dim{:Ti}(dates)))

output of ds is the following, with dimension Ti being integers:

ds = open_mfdataset(DD.DimArray(files,DD.Dim{:Ti}(dates)))
YAXArray Dataset
Shared Axes:
  ( Dim_1 Sampled{Int64} 1:1:10 ForwardOrdered Regular Points,
   Dim_2 Sampled{Int64} 1:1:20 ForwardOrdered Regular Points,
  ↗ Dim_3 Sampled{Int64} 1:1:5 ForwardOrdered Regular Points,
  ⬔ Ti    Sampled{Int64} 1:2 ForwardOrdered Regular Points)

Variables:
layer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants