Transportation Year 2 Updates #35

nreinicke · 2025-01-31T22:14:29Z

I'm very new to this data format and so starting this as a draft to get feedback. I've attempted to update the schema to accommodate the new load data from the evi-grid-national-framework. The new raw data is hourly load profiles spanning once week for each month of the year. We've run these for four scenarios:

low_baseline
high_baseline
high_inefficient
breakthrough_baseline

The raw data is parquet format on kestrel at (/projects/evix/evbps/grid-team-deliverables/2025-01-24-IEF/aggregations/load-profiles/month-week-hour). Here's a sample of what the data looks like:

region	month	body_style	charge_location	plug_name	hour	energy_kwh	scenario	state	year
str	u8	enum	enum	cat	i32	f64	str	str	i64
"08083"	1	"ldv-light-truck"	"enroute"	"DC150"	0	0.0	"breakthrough_baseline"	"CO"	2025
"08083"	8	"ldv-light-truck"	"public"	"L2"	0	14.131195	"breakthrough_baseline"	"CO"	2025
"08055"	9	"ldv-light-truck"	"public"	"L2"	0	0.0	"breakthrough_baseline"	"CO"	2025
"08057"	11	"ldv-car"	"home"	"L2"	0	0.0	"breakthrough_baseline"	"CO"	2025

Some questions:

How much post-processing should we do on our side? For example, it looks like the current transportation config has the load like this electricity_ev_ldv_work_l2. Should we run our data through a script that does the mapping from our own format into that format or should that be something we modify in the config so dsgrid can handle that?
In our current pipeline, we aggregate BEV and PHEV light duty vehicles into a larger category like ldv-car. Do we need to retain the distinction between BEV and PHEV for this analysis?

nreinicke · 2025-01-31T22:16:00Z

Tagging @ahcyip @elainethale @daniel-thom for any feedback. Sorry if I've totally butchered the existing schema.

ahcyip · 2025-02-05T18:19:49Z

How much post-processing should we do on our side? For example, it looks like the current transportation config has the load like this electricity_ev_ldv_work_l2. Should we run our data through a script that does the mapping from our own format into that format or should that be something we modify in the config so dsgrid can handle that?

I think the idea is that the data on our side should be as raw and detailed as possible and the config should be modified to handle the raw output.

In our current pipeline, we aggregate BEV and PHEV light duty vehicles into a larger category like ldv-car. Do we need to retain the distinction between BEV and PHEV for this analysis?

@bborlaug should make the call on whether our data should retain BEV separate from PHEV. On one hand, the data will be very different (magnitude and timing and locations of load per BEV vs. load per PHEV are very different etc.), but on the other, if our pipeline is already aggregating, I don't know if we will be checking the results with distinct BEV and PHEVs or doing any analysis that uses the distinction, so we may not need to go "backwards" and keep the distinction for dsgrid.

ahcyip · 2025-02-05T18:30:09Z

Regarding date format, you may have to change from 0-167 to
day_of_week 0-6 (zero-based, starting on Monday. Mon -> 0, Tue -> 1) X hour 0-23 (zero-based, starting at midnight) instead.
https://dsgrid.github.io/dsgrid/reference/dataset_formats.html

P.S. @daniel-thom helped me with the dsgrid software last time, but @nreinicke is a software pro, so Nick could probably handle everything discussed in https://dsgrid.github.io/dsgrid/tutorials/create_and_submit_dataset.html (if it is up to date). Also, @nreinicke sorry I forgot to pass this dsgrid documentation to you earlier - this may have covered a lot of what we chatted about.

daniel-thom · 2025-02-06T23:34:34Z

Regarding date format, you may have to change from 0-167 to day_of_week 0-6 (zero-based, starting on Monday. Mon -> 0, Tue -> 1) X hour 0-23 (zero-based, starting at midnight) instead. https://dsgrid.github.io/dsgrid/reference/dataset_formats.html

P.S. @daniel-thom helped me with the dsgrid software last time, but @nreinicke is a software pro, so Nick could probably handle everything discussed in https://dsgrid.github.io/dsgrid/tutorials/create_and_submit_dataset.html (if it is up to date). Also, @nreinicke sorry I forgot to pass this dsgrid documentation to you earlier - this may have covered a lot of what we chatted about.

The data tables are already in a very good format for dsgrid. Here are the minor changes that need to be made:

Converting to day_of_week as discussed by @ahcyip would be very helpful because we already have support for that. I'll emphasize that while we are flexible on this point, we would prefer consistency with prior formats so that it's less work.
dsgrid currently requires specific column names: (1) dimension types of scenario, sector, subsector, metric, model_year, weather_year, geography, (2) value column called value, (3) time can be whatever you want. We might need discussion here.

I'd be happy to help with the post-processing to convert to a dsgrid format. This would be a simple Spark query.

start making updates based on new data from evi-grid-framework

0b5ab24

nreinicke marked this pull request as draft January 31, 2025 22:14

ahcyip requested review from daniel-thom, ashreeta, elainethale, ahcyip and bborlaug February 5, 2025 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transportation Year 2 Updates #35

Transportation Year 2 Updates #35

nreinicke commented Jan 31, 2025

nreinicke commented Jan 31, 2025

ahcyip commented Feb 5, 2025

ahcyip commented Feb 5, 2025

daniel-thom commented Feb 6, 2025

Transportation Year 2 Updates #35

Are you sure you want to change the base?

Transportation Year 2 Updates #35

Conversation

nreinicke commented Jan 31, 2025

nreinicke commented Jan 31, 2025

ahcyip commented Feb 5, 2025

ahcyip commented Feb 5, 2025

daniel-thom commented Feb 6, 2025