Skip to content

v0.3.0

Compare
Choose a tag to compare
@rodrigodelazcano rodrigodelazcano released this 17 May 14:59
· 155 commits to main since this release
7fc53fd

v0.3.0: Minari is ready for testing

Minari 0.3.0 Release Notes:

For this beta release Minari has experienced considerable changes from its past v0.2.2 version. As a major refactor, the C source code and Cython dependency have been removed in favor of a pure Python API in order to reduce code complexity. If we require a more efficient API in the future we will explore the use of C.

Apart from the API changes and new features we are excited to include the first official Minari datasets which have been re-created from the D4RL project.

The documentation page at https://minari.farama.org/, has also been updated with the latest changes.

We are constantly developing this library. Please don't hesitate to open a GitHub issue or reach out to us directly. Your ideas and contributions are highly appreciated and will help shape the future of this library. Thank you for using our library!

New Features and Improvements

Dataset File Format

We are keeping the HDF5 file format to store the Minari datasets. However, the internal structure of the datasets has been modified. The data is now stored in a per episode basis. Each Minari dataset has a minimum of one HDF5 file (:page_facing_up:, main_data.hdf5). In the dataset file, the collected transitions are separated by episode groups (:file_folder:) that contain 5 required datasets(:floppy_disk:) : observations, actions, terminations, truncations, and rewards. Other optional group and dataset collections can be included in each episode; such is the case of the infos step return. This structure allows us to store metadata for each episode.

📄 main_data.hdf5
├ 📁 episode_id
│  ├ 💾 observations
│  ├ 💾 actions
│  ├ 💾 terminations
│  ├ 💾 truncations
│  ├ 💾 rewards
│  ├ 📁 infos
│  │  ├ 💾 info datasets
│  │  └ 📁 info subgroup
│  │     └ 💾 info subgroup dataset
│  └ 📁 extra dataset group
│     └ 💾 extra datasets
└ 📁 next_episode_id

MinariDataset

When loading a dataset, the MinariDataset object now delegates the HDF5 file access to a MinariStorage object. The MinariDataset provides new methods (MinariDataset.sample_episodes()(#34) and MinariDataset.iterate_episodes()(#54)) to retrieve EpisodeData from the available episode indices in the dataset.

NOTE: for now the user is in charge of creating their own replay buffers with the provided episode sampling methods. We are currently working on creating standard replay buffers (#55) and making Minari datasets compatible with other learning Offline RL libraries.

The available episode indices can be filtered using metadata or other information from the episodes HDF5 datasets with MinariDataset.filter_episodes(condition: Callable[[h5py.Group], bool])(#34).

dataset = minari.load_dataset("door-human-v0")

print(f'TOTAL EPISODES ORIGINAL DATASET: {dataset.total_episodes}')

# get episodes with mean reward greater than 2
filter_dataset = dataset.filter_episodes(lambda episode: episode["rewards"].attrs.get("mean") > 2)

print(f'TOTAL EPISODES FILTER DATASET: {filter_dataset.total_episodes}')
>>> TOTAL EPISODES ORIGINAL DATASET: 25
>>> TOTAL EPISODES FILTER DATASET: 18

The episodes in a MinariDataset can also be splitted into smaller sub-datasets with minari.split_dataset(dataset: MinariDataset, sizes: List[int], seed: int | None = None)(#34).

dataset = minari.load_dataset("door-human-v0")

split_datasets = minari.split_dataset(dataset, sizes=[20, 5], seed=123)

print(f'TOTAL EPISODES FIRST SPLIT: {split_datasets[0].total_episodes}')
print(f'TOTAL EPISODES SECOND SPLIT: {split_datasets[1].total_episodes}')
>>> TOTAL EPISODES FIRST SPLIT: 20
>>> TOTAL EPISODES SECOND SPLIT: 5

Finally, Gymnasium release v0.28.0 made possible the conversion of the environment's EnvSpec to a json dictionary. This allowed Minari to "safe" the description of the environment used to generate the dataset into the HDF5 file for later recovery through: MinariDataset.recover_environment() (#31). NOTE: the entry_point of the environment must be available, i.e. to recover the environment from door-human-v0 dataset, the gymnasium-robotics library must be installed.

Dataset Creation (#31)

We are facilitating the logging of environment data by providing a Gymnasium environment wrapper, DataCollectorV0. This wrapper buffers the parameters from a Gymnasium step transition. The DataCollectorV0 is also memory efficient by providing a step/episode scheduler to cache the recorded data. In addition, this wrapper can be initialized with two custom callbacks:

  • StepDataCallback - This callback automatically flattens Dictionary or Tuple observation/action spaces (this functionality will be removed in a future release following the suggestions of #57). This class can be overridden to store additional environment data.

  • EpisodeMetadataCallback - This callback adds metadata to each recorded episode. For now automatic metadata will be added to the rewards dataset of each episode. It can also be overridden to include additional metadata.

To save the Minari dataset in disk with a specific dataset id two functions are provided. If the data is collected by wrapping the environment with a DataCollectorV0, use minari.create_dataset_from_collector_env. Otherwise you can collect the episode trajectories with dictionary collection buffers and use minari.create_dataset_from_buffers.

This functions return a MinariDataset object which can be used to checkpoint the data collection process to later append more data with
MinariDataset.update_dataset_from_collector_env(collector_env: DataCollectorV0).

import minari
import gynasium as gym

env = gym.make('CartPole-v1')   
collector_env = minari.DataCollectorV0(env)

dataset_id = 'cartpole-test-v0'

# Collect 1000 episodes for the dataset
for n_step in range(1000):
	collector_env.reset(seed=123)
	while True:
    	action = collector_env.action_space.sample()
    	obs, rew, terminated, truncated, info = collector_env.step(action)
    	if terminated or truncated:
         	break

	# Checkpoint data after each 100 episodes
	if (n_step + 1) % 100 == 0:
    	# If the Minari dataset id does not exist create a new dataset, otherwise update the existing one
    	if dataset_id not in minari.list_local_datasets():
        	dataset = minari.create_dataset_from_collector_env(collector_env=collector_env, dataset_id=dataset_id)
    	else:
        	dataset.update_dataset_from_collector_env(collector_env)

We provide a curated tutorial in the documentation on how to use these dataset creation tools: https://minari.farama.org/main/tutorials/dataset_creation/point_maze_dataset/#sphx-glr-tutorials-dataset-creation-point-maze-dataset-py

Finally, multiple existent datasets can be combined into a larger dataset. This requires that the datasets to be combined have the same observation/action space as well as the same EnvSpec (except for the max_episode_steps argument for which the largest will be selected among all the datasets)

Multiple already existent Minari datasets can be combined under a different name as follows:

dataset_v1 = minari.load_dataset('dataset-v1')
dataset_v2 = minari.load_dataset('dataset-v2')

dataset_v3 = minari.combine_datasets(datasets_to_combine = [dataset_v1, dataset_v2], new_dataset_id = 'dataset-v3')

CLI

To improve accessibility to the remote public datasets, we are also including a CLI tool with commands to list, download, and upload Minari datasets.

New Public Datasets

Bellow is a list of new available dataset ids from different Gymnasium environments. These datasets have been re-created from the original D4RL project.

AdroitHandDoor-v1:

AdroitHandHammer-v1

AdroitHandPen-v1

AdroitHandRelocate-v1

PointMaze

FrankaKitchen-v1