All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Creating playlists from lasso selection in comparison mode
- New mode: comparison interface for proper music exploration
- New mode: similarity interface for gathering user feedback on segment similarity
- Annoy indexes for embeddings storage
- SQLite database for
- Mapping between Annoy ids and
track_id
with timestamps - Track metadata (artist, album, tags)
- Mapping between Annoy ids and
- Reading metadata from ID3 tags for local collections and consuming Jamendo API for MTG-Jamendo dataset
- New projection: STD-PCA that standardizes individual embedding dimensions before applying PCA
- New projection: UMAP
- Experiments to measure hubness and spread
- Anonymized models for the user experiments
- Aggregation of embeddings into one file instead of having multiple small .npy files per track
- Restructured app to have scripts as part of Flask app with
click
instead ofargparse
- Optimized embeddings extraction script
- Modularized JS code
- Old processing code using MusiCNN package directly in favor of newer Essentia extractors
- Support for seaborn plotting
- Show artist name and track name with link to Jamendo website when playing audio
- Better alerts
- Merged code for showing segments and trajectories
- Support for different length of segments (for VGGish embeddings)
- UI for selecting different models and datasets
- New dataset (pre-trained): MSD (Million Song Dataset)
- New model: VGG trained on both MSD and MTAT
- Dropdown selector for tags and PCA components with search functionality
- UI for audio toggle and log scale moved
- Made the plot section bigger
- Support for dynamic PCA
- Selection of layers for t-SNE (now only [0, 1])
- Graph area for visualization
- Input field for number fo tracks
- Embeddings and taggrams from MusiCNN model trained on MTAT
- 3 ways to visualize latent spaces: segments, averages, trajectories
- 2 spaces: taggrams and penultimate
- 3 projections: PCA, t-SNE and original with numerical input fields for X and Y
- Selection of the way audio is played: on-click, on-hover and disabled
- Optional log-scale for visualizations that are too clustered to axes
- Experimental UI for music exploration