Skip to content

Commit

Permalink
Merge pull request #3 from PAIR-code/50k
Browse files Browse the repository at this point in the history
Add raw 50k mammoth subsample
  • Loading branch information
cannoneyed authored Dec 4, 2019
2 parents 8a41565 + 0f7238d commit 1ff6f3a
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 2 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,9 @@ yarn dev:toy_comparison

#### Data preprocessing

_Understanding UMAP_ uses a few tricks to make the data payloads for some of the interactive figures small enough to download in a reasonable time. The `mammoth` figures use a 10-bit encoding scheme to compress the 10,000 3D points into a significantly smaller payload. The `hyperparameters` and `toy_comparison` figures precompute UMAP embeddings for all of their different combinations, then use the same 10-bit encoding scheme to compress the data.
For the mammoth figures, the [raw 3D data](https://github.com/MNoichl/UMAP-examples-mammoth-/blob/master/mammoth_a.csv) was downsampled to 50,000 points before being projected with UMAP / t-SNE. These 50,000 points were then randomly subsampled to 10,000 points in order to minimize the payload size.

_Understanding UMAP_ uses a few tricks to make the data payloads for some of the interactive figures small enough to download in a reasonable time. The `mammoth` figures use a 10-bit encoding scheme to compress the 10,000 data points into a significantly smaller payload. The `hyperparameters` and `toy_comparison` figures precompute UMAP embeddings for all of their different combinations, then use the same 10-bit encoding scheme to compress the data.

```bash
yarn preprocess:hyperparameters
Expand Down
1 change: 1 addition & 0 deletions raw_data/mammoth_3d_50k.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion src/article/Article.svx
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ The following visualization - extended from excellent work by [Max Noichl](https
<MammothUmapVisualization />
<span slot="caption">
<span class="figure-number">Figure 5: </span>
UMAP projections of a 3D woolly mammoth skeleton (50,000 points) into 2 dimensions, with various settings for the <code>n_neighbors</code> and <code>min_dist</code> parameters.
UMAP projections of a 3D woolly mammoth skeleton (50k points, 10k shown) into 2 dimensions, with various settings for the <code>n_neighbors</code> and <code>min_dist</code> parameters.
</span>
</Figure>

Expand Down

0 comments on commit 1ff6f3a

Please sign in to comment.