Skip to content

Commit

Permalink
readme updates
Browse files Browse the repository at this point in the history
  • Loading branch information
bmschmidt committed May 15, 2024
1 parent e76e2f3 commit fbed675
Show file tree
Hide file tree
Showing 7 changed files with 53 additions and 23 deletions.
2 changes: 1 addition & 1 deletion integers.html
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
<div id="deepscatter"></div>
</body>
<script type="module" lang="ts">
import { Scatterplot, Dataset, Bitmask, dictionaryFromArrays } from './src/deepscatter';
import { Scatterplot, Deeptable, Bitmask, dictionaryFromArrays } from './src/deepscatter';
import {
tableFromArrays,
Table,
Expand Down
25 changes: 18 additions & 7 deletions release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,23 @@ Breaking changes:

This allows the export of several useful types for advanced functions in scatterplots we've found useful at Nomic. The initial set of exported items are `{ Dataset, Bitmask, Scatterplot, dictionaryFromArrays, LabelMaker }`. Bitmasks are efficient, useful ways to refer to update and refer to selection masks.

2. Apache Arrow is now a peer dependency of deepscatter rather than
2. Apache Arrow is now a peer dependency of
deepscatter rather than
being bundled into the distribution. Most bundlers will hopefully take care of installation for you, but if you are writing raw HTML code,
it will be necessary to include and re-export it. In general that will look like this.

```
import * as Arrow from 'apache-arrow';
export { Arrow };
```

3. The distinction between `QuadTile` and `ArrowTile`
has been eliminated in favor of `Tile`, and with it the need to supply
generics around them through the system. Similarly, `QuadTileDataset` and `ArrowDataset` have both been removed in favor of `Dataset`.
generics around them through the system. Similarly, `QuadTileDataset` and `ArrowDataset` have both been removed in favor of `Deeptable`, which is a generalized
version of the dataset class. It has been renamed because
the word 'dataset' is overloaded, and 'deeptable' better
captures that this thing is one of the primary novel
objects in this library--a lazily loaded structure for operations on a collection of Arrow record batches that are arranged in a tree.

4. Deepscatter no longer accepts strings as direct
arguments to `Scatterplot.plotAPI` in places where they were previously cast to functions
Expand All @@ -39,6 +46,7 @@ Breaking changes:

5. Shortcuts for passing `position` and `position0` rather
than naming the `x` and `y` dimensions explicitly have been removed.

6. Tile objects no longer have `ready` and `promise` states.
This is because tiles
other than the first no longer necessarily download any data at all. Code that blocked on these states should instead block on the dataset's `ready` promise; code needing to know if a particular tile has a record batch can check for the presence of `tile.record_batch`, but this no
Expand All @@ -50,15 +58,18 @@ Breaking changes:
object (this as childLocations, min_ix, max_ix, highest_known_ix, etc.) is now located in an object called `manifest` that is used to manage children. This is designed to
make it possible (though not yet necessary) to pre-load a single file enumerating all the tiles in the dataset.

7. The syntax for expressly passing a categorical scale may change.
7. The tools for handling a DataSelection have been moved
from the scatterplot class to the `deeptable` class. This is because the selection is a property of the dataset, not the plot, and can be instantiated without the plot
being drawn.

8. 2. Datasets where underlying data is boolean using API encoding channels `filter`, `filter2`, and `foreground` no longer handle the data with `op` commands: instead, true is true and false is false.

## Fundamental design changes

1. Previously `Aesthetic` objects were stateful;
they are now stateless, with all necessary state held in the pair of `StatefulAesthetic` that defines them. This allows for tighter binding and type safety with d3 scales; it should
2. The preferred tile input type has changed.
(There will be associated changes to the quadfeather package as well). Although for the sake of back-compatibility the special keys `x`, `y`, and `ix` will still work, deepscatter now falls back to those as defaults, preferring to find them wrapped in a struct field called `_deepscatter`.
3. Datasets where underlying data is boolean are no longer passed to filters with `op` commands: instead, true is true and false is false.
they are now stateless, with all necessary state held in the pair of `StatefulAesthetic` that defines them. This allows for tighter binding and type safety with d3 scales.

2. Datasets where underlying data is boolean

# 2.15.3

Expand Down
22 changes: 11 additions & 11 deletions src/Deeptable.ts
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ export class Deeptable {

public readonly tileStucture: DS.TileStructure;
/**
* @param plot The plot to which this dataset belongs.
* @param plot The plot to which this deeptable belongs.
**/

constructor({
Expand Down Expand Up @@ -193,7 +193,7 @@ export class Deeptable {
}

/**
* Ensures that all the tiles in a dataset are downloaded that include
* Ensures that all the tiles in a deeptable are downloaded that include
* datapoints of index less than or equal to max_ix.
* @param max_ix the depth to download to.
*/
Expand Down Expand Up @@ -287,7 +287,7 @@ export class Deeptable {
}

/**
* Generate an ArrowDataset from a single Arrow table.
* Generate an ArrowDeeptable from a single Arrow table.
*
* @param table A single Arrow table
* @param prefs The API Call to use for rendering.
Expand All @@ -301,7 +301,7 @@ export class Deeptable {
/**
*
* @param name The name of the column to check for
* @returns True if the column exists in the dataset, false otherwise.
* @returns True if the column exists in the deeptable, false otherwise.
*/
has_column(name: string) {
return (
Expand All @@ -328,7 +328,7 @@ export class Deeptable {
* The generic T tracks whether this reads strings from JSON and return dates,
* or reads numbers from JSON and returns numbers
*
* @param columnName A column in the dataset.
* @param columnName A column in the deeptable.
* @returns A pair of numbers. Dates and bigints will be
* converted to numbers.
*/
Expand Down Expand Up @@ -407,8 +407,8 @@ export class Deeptable {
}
/**
* Map a function against all tiles.
* It is often useful simply to invoke Dataset.map(d => d) to
* get a list of all tiles in the dataset at any moment.
* It is often useful simply to invoke Deeptable.map(d => d) to
* get a list of all tiles in the deeptable at any moment.
*
* @param callback A function to apply to each tile.
* @param after Whether to perform the function in bottom-up order
Expand All @@ -424,7 +424,7 @@ export class Deeptable {
}

/**
* Invoke a function on all tiles in the dataset that have been downloaded.
* Invoke a function on all tiles in the deeptable that have been downloaded.
* The general architecture here is taken from the
* d3 quadtree functions. That's why, for example, it doesn't
* recurse.
Expand Down Expand Up @@ -464,7 +464,7 @@ export class Deeptable {
}

/**
* Invoke a function on all tiles in the dataset, downloading those that aren't
* Invoke a function on all tiles in the deeptable, downloading those that aren't
* here yet..
* The general architecture here is taken from the
* d3 quadtree functions. That's why, for example, it doesn't
Expand Down Expand Up @@ -578,7 +578,7 @@ export class Deeptable {
*
* @param ids A list of ids to get, keyed to the value to set them to.
* @param field_name The name of the new field to create
* @param key_field The column in the dataset to match them against.
* @param key_field The column in the deeptable to match them against.
*/

add_label_identifiers(
Expand Down Expand Up @@ -676,7 +676,7 @@ export class Deeptable {
) {
/*
Browsing can spawn a *lot* of download requests that persist on
unneeded parts of the dataset. So the dataset handles its own queue for dispatching
unneeded parts of the deeptable. So the deeptable handles its own queue for dispatching
downloads in case tiles have slipped from view while parents were requested.
*/

Expand Down
1 change: 1 addition & 0 deletions src/deepscatter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ export { Bitmask, DataSelection } from './selection';
export { Deeptable } from './Deeptable';
export { LabelMaker } from './label_rendering';
export { dictionaryFromArrays } from './utilityFunctions';
export { Tile } from './tile';
20 changes: 19 additions & 1 deletion src/scatterplot.ts
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,21 @@ export class Scatterplot {
this.bound = true;
}

/**
* Create a data selection. For back-compatability,
* this wraps the select_data object on a deeptable;
* it's recommended to use the deeptable directly.
*
* @deprecated
*
* @param params argument passed to deeptable.select_data.
* @returns
*/
async select_data(
...params: Parameters<Deeptable['select_data']>
): Promise<DataSelection> {
return this.deeptable.select_data(...params);
}
/**
* Creates a new selection from a set of parameters, and immediately applies it to the plot.
* @param params A set of parameters defining a selection.
Expand Down Expand Up @@ -807,9 +822,12 @@ export class Scatterplot {
zoom.restart_timer(60_000);
}

get dataset() {
return this.deeptable;
}
get root_batch() {
if (!this._root) {
throw new Error('No dataset has been loaded');
throw new Error('No deeptable has been loaded');
}
return this.deeptable.root_tile.record_batch;
}
Expand Down
2 changes: 1 addition & 1 deletion tests/dataset.spec.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { Dataset, DataSelection, Bitmask } from '../dist/deepscatter.js';
import { Deeptable, DataSelection, Bitmask } from '../dist/deepscatter.js';
import { Table, vectorFromArray, Utf8 } from 'apache-arrow';
import { test } from 'uvu';
import * as assert from 'uvu/assert';
Expand Down
4 changes: 2 additions & 2 deletions tests/datasetHelpers.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { Table, vectorFromArray, Utf8 } from 'apache-arrow';
import { Dataset, Bitmask } from '../dist/deepscatter.js';
import { Deeptable, Bitmask } from '../dist/deepscatter.js';

// Creates a tile transformation for factors of n.
export function selectFunctionForFactorsOf(n) {
Expand Down Expand Up @@ -68,7 +68,7 @@ function createTable(n_batches) {
export function createIntegerDataset() {
const num_batches = 4;
const table = createTable(num_batches);
return Dataset.fromArrowTable(table);
return Deeptable.fromArrowTable(table);
}

const memo = {};
Expand Down

0 comments on commit fbed675

Please sign in to comment.