Skip to content

Backends

Jordan Matelsky edited this page Oct 11, 2020 · 4 revisions

At this time, all backends support all features. You do not need to interact directly with a backend beyond specifying one for your Grand Graph objects:

import grand
from grand.backends import SQLBackend

grand.Graph(backend=SQLBackend())

There are currently three supported backends:

NetworkXBackend

If you're aiming for performance, you can most likely ignore this one; this is a backend that simply passes all operations through to a NetworkX graph.

Best For:

This is most relevant for performance benchmarking and feature-consistency checks.

Warnings:

None

SQLBackend

This backend relays operations to a SQL database. Interestingly, it is faster to ingest data into the SQLBackend than into the NetworkXBackend, so for large data ingests from an edgelist, it may be advantageous to use a SQLBackend instead of vanilla NetworkX, even if you don't care about other Grand features.

Best For:

Quick ingests of data and fast operations on the structure of a graph out-of-memory.

Warnings:

Data IO is slower and highly dependent upon where the SQL database lives: If you're using a file on disk (sqlite):

import grand
from grand.backends import SQLBackend

grand.Graph(backend=SQLBackend("sqlite:///my-file.db"))

...you may find that the operations are slower than if you're using a true SQL database service, or an in-memory sqlite (indicated by passing no string to the SQLBackend constructor).

DynamoBackend

This backend relays operations to a DynamoDB database. All metadata attributes are "promoted" to top-level attributes in the table, so DynamoDB scan and query operations work on any metadata attribute in your nodes or edges. This means that even on Very Large Graphs, attribute queries are still quite speedy.

Best For:

Extremely large graphs (>10s of GB to TB). Also perfect for compatibility with GrandIso-Cloud, which is arguably the fastest subgraph monomorphism library for graphs of this size.

Warnings:

All data IO is done in single atomic calls to the server, so adding a billion edges takes a long time. Fixes for this are currently under experimentation.

Clone this wiki locally