Skip to content

Commit

Permalink
Stellar developer docs for Galexie
Browse files Browse the repository at this point in the history
  • Loading branch information
urvisavla committed Nov 1, 2024
1 parent 753be56 commit af812bc
Show file tree
Hide file tree
Showing 13 changed files with 280 additions and 2 deletions.
14 changes: 14 additions & 0 deletions config/sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ const sidebars: SidebarsConfig = {
{ type: 'ref', id: 'data/rpc/README', label: 'Soroban RPC'},
{ type: 'ref', id: 'data/hubble/README', label: 'Hubble'},
{ type: 'ref', id: 'data/horizon/README', label: 'Horizon'},
{ type: 'ref', id: 'data/galexie/README', label: 'Galexie'},
],
tools: [
{
Expand Down Expand Up @@ -74,6 +75,19 @@ const sidebars: SidebarsConfig = {
collapsible: false,
},
],
galexie: [
{
type: 'category',
label: 'Galexie',
items: [
{
type: "autogenerated",
dirName: "data/galexie",
},
],
collapsible: false,
},
],
soroban_rpc: [
{
type: "category",
Expand Down
2 changes: 1 addition & 1 deletion docs/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Information on how to issue assets on the Stellar network and create custom smar

### [Data](/docs/data/README.mdx)

Discover various data availability options: RPC, Hubble, and Horizon.
Discover various data availability options: RPC, Hubble, Horizon, and Galexie.

### [Tools](/docs/tools/README.mdx)

Expand Down
6 changes: 5 additions & 1 deletion docs/data/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This section will walk you through the differences between the various platforms

- **[RPC](#rpc)** - live network gateway
- **[Horizon](#horizon)** - API for network state data
- **Galexie** - exports raw ledger metadata files
- **[Galexie](#Galexie)** - exports raw ledger metadata files
- **[Hubble](#hubble)** - analytics database for network data

| Features | RPC | Horizon | Galexie | Hubble |
Expand Down Expand Up @@ -70,3 +70,7 @@ Horizon is an API for accessing and interacting with the Stellar network data. I
Horizon stores three types of data (current state, historical state, and derived state) in one database, and the data is available in real-time for transactional use, which makes Horizon more expensive and resource-intensive to operate. If you’re considering using Horizon over the RPC, let us know in the [Stellar Developer Discord](https://discord.gg/stellardev) or file an issue in the [RPC repo](https://github.com/stellar/soroban-rpc) and let us know why!

You can [run your own instance of Horizon](./horizon/admin-guide/README.mdx) or use one of the publicly available Horizon services from [these infrastructure providers](./horizon/horizon-providers.mdx).

## [Galexie](./galexie/README.mdx)

Galexie is a tool for exporting Stellar ledger metadata to Google Cloud Storage.
39 changes: 39 additions & 0 deletions docs/data/galexie/README.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: Galaxie Introduction
sidebar_position: 0
---

## What is Galexie?

Galexie is a tool for extracting, processing, exporting Stellar ledger metadata to external storage, and creating a data lake of pre-processed ledger metadata. Galaxy is the foundation of the Composable Data Pipeline (CDP) and serves as the first step in extracting raw Stellar ledger metadata and making it accessible. Learn more about CDP’s benefits and applications in this [blog post](https://stellar.org/blog/developers/composable-data-platform).

## What Are the Key Features of Galexie?

Galexie is designed to make streamlined and efficient export of ledger metadata via a simple user-friendly interface. Its key features include:

- Exporting Stellar ledger metadata to cloud storage
- Configurable to export a specified range of ledgers or continuously stream new ledgers as they are created on the Stellar network
- Exporting ledger metadata in XDR which is Stellar Core’s native format.
- Compressing data before export to optimize storage efficiency in the data lake.

**Galexie Architecture**

![](/assets/galexie-architecture.png)

## Why XDR Format?

Exporting data in XDR—the native Stellar Core format—enables Galexie to preserve full transaction metadata, ensuring data integrity while keeping storage efficient. The XDR format maintains compatibility with all Stellar components, providing a solid foundation for applications that require consistent access to historical data. Refer to the [XDR](/docs/learn/encyclopedia/data-format/xdr) documentation for more information on this format.

## Why Run Galexie?

Galexie enables you to make a copy of Stellar ledger metadata over which you have complete control. Galexie can continuously sync your data lake with the latest ledger data freeing you up from tedious data ingestion and allowing you to focus on building customized applications that consume and analyze exported data.

## What Can You Do with the Data Lake Created by Galexie?

Once data is stored in the cloud, it becomes easily accessible for integration with modern data processing and analytics tools, enabling various workflows and insights.

The pre-processed ledger data exported by Galexie can be utilized across various applications, such as:

- Analytics Tools: Analyze trends over time.
- Audit Applications: Retrieve historical transaction data for auditing and compliance.
- Monitoring Systems: Create tools to track network metrics.
6 changes: 6 additions & 0 deletions docs/data/galexie/admin_guide/README.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: Admin Guide
sidebar_position: 15
---

This guide provides step-by-step instructions on installing and running the Galexie.
45 changes: 45 additions & 0 deletions docs/data/galexie/admin_guide/configuring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
title: Configuring
sidebar_position: 20
---

# Configuring

## Steps to Configure Galexie

1. **Copy the Sample Configuration**
Start with the provided sample file, [`config.example.toml`](https://github.com/stellar/go/blob/master/services/galexie/config.example.toml).

2. **Rename and Update**
Rename the file to `config.toml` and adjust settings as needed.


**Key Settings Include:**

- **Google Cloud Storage (GCS) Bucket**

Specify the GCS bucket where Galexie will export Stellar ledger data. Update `destination_bucket_path` to the complete path of your GCS bucket, including subpaths if applicable.

```toml
destination_bucket_path = "stellar-network-data/testnet"
```

- **Stellar Network**

Set the Stellar network to be used in creating the data lake.

```toml
network = "testnet"
```

- **Data Organization (Optional)**

Configure how the exported data is organized in the GCS bucket. The example below adds 64 ledgers per file and organizes them in a directory of 1000 files.

```toml
# Number of ledgers stored in each file
ledgers_per_file = 64

# Number of files per partition/directory
files_per_partition = 1000
```
12 changes: 12 additions & 0 deletions docs/data/galexie/admin_guide/installing.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Installing
sidebar_position: 30
---

# Installing

To install Galexie, retrieve the Docker image from the [Stellar Docker Hub registry](https://hub.docker.com/r/stellar/stellar-galexie) using the following command:

```shell
docker pull stellar/stellar-galexie
```
6 changes: 6 additions & 0 deletions docs/data/galexie/admin_guide/monitoring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: Monitoring
sidebar_position: 50
---

# Monitoring
23 changes: 23 additions & 0 deletions docs/data/galexie/admin_guide/prerequisites.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: Prerequisites
sidebar_position: 10
---

# Prerequisites

### 1. Google Cloud Platform (GCP) Account

Galexie exports Stellar ledger metadata to Google Cloud Storage (GCS), so you need a GCP account with:

- Permissions to create a new GCS bucket, or
- Access to an existing bucket with read/write permissions.

### 2. Docker (Recommended)
> **_NOTE:_** While it is possible to natively install Galexie (without Docker), this requires manual dependency management and is recommended only for advanced users.]
Galexie is available as a Docker image, which simplifies installation and setup. Ensure you have Docker Engine installed on your system ([Docker installation guide](https://docs.docker.com/engine/install/)).


## Hardware Requirements

The minimum hardware requirements for running Galexie are:
102 changes: 102 additions & 0 deletions docs/data/galexie/admin_guide/running.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
title: Running
sidebar_position: 40
---

# Running

With the Docker image available and the configuration file set up, you're now ready to run Galexie and start exporting Stellar ledger data to the GCS bucket.

## Command Line Usage

The primary way of running Galexie is using the `append` command.

### Append
Using the `append` command, Galexie can either continuously monitor the network for new ledgers and export them, or export a fixed ledger range and stop when it is exported.

Syntax:

```shell
stellar-galexie append --start <start_ledger> [--end <end_ledger>] [--config-file <config_file>]
```

Arguments:

`--start <start_ledger>` **(required)**

- The starting ledger sequence number of the range being exported.

`--end <end_ledger>` **(optional)**

- The ending ledger sequence number of the range being exported. If unspecified or set to 0, the exporter will continuously export new ledgers as they appear on the network.

`--config-file <config_file_path>` **(optional)**

- The path to the configuration file. If unspecified, the application will look for a file named `config.toml` in the current directory.

Example usage:

```shell
docker run --platform linux/amd64 -d \
-v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \
-v ${PWD}/config.toml:/config.toml \
stellar/stellar-galexie \
append --start 350000 --end 450000 --config-file config.toml
```

`--platform linux/amd64`

- Specifies the platform architecture (adjust if needed for your system).

`-v` Mounts volumes to map your local GCP credentials and config.toml file to the container:

- `$HOME/.config/gcloud/application_default_credentials.json`: Your local GCP credentials file.
- `${PWD}/config.toml`: Your local configuration file.

`-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json`

- Sets the environment variable for credentials within the container.

`stellar/stellar-galexie`

- The Docker image name.

### Resumability:

The `append` command includes built-in resumability, allowing exports to continue seamlessly after an interruption. If Galexie is stopped mid-export, it will scan for the first missing ledger after the specified starting ledger upon restart. Exporting will then resume from that missing ledger, with no manual adjustment needed. To utilize resumability, simply restart Galexie with the same starting ledger, and it will pick up right where it left off.

## Scan-and-fill

While the `append` command is efficient, it may miss data gaps if there are multiple non-sequential gaps in the range. For more thorough verification, the `scan-and-fill` command provides a slower but comprehensive alternative, scanning a specified ledger range to locate and fill any gaps, ensuring data completeness. Due to its slower execution, `scan-and-fill` should be used sparingly and only when data gaps are suspected.

Syntax:

```shell
stellar-galexie scan-and-fill --start <start_ledger> --end <end_ledger> [--config-file <config_file>]
```

Arguments:

`--start <start_ledger>` **(required)**

- The starting ledger sequence number of the range being exported.

`--end <end_ledger>` **(required)**

- The ending ledger sequence number of the range being exported.

`--config-file <config_file_path>` **(optional)**:

- The path to the configuration file. If unspecified, the exporter will look for a file named “config.toml” in the current directory.

Example usage:

```shell
docker run --platform linux/amd64 -d \
-v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \
-v ${PWD}/config.toml:/config.toml \
stellar/stellar-galexie \
scan-and-fill --start 64000 --end 68000 --config-file config.toml
```
22 changes: 22 additions & 0 deletions docs/data/galexie/admin_guide/setup.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: Setup
sidebar_position: 10
---

# Setup

### Google Cloud Platform (GCP) credentials

Create application default credentials by using your user account for your GCP project by following these steps:

1. Download the [SDK](https://cloud.google.com/sdk/docs/install).
2. Install and initialize the [gcloud CLI](https://cloud.google.com/sdk/docs/initializing).
3. Create [application default credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc#google-idp) and it should automatically store in this location: `$HOME/.config/gcloud/application_default_credentials.json.`
4. Verify that this file exists before moving on to the next step.

### Google Cloud Storage (GCS) bucket

If you already have a GCS bucket with read and write permissions, you can skip this section. If not, follow these steps:

1. Visit the GCP Console's Storage section (https://console.cloud.google.com/storage) and create a new bucket.
2. Choose a descriptive name for the bucket, such as `stellar-ledger-data`. Refer to [Google Cloud Storage Bucket Naming Guideline](https://cloud.google.com/storage/docs/buckets#naming) for bucket naming conventions. Note down the bucket name, you will need it later during the configuration process.
5 changes: 5 additions & 0 deletions docusaurus.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,11 @@ const config: Config = {
docId: "data/horizon/README",
label: "Horizon",
},
{
type: 'doc',
docId: "data/galexie/README",
label: "Galexie",
},

]
},
Expand Down
Binary file added static/assets/galexie-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit af812bc

Please sign in to comment.