Release v0.5.0-rc1 · aleph-im/pyaleph

This release introduces multiple changes to the way node operates.

TL;DR

A new database: we switch from MongoDB to PostgreSQL.
A new implementation of the processing pipeline: the message pipeline is now split in two parts to fix race conditions and optimize the overall throughput.
Materialized aggregates: aggregates are now faster to query through the API.
New endpoints: we make it easier to post new Aleph messages and determine if your messages were processed or rejected.
Major dependency updates: CCNs now run on Python 3.11.

Switch to PostgreSQL

One of the main features of this release is the switch from MongoDB to PostgreSQL. This switch is motivated by the development of new features for which we feel a relational database is more appropriate.

Each type of message is now associated with one or more DB tables that store the actual objects mentioned in Aleph messages. API endpoints and internal operations can now directly access these object tables instead of having to search through messages.

Additionally, we now use a DB migration system that guarantees the consistency of the data across updates.

As we dropped MongoDB, files are now stored on the local file system in a dedicated volume.

New message pipeline

Fetcher and processor

The new message pipeline addresses two issues: determinism and observability. We now use two separate processes:

the fetcher performs network accesses for messages that require additional downloads. It ensures that all the data required to process a message is available on the node before any further processing. It uses asyncio tasks to fetch data for multiple messages in parallel.
the message processor is in charge of checking the integrity of messages and permissions. It processes messages atomically, guaranteeing the absence of race condition.

This new architecture allows to process messages as soon as they are fetched. As most messages are immediately ready for processing, this maximizes the throughput of the message pipeline.

Errors and error codes

The error checking mechanism of the message pipeline was completely rewritten. Each error is now specified as its own exception type and is made visible to the user as an error code. By using the new GET /api/v0/messages/{item_hash}, users can now determine if and why their message was rejected by a node.

Additionally, we now use exponential retry times to reduce the total amount of retries and the CPU/network load that comes with them. Messages are now retried up to 10 times within a span of around 20 minutes.

Materialized aggregates

Aggregates are now re-calculated as soon as a new aggregate message is processed. This improves the performance when querying large aggregates.

API updates

New endpoints

POST /api/v0/messages: allows users to post a new message and then track the progress of the message in the processing pipeline. This endpoint supports a synchronous mode where the response is only sent once the node processes the message or a timeout occurs.
GET /api/v0/messages/{item_hash}: allows users to track the status individual messages. The status field allows users to determine if their message is processed, rejected, pending or forgotten.
GET /api/v0/addresses/{address}/balance: returns the balance in Aleph of a wallet address.
GET /api/v0/addresses/{address}/files: returns the list of files stored by the user along the total number of files they store on Aleph and the total space used.
GET /api/v1/posts.json: a new implementation of the /posts/ endpoint. This new implementation removes message-specific files and focuses on the post content and metadata. /api/v0/posts.json is now deprecated.

New features

The messages websocket now allows history = 0. It was reimplemented to use a RabbitMQ queue to read new messages directly from the message pipeline.

Breaking changes

GET /api/v0/messages:
- the endpoint only returns processed messages. Forgotten messages are now ignored.
- The size, content_type and engine_info fields added by the node on STORE messages are not returned anymore. If you need this information, use the new GET /api/v0/addresses/{address}/files endpoint.
GET /api/v0/posts: a lot of fields were dropped as they were redundant
GET /api/v0/addresses/stats.json: removed the address field. It was redundant with the key of the dictionary.
Message specification:
- The content field of aggregate messages is now required to be a dictionary.
- The ref field of program volumes is now required to be a message hash.
- Dropped support for the NaN float value and the \u0000 character in aggregates and posts.

Upgrade guide

Prerequisites

Make sure that your node is up-to-date with the latest release. Specifically, you must ensure that your private key is in the format introduced in the 0.4.x releases. You can find the full upgrade guide here. You can also skip this update and convert your private key file in the right format using openssl in your keys directory:

cd ./keys
openssl pkcs8 -topk8 -inform PEM -outform DER -in node-secret.key -out node-secret.pkcs8.der -nocrypt

Stop the node

This release requires a full re-sync of your node. While you wait for your node to resynchronize, use any of our official nodes to access data: official.aleph.cloud.

The full resync is the simplest option and will work for all node operators who do not require their node to be up at the time.

The following instructions assume that you use one of our official Docker Compose files.
First, switch off your node:

docker-compose down

Now, retire your old Docker Compose file and download the new one.

mv docker-compose.yml docker-compose-old.yml
wget "https://raw.githubusercontent.com/aleph-im/pyaleph/v0.5.0-rc1/deployment/samples/docker-compose/docker-compose.yml"

The new Docker Compose file comes with a default password for PostgreSQL. Generate a new password and specify it in your docker-compose.yml and config.yml files:

Update docker-compose.yml:

services:
  postgres:
    env:
      POSTGRES_PASSWORD: "<new-password>"

Add to config.yml:

postgres:
  host: "postgres"
  password: "<new-password>"

Do not forget to keep other passwords from the previous Docker Compose file, like the one you generated for RabbitMQ.

You can now restart your node:

docker-compose up -d

The sync process takes around a full day.

Cleanup

Once you are confident that you will not need to roll back the release, you can delete the MongoDB volume:

docker volume rm <docker-compose-directory>_pyaleph-mongodb

Full Changelog: v0.4.7...v0.5.0-rc1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0-rc1