Skip to content

awslabs/real-time-vectorization-of-streaming-data

Real-time Vector Embedding Blueprint

Real-time Vector Embedding Blueprint is an Amazon Managed Service for Apache Flink (MSF) blueprint which deploys an MSF app and other needed infrastructure for vectorizing incoming stream data and persisting the vectorized data in a vector DB. The MSF app consumes from an Amazon MSK cluster, creates embeddings of these messages with a supported Amazon Bedrock model, and stores the embeddings to an Amazon OpenSearch domain or collection.

Get started with Real-time Vector Embedding

Installation

Follow the installation instructions here to install the shared libraries and begin developing.

Deploying

Follow the steps here to build, deploy, and run the application.