Skip to content

Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.

Notifications You must be signed in to change notification settings

ZipCodeCore/eventsim

 
 

Repository files navigation

Streaming Simulator

This streaming simulator uses Eventsim as data simulator with Fluentd as data collector that sends a small stream of data to Minio.

The data is generated inside a while loop, creating a continuous flow of logs in the /opt/eventsim/output folder. The fluentd agent is tailing the logs from this folder and sending these as packages of JSON files to S3 (or a compatible s3 storage like Minio). A sample of these files is located in 2023061719_20.json.

Each record contains the following information:

{
  "artist": "Daryl Hall & John Oates",
  "song": "Don't Hold Back Your Love",
  "duration": 313.8869,
  "ts": 1661129304330,
  "sessionId": 20857,
  "auth": "Logged In",
  "level": "paid",
  "itemInSession": 20,
  "city": "Memphis",
  "zip": "38135",
  "state": "TN",
  "userAgent": "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko",
  "lon": -89.848559,
  "lat": 35.238925,
  "userId": 212,
  "lastName": "Morgan",
  "firstName": "Tenli",
  "gender": "F",
  "registration": 1655316754330
}

The max size of each JSON file is 3MB, this can be modified in the fluent.conf.

Getting Started

Prerequisites

  • Docker
  • Docker compose plugin

Usage

This streaming simulator uses two services with docker-compose:

  1. Minio: It is an open source S3 compatible object storage. This service help us emulate the S3 behavior locally. This can be replaced for real S3 in AWS or another object storage.
  2. Eventsim: This service is the one in charge to generate the simulated events and forward them to Minio as a Streaming Sink.

To run the streaming simulator just run:

docker compose up

The entrypoint.sh will initialize the Minio Bucket and start sinking all the data in the bucket de-streaming-test and the path Raw/listen_events/. The credentials to access the bucket are defined in minio.env, and it is possible to access via UI the bucket in http://localhost:9001 with the same credentials.

Advance Configurations

Kafka

Kafka can be configured as sink, adding the output plugging configuration in fluent.conf

S3

In the case that the preferred sink is the real S3 instead the compatible solution (Minio), the following parameters needs to be updated in fluent.conf with the proper credentials, bucket, and removing the s3_endpoint parameter.

  aws_key_id minio_ak         # The access key for Minio
  aws_sec_key minio_sk        # The secret key for Minio
  s3_bucket de-streaming-test         # The bucket to store the log data
  s3_endpoint http://minio:9000/          # The endpoint URL (like "http://localhost:9000/")

Other sinks

Refer to the Fluentd documentation to configure additional sinks.

About

Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Scala 85.3%
  • Shell 14.2%
  • Dockerfile 0.5%