Releases · data-mill-cloud/data-mill · GitHub

This repository has been archived by the owner on Oct 21, 2020. It is now read-only.

25 Apr 15:13

pilillo

Third release Latest

Latest

Master status at 2019-04-25
A few new components: Druidi, Superset, Nifi
Bug fixing

Assets 2

23 Feb 16:14

pilillo

Second release

Added installation script to ease initial setup
Made run callable from anywhere (/usr/local/bin)
Configuration yaml for components and k8s can be centralised (all-in-one flavour file) or distributed (per component)
Shortened preamble (loading of component variables and configs) with new utils functions

Assets 2

12 Feb 13:43

pilillo

Second release Pre-release

Pre-release

Added components for:

Cassandra
Flink
Elasticsearch
Kibana
Grafana
Argo
Metallb
Ambassador
Traefik

Assets 2

12 Feb 13:38

pilillo

First release

Kubernetes setup
- local using minikube, as well as microk8s and multipass+microk8s
- remote, using sample scripts in KOPS for the setup on AWS and GKE
Setup of common components
- Ingestion (e.g. kafka, RabbitMQ)
- Persistent storage (e.g. s3, ArangoDB, InfluxDB)
- Data Versioning (e.g. Pachyderm)
- Processing (e.g. dask, spark)
- Exploration Environment (e.g. JupyterHub)
- BI Dashboarding (e.g. superset)
- ML model versioning and benchmarking, as well as project management (e.g. mlflow)
- ML model serving (e.g. Seldon-core)
- Monitoring (e.g. prometheus, Grafana)
Data Science environments
- Scientific Python Environment
- PySpark Environment
- Keras/Tensorflow Environment
- Keras/Tensorflow GPU Environment
Example code
- notebooks

Assets 2