Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
-
Updated
Feb 20, 2025 - Python
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
An orchestration platform for the development, production, and observation of data assets.
Fancy stream processing made operationally mundane
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.
The developer first cloud governance platform
Flink CDC is a streaming data integration tool
Privacy and Security focused Segment-alternative, in Golang and React
Build data pipelines, the easy way 🛠️
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Open source data anonymization and synthetic data platform for developers. Anonymize your production data and sync it across your environments so that developers can safely use it.
Spreadsheet with AI, Code, Connections
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
A curated list with resources about node-based UIs
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
Add a description, image, and links to the etl topic page so that developers can more easily learn about it.
To associate your repository with the etl topic, visit your repo's landing page and select "manage topics."