Skip to content
View LiliValGo's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report LiliValGo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
LiliValGo/README.md

Hi, I'm Lili Valencia Gonzalez πŸ’šπŸ‘©πŸ»β€πŸ’»

I'm an Environmental Engineer turned Data Engineer with a strong foundation in building efficient and scalable data architectures. My expertise includes data manipulation, transformation, and integration using Python (pandas, NumPy) and SQL. I have extensive experience in designing and managing data architectures for both Online Transaction Processing (OLTP) systems, which handle real-time transaction data, and Online Analytical Processing (OLAP) systems, optimized for complex data analysis. Additionally, I have developed and optimized ETL pipelines for diverse data sources, implementing best practices for data processing. My technical stack includes cloud data management with AWS services such as S3, Lambda, and RDS, where I design and manage data storage solutions to support analytics and reporting. I am passionate about creating impactful data solutions that drive decision-making and contribute to sustainable outcomes.

♻️ In my free time, I actively participate in various technology communities, including serving as an organizer for the Google Developers Group (GDG). I also enjoy engaging in data-centric hackathons, where I leverage my environmental engineering background to extract meaningful insights from data.

πŸ¦€ I'm currently learning Rust

Some technologies I use

Python Rust Docker Apache Kafka Linux Oracle Microsoft SQL Server Amazon AWS PostgreSQL Talend VSCODE Visual Studio

Some of my projects

ETL pipeline to analyze environmental crime data in Colombia, integrating data extraction from a public API, data transformation with pandas, and loading into PostgreSQL and AWS S3. Includes logging, error handling, and unit tests for reliability and scalability.

Simulates a real-time Smart City data pipeline with Kafka, Apache Spark, and S3. Streams and processes vehicle, GPS, weather, traffic, and emergency data with Dockerized components and Parquet storage for efficient, scalable data engineering.

This Rust-based command-line app records, analyzes, and assesses risk levels from data entries and geolocation info. Using DuckDB for storage, it calculates and stores monthly risk statuses, leveraging asynchronous requests and JSON handling for comprehensive risk analysis.

This project combines web scraping, PDF processing, and Natural Language Processing (NLP) to extract and analyze IPCC climate reports. It automates downloading PDFs, processes file validation, and applies NLP for data insights.

Find me around the web

Pinned Loading

  1. NLP-for-IPCC-Climate-Reports NLP-for-IPCC-Climate-Reports Public

    This project combines web scraping, PDF processing, and Natural Language Processing (NLP) to extract and analyze IPCC climate reports. It automates downloading PDFs, processes file validation, and …

    Jupyter Notebook

  2. Coal-Production-Colombia Coal-Production-Colombia Public

    Data analysis that includes information on annual coal production, royalties generated, and climate variables. Descriptive analysis and visual analysis techniques were used

    Jupyter Notebook 1

  3. ML_Time_Series ML_Time_Series Public

    This project uses time series data to predict corn crop yield in Colombia

    Jupyter Notebook