Skip to content
View carloshgalvan95's full-sized avatar

Block or report carloshgalvan95

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
carloshgalvan95/README.md

Hi there, I'm Carlos Galván! 👋

Petroleum Engineer turned Data Scientist & AI Engineer

M.Sc. Applied Artificial Intelligence Candidate | Based in Villahermosa, Mexico

I am a specialized Data Scientist bridging the gap between Industrial Engineering and Advanced AI. Currently, I work at PEMEX, developing production-grade RAG systems and predictive models for reservoir engineering, while pursuing my Master's in Applied AI at Tecnológico de Monterrey.

Previously, I engineered pricing algorithms and data pipelines at KAVAK, contributing to high-impact inventory optimization and revenue recovery.


Technical Stack

Languages Python C++ JavaScript SQL VBA Bash

Artificial Intelligence & GenAI
PyTorch TensorFlow Keras Scikit-Learn XGBoost OpenCV HuggingFace LangChain LlamaIndex OpenAI Gradio Streamlit RAG Agentic%20AI Fine--Tuning LoRA

Vector Databases & Retrieval Pinecone ChromaDB Weaviate Qdrant

Data Engineering & Cloud AWS SageMaker Glue Lambda Docker Kubernetes Apache Spark Airflow Kafka dbt Databricks

Backend & Web Frameworks FastAPI Flask Node.js Angular gRPC Bruno

Databases PostgreSQL MySQL SQL Server MongoDB

Data Science & Visualization Pandas NumPy Matplotlib Seaborn Tableau PowerBI


Professional Work Highlights

While much of my work is proprietary, here are some key systems I have engineered:

PEMEX (Current)

  • RAG System for Petroleum Reserves: Engineered a hybrid retrieval architecture (ChromaDB + BM25) to process technical engineering documents with semantic search, significantly improving information accessibility for reservoir engineers.
  • Water Breakthrough Prediction: Built a discrete-time survival model (Logistic Regression + 32 engineered features) to predict water breakthrough events, enabling proactive intervention for 3 critical wells.
  • ML Production Ops: Automated the classification of 140,000+ daily oil well movements with 96% accuracy using LinearSVC.

KAVAK

  • Inventory Optimization: Recovered $104M+ MXN in business value by identifying negative-margin inventory through predictive modeling.
  • Pricing Engine: Maintained and optimized AWS-based pricing algorithms (Glue, Lambda, Airflow) processing millions of vehicle valuations with 99%+ uptime.

Open Source & Personal Projects

You can find my public analysis and tools in my repositories below:

  • personal_finance_gui: A TypeScript-based personal finance application with a GUI for tracking income, expenses, and financial goals.
  • Square_Meter_Value_Real_Estate: Data analysis determining the most influential factors on property pricing per square meter.
  • World_Weather_Analysis: API-driven analysis correlating latitude with weather conditions for travel recommendations.

Connect with Me

LinkedIn Email

Pinned Loading

  1. PyBer_Analysis PyBer_Analysis Public

    Analysis to address the disparities found between the average fares for every type of city using Pandas Python library, Jupiter Notebooks and Matplotlib for data visualization.

    Jupyter Notebook

  2. School_District_Analysis School_District_Analysis Public

    School district analysis based on overall grade performance to determine how much budget impacts the grades of students using Pandas Python library and Jupiter Notebooks.

    Jupyter Notebook

  3. World_Weather_Analysis World_Weather_Analysis Public

    World weather analysis to determine correlations between latitude and weather conditions to use as parameters on trip recommendations given the desired weather conditions using Python Pandas, SciPy…

    Jupyter Notebook 1

  4. Craigslist_Used_Cars_Analysis Craigslist_Used_Cars_Analysis Public

    Craigslist used cars database analysis to determine correlations between price depreciation and manufacturer, model and age of used cars

    Jupyter Notebook

  5. Square_Meter_Value_Real_Estate Square_Meter_Value_Real_Estate Public

    Most important factor that influence the price per square meter of a property.

    Jupyter Notebook

  6. personal_finance_gui personal_finance_gui Public

    A personal finance application with GUI for tracking income, expenses, and financial goals

    TypeScript