Final year student at ENSAE Paris and Master of Data Science, Ecole Polytechnique.
I’m passionate about Data Science, Machine Learning, and Statistics, with a strong interest in turning complex problems into elegant, data-driven solutions.
Currently diving deep into ML engineering, statistical modeling, and scalable data pipelines — always curious, always building.
🔬 Plasma Blob Detection ("codalab_tokam2d")
Bounding box prediction on plasma imaging data to detect and localize blobs : applying computer vision to fusion research on constrained dataset.
⚗️ Optimal Transport Tutorial ("constrained_sinkhorn_ott")
A hands-on tutorial for the ott-jax library by Marco Cuturi about a Sinkhorn-type algorithm for constrained optimal transport problems. Covering core concepts and practical usage of optimal transport in JAX.
🔐 Federated Learning Research ("Federated-Learning-Research")
Implementing and evaluating state-of-the-art Byzantine attacks and defenses in a federated learning setting — where robustness meets distributed ML.
🛡️ Adversarial Examples ("Adversarial_examples")
Studying adversarial robustness of image classification models : crafting and defending against imperceptible perturbations that fool neural networks.
🧬 Graph & NLP — Molecular Description Generation ("Molecular-Graph-Captioning")
Translating 2D atomic structures (molecular graphs) into human-readable chemical descriptions, bridging graph neural networks and natural language generation.
🖼️ CLIP Few-Shot Classification ("CLIP-Few-Shot")
Exploring OpenAI's CLIP model for few-shot image classification — leveraging vision-language pretraining to generalize across visual categories with minimal labeled data.
🎵 Music Genre Prediction ("Python-pour-la-Data-Science")
Using Random Forest & XGBoost with Spotify’s API data to classify music genres (because data can groove too!)
📚 Hi!ckathon ("Hickathon_2025")
Leveraging LightGBM model to predict Math Scores of student using PISA dataset.
🎭 Generate Theater ("Generate-Theater")
Training a character-level language model on a Shakespeare text dataset, as a short introduction to LLMs.
📦 Delivery Network Optimization ("ensae-prog23")
Implementing Kruskal’s, Floyd-Warshall, and greedy algorithms to optimize logistics : efficiency meets algorithm design.
📊 Interactive Data Visualization Dashboard ("Stock-Price-Dashboard")
Building a Streamlit app for exploratory data analysis.
Always open to collaboration, research opportunities, or just a good data chat!

