You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm Ahmad, an Egyptian data engineer passionate about building efficient, reliable, and scalable data pipelines
💼 Data Engineering Stack
Category
Tools/Technologies
🚀 Big Data Frameworks
PySpark
📦 Data Storage and Management
Iceberg, MinIO, Nessie
🔄 Workflow Orchestration
Airflow, SSIS
✔️ Data Quality
Soda, dbt, Regex for failure detection
🔧 Data Transformation
dbt (Data Build Tool), SQL, Jinja templating
🔗 Version Control for Data
Implementing branching and versioning with Nessie
📄 File Formats
Parquet, CSV, JSON, YAML
🔁 CI/CD
GitHub Actions, act
🐳 Containerization
Docker, Docker-Compose
🧪 Testing
Python UnitTest, dbt unit tests, Soda quality tests, dbt data tests
🏗️ Data Modeling
Kimball Approach, Data Vaults
💻 Programming Languages
Python, JS, SH
🛠️ Projects and Tools I Work With
⚙️ ETL Pipelines
🤖 Orchestration and Automation
🧊 Loading and Partitioning
🌐 Orchestrating remote Spark jobs
☁️ Object Storage Integration
🛠️ Custom Airflow Operators via SSH
🐳 Environment Orchestration
⏱️ Data-Aware Scheduling
🧠 Core Principles
👾👾👾👾👾👾👾👾👾👾👾👾👾👾
📟 Precision Over Convenience
📍 Efficiency First
🔋 Collaboration is Key
🧩 Modularity and Reusability
🎯 Future Goals
Deepen my knowledge of dbt and evaluate its potential against custom SQL workflows.
Continue refining incremental load strategies to support real-time analytics.
Explore advanced lakehouse concepts and cutting-edge tools.
🤝 Let's Connect!
I’m always open to learning and collaborating. If you’re working on an interesting data engineering project, I’d love to discuss and exchange ideas. Let’s build something amazing together!