Caldas University course repository for Intelligent Systems II.
Class taken with Prof. Jorge Alberto Jaramillo Garzón
Live Demo: Bank Marketing ML Analysis
Intelligent Systems II - Advanced machine learning and artificial intelligence techniques
Institution: Universidad de Caldas | Computer Engineering
Professor: Jorge Alberto Jaramillo Garzón
Academic Period: 2024-2025
- 50% Partial Exams (Zero, First, Second, Third)
- 50% Final Project (Cybersecurity Incident Prediction)
Machine Learning · Data Science · Support Vector Machines · Neural Networks
PCA Analysis · Bayesian Inference · Ensemble Methods · Deep Learning
Dimensionality Reduction · Cross-Validation · Hyperparameter Tuning
Kernel Tricks · Gradient Boosting · Graph Neural Networks · Transformers
Technologies: python streamlit scikit-learn pytorch xgboost pandas numpy matplotlib seaborn plotly solara
.jorge/
├── partials/ # Partial Exams (50%)
│ ├── zero/ # Demo exam (practice)
│ ├── first/ # Bayesian & K-NN classifiers
│ ├── second/ # SVM, ANN, PCA ⭐ COMPLETE
│ └── third/ # TBD
│
├── project/ # Final Project (50%)
│ ├── Cybersecurity Incident Predictor
│ └── Microsoft GUIDE Dataset Analysis
│
└── notebooks/ # Class Materials
├── Theory: Perceptron, SVM, Kernels, ANN
└── Weekly notes
Location: partials/zero/
Topics: Validation, Bayesian classifiers, K-NN
Dataset: Iris
Exercises:
- Cross-Validation (10-fold) - Compare Bayesian vs Geometric classifiers on Iris
- Bootstrapping K-NN - Investigate performance vs number of neighbors
- Classifier Comparison - Contrast assumptions, requirements, dimensionality impact
Location: partials/first/
Topics: Data preprocessing, model training, evaluation
Dataset: Iris classification
Completed:
- ✅ Data preprocessing pipeline
- ✅ Multiple classifier training
- ✅ Performance evaluation metrics
Location: partials/second/
Status: ✅ COMPLETE (2.5/2.5 points)
Live: intel-ii-exam-ii.streamlit.app
Tasks Completed:
- 4 kernel types: Linear, RBF, Polynomial, Sigmoid
- Hyperparameter tuning: C, gamma, degree
- Cross-validation with K-Fold and Train/Test
- Experiment history tracking and comparison
- Confusion matrix visualization
- Best model auto-identification
- 12 architectures: 1-3 layers (10-100 neurons)
- 3 activation functions: ReLU, Tanh, Logistic
- 3 solvers: Adam, SGD, L-BFGS
- Learning curves visualization
- Performance comparison charts
- Best model saver
- Feature analysis: correlation, distributions, Q-Q plots
- Data exploration: 6 plots (3×2D + 3×3D interactive)
- PCA transformation with variance thresholds
- Model retraining on PCA data
- BEFORE vs AFTER comparison
- Automated insights and recommendations
- Answer: "What can you conclude for YOUR dataset?"
Features:
- 🎯 Interactive Streamlit dashboard
- 📊 Real-time experiment tracking
- 💾 Persistent experiment history
- 📈 Automated performance insights
- 🎨 Professional visualizations
- 💡 Smart recommendations (USE/AVOID PCA)
- 📥 CSV export functionality
Tech Stack:
Python | Streamlit | scikit-learn | pandas | matplotlib | seaborn | plotly
Documentation: See .jorge/partials/second/README.md
Location: partials/third/
Status: ✅ COMPLETE (2.5/2.5 points)
Location: .jorge/project/
Advanced ensemble ML platform for predicting cybersecurity incidents before they occur. Transforms reactive cybersecurity into proactive prevention for Security Operations Centers (SOCs).
- Predictive (not reactive): Predicts incidents 1-24 hours in advance
- Hybrid ensemble: LSTM + GNN + XGBoost + Transformer
- Meta-learning: Adaptive model weighting by context
- Production-ready: Professional Solara dashboard
Level 0: Specialized Base Models
-
LSTM/GRU - Temporal pattern recognition
- Learns incident sequences over time
- Captures long-term dependencies
-
Graph Neural Networks - Entity relationship modeling
- Models risk propagation through network
- 33 entity types (users, IPs, domains, etc.)
-
XGBoost - Alert pattern classification
- Complex decision rules
- 9,100+ detector patterns
-
Transformers - Evidence sequence analysis
- Self-attention over evidence chains
- MITRE ATT&CK technique mapping
Level 1: Meta-Ensemble
- Adaptive weight learning by context
- Organization-specific optimization
- Online learning for drift adaptation
- 13M+ evidences from real cybersecurity incidents
- 1.6M alerts from 9,100+ unique detectors
- 1M incidents with expert triage labels
- 6,100+ organizations across industries
- 441 MITRE ATT&CK techniques mapped
- 2-week period with temporal resolution
Prediction Accuracy (4h): 94.2%
Early Warning Score: 0.89
Cost-Weighted Recall: 0.91
Alert Fatigue Score: 0.85
MTTD Reduction: 4+ hours
- ✅ Prevents incidents before escalation
- ✅ Reduces MTTD by 4+ hours
- ✅ Optimizes analyst workload with intelligent prioritization
- ✅ Scales across organizations with adaptive learning
Python 3.10+ | Solara | PyTorch | XGBoost | scikit-learn | Microsoft GUIDE Dataset
Documentation:
- Perceptron fundamentals
- Linear classification
- Support Vector Machine theory
- Margin maximization
- Kernel trick explained
- Mapping φ to higher dimensions
- RBF, Polynomial, Sigmoid kernels
- Computational advantages: O(n²) vs O(n²d)
- Parameter selection (σ, degree)
Key Concepts:
- Kernel function:
K(x,y) = φ(x)ᵀφ(y)computed without explicit mapping - RBF kernel: Maps to infinite dimensions
- Polynomial kernel:
(xᵀy + c)ᵈcaptures interactions - Binomial theorem: Connects products in original/transformed space
- Neural network architectures
- Backpropagation algorithm
- Activation functions (ReLU, Tanh, Sigmoid)
- Training strategies
cd .jorge/partials/second
# Install dependencies
pip install streamlit pandas numpy scikit-learn matplotlib seaborn plotly
# Launch app
streamlit run app.pyVisit: http://localhost:8501
cd .jorge/project
# Install with UV
uv sync
# Download Microsoft GUIDE dataset from Kaggle
# Extract to data/microsoft_guide/
# Run dashboard
uv run cybersec-dashboardVisit: http://localhost:8765
cd .jorge/notebooks/docs
jupyter notebook| Component | Status | Description | Grade |
|---|---|---|---|
| Demo Exam | ✅ Complete | Bayesian, K-NN, Validation | Practice |
| First Exam | ✅ Complete | Fundamentals, Iris dataset | TBD |
| Second Exam | ✅ Complete | SVM + ANN + PCA Dashboard | 2.5/2.5 |
| Third Exam | 🔜 Pending | TBD | - |
| Final Project | ✅ Complete | Cybersecurity Incident Predictor | TBD |
Overall Progress: 80% Complete
By the end of this course, you will master:
- ✅ Support Vector Machines with kernel methods
- ✅ Bayesian classifiers and probabilistic inference
- ✅ K-Nearest Neighbors algorithms
- ✅ Cross-validation and bootstrapping
- ✅ Hyperparameter optimization
- ✅ Artificial Neural Networks (feedforward)
- ✅ LSTM/GRU for temporal sequences
- ✅ Graph Neural Networks for relationships
- ✅ Transformers and attention mechanisms
- ✅ Principal Component Analysis (PCA)
- ✅ Feature selection and engineering
- ✅ Variance analysis and scree plots
- ✅ Component interpretation
- ✅ Random Forest, XGBoost, LightGBM
- ✅ Gradient boosting techniques
- ✅ Meta-ensemble with adaptive weighting
- ✅ Model stacking strategies
- ✅ Deploy ML models in production
- ✅ Build interactive dashboards (Streamlit, Solara)
- ✅ Handle imbalanced datasets (SMOTE, SMOTE-ENN)
- ✅ Evaluate with business-focused metrics
- ✅ Make data-driven conclusions
- ✅ Communicate technical results effectively
Interactive dashboard comparing SVM, ANN, and PCA on UCI Bank Marketing dataset
Highlights:
- 3 ML algorithms with comprehensive tuning
- Automated experiment tracking
- PCA impact analysis with insights
- Smart recommendations based on results
- Professional production deployment
Live Demo: https://intel-ii-exam-ii.streamlit.app/
Enterprise ML platform for SOC teams with 4-hour incident predictions
Innovation:
- Hybrid ensemble (LSTM + GNN + XGBoost + Transformer)
- Meta-learning with context adaptation
- Microsoft GUIDE dataset (13M+ evidences)
- Professional Solara dashboard
- 94.2% prediction accuracy
Impact: Prevents incidents before escalation, saves millions in damages
- Second Exam:
.jorge/partials/second/README.md- SVM Guide:
.jorge/partials/second/ui/pages/svm/README.md - ANN Guide:
.jorge/partials/second/ui/pages/ann/README.md - PCA Guide:
.jorge/partials/second/ui/pages/pca/README.md - Deployment:
.jorge/partials/second/DEPLOYMENT.md
- SVM Guide:
- Overview:
.jorge/project/README.md - Project Vision:
.jorge/project/project_overview.md - Architecture:
.jorge/project/architecture_design.md - Dataset:
.jorge/project/dataset_guide.md - Metrics:
.jorge/project/evaluation_metrics.md
- SVM Theory:
.jorge/project/clase-03.md - Notebooks:
.jorge/notebooks/docs/
- ✅ Deployed Production ML App - Streamlit Cloud
- ✅ Built Professional SOC Dashboard - Solara
- ✅ Implemented Ensemble Learning - 4 specialized models
- ✅ Achieved 94%+ Accuracy - Real-world dataset
- ✅ Created Comprehensive Documentation - Theory + Practice
- ✅ Applied Advanced ML Techniques - Kernels, PCA, Meta-learning
Jorge Alberto Jaramillo Garzón
Computer Engineering Student
Universidad de Caldas
- Professor: Jorge Alberto Jaramillo Garzón
- Institution: Universidad de Caldas
- Course: Sistemas Inteligentes II (Intelligent Systems II)
- Datasets:
- UCI Machine Learning Repository (Bank Marketing, Iris)
- Microsoft GUIDE (Cybersecurity Incidents)
- CIC-IDS2017, UNSW-NB15 (Network intrusion)
Academic project for Universidad de Caldas coursework.