A machine learning project for classifying movement commands of a wall-following mobile robot (SCITOS-G5) based on ultrasound sensor readings.
- Project Overview
- Dataset Description
- Data Insights
- Installation
- Project Structure
- Usage
- Methodology
- Experimental Results
- Key Insights
- References
- License
This project aims to classify the movement commands of a wall-following mobile robot (SCITOS-G5) based on ultrasound sensor readings. The robot navigates a room following walls in a clockwise direction, and the task is to predict its movement at each time step:
- Move-Forward
- Slight-Right-Turn
- Sharp-Right-Turn
- Slight-Left-Turn
The classification problem is non-linearly separable, making it an interesting benchmark for supervised machine learning algorithms.
The dataset comes from the UCI Machine Learning Repository and contains three versions based on the number of input sensors:
- 24 ultrasound sensors arranged circularly around the robot's "waist"
- Each sensor:
US1–US24with a reference angle from front (180°) clockwise - Samples: 5,456
- Attributes: 24 numeric features + 1 class label
- Sensors aggregated into arcs:
SD_front,SD_left,SD_right,SD_back - Each represents minimum distance in 60° arc
- Samples: 5,456
- Attributes: 4 numeric features + 1 class label
- Only
SD_frontandSD_left - Samples: 5,456
- Attributes: 2 numeric features + 1 class label
| Class | Samples | Percentage |
|---|---|---|
| Move-Forward | 2,205 | 40.41% |
| Slight-Right-Turn | 826 | 15.13% |
| Sharp-Right-Turn | 2,097 | 38.43% |
| Slight-Left-Turn | 328 | 6.01% |
Missing values: None
| Sensor | Max | Min | Mean | SD |
|---|---|---|---|---|
| US1 | 5.0000 | 0.40000 | 1.47162 | 0.80280 |
| US2 | 5.0250 | 0.43700 | 2.32704 | 1.41015 |
| US3 | 5.0290 | 0.47000 | 2.48935 | 1.24743 |
| US4 | 5.0170 | 0.83300 | 2.79650 | 1.30937 |
| US5 | 5.0000 | 1.12000 | 2.95855 | 1.33922 |
| US6 | 5.0050 | 1.11400 | 2.89307 | 1.28258 |
| US7 | 5.0080 | 1.12200 | 3.35111 | 1.41369 |
| US8 | 5.0870 | 0.85900 | 2.54040 | 1.11155 |
| US9 | 5.0000 | 0.83600 | 3.12562 | 1.35697 |
| US10 | 5.0220 | 0.81000 | 2.83239 | 1.30784 |
| US11 | 5.0190 | 0.78300 | 2.54940 | 1.38203 |
| US12 | 5.0000 | 0.77800 | 2.07778 | 1.24930 |
| US13 | 5.0030 | 0.77000 | 2.12578 | 1.40717 |
| US14 | 5.0000 | 0.75600 | 2.19049 | 1.57687 |
| US15 | 5.0000 | 0.49500 | 2.20577 | 1.71543 |
| US16 | 5.0000 | 0.42400 | 1.20211 | 1.09857 |
| US17 | 5.0000 | 0.37300 | 0.98983 | 0.94207 |
| US18 | 5.0000 | 0.35400 | 0.91027 | 0.88953 |
| US19 | 5.0000 | 0.34000 | 1.05811 | 1.14463 |
| US20 | 5.0000 | 0.35500 | 1.07632 | 1.14150 |
| US21 | 5.0000 | 0.38000 | 1.01592 | 0.88744 |
| US22 | 5.0000 | 0.37000 | 1.77803 | 1.57169 |
| US23 | 5.0000 | 0.36700 | 1.55505 | 1.29145 |
| US24 | 5.0000 | 0.37700 | 1.57851 | 1.15048 |
| Feature | Max | Min | Mean | SD |
|---|---|---|---|---|
| SD_front | 5 | 0.49500 | 1.29031 | 0.62670 |
| SD_left | 5 | 0.34000 | 0.68127 | 0.34259 |
| SD_right | 5 | 0.83600 | 1.88182 | 0.56253 |
| SD_back | 5 | 0.36700 | 1.27369 | 0.82175 |
| Feature | Max | Min | Mean | SD |
|---|---|---|---|---|
| SD_front | 5 | 0.49500 | 1.29031 | 0.62670 |
| SD_left | 5 | 0.34000 | 0.68127 | 0.34259 |
⚠️ Note: These statistics show sensor ranges and variability, which are the features fed into the machine learning models.
- Python 3.7 or higher
- pip package manager
- Clone the repository:
git clone https://github.com/bugsNburgers/wall-following-robot.git
cd wall-following-robot- Install required dependencies:
pip install -r requirements.txt- numpy
- pandas
- scikit-learn
- matplotlib
- seaborn
- joblib
wall-following-robot/
├── data/ # Dataset files
│ ├── sensor_readings_2.csv # 2-sensor dataset
│ ├── sensor_readings_4.csv # 4-sensor dataset
│ └── sensor_readings_24.csv # 24-sensor dataset
├── src/ # Source code
│ ├── loader.py # Data loading utilities
│ ├── train_and_eval.py # Model training and evaluation
│ └── demo.py # Demo script for predictions
├── models/ # Trained model files (generated)
├── outputs/ # Evaluation outputs (generated)
│ ├── summary_metrics_*.csv # Performance metrics
│ └── confusion_matrix_*.png # Confusion matrices
├── requirements.txt # Python dependencies
└── README.md # This file
To train all models on all datasets:
python -m src.train_and_evalThis will:
- Train SVM, Random Forest, and Logistic Regression models
- Evaluate on test sets
- Generate confusion matrices
- Save models to
models/directory - Save performance metrics to
outputs/directory
To make predictions on specific data points:
# Predict using a specific model on the full dataset
python -m src.demo data/sensor_readings_4.csv sensor_readings_4_svm_rbf 0
# Predict using a specific model on the test set
python -m src.demo data/sensor_readings_4.csv sensor_readings_4_random_forest 10 testArguments:
<path_to_csv>: Path to the dataset file<model_name>: Model filename (without .joblib extension)<row_index>: Index of the row to predicttest(optional): Use test set instead of full dataset
# 1. Train models on all datasets
python -m src.train_and_eval
# 2. View results
cat outputs/summary_metrics_sensor_readings_4.csv
# 3. Test prediction with trained model
python -m src.demo data/sensor_readings_4.csv sensor_readings_4_random_forest 0 test- Data split: Train-Test Split (80%-20%)
- Encoding classes: Class labels mapped to integers:
- Move-Forward → 0
- Slight-Right-Turn → 1
- Sharp-Right-Turn → 2
- Slight-Left-Turn → 3
- Feature scaling: StandardScaler applied for SVM and Logistic Regression; not required for Random Forest
-
Support Vector Machine (SVM)
- Kernel: RBF (Radial Basis Function)
- Class weights: Balanced
-
Random Forest Classifier
- Number of estimators: 200
- Class weights: Balanced
-
Logistic Regression
- Solver: lbfgs
- Multi-class: auto
- Max iterations: 500
- Accuracy: Overall correctness
- Precision: Correct positive predictions / Total predicted positives
- Recall: Correct positive predictions / Total actual positives
- F1-Score: Harmonic mean of precision and recall
| Dataset | Model | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|---|
| 2-Sensor | SVM (RBF) | 0.9588 | 0.9364 | 0.9715 | 0.9520 |
| 2-Sensor | Random Forest | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| 2-Sensor | Logistic Regression | 0.9167 | 0.9046 | 0.9298 | 0.9122 |
| 4-Sensor | SVM (RBF) | 0.9469 | 0.9306 | 0.9641 | 0.9450 |
| 4-Sensor | Random Forest | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| 4-Sensor | Logistic Regression | 0.9112 | 0.8806 | 0.9233 | 0.8968 |
| 24-Sensor | SVM (RBF) | 0.8773 | 0.8423 | 0.9016 | 0.8662 |
| 24-Sensor | Random Forest | 0.9927 | 0.9902 | 0.9839 | 0.9870 |
| 24-Sensor | Logistic Regression | 0.6850 | 0.6422 | 0.7521 | 0.6676 |
As the number of sensors increases, Logistic Regression struggles due to non-linear separability and more complex feature interactions. Random Forest consistently achieves near-perfect accuracy. SVM performs well but slightly drops with 24 sensors.
-
Random Forest is robust to input feature size and non-linearity, achieving near-perfect classification across all datasets.
-
SVM works well on smaller sensor sets but slightly degrades on full 24-sensor data, though still maintaining good performance.
-
Logistic Regression performs decently on simplified datasets but struggles on the full 24-sensor data, confirming the non-linear nature of the task.
-
Sensor statistics explain difficulty: Increased variance and overlapping sensor ranges make linear separability impossible as the number of sensors grows.
-
Simpler can be better: The 2-sensor and 4-sensor datasets achieve excellent performance while being computationally more efficient.
- Lichman, M. (2013). UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
- Freire, A. L., Barreto, G. A., Veloso, M., & Varela, A. T. (2009). Short-Term Memory Mechanisms in Neural Network Learning of Robot Navigation Tasks: A Case Study. LARS'2009, Valparaíso, Chile. DOI: 10.1109/LARS.2009.5418323
Dataset: UCI Machine Learning Repository – "Other" license
This project demonstrates the application of machine learning algorithms to robotics navigation problems and serves as a benchmark for classification algorithms on non-linearly separable data.