This project is a Behavioral Analysis Tool designed to detect potential security threats (hackers) within server logs. Instead of relying on static rules, it utilizes Unsupervised Machine Learning (Isolation Forest) to identify anomalies based on user behavior patterns.
The system operates on the principle that "Attackers behave differently from normal users."
- Data Generation: We simulate a dataset of 1,020 users (1,000 normal users + 20 malicious actors).
- Feature Selection: The AI analyzes two key behavioral dimensions:
login_duration: How long the user stays online.failed_attempts: How many times they failed the password.
- The Algorithm: Using the Isolation Forest algorithm, the model isolates data points that deviate from the dense cluster of "normal" behavior.
- Visualization: A Matplotlib scatter plot renders the decision boundary, highlighting safe users in Blue and detected anomalies in Red.
- Python 3.x: The core logic.
- Pandas: For data manipulation and CSV handling.
- Scikit-Learn: For the Isolation Forest model.
- Matplotlib: For visualizing the detection map.
You will need the standard data science libraries:
pip install pandas scikit-learn matplotlib