Skip to content

Joyabrata001/CyberSecurity-Works

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

CyberSecurity-Works

  1. Implementing Honeypot, Honeynets, Simple Malware Classifier, RSA algorithm and Learning Python's Socket library
  2. Malware Classification using ML

Malware Classification using Machine Learning

The project focuses on the classification of malware and legitimate software using machine learning techniques. The dataset used for this project is the "MalwareData.csv" file. The project follows the following key steps:

Data Preparation: The dataset is loaded into a Pandas DataFrame. The dataset is divided into two subsets: "legit" containing legitimate software samples and "mal" containing malware samples.

Exploratory Data Analysis: The columns and initial rows of the dataset are examined to understand the structure and content of the data.

Feature Selection: An ExtraTreesClassifier model is trained on the dataset to determine the importance of each feature. SelectFromModel is used to select the most important features for improving accuracy.

Feature Importance: The feature importances are ranked, and the top features contributing to the classification are identified and printed.

Model Training: The dataset is split into training and testing sets using the train_test_split function. A RandomForestClassifier model is trained on the selected features.

Model Evaluation: The trained model's accuracy is evaluated by calculating its score on the test data.

Confusion Matrix Analysis: A confusion matrix is generated by comparing the predicted labels with the true labels from the test set. The matrix helps analyze the performance of the model by examining false positives and false negatives.

Gradient Boosting Classifier: Another machine learning model, GradientBoostingClassifier, is trained on the dataset, and its accuracy score is evaluated.

The project demonstrates the application of machine learning techniques for malware classification. By analyzing the importance of features and training classification models, it provides insights into distinguishing between legitimate software and malware samples. The accuracy scores of the RandomForestClassifier and GradientBoostingClassifier models provide a measure of their effectiveness. The project summary serves as a concise overview of the key steps and outcomes of the project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published