CyberSecurity-Works

Implementing Honeypot, Honeynets, Simple Malware Classifier, RSA algorithm and Learning Python's Socket library
Malware Classification using ML

Malware Classification using Machine Learning

The project focuses on the classification of malware and legitimate software using machine learning techniques. The dataset used for this project is the "MalwareData.csv" file. The project follows the following key steps:

Data Preparation: The dataset is loaded into a Pandas DataFrame. The dataset is divided into two subsets: "legit" containing legitimate software samples and "mal" containing malware samples.

Exploratory Data Analysis: The columns and initial rows of the dataset are examined to understand the structure and content of the data.

Feature Selection: An ExtraTreesClassifier model is trained on the dataset to determine the importance of each feature. SelectFromModel is used to select the most important features for improving accuracy.

Feature Importance: The feature importances are ranked, and the top features contributing to the classification are identified and printed.

Model Training: The dataset is split into training and testing sets using the train_test_split function. A RandomForestClassifier model is trained on the selected features.

Model Evaluation: The trained model's accuracy is evaluated by calculating its score on the test data.

Confusion Matrix Analysis: A confusion matrix is generated by comparing the predicted labels with the true labels from the test set. The matrix helps analyze the performance of the model by examining false positives and false negatives.

Gradient Boosting Classifier: Another machine learning model, GradientBoostingClassifier, is trained on the dataset, and its accuracy score is evaluated.

The project demonstrates the application of machine learning techniques for malware classification. By analyzing the importance of features and training classification models, it provides insights into distinguishing between legitimate software and malware samples. The accuracy scores of the RandomForestClassifier and GradientBoostingClassifier models provide a measure of their effectiveness. The project summary serves as a concise overview of the key steps and outcomes of the project.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Malware		Malware
Mini_project		Mini_project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CyberSecurity-Works

Malware Classification using Machine Learning

About

Uh oh!

Releases

Packages

Languages

Joyabrata001/CyberSecurity-Works

Folders and files

Latest commit

History

Repository files navigation

CyberSecurity-Works

Malware Classification using Machine Learning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages