Drug-Classification

This Drug Classification project utilizes a Decision Tree Classifier to accurately predict prescribed medications based on patient health profiles. The implementation features a modular scikit-learn pipeline with a ColumnTransformer to handle various data types.

Dataset Used: https://www.kaggle.com/datasets/prathamtripathi/drug-classification

My Solution

I chose a Pipeline approach to keep the code clean and prevent data leakage.

The logic was to split the features based on their type.

Since BP and Cholesterol have a natural order, I used ordinal encoding.

For numeric values like the sodium-to-potassium ratio, I used scaling to ensure they are on the same level.

I limited the tree size to 6 leaf nodes to make sure the model stays simple and easy to interpret.

Results and Discussion

The model achieved 100% accuracy on the test set.

Performance Summary

The confusion matrix shows perfect precision and recall for all five drug classes (A, B, C, X, and Y).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
drug200.csv		drug200.csv
drug_classification.ipynb		drug_classification.ipynb
drug_classification.py		drug_classification.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drug-Classification

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Drug-Classification

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages