Skip to content

tkpratardan/Credit-Card-Fraud-Detection

Repository files navigation

OBJECTIVE: Credit-Card-Fraud Detection pipeline is an en-to-end Machine Learning project that helps in prediction of fradulent transactions for a bank.

DEISGN:

image

The pipeline utilises a cassandra database for stored credit-card-fraud analysis a source for testing and training data for model. The trained models(for Kafka and REST) were deployed to make predictions from live transaction data.

Tools; Database: Cassandra ML: Estimators: SGDClassifier, RandomForestClassifier, SVM Classifier and choosing the best estimator Sampling : standardscaler; Imbalanced-learn(smote, smoteenn) since the data is highly imbalanced Model for Kafka client and model for REST(Flask) interface

Source of Data: URL : https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud?resource=download (API command: kaggle datasets download -d mlg-ulb/creditcardfraud)

KAFKA MODEL:

  1. Simulate Kafka producer to produce kafka tx records (serialising json contents of a new transaction)
  2. Kafka consumer is subscribed to topic ('credit-card-tx')
  3. As and when kafka consumer gets a message, the transaction data is used ot predict if it is a fraud

REST (flask) MODEL:

  1. Model predict function is behind rest interface
  2. REST api invocations calls with feature params

Code flow:

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages