Gendered Abuse Detection in Indic Languages 🌐

![Gendered Abuse Detection](https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip)

Online gender-based violence limits marginalized voices. Detection in Indic languages is hard due to limited data and linguistic complexity. This work builds better classifiers for improved abuse detection in such settings.

Introduction

Gender-based violence is a significant issue affecting many communities. In many cases, the voices of those most impacted are silenced. Detecting instances of abuse in Indic languages poses unique challenges. The linguistic diversity and limited resources complicate the development of effective detection systems.

This repository aims to tackle these challenges by creating advanced classifiers. We focus on improving detection capabilities for gender-based violence in Indic languages.

Problem Statement

The existing models for detecting gender-based violence often fail in Indic languages. The primary reasons include:

Limited Data: There is a scarcity of labeled datasets for training models.
Linguistic Complexity: Indic languages have diverse structures, making it hard for standard NLP models to perform well.

By addressing these issues, we hope to enhance the detection of gendered abuse and provide better support for marginalized voices.

Dataset

We use various datasets that include text from social media, forums, and other platforms where abuse may occur. The datasets are curated to include instances of gender-based violence.

Data Sources

Social media platforms
Online forums
Community reports

Data Preparation

Data preprocessing involves:

Tokenization
Normalization
Removing noise

This step ensures that the models receive clean and relevant data for training.

Models

We explore several models to improve detection accuracy.

BERT

BERT (Bidirectional Encoder Representations from Transformers) has shown promise in understanding context in language. We fine-tune BERT for our specific task, allowing it to learn nuances in Indic languages.

Convolutional Neural Networks

CNNs are effective in capturing local patterns in text. We adapt CNNs to analyze sequences of words, which helps in identifying abusive language.

GRU

Gated Recurrent Units (GRUs) are another option for sequence modeling. They help in understanding context over longer sequences, making them suitable for our needs.

Installation

To set up the project, follow these steps:

Clone the repository:

git clone https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip

Navigate to the project directory:

cd Gendered_Abuse_Detection_In_Indic-Languages

Install the required packages:

pip install -r https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip

Ensure you have the necessary libraries:
- PyTorch
- Transformers
- Scikit-learn

Usage

After installation, you can start using the models.

Load the model:

from model import load_model
model = load_model('path_to_model')

Make predictions:

predictions = https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip(input_text)

Evaluate the model:

from evaluator import evaluate
results = evaluate(model, test_data)

Results

We present the results of our models on a validation dataset. The metrics include:

Accuracy: The percentage of correct predictions.
Precision: The ratio of true positives to the sum of true and false positives.
Recall: The ratio of true positives to the sum of true positives and false negatives.

Our models show promising results, demonstrating improved accuracy in detecting gender-based violence in Indic languages.

Contributing

We welcome contributions to improve this project. If you have ideas or want to report issues, please follow these steps:

Fork the repository.
Create a new branch for your feature or bug fix.
Make your changes and commit them.
Push to your fork and submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For questions or feedback, feel free to reach out:

Email: https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip
Twitter: [@yourhandle](https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip)

Releases

For the latest updates and versions, please visit the [Releases](https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip) section. Here, you can find downloadable files and execute them as needed.

![Download Releases](https://github.com/Tanlouie/Gendered_Abuse_Detection_In_Indic-Languages/raw/refs/heads/main/Baseline Deliverables_midsem/Task 1 (paper Reproduced)/Gendered-In-Languages-Abuse-Detection-Indic-v2.6.zip)

By improving detection methods, we can help amplify marginalized voices and address gender-based violence more effectively. Your support and contributions are invaluable in this mission.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Baseline Deliverables_midsem		Baseline Deliverables_midsem
Dataset		Dataset
Final Deliverables		Final Deliverables
.gitignore		.gitignore
48_PPT.pdf		48_PPT.pdf
48_report.pdf		48_report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gendered Abuse Detection in Indic Languages 🌐

Table of Contents

Introduction

Problem Statement

Dataset

Data Sources

Data Preparation

Models

BERT

Convolutional Neural Networks

GRU

Installation

Usage

Results

Contributing

License

Contact

Releases

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gendered Abuse Detection in Indic Languages 🌐

Table of Contents

Introduction

Problem Statement

Dataset

Data Sources

Data Preparation

Models

BERT

Convolutional Neural Networks

GRU

Installation

Usage

Results

Contributing

License

Contact

Releases

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages