This research investigates the presence of gender bias in Large Language Models (LLMs). We assess bias through various experiments and propose strategies to mitigate gender stereotypes in generated text. The goal is to identify and analyze bias patterns and suggest methods for making LLM outputs fairer.
- Data Collection: A set of gender-neutral and gender-specific prompts was curated to assess model responses.
- Bias Evaluation: Metrics for explicit and implicit bias were used to analyze the text generated by the LLMs.
- Experiments: Multiple LLMs were evaluated, comparing their responses and bias levels.
- Analysis: Results were categorized into bias types (e.g., occupation, personality traits) and compared across models.
- Consistent gender bias was identified across various LLMs.
- Bias tends to be more pronounced in areas such as occupational roles and personality traits.
- Larger models, trained on broader datasets, tend to exhibit more subtle biases.
- Python 3.8 or later
- Jupyter Notebook
- Required Python packages (install via
requirements.txt)
-
Clone the repository:
git clone https://github.com/your-repo/LLM-gender-bias.git cd LLM-gender-bias -
Install dependencies:
pip install -r requirements.txt
-
Run the Jupyter Notebook: Open the notebook file in Jupyter Lab or Notebook environment:
jupyter notebook Coling_LLM_gender_bias.ipynb
-
Data: The dataset used for evaluation is included in the repository under the
data/folder.
- Step 1: Load the elaluation datasets in the notebook.
- Step 2: Run the bias evaluation scripts. The notebook includes all the necessary code to conduct experiments.
Results of the experiments will be displayed in the notebook after execution. These include bias scores for different text generation scenarios and comparative charts across multiple models.
Our findings reveal that gender bias is a significant issue in LLMs. This research presents several important insights and highlights the need for more inclusive training data and bias mitigation techniques in AI models.
We suggest exploring advanced fine-tuning methods, adversarial training, and dataset refinement to reduce bias in LLMs.
For questions, contributions, or collaborations, please reach out to [Tetiana Bas] at [tetiana@uni.minerva.edu].