List of resources and tools developed with focus on Portuguese.
-
Updated
Mar 20, 2024
List of resources and tools developed with focus on Portuguese.
End-to-End Python implementation of Muço’s (2025) corruption measurement framework. Combines NLP pipeline (regex extraction, Porter stemming, TF-IDF), PCA-based dimensionality reduction, and fixed-effects OLS to quantify institutional quality from Brazilian audit reports. Includes supervised learning robustness checks and LOO sensitivity analysis.
Article reproducibility LexIris-pt and LexBert-pt: Specialized Sentence Embeddings for Legal Similarity in Brazilian Portuguese.
This repository contains the official Python code and resources for the research paper: "Portuguese Automated Fact-checking: Information Retrieval with Claim extraction".
Honest benchmark: does DSPy beat a hand-written prompt, and at what cost? Manual vs. BootstrapFewShot vs. MIPROv2 on 3 real PT-BR tasks, reporting accuracy gains and USD cost.
Portuguese split from MQA
Comparative study of 23 LLMs for Brazilian Portuguese sentiment analysis via in-context learning. Evaluates multilingual vs Portuguese-specialized models across 12 datasets. Code and data included.
Reproducible experiments on PT-BR fake news detection, shortcut learning, robustness, XAI and cross-dataset generalization.
Article reproducibility Classification of the Conciliation Profile in Initial Petitions in the Brazilian Judiciary
CurupiraIA: A Brazilian Portuguese hate speech detection model using BERT fine-tuning, inspired by folklore guardianship principles to protect digital communities from toxic content.
Add a description, image, and links to the portuguese-nlp topic page so that developers can more easily learn about it.
To associate your repository with the portuguese-nlp topic, visit your repo's landing page and select "manage topics."