Skip to content

KRC00112/LegalLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LegalLens

About LegalLens

LegalLens is an Android application designed to automate the summarization of legislative bills using advanced Natural Language Processing (NLP) techniques. The project aimed to compare several abstractive transformer-based models—Pegasus, ProphetNet, T5, and Longformer—along with an extractive-based technique, K-means, to identify the best performing model for integration into the mobile app. The BillSum dataset, consisting of 1,237 California bills and their reference summaries, was used for this project. Due to the vast size of the corpus and limited resources, we selected a subset of California bills for training, validation, and testing purposes. Specifically, we allocated 791 bills for training, 198 bills for validation, and 248 bills for testing.

Model Results

The transformer-based models were trained for 10 epochs, and their performance was evaluated using ROUGE scores as shown by the following table:

Models ROUGE-1 ROUGE-2 ROUGE-L ROUGE-L_sum
google/pegasus-cnn_dailymail 0.48 0.24 0.33 0.33
google-t5/t5-small 0.15 0.08 0.13 0.13
allenai/led-base-16384 0.14 0.08 0.12 0.13
microsoft/prophetnet-large-uncased 0.50 0.23 0.31 0.31

Additionally, the K-means extractive technique achieved the following ROUGE scores:

  • ROUGE-1: 0.3505
  • ROUGE-2: 0.0909
  • ROUGE-L: 0.2680

Among all models, Pegasus emerged as the top performer, demonstrating superior performance across all metrics. Due to its results, Pegasus was selected for implementation within LegalLens. By leveraging the power of Pegasus, LegalLens provides legal professionals, students, and researchers with an accessible and user-friendly tool to quickly generate summaries of extensive legal documents.

User Interface

Authorization(Login and Registration)


Login Image                  Register Image

Main Screen and Summarization History Screen


Main Screen Image                  history Image

Settings


settings Image

Admin Interface

Authorization(Login and Registration)


Login Image                  Register Image

Note: Here, We use employee ID for authentication instead of a username, driven by sheer curiosity. Unlike users, whose authentication is done using email, employees' authentication is based on their own unique Employee ID. The regex for employee ID is "LL20(2[4-9]|[3-9][0-9])(DEV|MAN)\d{4}". This pattern means that only those employees whose ID matches the regex can log in. The regex ensures the following:

  • The ID starts with "LL20".
  • Followed by a number between 24 and 99.
  • Followed by either "DEV" or "MAN".
  • Followed by exactly 4 digits.

Users' List & Details Screeens

listScreen Image                  details Image

About

An Android App for Automated Summarization of Legislative Bills Using Transformer based Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors