You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create a folder called .model in the same folder as this file and place the proper finetuned GPT-2 model (see Models section) inside it
(.model/GPT2-rap-recommended/config.json pytorch...). The model is available here
Hardware that can deal with GPT-2.
Data Documentation
We gathered raps from genius.com, ohhla.com and battlerap.com. For genius.com, we used the official API
(GeniusLyrics and GetRankings repos) while genius.com and ohhla.com were scraped using a specifically tailored scrapy scraper.
In total we gathered ~70k raps which we used for finetuning. GPT-2 was finetuned by creating one large text, while T5 was finetuned
on prompts. The prompts had the form of KEYWORDS: <keywords> RAP-LYRICS: <rap text> which proved to be insufficient for our task.
Eventually we chosed to use the fine-tuned GPT2 model. Experimental and succeeding scripts can be found in ./preprocessing/finetunging.
Additionaly, a RoBERTa model was finetuned on both data from the english wikipedia, tweets regarding hate speech, the CNN/Dailymail dataset
and 4k rap lyrics data (data can be found under Data) to classify the quality of the generated raps.