You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I worked on this project by reading and studying some papers on GPT so I can better understand how to implement GPT myself. I came to figure that creating something like chatGPT isn't as easy as it looks but there is a way to create an application that stems from one of its main components consisting of language modeling.
How to navigate this project
This is a pretty easy project to follow since most of programming in Python is usually done in a single file. langModel is where the basic structure of a GPT model is layed out and train.py is where the actual programming is done to train the program to formulate responses based off the input data it reads.
Why I built the project this way
I built the project with the execution of two files: langModel.py has the basic fondations of the language model and train.py is where all that information is actaully used to train the specificity of performing the tasks that create dialogue by following the syntax of the two input files.
If I had more time I would change this
There is a way to optimize this program in a way which results are showcased in a more profound manner with a better sophistication of the english language. However, that would require a lot more time and understanding of the different components of The Transformer - model architect shown in the paper "Attention is All You Need".
Papers that helped me design and understand GPT with more clarity
miniGPT is an application that takes some building blocks from Generative Pre-training Transformer such as a simple Language Learning Model and uses that to create a thread of dialogues in the English language based off the input it receives.