Skip to content

Latest commit

 

History

History
20 lines (11 loc) · 1.53 KB

File metadata and controls

20 lines (11 loc) · 1.53 KB

How I worked on this project

I worked on this project by reading and studying some papers on GPT so I can better understand how to implement GPT myself. I came to figure that creating something like chatGPT isn't as easy as it looks but there is a way to create an application that stems from one of its main components consisting of language modeling.

How to navigate this project

This is a pretty easy project to follow since most of programming in Python is usually done in a single file. langModel is where the basic structure of a GPT model is layed out and train.py is where the actual programming is done to train the program to formulate responses based off the input data it reads.

Why I built the project this way

I built the project with the execution of two files: langModel.py has the basic fondations of the language model and train.py is where all that information is actaully used to train the specificity of performing the tasks that create dialogue by following the syntax of the two input files.

If I had more time I would change this

There is a way to optimize this program in a way which results are showcased in a more profound manner with a better sophistication of the english language. However, that would require a lot more time and understanding of the different components of The Transformer - model architect shown in the paper "Attention is All You Need".

Papers that helped me design and understand GPT with more clarity