-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathinput1.txt
More file actions
13 lines (13 loc) · 759 Bytes
/
input1.txt
File metadata and controls
13 lines (13 loc) · 759 Bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
1 7
Large language models learn from text.
Tokenization reduces repetition and improves efficiency.
Subword merges are a simple idea with powerful impact.
We study small examples to understand the big picture.
This assignment focuses on basic arrays and careful iteration.
Please test your code with redirected input and compare outputs.
Finally, report what you learned about compression tradeoffs.
3
32 116 128
101 110 129
97 108 130
76 97 114 103 101 128 109 111 100 101 108 115 32 108 101 97 114 110 32 102 114 111 109 32 116 101 120 116 46 32 84 111 107 101 110 105 122 97 116 105 111 110 32 114 101 100 117 99 101 115 32 114 101 112 101 116 105 116 105 111 110 32 97 110 100 32 105 109 112 114 111 118 101 115 32 101 102 102 105 99 105 101 110 99 121 46