Skip to content

Latest commit

 

History

History
8 lines (7 loc) · 712 Bytes

File metadata and controls

8 lines (7 loc) · 712 Bytes

Basic Usage:

  • cat together a corpus into a singular file (ex., copcorp.txt).
  • Edit freq.sh's first line to contain the unicode characters that you need (unicode-table.com is good for this).
  • Run freq.sh, this will take a while and use a lot of cpu.
  • Make sure there aren't any errors, since it takes a while freq.sh generates a file at each step for error checking.
  • Run parser.pl on the last output of freq.sh, this will normalize the frequency numbers to get them ready for ASK.

parser.pl taken from this repository licensed with Apache 2.0!