After spiting the text the typical next step is sometimes to clean (aka normalize) if before before indexing or vectorization processes. Here is the class that I wrote. It might be an interesting new feature to add to baran.
https://github.com/MadBomber/lib_ruby/blob/master/text_cleaner.rb
I think the TextCleaner class is too small to be a stand-alone gem. It might fit in well with your project.
Dewayne
o-*
After spiting the text the typical next step is sometimes to clean (aka normalize) if before before indexing or vectorization processes. Here is the class that I wrote. It might be an interesting new feature to add to baran.
https://github.com/MadBomber/lib_ruby/blob/master/text_cleaner.rb
I think the
TextCleanerclass is too small to be a stand-alone gem. It might fit in well with your project.Dewayne
o-*