text encoding option for Iso2709 files#24
text encoding option for Iso2709 files#24VeniVidiVici wants to merge 9 commits intofredericd:masterfrom
Conversation
|
Hi, Thks for your feedback. Your code looks good. What is your use case? Could you document how you use I'm reluctant to extend this library to deal with non utf8 ISO2709 files. Those kind of files belong to the past... and could be handled and transformed into utf8 with other tools. |
|
Hi, The data we are importing does not belonging to us but I will asked a friendly contact and one of the libraries if they can supply m with a few records to share. (most imports are more than a GB in size) I have in the past tried to transform the data between reading the file and passing it to marcjs but never got it working, this was more than a year ago so I don't remember the specifics. Maybe with he recent updates this could now be possible again. |
|
Do you have a sample file? I'm currently working on the module and may add your code. |
Update Marcxml parser library to ignore self closing XML tags
Hi,
When parsing Iso2709 records toString is called with utf8 hard coded in. We are sometimes supplied files with strings in latin1.
I have been unable to find a way to reliably detect string encoding when parsing, this patch makes it an optional parameter when creating the stream. There may be a better way to handle this but this seems to be working fine.
Chris