Conversation
…ount is miss matched
|
Thank you for the contribution! I believe that the problem is a bit more subtle, though. Until now the implementation has been naive in the sense I didn't try to follow the standard. If we are to make this extension then we need to choose a spec and follow it. E.g. https://csv-spec.org/. On an implementation-level, there are things that I am currently not happy with that I shall address. |
|
Sure , Should we implement a flag to ignore errors or offer an option to skip error ? i was dealing with a file with 249 million lines and was splitting to files of 500,000 each . I picked your tool because the fact it includes the headers on the split , and when you the program exits when there is an error without some useful info and no reasonable way to resume it took some math and dd skip bytes then sed append header etc, and then cut -d f take the first column build an index then figure out the line that was broken. so maybe
I will check the spec out |
|
looks like if we want to stick with spec better off to integrate with https://github.com/rgamble/libcsv what do you think ? |
|
Sorry for ghosting you. I think this tool wants to be standalone, for learning purposes; of course you can fork and do whatever you want with it, no hard feelings. However, options 2 and 3 seem quite sensible and possibly the easiest to implement? |
the column count versifier was not taking enclosed string columns as a single column and was splitting columns with ',' inside the string. I did a quick fix for it, Hope it's good enough, also some compiler warnings on flags.c about pointers being compared to none pointer types etc.