Skip to content

Cannot read Twix headers with \r or \r\n line endings #50

@drewclements

Description

@drewclements

When parsing Siemens Twix files, read_twix_hdr.py assumes that all line endings are Unix-style (\n). However, some scanners or environments generate headers with carriage returns (\r) or Windows-style CRLF (\r\n). This causes parsing failures in functions like parse_ascconv and parse_buffer since regexes and split('\n') calls never match.

To help a site I support, I created this workaround in read_twix_hdr.py for them:

  1. Normalize line endings immediately after decoding the header buffer:
    buffer = buffer.decode('latin-1', errors='ignore')
    buffer = buffer.replace('\r\n', '\n').replace('\r', '\n')
  2. And adjust regexes to allow any line ending:
    - vararray = re.finditer(r'(?P<name>\S*)\s*=\s*(?P<value>\S*)\n', buffer)
    + vararray = re.finditer(r'(?P<name>\S*)\s*=\s*(?P<value>\S*)(?:\r\n|\n|\r)', buffer)
    ...
    - reASCCONV = re.compile(r'### ASCCONV BEGIN[^\n]*\n(.*)\s### ASCCONV END ###', re.DOTALL)
    + reASCCONV = re.compile(r'### ASCCONV BEGIN[^\r\n]*[\r\n](.*)\s### ASCCONV END ###', re.DOTALL)

I would appreciate if this fix was added to main to simplify setup for sites that experience this issue. They can't seem to figure out why their twix files handle new lines like this. Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions