Due to a broken check in the codegolf function, non-Latin-1 characters (all those above U+00FF) at odd-numbered positions in the input string have their code point silently truncated to 8 bytes, instead of throwing an error so the user can be notified.
To Reproduce
- Enter
'ज़' into input box, which has a non-Latin-1 character in index 1.
- Click on "Golf it" button
- Observe printed output
exec(bytes('嬧‧','u16')[2:])
- Verify that
bytes('嬧‧','u16')[2:] evaluates to b"'[' ", which does not match the input code.
Expected behavior
Error message displayed about non-ASCII characters, as it is for the input ' ज़' (space added to put the character into an even-numbered position).
Environment
- OS: Windows 10
- Browser: Brave Version 1.39.111 Chromium: 102.0.5005.61 (Official Build) (64-bit)
Additional context
The code causing the issue is here
Effectively c1 (the even-numbered character) is checked but c2 is ignored and subsequently truncated.
Also
Handling of characters from the Latin-1 Supplement block (U+0080 to U+00FF) by this site is unclear. These are non-ASCII characters, but is there a reason to ban them from the input? Shouldn't the check really be > 255 instead of > 127?
Due to a broken check in the codegolf function, non-Latin-1 characters (all those above U+00FF) at odd-numbered positions in the input string have their code point silently truncated to 8 bytes, instead of throwing an error so the user can be notified.
To Reproduce
'ज़'into input box, which has a non-Latin-1 character in index 1.exec(bytes('嬧‧','u16')[2:])bytes('嬧‧','u16')[2:]evaluates tob"'[' ", which does not match the input code.Expected behavior
Error message displayed about non-ASCII characters, as it is for the input
' ज़'(space added to put the character into an even-numbered position).Environment
Additional context
The code causing the issue is here
Effectively
c1(the even-numbered character) is checked butc2is ignored and subsequently truncated.Also
Handling of characters from the Latin-1 Supplement block (U+0080 to U+00FF) by this site is unclear. These are non-ASCII characters, but is there a reason to ban them from the input? Shouldn't the check really be
> 255instead of> 127?