Skip to content

invalid continuation byte error #4

@conniec

Description

@conniec

I'm getting this error when trying to run urlnorm.norm on this url:

http://productiveRamadan.com/ar/%d8%a7%d9%86%d8%aa%d9%81%d8%b9-%d9%85%d9%86-%d8%a7%d9%84%d8%b5%d9%88%d9%85-%d9%88-%d8%aa%d8%ac%d9%86%d8%a8-%d9%87%d8%b0%d9%87-%d8%a7%d9%84%d8%a3%d9%86%d9%88%d8%a7%d8%b9-%d9%85%d9%86-%d8%a7%d9%84%d8%-2

url is valid arabic url, but something in the norm_path() causes value.decode("utf-8") in _unicode() to fail with
"UnicodeDecodeError: 'utf8' codec can't decode byte 0xd8 in position 74: invalid continuation byte"

Printing out the value in _unicode() and then trying to do the decode('utf-8') in python shell works fine, any ideas?

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions