Skip to content

Add UTF-8 encoding support for query data and update version to 2.1.0#9

Merged
semantiDan merged 5 commits into
mainfrom
feature/add-utf8-encoding-to-query-data
Dec 11, 2025
Merged

Add UTF-8 encoding support for query data and update version to 2.1.0#9
semantiDan merged 5 commits into
mainfrom
feature/add-utf8-encoding-to-query-data

Conversation

@semantiDan

Copy link
Copy Markdown
Contributor

Add UTF-8 Encoding to Query Data

Summary

Fixed encoding issue when sending SQL queries with unicode characters to the Timbr API. Queries containing non-ASCII characters (e.g., Chinese, Arabic, emoji, accented letters) were not properly encoded, causing server-side errors.

Changes Made

1. Core Fix (pytimbr_api/timbr_http_connector.py)

  • Modified run_query() function to explicitly encode string queries as UTF-8 bytes
  • Added type checking to handle both string and pre-encoded byte inputs
  • Changed: data = querydata = query.encode('utf-8') if isinstance(query, str) else query

2. Test Coverage (test/test_encoding.py) - New file

  • Added 4 comprehensive test cases to verify encoding functionality:
    • test_run_query_with_unicode_characters - Tests Chinese, French accents, and emoji
    • test_run_query_with_already_encoded_bytes - Ensures backward compatibility with byte inputs
    • test_run_query_with_special_sql_characters - Tests SQL special characters and copyright symbol
    • test_run_query_with_multilingual_text - Tests Latin, Cyrillic, Arabic, and Japanese characters

3. Version Bump (pyproject.toml)

  • Updated version from 2.0.0 to 2.1.0 (minor version bump for new functionality)

4. Dependencies (requirements.txt)

  • Updated file encoding to UTF-8

Impact

  • Fixes issues with international characters and special symbols in SQL queries
  • Maintains backward compatibility with existing code
  • All tests pass successfully

@semantiDan semantiDan self-assigned this Dec 10, 2025
@semantiDan semantiDan requested a review from Copilot December 10, 2025 10:12

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds UTF-8 encoding support for SQL queries containing unicode characters (e.g., Chinese, Arabic, emoji, accented letters) to prevent server-side errors. The fix ensures queries are properly encoded before being sent to the Timbr API.

Key Changes:

  • Modified run_query() to explicitly encode string queries as UTF-8 bytes
  • Added comprehensive test coverage for unicode character handling
  • Bumped version from 2.0.0 to 2.1.0

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File Description
pytimbr_api/timbr_http_connector.py Added UTF-8 encoding logic with type checking for backward compatibility
test/test_encoding.py New test file with 4 test cases covering unicode, multilingual text, and special characters
pyproject.toml Version bump to 2.1.0 reflecting the new encoding functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pytimbr_api/timbr_http_connector.py
Comment thread test/test_encoding.py
Comment thread test/test_encoding.py
Comment thread test/test_encoding.py
@semantiDan semantiDan merged commit 77e0b32 into main Dec 11, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants