feat: Enforce `ml_dtypes.bfloat16` for BF16 I/O in Python client by yinggeh · Pull Request #897 · triton-inference-server/client

yinggeh · 2026-05-15T03:10:07Z

Summary

NumPy has no native BF16. Currently, casting FP32 <=> BF16 magnifies accuracy loss. Enforce using ml_dtypes.bfloat16 for native BF16 support.

Example

For 0.01 + 0.01, BF16 <=> FP32 will lose precision during truncating, making the output value biases from the true result 0.02.

0.01(BF16) + 0.01(BF16) => 0.0200195 (BF16)
0.01(FP32->BF16) + 0.01(FP32->BF16) => 0.01989746 (BF16->FP32)

whoisj · 2026-05-15T19:32:03Z

+        dtype = np_to_triton_dtype(input_tensor.dtype)
+        if self._input.datatype != dtype:
+            error_message = (
+                "got unexpected datatype {} from numpy array, expected {}.".format(


why use an interpolated string instead?

f"got unexpected datatype {dtype} from numpy array, expected {self._input.datatype}."

whoisj · 2026-05-15T19:32:40Z

+        dtype = np_to_triton_dtype(input_tensor.dtype)
+        if self._datatype != dtype:
+            error_message = (
+                "got unexpected datatype {} from numpy array, expected {}.".format(


again, why not an interpolated string?

Initial commit

e48e514

yinggeh changed the title ~~feat: Require ml_dtypes.bfloat16 for BF16 in Python client (TRI-801)~~ feat: Require ml_dtypes.bfloat16 for BF16 in Python client May 15, 2026

yinggeh mentioned this pull request May 15, 2026

test: Align QA BF16 with ml_dtypes and generate ONNX BF16 models triton-inference-server/server#8782

Open

11 tasks

yinggeh requested review from mc-nv, mudit-eng, pskiran1 and whoisj and removed request for mc-nv May 15, 2026 03:14

yinggeh self-assigned this May 15, 2026

yinggeh added the enhancement New feature or request label May 15, 2026

yinggeh changed the title ~~feat: Require ml_dtypes.bfloat16 for BF16 in Python client~~ feat: Use ml_dtypes.bfloat16 for BF16 I/O in Python client (TRI-801) May 15, 2026

yinggeh changed the title ~~feat: Use ml_dtypes.bfloat16 for BF16 I/O in Python client (TRI-801)~~ feat: Use ml_dtypes.bfloat16 for BF16 I/O in Python client May 15, 2026

yinggeh changed the title ~~feat: Use ml_dtypes.bfloat16 for BF16 I/O in Python client~~ feat: Enforce ml_dtypes.bfloat16 for BF16 I/O in Python client May 15, 2026

whoisj requested changes May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Enforce `ml_dtypes.bfloat16` for BF16 I/O in Python client#897

feat: Enforce `ml_dtypes.bfloat16` for BF16 I/O in Python client#897
yinggeh wants to merge 1 commit into
mainfrom
yinggeh/tri-801-deprecate-bf16-to-fp32-conversion-in-python-client-library

yinggeh commented May 15, 2026 •

edited

Loading

Uh oh!

whoisj May 15, 2026

Uh oh!

whoisj May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

yinggeh commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example

Uh oh!

whoisj May 15, 2026

Choose a reason for hiding this comment

Uh oh!

whoisj May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

yinggeh commented May 15, 2026 •

edited

Loading