NIM Proxy prevents empty reponses in Continue + VSCode/VSCodium

Fixes Step 3.7 Flash, Nemotron 3 Ultra and Kimi k2.6 (and other models) silently returning empty responses in Continue.

Root Cause: Step 3.7 Flash, Nemotron 3 Ultra and Kimi k2.6 on NVIDIA NIM runs with speculative decoding and includes a usage field on every streaming chunk. Continue's OpenAI provider interprets any chunk containing usage as the final chunk and stops — discarding all content silently, no error shown.

Overview of the Proxy:

Sits between Continue and NIM, fixing things per request:

Strips min_p from outgoing requests (causes silent HTTP 400)
Strips usage from content chunks in the streaming response (causes silent empty reply)
Strips reasoning/reasoning_content chunks (they had empty content)
Preserving tool_calls chunks so Continue can execute tools
Forward almost real-time Streaming

Requirements

Python 3.x (only standard libraries — tested with python 3.14)
NVIDIA NIM API key

Setup

0. Download the proxy

nim_proxy.py

1. Setup the port you want to use

open nim_proxy.py and change the port by replacing the default LISTEN_PORT = 7606 with whichever port you want to use (make sure it is not occupied by something else).

2. Run the proxy (keep this terminal open while using Step 3.7 Flash, Nemotron 3 Ultra and Kimi k2.6 in Continue)

# source your venv if you use one, then run
python nim_proxy.py

3. Point Continue to the proxy in your config.yaml, here an example of configuration (pay attention to apiBase):

models:
  - name: Step-3.7-Flash
    provider: openai
    model: stepfun-ai/step-3.7-flash
    apiBase: http://localhost:7606   # important: proxy instead of https://integrate.api.nvidia.com/v1
    apiKey: your-nim-key-here
    roles: [chat, edit, apply, summarize]
    capabilities: capabilities: [tool_use, image_input]
    defaultCompletionOptions:
      temperature: 0.7
      top_p: 0.95
      top_k: 35
      contextLength: 262144
      maxTokens: 16384
    chatOptions:
      baseSystemMessage: |
        You are an expert ... # enter your system prompt here
      baseAgentSystemMessage: |
        You are an expert ... # enter your system prompt here
      basePlanSystemMessage: |
        You are an expert ... # enter your system prompt here

Contact

Developer: Johannes Faber — fais.udder466@passinbox.com
Hub-Website: https://fai-solutions.github.io/
Issues: https://github.com/FAI-Solutions/Continue-NIM-Proxy/issues

License

MIT

Summary

This repository contains a practical workaround for step-3.7-flash, Nemotron 3 Ultra and Kimi k2.6 empty response, Continue VSCode no reply, stepfun-ai step-3.7-flash not working Continue, min_p speculative decoding HTTP 400 NIM and related NIM / Continue integration issues. The "solution" nim_proxy.py acts as a proxy between NIM and Continue, rewriting the Step 3.7 Flash stream into a Continue-compatible format so the model works again in VSCode / VSCodium. Keep the proxy running while using Step 3.7 Flash, Nemotron 3 Ultra and Kimi k2.6; it can remain active alongside other models.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
LICENSE		LICENSE
README.md		README.md
nim_proxy.py		nim_proxy.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NIM Proxy prevents empty reponses in Continue + VSCode/VSCodium

Overview of the Proxy:

Requirements

Setup

Contact

License

Summary

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NIM Proxy prevents empty reponses in Continue + VSCode/VSCodium

Overview of the Proxy:

Requirements

Setup

Contact

License

Summary

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages