diff --git a/docs/providers/neosantara.md b/docs/providers/neosantara.md
new file mode 100644
index 00000000..30da66b3
--- /dev/null
+++ b/docs/providers/neosantara.md
@@ -0,0 +1,294 @@
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+# Neosantara
+
+## Overview
+
+| Property | Details |
+|-------|-------|
+| Description | Neosantara provides an OpenAI-compatible API for chat completions and the Responses API. |
+| Provider Route on LiteLLM | `neosantara/` |
+| Link to Provider Doc | [Neosantara ↗](https://docs.neosantara.xyz) |
+| Base URL | `https://api.neosantara.xyz/v1` |
+| Supported Operations | [`/chat/completions`](#usage---litellm-python-sdk), [`/responses`](#responses-api), [function calling](#function-calling) |
+
+
+
+
+https://api.neosantara.xyz/v1
+
+**We support Neosantara models through the `neosantara/` prefix in LiteLLM.**
+
+## Required Variables
+
+```python showLineNumbers title="Environment Variables"
+os.environ["NEOSANTARA_API_KEY"] = "" # your Neosantara API key
+```
+
+You can override the default base URL with:
+
+```python showLineNumbers title="Optional Base URL Override"
+os.environ["NEOSANTARA_API_BASE"] = "https://api.neosantara.xyz/v1"
+```
+
+## Usage - LiteLLM Python SDK
+
+### Non-streaming
+
+```python showLineNumbers title="Neosantara Non-streaming Completion"
+import os
+from litellm import completion
+
+os.environ["NEOSANTARA_API_KEY"] = ""
+
+response = completion(
+ model="neosantara/gemini-3.5-flash",
+ messages=[{"role": "user", "content": "Hello from LiteLLM"}],
+ max_tokens=64,
+)
+
+print(response.choices[0].message.content)
+```
+
+### Streaming
+
+```python showLineNumbers title="Neosantara Streaming Completion"
+import os
+from litellm import completion
+
+os.environ["NEOSANTARA_API_KEY"] = ""
+
+response = completion(
+ model="neosantara/gemini-3.5-flash",
+ messages=[{"role": "user", "content": "Write a one-line poem about Jakarta"}],
+ max_tokens=64,
+ stream=True,
+)
+
+for chunk in response:
+ if chunk.choices[0].delta.content is not None:
+ print(chunk.choices[0].delta.content, end="")
+```
+
+## Function Calling
+
+### Chat Completions
+
+Function calling works on Neosantara through `completion()`, but LiteLLM currently requires explicitly allowing the OpenAI params on this provider route.
+
+```python showLineNumbers title="Neosantara Function Calling via completion()"
+import os
+from litellm import completion
+
+os.environ["NEOSANTARA_API_KEY"] = ""
+
+response = completion(
+ model="neosantara/gemini-3.5-flash",
+ messages=[{"role": "user", "content": "What is the weather in Jakarta? Use the tool."}],
+ tools=[
+ {
+ "type": "function",
+ "function": {
+ "name": "get_weather",
+ "description": "Get weather by city name",
+ "parameters": {
+ "type": "object",
+ "properties": {
+ "city": {"type": "string"}
+ },
+ "required": ["city"]
+ }
+ }
+ }
+ ],
+ tool_choice="auto",
+ allowed_openai_params=["tools", "tool_choice"],
+ max_tokens=128,
+)
+
+print(response.choices[0].message.tool_calls)
+```
+
+## Responses API
+
+Neosantara also supports LiteLLM's `responses()` interface.
+
+```python showLineNumbers title="Neosantara Responses API"
+import os
+from litellm import responses
+
+os.environ["NEOSANTARA_API_KEY"] = ""
+
+response = responses(
+ model="neosantara/gemini-3.5-flash",
+ input="Reply with exactly: pong",
+ max_output_tokens=8,
+)
+
+print(response.output_text)
+```
+
+### Responses API Tool Calling
+
+Function calling also works through `responses()`:
+
+```python showLineNumbers title="Neosantara Function Calling via responses()"
+import os
+from litellm import responses
+
+os.environ["NEOSANTARA_API_KEY"] = ""
+
+response = responses(
+ model="neosantara/gemini-3.5-flash",
+ input="What is the weather in Jakarta? Use the tool.",
+ tools=[
+ {
+ "type": "function",
+ "name": "get_weather",
+ "description": "Get weather by city name",
+ "parameters": {
+ "type": "object",
+ "properties": {
+ "city": {"type": "string"}
+ },
+ "required": ["city"]
+ }
+ }
+ ],
+ max_output_tokens=128,
+)
+
+print(response.output)
+```
+
+## Parameter Compatibility
+
+LiteLLM maps `max_completion_tokens` to `max_tokens` for Neosantara automatically.
+
+```python showLineNumbers title="Neosantara max_completion_tokens Compatibility"
+import os
+from litellm import completion
+
+os.environ["NEOSANTARA_API_KEY"] = ""
+
+response = completion(
+ model="neosantara/gemini-3.5-flash",
+ messages=[{"role": "user", "content": "Summarize LiteLLM in one sentence"}],
+ max_completion_tokens=64,
+)
+
+print(response.choices[0].message.content)
+```
+
+## Reasoning Effort
+
+LiteLLM can pass `reasoning_effort` through to Neosantara, but like tool calling on `completion()`, you should explicitly allow the parameter on this provider route.
+
+```python showLineNumbers title="Neosantara reasoning_effort"
+import os
+from litellm import completion
+
+os.environ["NEOSANTARA_API_KEY"] = ""
+
+response = completion(
+ model="neosantara/gemini-3.5-flash",
+ messages=[{"role": "user", "content": "Reply with exactly: reasoned"}],
+ reasoning_effort="high",
+ allowed_openai_params=["reasoning_effort"],
+ max_tokens=8,
+)
+
+print(response.choices[0].message.content)
+```
+
+## Usage - LiteLLM Proxy
+
+Add the following to your LiteLLM Proxy configuration file:
+
+```yaml showLineNumbers title="config.yaml"
+model_list:
+ - model_name: neosantara-gemini-flash
+ litellm_params:
+ model: neosantara/gemini-3.5-flash
+ api_key: os.environ/NEOSANTARA_API_KEY
+```
+
+Start your LiteLLM Proxy server:
+
+```bash showLineNumbers title="Start LiteLLM Proxy"
+litellm --config config.yaml
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+
+
+
+```python showLineNumbers title="Neosantara via Proxy - OpenAI SDK"
+from openai import OpenAI
+
+client = OpenAI(
+ base_url="http://localhost:4000",
+ api_key="your-proxy-api-key",
+)
+
+response = client.chat.completions.create(
+ model="neosantara-gemini-flash",
+ messages=[{"role": "user", "content": "Say hello from the proxy"}],
+)
+
+print(response.choices[0].message.content)
+```
+
+
+
+
+
+```python showLineNumbers title="Neosantara via Proxy - LiteLLM SDK"
+import litellm
+
+response = litellm.completion(
+ model="litellm_proxy/neosantara-gemini-flash",
+ messages=[{"role": "user", "content": "Say hello from the proxy"}],
+ api_base="http://localhost:4000",
+ api_key="your-proxy-api-key",
+)
+
+print(response.choices[0].message.content)
+```
+
+
+
+
+
+```bash showLineNumbers title="Neosantara via Proxy - cURL"
+curl http://localhost:4000/v1/chat/completions \
+ -H "Content-Type: application/json" \
+ -H "Authorization: Bearer your-proxy-api-key" \
+ -d '{
+ "model": "neosantara-gemini-flash",
+ "messages": [{"role": "user", "content": "Say hello from the proxy"}]
+ }'
+```
+
+```bash showLineNumbers title="Neosantara via Proxy - Responses API"
+curl http://localhost:4000/v1/responses \
+ -H "Content-Type: application/json" \
+ -H "Authorization: Bearer your-proxy-api-key" \
+ -d '{
+ "model": "neosantara-gemini-flash",
+ "input": "Reply with exactly: pong"
+ }'
+```
+
+
+
+
+## Notes
+
+- Use the `neosantara/` model prefix when calling Neosantara through LiteLLM.
+- The default auth env var is `NEOSANTARA_API_KEY`.
+- `NEOSANTARA_API_BASE` can be used to point LiteLLM at a custom Neosantara-compatible base URL.
+- `max_completion_tokens` is mapped to `max_tokens` automatically.
+- On the `completion()` route, `tools`, `tool_choice`, and `reasoning_effort` may require `allowed_openai_params=[...]` so LiteLLM will pass them through.
diff --git a/sidebars.js b/sidebars.js
index 8dc25e84..7a50e12f 100644
--- a/sidebars.js
+++ b/sidebars.js
@@ -993,6 +993,7 @@ const sidebars = {
"providers/moonshot",
"providers/morph",
"providers/nebius",
+ "providers/neosantara",
"providers/nlp_cloud",
"providers/nano-gpt",
"providers/novita",