UniScientist

Advancing Universal Scientific Research Intelligence via Evolving Polymathic Synthesis

News

[2026-03-14] UniScientist-30B-A3B is now available for download on HuggingFace and ModelScope.
[2026-03-11] We release the full inference trajectories of UniScientist on the FrontierScience-Research benchmark. Check the trajectory/ folder for details.

UniScientist advances universal scientific research intelligence through a unified paradigm. By reassigning LLMs as cross-disciplinary generators and human experts as high-precision verifiers, it produces research-grade data spanning 50+ scientific disciplines with structured, rubric-based supervision. A 30B-parameter model trained on this data achieves highly competitive performance across five research benchmarks. Read the blog first for a better overall impression.

Overview

UniScientist formalizes open-ended scientific research as Active Evidence Integration and Model Abduction, and proposes the Evolving Polymathic Synthesis paradigm for synthesizing high-quality research problems and evaluation rubrics at scale.

The approach comprises three key components:

Evolving Polymathic Synthesis — A human-LLM collaborative data paradigm that generates research-grade scientific problems across 50+ disciplines, each accompanied by co-evolved rubrics refined through completeness, consistency, and distinguishability checks.
Agentic Research Loop — The model conducts scientific research by iteratively acquiring evidence, deriving formally-justified results, and updating hypotheses via abductive inference, using tools including web_search, google_scholar, page_fetching, and code_interpreter.
Report Aggregation — Given multiple candidate research reports, the model learns to synthesize a consolidated report integrating the best elements, enabling research quality to self-evolve over time.

Main Results

UniScientist-30B-A3B achieves top-tier performance across all five benchmarks, surpassing much larger proprietary models:

FrontierScience-Research: 28.3 (surpassing Claude Opus 4.5 at 17.5, GPT-5.2 at 25.2), reaching 33.3 with test-time scaling (Aggr@8)
FrontierScience-Olympiad: 66.0 without tools, 71.0 with tools + Aggr@8 (matching Claude Opus 4.5)
DeepResearch Bench: 46.0 (vs. Perplexity Deep Research 42.3, OpenAI Deep Research 47.0)
DeepResearch Bench II: 48.0 (surpassing OpenAI Deep Research 45.4, Gemini-3-Pro Deep Research 44.6)
ResearchRubrics: 59.9 (comparable to OpenAI Deep Research 59.7, Gemini Deep Research 61.5)

Repository Structure

UniScientist/
├── local_deploy.sh                 # Step 1: Deploy local LLM via vLLM
├── inference_local_qwen.sh         # Step 2: Run agentic inference (repeat for multiple rollouts)
├── inference_local_qwen.py         # Agentic inference engine
├── inference_local_aggregate.py    # Step 3: Aggregate multiple rollouts into a final report
├── tools/
│   ├── tool_search.py              # Google web search (via Serper API)
│   ├── tool_scholar.py             # Google Scholar search (via Serper API)
│   ├── tool_visit.py               # Webpage reader (via Jina Reader API) with LLM summarization
│   └── tool_code.py                # Python code interpreter
├── trajectory/
│   ├── uniscientist_research_traj.jsonl          # Single-rollout trajectories (with tools)
│   ├── uniscientist_research_no_tool_traj.jsonl  # Single-rollout trajectories (without tools)
│   └── uniscientist_research_aggregate8_traj.jsonl # Aggregated trajectories (Aggr@8)
├── data/                           # Place your input data here (see Data Format below)
├── requirements.txt
└── .gitignore

Quick Start

Install

pip install -r requirements.txt

Step 1: Deploy Local LLM

Edit local_deploy.sh to set MODEL_PATH to your model weights, then:

bash local_deploy.sh

This starts a vLLM OpenAI-compatible server on port 8000. Wait until the server is ready before proceeding.

Step 2: Run Agentic Inference

Edit inference_local_qwen.sh to fill in your API keys and configuration, then run it multiple times to collect diverse rollouts:

# Run N times to collect N rollouts
bash inference_local_qwen.sh
bash inference_local_qwen.sh
bash inference_local_qwen.sh

Each run produces (or appends to) a .jsonl output file named <STORED_MODEL_NAME>_<BENCHMARK>.jsonl.

Step 3: Aggregate Results

Merge multiple rollout results into a single comprehensive report:

python inference_local_aggregate.py \
    --model-name "UniScientist-30B-A3B" \
    --data-paths rollout_1.jsonl rollout_2.jsonl rollout_3.jsonl \
    --benchmark research \
    --rollout-num 1 \
    --llm-max-concurrency 32

Configuration

API Keys

The following API keys are required for the tool suite:

Key	Service	Description
`SERPER_KEY_ID`	Serper	Google web search & Google Scholar
`JINA_API_KEYS`	Jina Reader	Webpage content reading
`OPENROUTER_API_KEY`	OpenRouter	LLM-based webpage summarization

Data Format

Place your input data in the data/ directory as .jsonl files:

{"problem": "Your research question here", "answer": "Ground truth answer / rubrics (optional)"}

Aggregation Arguments

Argument	Required	Default	Description
`--model-name`	Yes	-	Model identifier for naming the output file
`--data-paths`	Yes	-	One or more rollout `.jsonl` files to aggregate
`--benchmark`	No	`research`	Benchmark name for naming the output file
`--rollout-num`	No	`1`	Number of aggregation passes per question cluster
`--local-base-url`	No	`http://localhost:8000/v1`	vLLM server endpoint
`--output-path`	No	auto-generated	Custom output file path
`--llm-max-concurrency`	No	`32`	Max concurrent LLM requests

Citation

If you find UniScientist useful in your research, please cite:

@misc{unipat2026uniscientist,
  title   = {UniScientist: Advancing Universal Scientific Research Intelligence},
  author  = {Baixuan Li and Jialong Wu and Yida Zhao and Wendong Xu and Xuanzhong Chen and Huifeng Yin and Liang Chen and Wentao Zhang and Kuan Li},
  year    = {2026},
  url     = {https://unipat.ai/blog/UniScientist}
}

Contact

We are continuously expanding the Universal Scientific Research Dataset to cover additional disciplines and research paradigms. We welcome collaborations with research teams interested in advancing scientific research intelligence. Reach out at contact@unipat.ai.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniScientist

News

Overview

Main Results

Repository Structure

Quick Start

Install

Step 1: Deploy Local LLM

Step 2: Run Agentic Inference

Step 3: Aggregate Results

Configuration

API Keys

Data Format

Aggregation Arguments

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
tools		tools
trajectory		trajectory
.gitignore		.gitignore
README.md		README.md
inference_local_aggregate.py		inference_local_aggregate.py
inference_local_qwen.py		inference_local_qwen.py
inference_local_qwen.sh		inference_local_qwen.sh
local_deploy.sh		local_deploy.sh
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

UniScientist

News

Overview

Main Results

Repository Structure

Quick Start

Install

Step 1: Deploy Local LLM

Step 2: Run Agentic Inference

Step 3: Aggregate Results

Configuration

API Keys

Data Format

Aggregation Arguments

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages