Skip to content

feature: Native R Language Support as a First-Class Script Block #8258

@Ohkyusung

Description

@Ohkyusung

Feature Request: Native R Language Support as a First-Class Script Block

Summary

We are deploying Windmill as the core MLOps orchestration platform for a Virtual Metrology (VM) system at a major semiconductor/display manufacturer. R is the dominant language used by their process engineers and data scientists. The lack of native R support in Windmill is currently the single biggest blocker to enterprise adoption.

We are requesting native R language support — not just Bash-wrapped Rscript execution, but a first-class R script block with IDE-level features comparable to what Windmill already offers for Python and TypeScript.

Context & Business Case

Who We Are

We are an IT solutions provider building smart factory systems (FDC, APC, SPC, Virtual Metrology) for semiconductor, display, and battery manufacturers. We are currently architecting a Virtual Metrology system targeting MLOps Level 3 maturity (automated model deployment with CI/CD pipelines, model registry, continuous monitoring, and automated retraining triggers).

Why Windmill

We evaluated Airflow, Prefect, and Windmill. Windmill stood out for its:

  • Built-in Web IDE with LSP support per language
  • Auto-generated UI for script parameters
  • Flow editor for DAG-based orchestration with sub-20ms overhead
  • Self-hostable, air-gapped deployment capability (critical for semiconductor fabs)
  • Enterprise RBAC, SSO, and audit logging

We chose Windmill as our MLOps backbone. However, R support is a critical gap.

Why R Matters in This Domain

In semiconductor and display manufacturing, R is deeply embedded in the engineering workflow:

  • Statistical Process Control (SPC): Process engineers use R for control chart analysis, Cp/Cpk calculations, and distribution fitting
  • Virtual Metrology models: Many existing VM models are written in R using packages like caret, randomForest, xgboost, and custom in-house libraries
  • R2R (Run-to-Run) control: Feedback/feedforward control algorithms often originate as R scripts in RStudio
  • Regulatory & validation: Rewriting validated R models in Python solely due to platform limitations introduces re-validation overhead and risk

Our customer's engineering team has hundreds of existing R scripts powering their current VM and process control pipelines. Asking them to rewrite everything in Python is not viable — it would add months of migration effort and introduce regression risk in a domain where model accuracy directly impacts yield and product quality.

Enterprise Subscription Potential

This engagement involves a Fortune Global 500 display manufacturer. Native R support in Windmill could be the deciding factor for them to commit to an Enterprise plan. The deployment would span multiple fab lines, with 50+ engineers using Windmill daily for model training, validation, and deployment workflows. This is not a one-off request — we see the same R dependency across multiple customers in the semiconductor and battery industries.

What We're Requesting

Tier 1: Core R Script Block (Essential)

  • Native R executor (Rscript runtime) as a first-class language option in the script editor, flow steps, and app inline scripts
  • Typed input/output signature — R function parameters parsed and exposed as auto-generated UI inputs (similar to Python's type hints → UI mapping)
  • Dependency managementrenv.lock or equivalent for reproducible R package installation, with worker-level caching (similar to how Windmill caches pip and npm dependencies)
  • Resource/variable access — Windmill client library (or REST helper) for R to fetch resources, variables, and state from the Windmill workspace
  • Result handling — structured JSON return values from R scripts that downstream flow steps can consume

Tier 2: IDE Experience (Highly Desired)

  • LSP integration — R Language Server (languageserver package) connected via WebSocket, providing autocomplete, signature help, hover docs, and diagnostics in the Web IDE
  • Inline execution & debugging — ability to run R code cell-by-cell or line-by-line within the editor (similar to RStudio / Jupyter R kernel experience), with output/plot rendering in the preview pane
  • Plot rendering — capture and display R plot output (base R, ggplot2) as images in the job result view

Tier 3: Advanced (Nice to Have)

  • Dedicated worker tag for R-heavy workloads (pre-installed R + common packages to avoid cold-start)
  • R ↔ Python interop within a single flow — e.g., pass a data frame from a Python step to an R step via Arrow/Parquet serialization through S3
  • Rmarkdown rendering as a job output type

Current Workaround & Its Limitations

We are currently wrapping R execution in Bash steps:

Rscript -e "source('/path/to/script.R'); main(arg1='$1', arg2='$2')"

This works for simple cases but has significant limitations:

  • No LSP, no autocomplete, no syntax highlighting for R in the editor
  • No typed parameter UI — everything is a raw string
  • Dependency management is manual (must pre-install packages in the Docker image)
  • No structured output — must manually parse stdout
  • Debugging is painful — errors surface as raw stderr text with no line mapping
  • Cannot leverage Windmill's dependency caching or resource system from R

Reference

In Issue #4102, @rubenfiszel mentioned:

"Other languages we intend to support are R and Rust"

Rust has since been added as a native language. We would like to advocate for R to be next in line, backed by a concrete enterprise deployment that depends on it.

Summary

Aspect Detail
Industry Semiconductor / Display Manufacturing
Use Case Virtual Metrology, SPC, R2R Control
Target MLOps Level Level 3 (Automated Model Deployment)
Scale 50+ engineers, multiple fab lines
Subscription Impact Enterprise plan adoption contingent on R support
Migration Alternative Rewriting R → Python is not viable (validation cost, regression risk)

We're happy to collaborate on design, testing, and feedback. If there's an RFC or beta program for new language support, we'd love to participate.

# Feature Request: Native R Language Support as a First-Class Script Block

Summary

We are deploying Windmill as the core MLOps orchestration platform for a Virtual Metrology (VM) system at a major semiconductor/display manufacturer. R is the dominant language used by their process engineers and data scientists. The lack of native R support in Windmill is currently the single biggest blocker to enterprise adoption.

We are requesting native R language support — not just Bash-wrapped Rscript execution, but a first-class R script block with IDE-level features comparable to what Windmill already offers for Python and TypeScript.

Context & Business Case

Who We Are

We are an IT solutions provider building smart factory systems (FDC, APC, SPC, Virtual Metrology) for semiconductor, display, and battery manufacturers. We are currently architecting a Virtual Metrology system targeting MLOps Level 3 maturity (automated model deployment with CI/CD pipelines, model registry, continuous monitoring, and automated retraining triggers).

Why Windmill

We evaluated Airflow, Prefect, and Windmill. Windmill stood out for its:

  • Built-in Web IDE with LSP support per language
  • Auto-generated UI for script parameters
  • Flow editor for DAG-based orchestration with sub-20ms overhead
  • Self-hostable, air-gapped deployment capability (critical for semiconductor fabs)
  • Enterprise RBAC, SSO, and audit logging

We chose Windmill as our MLOps backbone. However, R support is a critical gap.

Why R Matters in This Domain

In semiconductor and display manufacturing, R is deeply embedded in the engineering workflow:

  • Statistical Process Control (SPC): Process engineers use R for control chart analysis, Cp/Cpk calculations, and distribution fitting
  • Virtual Metrology models: Many existing VM models are written in R using packages like caret, randomForest, xgboost, and custom in-house libraries
  • R2R (Run-to-Run) control: Feedback/feedforward control algorithms often originate as R scripts in RStudio
  • Regulatory & validation: Rewriting validated R models in Python solely due to platform limitations introduces re-validation overhead and risk

Our customer's engineering team has hundreds of existing R scripts powering their current VM and process control pipelines. Asking them to rewrite everything in Python is not viable — it would add months of migration effort and introduce regression risk in a domain where model accuracy directly impacts yield and product quality.

Enterprise Subscription Potential

This engagement involves a Fortune Global 500 display manufacturer. Native R support in Windmill could be the deciding factor for them to commit to an Enterprise plan. The deployment would span multiple fab lines, with 50+ engineers using Windmill daily for model training, validation, and deployment workflows. This is not a one-off request — we see the same R dependency across multiple customers in the semiconductor and battery industries.

What We're Requesting

Tier 1: Core R Script Block (Essential)

  • Native R executor (Rscript runtime) as a first-class language option in the script editor, flow steps, and app inline scripts
  • Typed input/output signature — R function parameters parsed and exposed as auto-generated UI inputs (similar to Python's type hints → UI mapping)
  • Dependency managementrenv.lock or equivalent for reproducible R package installation, with worker-level caching (similar to how Windmill caches pip and npm dependencies)
  • Resource/variable access — Windmill client library (or REST helper) for R to fetch resources, variables, and state from the Windmill workspace
  • Result handling — structured JSON return values from R scripts that downstream flow steps can consume

Tier 2: IDE Experience (Highly Desired)

  • LSP integration — R Language Server (languageserver package) connected via WebSocket, providing autocomplete, signature help, hover docs, and diagnostics in the Web IDE
  • Inline execution & debugging — ability to run R code cell-by-cell or line-by-line within the editor (similar to RStudio / Jupyter R kernel experience), with output/plot rendering in the preview pane
  • Plot rendering — capture and display R plot output (base R, ggplot2) as images in the job result view

Tier 3: Advanced (Nice to Have)

  • Dedicated worker tag for R-heavy workloads (pre-installed R + common packages to avoid cold-start)
  • R ↔ Python interop within a single flow — e.g., pass a data frame from a Python step to an R step via Arrow/Parquet serialization through S3
  • Rmarkdown rendering as a job output type

Current Workaround & Its Limitations

We are currently wrapping R execution in Bash steps:

Rscript -e "source('/path/to/script.R'); main(arg1='$1', arg2='$2')"

This works for simple cases but has significant limitations:

  • No LSP, no autocomplete, no syntax highlighting for R in the editor
  • No typed parameter UI — everything is a raw string
  • Dependency management is manual (must pre-install packages in the Docker image)
  • No structured output — must manually parse stdout
  • Debugging is painful — errors surface as raw stderr text with no line mapping
  • Cannot leverage Windmill's dependency caching or resource system from R

Reference

In [Issue #4102](#4102), @rubenfiszel mentioned:

"Other languages we intend to support are R and Rust"

Rust has since been added as a native language. We would like to advocate for R to be next in line, backed by a concrete enterprise deployment that depends on it.

Summary

Aspect Detail
Industry Semiconductor / Display Manufacturing
Use Case Virtual Metrology, SPC, R2R Control
Target MLOps Level Level 3 (Automated Model Deployment)
Scale 50+ engineers, multiple fab lines
Subscription Impact Enterprise plan adoption contingent on R support
Migration Alternative Rewriting R → Python is not viable (validation cost, regression risk)

We're happy to collaborate on design, testing, and feedback. If there's an RFC or beta program for new language support, we'd love to participate.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions