Skip to content

Latest commit

 

History

History
123 lines (62 loc) · 7.28 KB

File metadata and controls

123 lines (62 loc) · 7.28 KB

Since the Fork

Opengrep is a fork of Semgrep version 1.100.0. Since the fork, it has introduced a number of new features and bug fixes. This document outlines the most important changes and improvements.

Summary

Main New Features

  • Switched to OCaml 5.3.0 with large-scale refactoring to support multicore execution with shared-memory parallelism. At the time of the fork, the project was using OCaml 4 and parallelism was achieved using a process forking approach, which did not work on Windows.

  • Added native support for Windows.

  • Added support for languages that are not available in Semgrep CE or Semgrep PRO: Visual Basic.

  • Added support for languages that are not available in Semgrep CE: Apex and Elixir.

  • Improved support for Clojure which now supports tainting, while in Semgrep the translation is very limited.

  • Added support for intrafile cross-function tainting, with the flag --taint-intrafile, supporting higher-order functions. This works similarly to Semgrep's --pro-intrafile.

Features

  • Improvements in many existing languages such as C, C#, C++, Dart, Elixir, Java, Javascript, Kotlin, Php, Ruby, Rust, Scala.

  • Self-contained binaries for multiple architectures, built using Nuitka for fast and self-contained executables. This is a departure from Semgrep, which is typically distributed using python wheels or homebrew. Now user environments do not need to have python installed, and the performance is comparable to the python wheel.

  • Install script for macOS and Linux.

  • Release binaries signed with Cosign to ensure authenticity.

  • Metavariable values and fingerprints are now included in both JSON and SARIF outputs.

  • Added support for reporting the enclosing context of a match (e.g., class, function, module) in the JSON output. Use the flag: --output-enclosing-context

  • Per rule timeouts that override the --timeout CLI parameter. Requires --allow-rule-timeout-control and also a CLI --timeout that is bigger than 0 (which disables timeouts). Set using the timeout rule option.

  • Dynamic timeouts that scale with file size. Enabled via --dynamic-timeout, but can also be controlled per rule using dynamic_timeout: true together with --allow-rule-timeout-control. The behaviour can be finetuned with --dynamic-timeout-unit-kb (rule option: dynamic_timeout_unit_kb) and --dynamic-timeout-max-multiplier (rule option dynamic_timeout_max_multiplier); see the CLI man page for details.

  • Added a per-rule limit on the number of reported matches. Use the rule option: max-match-per-file.

  • Taint analysis now supports per-rule timeout configuration. Use the rule option: taint-fixpoint-timeout. This should be mostly be reserved for cases where results are not very stable on consecutive runs, and should remain relatively low, for example between 0.2 and 2 seconds. Update: this is now deprecated: taint fixpoint timeouts no longer exist.

  • Postprocessing (autofix and nosem annotations) now works with incremental output. Use the flag: --incremental-output-postprocess.

  • Added option to inline metavariable values in the metadata section of the JSON output. Use the flag: --inline-metavariables.

  • Support for custom ignore annotations (instead of default nosem / nosemgrep). Use the option: --opengrep-ignore-pattern=<VAL>.

  • Added a CLI option for specifying a custom ignore file name: --semgrepignore-filename=<VAL>.

  • Improved control over file inclusion/exclusion. Use --force-exclude to apply --include / --exclude rules even on explicitly passed file targets instead of just on directories.

  • The test command now accepts multiple target files.

  • Taint intrafile: support for rest/variadic parameters in IL and taint signatures #538

  • Many performance improvements.

  • Many bug fixes.

Important bug fixes and language feature support

  • Taint tracking: Fix bug in propagating taints through composites and dictionaries #367

  • Matching: Fix wrong ranges reported for a number of languages #272

  • C/C++: Fix failure in to-ast-generic #385

  • C/C++: Make C/C++ parser more lenient when dealing with preprocessor directives #393

  • C#: Fix string literals in the parser #186

  • C#: Allow implicit variables in properties to be taint sources #516

  • C#: Conditional array access in l-values #535

  • C#: Primary constructor arguments on base class #589

  • Clojure: New translation in #501 and #517

  • Dart: Use identifier casing to guess Call vs New #555

  • Dart: Add typed metavariables #551

  • Dockerfile: Add missing BuildKit constructs #581

  • Elixir: Fix short lambdas #556

  • Go: Add goroutine call to IL #559

  • Kotlin: Fix string templates #191

  • Kotlin: Enable taint tracking through the Elvis operator (?:) #334

  • Kotlin: Enable taint tracking through scope functions (let, also, use, takeIf, takeUnless) #332

  • PHP: Add union types to PHP menhir parser #201

  • PHP: Add arrow functions to the menhir parser #205

  • PHP: Interpolated strings parsed as normal strings #296

  • PHP: Add match and enum in the primary parser #306

  • PHP: Property hooks, asymmetric visibility, first-class callable syntax, union/intersection DNF types #529

  • Python: Taint propagation via for comprehensions #564

  • Ruby: Improve Ruby tainting #324

  • Rust: Propagate taint through variable shadowing #572

  • Rust: Fix missing type alias translation #549

  • Scala: Support metavariables as elements in interpolated strings #403

  • Typescript: Fix bug related to lambdas #378

  • Typescript: Fix naming in the presence of typed patterns #395

Acknowledgments

We'd like to thank all external contributors and our industry partners for their invaluable support.