Skip to content

thescanner42/WindowSweep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WindowSweep

WindowSweep is a light-weight framework for scanning files and archives.

Motivation

Suppose a large input is being processed. Fundamentally there are two steps:

  • detection: the part that is relevant is identified
  • postprocessing. usually a contiguous span of bytes must be provided, which may or may not include some number of bytes before or after that matched span

To fullfill those functional requirements, other scanning tools like YARA load the whole content into memory before scanning. WindowSweep does this differently; it works in configurable constant memory and scans with a sliding window. The sliding window has the following configurations:

  • buffer size: how many bytes does it hold onto at a time
  • lookbehind and lookahead: some contiguous number of bytes around the match must be available
  • match overlap: to prevent matches being missed when straddling a segment boundary, some bytes are rescanned

Plugins

A plugin has a generic interface which describes a state which persists as a file is being scanned and how input is accepted. It requires that the underlying scanning technology chosen must have a mode of operation which accepts chunks of data.

This crate provides a dfa plugin which uses the rust regex dense DFA. This allows for scanning with simultaneous regular expressions. The top level binary of this repo is a CLI wrapper of this plugin.

Other plugins can be created by you. If sub-linear time complexity is desired then consider using daachorse. If linear time constant space SAST scanning is desired, consider LexerSearch.

About

framework for scanning files and archives

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Contributors

Languages