Skip to content

Preserving original unsplit value for the command fieldtype #220

@Jacob-gr

Description

@Jacob-gr

Background: I am working on an Elastic Integration to assist with ingesting Dissect data into Elastic via Elastic Agent. (The Elastic Ingest pipelines would also allow improved parsing of documents being written via rdump's elasticsearch writer). I've been going over the dissect output of various functions and figuring out how it could best be mapped to ECS and other helpful fields for analysts. During this process, I've come across a few things that I'd like to bring up for discussion and determine if I am possible missing a flag for a command or if there would be a need for a feature request.

I'm opening an issue regarding a proposal to add a feature to the command fieldtype: an original property that preserves the exact input string before any splitting or normalization takes place. If there is already something like this and I'm just not seeing it, please let me know.

Observed Problem

While testing the runkeys function, I noticed the command executable and args getting incorrectly split due to (what I think) are missing quotes in the original value. This results in a structure like:

"command": {"executable": "%ProgramFiles%\\Windows", "args": ["Mail\\wab.exe /Upgrade"]}

When one would expect

"command": {"executable": "%ProgramFiles%\\Windows Mail\\wab.exe", "args": ["/Upgrade"]}

Looking into it, I see that this is a recognized issue in the docstring of the command FieldType:

class command(FieldType):
"""The command fieldtype splits a command string into an ``executable`` and its arguments.
Args:
value: the string that contains the command and arguments
path_type: When specified it forces the command to use a specific path type
Example:
.. code-block:: text
'c:\\windows\\malware.exe /info' -> windows_path('c:\\windows\\malware.exe) ['/info']
'/usr/bin/env bash' -> posix_path('/usr/bin/env') ['bash']
# In this situation, the executable path needs to be quoted.
'c:\\user\\John Doe\\malware.exe /all /the /things' -> windows_path('c:\\user\\John')
['Doe\\malware.exe /all /the /things']
"""
__executable: path
__args: tuple[str, ...]
__path_type: type[path]
def __init__(self, value: str = "", *, path_type: type[path] | None = None):
if not isinstance(value, str):
raise TypeError(f"Expected a value of type 'str' not {type(value)}")
raw = value.strip()
# Detect the kind of path from value if not specified
self.__path_type = path_type or type(path(raw.lstrip("\"'")))
self.executable, self.args = self._split(raw)
def __repr__(self) -> str:
return f"(executable={self.executable!r}, args={self.args})"
def __eq__(self, other: object) -> bool:
if isinstance(other, command):
return self.executable == other.executable and self.args == other.args
if isinstance(other, str):
return self.raw == other
if isinstance(other, (tuple, list)):
return self.executable == other[0] and self.args == (*other[1:],)
return False
def _split(self, value: str) -> tuple[str, tuple[str, ...]]:
if not value:
return "", ()
executable, *args = shlex.split(value, posix=self.__path_type is posix_path)
return executable.strip("'\" "), (*args,)
def _pack(self) -> tuple[str, int]:
path_type = TYPE_WINDOWS if self.__path_type is windows_path else TYPE_POSIX
return self.raw, path_type
@classmethod
def _unpack(cls, data: tuple[str, int]) -> command:
raw_str, path_type = data
if path_type == TYPE_POSIX:
return command(raw_str, path_type=posix_path)
if path_type == TYPE_WINDOWS:
return command(raw_str, path_type=windows_path)
# default, infer type of path from str
return command(raw_str)
@property
def executable(self) -> path:
return self.__executable
@property
def args(self) -> tuple[str, ...]:
return self.__args
@executable.setter
def executable(self, val: str | path | None) -> None:
self.__executable = self.__path_type(val)
@args.setter
def args(self, val: str | tuple[str, ...] | list[str] | None) -> None:
if val is None:
self.__args = ()
return
if isinstance(val, str):
self.__args = tuple(shlex.split(val, posix=self.__path_type is posix_path))
elif isinstance(val, list):
self.__args = tuple(val)
else:
self.__args = val
@property
def raw(self) -> str:
exe = str(self.executable)
if " " in exe:
exe = shlex.quote(exe)
result = [exe]
# Only quote on posix paths as shlex doesn't remove the quotes on non posix paths
if self.__path_type is posix_path:
result.extend(shlex.quote(part) if " " in part else part for part in self.args)
else:
result.extend(self.args)
return " ".join(result)
@classmethod
def from_posix(cls, value: str) -> command:
return command(value, path_type=posix_path)
@classmethod
def from_windows(cls, value: str) -> command:
return command(value, path_type=windows_path)

Proposed Solution

I could join the the executable string and the args array to get an approximation of the original value; however, I feel there would be benefit in having access to the original, unmodified/unsplit/unstriped value (especially when dealing with forensics). I've read about various whitespace padding and null injection techniques to help store payloads in registry values.

Would it be reasonable to include something like an executable.raw or executable.original field to provide the original string? I would think it would look something like this on the Dissect JSON output:

"command": {"executable": "%ProgramFiles%\\Windows", "args": ["Mail\\wab.exe /Upgrade"], "original": "%ProgramFiles%\\Windows Mail\\wab.exe /Upgrade"}

Or it may be better to just provide as a seperate field all together?

"raw_command": "%ProgramFiles%\\Windows Mail\\wab.exe /Upgrade"

Having this original value would provide forensic accuracy and also allow me to map it the ECS registry.data.strings field without worrying about reassembly . Adding this as a new field would also hopefully avoid any issues the user's current workflows relying on command.executable and command.args.

I could try to work out implementation and submit a PR if this field would be seen as a positive addition. I'm not sure what the ideal implementation would be though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions