Skip to content

[source_ref] add support for named arguments in a Rust format string#4

Merged
ttiimm merged 6 commits into
ttiimm:mainfrom
tstack:namedargs
Sep 5, 2025
Merged

[source_ref] add support for named arguments in a Rust format string#4
ttiimm merged 6 commits into
ttiimm:mainfrom
tstack:namedargs

Conversation

@tstack

@tstack tstack commented Sep 4, 2025

Copy link
Copy Markdown
Contributor

A simple first project, adding support for named arguments in a Rust format string. The SourceRef now contains more detail about the placeholders in the format string since that is where the name comes from. (The Java stuff works because tree-sitter understands the StringTemplate magic)

Files:

  • Cargo.toml: Try using tree-sitter-rust-orchard which seems to be more up-to-date than the main one.
  • lib.rs:
    • Capture all strings in a macro, to make sure we don't miss anything.
    • Add body() method to LogRef to make it easy to get the string to run the regex over.
    • lookup_source() needs to pull expressions from the placeholder or the arguments.
  • log_format.rs: Make build_src_filter() return an Option that is None if there are no filters.
  • source_query.rs: Extract the expressions used in the macro since tree-sitter is not going to do it for us.
  • source_ref.rs: Use a language-specific regex for processing format strings.

A simple first project, adding support for named arguments
in a Rust format string.  The SourceRef now contains more
detail about the placeholders in the format string since
that is where the name comes from.  (The Java stuff works
because tree-sitter understands the StringTemplate magic)

Files:
* Cargo.toml: Try using tree-sitter-rust-orchard which seems
  to be more up-to-date than the main one.
* lib.rs:
  - Capture all strings in a macro, to make sure we
    don't miss anything.
  - Add body() method to LogRef to make it easy to get
    the string to run the regex over.
  - lookup_source() needs to pull expressions from the
    placeholder or the arguments.
* log_format.rs: Make build_src_filter() return an Option
  that is None if there are no filters.
* source_query.rs: Extract the expressions used in the
  macro since tree-sitter is not going to do it for us.
* source_ref.rs: Use a language-specific regex for
  processing format strings.
@ttiimm

ttiimm commented Sep 4, 2025

Copy link
Copy Markdown
Owner

Can you merge main? I pushed a change so that the action should work better.

@ttiimm

ttiimm commented Sep 4, 2025

Copy link
Copy Markdown
Owner

Can you add or modify a test case for the change?

@tstack

tstack commented Sep 4, 2025

Copy link
Copy Markdown
Contributor Author

Can you add or modify a test case for the change?

Ah, sorry. I only added a test for build_matcher() and not whether the variables were plumbed through. I was testing that manually. I've updated the Rust test to check that.

The PR builds seem to work now, thanks!

I used the "filters" feature of insta to replace the paths in the expected output. Seems to work.

@ttiimm ttiimm left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Stack.

Comment thread Cargo.toml
serde_json = "1.0.140"
tree-sitter = "0.25.3"
tree-sitter-rust = "0.24.0"
tree-sitter-rust-orchard = "0.12.0"

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine trying out the orchard stuff, but am curious if there was a particular case that the canonical lib wasn't handling well or another reason to swap over.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it ended up making a difference. At first, I was confused by the dump of the "token-tree" code for a macro. For something like i * 2, j, it would not group the binary expression and would show the i, *, and j as siblings. So, I wasn't sure how to separate the arguments. I switched to orchard thinking it was a bug. But, it turns out the comma nodes exist, but they don't seem to be shown when converting the AST to a string.

Comment thread src/lib.rs
#[serde(rename(serialize = "srcRef"))]
pub src_ref: Option<SourceRef>,
pub variables: HashMap<String, String>,
pub variables: BTreeMap<String, String>,

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you switch to a BTreeMap here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the keys in the generated JSON would be sorted.

Comment thread src/source_ref.rs
pub(crate) text: String,
#[serde(skip_serializing)]
pub(crate) matcher: Regex,
pub(crate) args: Vec<FormatArgument>,

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using vars in the debug adapter to try and populate the variables when available. Could you also wire up the args to that or open an issue that reminds me to do it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the placeholders in the format string. I'm not sure they're relevant in the debug adapter. They're needed to figure out the variable mapping if the name is in the placeholder and not passed to the macro (e.g. "i is {i}" instead of ("i is {}", i)).

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't those be useful for showing more values? For example,

fn bar(j: u32) {
    debug!("hello from bar {j}");
}

[...] Hello from bar j=2

In this case, couldn't the debugger now show that j has the value 2 by checking for j in the args?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, vars will contain a mapping from j to 2.

The following code handles iterating through the format placeholders and captures together, then populating vars the appropriate values:

log2src/src/lib.rs

Lines 178 to 192 in 56dbaa4

for (index, (cap, placeholder)) in
std::iter::zip(captures.iter().skip(1), src_ref.args.iter()).enumerate()
{
let key = match placeholder {
FormatArgument::Named(name) => name.clone(),
FormatArgument::Positional(pos) => src_ref
.vars
.get(*pos)
.map(|s| s.as_str())
.unwrap_or("<unknown>")
.to_string(),
FormatArgument::Placeholder => src_ref.vars[index].to_string(),
};
variables.insert(key, cap.unwrap().as_str().to_string());
}

@ttiimm ttiimm merged commit 56dbaa4 into ttiimm:main Sep 5, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants