Skip to content

feat(lockfile): add lockfile positions for transitive dependencies across all parsers#127

Open
anderruiz wants to merge 36 commits into
mainfrom
ander/transitive-dep-locations
Open

feat(lockfile): add lockfile positions for transitive dependencies across all parsers#127
anderruiz wants to merge 36 commits into
mainfrom
ander/transitive-dep-locations

Conversation

@anderruiz
Copy link
Copy Markdown
Contributor

@anderruiz anderruiz commented Apr 15, 2026

Summary

Transitive dependencies were previously reported without any lockfile position data, making it impossible to group them precisely relative to their direct dependency parents. This PR adds BlockLocation (lockfile position) to all 22 extractors so that every package — direct and transitive — carries its file position.

Approach (backward compatible):

  • Direct dependencies: no change — matchers continue to overwrite lockfile positions with source-file positions (e.g., package.json, Cargo.toml). The lockfile position is set during extraction but replaced by the manifest position before the SBOM is emitted, so the reported occurrence always points to the manifest file.
  • Transitive dependencies: now always carry their lockfile position (previously zero/empty). Since no matcher enriches them, the lockfile block is the only occurrence reported.

In short: lockfile positions are populated for every package, but only reach the final SBOM output for packages that have no manifest file to be matched against (i.e. transitives).

Scope:

  • 16 extractors modified: npm-lock (gap fix), gradle-lock, mix-lock, gemfile-lock, cargo-lock, poetry-lock, pdm-lock, uv-lock, composer-lock, pipenv-lock, renv-lock, nuget-lock, conan-lock (v1+v2), gradle-verification-metadata, pubspec-lock, pnpm-lock, yarn-lock
  • 6 extractors already correct: go-lock, requirements-txt, maven-install, maven-lock, nuget-csproj (+ npm-lock gap was confirmed and fixed)

Implementation patterns used:

  • Line-based scanning (gradle-lock, mix-lock, gemfile-lock)
  • TOML [[package]] via InTOML utility (cargo, poetry, pdm, uv)
  • JSON map-keyed via InJSON utility (pipenv, renv, nuget-lock, conan)
  • JSON array brace-depth tracking (composer-lock)
  • YAML indent-based scanning (pubspec-lock, pnpm-lock)
  • XML regex-based element scanning (gradle-verification-metadata)
  • Dual-format handling (yarn-lock v1-3 + v4+, conan-lock v1+v2)

Bonus fixes discovered during implementation:

  • Matcher de-duplication bug: BlockLocation.Line.Start != 0 check in match-package-json.go and match-composer.go broke once extractors started populating BlockLocation. Fixed by comparing BlockLocation.Filename == depMap.FilePath instead.
  • npm-lock non-determinism: npmPackageDetailsMap.add() now prefers the earlier (smaller line number) location when the same name@version appears at multiple node_modules paths.
  • SBOM occurrence ordering: cyclonedx.go now sorts evidence occurrences by location string, making output deterministic when a package has both lockfile and source-file locations.

Test plan

  • Each extractor has a dedicated _BlockLocation test asserting positions for all returned packages
  • Existing extractor tests updated (switched to ExpectPackagesWithoutLocations or updated with expected positions)
  • Integration snapshots updated to include new transitive dependency positions
  • Full test suite passes: go test -count=1 ./... — all 27 packages green
  • Lints pass: ./scripts/run_lints.sh
  • Output stability verified: 10 consecutive runs without snapshot drift

🤖 Generated with Claude Code

Validation

The results of the generator were validated using software-conposition-analysis-tests, the results were manually reviewed, detecting that the lock files were being reported without any additional apparent change, then the results were analyzed using Claude producing the following summary of cases:

Summary of Findings: SBOM Generator Comparison Analysis

  Context

  Comparison between datadog-sbom-generator v1.10.0 (base) and a new local build (version string changed to "set at build time, see .goreleaser.yml ldflags section") across 92 test
  repositories covering multiple ecosystems (NuGet, npm/Yarn, Cargo, PyPI, PHP Composer, Go, Java/Maven, Ruby).

  ---
  Finding 1: Evidence/Occurrences field — EXPECTED, CORRECT ✅

  41 out of 92 repos show structural SBOM changes. In all 41 cases the change is identical: the new generator adds an evidence.occurrences block to every component, containing the
  exact file location(s) where the dependency was found.

  Example structure added:
  "evidence": {
    "occurrences": [
      {
        "location": "{\"block\":{\"file_name\":\"packages.lock.json\",\"line_start\":2024,\"line_end\":2038,\"column_start\":7,\"column_end\":8}}"
      }
    ]
  }

  Verification: 8 locations manually verified across 5 ecosystems (NuGet, Cargo, Yarn, npm, PyPI). Line numbers, column numbers, and file names are all accurate.

  PURLs: In 100% of repos, the PURL lists are identical between versions — no dependency was added or dropped.

  ---
  Finding 2: Version string change — EXPECTED ✅

  The datadog-sbom-generator metadata component inside the SBOM changed:
  - Before: "version": "1.10.0"
  - After: "version": "set at build time, see .goreleaser.yml ldflags section"

  Affects all 51 repos that showed only this change. Consistent with a development build.

  ---
  Finding 3: Bug fix in old generator — CORRECTED ✅

  Repos: sca-testing-jest, sca-testing-react

  The old generator was misattributing @babel/core file locations to the @babel/compat-data bom-ref. The new generator correctly drops these invalid entries.

  ┌───────┬───────────────────────────────────┬─────────────────────────────────────────┬─────────────────────────────────┐
  │ Repo  │           Wrong bom-ref           │       File with wrong attribution       │ Actual content at that location │
  ├───────┼───────────────────────────────────┼─────────────────────────────────────────┼─────────────────────────────────┤
  │ jest  │ pkg:npm/@babel/compat-data@7.26.5 │ packages/jest-transform/package.json:22 │ "@babel/core": "^7.11.6"        │
  ├───────┼───────────────────────────────────┼─────────────────────────────────────────┼─────────────────────────────────┤
  │ react │ pkg:npm/@babel/compat-data@7.26.3 │ package.json:9                          │ "@babel/core": "^7.11.1"        │
  └───────┴───────────────────────────────────┴─────────────────────────────────────────┴─────────────────────────────────┘

  The new generator does not emit these spurious locations. This is a correct fix.

  ---
  Finding 4: Regression — Root-level occurrences missing ⚠️ BUG

  Repo: sca-testing-OrchardCore

  The new generator drops the root-level package.json occurrence for two packages and does not replace it with the corresponding yarn.lock entry, despite the root yarn.lock existing
   and resolving those exact versions.

  Affected packages

  pkg:npm/@babel/core@7.26.9
  - Root package.json line 27: "@babel/core": "^7.22.11" → dropped, no replacement
  - Root yarn.lock lines 63–84: "@babel/core@npm:^7.22.11" resolving to 7.26.9 → also not added
  - What was added instead: yarn.lock line 86 (@babel/generator, a different package) — wrong entry

  pkg:npm/@babel/preset-env@7.26.9
  - Root package.json line 28: "@babel/preset-env": "^7.22.10" → dropped, no replacement
  - Root yarn.lock contains "@babel/preset-env@npm:^7.22.10" resolving to 7.26.9 → also not added

  Root cause hypothesis

  The new generator appears to process the root yarn.lock and correctly emits lock-file locations for most packages — but for @babel/core and @babel/preset-env specifically in
  OrchardCore, it skips both the package.json manifest entry and the yarn.lock resolved entry. A sub-module package.json entry (in src/OrchardCore.Modules/.../Assets/package.json)
  is retained, so the component does still have at least one occurrence, but the root-level origin is lost.

  This is particularly concerning because these are direct dependencies (declared in the root manifest), and losing that root-level occurrence means the direct-vs-transitive signal
  could be incorrect downstream.

  Reproduction path

  - Repo: sca-testing-OrchardCore (https://github.com/muh-nee/sca-testing-OrchardCore)
  - PURL 1: pkg:npm/%40babel%2Fcore@7.26.9
  - PURL 2: pkg:npm/%40babel%2Fpreset-env@7.26.9
  - Expected locations: root package.json lines 27–28 AND root yarn.lock lines 63–84 (for core) / equivalent range for preset-env
  - Actual locations in new SBOM: only src/OrchardCore.Modules/OrchardCore.Apis.GraphQL/Assets/package.json and multiple sub-lockfile entries

  ---
  Summary table

  ┌─────┬────────────────────────────────────────────────────────────────────────────────────────────┬────────────────┬─────────────────────────────────────────────┐
  │  #  │                                          Finding                                           │ Repos affected │                   Verdict                   │
  ├─────┼────────────────────────────────────────────────────────────────────────────────────────────┼────────────────┼─────────────────────────────────────────────┤
  │ 1   │ evidence.occurrences block added to all components                                         │ 41             │ ✅ Expected, verified correct               │
  ├─────┼────────────────────────────────────────────────────────────────────────────────────────────┼────────────────┼─────────────────────────────────────────────┤
  │ 2   │ Generator version string changed                                                           │ 51             │ ✅ Expected                                 │
  ├─────┼────────────────────────────────────────────────────────────────────────────────────────────┼────────────────┼─────────────────────────────────────────────┤
  │ 3   │ Old generator was emitting @babel/core locations under @babel/compat-data bom-ref          │ jest, react    │ ✅ Correctly fixed                          │
  ├─────┼────────────────────────────────────────────────────────────────────────────────────────────┼────────────────┼─────────────────────────────────────────────┤
  │ 4   │ Root package.json + root yarn.lock locations dropped for @babel/core and @babel/preset-env │ OrchardCore    │ ⚠️ Regression — root-level occurrences lost │
  └─────┴────────────────────────────────────────────────────────────────────────────────────────────┴────────────────┴─────────────────────────────────────────────┘

The locations were actually reviewed to verify that they were pointing to the right lines, finally a regression was detected, this regression was fixed to report the right line.

All codex issues were also reviewed and acted on them whenever needed

@anderruiz anderruiz requested a review from a team as a code owner April 15, 2026 20:39
@datadog-prod-us1-4
Copy link
Copy Markdown

datadog-prod-us1-4 Bot commented Apr 15, 2026

🎯 Code Coverage (details)
Patch Coverage: 87.72%
Overall Coverage: 84.91% (+0.19%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 26f333d | Docs | Datadog PR Page | Give us feedback!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 78ea05c619

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

}

// Entry key at depth 1 (4-space indent): " \"package@npm:^1.0.0\": {"
if braceDepth == 2 && strings.Contains(trimmed, `":`) && strings.HasPrefix(line, " ") {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match only Yarn Berry entry headers

The entry-detection condition is too broad: at depth 2 it accepts any line with at least four leading spaces and ":, so normal package fields like "version": "..." are treated as new entries. In valid Yarn v4 lockfiles this causes closeBerryEntry to fire early and truncates/misaligns BlockLocation ranges for packages, which breaks the new lockfile-position accuracy this change is meant to provide. Tighten this check to actual entry-header lines (e.g., exact 4-space indent and trailing {).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive. The brace-counting loop runs before the entry-header check on line 546, so by the time we evaluate braceDepth == 2, the { on that same line has already been counted. A nested-object field like "resolution": { bumps depth to 3 first, so it never matches the entry-header condition. Fields with string values (e.g. "checksum": "...") don't end with { and are also excluded by HasSuffix("{"). The logic is correct for all standard Yarn v4 pretty-printed lockfiles.

@anderruiz anderruiz force-pushed the ander/transitive-dep-locations branch from 78ea05c to c54cfb9 Compare April 16, 2026 12:07
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c54cfb9d58

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +168 to +170
func lookupPnpmPosition(name, version, filePath string, positions map[string]models.FilePosition) models.FilePosition {
key := name + "@" + version
if pos, ok := positions[key]; ok {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle peer-suffixed pnpm keys in position lookup

lookupPnpmPosition searches the positions map with name + "@" + version, but extractPnpmV9PackagePositions stores raw YAML package keys (which can include peer suffixes like tsutils@3.21.0(typescript@4.9.5)). Because getCleanedVersion strips the peer suffix before lookup, these entries never match and BlockLocation is left empty for affected direct/transitive deps. This breaks the new guarantee that pnpm packages carry lockfile positions in peer-dependency graphs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive. lookupPnpmPosition already handles peer-suffixed keys via a prefix-match fallback (lines 194–203): it searches for any key starting with name@version( and picks the entry with the smallest line number. TestParsePnpmLock_v9_PeerDependenciesAdvanced_BlockLocation covers this path and passes.

@anderruiz anderruiz force-pushed the ander/transitive-dep-locations branch from c54cfb9 to 598f88b Compare April 16, 2026 20:23
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 598f88bb05

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +169 to +170
key := name + "@" + version
if pos, ok := positions[key]; ok {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve pnpm positions for git/tarball package keys

lookupPnpmPosition builds lookup keys as name@version, but for pnpm v9 git/tarball dependencies the packages: keys are URL-based (for example in pkg/lockfile/fixtures/pnpm/commits.v9.yaml, ansi-regex@https://codeload...). getCleanedVersion converts that to a semver (6.0.1), so the exact lookup and the key + "(" fallback both miss and BlockLocation stays empty. This regresses the new lockfile-position coverage specifically for git/tarball dependencies.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive. lookupPnpmPosition already has a second fallback (lines 184–190) that retries the lookup with the raw, uncleaned version string (name@rawVersion). For git/tarball deps the packages: key is the full URL, so rawVersion carries that URL and the lookup succeeds. This is explicitly covered by TestParsePnpmLock_v9_Commits_BlockLocation, which passes.

Comment thread pkg/lockfile/dotnet/parse-nuget-lock.go Outdated
// between frameworks
for _, dependencies := range file.Dependencies {
maps.Copy(details, parseNuGetLockDependencies(dependencies))
for frameworkName, dependencies := range file.Dependencies {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make NuGet duplicate-package merge deterministic

This loop iterates file.Dependencies as a Go map and later merges each framework into a single details map keyed by name@resolved; duplicates across frameworks therefore get overwritten by whichever framework happens to be visited last. With this commit, each framework now contributes a different BlockLocation, so identical package/version entries can non-deterministically point to different lockfile blocks across runs. Sorting framework keys (or preserving the first/earliest location) would avoid unstable results.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive. The PR already addresses this: parseNuGetLock sorts framework names with slices.Sorted(maps.Keys(file.Dependencies)) (line 60) before iterating. When the same package appears in multiple frameworks, maps.Copy always overwrites with the alphabetically-later framework, which is deterministic across runs.

@anderruiz anderruiz force-pushed the ander/transitive-dep-locations branch from 598f88b to c6c6b42 Compare April 17, 2026 05:36
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c6c6b42084

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +110 to 113
BlockLocation: lookupPnpmPosition(depName, version, depVersion, filePath, positions),
}
addDependencyToPackageDetails(transitiveDep, getPnpmDependencyKey(transitiveDep), deps)
childKey := depName + "@" + depVersion
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make pnpm position merge deterministic for peer variants

When peer-suffixed keys resolve to the same cleaned version (for example pkg@1.2.3(peerA) and pkg@1.2.3(peerB)), they collapse to one name@version identity and are merged, but the BlockLocation kept depends on whichever map entry is visited first. Because snapshot.Dependencies iteration order is nondeterministic in Go maps, affected pnpm graphs can emit different lockfile locations across runs. Pick a stable winner (such as earliest line number) when merging duplicates.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive. addDependencyToPackageDetails keeps the first entry when merging duplicates (lines 60–71): it only updates DepGroups and TargetVersions, leaving BlockLocation unchanged. Combined with the BFS traversal over snapshotQueue (which processes nodes in deterministic insertion order), the winning BlockLocation is stable across runs. lookupPnpmPosition also picks the earliest line number for peer variants, so there is no non-determinism here.

Comment on lines +74 to +78
for _, ch := range line {
switch ch {
case '{':
braceDepth++
if braceDepth == 1 {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore braces inside JSON strings when tracking blocks

computeComposerBlockPositions increments/decrements braceDepth for every { and } rune in a line, including braces that appear inside quoted JSON string values. In valid composer.lock files with such strings, depth tracking drifts and subsequent package boundaries are miscomputed, producing incorrect or missing BlockLocation data. The block scanner should be JSON-string aware instead of counting raw characters.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid edge case, low real-world impact. The character-by-character brace counter can drift if a JSON string value contains a literal { or } (e.g. a description field like "uses {handlebars}"). None of the current fixtures trigger this, and composer.lock package descriptions rarely contain raw braces in practice. A more robust approach would be to use json.Decoder.Token() to detect object boundaries instead of counting raw characters. Leaving as a known limitation for now.

@anderruiz anderruiz force-pushed the ander/transitive-dep-locations branch from c6c6b42 to 775f72c Compare April 24, 2026 05:27
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 775f72ce9f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

closePnpmBlock(positions, currentKey, i, lines)
}

pkgKey := strings.TrimSuffix(trimmed, ":")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Strip YAML quotes from legacy pnpm package keys

Legacy pnpm package keys can be quoted when they are not plain YAML scalars; the existing pkg/lockfile/fixtures/pnpm/tarball.yaml fixture has a single-quoted tarball URL key under packages:. This stores the quotes in the positions map, while the decoded sourceFile.Packages key used later for positions[s] is unquoted, so those packages are emitted without a BlockLocation. Normalize quoted keys here as the v9 scanner does.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed. The line scanner stored YAML keys verbatim, including surrounding single-quotes (e.g. 'https://...'), while the go-yaml decoder strips them when populating sourceFile.Packages. This caused positions[s] to miss every single-quoted tarball/scoped key and leave BlockLocation empty for those packages. The tarball.yaml fixture reproduces it.

Fix: pkgKey now strips surrounding quotes before storing in the map:

pkgKey := strings.Trim(strings.TrimSuffix(trimmed, ":"), "'\"")```
This normalizes both single- and double-quoted YAML keys to match what the YAML decoder provides.

Comment on lines +511 to +515
if !inEntries && strings.Contains(trimmed, `"entries"`) && strings.HasSuffix(trimmed, "{") {
inEntries = true
braceDepth = 1

continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle compact Yarn JSON when locating entries

For a valid compact/minified Yarn v4 JSON lockfile, the entries object and package keys can be on the same line; this check requires the line containing "entries" to end with { and then continues, so extractYarnBerryJSONPositions never records any entry positions even though json.Unmarshal still parses the packages. That leaves every package's BlockLocation empty for minified JSON locks.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid edge case, negligible practical impact. For a minified/compact Yarn v4 JSON lockfile all content would be on one or few lines; the "entries" detection requires HasSuffix("{") on the same line, which would fail if entries opens and closes on the same line. In that case inEntries never becomes true and all BlockLocations stay empty. However: Yarn v4 always generates pretty-printed JSON lockfiles (verified against the Yarn source), so this scenario cannot arise with any real yarn.lock produced by the official tooling. Leaving as a known limitation.

piloulacdog
piloulacdog previously approved these changes Apr 28, 2026
Copy link
Copy Markdown
Contributor

@piloulacdog piloulacdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blindly approving

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8c59c1e4b3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Commit: commit,
DepGroups: depGroups,
IsDirect: isDirect,
BlockLocation: blockLocation,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark legacy pnpm locations as lockfile locations

When lockfileVersion < 7 takes the legacy pnpm path, this code now attaches a BlockLocation from pnpm-lock.yaml but never sets LocationRole. groupBySource passes that empty role into NewPackageLocations, so transitive legacy pnpm packages are reported with a lockfile location that omits role: "lockfile", unlike the v9 pnpm path and the other new lockfile positions. Set LocationRole when the block location comes from the lockfile here.

Useful? React with 👍 / 👎.

return group + ":" + name + ":" + version
}

var componentStartRe = cachedregexp.MustCompile(`<component\s+group="([^"]+)"\s+name="([^"]+)"\s+version="([^"]+)"`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse component attributes order-independently

For a valid Gradle verification metadata file where a <component> tag's attributes are emitted or reformatted in a different order, xml.Decoder still populates parsedLockfile.Components, but this regex returns no match and the package is emitted without a BlockLocation. XML attribute order is not significant, so the position scanner should collect group, name, and version attributes order-independently instead of requiring exactly group then name then version.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf61fbaee9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


if pos, ok := positions[name]; ok {
pos.Filename = f.Path()
pkgDetails.BlockLocation = pos
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Set lockfile role for pubspec positions

When pubspec.lock packages get a BlockLocation here, the extractor never sets LocationRole, and there is no Pub matcher later to overwrite it. As a result every emitted Pub package location from this lockfile is serialized with an empty role instead of lockfile, unlike the role-aware location path in NewPackageLocations; set pkgDetails.LocationRole = models.LocationRoleLockfile when attaching this lockfile position.

Useful? React with 👍 / 👎.

anderruiz and others added 11 commits May 13, 2026 13:58
Add line counter to bufio.Scanner loop and set BlockLocation with line
number and column range for each parsed package. Column.Start is 1
(start of line), Column.End is the raw line length + 1 (before
TrimSpace, preserving original positions).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add line counter to bufio.Scanner loop and set BlockLocation with line
number and full line column range for each parsed package.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation assertion test. Switch strict ExpectPackages
to ExpectPackagesWithoutLocations for non-location tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add line number and source file tracking to the gemfileLockfileParser
state machine. Each package gets BlockLocation set with the line where
it appears in the Gemfile.lock file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation assertion test for gemfile-lock extractor
verifying line numbers and filenames. Switch existing tests to
ExpectPackagesWithoutLocations since dedicated test covers location
assertions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use the InTOML utility to compute block positions for each [[package]]
section in Cargo.lock. Read file content into buffer for both TOML
decode and line-based position computation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation assertion test for cargo-lock extractor
verifying multi-line block positions from InTOML utility. Switch
existing tests to ExpectPackagesWithoutLocations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use the InTOML utility to compute block positions for each [[package]]
section in poetry.lock. Uses [metadata] as the otherKey to properly
close the last package block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation assertion test for poetry-lock extractor
verifying multi-line block positions from InTOML utility. Switch
existing tests to ExpectPackagesWithoutLocations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use the InTOML utility to compute block positions for each [[package]]
section in pdm.lock. Buffer file content with io.ReadAll for both TOML
decode and line-based position computation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
anderruiz and others added 24 commits May 13, 2026 13:58
Add dedicated BlockLocation assertion test for pdm-lock extractor
verifying multi-line block positions from InTOML utility. Switch
existing tests to ExpectPackagesWithoutLocations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use the InTOML utility to compute block positions for all [[package]]
sections including root, then assign positions only to non-root
packages. Handles the root package skip correctly by computing
positions for all TOML entries and indexing by position.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation assertion test for uv-lock extractor
verifying that root package is skipped while non-root packages get
correct positions. Switch existing tests to ExpectPackagesWithoutLocations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add custom JSON array scanner to compute block positions for each
package object in the "packages" and "packages-dev" arrays. Tracks
brace depth to find start/end lines of each package block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation assertion test for composer-lock extractor
verifying JSON block positions for both packages and packages-dev
arrays. Switch existing tests to ExpectPackagesWithoutLocations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Buffer content with io.ReadAll, use InJSON utility for "default" and
"develop" map sections to compute block positions for each package.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch 5 existing tests to ExpectPackagesWithoutLocations and add
dedicated TestParsePipenvLock_TwoPackages_BlockLocation test verifying
positions for packages in both "default" and "develop" sections.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Buffer content with io.ReadAll, use InJSON utility for "Packages" map
section to compute block positions for each package.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch 4 existing tests to ExpectPackagesWithoutLocations and add
dedicated TestParseRenvLock_TwoPackages_BlockLocation test verifying
positions for packages in the "Packages" section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Buffer content with io.ReadAll, use InJSON utility per framework target
(e.g. "net6.0") to compute block positions for each package within the
nested "dependencies" structure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch 6 existing tests to ExpectPackagesWithoutLocations, update
MultipleVersionsNonDeterministicOrder to expect lockfile positions for
packages not matched by .csproj, and add dedicated BlockLocation test
for one-framework-two-packages fixture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
V1 (nodes map): Uses InJSON with "nodes" group key to get positions for
each node block, keyed by node ID.
V2 (string arrays): Uses ExtractDelimitedStringPositionInBlock to find
the line containing each reference string within quotes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch existing v1-revisions and v2 tests from ExpectPackages to
ExpectPackagesWithoutLocations, and add dedicated BlockLocation tests
for both v1-revisions and v2 formats.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ractor

Buffer XML content and scan for <component> elements to extract line
positions. Handles duplicate group:name:version entries (multiple
versions) by consuming positions sequentially.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ocation assertions

Switch existing tests to ExpectPackagesWithoutLocations and add
dedicated BlockLocation test for two-packages fixture verifying
line ranges for each <component> block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Buffer YAML content and scan for package name entries at 2-space indent
under the "packages:" key. Track block boundaries from package name line
to the line before the next package or end of section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch existing tests to ExpectPackagesWithoutLocations and add
dedicated BlockLocation test for two-packages fixture verifying
line ranges for shelf and shelf_web_socket packages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add lockfile position tracking for both pnpm v9+ and legacy (<v7)
formats. Buffer file content before YAML decode, scan for package
entries under "packages:" section using indent-based detection, and
set BlockLocation on all returned PackageDetails.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation tests for v9 one-package, v9 mixed-groups,
legacy one-package, and legacy multiple-packages fixtures. Update
workspace-complex test to expect lockfile position for transitive
dependency (colors@1.4.0).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add lockfile position tracking for both yarn v1-v3 (text format) and
v4+ (JSON format). For text format, track line numbers during block
grouping. For JSON format, scan lines for entry keys within "entries"
object. Buffer full content for both formats to enable line scanning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dedicated BlockLocation tests for v1 one-package, v1 two-packages,
v2 one-package, and v2 two-packages fixtures. Update workspace-complex
tests for both v1 and v2 to expect lockfile positions for transitive
dependency (colors@1.4.0).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Update integration test snapshots to reflect new lockfile positions for
transitive dependencies. Fix testifylint warnings in new test code:
- Use assert.Len instead of assert.Equal with len()
- Use assert.Positive instead of assert.Greater with 0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Buffer file content with io.ReadAll, scan pin blocks by identity
field using pinPositionsByIdentity, and attach BlockLocation to
each PackageDetails entry for v2/v3 lockfiles.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch metadata tests to ExpectPackagesWithoutLocations and add a
dedicated TestParsePackageResolved_TwoPackagesV2_BlockLocation test
that verifies exact line/column ranges for each pin block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@anderruiz anderruiz force-pushed the ander/transitive-dep-locations branch from cf61fba to b69267a Compare May 13, 2026 11:59
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b69267a3e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +80 to +82
for i := range packages {
packages[i].BlockLocation.Filename = f.Path()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Set lockfile role for PDM block locations

When a PDM package keeps this newly attached lockfile BlockLocation (for example, transitive packages that are not later matched to pyproject.toml), LocationRole remains empty. groupBySource passes that empty value into NewPackageLocations, so the serialized location has no role: "lockfile" even though it points at pdm.lock; set packages[i].LocationRole = models.LocationRoleLockfile when finalizing these lockfile positions.

Useful? React with 👍 / 👎.

Snapshots now reflect BlockLocation data from all parsers added in
ander/transitive-dep-locations, merged with the npm-lock positions
already shipped in PR #141.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26f333d730

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if pos, ok := positions[name]; ok {
blockLocation := *pos
blockLocation.Filename = f.Path()
pkgDetails.BlockLocation = blockLocation
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Set lockfile role for renv block locations

When an renv package gets this new BlockLocation, LocationRole is left empty; groupBySource passes that value directly to NewPackageLocations, so emitted locations for all CRAN packages parsed from renv.lock have no role: "lockfile" even though they point at the lockfile. Set pkgDetails.LocationRole = models.LocationRoleLockfile when attaching this lockfile position (the same missing-role pattern appears in several of the new lockfile-position paths).

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants