feat: extend scanner language support for mobile dev (stacked on #10)#17
Open
smk508 wants to merge 2 commits intocytostack:mainfrom
Open
feat: extend scanner language support for mobile dev (stacked on #10)#17smk508 wants to merge 2 commits intocytostack:mainfrom
smk508 wants to merge 2 commits intocytostack:mainfrom
Conversation
Adds Flutter-critical languages (Kotlin, Swift, Objective-C) plus common gaps (C++ variants, C#, Ruby, PHP, Lua, Vue, Svelte, HTML, Protobuf, GraphQL, Terraform, shell variants) to both extension sets. Also brings src/tracker/token-estimator.ts CODE_EXTS back in sync with src/scanner/anatomy-scanner.ts CODE_EXTENSIONS — the two sets had drifted apart since only CODE_EXTENSIONS gets the .dart addition from cytostack#10. Adds a one-line "Keep in sync with ..." comment above each so future additions hit both places. These sets control the chars-per-token ratio (3.5 for code vs 3.75 fallback) used by estimateTokens; the net effect is ~7% more accurate token accounting in anatomy.md and detectContentType() consumers for projects written in these languages.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hey — this is a stacked follow-up to #10. @levnikmyskin's patch adds
.darttoCODE_EXTENSIONSinsrc/scanner/anatomy-scanner.ts:44; I wanted to extend that to cover the rest of the Flutter toolchain (Kotlin, Swift, ObjC) and a few other common gaps, and fix a small drift issue I spotted in the same area.This PR is stacked on #10, so the commit history here is @levnikmyskin's unchanged
.dartcommit followed by mine. If #10 merges first, I'll rebase and this diff will shrink to only the net-new additions. If this one lands first, #10 becomes a no-op.The context
CODE_EXTENSIONSisn't a file-inclusion gate — it only controls which chars-per-token ratioestimateTokensuses (3.5 for code, 3.75 fallback, 4.0 for prose). A.dartfile was already being scanned and written toanatomy.mdbefore #10; its token count was just ~7% low because it fell through to the default ratio. Same is true today for.kt,.swift,.m,.mm, and plenty of others. Worth flagging in case the framing of "add dart support" made the change sound bigger than it is.The drift
There are actually two near-identical extension sets in the repo:
CODE_EXTENSIONSinsrc/scanner/anatomy-scanner.ts:41— used by the anatomy scanner's internalestimateTokens.CODE_EXTSinsrc/tracker/token-estimator.ts:3— used by the exporteddetectContentType()helper that other parts of the codebase consume.They drifted apart the moment #10 added
.dartto the first set only. This PR brings them back in sync (.dartis added toCODE_EXTShere) and drops a one-line// Keep in sync with ...comment above each set so the coupling is visible to the next person who touches either file.The additions
Added to both sets (
.dartwas already inCODE_EXTENSIONSfrom #10 and is newly added toCODE_EXTShere):.kt,.kts,.swift,.m,.mm.hpp,.hh,.cc,.cxx.cs,.rb,.php,.lua.vue,.svelte,.html,.htm.proto,.graphql,.gql,.tf.bash,.zsh,.fishNo behavioral change beyond the ratio swap —
description-extractor.tsalready handles per-language description extraction for Kotlin, Swift, Dart, Ruby, C#, PHP, etc. via its ownpath.extname()routing, so nothing else needs touching.Testing
Environment: macOS Darwin 25.3.0, Node 22, pnpm build pipeline unchanged.
Scratch Flutter layout used for rows 5–6:
pnpm buildnode dist/bin/openwolf.js --helpdetectContentType("foo.kt" / ".swift" / ".m" / ".mm" / ".dart")on upstream/main"mixed"(falls through to 3.75 ratio)"code"(3.5 ratio)buildAnatomy()on scratch Flutter layout, upstream/mainMainActivity.kt ~46 tok,AppDelegate.swift ~61 tok,main.dart ~76 tokMainActivity.kt ~49 tok,AppDelegate.swift ~65 tok,main.dart ~82 tok— +6.5%, +6.6%, +7.9% respectivelyestimateTokens(text, "mixed")vsestimateTokens(text, "code")anatomy.mdsection/description output on scratch projectdescription-extractor.ts(no change needed there)Rows 3 and 5 were captured by temporarily reverting both files to
upstream/mainon the same branch, running the same node one-liners, then restoring.