Add InlineIngredients function for amount injection into instructions#16
Open
xcapaldi wants to merge 29 commits into
Open
Add InlineIngredients function for amount injection into instructions#16xcapaldi wants to merge 29 commits into
xcapaldi wants to merge 29 commits into
Conversation
Parse RecipeMD format: title, description, tags, yields, ingredients, ingredient groups, and instructions. Includes amount parsing with fractions, decimals, and unicode vulgar fractions. Validation: empty titles, missing dividers, invalid amounts, multiple tags/yields, empty ingredients, paragraphs in ingredients section. Extract description and instructions from raw source bytes instead of reconstructing from AST nodes. This preserves CommonMark reference-style links and their definitions exactly as written. Also fixes setext heading underlines (===) being included in description. All 31 canonical tests pass.
Replace the single-pass state machine with a cleaner three-phase approach: 1. splitSections: uses goldmark to count real thematic breaks (correctly ignoring --- inside fenced code blocks or setext H2 headers), then a raw line scan for exact byte positions. This preserves invisible content (link reference definitions, setext underlines, fenced code block delimiters) that goldmark's AST does not expose as nodes. 2. parsePreamble: parses only the preamble section for title, description, tags, and yields. Eliminates the state machine and the excludeRanges mechanism; uses skipSetextUnderline for correct descStart on setext headings. 3. parseIngredientsSection: parses only the ingredients section, reusing the existing parseIngredientList/parseIngredientGroup helpers. The refactor also fixes a correctness bug in the original findThematicBreaks: it would previously match --- inside fenced code blocks. Removes: parserState type, stateStart/Description/TagsYields/Ingredients/ Instructions constants, the ast.Walk state machine closure, skipSetextUnderline (restored), and the excludeRanges accumulator pattern.
Add two CLI commands matching the RecipeMD CLI specification: - recipemd: parse, display, scale, and serialize recipes with support for title-only, ingredients-only, JSON output, rounding, multiply, and yield-based scaling - recipemd-find: search recipe collections with filter expressions, list recipes/tags/ingredients/units with multi-column output
Walk the full-document AST linearly (preamble → thematic break → ingredients → thematic break → instructions) instead of splitting into byte slices and re-parsing each section separately. ~43% faster, ~45% less memory, ~35% fewer allocs vs the 3-parse original (17.5μs vs 30.9μs per recipe). Explored storing and threading goldmark AST nodes through the Recipe struct so renderers could derive text on demand. Not worth it: AST nodes are read-only source-offset references that can't survive scaling, serialization, or caching. Pure data structs (eagerly extracted strings) stay decoupled from goldmark and are trivially cacheable in SQLite. Also fixes ingredients_empty.invalid — empty list items now correctly error instead of producing a nameless ingredient.
Migrate to a `Parser` struct with functional options. This struct contains the underlying Goldmark parser/renderer for future HTML rendering. A few functional options are exposed: - a simple frontmatter extension which ignores YAML or TOML frontmatter - a Github-formatted Markdown extension which supports GFM The frontmatter extension simply allows RecipeMD files to pass if they have some frontmatter (maybe as part of another note system like Prot's Denote). The frontmatter is not currently parsed or rendered to keep dependencies minimal. The GFM extension is almost purely for rendering purposes EXCEPT for AutoLink support which infers links from plain text. With that extension, bare URLs in ingredients will correctly populate the link field. Recipe struct is now pure data and transforms on that data. No IO.
Fixed a bug where fenced code --- was treated as a thematic break. Added support for GFM task lists which look like normal Commonmark lists except they have the "checkbox": - [ ] an unchecked element - [x] a checked element When used as an ingredient list, it breaks the recipemd spec which expects the first element to be the unit with emphasis. It would still parse but you would get the entire `[x] *1 cup* flour` treated as the text field. With this commit, we properly skip the ast.TaskCheckBox before parsing. Finally, these were found because we added a massive golden test suite in the vein of the canonical tests but extended to cover many more cases.
Clean it up and split into files.
Match Python reference where possible, accept Go flag package defaults. Differences from Python: - Flags must precede positional args (Go flag behavior). For recipemd-find this means: `recipemd-find [flags] <action> [folder]`, not `recipemd-find <action> [folder] [flags]`. - -h/--help exits 2 (Go default), not 0 - --export-links always requires DIR argument - Error/help message formatting differs Go-only extensions: --gfm, --frontmatter
Replace cmd/recipemd and cmd/recipemd-find with a set of simple, self-contained examples in examples/ that demonstrate library usage: - examples/parse/ – parse a .md file, output compact JSON (pipe to jq) - examples/scale/ – scale a recipe by factor or yield, output markdown - examples/flatten/ – inline linked recipes, output markdown Also have some minimal tests for the examples to ensure future changes don't break the examples.
Instead of returning on the first error, Parse() now collects all recoverable errors and returns them together via errors.Join. Each error is wrapped in a *ParseError that includes byte offset, 1-based line, and 1-based column. This is enough info for a future LSP diagnostic.
Accepting io.Reader is idiomatic Go for byte-stream consumption. Callers can now pass *os.File, http.Response.Body, strings.NewReader, etc. directly. Internally io.ReadAll buffers the bytes for goldmark, which is identical overhead to what callers did before with os.ReadFile.
Implements Parser.RenderHTML that converts a Recipe to an HTML <article> element. Markdown fields (Description, Instructions) are rendered to HTML via the parser's goldmark instance. Ingredient amounts are wrapped in <em>, yields in <strong>, and tags in <em> to restore the emphasis conveyed by the original RecipeMD markdown formatting. All elements carry class attributes matching RecipeMD types (recipemd-recipe, recipemd-title, recipemd-preamble, recipemd-description, recipemd-tags, recipemd-yields, recipemd-separator, recipemd-ingredients, recipemd-ingredient-list, recipemd-ingredient, recipemd-amount, recipemd-ingredient-name, recipemd-ingredient-link, recipemd-ingredient-group, recipemd-group-title, recipemd-instructions) for CSS styling. Ingredient group headings use h2-h6 matching nesting depth. Nested groups are supported recursively.
- Add doc.go with package-level overview, usage examples for parsing, scaling, rendering, and flattening linked ingredients. - Document all exported types (Recipe, Ingredient, IngredientGroup, Amount) with field-level comments explaining optionality and semantics. - Document all exported methods: Scale, ScaleForYield, LeafIngredients, FormatFactor, Serialize, MarshalJSON. - Document Parser, Option, NewParser, WithFrontmatter, WithGithubFormattedMarkdown, Parse, Flatten, ParseAmountString, RenderMarkdown, and RenderJSON following godoc conventions (comments begin with the symbol name, full sentences, doc-links). https://claude.ai/code/session_01MzFkhNfhVMPGFeoB6nYAVv
Adds InlineIngredients(r *Recipe, rounding int) *string which returns a copy of the instructions with each ingredient's amount injected before its name wherever it appears in the text. For example, "*1/2 tsp* cinnamon" in the ingredient list causes "add cinnamon and mix" to become "add 0.5 tsp cinnamon and mix". Matching is case-insensitive and respects word boundaries so partial words (e.g. "salt" inside "salted") are not affected. A single combined regex pass handles all ingredients at once, ensuring that a longer name like "brown sugar" is always matched before the shorter "sugar" and that the shorter name is not spuriously injected into already-replaced text. All ingredients nested inside IngredientGroups are also considered via Recipe.LeafIngredients(). https://claude.ai/code/session_012wP9EExNKgdFcpp9JJELGC
Removes the standalone InlineIngredients function in favour of a parser-level
option: WithInlineIngredients(...InlineIngredientsOption). Injection now
happens automatically in RenderMarkdown and RenderHTML.
Sub-options:
• WithInlineFormat(InlineIngredientsBefore|InlineIngredientsAfter)
Controls whether the amount appears before the name ("3 eggs") or after
it in parentheses ("eggs (3)"). Default is InlineIngredientsBefore.
• WithInlineHTMLHover()
For RenderHTML only: places the amount in the span's title attribute so
it appears as a CSS/browser tooltip rather than inline text. Each matched
name is wrapped in <span class="recipemd-inline-ingredient" title="...">.
• WithInlinePrepSeparators(seps ...string)
Splits ingredient names on the first matching separator to extract a
preparation note. "garlic, minced" (sep=",") → base="garlic", prep="minced".
Paired closers are stripped automatically ("(" strips trailing ")").
Prep is included in the rendered amount:
InlineIngredientsBefore → "3 cloves minced garlic"
InlineIngredientsAfter → "garlic (3 cloves minced)"
Hover → title="3 cloves minced"
Pluralization (built-in, no external dependency):
Matching is case-insensitive and covers both singular and plural forms so
that ingredient "egg" also matches "eggs" in the instructions. wordForms()
applies standard English suffix rules (y→ies, f→ves, ch/sh/x/z→es, +s)
with an irregulars table for common food vocabulary (potato/potatoes,
leaf/leaves, knife/knives, etc.) and invariant uncountable nouns (flour,
garlic, rice, sugar, …).
A single-pass combined regex ensures that longer names (e.g. "brown sugar")
are always matched and replaced before shorter overlapping names ("sugar"),
preventing double-injection into already-replaced text.
HTML rendering post-processes the goldmark HTML output with a lightweight
text-node splitter that leaves tag markup untouched, so no unsafe HTML is
injected into the markdown source.
https://claude.ai/code/session_012wP9EExNKgdFcpp9JJELGC
bfc1d4b to
bb7b4dc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds InlineIngredients(r *Recipe, rounding int) *string which returns a
copy of the instructions with each ingredient's amount injected before its
name wherever it appears in the text. For example, "1/2 tsp cinnamon"
in the ingredient list causes "add cinnamon and mix" to become
"add 0.5 tsp cinnamon and mix".
Matching is case-insensitive and respects word boundaries so partial words
(e.g. "salt" inside "salted") are not affected. A single combined regex
pass handles all ingredients at once, ensuring that a longer name like
"brown sugar" is always matched before the shorter "sugar" and that the
shorter name is not spuriously injected into already-replaced text.
All ingredients nested inside IngredientGroups are also considered via
Recipe.LeafIngredients().
https://claude.ai/code/session_012wP9EExNKgdFcpp9JJELGC