Skip to content

Add SQL lexer with token scanning and position tracking#23

Open
rahulc0dy wants to merge 7 commits into
mainfrom
feat/sql/lexer
Open

Add SQL lexer with token scanning and position tracking#23
rahulc0dy wants to merge 7 commits into
mainfrom
feat/sql/lexer

Conversation

@rahulc0dy

@rahulc0dy rahulc0dy commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Issue Reference

Summary by CodeRabbit

  • New Features

    • Added a full SQL lexer to recognize keywords, identifiers, numbers, strings, operators, punctuation, and both line and block comments.
    • Introduced single-element lookahead support for parsing streams.
  • Bug Fixes / Reliability

    • Improved line/column tracking and error reporting for unterminated strings, comments, and unexpected characters.

@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a position-tracked SQL lexer: token types, keyword map and lookup, scanners for identifiers/numbers/strings, whitespace/comment skipping, lexer error types, NextToken returning (Token, error), and a generic LookaheadIterator for parser peek semantics.

Changes

SQL Lexer Infrastructure

Layer / File(s) Summary
Token types and representation
internal/sql/lexer/tokens.go
TokenType enum with human-readable names and Token struct carrying Type, Literal, and 1-based Line/Col; string formatting methods.
Lexer state, cursor, and errors
internal/sql/lexer/lexer.go (lines 1–110), internal/sql/lexer/errors.go
Lexer position tracking (absolute index, line/col), NewLexer, rune cursor helpers (peek, peekNext, advance) updating line/col, comment skipping (--, /*...*/), and LexError sentinel/errors.
Scanning: identifiers, keyword lookup, numbers, strings
internal/sql/lexer/lexer.go (lines 112–202), internal/sql/lexer/keywords.go
makeToken; scanIdentifier with lookupIdent using an upper-case keywords map; scanNumber supports integers, floats and leading-dot floats; scanString supports single-quoted literals and doubled-quote escaping.
NextToken dispatch and operators
internal/sql/lexer/lexer.go (lines 204–282)
NextToken() (Token, error) skips whitespace/comments, returns TOKEN_EOF at end, dispatches to scanners, recognizes punctuation, arithmetic, =, and multi-char comparisons (<=, <>, >=, !=); returns TOKEN_ILLEGAL plus LexError for unexpected input.
Generic lookahead iterator
internal/sql/lexer/lookahead.go
LookaheadIterator[T] generic with single-element peek buffer; Peek, Next, Count, ExpectNextValue, and ExpectNextMatches helpers for parser consumption semantics.

Assessment against linked issues

Objective Addressed Explanation
Define token constants for SQL keywords, operators, identifiers [#10]
Build lexer struct to read SQL strings rune-by-rune [#10]
Implement NextToken method with whitespace handling and terminator detection [#10]

Comment @coderabbitai help to get the list of available commands and usage tips.

@rahulc0dy rahulc0dy requested review from Souvik606 and theMr17 June 7, 2026 11:53
@rahulc0dy rahulc0dy added the sql gateway Components for parsing SQL and translating queries into native Key-Value storage operations. label Jun 7, 2026
@coderabbitai coderabbitai Bot changed the title @coderabbitai Add SQL lexer with token scanning and position tracking Jun 7, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/sql/lexer/lookahead.go (1)

16-17: ⚡ Quick win

Consider validating nextFn parameter.

If nextFn is nil, the iterator will panic later during the first Peek() or Next() call with a nil pointer dereference. Adding an early check would provide a clearer error message at construction time.

🛡️ Suggested validation
 func NewLookaheadIterator[T any](nextFn func() T) *LookaheadIterator[T] {
+	if nextFn == nil {
+		panic("lookahead: nextFn cannot be nil")
+	}
 	return &LookaheadIterator[T]{nextFn: nextFn}
 }

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bc4ffb44-7265-4271-9e98-3f6725f40711

📥 Commits

Reviewing files that changed from the base of the PR and between 885d5cc and a0d4c72.

📒 Files selected for processing (6)
  • internal/sql/lexer/errors.go
  • internal/sql/lexer/keywords.go
  • internal/sql/lexer/lexer.go
  • internal/sql/lexer/lookahead.go
  • internal/sql/lexer/position.go
  • internal/sql/lexer/tokens.go

Comment thread internal/sql/lexer/tokens.go
@rahulc0dy rahulc0dy removed the request for review from Souvik606 June 7, 2026 12:39

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/sql/lexer/lexer.go (1)

22-22: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Typo in doc comment: "fro" → "for".

-// NewLexer creates a Lexer fro the given SQL source string
+// NewLexer creates a Lexer for the given SQL source string

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1099c6c0-fd02-47c0-b261-90848c729c04

📥 Commits

Reviewing files that changed from the base of the PR and between 07b2f6f and ab07f8a.

📒 Files selected for processing (3)
  • internal/sql/lexer/.gitkeep
  • internal/sql/lexer/errors.go
  • internal/sql/lexer/lexer.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql gateway Components for parsing SQL and translating queries into native Key-Value storage operations.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Lexer / Scanner

2 participants