Skip to content

Regex Text Extraction tool#12

Open
sharananurag998 wants to merge 1 commit into
mainfrom
vk/a9f1-regex-text
Open

Regex Text Extraction tool#12
sharananurag998 wants to merge 1 commit into
mainfrom
vk/a9f1-regex-text

Conversation

@sharananurag998
Copy link
Copy Markdown
Collaborator

I've successfully created a comprehensive regex/text extractor tool. Here's what was implemented:

📁 Created Files

  1. src/tools/regexExtractorTool.ts - The main tool implementation with:

    • Core regexExtractorTool for custom regex patterns
    • predefinedPatternExtractorTool for common patterns
    • 18 predefined patterns (emails, phones, URLs, IPs, dates, UUIDs, invoice numbers, etc.)
    • Support for capturing groups with named groups
    • createCustomExtractor function for creating specialized extractors
  2. data/ folder with sample text files:

    • support_transcript.txt - Customer support conversation
    • invoice_sample.txt - Business invoice with various IDs and prices
    • server_logs.txt - Server log entries with timestamps, IPs, and UUIDs
  3. examples/regex_extractor.ts - Comprehensive demonstration showing:

    • Email extraction from support transcripts
    • Order ID and invoice number extraction
    • Server log parsing (IPs, UUIDs, timestamps)
    • Custom pattern extraction (log levels)
    • Group extraction with named captures
    • Price extraction and summation

✨ Key Features

  • 18 Predefined Patterns: email, phone, url, ipv4, date, time, creditCard, ssn, zipCode, hashtag, mention, uuid, invoiceNumber, orderId, trackingNumber, price, percentage
  • Flexible Pattern Support: Use custom regex or predefined patterns
  • Group Capture: Support for named capturing groups
  • Batch Extraction: Extract all matches or just the first one
  • Type-Safe: Full TypeScript support with proper interfaces

🎯 Example Output

The tool successfully extracts:

  • 6 email addresses from support transcript
  • 20+ reference numbers (orders, invoices, support tickets)
  • 6 unique IP addresses from logs
  • UUIDs, timestamps, and log levels
  • Prices with automatic summation ($90,775.50 total from invoice)

The implementation follows Vibe Kanban's functional architecture pattern and integrates seamlessly with the existing ADK tool system.

…or Vibe Kanban. Here's what was implemented:

## 📁 Created Files

1. **`src/tools/regexExtractorTool.ts`** - The main tool implementation with:
   - Core `regexExtractorTool` for custom regex patterns
   - `predefinedPatternExtractorTool` for common patterns
   - 18 predefined patterns (emails, phones, URLs, IPs, dates, UUIDs, invoice numbers, etc.)
   - Support for capturing groups with named groups
   - `createCustomExtractor` function for creating specialized extractors

2. **`data/` folder** with sample text files:
   - `support_transcript.txt` - Customer support conversation
   - `invoice_sample.txt` - Business invoice with various IDs and prices
   - `server_logs.txt` - Server log entries with timestamps, IPs, and UUIDs

3. **`examples/regex_extractor.ts`** - Comprehensive demonstration showing:
   - Email extraction from support transcripts
   - Order ID and invoice number extraction
   - Server log parsing (IPs, UUIDs, timestamps)
   - Custom pattern extraction (log levels)
   - Group extraction with named captures
   - Price extraction and summation

## ✨ Key Features

- **18 Predefined Patterns**: email, phone, url, ipv4, date, time, creditCard, ssn, zipCode, hashtag, mention, uuid, invoiceNumber, orderId, trackingNumber, price, percentage
- **Flexible Pattern Support**: Use custom regex or predefined patterns
- **Group Capture**: Support for named capturing groups
- **Batch Extraction**: Extract all matches or just the first one
- **Type-Safe**: Full TypeScript support with proper interfaces

## 🎯 Example Output

The tool successfully extracts:
- 6 email addresses from support transcript
- 20+ reference numbers (orders, invoices, support tickets)
- 6 unique IP addresses from logs
- UUIDs, timestamps, and log levels
- Prices with automatic summation ($90,775.50 total from invoice)

The implementation follows Vibe Kanban's functional architecture pattern and integrates seamlessly with the existing ADK tool system.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants