Add WebSearchAPI.ai Provider and MCP Server#1
Open
nazq wants to merge 1 commit into
Open
Conversation
Features: - WebSearchAPI.ai provider with full markdown content extraction - MCP server supporting stdio (default) and HTTP transport modes - Docker deployment with health check endpoint - Exa provider with optional autopromptString parsing - Tavily provider integration Includes Dockerfile.mcp for containerized deployments and comprehensive test coverage for all providers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces three enhancements to the websearch library:
SearchResulttype to include full page contentThese additions expand the library's capabilities while maintaining full backward compatibility with existing code.
Motivation
The original search-sdk provides excellent multi-provider search capabilities. As AI assistants become more prevalent, there's growing demand for:
This PR addresses all three needs while respecting the existing architecture and patterns established in the codebase.
Changes
1. WebSearchAPI.ai Provider (
src/providers/websearchapi_ai.rs)A new provider that integrates with WebSearchAPI.ai, which is specifically designed for AI/LLM use cases:
2. Extended SearchResult Type (
src/types.rs)Added optional fields to
SearchResultfor providers that support content extraction:These fields are
Optiontypes, so existing providers continue to work unchanged. Providers that support content (WebSearchAPI.ai, Exa, Tavily advanced mode) populate these fields automatically.3. MCP Server (
src/mcp/,src/bin/websearch_mcp.rs)A complete Model Context Protocol implementation enabling AI assistants to use the search library:
Features:
web_searchandlist_providersTools exposed:
web_searchlist_providersUsage:
Docker support:
docker build -f Dockerfile.mcp -t websearch-mcp . docker run -p 3000:3000 -e WEBSEARCHAPI_KEY=xxx websearch-mcp4. Provider Improvements
autopromptStringfield parsing (now optional with#[serde(default)])raw_contentfield in advanced modeFiles Changed
Cargo.tomlmcpfeature with rmcp, schemars, axum dependenciessrc/lib.rssrc/types.rssrc/providers/mod.rssrc/providers/websearchapi_ai.rssrc/providers/exa.rssrc/mcp/mod.rssrc/mcp/server.rssrc/mcp/schemas.rssrc/bin/websearch_mcp.rsDockerfile.mcpMCP.md.cargo/config.toml.exampletests/provider_integration.rstests/cli_tests.rsTesting
All existing tests pass. New test coverage includes:
New Integration Test Suite (
tests/provider_integration.rs)Added a comprehensive integration test framework that validates providers against live APIs. Tests are gated by environment variables so they only run when API keys are available:
The integration tests verify:
Backward Compatibility
This PR maintains full backward compatibility:
SearchResultconsumers continue to work (new fields areOption)--features mcp)Documentation
MCP.md- Complete guide for MCP server setup and usage.cargo/config.toml.example- Template for local developmentFuture Considerations
Some ideas for potential future work (not in scope for this PR):
fetch_urlfor direct URL content extraction)