Skip to content

Intelligent Python system that extracts real estate property data as structured JSON using AI agents, Nebius Qwen LLM, and Bright Data MCP.

Notifications You must be signed in to change notification settings

brightdata/real-estate-ai-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Bright Data Logo

Real Estate AI Agent System

AI-Powered Solution for Real Estate Public Data Extraction


🌟 Overview

Real Estate AI Agent System is a Python-based solution that leverages AI agents and Bright Data's Model Context Protocol (MCP) server to extract, process, and deliver structured real estate property data from multiple sources.

  • Automates public property data extraction from real estate websites like Zillow, Realtor.com, Redfin, and more
  • Integrates with Bright Data proxies for robust anti-bot and geo-unblocking
  • Uses Nebius Qwen LLM for adaptive, schema-validated property data extraction
  • Outputs results as structured JSON for analytics or downstream applications

Table of Contents

  • ✨ Features
  • 🚀 Quickstart
  • 🔧 Environment Setup
  • 💡 Usage Example
  • 📈 Key Capabilities
  • 🔒 Security Best Practices

✨ Features

  • Intelligent AI Agents: Uses CrewAI and LLM for adaptive data extraction and property detail parsing.
  • Bright Data Integration: Seamless support for proxy rotation, CAPTCHA solving via MCP server.
  • Strict JSON Schema: Always returns result in snake_case, schema-validated JSON.
  • Plug-and-Play: Spin up an advanced real estate data pipeline in minutes.
  • Cross-Platform: Python 3.9; requires Node.js for Bright Data MCP server.

🚀 Quickstart

  1. Clone this repository

    git clone https://github.com/brightdata-com/real-estate-ai-agents.git
    cd real-estate-ai-agents

🔧 Environment Setup

Prerequisites

  • Python 3.9+
  • Node.js + npm (for Bright Data MCP server)
  • Bright Data account with API token
  • Nebius AI API key

Virtual Environment

macOS/Linux

python3.9 -m venv venv
source venv/bin/activate

Windows

python3.9 -m venv venv
.\venv\Scripts\activate

Install Dependencies

pip install "crewai-tools[mcp]" crewai mcp python-dotenv pandas

Add Environment Variables

Create a .env file in your project directory with the following:

BRIGHT_DATA_API_TOKEN="your_api_token_here"
WEB_UNLOCKER_ZONE="your_web_unlocker_zone"
BROWSER_ZONE="your_browser_zone"
NEBIUS_API_KEY="your_nebius_api_key"

💡 Usage Example

To run the agent:

python real_estate_agents.py

If successful, the script will extract property data from a real estate listing and output result like:

{
  "address": "123 Main Street, City, State 12345",
  "price": "$450,000",
  "bedrooms": 3,
  "bathrooms": 2,
  "square_feet": 1850,
  "lot_size": "0.25 acres",
  "year_built": 1995,
  "property_type": "Single Family Home",
  "listing_agent": "John Doe, ABC Realty",
  "days_on_market": 45,
  "mls_number": "MLS123456",
  "description": "Beautiful home with updated kitchen...",
  "image_urls": ["https://example.com/image1.jpg", "https://example.com/image2.jpg"],
  "neighborhood": "Downtown Historic District"
}

📈 Key Capabilities

  • Extracts address, price, bedrooms, bathrooms, square footage, lot size, year built, property type, listing agent, days on market, MLS number, description, image URLs, and neighborhood.
  • Strict JSON schema validation: always outputs snake_case keys.
  • Handles proxy rotation, CAPTCHAs, and anti-bot protections using Bright Data’s MCP stack.
  • Easily extendable for more data fields and custom sources.

🔒 Security Best Practices

  • Store all API keys and credentials securely in your .env file.
  • Always validate and sanitize extracted data before use.
  • Respect robots.txt and website terms of service.

Bright Data Logo

About

Intelligent Python system that extracts real estate property data as structured JSON using AI agents, Nebius Qwen LLM, and Bright Data MCP.

Topics

Resources

Stars

Watchers

Forks

Languages