This scraper pulls structured, high-quality auction data from Drouot.com, one of Europe’s most active art marketplaces. It captures catalog details, bidding information, artwork metadata, and seller info directly from listing or search URLs—ideal for analysts, collectors, dealers, and art-market intelligence platforms.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Drouot Scraper you've just found your team — Let's Chat. 👆👆
The project automates the extraction of detailed auction listings from Drouot, transforming scattered catalog pages into clean, structured JSON.
It solves the difficulty of gathering consistent pricing ranges, lot descriptions, and auction statuses manually and supports anyone studying trends or building valuation tools.
- Complete artwork and lot metadata
- Pricing ranges, bidding activity, and auction status
- Images and catalog references
- Seller and contact information
- Scalable crawling for long catalog lists
| Feature | Description |
|---|---|
| Artwork Metadata Extraction | Retrieves lot names, descriptions, categories, edition notes, signatures, provenance, and more. |
| Auction Data Capture | Extracts estimate ranges, bidding levels, reserve price info, and auction timing. |
| Image Collection | Saves all artwork images including the main catalog image. |
| Seller Information | Pulls seller or auction house contact details. |
| Search-URL Crawling | Works from any listing or search page to fetch multiple lots at once. |
| Structured JSON | Clean output ready for databases, pricing models, or dashboards. |
| Field Name | Field Description |
|---|---|
| lot_number | SKU / lot identifier for the artwork. |
| title | Artwork or object name. |
| description | Full catalog description including technique, condition, and notes. |
| category | Classification (e.g., Painting, Sculpture, Photography). |
| images | Array of all extracted image URLs. |
| main_image | Primary catalog image. |
| estimate_low | Lower price estimate. |
| estimate_high | Higher price estimate. |
| current_bid | Current highest bid if available. |
| next_bid | Required next bid amount. |
| reserve_met | Indicates if the reserve price was met. |
| auction_type | Online or live auction. |
| auction_status | Status such as ongoing, closed, or upcoming. |
| start_time | Auction start timestamp. |
| end_time | Auction end timestamp. |
| seller_name | Auction house or seller. |
| seller_contact | Contact details extracted from the catalog page. |
[
{
"lot_number": "153",
"title": "Bernard Buffet — Nature morte au vase",
"description": "Oil on canvas, signed and dated 1963. Good condition. Provenance noted.",
"category": "Peinture",
"images": [
"https://example.com/img1.jpg",
"https://example.com/img2.jpg"
],
"main_image": "https://example.com/img1.jpg",
"estimate_low": 12000,
"estimate_high": 18000,
"current_bid": 14500,
"next_bid": 15000,
"reserve_met": true,
"auction_type": "online",
"auction_status": "in progress",
"start_time": "2024-05-12T09:00:00Z",
"end_time": "2024-05-19T18:00:00Z",
"seller_name": "Maison de Ventes Dupont",
"seller_contact": "+33 1 44 22 00 00"
}
]
drouot-scraper/
├── src/
│ ├── main.js
│ ├── crawler/
│ │ ├── playwright_engine.js
│ │ ├── pagination.js
│ │ └── extractors.js
│ ├── utils/
│ │ ├── logger.js
│ │ ├── formatting.js
│ │ └── validator.js
│ └── config/
│ └── input_schema.json
├── data/
│ ├── sample_input.json
│ └── sample_output.json
├── Dockerfile
├── package.json
└── README.md
- Art market analysts track pricing trends, estimate accuracy, and bidding behavior.
- Auction houses compare their catalogs with competitors and monitor artist popularity.
- Dealers and collectors evaluate artworks, provenance, and historical bidding patterns.
- Data platforms enrich valuation tools and price databases with structured auction insights.
- Researchers study trends across categories, artists, or sale cycles.
Can it scrape entire auction catalogs?
Yes—when given a search or category URL, it crawls all available lots.
Does it retrieve high-resolution images?
It extracts all image URLs; quality depends on what Drouot provides.
What if a page has missing pricing info?
Fallback extraction rules keep the JSON structure stable even with incomplete fields.
Can this be used for real-time bidding analysis?
It captures current bids and status but is not meant for automated bidding systems.
Primary Metric:
Efficiently crawls dozens of catalog lots per minute while preserving detailed metadata.
Reliability Metric:
Consistently handles mixed content formats and varying catalog layouts.
Efficiency Metric:
Optimized Playwright workflows reduce page-load overhead for large search crawls.
Quality Metric:
High metadata completeness with robust parsing for images, pricing, and auction states.
