Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,9 @@ us
vn
za
ar
ajax
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that this is a mistake

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ajax means loading data via network. Not the looney tunes secret stuff.

DOM
iframes
webhooks
unix
customizable
3 changes: 3 additions & 0 deletions docs/CSS selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,6 @@ For example: `.shadow-root-parent-element:shadow-root .selector-within-shadow-ro
[How to select elements that have a specific element]: https://www.webscraper.io/how-to-video/jquery-has-selector
[How to select elements that don’t contain specific text]: https://www.webscraper.io/how-to-video/jquery-not-contains-selector
[How to select elements that don’t have a specific element]: https://www.webscraper.io/how-to-video/jquery-not-has-selector

description: Learn CSS selectors for Web Scraper - a comprehensive guide to CSS selectors, jQuery selectors and selecting elements within iframes and shadow-root
keywords: css selectors, web scraper css, jquery selectors, iframe selectors, shadow-root selectors
3 changes: 3 additions & 0 deletions docs/Installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,6 @@ Browser version requirements:

[1]: https://chrome.google.com/webstore/detail/web-scraper/jnhgnonknehpejjnehehllkliplmbmhn "Install web scraper from Chrome store"
[2]: https://addons.mozilla.org/en-US/firefox/addon/web-scraper/

description: Install Web Scraper browser extension for Chrome and Firefox - step-by-step installation guide
keywords: web scraper installation, chrome extension, firefox addon, browser extension install, web scraping setup
3 changes: 3 additions & 0 deletions docs/Open Web Scraper.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,6 @@ Shortcuts:

[open-web-scraper]: images/open-web-scraper/open-web-scraper-chrome.png?raw=true
[How to open Web Scraper extension for the first time]: https://www.webscraper.io/how-to-video/open-web-scraper

description: Learn how to open Web Scraper extension in Chrome and Firefox developer tools - keyboard shortcuts and step-by-step guide to access the scraping interface
keywords: open web scraper, developer tools, browser extension access, chrome devtools, firefox developer tools, web scraper interface
3 changes: 3 additions & 0 deletions docs/Scraping a site.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,6 @@ data as CSV* panel.
[delay-definition]: images/scraping-a-site/delay-definition.png
[How to create a sitemap]: https://www.webscraper.io/how-to-video/create-a-sitemap
[How to add multiple start URLs]: https://www.webscraper.io/how-to-video/add-multiple-start-urls

description: Complete guide to scraping websites with Web Scraper - learn how to create sitemaps, set start URLs, configure selectors, and extract data from web pages
keywords: website scraping, sitemap creation, web scraper tutorial, data extraction, start urls, scraping guide, web scraping workflow
3 changes: 3 additions & 0 deletions docs/Selectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,6 @@ selectors on [CSS selector][css-selectors] page.
[select-tool]: images/selectors/select-tool.png
[select-tool-hotkeys]: images/selectors/select-tool-hotkeys.png
[css-selectors]: CSS%20selector.md

description: Comprehensive guide to Web Scraper selectors - learn about data extraction selectors, link selectors, and element selectors for effective web scraping
keywords: web scraper selectors, data extraction selectors, link selectors, element selectors, text selector, image selector, table selector, web scraping selectors
3 changes: 3 additions & 0 deletions docs/Selectors/Element attribute selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,6 @@ how data is returned
[text-selector]: Text%20selector.md
[css-selector]: ../CSS%20selector.md
[How to extract data from element attribute]: https://www.webscraper.io/how-to-video/element-attribute

description: Web Scraper Element Attribute Selector - extract element attributes like title, data attributes, and custom properties from web pages
keywords: attribute selector, html attributes, element attributes, data attributes, title attribute, custom attributes, attribute extraction
3 changes: 3 additions & 0 deletions docs/Selectors/Element click selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,3 +92,6 @@ and during the clicking process, resulting in duplicate or unusable row of data.
[How to iterate through item drop-down variations]: https://www.webscraper.io/how-to-video/dropdown-variation
[How to iterate through item button variations]: https://www.webscraper.io/how-to-video/button-variation
[How to iterate through two or more item variation selects]: https://www.webscraper.io/how-to-video/product-with-multiple-variations

description: Web Scraper Element Click Selector - interact with clickable elements to load dynamic content and extract data from JavaScript-driven websites
keywords: element click selector, click selector, dynamic content, interactive scraping, javascript scraping, button clicking, ajax loading
3 changes: 3 additions & 0 deletions docs/Selectors/Element selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,6 @@ Though [Table selector] [table-selector] might be much better solution.
[table-selector]: Table%20selector.md
[multiple-elements-with-text-selectors]: ../images/selectors/text/text-selector-multiple-elements-with-text-selectors.png?raw=true
[How to scrape multiple items within a listings page]: https://www.webscraper.io/how-to-video/multiple-items

description: Web Scraper Element Selector - select multiple data elements from lists and containers with scroll support for dynamic content loading
keywords: element selector, multiple elements, list scraping, scroll selector, dynamic content, multiple items scraping
3 changes: 3 additions & 0 deletions docs/Selectors/HTML selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,6 @@ See [Text selector] [text-selector] use cases.

[text-selector]: Text%20selector.md
[css-selector]: ../CSS%20selector.md

description: Web Scraper HTML Selector - extract raw HTML content and inner HTML from selected elements while preserving formatting and structure
keywords: html selector, html extraction, inner html, raw html, html content extraction, preserve html formatting
3 changes: 3 additions & 0 deletions docs/Selectors/Image selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,6 @@ how data is returned
[windows-image-download-script]: ../images/selectors/image/win-image-downloader.gif?raw=true
[osx-image-download-script]: ../images/selectors/image/osx-image-downloader.gif?raw=true
[image-downloader]: https://github.com/webscraperio/image-downloader/tags

description: Extract image URLs and download images from websites using Web Scraper's Image selector with multiple handling options
keywords: image selector, web scraper, image download, src attribute, CSS selector, image extraction, python image downloader, bulk download
3 changes: 3 additions & 0 deletions docs/Selectors/Link selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,6 @@ selector.
[element-click]: Element%20click%20selector.md
[css-selector]: ../CSS%20selector.md
[pagination-selector]: Pagination%20selector.md

description: Web Scraper Link Selector - extract URLs and navigate websites with support for different link types including scripted links and AJAX navigation
keywords: link selector, url extraction, website navigation, link scraping, web scraper navigation, href extraction, scripted links
4 changes: 3 additions & 1 deletion docs/Selectors/Pagination selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,6 @@ data from those.

[css-selector]: ../CSS%20selector.md
[pagination-selector]: ../images/selectors/pagination/pagination-selector.png?raw=true


description: Web Scraper Pagination Selector - navigate through paginated content and load more buttons to scrape data from multiple pages automatically
keywords: pagination selector, pagination scraping, load more button, page navigation, multi-page scraping, automatic pagination, next page
3 changes: 3 additions & 0 deletions docs/Selectors/Sitemap xml selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,6 @@ When using Sitemap.xml selector, set the main page of the site as a start URL.
[cloud-web-scraper]: https://www.webscraper.io/cloud-scraper
[sitemap-xml-link-selectors]: ../images/selectors/sitemap-xml/sitemap-xml-link-selector.png?raw=true
[sitemap format]: https://www.sitemaps.org/protocol.html

description: Extract URLs from sitemap.xml files to scrape entire websites without pagination using Web Scraper's Sitemap.xml link selector
keywords: sitemap xml, link selector, web scraper, URL extraction, site crawling, sitemap.xml, robots.txt, website scraping
5 changes: 4 additions & 1 deletion docs/Selectors/Table selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,7 @@ See [Text selector] [text-selector] use cases.

[table-selector-selectors]: ../images/selectors/table/selectors.png?raw=true
[text-selector]: Text%20selector.md
[css-selector]: ../CSS%20selector.md
[css-selector]: ../CSS%20selector.md

description: Web Scraper Table Selector - extract data from tables with automatic header detection and configurable row and column selection
keywords: table selector, table scraping, table data extraction, html tables, table rows, table columns, structured data extraction
3 changes: 3 additions & 0 deletions docs/Selectors/Text selector.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,6 @@ how data is returned
[element-selector]: Element%20selector.md
[css-selector]: ../CSS%20selector.md
[How to scrape multiple items within a listings page]: https://www.webscraper.io/how-to-video/multiple-items

description: Web Scraper Text Selector - extract text content from elements within a web page
keywords: text selector, text extraction, web scraper selector, content extraction, text scraping
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,3 +116,6 @@ element click selector. If the timeout is reached, no data will be scraped from
[data-quality-control]: Web%20Scraper%20Cloud/Data%20quality%20control.md
[notifications]: Web%20Scraper%20Cloud/Notifications.md
[sitemap-sync]: Web%20Scraper%20Cloud/Sitemap%20sync.md

description: Web Scraper Cloud - automated enterprise web scraping with Proxy support, scheduling, API access, automatic data export, and scalable cloud-based scraping solutions
keywords: web scraper cloud, cloud scraping, proxy scraping, automated web scraping, scraping api, scheduled scraping, cloud data extraction
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/API.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,3 +194,6 @@ Returns **empty** and **failed** urls for specific scraping job.
[node]: https://github.com/webscraperio/api-client-nodejs
[api-page]: https://cloud.webscraper.io/api
[queue system]: https://laravel.com/docs/10.x/queues

description: Web Scraper Cloud API documentation - manage sitemaps, scraping jobs, and download data via REST API with Node.js and PHP SDK support
keywords: web scraper api, scraping api, rest api, nodejs sdk, php sdk, api documentation, cloud scraping api, sitemap api
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/API/Webhooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,6 @@ otherwise notification sender could timeout and resend the notification which co

[cloud]: https://www.webscraper.io/cloud-scraper
[api-page]: https://cloud.webscraper.io/api

description: Configure webhook notifications in Web Scraper Cloud to receive webhook notifications about scraping jobs' statuses
keywords: webhook, notifications, API, web scraper cloud, scraping jobs, POST request, job status, automation
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Data Export.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,6 @@ import feature:
[excel-load-data-from-text-csv]: ../images/data-export/excel-load-data-from-text-csv.png?raw=true

[libre-office-calc]: https://www.libreoffice.org/discover/calc/

description: Web Scraper Cloud data export options - download scraped data in CSV, JSON, and XLSX formats with automated export delivery options
keywords: data export, csv export, json export, excel export, scraped data download, automated data delivery, cloud data export
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Data quality control.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,6 @@ previous data quality notifications for that exact sitemap will be deleted.

[notifications]: Notifications.md
[data-quality-control-image]: ../images/cloud/data-quality-control-example.png

description: Web Scraper Cloud Data Quality Control - monitor and validate scraped data quality with automated checks, alerts, and data integrity verification
keywords: data quality control, data validation, scraped data quality, data integrity, quality monitoring, data quality alerts, scraping validation
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Notifications.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,6 @@ Notification channels can be configured separately for each notification type.
[cloud-notification-popup-image]: ../images/cloud/notification-popup-example.png
[cloud-notification-image]: ../images/cloud/notification-list-example.png
[notification-settings-image]: ../images/cloud/notification-settings.png

description: Web Scraper Cloud Notifications - configure email alerts and platform-based notifications for scraping job completion, failures, data quality checks and more.
keywords: scraping notifications, email alerts, scraping job alerts, automated notifications, data delivery alerts
4 changes: 3 additions & 1 deletion docs/Web Scraper Cloud/Parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,6 @@ These parsers can be used for data post processing:
[virtual-column]: Parser/Virtual%20column.md
[drag-n-drop]: ../images/parsers/drag-n-drop.gif
[cloud]: https://cloud.webscraper.io/


description: Web Scraper Cloud Parser - transform and clean scraped data with virtual columns, regex matching, HTML stripping, text replacement, and timestamp conversion
keywords: data parser, web scraper parser, data transformation, virtual columns, regex parser, html stripping, text replacement, data cleaning
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Append and prepend text.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@ Add text at the beginning of the string and/or at the end of the string.
* Append text - add text at the end of the string.
* Prepend text - add text at the beginning of the string.
* Text to place - text that will be placed. Escape sequences as `\n\r\t` can be used.

description: Add text at the beginning or end of scraped strings using Web Scraper Cloud append and prepend text parser
keywords: append text, prepend text, parser, text manipulation, web scraper, string processing, data formatting
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Convert UNIX timestamp.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,6 @@ designation.


[php-date]: https://www.php.net/manual/en/function.date.php

description: Web Scraper Cloud Convert UNIX Timestamp Parser - convert UNIX timestamps to human-readable date formats with customizable output formatting
keywords: unix timestamp, timestamp conversion, date conversion, timestamp parser, unix time, epoch time, date formatting
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Regex match.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,6 @@ expressions.
| date: 2014-08-20 | `\d+-\d+-\d+` | 0 | 2014-08-20 |

[regexr]: https://regexr.com/

description: Web Scraper Cloud Regex Match Parser - extract specific data patterns from scraped content using regular expressions and capture groups
keywords: regex parser, regular expressions, pattern matching, data extraction, regex match, capture groups, text parsing
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Remove column.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,6 @@ them.
[virtual-column]: Virtual%20column.md
[remove-columns]: ../../images/parsers/remove-column.gif
[remove-virtual-column]: ../../images/parsers/remove-virtual-column.gif

description: Clean up scraped data by removing unnecessary columns using Web Scraper Cloud parser
keywords: remove column, parser, data cleanup, web scraper, column management, virtual column, scraped data
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Remove whitespaces.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@ The parser allows you to remove whitespaces and new lines. It is useful for clea
* Remove new lines - replaces all new line groups with a single space.

[text-selector]: ../../Selectors/Text%20selector.md

description: Remove whitespaces and newlines from scraped data using the Web Scraper Cloud parser to clean up text fields
keywords: remove whitespaces, parser, text cleaning, web scraper, data processing, newlines removal
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Replace text.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,6 @@ Text to place: `https://website.com/`
Use Regex: `checked`

[regexr]: https://regexr.com/

description: Web Scraper Cloud Replace Text Parser - find and replace text patterns in scraped data using string matching or regular expressions
keywords: replace text, text replacement, find replace, string replacement, regex replacement, text substitution, data cleaning
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Strip HTML.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ parser beforehand.

[html-entities]: https://www.w3schools.com/html/html_entities.asp
[replace-text]: Replace%20text.md

description: Web Scraper Cloud Strip HTML Parser - remove HTML tags and formatting from scraped content to extract clean plain text data
keywords: strip html, html removal, clean text, html tags removal, text extraction, html parser, plain text conversion
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Parser/Virtual column.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,6 @@ value. In order to do this:
[replace-parser]: Replace%20text.md
[selectors]: ../../Selectors.md
[regex-parser]: Regex%20match.md

description: Web Scraper Cloud Virtual Column Parser - create custom data columns by combining and transforming data from existing columns with dynamic expressions
keywords: virtual column, data transformation, custom columns, column parser, data manipulation, calculated fields, dynamic columns
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Scheduler.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,6 @@ If a very customized and specific scheduler is needed, for which the previous sc

[cloud]: https://www.webscraper.io/cloud-scraper
[cron]: https://en.wikipedia.org/wiki/Cron

description: Learn how to use Web Scraper Cloud Scheduler to automate scraping jobs with daily, interval, and custom cron expression scheduling options
keywords: web scraper, scheduler, automation, scraping jobs, cron expression, daily scheduler, interval scheduler, web scraper cloud
3 changes: 3 additions & 0 deletions docs/Web Scraper Cloud/Sitemap sync.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,6 @@ downloaded in extension will be discarded from the sitemap list.
## Sync limitations

User has a limit of 50 sitemap sync actions per 15 minutes.

description: Web Scraper Cloud Sitemap Sync - synchronize sitemaps between browser extension and cloud platform for seamless scraping workflow management
keywords: sitemap sync, web scraper sync, browser extension sync, cloud sync, sitemap synchronization, scraping workflow sync
3 changes: 3 additions & 0 deletions docs/Website State Setup %2F Sign-in.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,6 @@ you have explicit written permission to conduct data extraction behind a login.*

[CSS selectors]: https://webscraper.io/documentation/css-selector
[jQuery Contains Selector]: https://webscraper.io/how-to-video/jquery-contains-selector

description: Website State Setup for Web Scraper - configure conditional actions, website sign-in, location changes, and currency settings for automated web scraping workflows
keywords: website state setup, web scraper login, sign-in automation, location change, currency change, conditional scraping, scraper authentication
Loading