Learn how to trigger data collection using the Web Scraper API with options for discovery and PDP scrapers. Customize requests, set delivery options, and retrieve data efficiently.
Example: [{"url":"https://www.airbnb.com/rooms/50122531"}]
data
Example (curl): data=@path/to/your/file.csv
PDP
with URL inputPDP
is always a URL, pointing to the page to be scraped.
discovery
methoddiscovery
can vary according to the specific scraper. Inputs can be:
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Dataset ID for which data collection is triggered.
"gd_l1vikfnt1wgvvqz95w"
List of output columns, separated by |
(e.g., url|about.updated_on
). Filters the response to include only the specified fields.
"url|about.updated_on"
Set it to "discover_new" to trigger a collection that includes a discovery phase.
discover_new
Specifies which discovery method to use. Available options: "keyword", "best_sellers_url", "category_url", "location" and more (according to the specific API). Relevant only for collections that include a discovery phase.
Include errors report with the results.
Limit the number of results per input. Relevant only for collections that include a discovery phase.
x >= 1
Limit the total number of results.
x >= 1
URL where the notification will be sent once the collection is finished. Notification will contain snapshot_id and status.
Webhook URL where data will be delivered.
Specifies the format of the data to be delivered to the webhook endpoint.
json
, ndjson
, jsonl
, csv
Authorization header to be used when sending notification to notify URL or delivering data via webhook endpoint.
By default, the data will be sent to the webhook compressed. Pass true to send it uncompressed.
The body is of type Only inputs · object[]
.
Collection job successfully started
The response is of type object
.