bad_input
error_code=bad_input
blocked
bounding_box
selector
: A valid CSS selector for the elementbrowser_size
capture_graphql
click
selector
: Element selectorclose_popup
close_popup()
at the top of your code to add a background watcher that will close the popup when it appears. If a popup appears multiple times, it will always be closed
click_inside
: selector of parent iframe which contains close button selectorcollect
collect(<data_line>[, <validate_fn>]);
data_line
: A object with the fields you want to collectvalidate_fn
: Optional function to check that the line data is validconsole
country
country(<code>);
code
: 2-character ISO country codedead_page
error_code=dead_page
detect_block
resource
: An object specifying the resource required for the detection
selector
condition
: An object specifying how the resource should be processed for detection
exists
has_text
disable_event_listeners
track_event_listeners()
must have been called first
event_types
: Specific event types that should be disabledel_exists
selector
: Valid CSS selectortimeout
: Timeout duration to wait for the element to appear on the pageel_is_visible
embed_html_comment
comment
: Body of the commentfont_exists
font_exists(<font-family>);
freeze_page
hover
hover(<selector>);
selector
: Element selectorhtml_capture_options
options
: An object which accepts options defining how HTML capturing should be processed
coordinate_attributes
Image
src
: Image URL or data:image URI stringinput
next_stage()
calls
job
next_stage()
calls
load_html
html
: Any HTML stringload_more
selector
: Selector for the element that contains the lazy-loaded itemsload_sitemap
location
value
: Amount of moneycurrency
: Currency codemouse_to
mouse_to(<x>, <y>);
x
: Target x positiony
: Target y positionnavigate
navigate(<url>);
dead_page
error by default. Use opt.allow_status
to override thisurl
: A URL to navigate toopt
: navigate options (see examples)next_stage
input
: Input object to pass to the next browser sessionparse
preserve_proxy_session
press_key
proxy_location
country(code)
instead
configuration
: Object with a desired proxy location, check examples for more inforedirect_history
rerun_stage
resolve_url
url
: URL string/instanceresponse_headers
request
url
| options
: the url to make the request to, or request options (see examples)right_click
right_click(<selector>);
selector
: Element selectorrun_stage
scroll_to
load_more()
. Defaults to scrolling in a natural way, which may take several seconds. If you want to jump immediatley, use {immediate: true}
syntax: scroll_to(<selector>);
selector
: Selector of the element you want to scroll toscroll_to_all
scroll_to_all(<selector>);
selector
: Selector of the elements you want to scroll throughselect
select(<select>, <value>);
selector
: Element selectorset_lines
set_lines()
will override previous ones, and only the last set of lines will be added into the dataset (tracked per page crawl). This is a good fit when the scraper is set to collect partial on errors. You can keep calling set_lines()
with the data you gathered so far, and the last call will be used if the page crawl throws an error
syntax: set_lines(<data_line>[, <validate_fn>]);
lines
: An array of data lines to add to your final datasetvalidate_fn
: Optional function to check that the line data is valid (run once per line)set_session_cookie
set_session_headers
headers
: Object with extra headers in key-value formatsolve_captcha
status_code
tag_all_responses
field
: The name of the tagged fieldpattern
: The URL pattern to matchoptions
: Set options.jsonp=true to parse response bodies that are in jsonp format. This will be automatically detected when possibletag_download
url
: A pattern or a string to match requests againsttag_image
tag_response
tag_response(<field>, <pattern>, <options>);
name
: The name of the tagged fieldpattern
: The URL pattern to matchoptions
: Set options.jsonp=true to parse response bodies that are in jsonp format. This will be automatically detected when possibletag_screenshot
tag_screenshot(<field>, <options>);
tag_script
tag_script(<field>, <selector>);
tag_serp
field
: The name of the tagged fieldtype
: Parser type: (e.g. bing, google)tag_video
field
: The name of the tagged fieldselector
: A valid CSS selectoropt
: download options (see example)tag_window_field
field
: The path to the relevant datatrack_event_listeners
disable_event_listeners()
later
type
selector
: Element selectortext
: Text to enterURL
url
: URL stringverify_requests
callback
: A function which will be called on each failed request with an object in format: {url, error, type, response}
Video
src
: Video URLwait
selector
: Element selectoropt
: wait options (see examples)wait_any
wait_for_parser_value
field
: The parser value path to wait onvalidate_fn
: An optional callback function to validate that the value is correctopt
: Extra options (e.g. timeout)wait_for_text
selector
: Element selectortext
: The text to wait forwait_hidden
selector
: Element selectorwait_network_idle
timeout
: Wait for browser network to be idle for X millisecondsoptions
: ignore: an array of patterns to exclude requests from monitoring timeout: how long the network needs to be idle in milliseconds (default 500)wait_page_idle
timeout
: Milliseconds to wait for no changesoptions
: An object, which can accept a ignore argument to exclude some elements from monitoringwait_visible
selector
: Element selector$
selector
: Element selectorClicking on selector based on text
selector
: Element selectoremulate_device
device
: A string with the name of deviceHere is the full list of device names
input
$
location
Image
Video
Money