Scrape dynamic webpages: Selenium
- Dynamic website
- Changes as the user interacts with it
- Has reactive elements (e.g. drop-down menu)
- General idea: Control your browser to scrape dynamically rendered web pages
- Originally developed for web testing purposes
- R will launch a browser session and all communication will be routed through that browser session.
phantomJS
: scriptable headless browser (will not display website)
- Capabilities: complete forms, write text, click on buttons or area of website, navigate to new URL…