How to Scrape Data from a JavaScript Website with R

In September 2017, I found myself working on a project that required odds data for football. At the time I didn’t know about resources such as Football-Data or the odds-api, so I decided to build a scraper to collect data directly from the bookmakers. However, most of them used JavaScript to display their odds, so I couldn’t collect the data with R and rvest alone. In this article, I’ll demonstrate how PhantomJS can be used with R to scrape JS-rendered content from the web.

Read More

Building a Repository of Alpine-based Docker Images for R, Part II

Shiny Alpine Docker Run

In the first article of this series, I built an Alpine-based Docker image with R base packages from Alpine’s native repositories, as well as one image with R compiled from source code. The images are hosted on Docker Hub, velaco/alpine-r repository.

The next step was either to address the fatal errors I found while testing the installation of R or to proceed building an image with Shiny Server. The logical choice would have been to pass all tests with R’s base packages before proceeding, but I was a bit impatient and wanted to go through the process of building a Shiny Server as soon as possible. After two weeks of trial and error, I finally have a container that can start the server and run Shiny apps.

Read More

Building a Repository of Alpine-based Docker Images for R, Part I

Shiny Alpine Docker

The Rocker Project maintains the official Docker images of interest to R users. I use their images as a base to deploy containerized Shiny apps, but the virtual size of the images I build tends to fall in the range between 400 and 600 MB. To reduce the size of my images, I decided to try building a Shiny Server on Alpine Linux as an alternative to Rocker’s Debian-based images. In this series of articles, I’ll document my progress from building a base image with R to building an image with Shiny Server. The Dockerfiles included in this article can be found at the velaco/alpine-r repository.

Read More

RtGraph Example - Graphing How a Tweet Can Spread Among Retweeters

“How far can users’ tweets spread thanks to their followers’ retweets?” I asked myself one day. It’s important to note that I was thinking in terms of degrees of separation rather than geographic distances. For example, if one of your followers retweets something you wrote, their followers will see that tweet and are now two steps away from you. If one of them does not follow you, but decides to retweet it further, their followers will be three steps away from you. And so on…

To satisfy my curiosity, I wrote an R script that would allow me to graph the relationships between an author of a tweet and its retweeters. The script I used in this example is available at the RtGraph repository.

Read More