Chapter 5 Visualizations with ggplot2

We can plot a sample of this table to have an overview by filtering on the hour equal to 17h :

Here we just confirm what we saw in Chapter 4, it seems some maxspeed are not “standard” or very seldomly used by Uber. We can confirm this with a quick histogram :

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Indeed, 30,50 and 80 kph limits are our numbers.
We can drop the others rows :

Now focus on the maxspeed=50 kph:(the color represents a street name). Please notice here that we use the ggplotly function from the plotly package to make our ggplot2 chart interactive (under the hood, it is d3.js running, you can be proud of you, really.)

What those curves could be ? How can those dots, which color corresponds to a street name, form inverse curves although they are not related ? For instance, the Müllerstraße dot on the plot is next to the Frankfürter Allee dot, and on the same fitting curve, although they are in two different districts of Berlin.

More prosaically, we can plot the over speed percentage in the day : For this, we just use our current table, and aggregate per hour of the day and compute the weighted ratio of over speed :

##    hour maxspeed ratio.over mean_speed_minus_max
## 1:    0       50   14.80034           -10.042412
## 2:    1       50   17.21446            -9.231080
## 3:    2       50   22.51034            -8.057035
## 4:    3       50   25.58019            -7.430214
## 5:    4       50   31.03792            -6.464708

We use here the ggthemes package to add some elegance and credibility to our charts with the help of The Economist theme :

Hey, but we forgot to look at the overall rides along the hour of the day ! How could we do that ? Just reuse the original DT.Uber table and do some aggregation :

##    hour total_rides
## 1:    0      142254
## 2:    1      128207
## 3:    2       57126
## 4:    3       84035
## 5:    4       81893

For the plot, we use an other ggplot trick ; the polar coordinates :

Which gives us a relative idea of the rides volume in time.