Amy R. Johnson

Creating a Realtime Graph With Rickshaw and Heroku Scheduler

Rickshaw is a really cool JavaScript graphing library by Shutterstock built on d3. I decided to create a realtime visualization of Citi Bike data from their JSON feed. (It’s still a work in progress, check it out here.)

To set up even a simple graph, you have to download the Rickshaw JavaScript and css files and reference them correctly. The more complicated your graph is the more of these there will be, so be sure you put everything in the right place. In my realtime graph there were many more, but for a basic line graph the links look something like this:

1
2
3
4
5
6
7
8
9
<link type="text/css" rel="stylesheet" href="../src/css/sample.css">
<link type="text/css" rel="stylesheet" href="../src/css/graph.css">
<link type="text/css" rel="stylesheet" href="css/lines.css">

<script src="js/d3.v3.js"></script>

<script src="js/rickshaw.min.js"></script>

<div id="chart"></div>

The next important component is getting the realtime data from the Citi Bike feed. I accomplished this using an AJAX request. The Rickshaw example realtime graph includes a helpful SetInterval function that will repeat the AJAX request at a specificed interval. Once the request is complete I send the data to a callback function that parses and adds it, and the graph.update() function displays the updated graph.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
setInterval( function() {
    newRemoveData(seriesData);
    jQuery.ajax({
        url: citibikeurl,
        type: 'GET',
        dataType: "jsonp",
        crossDomain: true,
        success: function(json){
            sortNeighborhoods(json);
        },
        error: function(){
        // repeats previous datapoint if bike feed request is unsuccessful
           newAddData([seriesData[0]], seriesData[0][seriesData[0].length - 1]["y"]);
           newAddData([seriesData[1]], seriesData[1][seriesData[1].length - 1]["y"]);
           newAddData([seriesData[2]], seriesData[2][seriesData[2].length - 1]["y"]);
        }
    });
    graph.update();

}, 2000 );

Looking at the data, I decided it would be interesting to graph the number of available bikes by area. The callback function classifies each station by its longitude and latitude and adds its bikes to the total count for its respective neighborhood.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
function sortNeighborhoods(json){
    var midtownBikes = 0;
    var brooklynBikes = 0;
    var downtownBikes = 0;
       for(var i = 0; i < json.length; i++){
            if(json[i]["lng"] >= -73971000 || (json[i]["lat"] <= 40705311 && json[i]["lng"] >= -73999978)){
            brooklynBikes += json[i]["bikes"]
            }else if(json[i]["lat"] > 40740895){
            midtownBikes += json[i]["bikes"]
            }else if(json[i]["lat"] <= 40740895){
            downtownBikes += json[i]["bikes"]
            };
       };
    newAddData([seriesData[0]], midtownBikes);
    newAddData([seriesData[1]], downtownBikes);
    newAddData([seriesData[2]], brooklynBikes);
}

Once you have the data in the graph, Rickshaw has lots of options you can customize, from color palatte to annotations and hover detail. The syntax for these is pretty straightforward and can be found in the documentation and by looking at the example graphs provided on their website.

The next interesting problem I faced was seeding the graph with some recent data before the page rendered so the user didn’t have to wait for the graph to populate. I set up a Postgres database and created a simple rake task to grab the realtime data, just like in my graph, but write it to the database instead of rendering it. That way a rolling log of data would be available to populate the graph.

In order for the data to be current, the rake task needs to be run at regular intervals automatically, which is where the Heroku scheduler add-on comes in. This add-on is free and can be added to your existing heroku application using the command $ heroku addons:add scheduler:standard. Then, in your Heroku dashboard you can specify the rake task you want run and the interval, as often as every 10 minutes. Because I can’t have the task run as often as my data source updates, I added code to interpolate between the most recent points to make the graph smoother.