C3 Graphing

Churning Data into Information

I work with a lot of data on the behalf of an agency without a lot of money. Exploring free-to-use and open-source tools is key to being effective in my job.

Recently, I've written a a couple of series on how to use R and SQL to sort through Homeless Management Information System data.

These data are essential to local governments helping individuals experiencing homelessness to be housed quickly and appropriately.

But one area R and SQL have not delivered is on-line interactive dashboards. Data is one thing, but easy to digest information is really key to informing stakeholders how the system is working to end homelessness.

In other projects I've attempted to generate graphs as images and upload to a static link. Then, each time the data change re-generate replace the image. But, most website servers cache the images so it is not ideal.

This has pushed me to try to learn D3.

I'm not going to lie, I've felt confused by languages, IDEs, and libraries. And I've overcome most of the these challenges. But I've never been so confused as by the layout and syntax of D3. The dyslexic feeling I get trying to work in D3 has discouraged me from spending too much time on it.

But recently I decided to take another stab at it-- this time I lucked out and found the C3.js.

Essentially, C3 is a library which greatly simplifies D3. It boils down building a graph into a set of options passed to the C3 graph builder as a JSON object.

This code:

var chart = c3.generate({
    data: {
        x: 'Date',
        y: '# Individuals',
        xFormat: '%Y-%m-%d',
        url: 'https://ladvien.com/projects/d3/data/trendsInTX601.csv',
        type: 'line',
        // colors: {
        //     Count: '#990000'
        // }
        names: {
            NumberHomeless: "Homeless",
            NumberInRRH: "Rapid Rehousing",
            NumberInPSH: "Permanent Supportive Housing"
        }
    },

    title: {
        text: "Homeless or Formerly Homeless in TX-601"
    },

    legend: {
        show: true
    },

    axis: {
        x: {
            type: 'timeseries',
            tick: {
                count: 4,
                format: '%Y-%m-%d',
                // rotate: 90,
                multiline: false,

                culling: {
                    max:5 
                }
            }
        },
        y: {
            max: 3000,
            min: 0,
            label: "# Individuals"
            // Range includes padding, set 0 if no padding needed
            // padding: {top:0, bottom:0}
        },
    },

    point: {
        r: 0
    }
});

Using this CSV:

Produces the following graph:

One Hiccup

I did run into a one hiccup in setup. It seems the most recent version of d3 (version 4.0) has had much of its API overhauled. In such, it will not work with C3. But D3 v3 is still available from the D3 CDN:

<script src="https://d3js.org/d3.v3.min.js"></script>

Calling this library and following the instructions outlined by the C3 site, you can be generating graphs in little time.

Updating Data Securely and On Schedule

Now that I've the ability to use R and SQL to sort through my data, and I could quickly generate graphs using D3 and C3, it'd be really nice if a lot of this could be automated. And luckily, I'd run into a few other tools which made it pretty easy to replace the data on my C3 graphs.

Rsync

Rsync is primarily a Linux tool, but it is available on Windows as well. It is nice since it will allow you to quickly reconcile two file-trees (think of a manual Dropbox).

It will also allow you to sync a local file tree with a server file tree across an SSH connection. For example, I use the following command to sync the data mentioned above to the server

rsync -avz /Users/user/data/js-practice/d3/* ladvien@ladvien.com:/usr/share/nginx/html/projects/d3/

After running this command it will prompt for a password to access the server. Then, it will proceed to sync the two file-trees. Nifty!

This allows me to quickly update the data on the graph. Now, if only there were a way to automatically insert my password, then I could write a script to automate the whole process.

Python Keyring

Python Keyring is a tool which allows you to save and retrieve passwords from your PC's keyring.

It is compatible with:

  • Mac OS X Keychain
  • Freedesktop Secret Service (requires secretstorage)
  • KWallet (requires dbus)
  • Windows Credential Vault

If you have Python installed you can install the Keyring tool with Pip:

$pip install keyring

After, you can store a password in the keyring by using the command-line tool. You will need to replace username with the name of your server login.

$keyring set system username

And retrieve it with:

$keyring get system username

This is great. It means we can store our password in the keyring and retrieve it securely from a script.

Great! Now we could write a script to have Rsync sync the any data changes locally with the server. Right? Well, almost. We needed one more tool.

SSHPass

There is a problem with using Rsync to sync files remotely from a script. When Rsync is called from a script it will not wait for parameters to be passed to the tool. Sigh.

Luckily, I'm not the only with this problem and a tool was created to solve this problem.

If you are on a Mac you'll need to use Brew to install SSHPass.

brew install https://raw.githubusercontent.com/kadwanev/bigboybrew/master/Library/Formula/sshpass.rb 

There we go! Now we can automate the whole process.

I wrote this script to do the dirty work:

#!/bin/sh
PASSWORD=("$(keyring get system ladvien.com)")
ECHO ""
ECHO "****************************"
ECHO "* Updating D3 Projects     *"
ECHO "****************************"
ECHO ""
sshpass -p "$PASSWORD" rsync -avz /Users/user/data/js-practice/d3/* root@ladvien.com:/usr/share/nginx/html/projects/d3/

Cron

Ok! One last bit of sugar on this whole process. Let's create a Cron job. This will run the script in the background at an interval of our choosing.

For me, I've a staff who pulls data and runs a master script every Monday. So, I'll set my automated script to update my C3 graph data on Tuesday, when I know new data is available.

You can use Nano to edit your Cron job list.

env EDITOR=nano crontab -e

To run a Cron job on Tuesday we would set the fifth asterisk to 2.

* * * * 2 /the/path/to/our/update_script.sh

And don't forget to make the update_script.sh executable.

chmod +x update_script.sh

I'm a hacker hacking with a hacksaw!