Ladvien's Lab

Latest Posts

Working with R Excel Libraries

We've worked a bit with Comma Separated Values ( .csv ) files, but it they aren't the only way to store data. There are a lot of data storage formats, each with its strengths and weaknesses. One of the deficits of the CSV format is it cannot store formatting or graphs. This is the reason Excel format ( .xls or .xlsx ) has become another industry standard.

Excel is a program created by Microsoft to allow people to easily work with spreadsheets. With it, they created a way of storing data which allows for formatting and other information to be included. In fact, Excel ...

Comparing Values in R and SQL

Comparative Functions

Comparing two or more values is an extremely important concept when talking to computers. In writing a report script, it is is essential. Comparisons allow us to filter to values within a range, allowing us to provide a report of relevant information.

Take the following data:

######################### Data ###################################
###################### DO NOT CHANGE #############################
peopleDf <- data.frame(PersonalID=c("ZP1U3EPU2FKAWI6K5US5LDV50KRI1LN7", "IA26X38HOTOIBHYIRV8CKR5RDS8KNGHV", "LASDU89NRABVJWW779W4JGGAN90IQ5B2"), 
           FirstName=c("Timmy", "Fela", "Sarah"),
           LastName=c("Tesa", "Falla", "Kerrigan"),
           DOB=c("2010-01-01", "1999-1-1", "1992-04-01"))
##################################################################
##################################################################

If you run the above in R you should get a dataframe called peopleDf which looks like this:

PersonalID FirstName LastName DOB
ZP1U3EPU2FKAWI6K5US5LDV50KRI1LN7 Timmy Tesa ...

Providing Chronically Homeless List

With this work challenge we are going to take the concepts we've learned from the first challenge and build on them. We will combine two dataframes derived from Client.csv and Enrollment.csv. Then, we will apply HUD's formula to get a by-name-list of those who are chronically homeless.

Data Needed

The current definition of chronically homeless is found in HUD's federal register :

A “chronically homeless” individual is defined to mean a homeless individual with a disability who lives either in a place not meant for human habitation, a safe haven, or in an emergency shelter ...

Give me MyFitnessPal Data!

I'm fat. Fatter than I want to be. I've not always been fat, I got down to 180 at back in 2008. It took counting calories and weight religiously. The key piece for me was having a graph which I looked at daily showing my outcomes. Over the course of a year I lost 40 pounds. Well, it's time to do it again. I've gained that 40 back over 10 years--and now it needs to go.

Back in 2008 I was using Google to give me the calories of every item I ate and recording them in an Excel document ...

Filter to Most Recent HUD Assessment

Enrollment.csv

Many of the CSVs in the HMIS CSV may contain multiple rows per client. This can make it difficult when working with HMIS data, as it appears to the non-data person there are duplicates within your data.

Let's look at some dataframes:

enrollmentDf

ProjectEntryID PersonalID FirstName EntryDate
L0TDCLTDEARVHNIQ4F9EDDKXJ764Z65Q ZP1U3EPU2FKAWI6K5US5LDV50KRI1LN7 Bob 10/17/2016
169ACC89JY5SX0U87U7HQ28PMMHNJEXQ IA26X38HOTOIBHYIRV8CKR5RDS8KNGHV Jane 05/05/2015
XB52BYOGJ1YDFESNZVNGDGA58ITDML0A ZP1U3EPU2FKAWI6K5US5LDV50KRI1LN7 Bob 01/01/2013

Notice how Bob has two records? One on 10/17/2016 and 01/01/2013. This represents two HUD Entry Assessments completed on Bob. These HUD Entry Assessments could represent two stays in ...

Read and Write CSVs in R

R let's us work with HMIS data, but there is still the problem of how to load the data into R from a source. R is actually pretty neat regarding data importing. We can load data from a website, SQL database, text file, Excel file, or CSV.

When working with HMIS data the two most common types of data source are Excel and CSV. Therefore, it's going to pay to spend a little time on how to bring those files into R.

Importing CSVs into R

CSV stand for Comma Separated Value format. It is a near ancient file format ...

Working with Dates in SQLdf

Mixing R and SQL is powerful. One of the easiest ways to implement this combination is with the R library SQLdf.

If TL;DR, skip to Coerce Date Types into Strings before Passing to SQLdf at bottom.

SQLdf

The power of SQLdf comes from its ability to convert dataframes into SQLite databases on the fly. To the user, it doesn't appear like anything special is going on, but under the hood R is working together with a SQLite client to create a table which can be queried and manipulated with ANSI SQL calls.

For example,

dataFrame1 <- read.csv(pathToData)
library ...

Creating a List of Domestic Violence Victims

In this first work challenge you will use R and SQL to get a by-name-list of those who are domestic violence victims from an HMIS data pull (5.1).

Data Needed

The HMIS Data Pulls are simply a relational database which are broken into multiple CSVs. These CSVs will change in formatting, as stipulated by HUD. The current version of these CSVs is 5.1. For this work challenge the focus will be on two CSVs.

  1. Client.csv
  2. HealthAndDV.csv

The Client file will contain one row per client and only one row ...

The R and SQL Way

Below is my attempt to describe the method I use to get at HMIS data. In short, I'm mixing two powerful data languages to get answers from a data set quickly.

What is SQL?

SQL stands for Structured Query Language. Which can be translated as, "Asking a question a computer could understand." This computer language was designed to get data off a remote relational database .

A sample of what SQL looks like:

SELECT Data FROM DataSet WHERE Data='BlueEyes'

SQL Pros:

  • Easy to understand
  • Can be learned quickly
  • Powerful in merging data sets

SQL Cons ...

Bluetooth Low Energy in JavaScript

For a long time now I've put off learning JavaScript. It really never interested me. I'd like to say it was the thought, "Oh, JavaScript is for web developers and I'm embedded all the way!" But that wasn't really it. I think it hasn't appealed to me because I couldn't connect it to hardware. Well, at least, that was my assumption.

However, I've recently discovered Google's Web APIs. Specifically, their Bluetooth Low Energy API.

It's pretty amazing. It allows a developer to write asynchronous JavaScript using Promises to get into the hardware of the client's PC ...