Creating Reusable Code
Writing report code which can be reused is critical to being an effective report specialist. By now, hopefully, you see the power of SQL-R, especially around HMIS data. But you may still feel slow. Or have thoughts like, "If I pulled these data into Excel I could manually filter them in 1/10th the time." That's probably true. But, after manually filtering dataset after dataset it becomes apparent finding a way to automate some tasks would save many hours in the long-run. Thus, writing an R scripts for routine work would save countless hours of monotony.
However, one problem remains, each task will usually have a slight variation from the one before it. This causes you to write 95% of the same code with a slight tweak for the current project. And that doesn't save time at all. In the programming world, the 95% code which is the same is known as bolierplate code .
Ok, that's the problem. The solution? Functions.
A function is nothing more than a section of code you save into a variable for easy reuse.
Defining a function looks like this:
myNewFunction <- function(){
# Code you want to run goes here.
}
Then, whenever you want to use this code it can be called like this:
myNewFunction()
If you want to pass the function something to use:
myNewFunction <- function(clientDf){
clientDf$VeteranStatus
}
clientDf <- read.csv(clientCsvPath)
myNewFunction(clientDf)
And the coolest thing about functions is being able to
return
data. Functions return whatever data is on the last line of the function. This can be a tricky concept, but at its root it is simple.
Here, the
clientDf
will be returned.
myNewFunction <- function(clientDf){
clientDf$VeteranStatus[clientDf$VeteranStatus == "1"]
clientDf
}
clientDf <- read.csv(clientCsvPath)
veteranList <- myNewFunction(clientDf)
The result is then passed back out of the function, where it can be assigned to a new variable.
You may notice, this is similar to a lot of code we have been using. Like
read.csv
. That's because
read.csv
is a function written by the makers of R and included for our use.
clientDf <- read.csv(clientCsvPath)
This is how R has become powerful tool. Many smart people have written sets of functions, which are called libraries. Feel the power of open-source.
Time to give back to community and write some of our own functions
Data Needed
For this work challenge you will need:
- Client.csv
- Enrollment.csv
- Project.csv
- Exit.csv
The Goal
Write functions which will do the following:
-
Join
clientDf
,enrollmentDf
,projectDf
,exitDf
and return the combined dataframe. - Make the following columns readable:
- Gender
- VeteranStatus
- DisablingCondition
- RelationshipToHoH
- ResidencePriorLengthOfStay
- LOSUnderThreshold
- PreviousStreetESSH
- TimesHomelessPastThreeYears
- MonthsHomelessPastThreeYears
- Destination
- Get most recent HUD Assessment per PersonalID
- Filter to clients who are active in programs (except Night-by-Night and Street Outreach projects)
-
Write a function to filter
enrollmentDf
based upon a user defined parameter.
BONUS * Write a function which returns a list of Chronically Homeless individuals.
For the last function, here's an example,
clientsWithDisablingCondition <- getSubpopulation(df, "DisablingCondition", "Yes")
The function you'd write would be
getSubpopulation()
. The first parameter would be the dataframe the user is passing into your function. Second parameter is the column to look at. The last is which response the user wants in the column to look in.
The Resources
Below are the resources which should help for each step:
- R Programming A-Z -- Video 21 -- Functions in R
- paste()