Ladvien's Lab

Latest Posts

JPS DSRIP Report V2.0

JPS DSRIP Report V2.0

options(java.parameters = "-Xmx14336m")  ## memory set to 14 GB
library("XLConnect")
library("sqldf")
library("tcltk")

startDate <- "2015-10-01"
endDate <- "2016-09-30"

df <- readWorksheetFromFile("JPS_Raw_Data.xlsx", sheet = 1, startRow = 2)

#sampleVector <- sample(1:nrow(df), 30000)
#df2 <- df[sampleVector,]

#write.csv(df2, file="Sample of JPS_Raw_Data (30000).csv", na="")

#df3 <- read.csv("Sample of JPS_Raw_Data (30000).csv")

### Formatting ###################################
df3[is.na(df3)] <- ""
df3$Participant.Enterprise.Identifier <- gsub("-", "", df3$Participant.Enterprise.Identifier)
colnames(df3)[2] <- "peid"
colnames(df3)[5] <- "CaseNumber"
colnames(df3)[7] <- "Gender"
colnames(df3)[8] <- "Race"
colnames(df3)[9] <- "Ethnicity"
colnames(df3)[10] <- "ProgramName"
colnames(df3)[11 ...
Identifying Chronically Homeless and Veteran Participants throughout a COC

This is my attempt to write SQL against the HMIS 5.1 CSVs.  It includes:

  1. Identifying Chronically Homeless (CHP) Participants enterprise Wide
  2. Identifying Veterans (Vets) enterprise wide
  3. Sorting CHPs and Vets to identify those who've exited the literal homelessness and where they went.
  4. Sorting CHPs and Vets to identify those are still in the literal homelessness
  5. Filtering to Active Participants in Projects using Entry / Exit
  6. Filtering to Active Participants in Projects using NBN
  7. Getting total NBN stays by participant

To actualy get anything done through writing SQL against these CSVs, one will need the HMIS Vendor CSV Specifications

Current HMIS ...

Sampling Large Data

This R function allows sampling of a dataframe.  This is helpful when writing a script which will be used against a large dataframe, however, writing the script is iterative.  Sampling allows the overall reduction in time of testing iterations, without losing the validity of realistic results.

    options(java.parameters = "-Xmx14336m")  ## memory set to 14 GB
    library("sqldf")
    library("XLConnect")
    library("tcltk")

    df <- readWorksheetFromFile("Data_X.xlsx", sheet = 1, startRow = 1)

    sampleVector <- sample(1:nrow(df), 30000)
    df2 <- df[sampleVector,]

    write.csv(df2, file="Sample of Data_X (30000).csv", na="")

Lumi3

This project is meant as stepping stone to implement Lumi2.  The Lumi projects I've been working on are over-the-air uploaders of Arduino / AVR programs to Atmega and Atiny chips which are programmed with the TinySafeBoot bootloader.  The goal is to allow the user to select either WiFi or Bluetooth, create a connection to either an ESP8266 or HM-1X device, and upload whatever program to an AVR connected to the wireless receiving device.

The last iteration of Lumi was written in Windows Universal Apps SDK.  Unfortunately, the code-base turned into spaghetti.  I've diagnosed the issues to be due to God-modules ...

Splitting Program Data

options(java.parameters = "-Xmx14336m")  ## memory set to 14 GB
library("sqldf")
library("XLConnect")
library("tcltk")

# 1. Load all PoS 
# 2. Load all NBN stays
# 3. Find all PEID's who have had a PoS in the last year
# 4. Find all PEID's of those who have stayed in a NBN bed in the last year
# 5. Append the lists and get a distinct set of PEIDs
# 6. Load family and individual demographic information
# 7. Inner join the demographic data to the distinct PEID set.
# 8. Write all information to file.
# 9. Pray.


allNBN <- readWorksheetFromFile("All TCES NBN Bed Data ...
Robber Board

The Robber Board

This board originated with a request from an LPC who was practicing EMDR and wanted to upgrade his feedback machine.  He had requested it be wireless and provide both haptic and visual feedback.  The whole thing fell apart when he sent me an NDA which seemed typical of real professional level projects.  However, it would have prohibited me from sharing anything I discovered, and given the amount I could have made from it, well, it just wasn't worth it.  I thought I'd finish the project and share with everyone, since sharing is the greatest form of payment ...

R Function to Split CSVs

This is an R function written to split a dataset into particular sized sets, then write them as a CSV.  Often, our office is need a quick way to split files for uploading purposes, since our HMIS software doesn't handle large uploads well.

For example:

splitDataAndWriteFiles(df, 500, "My_Data")  

Will produce X number of files named "My_data_X.csv"

options(java.parameters = "-Xmx14336m")  ## memory set to 14 GB
library("XLConnect")

# Function to split files.
splitDataAndWriteFiles <- function(df, chunkSize, nameOfFiles) {
  success <- FALSE
  count <- 0
  while (!success) {
    # If you want 20 samples, put any range of 20 values within the range of number ...
Shaping and Combining HMIS Data from ETO

Continuing to explore R and SQL's usefulness when it comes to HMIS data I decided to start posting HMIS problems and R and SQL solutions.

Problem:  Our HMIS implementation has had three shelters entering data into one partition.  This has been a lot like mixing three different colors of sand into one bucket--much easier to put in then sort out.  It is also a problem since HUD requires Emergency Solution Grant (ESG) recipients to provide an annual standardized report, known as the CAPER, on data defined by the HMIS Data Dictionary . These data elements are referred to as Universal Data ...

Attachment III, aka, The Zombie

Continuing to explore R and SQL's usefulness when it comes to HMIS data I decided to start posting HMIS problems and R and SQL solutions.

Problem: One of our Emergency Solutions Grant (ESG) funders requires the subrecipients to produce a report of all the participants which receive services from the shelter.  This requirement was written into the contract the ESG funders have with the subrecipient.  This worked somewhat in 2011 when it was implemented, however, it is 2016 and the data which is queried against to produce the report is well over 1.3 million entries.  These data are generated ...

HMIS, R, and SQL -- Basics

Hacker Introduction

I'm a hacker .  If you find errors, please leave comments below.  If you have an opinion I'll hear it, but I'm often not likely to agree without some argument.

Joins (Merging Data)

Probably the best part of R and SQL is their ability to quickly combine data around a key.  For example, in HMIS CSVs the Client.csv contains a lot of demographic information and the Enrollment.csv contains a lot of assessment information.  This makes it difficult when needing a count of the total participants who are veterans and disabled, since the veteran information is in Client ...