Continuing to explore R and SQL’s usefulness when it comes to HMIS data I decided to start posting HMIS problems and R and SQL solutions.
Problem: Our HMIS implementation has had three shelters entering data into one partition. This has been a lot like mixing three different colors of sand into one bucket–much easier to put in then sort out. It is also a problem since HUD requires Emergency Solution Grant (ESG) recipients to provide an annual standardized report, known as the CAPER, on data defined by the HMIS Data Dictionary. These data elements are referred to as Universal Data Elements. With this mixed bucket data error responsibility becomes a big issue. The CAPER only allows up to 25% missing data, which makes data quality important. As for data repair, this should be completed by the agency which created the error. This makes communicating data issues imperative.
Unfortunately, when data from several agencies is mixed ,creating an error report is problematic—at least, for the HMIS software our continuum of care utilizes. The data quality reports our HMIS office produces lumps all errors together. This leads to social loafing between the agencies, in turn, few data repairs.
Solution: The solution seems to sort the data back out, re-assigning it to the respective agency’s data. This would allow data quality reports to assign responsibility of repair. Currently, our COC uses Social Solutions ETO software for manage our HMIS. The process of the moving the data consists of the following steps:
Determine all data which needs to be migrated. For us, this is Demographic, HUD Assessment, and Bed Stay data.
Export these data sets.
Sort the data sets to respective agencies.
Import the data using a predefined template.
This article focuses on the third step. The data has been exported, but how to sort it?
Below is a script written to take a flat file of HUD Assessments and
Filter to to the respective program
Filter HUD Assessments to just Protect Entry
Repair the COC code (e.g., “tx601” -> “TX-601”)
Re-assign the data to the proper Site (agency’s data) and Program.
Chop data into sets of no more than 500 rows, making the import process easier