Ignorance is Bliss

In June of 2015 I became a Homeless Management Information System Administrator. Going into the job I had no idea what was to be done. I’d been working as a homeless street-outreach specialist for MHMR of Tarrant County for several years before. The reason I landed the job, I think, is I was tech savvy, something rare in the social service world, but more on that later.

I’d become tech savvy because working a street outreach specialist one sees a lot of bad. A lot It will leave scares in your psyche if you are not vigilant to guard against those bad scenes replaying in your head. I found if I filled my head with something complex there was no room for the dissonance created by being helpless to aid the 17-year-old heroine addict and future mother under the unfinished bridge off of I-30. So, I took up robotics. It worked well.

Anyway, I when I started as an HMIS System Administrator I was clueless. Looking back, I’d wished there was someone who was around to teach me. Most of the skills needed are esoteric, and few on-line resources exist. Well, as I step away from the desk, I’m going to do my best to write down everything I learned for my replacement–of course, it’ll be laden with opinion–but hopefully, it’ll provide a spring-board into your new job.

Brand new HMIS System Administrator, this is for you, as you start your new job. The best job in the world.

Get In the Weeds

Hey, by now, you’ve been to a few meetings and you know a few things about the data. A word of warning, don’t let others tell you, “Hey, that’s ‘too in the weeds’ for this conversation.” Bull.

Often, your boss or peer will be saying it with good intent. They want to make sure the content of what your agency is providing to a partner agency or funder is digestable in one meeting, and often, the critique is fair. But don’t let it become the only rule you live by. The weeds are necessary for several reasons.

First, if you are never allowed to talk about the mechanics of your job then then you will not have the vocabulary, analogies, and metaphors worked out to talk about critical system issues when it is necessary to vet HMIS decisions with those outside your office.

Secondly, others will not be primed for a conversation about something they have never heard about until it matters.

In short, getting into the weeds of how an HMIS works is needed for more reasons than there are to prevent going into them.

One thing I’ll encourage you do shortly after you start. Cold-call all the HMIS system administrators in your state. Introduce yourself, ask if they would be willing to chat with you when needing to discuss HMIS stuff no one else will listen to. I’ve found this to be the absolutely most critical piece of advice I can give.

This job is great. But you’ll have no friends. No one will understand you. And no one will want to talk about how HUD has changed the chronically homeless definition yet again–except, other HMIS system admins. When you first make contact with another HMIS Admin, after being in the job a year, it will be as if you’ve discovered a neighboring isle next to the one you’ve been on–and that isle has another human to talk to!

Data Quality is Key

Three months into my job I realized we had to do something about our data quality. We had two major cities and the county both complaining about how reports coming from our HMIS were not reliable. One month they’d produce an “accurate” account of who was in a program the municipalities were funding, then, the next month they would be completely off. (Of course, the municipalities knew they were off because the agencies funded were keeping a separate set of books–more on that later)

We had to do something. To be honest–oh yah, and always be honest–we didn’t have a clue what our system data quality was like. There was data quality detection system in place to determine if it was good or bad.

Luckily, our software vendor had an HMIS data error report which would list out HUD data errors of clients active in any program. Without a better solution, I pulled this report for every program and aggregated the data errors.

No one will trust reports without good data Show Tree Graph

DTR

Find Your Tools

R, SQL, Tablaeu

Power of the Purse

ESG Funders COC Funder

Discourage Separate Sets of Book

Data Quality Goes Down

Try to be HMIS Software Independent

You can only love something if you don’t need it

Departmental Checks and Balances

Don’t scrub data, ever, not even once – don’t do it… seriously

If you give a mouse a cookie

Advocate for everyone to Create Reports

Taking care of the date = cook Pulling reports for you = your cook chewing your food

Get a Help Desk

Automate Everything

Give them What they Need Always, and What they Want, when Possible

Understand How You’re Funded

HUD funds you – they are your boss, kinda’, but your other boss if who provides match for those funds.

Always Be Honest

Eminence Based vs. Data Based Decision Making

##################################################
# Create Occupancy Trends for Emergency Shelters #
# Rapid Rehousing, and Permanet Housing programs #
# by ProjectName and OrganizationName.           #
##################################################
trendsOfOccupancyByProjectAndOrganization <- function(allDataPath, 
                                                      outputFolder, 
                                                      interval = "week", 
                                                      startDate = "2014-01-01"){
  
  library(plyr)
  
  client <- loadClient(allDataPath)
  
  primaryPersonalIDs <- getPrimaryPersonalID(client)
  primaryPersonalIDs <- sqldf("
                            SELECT
                              PrimaryPersonalID,
                              PersonalID 
                            FROM
                              primaryPersonalIDs")
  
  client <- sqldf("
                  SELECT
                    a.PrimaryPersonalID,
                    b.* 
                  FROM
                    primaryPersonalIDs a 
                  LEFT JOIN
                    client b 
                      ON a.PersonalID=b.PersonalID")
  
  client <- within(client, rm(PersonalID))
  colnames(client)[1] <- "PersonalID"
  client <- unique(client)
  
  enrollment <- loadEnrollment(allDataPath)
  enrollment <- sqldf("SELECT
                        a.PrimaryPersonalID,
                        b.* 
                      FROM
                        primaryPersonalIDs a 
                      LEFT JOIN
                        enrollment b 
                          ON a.PersonalID=b.PersonalID")
  
  enrollment <- within(enrollment, rm(PersonalID))
  colnames(enrollment)[1] <- "PersonalID"
  enrollment <- unique(enrollment)
  
  exit <- loadExit(allDataPath)
  exit <- sqldf("SELECT 
                  a.PrimaryPersonalID, 
                  b.* 
                FROM 
                  primaryPersonalIDs a 
                LEFT JOIN 
                  exit b 
                    ON a.PersonalID=b.PersonalID")
  
  exit <- within(exit, rm(PersonalID))
  colnames(exit)[1] <- "PersonalID"
  exit <- unique(exit)
  
  project <- loadProject(allDataPath)
  inventory <- loadInventory(allDataPath)

  organization <- loadOrganization(allDataPath)
  
  # Add all bed inventories into one (HH without children, HH with children, and HH of children only)
  inventory <- sqldf("
                      SELECT
                        ProjectID,
                        SUM(BedInventory) As 'BedInventory' 
                      FROM
                        inventory 
                      GROUP BY
                        ProjectID
                     ")
  
  allData <- sqldf("
                    SELECT  
                      DISTINCT a.PersonalID, 
                      a.EnrollmentID, 
                      c.ProjectName, 
                      a.EntryDate, 
                      b.ExitDate, 
                      c.ProjectType, 
                      d.BedInventory, 
                      e.OrganizationName
                    FROM 
                      enrollment a
                    LEFT JOIN 
                        exit b
                          ON a.EnrollmentID=b.EnrollmentID
                    LEFT JOIN 
                        project c
                          ON a.ProjectID=c.ProjectID
                    LEFT JOIN 
                        inventory d
                          ON c.ProjectID=d.ProjectID
                    LEFT JOIN 
                        organization e
                          ON c.OrganizationID=e.OrganizationID
                    WHERE RelationshipToHoH != 'NA'")
  
  remove(client, enrollment, exit, project, primaryPersonalIDs)
  
  # Gets max and min date
  bfr <- sqldf("
                SELECT 
                  MIN(EntryDate) As MinimumDate
                FROM 
                  allData
               ")
  
  min_date <- ""
  if(startDate == ""){
    min_date <- as.character(bfr[1,1])  
  } else {
    min_date <- startDate
  }
  bfr <- sqldf("
                SELECT 
                  MAX(EntryDate) As MaximumDate 
               FROM 
                  allData
               ")

  max_date <- as.character(bfr)
  
  intervalConstant <- switch(interval,
                             week = 7,
                             month = 30,
                             quarter = 120)
  
  numberOfIntervals <- switch(interval,
                              week = as.integer(getWeeksBetween(min_date, max_date)),
                              month = as.integer(getMonthsBetween(min_date, max_date)),
                              quarter = as.integer(getQuartersBetween(min_date, max_date)))
  
  allData$EntryDate <- as.Date(allData$EntryDate)
  allData$ExitDate <- as.Date(allData$ExitDate)

  projectTypeList <- unique(allData$ProjectType[!is.na(allData$ProjectType)])

  # Calculate occupancy by ProjectName
  for(projectType in projectTypeList) {
    
    projectTypeName <- as.character((makeProjectTypeReadableVector(list(projectType))))
    
    thisProjectTypeData <- allData[allData$ProjectType == projectType,]
    
    # Inialize dataframe with all ProjectNames
    projectEnrollmentsTrend <- unique(data.frame(thisProjectTypeData$ProjectName))
    colnames(projectEnrollmentsTrend)[1] <- "ProjectName"
    
    # Attach project bed data 
    thisProjectBedData <- unique(data.frame(allData$ProjectName, allData$BedInventory))
    colnames(thisProjectBedData)[1] <- "ProjectName"
    colnames(thisProjectBedData)[2] <- "BedInventory"
    
    for(i in 0:numberOfIntervals) {
      intervalStartDate <- as.Date(min_date) + i * intervalConstant
      intervalEndDate <- as.Date(min_date) + (i + 1) * intervalConstant
      
      activeEnrollment <- subset(thisProjectTypeData, 
                                 EntryDate <= as.Date(intervalStartDate) &
                                   (ExitDate >= as.Date(intervalEndDate) |
                                      is.na(ExitDate)))
      
      projectCount <- count(activeEnrollment, ProjectName)
      colnames(projectCount)[2] <- as.character(intervalStartDate)
    
      thisCountWithBeds <- merge(x = projectCount, y = thisProjectBedData, by = "ProjectName", all.x = TRUE)
      thisCountWithBeds$OccupancyPercentage <- round(thisCountWithBeds[,2] / thisCountWithBeds[,3], digits = 4)
      projectCount <- data.frame(thisCountWithBeds$ProjectName, thisCountWithBeds$OccupancyPercentage)
      colnames(projectCount)[1] <- "ProjectName"
      colnames(projectCount)[2] <- "Occupancy"
      
      averageDf <- sqldf("
                         SELECT 
                            'Average' As 'ProjectName',
                            AVG(Occupancy) As 'Occupancy'
                         FROM 
                            projectCount
                         ")
      
      projectCount <- rbind(projectCount, averageDf)
      colnames(projectCount)[2] <- as.character(intervalStartDate)

      projectEnrollmentsTrend <- merge(x = projectEnrollmentsTrend, y = projectCount, by = "ProjectName", all.x = TRUE)
    }

    tmpColMeans <- numcolwise(mean, na.rm = TRUE)(projectEnrollmentsTrend)
    tmpColMeans$ProjectName <- "Average"
    projectEnrollmentsTrend <- rbind(projectEnrollmentsTrend, tmpColMeans)

    projectEnrollmentsTrend <- t(projectEnrollmentsTrend)
    colnames(projectEnrollmentsTrend) <- projectEnrollmentsTrend[1,]
    projectEnrollmentsTrend <- projectEnrollmentsTrend[-1,]
    
    write.csv(projectEnrollmentsTrend, 
              paste(outputPublicPath, 
                    "/ProjectsEnrollmentsTrend_ProjectType_", 
                    projectTypeName, 
                    ".csv", 
                    sep=""), 
              na = "", row.names = TRUE) 
  }


  # Bed occupancy by OrganizationName 
  for(projectType in projectTypeList) {
    
    projectTypeName <- as.character((makeProjectTypeReadableVector(list(projectType))))
    
    thisOrganizationData <- allData[allData$ProjectType == projectType,]
    organizationEnrollmentsTrend <- unique(data.frame(thisOrganizationData$OrganizationName))
    colnames(organizationEnrollmentsTrend)[1] <- "OrganizationName"
    
    thisOrganizationBedData <- unique(data.frame(allData$OrganizationName, allData$BedInventory))
    colnames(thisOrganizationBedData)[1] <- "OrganizationName"
    colnames(thisOrganizationBedData)[2] <- "BedInventory"
    
    for(i in 0:numberOfIntervals) {
      intervalStartDate <- as.Date(min_date) + i * intervalConstant
      intervalEndDate <- as.Date(min_date) + (i + 1) * intervalConstant
      
      activeEnrollment <- subset(thisOrganizationData, 
                                 EntryDate <= as.Date(intervalStartDate) &
                                   (ExitDate >= as.Date(intervalEndDate) |
                                      is.na(ExitDate)))
      
      projectCount <- count(activeEnrollment, OrganizationName)
      colnames(projectCount)[2] <- as.character(intervalStartDate)
    
      thisCountWithBeds <- merge(x = projectCount, y = thisOrganizationBedData, by = "OrganizationName", all.x = TRUE)
      thisCountWithBeds$OccupancyPercentage <- round(thisCountWithBeds[,2] / thisCountWithBeds[,3], digits = 4)
      projectCount <- data.frame(thisCountWithBeds$OrganizationName, thisCountWithBeds$OccupancyPercentage)
      colnames(projectCount)[1] <- "OrganizationName"
      colnames(projectCount)[2] <- "Occupancy"

      projectCount <- sqldf("SELECT 
                              OrganizationName, 
                              AVG(Occupancy) As 'Occupancy'
                            FROM 
                              projectCount 
                            GROUP BY 
                              OrganizationName")
      
      colnames(projectCount)[2] <- as.character(intervalStartDate)

      organizationEnrollmentsTrend <- merge(x = organizationEnrollmentsTrend, y = projectCount, by = "OrganizationName", all.x = TRUE)
    }

    tmpColMeans <- numcolwise(mean, na.rm = TRUE)(organizationEnrollmentsTrend)
    tmpColMeans$OrganizationName <- "Average"
    organizationEnrollmentsTrend <- rbind(organizationEnrollmentsTrend, tmpColMeans)

    organizationEnrollmentsTrend <- t(organizationEnrollmentsTrend)
    colnames(organizationEnrollmentsTrend) <- organizationEnrollmentsTrend[1,]
    organizationEnrollmentsTrend <- organizationEnrollmentsTrend[-1,]
    
    write.csv(organizationEnrollmentsTrend, paste(outputPublicPath, "/OrganizationEnrollmentsTrend_ProjectType_", projectTypeName, ".csv", sep=""), na = "", row.names = TRUE) 
  }
}

trendsOfOccupancyByProjectAndOrganization(allDataPath, outputPublicPath)