Ladvien's Lab

Latest Posts

Creating a List of Domestic Violence Victims

In this first work challenge you will use R and SQL to get a by-name-list of those who are domestic violence victims from an HMIS data pull (5.1).

Data Needed

The HMIS Data Pulls are simply a relational database which are broken into multiple CSVs. These CSVs will change in formatting, as stipulated by HUD. The current version of these CSVs is 5.1. For this work challenge the focus will be on two CSVs.

  1. Client.csv
  2. HealthAndDV.csv

The Client file will contain one row per client and only one row. Each row will contain most of all the Client's demographic information.

The HealthAndDV file will contain a row for every HUD Entry Assessment completed on the participant. Each entry will contain general health information as well the client's domestic violence information, which the client reported during the HUD Entry Assessment.

What ties the two files together is the PersonalID column. This ID is meant to be the grand-mamma of all client IDs. It is 32 characters long and contain both numbers and letter:

B7YIOJIGF9CDP6FV7TANQXLMQRMBTVTB

(note, this ID is called the "Enterprise ID" in our HMIS software)

Both the Client and HealthAndDV contain the PersonalID column. This often referred to as a key when dealing with relational databases. This unique ID in both files will allow us to merge the two data sets together using something called joins .

Now, in the Client.csv the information is readable to us humans. There will be the FirstName , LastName , SSN columns and many more. But in the HealthAndDV.csv the information is trickier. For this challenge we are going to focus on one column DomesticViolenceVictim . When you open the data set you may notice that instead of "Yes" or "No" answers in the columns you will see "1" or "0". That's because computers understand the 1 and 0 much quicker than Yes or No.

This is an important side note for managing data. Make the databases easy for computers to understand and report generation will be much faster. You have to think, if using a 1 instead of Yes could save 0.5 seconds on a calculation, then when you have a dataset which contains 1000 records you just saved 500 seconds 8.3 seconds. Now, multiply by 1,700,000 records. Well, you get the picture.

Ok. Back to the problem at hand. Just know "1" is equal to "Yes" and "0" is equal to "No". So, for this challenge, we will want to find all the clients who have a "1" in the DomesticViolenceVictim column

The Goal

We are going to merge the two data sets and to discover the following:

  1. A list of clients who are victims of domestic violence.
  2. A count of how many clients are fleeing domestic violence.

Really, the second piece of information is counting how many people are in the list of those who are victims.

To get this information we will need to do the following:

  1. Load the Client.csv and HealthAnd.csv
  2. Filter the HealthAndDV dataset to the most recent according to the column DateCreated
  3. Join (merge) the dataframes where their PersonalID are the same
  4. Filter the merged dataframe to those who've reported 1 in DomesticViolenceVictim
  5. Write this data to a file.
  6. Use a function to count how many participants are in this victim list.

The Resources

Below are the resources which should help you understand each step of the process.

Step 1

  • R Programming A-Z -- Video 41 -- Loading and Importing Data in R
  • R Programming A-Z -- Video 21 -- Functions in R

Step 2

Step 3

  • The Complete SQL Bootcamp -- All Videos in Section 8

Step 4 --

Step 5 --

Step 6 --

  • The Complete SQL Bootcamp -- All Videos in Section 5
  • The Complete SQL Bootcamp -- All Videos in Section 6

The R and SQL Way

Below is my attempt to describe the method I use to get at HMIS data. In short, I'm mixing two powerful data languages to get answers from a data set quickly.

What is SQL?

SQL stands for Structured Query Language. Which can be translated as, "Asking a question a computer could understand." This computer language was designed to get data off a remote relational database .

A sample of what SQL looks like:

SELECT Data FROM DataSet WHERE Data='BlueEyes'

SQL Pros:

  • Easy to understand
  • Can be learned quickly
  • Powerful in merging data sets

SQL Cons: * No advanced functions * Does not allow for additional features to be added by users

Would you like to know more?

What is R?

R is a programming language for super nerds. Often, it is used by those working in:

  • Bio-statistic (genetic sequencing)
  • High-level economics
  • Big-data (think all data collected by Facebook)
  • Machine learning

It is extremely fast computationally. The people who designed it wanted something which could crunch big data sets, on their computer, in a matter of minutes.

One of the neatest parts of R is it allows for user written plugins. Want to translate a data set of ZIP codes into their corresponding GPS coordinateds. Their's a plugin for that! And did I mention it's all free!

Would you like to know more?

Mix R and SQL? But why...

Think of our approach to data as Spanglish. We are going to take two major programming languages to get the best of both. When we don't have the exact word to describe something, we will switch to a different language and use its vocabulary.

R is a powerful tool, but often, the syntax is boggish. It is hard to read and figure out what’s going on. SQL on the other hand, it’s pretty simple. And humans have an easier time reading it.

Personally, I'm looking to solve problems as quickly as possible and I’ve found by mixing the two I get to solutions quick. If I'm having a hard time getting Google to tell me how to do something in SQL, I'll switch and see if Google know's how to do it in R. Thus, getting to what I need do much quicker. And let's not fool ourselves--learning a programming language is learning how to Google well.

A second reason to mix SQL is about respect and marketability. R seems to be gaining ground in a lot of data sciences, and seems to be the tool when it comes to economics and statistics, however, most data exchanges have SQL at their heart. Therefore, when I can use my work as an excuse to develop a marketable skill, I’m going to do it.

A third reason to mix the two, most often the servers where our data calculations are being done are extremely limited in their hardware. This results in calculations taking much longer on a server than they would on or personal PC. For example, a query which would take 45 minutes on our HMIS software vendor's server takes around 30 seconds on my personal PC.

A fourth reason, by completing our calculations on our personal computers it reduces the number of times client level information would need to be transmitted back-and-forth from the server. Of course, this doesn't mean the data is more secure, it simply means the opportunities for more secure data are in our hands. This method makes follow proper PC security practices much more important (encrypting hard-drives, not sharing passwords, etc) Localization of data.

Bluetooth Low Energy in JavaScript

For a long time now I've put off learning JavaScript. It really never interested me. I'd like to say it was the thought, "Oh, JavaScript is for web developers and I'm embedded all the way!" But that wasn't really it. I think it hasn't appealed to me because I couldn't connect it to hardware. Well, at least, that was my assumption.

However, I've recently discovered Google's Web APIs. Specifically, their Bluetooth Low Energy API.

It's pretty amazing. It allows a developer to write asynchronous JavaScript using Promises to get into the hardware of the client's PC, all from the browser!

Now, this might sound like it open for security issues--and perhaps it will be. But there are two requirements Google has put in place which hopefully gets around any issues. First, the API can only be called by action. Secondly, the API can only be called from a secured connection (HTTP over SSL).

Ok, there are few other downers to talk about. First this only works in Chrome--but given this is a Google API, well, duh . The other is not all OSes are currently supported. The following I've tested and work right out of the box:

  • Mac OS
  • Android

The others which are supposed to be supported but I've not tested:

  • Linux
  • Windows (with some work)
  • Chromium

Having worked with Bluetooth LE on all of these OSes I can say there is hope for Windows. In fact, I think with the Creator's Update the Microsoft folk opened up the last needed ingredient . The real hold out will be iOS. Apple is not a fan of browser apps. They would much rather browsing be done inside their native apps. If I'm being positive, I'd say this is so Apple can make sure the mobile UX is excellent, and by forcing native apps, they have control by app approval. If I'm being negative, well, Apple takes 30% on app purchases and web apps land them nada. Just my opinion.

If you'd like to stay up to date on compatibility of BLE in the browser there is a an implementation status page on the Web Bluetooth Community Group:

Sadly, right now iOS is the loser.

Moving into the fun part. Below is how to kick things off.

To begin, it will pay to keep the Mozilla Developer Netowork's Web Bluetooth API open for reference.

The documentation is actually pretty robust--and with this guide, the process of subscribing to a device characteristic should be pretty straight forward.

The first piece we need are service IDs to search for.

let optionalServices = document.getElementById('optionalServices').value
    .split(/, ?/).map(s => s.startsWith('0x') ? parseInt(s) : s)
    .filter(s => s && BluetoothUUID.getService);

This takes the text element of the DOM element 'optionalServices', which should be in the in 16 bit hex format, 0x0000. This becomes one of the service IDs searched in the Bluetooth LE search cycle. For the Bluetooth module HM-10, HM-11, HM-16, HM-17 the service ID is 0xFFE0.

Moving on to the search, when the code below is executed the Chrome browser should show a search and pair menu (see image) for pairing a device. When a device has been paired the promise will resolve returning the device which has been paired.

navigator.bluetooth.requestDevice({
        acceptAllDevices: true,
        optionalServices: optionalServices
    })

It is important to note this block must be called by a user action. For example, if set to execute on page load it will refuse to fire. But if called onClick then it will show. This is meant to provide more security to the API.

As stated, the requestDevice will return a device. Using the promise .then we can begin working with the BluetoothDevice

Which is returned after it has been paired by the user. The BluetoothDevice object has three items of interest.

  • name -- which provides the string name of the device
  • id -- the ID string
  • gatt -- a gatt which contains a reference to the BluetoothRemoteGATTServer object

The BluetoothRemoteGATTServer interface contains many of the methods needed to interact with the Bluetooth device. For example,

device.gatt.connect()

Attempts to asynchronously create a connection with the device through a Promise. If .then is attached then the method will return a service object if succesful. If you are just looking to get something done with Bluetooth, feel free to keep hacking through this article (that's what I'd do--TL;DR). However, if you want to know more about Bluetooth 4 protocol here a few useful links:

Back to the code.

.then(device => {
    pairedDevices[device.name] = device;
    return device.gatt.connect();
}).then

Once the connection attempt has been made and returned succesful, the BluetoothRemoteGATTServer object returned can be petitioned for a list of services.

....
    return device.gatt.connect();
})
.then(server => {
    return server.getPrimaryServices();
})

This will fire asynchronously using promises, and if succesful, return a BluetoothRemoteGATTService object. This represents all the services the device has public. Then, the returned service object may be iterated over to identify get characteristics of the device. (Almost to the data, I swear).

....
return server.getPrimaryServices();
    })
    .then(services => {
    services.forEach(service => {

Essentially, the BluetoothRemoteGATTService object is merely an array containing on the services. Using a services.forEach we get each individual service to explore its characteristics.

Now, I'm going to add the whole block which is responsible for iterating over each service and its characteristics, essentially turning on notifications for each device. This will ultimately allow a callback to be fired every every time the device sends data and a reference to a method by which data can be written to the device.

    ....
        let queue = Promise.resolve();
        queue = queue.then(_ => service.getCharacteristics()
            .then(characteristics => {
                characteristics.forEach(characteristic => {
                    writeCharacteristic = characteristic;
                    writeCharacteristic.startNotifications();
                    resolve();
            }); // End enumerating characteristics
        })); // End queue
    }) // End enumerating services
}) // End Service exploration  

The queue is a promise which allows us to loop through services and characteristics without breaking asynchronousity. Since this is my first JavaScript program, I will not try to explain it, but here's another guy's article which attempts to explain it:

Essentially, each service and characteristic contained in the service enumerated. At each characteristic there are two calls. One is to get a reference to the characteristic for writing. This is the global variable writeCharacteristic . Then, notifications for the writeCharacteristic are started. This will assure any time data is made available on the remote device our program is notified.

Now, it should be noted, this above code is hackish. For example, what if there are multiple characteristics and the last one isn't the one we want to know about. Well, we'd have a write reference to the wrong characteristic. So, filtering to the desired characteritic is on my TODO list.

But, let's finish before refactoring.

Let's take a look at how to write data to the device after getting a reference to the desired characteristic.

var write = function (data, string = true) {
    p = new Promise(function (resolve, reject) {
        // See if the device is paired.
        if (pairedDevices) {
            // Has a write reference been discovered.
            if (writeCharacteristic != null) {
                // Don't double encode.
                if (string) {
                    let encoder = new TextEncoder('utf-8');
                    writeCharacteristic.writeValue(encoder.encode(data));
                } else {
                    dataInUint8 = Uint8Array.from(data);
                    writeCharacteristic.writeValue(dataInUint8);
                }
                resolve();

            } else {
                reject("No write characteristic")
            }
        } else {
            reject("No devices paired.")
        }
    }).catch(error => {
    });
    return p;
}

The above method creates a promise and writes to the device asynchoronously. On the way, it checks to make sure the device is paired (not connected, that's on the TODO list). Also, it makes sure we still have a reference to the writeCharacteristic. Then, it will either encode it in utf-8 and write the data, or if the string argument is false it'll just write the data. After it has written the data, the resolve is executed. This would allow the writeMethod to be called like so:

write("Buggers", true).then(_ => {
    // Do something after write has completed.
})

Ok, last bit. Let's setup capturing incoming data. To begin, I created a method which holds a list of all the callback methods to call when data has been received.

var onReceivedDataCallbacks = [];
...
// Adds a function called when a BLE characteristic changes value.
// Mutiple callbacks may be added.
this.addReceivedDataCallback = function (callback) {
    if (writeCharacteristic) {
        writeCharacteristic.addEventListener('characteristicvaluechanged', callback);
        onReceivedDataCallbacks.push({
            key: callback.name,
            value: callback
        })
    }
}

This method allows a method's name to be passed in. It then adds an event listener to this method, which will be called whenever characteristicvaluechanged. Also, it saves this method's name in an array in case I want to stop notifications later (again, not completed, but on the TODO).

The purpose of allowing multiple callbacks is for when I'm working with many modules which all would like to know what's going on with the Bluetooth LE device.

For example, this module is meant to be a piece of a larger project, which is an uploader app using BLE to upload HEX files to AVRs running TinySafeBoot.

Ok, one last piece. Let us see what the onRecievedData callback could looks like:

this.onReceivedData = function (event) {
    // TODO: Handle received data better.  
    // NOTE: the TX buffer for the HM-1X is only 20 bytes.  
    // But other devices differ.
    var receivedData = new Uint8Array(event.target.value.byteLength);
    for (var i = 0; i < event.target.value.byteLength; i++) {
        receivedData[i] = event.target.value.getUint8(i);
    }
}

This is how I've written the notification of data callback. The event.target.value contains the data, which is in an untyped array. I choice to encode it into Uint8 as I'll be working with both ASCII and non-ASCII data.

Well, that's it. This code will allow one to search, connect, write data to, and receive data from Bluetooth Low Energy devices from Chrome browser. Let me know if you have any recommendations.

Here is the full code referenced directly from my project:

Stitching Together HMIS Exports

This is an R script which will take two sets of HMIS 5.1 CSVs and produce a combined set.

A few notes:

  1. A new ExportID will need to be provided.
  2. Each files are deduplicated based upon the Primary Key (ProjectEntryID, PersonalID, etc.)
  3. The Project.csv contains a PITCount which is different based upon the date ranges the two data sets were pulled.  However, the script takes the maximum of the two PITCounts.
  4. It requires HMIS_Functions and dplyr.
    library(dplyr)
    # hmisFunctions <- "/Users/user//Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
    # dataPathOne <- "/Users/user//Dropbox/HMIS/R HMIS CSV Set Merger/HMIS Data 10-01-2016 to 10-31-2016"
    # dataPathTwo <- "/Users/user//Dropbox/HMIS/R HMIS CSV Set Merger/HMIS Data 11-01-2016 to 11-30-2016"
    # pathForCombinedData <- "/Users/user//Dropbox/HMIS/R HMIS CSV Set Merger/"
    # nameOfMergedDirectory <- "Merged"

    hmisFunctions <- "/Users/user/Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
    dataPathOne <- "/Users/user/Dropbox/HMIS/R HMIS CSV Set Merger/HMIS Data 10-01-2016 to 10-31-2016"
    dataPathTwo <- "/Users/user/Dropbox/HMIS/R HMIS CSV Set Merger/HMIS Data 11-01-2016 to 11-30-2016"
    pathForCombinedData <- "/Users/user/Dropbox/HMIS/R HMIS CSV Set Merger/Merged/"
    nameOfMergedDirectory <- "Merged"

    newExportID <- "12345"

    source(hmisFunctions)

    dir.create(file.path(pathForCombinedData, nameOfMergedDirectory), showWarnings = FALSE)
    setwd(file.path(pathForCombinedData, nameOfMergedDirectory))

    ##################################
    # Merge Functions                #
    ##################################
    mergeHmisCsvsWithUniqueIDAndExport <- function(df1, df2, uniqueIDHeader, exportID){
      # Merge the data
      mergedDf <- rbind(df1, df2)
      mergedDf[,uniqueIDHeader] <- as.factor(mergedDf[,uniqueIDHeader])
      # Drop columns which would resist removing duplicates
      drops <- c(uniqueIDHeader, "ExportID")
      mergedDf <- mergedDf[ , !(names(mergedDf) %in% drops)]
      # Get only unique records
      mergedDf <- unique(mergedDf)
      # Creat a PrimaryKey vector.
      xName <- rownames(mergedDf)
      # Add Primary Key back.
      mergedDf <- cbind(xName=xName, mergedDf)
      # Create an ExportID vector.
      exportIDVector <- rep(exportID,length(mergedDf$xName))
      # Add ExportID vector back.
      mergedDf <- cbind(mergedDf, exportIDVector)
      # Rename newly created PrimaryKey and ExportID appropriately.
      colnames(mergedDf)[ncol(mergedDf)] <- "ExportID"
      colnames(mergedDf)[1] <- uniqueIDHeader
      mergedDf
    }

    mergeHmisCsvsWithExportAndPrimaryKey <- function(df1, df2, primaryKey, exportID){
      # Merge the data
      mergedDf <- rbind(df1, df2)

      # Drop columns which would resist removing duplicates
      drops <- c("ExportID")
      mergedDf <- mergedDf[ , !(names(mergedDf) %in% drops)]

      # Get only unique records
      #mergedDf <- unique(mergedDf, incomparables = FALSE)
      mergedDf <- mergedDf[!duplicated(mergedDf[,c(primaryKey)]),]
      # Create an ExportID vector.
      exportIDVector <- rep(exportID,length(mergedDf[1]))
      # Add ExportID vector back.
      mergedDf <- cbind(mergedDf, exportIDVector)
      # Rename newly created PrimaryKey and ExportID appropriately.
      colnames(mergedDf)[ncol(mergedDf)] <- "ExportID"
      mergedDf
    }

    mergeHmisCsvsWithExportId <- function(df1, df2, exportID){
      # Merge the data
      mergedDf <- rbind(df1, df2)
      # Drop columns which would resist removing duplicates
      drops <- c("ExportID")
      mergedDf <- unique(mergedDf[ , !(names(mergedDf) %in% drops)])
      # Create an ExportID vector.
      exportIDVector <- rep(exportID,length(mergedDf[1]))
      # Add ExportID vector back.
      mergedDf <- cbind(mergedDf, exportIDVector)
      # Rename newly created PrimaryKey and ExportID appropriately.
      colnames(mergedDf)[ncol(mergedDf)] <- "ExportID"
      mergedDf
    }

    #####################
    # Merge Affiliation #
    #####################
    affiliationOne <- loadAffiliation(dataPathOne)
    affiliationTwo <- loadAffiliation(dataPathTwo)
    affiliationCombined <- rbind(affiliationOne, affiliationTwo)
    affiliationCombined <- unique(affiliationCombined)
    remove(list=c("affiliationOne", "affiliationTwo"))

    #####################
    # Merge Client      #
    #####################
    clientOne <- loadClient(dataPathOne)
    clientTwo <- loadClient(dataPathTwo)
    clientCombined <- rbind(clientOne, clientTwo)
    clientCombined <- unique(clientCombined)
    remove(list=c("clientOne", "clientTwo"))

    ######################
    # Merge Disabilities #
    ######################
    disabilitiesOne <- loadDisabilities(dataPathOne)
    disabilitiesTwo <- loadDisabilities(dataPathTwo)
    disabilitiesCombined <- mergeHmisCsvsWithUniqueIDAndExport(disabilitiesOne, disabilitiesTwo, "DisabilitiesID", newExportID)
    remove(list=c("disabilitiesOne", "disabilitiesTwo"))

    ##################################
    # Merge Employment and Education #
    ##################################
    employmentEducationOne <- loadEmployementEducation(dataPathOne)
    employmentEducationTwo <- loadEmployementEducation(dataPathTwo)
    employmentEducationCombined <- mergeHmisCsvsWithUniqueIDAndExport(employmentEducationOne, employmentEducationTwo, "EmploymentEducationID", newExportID)
    remove(list=c("employmentEducationOne", "employmentEducationTwo"))

    #####################
    # Merge Enrollment  #
    #####################
    enrollmentOne <- loadEnrollment(dataPathOne)
    enrollmentTwo <- loadEnrollment(dataPathTwo)
    enrollmentCombined <- mergeHmisCsvsWithExportAndPrimaryKey(enrollmentOne, enrollmentTwo, "ProjectEntryID", newExportID)
    remove(list=c("enrollmentOne", "enrollmentTwo"))

    #######################
    # Merge EnrollmentCoC #
    #######################
    enrollmentCocOne <- loadEnrollmentCoc(dataPathOne)
    enrollmentCocTwo <- loadEnrollmentCoc(dataPathTwo)
    enrollmentCocCombined <- mergeHmisCsvsWithUniqueIDAndExport(enrollmentCocOne,
                                                                enrollmentCocTwo,
                                                                "EnrollmentCoCID",
                                                                newExportID)
    remove(list=c("enrollmentCocOne", "enrollmentCocTwo"))

    #####################
    # Merge Exit        #
    #####################
    exitOne <- loadExit(dataPathOne)
    exitTwo <- loadExit(dataPathTwo)
    exitCombined <- mergeHmisCsvsWithExportAndPrimaryKey(exitOne,
                                                         exitTwo,
                                                         "ExitID",
                                                         newExportID)
    remove(list=c("exitOne", "exitTwo"))

    #####################
    # Merge Expot       #
    #####################
    exportOne <- loadExport(dataPathOne)
    exportTwo <- loadExport(dataPathTwo)
    exportCombined <- exportTwo
    remove(list=c("exportOne", "exportTwo"))

    #####################
    # Merge Funder      #
    #####################
    funderOne <- loadFunder(dataPathOne)
    funderTwo <- loadFunder(dataPathTwo)
    funderCombined <- mergeHmisCsvsWithExportId(funderOne, funderTwo, newExportID)
    remove(list=c("funderOne", "funderTwo"))

    #####################
    # Merge Health & DV #
    #####################
    healthAndDVOne <- loadHealthAndDv(dataPathOne)
    healthAndDVTwo <- loadHealthAndDv(dataPathTwo)
    healthAndDVCombined <- mergeHmisCsvsWithUniqueIDAndExport(healthAndDVOne, healthAndDVTwo, "HealthAndDVID", newExportID)
    remove(list=c("healthAndDVOne", "healthAndDVTwo"))

    #############################
    # Merge Income and Benefits #
    #############################
    incomeBenefitsOne <- loadIncomeBenefits(dataPathOne)
    incomeBenefitsTwo <- loadIncomeBenefits(dataPathTwo)
    incomeBenefitsCombined <- mergeHmisCsvsWithUniqueIDAndExport(incomeBenefitsOne,
                                                                 incomeBenefitsTwo,
                                                                 "IncomeBenefitsID",
                                                                 newExportID)
    remove(list=c("incomeBenefitsOne", "incomeBenefitsTwo"))

    #####################
    # Merge Inventory   #
    #####################
    inventoryOne <- loadInventory(dataPathOne)
    inventoryTwo <- loadInventory(dataPathTwo)
    inventoryCombined <- mergeHmisCsvsWithUniqueIDAndExport(inventoryOne, inventoryTwo,
                                                            "InventoryID",
                                                            newExportID)
    remove(list=c("inventoryOne", "inventoryTwo"))

    ######################
    # Merge Organization #
    ######################
    organizationOne <- loadOrganization(dataPathOne)
    organizationTwo <- loadOrganization(dataPathTwo)
    organizationCombined <- mergeHmisCsvsWithExportId(organizationOne, organizationTwo, newExportID)
    remove(list=c("organizationOne", "organizationTwo"))

    #####################
    # Merge Project     #
    #####################
    projectOne <- loadProject(dataPathOne)
    projectTwo <- loadProject(dataPathTwo)
    projectsCombined <- rbind(projectOne, projectTwo)
    # Get only the highest PIT Count
    projectsCombined <- projectsCombined %>% 
      group_by(ProjectID) %>% 
      filter(PITCount==max(PITCount))
    # Remove ExportID column for flattening
    drops <- c("ExportID")
    projectsCombined <- projectsCombined[ , !(names(projectsCombined) %in% drops)]
    projectsCombined <- unique(projectsCombined)
    # Create an ExportID vector.
    exportIDVector <- rep(newExportID,length(projectsCombined$ProjectID))
    # Add ExportID vector back.
    projectsCombined <- as.data.frame(projectsCombined)
    projectsCombined <- cbind(projectsCombined, exportIDVector)
    colnames(projectsCombined)[ncol(projectsCombined)] <- "ExportID"
    remove(list=c("projectOne", "projectTwo"))

    #####################
    # Merge Project CoC #
    #####################
    projectsCoCOne <- loadProjectCoc(dataPathOne)
    projectsCoCTwo <- loadProjectCoc(dataPathTwo)
    projectCoCCombined <- rbind(projectsCoCOne, projectsCoCTwo)
    # Get only the highest PIT Count
    projectCoCCombined <- projectCoCCombined %>% 
      group_by(ProjectID) %>% 
      filter(PITCount==max(PITCount))
    # Remove ExportID column for flattening
    drops <- c("ExportID")
    projectCoCCombined <- projectCoCCombined[ , !(names(projectCoCCombined) %in% drops)]
    projectCoCCombined <- unique(projectCoCCombined)
    # Create an ExportID vector.
    exportIDVector <- rep(newExportID,length(projectCoCCombined$ProjectID))
    # Add ExportID vector back.
    projectCoCCombined <- as.data.frame(projectCoCCombined)
    projectCoCCombined <- cbind(projectCoCCombined, exportIDVector)
    colnames(projectCoCCombined)[ncol(projectCoCCombined)] <- "ExportID"
    remove(list=c("projectsCoCOne", "projectsCoCTwo"))

    #####################
    # Merge Services    #
    #####################
    servicesOne <- loadServices(dataPathOne)
    servicesTwo <- loadServices(dataPathTwo)
    servicesCombined <- mergeHmisCsvsWithUniqueIDAndExport(servicesOne, servicesTwo, "ServicesID", newExportID)
    remove(list=c("servicesOne", "servicesTwo"))

    #####################
    # Merge Site        #
    #####################
    siteOne <- loadSite(dataPathOne)
    siteTwo <- loadSite(dataPathTwo)
    siteCombined <- mergeHmisCsvsWithExportId(siteOne, siteTwo, newExportID)
    remove(list=c("siteOne", "siteTwo"))

    ############################
    # Write combined HMIS CSVs #
    ############################

    write.csv(affiliationCombined, file = paste(pathForCombinedData, "Affiliation.csv", sep=""), na = "", row.names = FALSE)
    write.csv(clientCombined, file = paste(pathForCombinedData, "Client.csv", sep=""), na = "", row.names = FALSE)
    write.csv(disabilitiesCombined, file = paste(pathForCombinedData, "Disabilities.csv", sep=""), na = "", row.names = FALSE)
    write.csv(employmentEducationCombined, file = paste(pathForCombinedData, "EmploymentEducation.csv", sep=""), na = "", row.names = FALSE)
    write.csv(enrollmentCombined, file = paste(pathForCombinedData, "Enrollment.csv", sep=""), na = "", row.names = FALSE)
    write.csv(enrollmentCocCombined, file = paste(pathForCombinedData, "EnrollmentCoC.csv", sep=""), na = "", row.names = FALSE)
    write.csv(exitCombined, file = paste(pathForCombinedData, "Exit.csv", sep=""), na = "", row.names = FALSE)
    write.csv(exportCombined, file = paste(pathForCombinedData, "Export.csv", sep=""), na = "", row.names = FALSE)
    write.csv(funderCombined, file = paste(pathForCombinedData, "Funder.csv", sep=""), na = "", row.names = FALSE)
    write.csv(healthAndDVCombined, file = paste(pathForCombinedData, "HealthAndDV.csv", sep=""), na = "", row.names = FALSE)
    write.csv(incomeBenefitsCombined, file = paste(pathForCombinedData, "IncomeBenefits.csv", sep=""), na = "", row.names = FALSE)
    write.csv(inventoryCombined, file = paste(pathForCombinedData, "Inventory.csv", sep=""), na = "", row.names = FALSE)
    write.csv(organizationCombined, file = paste(pathForCombinedData, "Organization.csv", sep=""), na = "", row.names = FALSE)
    write.csv(projectsCombined, file = paste(pathForCombinedData, "Project.csv", sep=""), na = "", row.names = FALSE)
    write.csv(projectCoCCombined, file = paste(pathForCombinedData, "ProjectCoC.csv", sep=""), na = "", row.names = FALSE)
    write.csv(servicesCombined, file = paste(pathForCombinedData, "Services.csv", sep=""), na = "", row.names = FALSE)
    write.csv(siteCombined, file = paste(pathForCombinedData, "Site.csv", sep=""), na = "", row.names = FALSE)
Lumi4 -- MVVM

One of the issues I've had in the past with the Lumi projects is manageable UI.  The project will start out pretty straight foward, but soon, I'm switching between device types, checking if hardware is ready, and routing callbacks based upon the device selected.  It becomes spaghetti code quick.  On Lumi4, I've decided to bite the bullet and implement MVVM .

After about 20 hours struggling with setting up Lumi4 as an MVVM project I've dervied two conclusions:

  1. It's possible
  2. Apple spoils developers with MVC baked into Xcode

MVVM in C# and UWP isn't simple.  It seems like there is a lot more boiler-plate coding necessary for MVVM than MVC in Xcode.  Eventually, I broke down and downloaded the NuGet package Prism , as a MVVM helper.  This helped alleviate some of the code necesssary for Commanding (which I still don't understand well enough to implement without Prism).

Below I'm going to take a look at a couple of controls I've written MVVM on.  Finding documentation for MVVM in Universal Windows Plateform (UWP) is tricky.  It is similar to XamarinForms and WPF, but overall, there are syntax differences which make generalizing the documentation difficult.

First though, here's how I structured my project:

MainPage.xaml = View MainViewViewModel = ViewModel *[Not Yet Written] = Model

Below is some code for a couple of UI controls.

First, taking a look at the four text boxes which will hold the network IDs and host IDs the user is to type in:

MainPage.xaml

    ....
    <TextBox x:Name="NetworkIDOne" Text="{Binding HostIDOne, Mode=TwoWay}" VerticalAlignment="Center" HorizontalAlignment="Stretch" TextAlignment="Center" Grid.Column="1"/>
    <TextBox x:Name="NetworkIDTwo" Text="{Binding HostIDTwo, Mode=TwoWay}" VerticalAlignment="Center" HorizontalAlignment="Stretch" TextAlignment="Center" Grid.Column="2"/>
    <TextBox x:Name="HostIDOne" Text="{Binding NetworkIDOne, Mode=TwoWay}" VerticalAlignment="Center" HorizontalAlignment="Stretch" TextAlignment="Center" Grid.Column="3"/>
    <TextBox x:Name="HostIDTwo" Text="{Binding NetworkIDTwo, Mode=TwoWay}" VerticalAlignment="Center" HorizontalAlignment="Stretch" TextAlignment="Center" Grid.Column="4"/>
    ....

This sets up the binding to the variables HostIDOne, HostIDTwo, NetworkIDOne, NetworkIDTwo.  However, there's plenty more boilerplate code before things start working.

To setup a ViewModel it's best to setup an abstract class which can be inherited.  This saves on creating boilerplate.  Below is the abstract class the internet told me to make:

MainViewModelBase.cs

    public abstract class MainViewModelBase : INotifyPropertyChanged
    {
        public event PropertyChangedEventHandler PropertyChanged;
        protected virtual void OnPropertyChanged(string propertyName)
        {
            this.PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(propertyName));
        }
    }

This code handles the property changing notification for all properties declared in classes which inherit from the MainViewModelBase.  Such as the MainViewViewModel which is used in Lumi4

MainViewViewModel.cs

    ....
        public class MainViewViewModel: MainViewModelBase
        {
    ....

Moving on to the actual implementation of the bound text boxes.  Each text box will have a property associated with the string value in the Text attribute.  However, before this will work, the DataContext must be set for the MainPage.xaml.  This is done in the MainPage.xaml.cs file.

MainPage.xaml.cs

    public MainPage()
    {
        this.InitializeComponent();
        DataContext = new Lumi4App.ViewModels.MainViewViewModel();

(I told you it was a lot of work to setup.  Well, compared to Xcode's MVC.)

Ok, everything should be in place, time to implement the bound properties.  In the MainViewViewModel I've the following code:

    private string _HostIDOne;
    public string HostIDOne
    {
        get { return _HostIDOne; }
        set {
            if (_HostIDOne != value)
            {
                _HostIDOne = value;
            }
        }
    }

    private string _HostIDTwo;
    public string HostIDTwo
    {
        get { return _HostIDTwo; }
        set {
            if (_HostIDTwo != value)
            {
                _HostIDTwo = value;
            }
        }
    }

    private string _NetworkIDOne;
    public string NetworkIDOne
    {
        get { return _NetworkIDOne; }
        set {
            if (_NetworkIDOne != value)
            {
                _NetworkIDOne = value;
            }
        }
    }

    private string _NetworkIDTwo;
    public string NetworkIDTwo
    {
        get { return _NetworkIDTwo; }
        set {
            if (_NetworkIDTwo != value)
            {
                _NetworkIDTwo = value;
            }
        }
    }

The if statement under the setter checks if the value about to be set is the same as the value is currently.  Note, there is a helper in Prism which will prevent one from having to rewrite this for every attribute.  However, I implemented the above code before I downloaded Prism and decided to write this article while it was still fresh in my mind.  I'll probably correct these to use the Prism helper before moving.

The code so far should allow for the properties setter to be called whenever the user types something in one of the four textboxes. On caveat, it seems the setter is not called until the user removes focus from the textbox.  I'll probably need to change the binding of the properties, but for now, it works well enough.

Well, this is all fine and gone, but what about Commands such as Button Click Events?  Pretty simple, but with more boilerplate code.

First, bind to the command in the View

MainPage.xaml

    <Button x:Name="Search" Command="{Binding SearchCommand, Mode=TwoWay}" Padding="2" >

Next, and this is the part I don't understand without Prism helping, there needs to be a delegate which will fire an event whenever the commad is called.  In Prism there is the DelegateCommand type which takes care of a lot of the work.  The DelegateCommand has to be initialized with two event handlers CanExecute and Execute.  These methods will be called in that order every time the DelegateCommand property is accessed.

MainViewViewModel.cs

    public DelegateCommand SearchCommand { get; set; }
    private bool SearchCanExecute()
    {
        return (CentralDeviceTypeSelected == CentralDeviceType.Http) ? true : false;
    }
    private void SearchExecute()
    {
        [Add Code to Do Stuff when the Search Button is pressed here]
    }

    ....

    public MainViewViewModel()
    {
        SearchCommand = new DelegateCommand(SearchExecute, SearchCanExecute);
    }

And that's it for now.  Had to make sure I jotted down my notes while the challenge was still fresh in my head.

Update: 4-16-2017

Apparently, a Prism ViewModel should inherit from BindableBase instead of the custom ViewModelBase the internet told me to write.  Also, to get the prismProp code snippet it looks like you have to download the Prism template packet:

* Prism Templates

And the intellisense phrase for the PrismProperty is "pprop".

Also, Brian Laguna (maintainer of Prism) has a video using Prism 6:

* MVVM Made Simple with Prism

Oh, and to make Prism work you need to get NuGet packages:

  1. Prism.Core
  2. Prism.Windows (contains ViewModelLocator)
Lumi4 -- Unit and Integration Tests

Unit and Integration Testing

I mentioned in an earlier entry that I had the hardest time differentiating between unit and integration tests.  But this distinction was critical for implementing tests which could cover frameworks designed to interact with embedded systems.  At least, in my perspective.  Below is an outline of how I'm structuring tests for the Lumi4 code base.

Lumi4.Tests

The unit tests namespace will contain all tests which cover methods which can be operate independtly, without communication of any other system then the program itself.

For example,

    [TestClass]
    public class Constructor
    {
        [TestMethod]
        public void WifiCentralManagerConstructor_Null_Exception()
        {
            bool ThrewNull = false;
            try
            {

                WifiCentralManager wifiCentralManager = new WifiCentralManager(null);
            }
            catch (Exception ex)
            {
                ThrewNull = true;
            }
            Assert.IsTrue(ThrewNull);
        }
    }

The test above covers a constructor method, which should always be able to execute effectively without any communication with a

Lumi4.IntegrationTests

In an earlier entry I reviewed the epiphanic difference between intergration and unit tests.  The intergration tests are really meant for code which relies on outside systems; database query result, characters from a filestream, or characters from a UART device.  For Lumi4 there are three systems which the program is depedent.

  1. Remote Bluetooth Device(s)
  2. Remote Wifi Device(s)
  3. Intel HEX Filestream

For the first two I've decided to focus on integration testing rather than mocks and stubs.  My reasoning is two fold, I will most likely be tweaking the firmware of the remote devices.  Secondly, I don't understand mocks and stubs yet.  Trying to focus on MVP .

Of course, when I finally put together a plan of action a new struggle arose.  A lot of my Bluetooth and Wifi handling was depedent on asynchronous callbacks.  And this isn't the easiest thing to tackle in MSTesting (which is the testing package I'm using for this iteration).  Eventually though, I hacked together the following logic

    [TestMethod]
    public async Task Search_FindsWebServer_ValidIp()
    {
        var localNetwork = Lumi4IntegrationTestSettings.LocalIP;
        WifiCentralManager wifiCentralManager = new WifiCentralManager(localNetwork);
        bool foundDevice = false;
        wifiCentralManager.DiscoveredDevice += delegate (object obj, DiscoveredDeviceEventArgs args)
        {
            if (args.DiscoveredPeripheral != null) { foundDevice = true; }
        };
        wifiCentralManager.Search(90, 120);
        await Task.Delay(Lumi4IntegrationTestSettings.SearchWifiCallbackDelay);
        Assert.IsTrue(foundDevice);
    }

There are a few inputs which most be manually provided to the test, for example, the LocalIP and the target device's IP.  Scoped at the top of the method is a flag which will identify whether the device was found. It then takes this information, sets up a in method delegate (callback), and attempts to contact the device. Lastly, there is an async delay whose purpose is to allow the search enough time to properly execute. If the test finds the device within the given time, the callback is fired, and the flag set true. Otherwise, it returns failed.

Not sure of the validity, but it's what I got (so far).

source(hmisFunctions)

# Time period: 1/1/2016-12/31/2016
# Include 
# Active in emergency shelter
# Active in transitional housing

setwd(hmisDataPath)
client <- loadClient()
enrollment <- loadEnrollment()
project <- loadProject()
exit <- loadExit()

enrollment$EntryDate <- as.character(enrollment$EntryDate)
exit$ExitDate <- as.character(exit$ExitDate)

targetEnrollments <- sqldf("SELECT *
                        FROM enrollment
                        WHERE EntryDate < '2016-12-31'
                        ")

targetExits <- sqldf("SELECT *
                    FROM exit
                     WHERE ExitDate < '2016-01-01'
                     ")

activeEnrollment <- getActiveHudEnrollments(targetEnrollments, targetExits, project)
activeEnrollmentSelect <- sqldf("SELECT PersonalID, ProjectEntryID, ProjectType, EntryDate, ExitDate FROM activeEnrollment")

transitionalHousing <- sqldf("SELECT *
                                  FROM activeEnrollmentSelect
                                  WHERE ProjectType = 2
                                  ")

thPersonalIDs <- sqldf("SELECT DISTINCT(PersonalID) FROM transitionalHousing")

remove(list=c("targetEnrollments", "targetExits", "activeEnrollment", "activeEnrollmentSelect", "transitionalHousing"))

################
# ES LTB Count #
################
ltbESEnrollment <- sqldf("SELECT *
                        FROM enrollment
                         WHERE EntryDate < '2016-12-31'
                         ")

ltbESEnrollment <- addProjectInfoToEnrollment(ltbESEnrollment, project)
ltbESEnrollment <- sqldf("SELECT *
                         FROM ltbESEnrollment
                         WHERE TrackingMethod = 0
                         ")

ltbESExits <- sqldf("SELECT *
                     FROM exit
                     WHERE ExitDate < '2016-01-01'
                     ")

activeltbESEnrollments <- getActiveHudEnrollments(ltbESEnrollment, ltbESExits, project)
activeltbESEnrollments <- sqldf("SELECT PersonalID, ProjectEntryID, ProjectType, EntryDate, ExitDate FROM activeltbESEnrollments")

ltbESPersonalIDs <- sqldf("SELECT DISTINCT(PersonalID) FROM activeltbESEnrollments")

remove(list=c("ltbESEnrollment", "ltbESExits", "activeltbESEnrollments"))

################
# NBN  Count   #
################

services <- loadServices()

nbnServices <- sqldf("SELECT * 
                     FROM services
                     WHERE RecordType = 200
                     ")

nbnServices$DateProvided <- as.character(nbnServices$DateProvided)

nbnServicesInRange <- sqldf("SELECT *
                            FROM nbnServices
                            WHERE DateProvided > '2016-01-01'
                            AND DateProvided < '2016-12-31'
                            ")

nbnPersonalIDs <- sqldf("SELECT DISTINCT(PersonalID) FROM nbnServicesInRange")

###################
# Outreach  Count #
###################

outreachServices <- sqldf("SELECT * 
                     FROM services
                     WHERE RecordType = 12
                     ")

outreachServices$DateProvided <- as.character(outreachServices$DateProvided)

outreachServicesInRange <- sqldf("SELECT *
                            FROM outreachServices
                            WHERE DateProvided > '2016-01-01'
                            AND DateProvided < '2016-12-31'
                            ")

outreachPersonalIDs <- sqldf("SELECT DISTINCT(PersonalID) FROM outreachServicesInRange")

remove(list=c("outreachServicesInRange", "outreachServices"))

##########
# Totals #
##########

totalHomelessInRange <- rbind(ltbESPersonalIDs, thPersonalIDs, nbnPersonalIDs, outreachPersonalIDs)

totalHomelessInRange <- sqldf("SELECT DISTINCT(PersonalID)
                              FROM totalHomelessInRange
                              ")

#############
# PH Counts #
#############

phTargetEnrollments <- sqldf("SELECT *
                        FROM enrollment
                           WHERE EntryDate < '2016-12-31'
                           ")

phTargetExits <- sqldf("SELECT *
                     FROM exit
                     WHERE ExitDate < '2016-01-01'
                     ")

phActiveEnrollment <- getActiveHudEnrollments(phTargetEnrollments, phTargetExits, project)
phActiveEnrollmentSelect <- sqldf("SELECT PersonalID, ProjectEntryID, ProjectType, EntryDate, ExitDate FROM phActiveEnrollment")

totalPhHousing <- sqldf("SELECT *
                             FROM phActiveEnrollmentSelect
                             WHERE ProjectType = 3
                             OR ProjectType = 13
                             ")

rrhHousing <- sqldf("SELECT *
                    FROM phActiveEnrollmentSelect
                    WHERE ProjectType = 13
                    ")

psh <- sqldf("SELECT *
             FROM phActiveEnrollmentSelect
             WHERE ProjectType = 3
             ")

setwd(executionPath)
Choropleth and Heatmaps for HMIS Data

# Mac PC
nameOfReport <- "Homebase_Report.R"
hmisDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/All Projects 2016 -- 10-01-2013 to 02-17-2017 -- HMIS CSV 5.1"
vispdatDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_Report/VI-SPDAT and HUD Flat Export for SQL -- 3-6-2017.xlsx"
staffInfoDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_Report/Staff Contact Info for SQL -- 3-6-2017.xlsx"
executionPath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_Report"
hmisFunctions <- "C:/Users/Ladvien/Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
hmisGraphsPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/TX-601_Graphs"
source(hmisFunctions)

enrollment <- loadEnrollment(hmisDataPath)

##############################################
############## Point Map #####################
##############################################

# https://thedhrelay.wordpress.com/2014/04/08/creating-a-density-map-in-r-with-zipcodes/

library(plyr)
library(ggmap)
library(zipcode)

personalIDsAndZips <- sqldf("SELECT PersonalID, LastPermanentZIP FROM enrollment")
rm(list = c("enrollment"))

# Clean zips
data(zipcode)
personalIDsAndZips$LastPermanentZIP <- clean.zipcodes(personalIDsAndZips$LastPermanentZIP)
personalIDsAndZips <- merge(personalIDsAndZips, zipcode, by.x = 'LastPermanentZIP', by.y = 'zip')

# Get density
zipCount <- sqldf("SELECT LastPermanentZIP, COUNT(LastPermanentZIP) As Count FROM personalIDsAndZips GROUP BY LastPermanentZIP")
personalIDsAndZips <- sqldf("SELECT * FROM personalIDsAndZips a LEFT JOIN zipCount b ON a.LastPermanentZIP=b.LastPermanentZIP ")
personalIDsAndZips <- subset(personalIDsAndZips)
zipCounts <- sqldf("SELECT DISTINCT LastPermanentZIP, longitude, latitude, COUNT(LastPermanentZIP) As 'Count' FROM personalIDsAndZips GROUP BY LastPermanentZIP")

texas <- get_map(location = c("dfw"), zoom = 9)
mapOfEntrants <- ggmap(texas) +
    geom_point(data = zipCounts,
    aes(x = longitude,
        y = latitude,
        size = Count,
        alpha = Count), color = "red") +
    ylab("Latitude") +
    xlab("Longitude") +
    labs(title = "Residence Prior to Project Entry", size = "Entrants", alpha = "Entrants")

svg(filename = paste(hmisGraphsPath, "/ResidencePriorToProjectEntry.svg", sep = ""),
    width = 5,
    height = 4,
    pointsize = 12)
plot(mapOfEntrants)
dev.off()

##############################################
############## Heat Map ######################
############## County Partition ##############
##############################################
library(ggmap) #Load libraries
library(ggplot2)
hpars <- read.table("https://sites.google.com/site/arunsethuraman1/teaching/hpars.dat?revision=1") #Read in the density data

ggmap(texas, extent = "device") +
geom_density2d(data = zipCounts, aes(x = longitude, y = latitude), size = 0.3) +
stat_density2d(data = zipCounts,
                 aes(x = longitude, y = latitude, fill = ..level.., alpha = 1), size = 0.01,
                 bins = 16, geom = "polygon") + scale_fill_gradient(low = "green", high = "red") +
  scale_alpha(range = c(0, 0.3), guide = FALSE) #Plot

##############################################
############## Choropleth Map ################
##############################################

# https://blogs.uoregon.edu/rclub/2015/10/27/map-maker-map-maker-make-me-a-map/
# https://www.gislounge.com/mapping-county-demographic-data-in-r/
install.packages(c("choroplethr", "choroplethrMaps"))
library(choroplethr)
library(choroplethrMaps)

# Clean up zipcodes.
valid <- read.csv("C:/Users/Ladvien/Dropbox/HMIS/Warehouse/ValidZips.csv")
zipsAndCount <- sqldf("SELECT DISTINCT(LastPermanentZIP) As 'region', COUNT(LastPermanentZIP) As 'value' FROM personalIDsAndZips GROUP BY LastPermanentZIP")
zipsAndCount <- na.omit(zipsAndCount)
zipsAndCount$value <- clean.zipcodes(zipsAndCount$value)
zipsAndCount <- sqldf("SELECT a.* FROM zipsAndCount a INNER JOIN valid b ON a.region=b.ValidZip")
zipsAndCount$value <- as.numeric(zipsAndCount$value)
#zipsAndCount$region <- as.numeric(zipsAndCount$region)

# http://stackoverflow.com/questions/30787877/making-a-zip-code-choropleth-in-r-using-ggplot2-and-ggmap
install.packages("devtools")
library(devtools)
install_github('arilamstein/choroplethrZip@v1.5.0')
library(choroplethrZip)

dallas_zips <- c("75019", "75039", "75038", "75041", "75040", "75043", "75042", "75044", "75049", "75048", "75051", "75050", "75052", "75054", "75061", "75060", "75063", "75062", "75080", "75082", "75081", "75089", "75088", "75099", "75104", "75106", "75115", "75116", "75125", "75134", "75137", "75141", "75146", "75150", "75149", "75154", "75159", "75172", "75181", "75180", "75182", "75202", "75201", "75204", "75203", "75206", "75205", "75208", "75207", "75210", "75209", "75212", "75211", "75215", "75214", "75217", "75216", "75219", "75218", "75220", "75223", "75222", "75225", "75224", "75227", "75226", "75229", "75228", "75231", "75230", "75233", "75232", "75235", "75234", "75237", "75236", "75238", "75241", "75240", "75243", "75242", "75244", "75247", "75246", "75249", "75248", "75251", "75250", "75253", "75254", "75260", "75275", "75283", "75284", "75326", "75359", "75381", "75001", "75390", "75006", "75007", "75397", "75015", "75014", "75016")
dallas_zips <- unique(dallas_zips)
# ec = east coast
texas = c("texas")
zip_choropleth(zipsAndCount,
               state_zoom = texas,
               title = "Residence Prior to Entry",
               legend = "Entrants",
               num_color = 5,
               reference_map = TRUE
               ) +
               coord_map()

############## Get Dallas ZIP codes ######################
dallasClients <- sqldf("SELECT * FROM enrollment 
                        WHERE 
        LastPermanentZIP = '75019' OR
        LastPermanentZIP = '75039' OR
        LastPermanentZIP = '75038' OR
        LastPermanentZIP = '75041' OR
        LastPermanentZIP = '75040' OR
        LastPermanentZIP = '75043' OR
        LastPermanentZIP = '75042' OR
        LastPermanentZIP = '75044' OR
        LastPermanentZIP = '75049' OR
        LastPermanentZIP = '75048' OR
        LastPermanentZIP = '75051' OR
        LastPermanentZIP = '75050' OR
        LastPermanentZIP = '75052' OR
        LastPermanentZIP = '75054' OR
        LastPermanentZIP = '75061' OR
        LastPermanentZIP = '75060' OR
        LastPermanentZIP = '75063' OR
        LastPermanentZIP = '75062' OR
        LastPermanentZIP = '75080' OR
        LastPermanentZIP = '75082' OR
        LastPermanentZIP = '75081' OR
        LastPermanentZIP = '75089' OR
        LastPermanentZIP = '75088' OR
        LastPermanentZIP = '75099' OR
        LastPermanentZIP = '75104' OR
        LastPermanentZIP = '75106' OR
        LastPermanentZIP = '75115' OR
        LastPermanentZIP = '75116' OR
        LastPermanentZIP = '75125' OR
        LastPermanentZIP = '75134' OR
        LastPermanentZIP = '75137' OR
        LastPermanentZIP = '75141' OR
        LastPermanentZIP = '75146' OR
        LastPermanentZIP = '75150' OR
        LastPermanentZIP = '75149' OR
        LastPermanentZIP = '75154' OR
        LastPermanentZIP = '75159' OR
        LastPermanentZIP = '75172' OR
        LastPermanentZIP = '75181' OR
        LastPermanentZIP = '75180' OR
        LastPermanentZIP = '75182' OR
        LastPermanentZIP = '75202' OR
        LastPermanentZIP = '75201' OR
        LastPermanentZIP = '75204' OR
        LastPermanentZIP = '75203' OR
        LastPermanentZIP = '75206' OR
        LastPermanentZIP = '75205' OR
        LastPermanentZIP = '75208' OR
        LastPermanentZIP = '75207' OR
        LastPermanentZIP = '75210' OR
        LastPermanentZIP = '75209' OR
        LastPermanentZIP = '75212' OR
        LastPermanentZIP = '75211' OR
        LastPermanentZIP = '75215' OR
        LastPermanentZIP = '75214' OR
        LastPermanentZIP = '75217' OR
        LastPermanentZIP = '75216' OR
        LastPermanentZIP = '75219' OR
        LastPermanentZIP = '75218' OR
        LastPermanentZIP = '75220' OR
        LastPermanentZIP = '75223' OR
        LastPermanentZIP = '75222' OR
        LastPermanentZIP = '75225' OR
        LastPermanentZIP = '75224' OR
        LastPermanentZIP = '75227' OR
        LastPermanentZIP = '75226' OR
        LastPermanentZIP = '75229' OR
        LastPermanentZIP = '75228' OR
        LastPermanentZIP = '75231' OR
        LastPermanentZIP = '75230' OR
        LastPermanentZIP = '75233' OR
        LastPermanentZIP = '75232' OR
        LastPermanentZIP = '75235' OR
        LastPermanentZIP = '75234' OR
        LastPermanentZIP = '75237' OR
        LastPermanentZIP = '75236' OR
        LastPermanentZIP = '75238' OR
        LastPermanentZIP = '75241' OR
        LastPermanentZIP = '75240' OR
        LastPermanentZIP = '75243' OR
        LastPermanentZIP = '75242' OR
        LastPermanentZIP = '75244' OR
        LastPermanentZIP = '75247' OR
        LastPermanentZIP = '75246' OR
        LastPermanentZIP = '75249' OR
        LastPermanentZIP = '75248' OR
        LastPermanentZIP = '75251' OR
        LastPermanentZIP = '75250' OR
        LastPermanentZIP = '75253' OR
        LastPermanentZIP = '75254' OR
        LastPermanentZIP = '75260' OR
        LastPermanentZIP = '75275' OR
        LastPermanentZIP = '75283' OR
        LastPermanentZIP = '75284' OR
        LastPermanentZIP = '75326' OR
        LastPermanentZIP = '75359' OR
        LastPermanentZIP = '75381' OR
        LastPermanentZIP = '75001' OR
        LastPermanentZIP = '75390' OR
        LastPermanentZIP = '75006' OR
        LastPermanentZIP = '75007' OR
        LastPermanentZIP = '75397' OR
        LastPermanentZIP = '75015' OR
        LastPermanentZIP = '75014' OR
        LastPermanentZIP = '75016'
")

# Clean up zipcodes.
valid <- read.csv("C:/Users/Ladvien/Dropbox/HMIS/Warehouse/ValidZips.csv")
zipsAndCount <- sqldf("SELECT DISTINCT(LastPermanentZIP) As 'region', COUNT(LastPermanentZIP) As 'value' FROM dallasClients GROUP BY LastPermanentZIP")
zipsAndCount <- na.omit(zipsAndCount)
zipsAndCount$value <- clean.zipcodes(zipsAndCount$value)
zipsAndCount <- sqldf("SELECT DISTINCT a.* FROM zipsAndCount a INNER JOIN valid b ON a.region=b.ValidZip")
zipsAndCount$value <- as.numeric(zipsAndCount$value)

zipsOfInterest <- sqldf("SELECT DISTINCT(region) FROM zipsAndCount")
zipsOfInterest <- unique(zipsOfInterest$region)

library(choroplethrZip)
# ec = east coast
texas = c("dallas")
zip_choropleth(zipsAndCount,
               zip_zoom = zipsOfInterest,
               title = "Residence Prior to Entry",
               legend = "Entrants",
               num_color = 5,
               reference_map = TRUE
               ) + coord_map()
Veteran's Report 2.0

#homebaseFunctionFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/Homebase_Function/Homebase_Function.R"
#nameOfReport <- "Homebase_Report.R"
#hmisDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/All Programs -- 5.1 -- 12-1-2016 to 2-28-2017"
#vispdatDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_Report/VI-SPDAT and HUD Flat Export for SQL -- 3-6-2017.xlsx"
#staffInfoDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_Report/Staff Contact Info for SQL -- 3-6-2017.xlsx"
#executionPath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_Report"
#hmisFunctionsFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
#homebaseFunctionFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/Homebase_Function/Homebase_Function.R"
#outputPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse"
#veteranMasterListTemplateFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/Veteran Report 2.0/Veteran_Report_v2/Master-List-Template.csv"

# PC
nameOfReport <- "Homebase_Report.R"
hmisDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/All Program 2016 Program Group, 1012013 - 2172017"
vispdatDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/VI-SPDAT 1/VI-SPDAT and HUD Flat Export for SQL -- 3-6-2017.xlsx"
vispdat2DataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/VI-SPDAT 2/VI-SPDAT v2.0 -- 04-05-17 -- TB.xlsx"
staffInfoDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/Staff Info/Staff Contact Info for SQL -- 3-6-2017.xlsx"
executionPath <- "C:/Users/Ladvien/Dropbox/HMIS/Veteran Report 2.0"
hmisFunctionsFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
homebaseFunctionFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/Homebase_Function/Homebase_Function.R"
outputPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse"
veteranMasterListTemplateFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/Veteran Report 2.0/Veteran_Report_v2/Master-List-Template.csv"
outputPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/Veteran Report Master List/"

# Load HMIS Functions
source(hmisFunctionsFilePath)
# Load Homebase function
source(homebaseFunctionFilePath)

homebase <- homebase(hmisDataPath,
                 vispdatDataPath,
                 staffInfoDataPath,
                 executionPath,
                 hmisFunctionsFilePath,
                 vispdat2DataPath
                 )

client <- loadClient(hmisDataPath)
enrollment <- loadEnrollment(hmisDataPath)
exit <- loadExit(hmisDataPath)
project <- loadProject(hmisDataPath)
projectCoc <- loadProjectCoc(hmisDataPath)

# Elements for the BFZ Veteran's Master List:

    #E1  Veteran Last Name
    #E2  Veteran First Name
    #E3  Veteran HMIS Client Identifier
    #E4  Veterans HOMES Client Identifier
    #E5  List Status
    #E6  Date Veteran Identified
    #E7  Last Review / Update on Master List
    #E8  Last known Location / Provider
    #E9  Confirmed Veteran Status?
    #E10 VHA Eligible
    #E11 SSVF Eligible
    #E12 Permant Housing Plan / Track
    #E13 Expected Permanent Housing Date
    #E14 Client Phone or Email if known
    #E15 Veteran DoB
    #E16 Assessment Score
    #E17 Chronic Status
    #E18 Provider Name and Contact
    #E19 Current Project Enrollment Type
    #E20 Date Permanent Housing Plan Created
    #E21 Permanent Housing Plan Notes
    #E22 Date of Move to TH, including GPD
    #E23 Exit Destination - Permanent Housing
    #E24 Date of Permanent Housing Placement / Exit from Literal Homelessness
    #E25 Exit Destination – Other(non - PH, non - literal homeless exits)
    #E26 Date of Other Exit

    ## Below are elements which will need to be added and rolled from month-to-month.

    #E27 Notes and Additional Information
    #E28 Date of Permanent Housing Intervention Offer
    #E29 Type of PH Intervention Offered
    #E30 Accept or Decline Offer
    #E31 Date of Accept or Decline
    #E32 Date of Permanent Housing Intervention Offer
    #E33 Type of PH Intervention Offered
    #E34 Accept or Decline Offer
    #E35 Date of Accept or Decline
    #E36 Date of Permanent Housing Intervention Offer
    #E37 Type of PH Intervention Offered
    #E38 Accept or Decline Offer
    #E39 Date of Accept or Decline
    #E40 Date of Permanent Housing Intervention Offer
    #E41 Type of PH Intervention Offered
    #E42 Accept or Decline Offer
    #E43 Date of Accept or Decline
    #E44 Date of Permanent Housing Intervention Offer
    #E45 Type of PH Intervention Offered
    #E46 Accept or Decline Offer
    #E47 Date of Accept or Decline
    #E48 Date of Permanent Housing Intervention Offer
    #E49 Type of PH Intervention Offered
    #E50 Accept or Decline Offer
    #E51 Date of Accept or Decline
    #E52 Date of Permanent Housing Intervention Offer
    #E53 Type of PH Intervention Offered
    #E54 Accept or Decline Offer
    #E55 Date of Accept or Decline
    #E56 Days Since Veteran Identified
    #E57 Days from Veteran Identification to Housing Plan Creation
    #E58 Days Since Veteran Permanent Housing Plan Created
    #E59 Days from Identification to Permanent Housing
    #E60 Days Since Permanent Housing Placement / Exit from Literal Homelessness

homebase_vets <- sqldf("SELECT *
                        FROM homebase
                        WHERE VeteranStatus = 'Yes'
                        ")
remove(list = c("homebase"))

# Get the whether the participant is actively homeless.

#E5  List Status
master_list_builder <- sqldf("SELECT *, CASE LastProjectTypeContacted
                        WHEN 'PH - Permanent Supportive Housing' THEN 'Inactive (Permanently Housed)'
                        WHEN 'Emergency Shelter' OR 'Transitional Housing' THEN 'Active - ES/TH'
                        WHEN 'Street Outreach' THEN 'Active - unsheltered'
                        END As ListStatus
                        FROM homebase_vets")

#E12 Permant Housing Plan / Track
master_list_builder <- sqldf("SELECT *, CASE 
              WHEN ChronicallyHomeless = 'Yes' THEN 'Permanent Supportive Housing'
              ELSE 'Rapid Rehousing'
              END As'Permanent Housing Plan / Track'
              FROM master_list_builder
              ")

#E22 Date of Move to TH, including GPD
master_list_builder <- sqldf("SELECT *, CASE
                              WHEN LastProjectTypeContacted = 'Transitional Housing' THEN RecentHUDEntryDate
                              ELSE ''
                              END As THMoveIn
                              FROM master_list_builder
                              ")

#E24 Date of Permanent Housing Placement / Exit from Literal Homelessness
master_list_builder <- sqldf("SELECT *, CASE
                              WHEN ActiveInPH = 'Yes' THEN RecentHUDEntryDate
                              ELSE ''
                              END As PHMoveIn
                              FROM master_list_builder
                              ")
master_list_builder <- subset(master_list_builder)

############ FILTERING ##################

####################################
# Filter:                          #
# Active (in 90 days) in NBN ES    #
####################################
services <- loadServices(hmisDataPath)

# Filter out NBN at TCES, get active NBN list, add them back in.
master_list_builder_bfr <- sqldf("SELECT *
                              FROM master_list_builder
                              WHERE LastProgramInContact != 'Salvation Army' 
                              AND LastProgramInContact != 'Presbyterian Night Shelter'
                              AND LastProgramInContact != 'Day Resource Center'
    ")

nbnServices <- sqldf("SELECT * 
                      FROM services
                      WHERE RecordType = 200
    ")

activeNbnSerivces <- activeFilter(nbnServices, 'DateProvided', 'DateProvided', as.character(Sys.Date() - 90), as.character(Sys.Date()))
filter_Active_in_90_NBN <- sqldf("SELECT DISTINCT(PersonalID) FROM activeNbnSerivces")

activeNbnRecords <- sqldf("SELECT b.*
                           FROM filter_Active_in_90_NBN a
                           INNER JOIN master_list_builder b
                           ON a.PersonalID=b.PersonalID
                        ")

master_list_builder <- rbind(master_list_builder_bfr, activeNbnRecords)

####################################
# Filter:                          #
# Active (in 90 days) in Outreach  #
####################################

# Filter out Outreach, get active Outreach, then add back in.
master_list_builder_bfr <- sqldf("SELECT *
                              FROM master_list_builder
                              WHERE LastProgramInContact != 'SOS' 
                              AND LastProgramInContact != 'SOS Night Time Outreach'
                              AND LastProgramInContact != 'PATH'
    ")

outreachServices <- sqldf("SELECT * 
                      FROM services
                      WHERE RecordType = 12
    ")

activeOutreachSerivces <- activeFilter(outreachServices, 'DateProvided', 'DateProvided', as.character(Sys.Date() - 90), as.character(Sys.Date()))
filter_Active_in_90_Outreach <- sqldf("SELECT DISTINCT(PersonalID) FROM activeOutreachSerivces")

activeOutreachRecords <- sqldf("SELECT b.*
                           FROM filter_Active_in_90_Outreach a
                           INNER JOIN master_list_builder b
                           ON a.PersonalID=b.PersonalID
                        ")

remove(list = c("services"))

master_list_builder <- rbind(master_list_builder_bfr, activeOutreachRecords)
master_list_builder <- sqldf("SELECT DISTINCT * FROM master_list_builder")

#######################################
# Reload Data for Entry / Exit Filter #
#######################################
enrollment <- loadEnrollment(hmisDataPath)
exit <- loadExit(hmisDataPath)
project <- loadProject(hmisDataPath)

####################################
# Filter:                          #
# Active in a Entry Exit ES        #
####################################

activeEEESProgramsFilter_builder <- sqldf("SELECT a.ProjectEntryID, a.PersonalID, a.EntryDate, b.ProjectName, b.ProjectType 
                               FROM enrollment a
                               LEFT JOIN project b
                               ON a.ProjectID=b.ProjectID 
    ")

# 1 = Emergency Shelter
# 11 = Day Shelter
activeEEESProgramsFilter_builder <- sqldf("SELECT * 
                               FROM activeEEESProgramsFilter_builder 
                               WHERE (ProjectType = 1 OR ProjectType = 11)
                               AND (ProjectName != 'Salvation Army' 
                                   AND ProjectName != 'Presbyterian Night Shelter'
                                   AND ProjectName != 'Day Resource Center'
                                   AND ProjectName != 'SOS'
                                   AND ProjectName != 'SOS Night Time Outreach'
                                   AND ProjectName != 'PATH')")

activeEEESProgramsFilter_builder <- sqldf("SELECT a.*, b.ExitDate
                                       FROM activeEEESProgramsFilter_builder a
                                       LEFT JOIN exit b
                                       ON a.ProjectEntryID=b.ProjectEntryID
    ")

activeEEESProgramsFilter_builder <- subset(activeEEESProgramsFilter_builder)
filter_Active_in_EEES <- sqldf("SELECT DISTINCT PersonalID FROM activeEEESProgramsFilter_builder WHERE ExitDate IS NULL")

remove(list = c("activeEEESProgramsFilter_builder"))

####################################
# Filter:                          #
# Active in a Transitional Housing #
####################################

activeTHProgramFilter_builder <- sqldf("SELECT a.ProjectEntryID, a.PersonalID, a.EntryDate, b.ProjectName, b.ProjectType 
                               FROM enrollment a
                               LEFT JOIN project b
                               ON a.ProjectID=b.ProjectID 
    ")

# 2 = Transitional Housing
activeTHProgramFilter_builder <- sqldf("SELECT * 
                               FROM activeTHProgramFilter_builder 
                               WHERE (ProjectType = 2)")

activeTHProgramFilter_builder <- sqldf("SELECT a.*, b.ExitDate
                                       FROM activeTHProgramFilter_builder a
                                       LEFT JOIN exit b
                                       ON a.ProjectEntryID=b.ProjectEntryID
    ")

activeTHProgramFilter_builder <- subset(activeTHProgramFilter_builder)
filter_Active_in_TH <- sqldf("SELECT DISTINCT PersonalID FROM activeTHProgramFilter_builder WHERE ExitDate IS NULL")

remove(list = c("activeTHProgramFilter_builder"))

####################################
# Get First Date in Homelessness   #
####################################

project <- loadProject(hmisDataPath)
enrollment <- loadEnrollment(hmisDataPath)

enrollmentAndProject <- sqldf("SELECT a.PersonalID, a.EntryDate, b.ProjectName, b.ProjectType 
                               FROM enrollment a
                               LEFT JOIN project b
                               ON a.ProjectID=b.ProjectID 
    ")

# 1 = Emergency Shelter, 2 = Transitional Housing, 4 = Street Outreach, 8 = Safe Haven, 11 = Day Shelter, 14 = Coordinated Assessment
startDateInHomelessnessByPersonalID <- sqldf("SELECT PersonalID, MAX(EntryDate) As 'FirstContactDate' 
                                              FROM enrollmentAndProject
                                              WHERE ProjectType = 1 OR ProjectType = 2 OR ProjectType = 4 OR ProjectType = 8 OR ProjectType = 11 OR ProjectType = 14
                                              GROUP BY PersonalID
    ")

# Add FirstContactDate to master list.
master_list_builder <- sqldf("SELECT a.*, b.FirstContactDate As 'DateVeteranIdentified'
                              FROM master_list_builder a
                              LEFT JOIN startDateInHomelessnessByPersonalID b
                              ON a.PersonalID=b.PersonalID
    ")

####################################
# Get Exit Destination Information #
####################################

# Get Homebase information tied back to the record
master_list <- sqldf("SELECT FirstName As 'VeteransLastName',
                              LastName As 'VeteransFirstName',
                              PersonalID As 'Veteran HMIS Client Identifier',
                              'Unknown' As 'Veterans HOMES Client Identifier',
                              ListStatus As 'List Status',
                              DateVeteranIdentified,
                              'Unknown' As 'Last Review / Update On Master List',
                              LastProgramInContact As 'Last Known Location / Provider',
                              'Unknown' As 'Confirmed Veteran Status',
                              'Unknown' As 'VHA Eligible',
                              'Unknown' As 'SSVF Eligible',
                              'Unknown' As 'Permanent Housing Plan',
                              'Unknown' As 'Permanent Housing Plan / Track',
                              'Unknown' As 'Permanent Housing Date',
                              'Unknown' As 'Client Phone or Email',
                              DOB,
                              scoreVISPDAT As 'Assessment Score',
                              ChronicallyHomeless As 'Chronically Homeless',
                              LastProgramInContact As 'Provider Name',
                              StaffName As 'Provider: StaffName',
                              StaffEmail As 'Provider: Staff Email',
                              LastProjectTypeContacted As 'Current Project Enrollment Type',
                              'Unknown' As 'Date Permanent Housing Plan Created',
                              'Unknown' As 'Permanent Housing Plan Notes',
                              THMoveIn As 'Date of Move to TH, including GPD',
                              'Unknown' As 'Exit Destination - Permanent Housing',
                              PHMoveIn As 'Date of Permanent Housing Placement / Exit from Literal Homelessness',
                              'Unknown' As 'Exit Destination – Other(non - PH, non - literal homeless exits)',
                              'Unknown' As 'Date of Other Exit'
                              FROM master_list_builder
") 

master_list_ph <- sqldf("SELECT * FROM master_list_builder WHERE LastProjectTypeContacted = 'PH - Rapid Re-Housing' OR LastProjectTypeContacted = 'PH - Permanent Supportive Housing'")
master_list_es_and_so <- sqldf("SELECT * FROM master_list_builder WHERE LastProjectTypeContacted = 'Day Shelter' OR LastProjectTypeContacted = 'Emergency Shelter' OR LastProjectTypeContacted = 'Street Outreach'")

filter_active_homeless <- rbind(filter_Active_in_90_NBN, filter_Active_in_90_Outreach)
filter_active_homeless <- rbind(filter_active_homeless, filter_Active_in_EEES)
filter_active_homeless <- rbind(filter_active_homeless, filter_Active_in_TH)
filter_active_homeless <- unique(filter_active_homeless)
master_list_es_and_so_and_th <- sqldf("SELECT a.* 
                                       FROM master_list a
                                       INNER JOIN filter_active_homeless b
                                       ON a.'Veteran HMIS Client Identifier'=b.PersonalID")

colnames(master_list_es_and_so_and_th)[5] <- "ListStatus"
master_list_es_and_so_and_th <- sqldf("SELECT * FROM master_list_es_and_so_and_th WHERE ListStatus IS NOT 'Inactive (Permanently Housed)'")
colnames(master_list_es_and_so_and_th)[5] <- "List Status"

filter_active_houseless <- rbind(filter_Active_in_90_NBN, filter_Active_in_90_Outreach)
filter_active_houseless<- rbind(filter_active_houseless, filter_Active_in_EEES)
master_list_es_and_so <- sqldf("SELECT a.* 
                                       FROM master_list a
                                       INNER JOIN filter_active_houseless b
                                       ON a.'Veteran HMIS Client Identifier'=b.PersonalID
                                       ")
colnames(master_list_es_and_so)[5] <- "ListStatus"
master_list_es_and_so <- sqldf("SELECT * FROM master_list_es_and_so WHERE ListStatus IS NOT 'Inactive (Permanently Housed)'")
colnames(master_list_es_and_so)[5] <- "List Status"

detach("package:XLConnect", unload = TRUE)
library(xlsx)
outputPath <- paste(outputPath, "Veteran_Master_List_", Sys.Date(), ".xlsx", sep = "")
masterListSheetName <- paste("TX-601_Master_List_", Sys.Date(), paste = "")
masterListPhName <- paste("PSH_", Sys.Date(), paste = "")
masterListESName <- paste("ES_DS_SO_", Sys.Date(), paste = "")
masterListEAndThSName <- paste("ES_DS_SO_TH_", Sys.Date(), paste = "")

###############################
# Write Sheets                #
###############################

write.xlsx(master_list, file = outputPath, sheetName = masterListSheetName, row.names = FALSE, showNA = FALSE)
write.xlsx(master_list_ph, file = outputPath, sheetName = masterListPhName, row.names = FALSE, showNA = FALSE, append = TRUE)
write.xlsx(master_list_es_and_so, file = outputPath, sheetName = masterListESName, row.names = FALSE, showNA = FALSE, append = TRUE)
write.xlsx(master_list_es_and_so_and_th, file = outputPath, sheetName = masterListEAndThSName, row.names = FALSE, showNA = FALSE, append = TRUE)

ch <- sqldf("SELECT 'NumberOfCH', COUNT(PersonalID) FROM master_list_builder WHERE ChronicallyHomeless = 'Yes' AND LastProjectTypeContacted = 'Emergency Shelter'")
Coordinated Entry By-Name-List using HMIS CSV 5.1, R, and SQL

This is a build out of TX-601's Coordinated Entry By-Name List.  The report is written using SQLDF inside of an R environment.

Homebase Function

homebase <- function(
  hmisDataPath,
  vispdatDataPath,
  staffInfoDataPath,
  executionPath,
  hmisFunctionsFilePath,
  viSpdat2DataPath,
  clientContactInfoPath) {

  library("tcltk")

  # Load the weights for progress bar
  loadingPackagesIncrement <- 2
  loadingHMISDataIncrement <- 10
  addDisabilitiesIncrement <- 5
  householdIdIncrement <- 4
  calculatingAgeIncrement <- 1
  gettingEnrollmentsIncrement <- 10
  gettingStaffInfoIncrement <- 5
  calculatingCHIncrement <- 10
  addVispdatIncrement <- 5
  getFamilyWithChildIncrement <- 5
  loadServicesIncrement <- 15
  nbnStaysIncrement <- 5
  outreachContactsIncrement <- 5
    outreachAndNbnCountIncrement <- 5
    clientContactinfoIncrement <- 3
  makeHmisCodesReadableIncrement <- 2
  formatHomebaseIncrement <- 1

  # Find the progress bar max.
  total <- (loadingPackagesIncrement +
              loadingHMISDataIncrement +
              addDisabilitiesIncrement +
              householdIdIncrement +
              calculatingAgeIncrement +
              gettingEnrollmentsIncrement +
              gettingStaffInfoIncrement +
              calculatingCHIncrement +
              addVispdatIncrement +
              getFamilyWithChildIncrement +
              loadServicesIncrement +
              nbnStaysIncrement +
              outreachAndNbnCountIncrement +
              makeHmisCodesReadableIncrement +
              formatHomebaseIncrement +
              clientContactinfoIncrement
  )
      # Initialize progress bar
  pbCounter = 0
  pb <- tkProgressBar(title = "Homebase Function", min = 0,
                      max = total, width = 300)

  ###### START ##########
  setTkProgressBar(pb, pbCounter, label = "Loading Packages")

  # There needs to be a lot of allocated memory for Java for
  # XLConnect to work.
    options(java.parameters = "-Xmx14336m") ## memory set to 14 GB

  library(sqldf)
  library(XLConnect)
  library(xlsx)


  # Load the HMIS functions.
  source(hmisFunctionsFilePath)

  # Return to execution path.
  setwd(executionPath)

  # Update progress bar
  pbCounter <- pbCounter + loadingPackagesIncrement
  setTkProgressBar(pb, pbCounter, label = "Loading HMIS Data")

  # Load HMIS Data
  client <- loadClient(hmisDataPath)
  enrollment <- loadEnrollment(hmisDataPath)
  disabilities <- loadDisabilities(hmisDataPath)
  exit <- loadExit(hmisDataPath)
  project <- loadProject(hmisDataPath)
  enrollmentCoc <- loadEnrollmentCoc(hmisDataPath)

  # Return to execution path.
  setwd(executionPath)

  pbCounter <- pbCounter + loadingHMISDataIncrement
  setTkProgressBar(pb, pbCounter, label = "Getting Disabilities")

  # Takes the Disabilities.csv and breaks out the disabilities reported
  # for each participants into individual elements with binary resposnes.
  client <- addDisabilityInfoToClient(client, disabilities)

  # Update progress bar
  pbCounter <- pbCounter + addDisabilitiesIncrement
  setTkProgressBar(pb, pbCounter, label = "Getting Household IDs")

  # Gets the PersonalIDs of participants with a HUD Entry or Update in last 90 days.
  enrollmentCoc <- getMostRecentRecordsPerId(enrollmentCoc, "PersonalID", "DateCreated")
  enrollmentCoc$DateCreated <- as.character(enrollmentCoc$DateCreated)

  # Get Household IDs from EnrollmentCoc.csv
  client_HHIDs <- sqldf("SELECT DISTINCT a.PersonalID, b.HouseholdID
                  FROM client a
                  LEFT JOIN enrollmentCoc b
                  ON a.PersonalID=b.PersonalID
                  WHERE HouseholdID != 'NA'
                  ")

  client_HHIDs <- sqldf("SELECT PersonalID, MAX(HouseholdID) As 'HouseholdID' FROM client_HHIDs GROUP BY PersonalID")

  # Update progress bar
  pbCounter <- pbCounter + householdIdIncrement
  setTkProgressBar(pb, pbCounter, label = "Calculating Age")

  # Calculate age
  client <- sqldf("SELECT DISTINCT *, (DATE('NOW') - DATE(DOB)) As 'Age' FROM client")

  pbCounter <- pbCounter + calculatingAgeIncrement
  setTkProgressBar(pb, pbCounter, label = "Getting Enrollments")

  # Filters to most recent HUD Assessment per participant
  df1 <- getMostRecentRecordsPerId(enrollment, "PersonalID", "EntryDate")

  ## Adds a 'MaxEntryDate' flag to enrollment
  #enrollment <- sqldf("SELECT a.*, b.MaxEntryDate
                      #FROM enrollment a
                      #LEFT JOIN df1 b
                      #ON a.ProjectEntryID=b.ProjectEntryID
                      #")


    # Adds a 'MaxEntryDate' flag to enrollment
    enrollment <- sqldf("SELECT a.*, b.MaxEntryDate
                      FROM enrollment a
                      LEFT JOIN df1 b
                      ON a.ProjectEntryID=b.ProjectEntryID
                      WHERE b.MaxEntryDate = 'Yes'
                      ")

  # Get Project Info for Enrollment
  enrollmentCoc <- addProjectInfoToEnrollment(enrollmentCoc, project)

  df3 <- sqldf("SELECT PersonalID, DateCreated As 'MostRecentHUDAssess', UserID As 'StaffID', ProjectName, ProjectType
               FROM enrollmentCoc")
  df3 <- makeProjectTypeReadable(df3)
  pbCounter <- pbCounter + gettingStaffInfoIncrement
  setTkProgressBar(pb, pbCounter, label = "Calculating Chronically Homeless")

  # Adds a ChronicallyHomeless flag
  df4 <- addChronicallyHomelessFlagToClient(client, df1)
  # Adds a ActiveInPh flag
  df5 <- getClientsInPH(loadEnrollment(hmisDataPath), exit, project)

  # Returns flags to clientDf
  targetClient <- sqldf("SELECT DISTINCT a.*, b.'ActiveInPh', c.EntryDate
                        FROM df4 a
                        LEFT JOIN df5 b
                        ON a.PersonalID=b.PersonalID
                        LEFT JOIN enrollment c
                        ON a.PersonalID=c.PersonalID
                        ")

  colnames(targetClient)[which(colnames(targetClient) == "EntryDate")] <- "RecentHUDEntryDate"

  # Load staff information
  staffInfo <- readWorksheetFromFile(staffInfoDataPath, sheet = 1, startRow = 1)
  colnames(staffInfo)[1] <- "StaffID"
  staffInfo <- sqldf("SELECT DISTINCT StaffID, Name, Email FROM staffInfo")

  # Find the staff information for who completed each HUD Assessment.
  df7 <- sqldf("SELECT a.*, b.Name As 'StaffName', b.Email As 'StaffEmail'
               FROM df3 a
               LEFT JOIN staffInfo b
               ON a.StaffID=b.StaffID
               ")
  remove(list = c("staffInfo"))

  # Add the Staff and Project information back to client list.
  targetClient <- sqldf("SELECT a.*, b.MostRecentHUDAssess, b.StaffName, b.StaffEmail, b.ProjectName As 'LastProgramInContact', b.ReadableProjectType As 'LastProjectTypeContacted'
                        FROM targetClient a
                        LEFT JOIN df7 b
                        ON a.PersonalID=b.PersonalID
                        ")

  targetClient <- subset(targetClient)

  # Cleanup
  remove(list = c("df4", "df3", "df5", "df1", "enrollmentCoc"))

  pbCounter <- pbCounter + calculatingCHIncrement
  setTkProgressBar(pb, pbCounter, label = "Adding VI-SPDAT")

  # Load VI-SPDAT information
  viSpdat <- readWorksheetFromFile(vispdatDataPath, sheet = 1, startRow = 1)
  viSpdat2 <- readWorksheetFromFile(viSpdat2DataPath, sheet = 1, startRow = 1)

  # Clean up VI-SPDAT1 formatting
  colnames(viSpdat)[6] <- "DateOfVISPDAT"
  viSpdat$DateOfVISPDAT <- as.character(viSpdat$DateOfVISPDAT)
  viSpdat$Participant.Enterprise.Identifier <- gsub("-", "", viSpdat$Participant.Enterprise.Identifier)
  viSpdat$Family.Enterprise.Identifier <- gsub("-", "", viSpdat$Family.Enterprise.Identifier)
  viSpdat$Family.Enterprise.Identifier <- gsub("\\{", "", viSpdat$Family.Enterprise.Identifier)
    viSpdat$Family.Enterprise.Identifier <- gsub("\\}", "", viSpdat$Family.Enterprise.Identifier)
    colnames(viSpdat)[1] <- "PersonalID"

  # Clean up VI-SPDAT2 formatting
  viSpdat2$DateOfVISPDAT <- as.character(viSpdat2$DateOfVISPDAT)
  viSpdat2$PersonalID <- gsub("-", "", viSpdat2$PersonalID)

    # Get most recent VI-SPDAT per client.
  viSpdat <- getMostRecentRecordsPerId(viSpdat, "PersonalID", "DateOfVISPDAT")
  viSpdat2 <- getMostRecentRecordsPerId(viSpdat2, "PersonalID", "DateOfVISPDAT")

    # Shape VI-SPDAT to look like VI-SPDAT 2
    viSpdat <- sqldf("SELECT PersonalID, DateOfVISPDAT As 'DateOfVISPDAT', '' As VISPDATTotalFamilyScore, '' As VISPDATTotalYouthScore, ScoreVISPDAT As 'VISPDATTotalIndividualScore', 'Individual' As TypeOfViSpdat FROM viSpdat")
    viSpdat2 <- sqldf("SELECT PersonalID, DateOfVISPDAT, VISPDATTotalFamilyScore, VISPDATTotalYouthScore, VISPDATTotalIndividualScore, TypeOfViSpdat FROM viSpdat2")

    allVispdat <- rbind(viSpdat, viSpdat2)

    # Get max between new and old VI-SPDAT
    allVispdat <- getMostRecentRecordsPerId(allVispdat, "PersonalID", "DateOfVISPDAT")

    allVispdat <- unique(allVispdat)


    allVispdat$VISPDATTotalIndividualScore[allVispdat$VISPDATTotalIndividualScore == ''] <- '0'
    allVispdat$VISPDATTotalFamilyScore[allVispdat$VISPDATTotalFamilyScore == ''] <- '0'
    allVispdat$VISPDATTotalYouthScore[allVispdat$VISPDATTotalYouthScore == ''] <- '0'

    allVispdat$VISPDATTotalIndividualScore[is.na(allVispdat$VISPDATTotalIndividualScore)] <- '0'
    allVispdat$VISPDATTotalFamilyScore[is.na(allVispdat$VISPDATTotalFamilyScore)] <- '0'
    allVispdat$VISPDATTotalYouthScore[is.na(allVispdat$VISPDATTotalYouthScore)] <- '0'


    allVispdat$VISPDATTotalIndividualScore <- as.integer(allVispdat$VISPDATTotalIndividualScore)
    allVispdat$VISPDATTotalFamilyScore <- as.integer(allVispdat$VISPDATTotalFamilyScore)
    allVispdat$VISPDATTotalYouthScore <- as.integer(allVispdat$VISPDATTotalYouthScore)

    allVispdat <- subset(allVispdat)
    # Make one column for score.
    #allVispdat <- sqldf("SELECT *, CASE 
                                    #WHEN (VISPDATTotalFamilyScore IS '' 
                                          #AND VISPDATTotalYouthScore IS '') 
                                          #THEN VISPDATTotalIndividualScore
                                    #WHEN (VISPDATTotalFamilyScore IS '' 
                                          #AND VISPDATTotalIndividualScore IS '') 
                                          #THEN VISPDATTotalYouthScore
                                          #ELSE VISPDATTotalFamilyScore
                                   #END As 'ScoreVISPDAT'
                        #FROM allVispdat
                        #ORDER BY ScoreVISPDAT DESC
                      #")

    allVispdat <- sqldf("SELECT *, CASE 
                                    WHEN (VISPDATTotalFamilyScore <= VISPDATTotalIndividualScore
                                          AND VISPDATTotalYouthScore <= VISPDATTotalIndividualScore) 
                                          THEN VISPDATTotalIndividualScore
                                    WHEN (VISPDATTotalFamilyScore <= VISPDATTotalYouthScore
                                          AND VISPDATTotalIndividualScore <= VISPDATTotalYouthScore) 
                                          THEN VISPDATTotalYouthScore
                                          ELSE VISPDATTotalFamilyScore
                                   END As 'ScoreVISPDAT'
                        FROM allVispdat
                        ORDER BY ScoreVISPDAT DESC
                      ")

    allVispdat$ScoreVISPDAT <- as.numeric(allVispdat$ScoreVISPDAT)

  targetClient <- subset(targetClient)


  # Add VI-SPDAT scores and dates to the client list.
  targetClient <- sqldf("SELECT a.*, b.ScoreVISPDAT, b.DateOfVISPDAT, b.TypeOfVispdat
                        FROM targetClient a
                        LEFT JOIN allVispdat b
                        ON a.PersonalID=b.PersonalID
                        ")

  # Just in case, make date SQLite friendly.
  targetClient$DateOfVISPDAT <- as.character(targetClient$DateOfVISPDAT)

  remove(list = c("viSpdat", "viSpdat2", "allVispdat"))

  pbCounter <- pbCounter + addVispdatIncrement
  setTkProgressBar(pb, pbCounter, label = "Getting Family with Child")

 client <- sqldf("SELECT a.*, b.HouseholdID 
                  FROM client a
                  LEFT JOIN client_HHIDs b 
                  ON a.PersonalID=b.PersonalID")
 targetClient <- sqldf("SELECT a.*, b.HouseholdID 
                  FROM targetClient a
                  LEFT JOIN client_HHIDs b 
                  ON a.PersonalID=b.PersonalID")

    remove(client)
    remove(enrollment)

    client <- loadClient(hmisDataPath)
    enrollment <- loadEnrollment(hmisDataPath)

    family_flag_builder <- sqldf("SELECT DISTINCT PersonalID, HouseholdID, RelationshipToHoH FROM enrollment")

    family_flag_builder <- family_flag_builder[!is.na(family_flag_builder$HouseholdID),]
    family_flag_builder <- sqldf("SELECT HouseholdID, 
        (CASE WHEN RelationshipToHoH = 1 THEN 'Yes' ELSE 'No' END) As 'Adult', 
        (CASE WHEN RelationshipToHoH = 2 THEN 'Yes' ELSE 'No' END) As 'Child' 
        FROM family_flag_builder")
    hhids_of_adults <- sqldf("SELECT DISTINCT HouseholdID, Adult FROM family_flag_builder WHERE Adult = 'Yes'")
    hhids_of_child <- sqldf("SELECT DISTINCT HouseholdID, Child FROM family_flag_builder WHERE Child = 'Yes'")
    family_flag_builder <- sqldf("SELECT a.HouseholdID, a.Adult, b.Child
                         FROM hhids_of_adults a
                         INNER JOIN hhids_of_child b
                         ON a.HouseholdID=b.HouseholdID  
                         ")

    family_flag_builder <- sqldf("SELECT HouseholdID, 'Yes' As FamilyWithChildren FROM family_flag_builder WHERE Adult = 'Yes' AND Child = 'Yes' GROUP BY HouseholdID")

  # Add the family flag back to the client list.
  targetClient <- sqldf("SELECT DISTINCT a.*, b.FamilyWithChildren
                        FROM targetClient a
                        LEFT JOIN family_flag_builder b
                        ON a.HouseholdID=b.HouseholdID
                        ")

  remove(list = c("family_flag_builder"))

  targetClient <- subset(targetClient)
  pbCounter <- pbCounter + getFamilyWithChildIncrement
  setTkProgressBar(pb, pbCounter, label = "Loading Services")


  # Free up some memory before attempting Services.csv
  remove(list = c("client", "enrollment", "disabilities", "exit", "project"))

  setwd(hmisDataPath)

  # Load services -- large files.
  services <- loadServices()

  pbCounter <- pbCounter + loadServicesIncrement
  setTkProgressBar(pb, pbCounter, label = "Getting Night-by-Night Stays")

  # Get all of te NBN service entries.
  clientNbn <- sqldf("SELECT *
                     FROM services
                     WHERE RecordType = 200
                     ")
  # Count the number of NBN check-ins per client.  Make sure the date as distinct, due to ETO duplicate bed
  # stays when data is pulled using a program-group including two-shelters.
  daysCheckedInNBN <- sqldf("SELECT DISTINCT(PersonalID), COUNT (DISTINCT(DateProvided)) As 'NonProgramShelterNights'
                            FROM clientNbn
                            GROUP BY PersonalID
                            ")

  pbCounter <- pbCounter + nbnStaysIncrement
  setTkProgressBar(pb, pbCounter, label = "Getting Outreach Contacts")

  # Get all Outreach Contact services.
  clientOutreach <- sqldf("SELECT *
                          FROM services
                          WHERE RecordType = 12
                          ")

  # Count all Outreach Contacts per client.  Count distinct is to get around ETO bug when data is
  # pulled using a program group which includes two outreach agencies.
  outreachContacts <- sqldf("SELECT DISTINCT(PersonalID), COUNT (DISTINCT(DateProvided)) As 'NumberOutreachContacts'
                            FROM clientOutreach
                            GROUP BY PersonalID
                            ")

  pbCounter <- pbCounter + outreachContactsIncrement
  setTkProgressBar(pb, pbCounter, label = "Adding Outreach and NBN Count")

  # Add NBN stays to client list.
  targetClient <- sqldf("SELECT a.*, b.NonProgramShelterNights
                        FROM targetClient a
                        LEFT JOIN daysCheckedInNBN b
                        ON a.PersonalID=b.PersonalID
                        ")

  # Add Outreach Contacts to client list.
  targetClient <- sqldf("SELECT a.*, b.NumberOutreachContacts
                        FROM targetClient a
                        LEFT JOIN outreachContacts b
                        ON a.PersonalID=b.PersonalID
                        ")

  remove(list = c("services", "clientOutreach", "daysCheckedInNBN", "clientNbn", "outreachContacts"))

    pbCounter <- pbCounter + outreachAndNbnCountIncrement
    setTkProgressBar(pb, pbCounter, label = "Add Client Contact Info")

  # Add client contact information
  clientContactInfo <- readWorksheetFromFile(clientContactInfoPath, sheet = 1, startRow = 1)
  colnames(clientContactInfo)[1] <- "PersonalID"
  colnames(clientContactInfo)[2] <- "ProgramStartDate"
  colnames(clientContactInfo)[3] <- "ClientEmail"
    colnames(clientContactInfo)[4] <- "ClientPhone"
    clientContactInfo$PersonalID <- gsub("-", "", clientContactInfo$PersonalID)

  clientContactInfo <- getMostRecentRecordsPerId(clientContactInfo, "PersonalID", "ProgramStartDate")

  targetClient <- sqldf("SELECT a.*, b.ClientEmail, b.ClientPhone 
                        FROM targetClient a
                        LEFT JOIN clientContactInfo b
                        ON a.PersonalID=b.PersonalID
                        ")

  pbCounter <- pbCounter + clientContactinfoIncrement
  setTkProgressBar(pb, pbCounter, label = "Make HMIS Codes Readable")

  setwd(executionPath)

  # Make HMIS codes human readable.
  targetClient <- combineRaceColumnsAndMakeReadable(targetClient)
  targetClient <- makeGenderReadable(targetClient)
  targetClient <- makeEthnicityReadable(targetClient)
  targetClient <- makeVeteranStatusReadable(targetClient)


  pbCounter <- pbCounter + makeHmisCodesReadableIncrement
  setTkProgressBar(pb, pbCounter, label = "Format Homebase Dataframe")

  # Pull and format the client list.
  homebase_all <- sqldf("SELECT DISTINCT
                        PersonalID,
                        FirstName,
                        MiddleName,
                        LastName,
                        ClientEmail,
                        ClientPhone,
                        NameSuffix,
                        SSN,
                        DOB,
                        Age,
                        Gender,
                        Race,
                        Ethnicity,
                        VeteranStatus,
                        HouseholdID,
                        ChronicallyHomeless,
                        NumberOutreachContacts,
                        NonProgramShelterNights,
                        RecentHUDEntryDate,
                        MostRecentHUDAssess,
                        FamilyWithChildren,
                        ScoreVISPDAT,
                        TypeOfViSpdat,
                        DateOfVISPDAT,
                        PhysicalDisability,
                        DevelopmentalDisability,
                        ChronicHealthCondition,
                        a.'HIV/AIDS',
                        MentalHealthProblem,
                        SubstanceAbuse,
                        StaffName,
                        StaffEmail,
                        LastProgramInContact,
                        LastProjectTypeContacted,
                        ActiveInPH
                        FROM targetClient a
                        ")

    pbCounter <- pbCounter + formatHomebaseIncrement
    setTkProgressBar(pb, pbCounter, label = "Homebase Complete")

    # Clean up

    close(pb)
    rm(list = setdiff(ls(), "homebase_all"))
    # Retun list.
    homebase_all
}

Homebase Report

options(java.parameters = "-Xmx14336m") ## memory set to 14 GB

developmentLog <- c("REPORT DETAILS -- DO NOT DELETE THIS TAB",
                    "For development information please visit ladvien.com",

                    "5/30/17",
                    "Added order by for PSH and RRH -- should match operations manual.",
                    "Changed 90-day filter for VI-SPDAT to only need at least one VI-SPDAT to keep participant in 'active'",
                    "Added client contact information",


                    "5/10/2017",
                    "Changed EnrollmentCoC max date function to look at DateCreated not InformationDate",
                    "Changed the Enrollment max date function to look at EntryDate and not DateCreated",
                    "",

                    "05/07/2017",
                    "Automated Formatting",
                    "Added descending on VeteranStatus, FamilyWithChildren, ScoreVISPDAT in respective order.",
                    "Added Graphic Dashboard",
                    "",

                    "05/07/2017",
                    "Switched from using Client.DOB -> Age for establishing family to Enrollment.RelationshipToHoH.  This seems to have addressed the families not flagging.",
                    "",

                    "03/15/2017",
                    "Incorporated VI-SPDAT2",
                    "",

                    "02/01/2017",
                    "Initial Report",
                    ""
                    )
# Argument guide
  # 1 = Name of Report
  # 2 = Folder path for HMIS data
  # 3 = VI-SPDAT File data
  # 4 = Staff Info File data
  # 5 = Executing directory path

# args <- commandArgs(trailingOnly = TRUE)
# nameOfReport <- args[1]
# hmisDataPath <- args[2]
# vispdatDataPath <- args[3]
# staffInfoDataPath <- args[4]
# executionPath <- args[5]
# viSpdat2DataPath <- args[6]
# 
# outputPath <- executionPath
# homebaseFunctionFilePath <- paste(executionPath, "\\Homebase_Function.R", sep ="")
# hmisFunctionsFilePath <- paste(executionPath, "\\HMIS_R_Functions.R", sep = "")

# PC
nameOfReport <- "Homebase_Report.R"
hmisDataPath <- "E:/Dropbox/HMIS/Warehouse/All Program 2016 Program Group  3-1-2017 - 5-30-2017"
vispdatDataPath <- "E:/Dropbox/HMIS/Warehouse/VI-SPDAT and HUD Flat Export for SQL -- 3-6-2017.xlsx"
staffInfoDataPath <- "E:/Dropbox/HMIS/Warehouse/Staff Contact Info for SQL -- 3-6-2017.xlsx"
executionPath <- "E:/Dropbox/HMIS/Coordinated_Entry_Report"
hmisFunctionsFilePath <- "E:/Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
homebaseFunctionFilePath <- "E:/Dropbox/HMIS/Coordinated_Entry_v2_TX-601/Coordinated_Entry_v2_TX-601/Homebase_Function.R"
outputPath <- "E:/Dropbox/HMIS/Warehouse"
viSpdat2DataPath <- "E:/Dropbox/HMIS/Warehouse/VI-SPDAT v2.0 -- 04-05-17 -- TB.xlsx"
clientContactInfoPath <- "E:/Dropbox/HMIS/Warehouse/Client Contact for SQL -- 5-30-17 -- TB.xlsx"
outputPath <- "E:/Dropbox/HMIS/Warehouse"
# Mac PC
 #nameOfReport <- "Homebase_Report.R"
 #hmisDataPath <- "C:/Users/Ladvien/Desktop/Homebase 04-18-2017/data"
 #vispdatDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/VI-SPDAT and HUD Flat Export for SQL -- 3-6-2017.xlsx"
 #staffInfoDataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/Staff Contact Info for SQL -- 3-6-2017.xlsx"
 #executionPath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_Report"
 #hmisFunctionsFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
 #homebaseFunctionFilePath <- "C:/Users/Ladvien/Dropbox/HMIS/Coordinated_Entry_v2_TX-601/Coordinated_Entry_v2_TX-601/Homebase_Function.R"
 #viSpdat2DataPath <- "C:/Users/Ladvien/Dropbox/HMIS/Warehouse/VI-SPDAT v2.0 -- 04-05-17 -- TB.xlsx"
 #outputPath <- "C:/Users/Ladvien/Desktop"
# Mac
#nameOfReport <- "Homebase_Report.R"
#hmisDataPath <- "/Users/user/Dropbox/HMIS/Coordinated_Entry_Report/All Programs 12-18-2016 to 03-18-2017"
#vispdatDataPath <- "/Users/user/Dropbox/HMIS/Coordinated_Entry_Report/VI-SPDAT and HUD Flat Export for SQL -- 3-6-2017.xlsx"
#staffInfoDataPath <- "/Users/user/Dropbox/HMIS/Coordinated_Entry_Report/Staff Contact Info for SQL -- 3-6-2017.xlsx"
#executionPath <- "/Users/user/Dropbox/HMIS/Coordinated_Entry_Report"
#hmisFunctions <- "/Users/user/Dropbox/HMIS/HMIS_R_Functions/HMIS_R_Functions.R"
cat("arg1: ")
cat(nameOfReport)
cat("\n") 
cat("arg2: ")
cat(hmisDataPath)
cat("\n") 
cat("arg3: ")
cat(vispdatDataPath)
cat("\n")
cat("arg4: ")
cat(staffInfoDataPath)
cat("\n")
cat("arg5: ")
cat(executionPath)
cat("\n")
cat("arg6: ")
cat(viSpdat2DataPath)
cat("\n")


source(homebaseFunctionFilePath)

homebase_all <- homebase(hmisDataPath,
                 vispdatDataPath,
                 staffInfoDataPath,
                 executionPath,
                 hmisFunctionsFilePath,
                 viSpdat2DataPath,
                 clientContactInfoPath)

######## WIP: Aggregates ##########
numberOfHudAssessmentsPerStaff <- sqldf("SELECT StaffName, COUNT(StaffName) FROM homebase_all GROUP BY StaffName")
#countOfChronicallyHomeless <- sqldf("SELECT COUNT(PersonalID) As CountOfChronic FROM homebase_all WHERE ChronicallyHomeless = 'Yes'")
#countOfGender <- sqldf("SELECT DISTINCT(Gender), COUNT(Gender) FROM homebase_all GROUP BY Gender")
#ageCounts <- sqldf("SELECT CASE
                        #WHEN Age < 5 THEN 1
                        #WHEN Age > 4 AND Age < 13 THEN 2
                        #WHEN Age > 13 AND Age < 18 THEN 3
                        #WHEN Age > 17 AND Age < 25 THEN 4
                        #WHEN Age > 24 AND Age < 35 THEN 5
                        #WHEN Age > 34 AND Age < 45 THEN 6
                        #WHEN Age > 44 AND Age < 55 THEN 7
                        #WHEN Age > 54 AND Age < 62 THEN 8
                        #WHEN Age > 61 THEN 9
                        #END As AgeType
                    #FROM homebase_all 
    #")

#aggregates <- cbind(numberOfHudAssessmentsPerStaff, countOfChronicallyHomeless)
homebase_housed <- sqldf("SELECT *
                         FROM homebase_all
                         WHERE ActiveInPH IS 'Yes'
                         ORDER BY VeteranStatus DESC,
                                  FamilyWithChildren DESC,
                                  ScoreVISPDAT DESC
                         ")

# Filter client to PersonalIDs with Entry or Update in last 90 days
homebase_active <- sqldf("SELECT *
                  FROM homebase_all
                  WHERE (
                          RecentHUDEntryDate > DATE('NOW', '-90 DAY')
                          OR (MostRecentHUDAssess > DATE('NOW', '-90 DAY'))
                        ) 
                        AND
                        (
                          (DateOfVISPDAT != '')
                        )
                  AND
                  ActiveInPH IS NOT 'Yes'
                  ORDER BY VeteranStatus DESC,
                           FamilyWithChildren DESC,
                           ScoreVISPDAT DESC
                  ")

tmp_possibly_active <- sqldf("SELECT *
                             FROM homebase_all
                             WHERE (( MostRecentHUDAssess > DATE('NOW', '-90 DAY')) 
                             OR
                             (
                              (DateOfVISPDAT > DATE('NOW', '-90 DAY'))
                             ))
                             AND
                             ActiveInPH IS NOT 'Yes'
                             ORDER BY VeteranStatus DESC,
                                      FamilyWithChildren DESC,
                                      ScoreVISPDAT DESC
                             ")

homebase_possibly_active <- sqldf("SELECT a.*
                                  FROM tmp_possibly_active a
                                  LEFT JOIN homebase_active b
                                  ON a.PersonalID=b.PersonalID
                                  WHERE b.PersonalID IS NULL
                                  ORDER BY VeteranStatus DESC,
                                           FamilyWithChildren DESC,
                                           ScoreVISPDAT DESC
                                  ")

homebase_psh <- sqldf("SELECT *
                      FROM homebase_active
                      WHERE ChronicallyHomeless = 'Yes'
                      ORDER BY MostRecentHUDAssess DESC                    
                      ")


homebase_psh <- sqldf("SELECT * FROM homebase_psh ORDER BY 
                                          ScoreVISPDAT DESC, 
                                          NonProgramShelterNights DESC
                                        ")


homebase_rrh <- sqldf("SELECT *
                      FROM homebase_active
                      WHERE ChronicallyHomeless IS NOT 'Yes'
                      ")

tmp_rrh_vets <- sqldf("SELECT * FROM homebase_rrh WHERE VeteranStatus = 'Yes'")
tmp_rrh_youth <- sqldf("SELECT * FROM homebase_rrh WHERE Age > 17 AND Age < 26")
tmp_rrh_families <- sqldf("SELECT * FROM homebase_rrh WHERE FamilyWithChildren = 'Yes'")
tmp_rrh_singles <- sqldf("SELECT * FROM homebase_rrh WHERE VeteranStatus IS NOT 'Yes' 
                                                           AND FamilyWithChildren IS NOT 'Yes'
                                                           AND Age < 18 OR Age > 25
                                                          ")


tmp_rrh_vets <- sqldf("SELECT * FROM tmp_rrh_vets ORDER BY
                                                ScoreVISPDAT DESC,
                                                NonProgramShelterNights DESC, 
                                                MostRecentHUDAssess DESC
                                                ")

tmp_rrh_youth <- sqldf("SELECT * FROM tmp_rrh_youth ORDER BY
                                                ScoreVISPDAT DESC,
                                                NonProgramShelterNights DESC, 
                                                MostRecentHUDAssess DESC
                                                ")

tmp_rrh_families <- sqldf("SELECT * FROM tmp_rrh_families ORDER BY
                                                ScoreVISPDAT DESC,
                                                NonProgramShelterNights DESC, 
                                                MostRecentHUDAssess DESC
                                                ")

tmp_rrh_singles <- sqldf("SELECT * FROM tmp_rrh_singles ORDER BY
                                                ScoreVISPDAT DESC,
                                                NonProgramShelterNights DESC, 
                                                MostRecentHUDAssess DESC
                                                ")

homebase_rrh <- rbind(tmp_rrh_vets, tmp_rrh_youth, tmp_rrh_families, tmp_rrh_singles)

homebase_assistToList <- sqldf("SELECT DISTINCT StaffName, COUNT(StaffName) As 'AssistToList'
                    FROM homebase_active
                    WHERE ActiveInPH IS NOT 'Yes'
                    GROUP BY StaffName
                    ORDER BY AssistToList DESC
                    ")


#library("ggplot2")

#df1 <- homebase_all$LastProjectTypeContacted

#ggplot(homebase_all, aes(NumberOutreachContacts, LastProgramInContact)) +
  #geom_raster(aes(fill = Age), interpolate = FALSE)


detach("package:XLConnect", unload = TRUE)
detach("package:xlsx", unload = TRUE)
library(openxlsx)

reportDetails <- data.frame(developmentLog)
colnames(reportDetails)[1] <- "Development Log"

ch_count <- sqldf("SELECT 'Chronically\nHomeless' As 'Category', 
    SUM(CASE ChronicallyHomeless WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")

vet_count <- sqldf("SELECT 'Veteran' As 'Category', 
    SUM(CASE VeteranStatus WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")

female_count <- sqldf("SELECT 'Female' As 'Category', 
    SUM(CASE Gender WHEN 'Female' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")

male_count <- sqldf("SELECT 'Male' As 'Category', 
    SUM(CASE Gender WHEN 'Male' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")

development_count <- sqldf("SELECT 'Developmental\nDisability' As 'Category', 
    SUM(CASE DevelopmentalDisability WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")
chronic_health_count <- sqldf("SELECT 'Chronic Health\nCondition' As 'Category', 
    SUM(CASE ChronicHealthCondition WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")
hiv_aids_count <- sqldf("SELECT 'HIV/AIDS' As 'Category', 
    SUM(CASE 'HIV/AIDS' WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")
mental_health_count <- sqldf("SELECT 'Mental Health\nProblem' As 'Category', 
    SUM(CASE MentalHealthProblem WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")
substance_abuse_count <- sqldf("SELECT 'Substance\nAbuse' As 'Category', 
    SUM(CASE SubstanceAbuse WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")

#fleeing_dv_count <-

family_count <- sqldf("SELECT 'Family With Children' As 'Category', 
    SUM(CASE FamilyWithChildren WHEN 'Yes' THEN 1 ELSE 0 END) As 'Count'
    FROM homebase_psh")


counts <- rbind(ch_count,
                vet_count,
                female_count,
                male_count,
                development_count,
                chronic_health_count,
                hiv_aids_count,
                mental_health_count,
                substance_abuse_count,
                family_count)

counts$val <- seq()
counts$Category <- factor(counts$Category, levels = counts$Category[order(counts$val)])

###########################################
# Calculate Housing Vacancy               #
###########################################

project <- loadProject(hmisDataPath)
enrollment <- loadEnrollment(hmisDataPath)
exit <- loadExit(hmisDataPath)
inventory <- loadInventory(hmisDataPath)

enrollmentAndProject <- addProjectInfoToEnrollment(enrollment, project)
activeInHousing <- sqldf("SELECT a.ProjectEntryID, a.PersonalID, a.ProjectType, a.ProjectName, a.ProjectID, b.ExitDate
                          FROM enrollmentAndProject a
                          LEFT JOIN exit b
                          ON a.ProjectEntryID=b.ProjectEntryID
                          WHERE (a.ProjectType = 2 OR a.ProjectType = 3 OR a.ProjectType = 9 OR a.ProjectType = 13)
                          ")

activeInHousing$ExitDate[is.na(activeInHousing$ExitDate)] <- 0
activeInHousing <- sqldf("SELECT DISTINCT * FROM activeInHousing WHERE ExitDate = 0")
aggregatedActiveInHousing <- sqldf("SELECT ProjectName, ProjectID, COUNT(ProjectName) As 'Enrolled' 
                                    FROM activeInHousing 
                                    GROUP BY ProjectName
                                    ORDER BY Enrolled DESC")

tmp <- sqldf("SELECT ProjectID, SUM(HMISParticipatingBeds) As 'Beds' FROM inventory GROUP BY ProjectID")

aggregatedActiveInHousing <- sqldf("SELECT a.*, b.Beds 
                                    FROM aggregatedActiveInHousing a
                                    LEFT JOIN tmp b
                                    ON a.ProjectID=b.ProjectID
                                   ")

projectVacancy <- sqldf("SELECT ProjectName, Enrolled, Beds, (CAST(Enrolled As FLOAT) / CAST(Beds As Float)) As 'OccupancyPercentage' FROM aggregatedActiveInHousing ORDER BY 'Vacancy %' DESC")
projectVacancy <- projectVacancy[order(projectVacancy$OccupancyPercentage),]
projectVacancy <- subset(projectVacancy)
projectWithVacancy <- sqldf("SELECT * FROM projectVacancy WHERE OccupancyPercentage < 1")
projectWithVacancy <- subset(projectWithVacancy)

###########################################
# Create some Awesome Graphs              #
###########################################

library("ggplot2")
library("RColorBrewer")
png("rrhCountsGraph.png", height = 1200, width = 1200, res = 250, pointsize = 8)
rrhCountsGraph <- ggplot(data = counts, aes(x = Category, y = Count, fill = Category)) +
    geom_bar(stat = "identity") +
    labs(title = "Permanent Supportive Housing Eligibe") +
    theme(axis.text.x = element_text(angle = 45, hjust = 1),
        legend.position = "none") +
    scale_fill_brewer(palette = "Spectral")
dev.off()

n <- 60
qual_col_pals = brewer.pal.info[brewer.pal.info$category == 'qual',]
col_vector = unlist(mapply(brewer.pal, qual_col_pals$maxcolors, rownames(qual_col_pals)))
levels <- seq(nrow(projectVacancy))
projectVacancy$Order <- levels
png("projectVacancy.png", height = 1200, width = 1200, res = 250, pointsize = 8)
projectVacancyGraph <- ggplot(data = projectVacancy, aes(x = reorder(ProjectName, -projectVacancy$Order), y = OccupancyPercentage, fill = ProjectName)) +
    geom_bar(stat = "identity") +
    labs(title = "Occupancy by Housing Projects") +
    theme(axis.text.x = element_text(angle = 90, hjust = 1),
        legend.position = "none") +
    coord_flip()
dev.off()
print(projectVacancyGraph)


n <- 60
qual_col_pals = brewer.pal.info[brewer.pal.info$category == 'qual',]
col_vector = unlist(mapply(brewer.pal, qual_col_pals$maxcolors, rownames(qual_col_pals)))
levels <- seq(nrow(projectWithVacancy))
projectWithVacancy$Order <- levels

vacancyPal <- c("#70AD47", "#47A3AD", "#AD5147", "#8447AD")

png("projectWithVacancy.png", height = 1200, width = 1200, res = 250, pointsize = 8)
projectWithVacancyGraph <- ggplot(data = projectWithVacancy,
        aes(x = reorder(ProjectName, Order), y = OccupancyPercentage, fill = OccupancyPercentage)) +
    geom_bar(stat = "identity", width = 1, color = vacancyPal[4]) +
    scale_fill_gradient2(low = "green", mid = vacancyPal[1], high = vacancyPal[2]) +
    labs(title = "Occupancy by Project") +
    xlab("Current Occupancy %") +
    ylab("Occupancy %") +
    theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 16),
          axis.text.y = element_text(hjust = 1),
        legend.position = "none") +
    coord_flip()
dev.off()
ggsave("projectWithVacancy.png")
print(projectWithVacancyGraph)

###########################################
# Format Workbook, Add Graphs, and Write  #`
###########################################
library("openxlsx")
setwd(outputPath)
# Theme
color_column_background <- "#70AD47"
color_header_font_color <- "#FFFFFF"
column_header_font_size <- 18

## Header Styles
headerStyle1 <- createStyle(fgFill = color_column_background,
                            valign = "top",
                            halign = "CENTER",
                            fontColour = color_header_font_color,
                            fontSize = column_header_font_size)

# Create the Workbook
homebaseWorkbook <- createWorkbook()

# Add the needed sheets.
addWorksheet(homebaseWorkbook, sheetName = "Dashboard")
addWorksheet(homebaseWorkbook, sheetName = "EligibleForPSH")
addWorksheet(homebaseWorkbook, sheetName = "EligibleForRRH")
addWorksheet(homebaseWorkbook, sheetName = "Possibly Active")
addWorksheet(homebaseWorkbook, sheetName = "Housed")
addWorksheet(homebaseWorkbook, sheetName = "AssistToList")
addWorksheet(homebaseWorkbook, sheetName = "Report Details")

# Add the data to worksheets.
writeDataTable(homebaseWorkbook, sheet = 2, x = homebase_psh, colNames = TRUE, tableStyle = "TableStyleLight9", headerStyle = headerStyle1)
writeDataTable(homebaseWorkbook, sheet = 3, x = homebase_rrh, colNames = TRUE, tableStyle = "TableStyleLight9", headerStyle = headerStyle1)
writeDataTable(homebaseWorkbook, sheet = 4, x = homebase_possibly_active, colNames = TRUE, tableStyle = "TableStyleLight9", headerStyle = headerStyle1)
writeDataTable(homebaseWorkbook, sheet = 5, x = homebase_housed, colNames = TRUE, tableStyle = "TableStyleLight9", headerStyle = headerStyle1)
writeDataTable(homebaseWorkbook, sheet = 6, x = homebase_assistToList, colNames = TRUE, tableStyle = "TableStyleLight9", headerStyle = headerStyle1)
writeDataTable(homebaseWorkbook, sheet = 7, x = reportDetails, colNames = TRUE, tableStyle = "TableStyleLight9", headerStyle = headerStyle1)

# Format the columns for readability.
setColWidths(homebaseWorkbook, 2, cols = 2:ncol(homebase_psh), widths = "auto")
setColWidths(homebaseWorkbook, 3, cols = 2:ncol(homebase_rrh), widths = "auto")
setColWidths(homebaseWorkbook, 4, cols = 2:ncol(homebase_possibly_active), widths = "auto")
setColWidths(homebaseWorkbook, 5, cols = 2:ncol(homebase_housed), widths = "auto")
setColWidths(homebaseWorkbook, 6, cols = 2:ncol(homebase_assistToList), widths = "auto")
setColWidths(homebaseWorkbook, 7, cols = 2:ncol(reportDetails), widths = "auto")

# Format first column statically.  For some reason, 'auto' clips the PersonalID
setColWidths(homebaseWorkbook, 2, cols = 1:1, widths = 38)
setColWidths(homebaseWorkbook, 3, cols = 1:1, widths = 38)
setColWidths(homebaseWorkbook, 4, cols = 1:1, widths = 38)
setColWidths(homebaseWorkbook, 5, cols = 1:1, widths = 38)
setColWidths(homebaseWorkbook, 6, cols = 1:1, widths = 38)
setColWidths(homebaseWorkbook, 2, cols = 1:1, widths = 38)

# Format rows.
setRowHeights(homebaseWorkbook, 2, rows = 1:1, height = 40)
setRowHeights(homebaseWorkbook, 3, rows = 1:1, height = 40)
setRowHeights(homebaseWorkbook, 4, rows = 1:1, height = 40)
setRowHeights(homebaseWorkbook, 5, rows = 1:1, height = 40)
setRowHeights(homebaseWorkbook, 6, rows = 1:1, height = 40)
setRowHeights(homebaseWorkbook, 7, rows = 1:1, height = 40)

# Freeze the top row
freezePane(homebaseWorkbook, sheet = 2, firstRow = TRUE)
freezePane(homebaseWorkbook, sheet = 3, firstRow = TRUE)
freezePane(homebaseWorkbook, sheet = 4, firstRow = TRUE)
freezePane(homebaseWorkbook, sheet = 5, firstRow = TRUE)
freezePane(homebaseWorkbook, sheet = 6, firstRow = TRUE)

##################################
#### Add Graphs to Workbook ######
##################################  
print(rrhCountsGraph) #plot needs to be showing
insertPlot(homebaseWorkbook, 1, width = 7.87, height = 5.5, fileType = "png", units = "in")

# Openxlsx requires zip.exe to save the workbook.  Weird.
Sys.setenv("R_ZIPCMD" = "C:/Rtools/bin/zip.exe")
today <- Sys.Date()
print(outputPath)
saveWorkbook(homebaseWorkbook, paste(outputPath, "\\", "Homebase_v2_", today, ".xlsx", sep =""), overwrite = TRUE)
cat(getwd())

openXL(paste(outputPath, "\\", "Homebase_v2_", today, ".xlsx", sep = ""))

The whole thing relies on some R functions I've written to assist:

HMIS Functions

  # Thanks SO. 
# http://stackoverflow.com/questions/9341635/check-for-installed-packages-before-running-install-packages
pkgTest <- function(x)
{
  if (!require(x,character.only = TRUE))
  {
    install.packages(x,dep=TRUE)
    library(x, character.only =  TRUE)
    if(!require(x,character.only = TRUE)) stop("Package not found")
  } else {
    library(x, character.only =  TRUE)
  }
}

pkgTest("sqldf")
pkgTest("tcltk")
pkgTest("zoo")
pkgTest("XLConnect")

loadAffiliation <- function(path = getwd()){
  affiliation <- read.csv(paste(path, "/Affiliation.csv", sep = ""))
}

loadClient <- function(path = getwd()){
  client <- read.csv(paste(path, "/Client.csv", sep = ""))
}

loadDisabilities <- function(path = getwd()) {
  disabilities <- read.csv(paste(path,  "/Disabilities.csv", sep = ""))
}

loadEmployementEducation <- function(path = getwd()) {
  employementEducation <- read.csv(paste(path, "/EmploymentEducation.csv", sep = ""))
}

loadEnrollment <- function(path = getwd()) {
  enrollment <- read.csv(paste(path, "/Enrollment.csv", sep = ""))
}

loadEnrollmentCoc <- function(path = getwd()) {
  enrollmentCoc <- read.csv(paste(path, "/EnrollmentCoC.csv", sep = ""))
}

loadExit <- function(path = getwd()) {
  exit <- read.csv(paste(path, "/Exit.csv", sep = ""))
}

loadExport <- function(path = getwd()) {
  export <- read.csv(paste(path, "/Export.csv", sep = ""))
}

loadFunder <- function(path = getwd()){
  funder <- read.csv(paste(path, "/Funder.csv", sep = ""))
}

loadHealthAndDv <- function(path = getwd()){
  healthAndDv <- read.csv(paste(path, "/HealthAndDV.csv", sep = ""))
}

loadIncomeBenefits <- function(path = getwd()){
  incomeBenefits <- read.csv(paste(path, "/IncomeBenefits.csv", sep = ""))
}

loadInventory <- function(path = getwd()){
  inventory <- read.csv(paste(path, "/Inventory.csv", sep = ""))
}

loadOrganization <- function(path = getwd()){
  organization <- read.csv(paste(path, "/Organization.csv", sep = "")) 
}

loadProject <- function(path = getwd()){
  project <- read.csv(paste(path, "/Project.csv", sep = ""))
}

loadProjectCoc <- function(path = getwd()){
  projectCoc <- read.csv(paste(path, "/ProjectCoC.csv", sep = ""))
}

loadServices <- function(path = getwd()){
  services <- read.csv(paste(path, "/Services.csv", sep = ""))
}

loadSite <- function(path = getwd()){
  site <- read.csv(paste(path, "/Site.csv", sep = ""))
}

# loadHMISCsvs51 <- function(path) {
#   affiliation <- read.csv(paste(path, "/Affiliation.csv", sep = ""))
#   client <- read.csv(paste(path, "/Client.csv", sep = ""))
#   disabilities <- read.csv(paste(path,  "/Disabilities.csv", sep = ""))
#   employementEducation <- read.csv(paste(path, "/EmploymentEducation.csv", sep = ""))
#   enrollment <- read.csv(paste(path, "/Enrollment.csv", sep = ""))
#   exit <- read.csv(paste(path, "/Exit.csv", sep = ""))
#   export <- read.csv(paste(path, "/Export.csv", sep = ""))
#   funder <- read.csv(paste(path, "/Funder.csv", sep = ""))
#   healthAndDv <- read.csv(paste(path, "/HealthAndDV.csv", sep = ""))
#   incomeBenefits <- read.csv(paste(path, "/IncomeBenefits.csv", sep = ""))
#   inventory <- read.csv(paste(path, "/Inventory.csv", sep = ""))
#   organization <- read.csv(paste(path, "/Organization.csv", sep = ""))
#   project <- read.csv(paste(path, "/Project.csv", sep = ""))
#   projectCoc <- read.csv(paste(path, "/ProjectCoC.csv", sep = ""))
#   services <- read.csv(paste(path, "/Services.csv", sep = ""))
#   site <- read.csv(paste(path, "/Site.csv", sep = ""))
#   return(c(services, site))
# }



makeDestinationReadable <- function (df) {
  df <- exit
  df <- sqldf("SELECT *, Destination as 'ReadableDestination' FROM df")
  df$ReadableDestination[df$ReadableDestination == "1"] <- "Emergency shelter, including hotel or motel paid for with emergency shelter voucher"
  df$ReadableDestination[df$ReadableDestination == "2"] <- "Transitional housing for homeless persons (including homeless youth)"
  df$ReadableDestination[df$ReadableDestination == "3"] <- "Permanent housing for formerly homeless persons (such as: CoC project; or HUD legacy programs; or HOPWA PH)"
  df$ReadableDestination[df$ReadableDestination == "4"] <- "Psychiatric hospital or other psychiatric facility"
  df$ReadableDestination[df$ReadableDestination == "5"] <- "Substance abuse treatment facility or detox center"
  df$ReadableDestination[df$ReadableDestination == "6"] <- "Hospital or other residential non-psychiatric medical facility"
  df$ReadableDestination[df$ReadableDestination == "7"] <- "Jail, prison or juvenile detention facility"
  df$ReadableDestination[df$ReadableDestination == "8"] <- "Client doesn’t know"
  df$ReadableDestination[df$ReadableDestination == "9"] <- "Client refused"
  df$ReadableDestination[df$ReadableDestination == "10"] <- "Rental by client, no ongoing housing subsidy"
  df$ReadableDestination[df$ReadableDestination == "11"] <- "Owned by client, no ongoing housing subsidy"
  df$ReadableDestination[df$ReadableDestination == "12"] <- "Staying or living with family, temporary tenure (e.g., room, apartment or house)"
  df$ReadableDestination[df$ReadableDestination == "13"] <- "Staying or living with friends, temporary tenure (e.g., room apartment or house)"
  df$ReadableDestination[df$ReadableDestination == "14"] <- "Hotel or motel paid for without emergency shelter voucher"
  df$ReadableDestination[df$ReadableDestination == "15"] <- "Foster care home or foster care group home"
  df$ReadableDestination[df$ReadableDestination == "16"] <- "Place not meant for habitation (e.g., a vehicle, an abandoned building, bus/train/subway station/airport or anywhere outside)"
  df$ReadableDestination[df$ReadableDestination == "17"] <- "Other"
  df$ReadableDestination[df$ReadableDestination == "18"] <- "Safe Haven"
  df$ReadableDestination[df$ReadableDestination == "19"] <- "Rental by client, with VASH housing subsidy"
  df$ReadableDestination[df$ReadableDestination == "20"] <- "Rental by client, with other ongoing housing subsidy"
  df$ReadableDestination[df$ReadableDestination == "21"] <- "Owned by client, with ongoing housing subsidy"
  df$ReadableDestination[df$ReadableDestination == "22"] <- "Staying or living with family, permanent tenure"
  df$ReadableDestination[df$ReadableDestination == "23"] <- "Staying or living with friends, permanent tenure"
  df$ReadableDestination[df$ReadableDestination == "24"] <- "Deceased"
  df$ReadableDestination[df$ReadableDestination == "25"] <- "Long-term care facility or nursing home"
  df$ReadableDestination[df$ReadableDestination == "26"] <- "Moved from one HOPWA funded project to HOPWA PH"
  df$ReadableDestination[df$ReadableDestination == "27"] <- "Moved from one HOPWA funded project to HOPWA TH"
  df$ReadableDestination[df$ReadableDestination == "28"] <- "Rental by client, with GPD TIP housing subsidy"
  df$ReadableDestination[df$ReadableDestination == "29"] <- "Residential project or halfway house with no homeless criteria"
  df$ReadableDestination[df$ReadableDestination == "30"] <- "No exit interview completed"
  df$ReadableDestination[df$ReadableDestination == "99"] <- "Data not collected"

  df
}


makeProjectTypeReadable <- function (df) {
  df <- sqldf("SELECT *, ProjectType as 'ReadableProjectType' FROM df")
  df$ReadableProjectType[df$ReadableProjectType == "1"] <- "Emergency Shelter"
  df$ReadableProjectType[df$ReadableProjectType == "2"] <- "Transitional Housing"
  df$ReadableProjectType[df$ReadableProjectType == "3"] <- "PH - Permanent Supportive Housing"
  df$ReadableProjectType[df$ReadableProjectType == "4"] <- "Street Outreach"
  df$ReadableProjectType[df$ReadableProjectType == "5"] <- "Services Only"
  df$ReadableProjectType[df$ReadableProjectType == "6"] <- "Other"
  df$ReadableProjectType[df$ReadableProjectType == "7"] <- "Safe Haven"
  df$ReadableProjectType[df$ReadableProjectType == "8"] <- "PH – Housing Only"
  df$ReadableProjectType[df$ReadableProjectType == "10"] <- "PH – Housing with Services (no disability required for entry)"
  df$ReadableProjectType[df$ReadableProjectType == "11"] <- "Day Shelter"
  df$ReadableProjectType[df$ReadableProjectType == "12"] <- "Homelessness Prevention"
  df$ReadableProjectType[df$ReadableProjectType == "13"] <- "PH - Rapid Re-Housing"
  df$ReadableProjectType[df$ReadableProjectType == "14"] <- "Coordinated Assessment"

  df
}

makeRaceReadable <- function(client){
  client$AmIndAKNativ[client$AmIndAKNative == "1"] <- "American Indian or Alaska Native"
  client$Asian[client$Asian == "1"] <- "Asian"
  client$BlackAfAmerican[client$BlackAfAmerican == "1"] <- "Black or African American"
  client$NativeHIOtherPacific[client$NativeHIOtherPacific == "1"] <- "Native Hawaiian or Other Pacific Islander"
  client$White[client$White == "1"] <- "White"
  client$RaceNone[client$RaceNone == "8"] <- "Client doesnt know"
  client$RaceNone[client$RaceNone == "9"] <- "Client refused"
  client$RaceNone[client$RaceNone == "99"] <- "Data not collected"
  client
}

makeGenderReadable <- function(client) {

  # HMIS CSV 5.1
  # 0 Female
  # 1 Male
  # 2 Transgender male to female
  # 3 Transgender female to male
  # 4 Doesn’t identify as male, female, or transgender
  # 8 Client doesn’t know
  # 9 Client refused
  # 99 Data not collected

  client$Gender[client$Gender == "0"] <- "Female"
  client$Gender[client$Gender == "1"] <- "Male"
  client$Gender[client$Gender == "2"] <- "Transgender male to female"
  client$Gender[client$Gender == "3"] <- "Transgender female to male"
  client$Gender[client$Gender == "4"] <- "Doesn’t identify as male, female, or transgender"
  client$Gender[client$Gender == "8"] <- "Client doesn’t know"
  client$Gender[client$Gender == "9"] <- "Client refused"
  client$Gender[client$Gender == "99"] <- "Data not collected"

  client
}

makeVeteranStatusReadable <- function(client) {

  # HMIS CSV 5.1
  # 0 No
  # 1 Yes
  # 8 Client doesn’t know
  # 9 Client refused
  # 99 Data not collected

  client$VeteranStatus[client$VeteranStatus == "0"] <- "No"
  client$VeteranStatus[client$VeteranStatus == "1"] <- "Yes"
  client$VeteranStatus[client$VeteranStatus == "8"] <- "Client doesn’t know"
  client$VeteranStatus[client$VeteranStatus == "9"] <- "Client refused"
  client$VeteranStatus[client$VeteranStatus == "99"] <- "Data not collected"

  client
}

makeEthnicityReadable <- function(client) {

  # HMIS CSV 5.1
  # 0 Non-Hispanic/Non-Latino
  # 1 Hispanic/Latino
  # 8 Client doesn’t know
  # 9 Client refused
  # 99 Data not collected

  client$Ethnicity[client$Ethnicity == "0"] <- "Non-Hispanic/Non-Latino"
  client$Ethnicity[client$Ethnicity == "1"] <- "Hispanic/Latino"
  client$Ethnicity[client$Ethnicity == "8"] <- "Client doesn’t know"
  client$Ethnicity[client$Ethnicity == "9"] <- "Client refused"
  client$Ethnicity[client$Ethnicity == "99"] <- "Data not collected"

  client
}

combineRaceColumnsAndMakeReadable <- function(client){
  client$AmIndAKNativ[client$AmIndAKNative == "0"] <- ""
  client$Asian[client$Asian == "0"] <- ""
  client$BlackAfAmerican[client$BlackAfAmerican == "0"] <- ""
  client$NativeHIOtherPacific[client$NativeHIOtherPacific == "0"] <- ""
  client$White[client$White == "0"] <- ""  
  client$RaceNone[client$RaceNone == "0"] <- ""

  client <- makeRaceReadable(client)
  client$Race <- paste(client$AmIndAKNativ, client$Asian, client$BlackAfAmerican, client$NativeHIOtherPacific, client$White, client$RaceNone, sep = " ")
  client$Race <- gsub(" NA", "",client$Race)
  client <- client[,-which(names(client) == "AmIndAKNative")]
  client <- client[,-which(names(client) == "Asian")] 
  client <- client[,-which(names(client) == "BlackAfAmerican")]
  client <- client[,-which(names(client) == "NativeHIOtherPacific")]
  client <- client[,-which(names(client) == "White")]
  client <- client[,-which(names(client) == "RaceNone")]
  client
}

addDisabilityInfoToClient <- function(client, disabilities){

  # From HMIS CSV Programming Specifications 5.1
  # 5 = Physical disability
  clientsWithPhysicalDisability <- sqldf("SELECT *, 'Yes' As PhysicalDisability 
                                         FROM disabilities
                                         WHERE DisabilityType = 5
                                         AND DisabilityResponse = 1
                                         ")

  clientsWithPhysicalDisability <- getMostRecentRecordsPerId(clientsWithPhysicalDisability, "PersonalID", "InformationDate")

  client <- sqldf("SELECT a.*, b.PhysicalDisability
                  FROM client a
                  LEFT JOIN clientsWithPhysicalDisability b
                  ON a.PersonalID=b.PersonalID
                  ")

  client <- subset(client)

  # 6 Developmental disability
  clientDevelopmentalDisability <- sqldf("SELECT *, 'Yes' As DevelopmentalDisability 
                                         FROM disabilities
                                            WHERE DisabilityType = 6
                                            AND DisabilityResponse = 1
                                            ")

  clientDevelopmentalDisability <- getMostRecentRecordsPerId(clientDevelopmentalDisability, "PersonalID", "InformationDate")

  clientDevelopmentalDisability <- subset(clientDevelopmentalDisability)

  client <- sqldf("SELECT a.*, b.DevelopmentalDisability
                  FROM client a
                  LEFT JOIN clientDevelopmentalDisability b
                  ON a.PersonalID=b.PersonalID
                  ")

  client <- subset(client)

  # 7 Chronic health condition
  clientChronicHealthCondition <- sqldf("SELECT *, 'Yes' As ChronicHealthCondition 
                                         FROM disabilities
                                        WHERE DisabilityType = 7
                                        AND DisabilityResponse = 1
                                        ")

  clientChronicHealthCondition <- getMostRecentRecordsPerId(clientChronicHealthCondition, "PersonalID", "InformationDate")

  client <- sqldf("SELECT a.*, b.ChronicHealthCondition
                  FROM client a
                  LEFT JOIN clientChronicHealthCondition b
                  ON a.PersonalID=b.PersonalID
                  ")

  client <- subset(client)

  # 8 HIV/AIDS
  clientHivAids         <- sqldf("SELECT *, 'Yes' As 'HIV/AIDS' 
                                         FROM disabilities
                                        WHERE DisabilityType = 8
                                        AND DisabilityResponse = 1
                                        ")

  clientHivAids <- getMostRecentRecordsPerId(clientHivAids, "PersonalID", "InformationDate")

  client <- sqldf("SELECT a.*, b.'HIV/AIDS'
                  FROM client a
                  LEFT JOIN clientHivAids b
                  ON a.PersonalID=b.PersonalID
                  ")

  client <- subset(client)

  # 9 Mental health problem
  clientMentalHealthProblem <- sqldf("SELECT *, 'Yes' As MentalHealthProblem 
                                         FROM disabilities
                                     WHERE DisabilityType = 9
                                     AND DisabilityResponse = 1
                                     ")

  clientMentalHealthProblem <- getMostRecentRecordsPerId(clientMentalHealthProblem, "PersonalID", "InformationDate")

  client <- sqldf("SELECT a.*, b.MentalHealthProblem
                  FROM client a
                  LEFT JOIN clientMentalHealthProblem b
                  ON a.PersonalID=b.PersonalID
                  ")

  client <- subset(client)

  # 10 Substance abuse
  clientSubstanceAbuse      <- sqldf("SELECT *, 'Yes' As SubstanceAbuse 
                                         FROM disabilities
                                     WHERE DisabilityType = 10
                                     AND DisabilityResponse = 1
                                     ")

  clientSubstanceAbuse <- getMostRecentRecordsPerId(clientSubstanceAbuse, "PersonalID", "InformationDate")

  client <- sqldf("SELECT a.*, b.SubstanceAbuse
                  FROM client a
                  LEFT JOIN clientSubstanceAbuse b
                  ON a.PersonalID=b.PersonalID
                  ")

  client <- subset(client)

  client
}


makeDisabilityTypeReadable <- function(disabilities){
  # From HMIS CSV Programming Specifications 5.1
  # 5 Physical disability
  # 6 Developmental disability
  # 7 Chronic health condition
  # 8 HIV/AIDS
  # 9 Mental health problem
  # 10 Substance abuse
  disabilities$DisabilityType[disabilities$DisabilityType == "5"] <- "Physical disability"
  disabilities$DisabilityType[disabilities$DisabilityType == "6"] <- "Developmental disability"
  disabilities$DisabilityType[disabilities$DisabilityType == "7"] <- "PhysicalChronic health condition"
  disabilities$DisabilityType[disabilities$DisabilityType == "8"] <- "Chronic health condition"
  disabilities$DisabilityType[disabilities$DisabilityType == "9"] <- "HIV/AIDS"
  disabilities$DisabilityType[disabilities$DisabilityType == "10"] <- "Substance abuse"
  disabilities  
}

makeTrackingMethodReadable <- function(df) {
  df <- sqldf("SELECT *, ProjectType as 'ReadableTrackingMethod' FROM df")
  df$ReadableProjectType[df$ReadableProjectType == "0"] <- "Entry/Exit Date"
  df$ReadableProjectType[df$ReadableProjectType == "3"] <- "Night-by-Night"
}

# activeRecords <- activeFilter(df, "occStartDate", "occEndDate", "2017-01-23", '2017-01-26')
activeFilter <- function(df, dateVector1, dateVector2, beginRange, endRange){
  df[is.na(df)] <- ""
  df[dateVector1,] <- as.character(df[dateVector1,])
  df[dateVector2,] <- as.character(df[dateVector2,])
  str <- paste("SELECT * FROM df WHERE (", dateVector1, "< '", endRange, "' AND ", dateVector2, " = '') OR (", dateVector1, "< '", endRange, "' AND ", dateVector2, " > '", beginRange, "')", sep = "")
  sqldf(str)
}

# Get active Enrollments
getActiveHudEnrollments <- function(enrollment, exit, project){
  project <- makeProjectTypeReadable(project)

  project <- subset(project)

  activeEnrollment <- sqldf("SELECT *
                            FROM enrollment a
                            LEFT JOIN exit b
                            ON a.ProjectEntryID=b.ProjectEntryID
                            WHERE b.ProjectEntryID IS NULL")


  activeEnrollment <- subset(activeEnrollment)

  activeEnrollmentsWithProjectInfo <- sqldf("SELECT b.ProjectType, b.ReadableProjectType, b.ProjectID, b.ProjectName, a.*
                                            FROM activeEnrollment a
                                            INNER JOIN project b
                                            ON a.ProjectID=b.ProjectID
                                            ")

  activeEnrollmentsWithProjectInfo <- subset(activeEnrollmentsWithProjectInfo)
  activeEnrollmentsWithProjectInfo
}


addProjectInfoToEnrollment <- function(enrollment, project){
  project <- makeProjectTypeReadable(project)
  enrollmentsWithProjectInfo <- sqldf("SELECT a.*, b.ProjectType, b.ReadableProjectType, b.ProjectName, b.TrackingMethod
                                            FROM enrollment a
                                            INNER JOIN project b
                                            ON a.ProjectID=b.ProjectID
                                            ")

  enrollmentsWithProjectInfo <- subset(enrollmentsWithProjectInfo)
  enrollmentsWithProjectInfo
}


getMostRecentRecordsPerId <- function(df, idHeader, dateHeader){
  str <- paste("SELECT *, MAX(", dateHeader, "), 'Yes' As Max", dateHeader, " FROM df GROUP BY ", idHeader, sep = "")
  bfr <- sqldf(str)
  sqldf(str)
}

getClientsInPH <- function(enrollment, exit, project) {

  ### START ###
  #############################################
  ##### Active in PH Projects #################
  ############## Incomplete ###################
  #############################################

  project <- makeProjectTypeReadable(project)

  project <- subset(project)

  activeEnrollment <- sqldf("SELECT *
                            FROM enrollment a 
                            LEFT JOIN exit b
                            ON a.ProjectEntryID=b.ProjectEntryID
                            WHERE b.ProjectEntryID IS NULL")


  activeEnrollment <- subset(activeEnrollment)

  activeEnrollmentsWithProjectInfo <- sqldf("SELECT b.ProjectType, b.ReadableProjectType, b.ProjectID, b.ProjectName, a.*
                                            FROM activeEnrollment a
                                            INNER JOIN project b
                                            ON a.ProjectID=b.ProjectID
                                            ")

  activeEnrollmentsWithProjectInfo <- subset(activeEnrollmentsWithProjectInfo)

  # PSH = 3
  # RRH = 13
  clientsActiveInPh <- sqldf("SELECT PersonalID, 'Yes' As 'ActiveInPH'
                             FROM activeEnrollmentsWithProjectInfo
                             WHERE ProjectType = 3
                             OR ProjectType = 13
                             ")

  clientsActiveInPh <- subset(clientsActiveInPh)
  clientsActiveInPh
}

getClientsInPHWithinRange <- function(enrollment, exit, project, beginDate, endDate) {

  enrollment$EntryDate <- as.character(enrollment$EntryDate)
  exit$ExitDate <- as.character(exit$ExitDate)

  project <- makeProjectTypeReadable(project)

  project <- subset(project)

  enrollmentWithProjectInfo <- sqldf("SELECT b.ProjectType, b.ReadableProjectType, b.ProjectID, b.ProjectName, a.*
                                     FROM enrollment a
                                     INNER JOIN project b
                                     ON a.ProjectID=b.ProjectID
                                     ")

  enrollmentWithProjectInfo <- subset(enrollmentWithProjectInfo)

  str <- paste("SELECT * FROM enrollmentWithProjectInfo WHERE (ProjectType = 3 OR ProjectType = 13) AND EntryDate < '", beginDate, "'", sep = "")

  phEnrollment <- sqldf(str)

  str2 <- paste("SELECT * FROM exit WHERE ExitDate < '", endDate, "'", sep = "")
  relevantExit <- sqldf(str2)

  activeEnrollment <- sqldf("SELECT *
                            FROM phEnrollment a
                            LEFT JOIN relevantExit b
                            ON a.ProjectEntryID=b.ProjectEntryID
                            WHERE b.ProjectEntryID IS NULL")

  ######################################################################
  ######################################################################
  ######################################################################
  ###################### Left off Here #################################
  ######################################################################
  ######################################################################
  ######################################################################
}

getWeeksBetween <- function(beginDate, endDate){
  difftime(strptime(endDate, format = "%Y-%m-%d"), strptime(beginDate, format = "%Y-%m-%d"), units="weeks") 
}

getMonthsBetween <- function(beginDate, endDate){
  (as.yearmon(strptime(endDate, format = "%Y-%m-%d"))-
     as.yearmon(strptime(beginDate, format = "%Y-%m-%d")))*12
}

getQuartersBetween <- function(beginDate, endDate){
  (as.yearqtr(strptime(endDate, format = "%Y-%m-%d"))-
     as.yearqtr(strptime(beginDate, format = "%Y-%m-%d")))*4
}

getYearsBetween <- function(beginDate, endDate){
  (as.yearmon(strptime(endDate, format = "%Y-%m-%d"))-
     as.yearmon(strptime(beginDate, format = "%Y-%m-%d")))
}

addChronicallyHomelessFlagToClient <- function(client, enrollment){
  #############################################
  ##### Get those with Disabling Condition ###
  #############################################
  enrolledWithDisability <- sqldf("SELECT *
                              FROM enrollment
                              WHERE DisablingCondition = 1")

  #############################################
  ##### Length-of-Stay ########################
  #############################################
  # Participants who meet the length-of-stay in homelessness requirement
  # Either through four or more occurences with cumulative duration exceeding a year
  # Or a consequtive year.
  #                 113 = "12 Months"
  #                 114 = "More than 12 Months"
  chronicityDf <- sqldf("SELECT *
                        FROM enrolledWithDisability
                        WHERE (TimesHomelessPastThreeYears = 4
                          AND (
                            MonthsHomelessPastThreeYears = 113
                            OR MonthsHomelessPastThreeYears = 114)
                          )
                            OR (CAST(JULIANDAY('now') - JULIANDAY(DateToStreetESSH) AS Integer) > 364
                            AND (DateToStreetESSH != '')
                          )
                        ")

  #############################################
  ##### Chronically Homeless ##################
  #############################################
  # Take the distinct PersonalIDs of individuals who meet both chronicity
  # and disabling condition.

  chClient <- sqldf("SELECT DISTINCT(PersonalID), 'Yes' As 'ChronicallyHomeless'
                        FROM chronicityDf
                        ")

  # Get client info for chronically homeless.
  chClient <- sqldf("SELECT a.*, b.'ChronicallyHomeless'
                    FROM client a
                    LEFT JOIN chClient b
                    ON a.PersonalID=b.PersonalID
                    ")

  chClient <- subset(chClient)

  chClient
}