R Function to Split CSVs

This article is part of a series.

View all 12 parts

Part 1 – Attachment III, aka, The Zombie
Part 2 – This Article
Part 3 – Shaping and Combining HMIS Data from ETO
Part 4 – Splitting Program Data
Part 5 – Sampling Large Data
Part 6 – Identifying Chronically Homeless and Veteran Participants throughout a COC
Part 7 – JPS DSRIP Report V2.0
Part 8 – Coordinated Entry By-Name-List using HMIS CSV 5.1, R, and SQL
Part 9 – Veteran's Report 2.0
Part 10 – Choropleth and Heatmaps for HMIS Data
Part 11 – Annualized Count
Part 12 – Stitching Together HMIS Exports

This is an R function written to split a dataset into particular sized sets, then write them as a CSV. Often, our office is need a quick way to split files for uploading purposes, since our HMIS software doesn't handle large uploads well.

For example:

splitDataAndWriteFiles(df, 500, "My_Data")

Will produce X number of files named "My_data_X.csv"

options(java.parameters = "-Xmx14336m")  ## memory set to 14 GB
library("XLConnect")

# Function to split files.
splitDataAndWriteFiles <- function(df, chunkSize, nameOfFiles) {
  success <- FALSE
  count <- 0
  while (!success) {
    # If you want 20 samples, put any range of 20 values within the range of number of rows
    s <- paste(((count*chunkSize)+1), "_", ((count+1)*chunkSize))
    print(s)
    chunk <- subset(df[((count*chunkSize)+1):((count+1)*chunkSize),])
    #chunk <- sample(df[5:20,])
    ## this would contain first 20 rows
    fileName <- paste(nameOfFiles, "_", as.character(count), ".csv")
    # Write out all the Active HUD Assessments.
    write.csv(chunk, file = fileName, na = "", row.names = FALSE, fileEncoding = "utf8")
    count <- count + 1
    success <- (count * chunkSize) > nrow(df)
  }
  return(success)
}

fileToSplit <- read.csv("UPLOAD -- Sal Men-- TCES Move -- TSA Bed Data Template.csv")

splitDataAndWriteFiles(fileToSplit, 5000, "Sal_Men_NBN")