This article is part of a series.
- Part 1 - Attachment III, aka, The Zombie
- Part 2 - R Function to Split CSVs
- Part 3 - Shaping and Combining HMIS Data from ETO
- Part 4 - Splitting Program Data
- Part 5 - This Article
- Part 6 - Identifying Chronically Homeless and Veteran Participants throughout a COC
- Part 7 - JPS DSRIP Report V2.0
- Part 8 - Coordinated Entry By-Name-List using HMIS CSV 5.1, R, and SQL
- Part 9 - Veteran's Report 2.0
- Part 10 - Choropleth and Heatmaps for HMIS Data
- Part 11 - Annualized Count
- Part 12 - Stitching Together HMIS Exports
This R function allows sampling of a dataframe. This is helpful when writing a script which will be used against a large dataframe, however, writing the script is iterative. Sampling allows the overall reduction in time of testing iterations, without losing the validity of realistic results.
options(java.parameters = "-Xmx14336m") ## memory set to 14 GB
library("sqldf")
library("XLConnect")
library("tcltk")
df <- readWorksheetFromFile("Data_X.xlsx", sheet = 1, startRow = 1)
sampleVector <- sample(1:nrow(df), 30000)
df2 <- df[sampleVector,]
write.csv(df2, file="Sample of Data_X (30000).csv", na="")