Sampling Large Data

Reading time ~1 minute

This R function allows sampling of a dataframe.  This is helpful when writing a script which will be used against a large dataframe, however, writing the script is iterative.  Sampling allows the overall reduction in time of testing iterations, without losing the validity of realistic results.

    options(java.parameters = "-Xmx14336m")  ## memory set to 14 GB
    library("sqldf")
    library("XLConnect")
    library("tcltk")

    df <- readWorksheetFromFile("Data_X.xlsx", sheet = 1, startRow = 1)

    sampleVector <- sample(1:nrow(df), 30000)
    df2 <- df[sampleVector,]

    write.csv(df2, file="Sample of Data_X (30000).csv", na="")
    

A LEGO Classifier -- CNN and Elbow Grease

[![](../images/lego_classifier/lego_classifier_comic.png){: .float-right}](https://ladvien.com/lego_classifier/lego_classifier_comic.png)...… Continue reading

Setup a Local MySQL Database

Published on May 26, 2019

Understanding the MySQL Query

Published on May 25, 2019