Well, bummer. I really needed them.
On top of it, I only had read access to the database without the ability to create a stored procedure. Out of desperation, I found myself researching the possibility of creating my own functions using MySQL User Variables.
Slightly tweaking Dante, “Abandon all normal, ye who enter here.” User variables are weird.
I’ve spent some time researching on the interweb and about the best article I’ve found on the subject is:
Which focuses on getting the desired behavior from user variables rather than understanding them. This article is going to stick with the same philosophy–I don’t need to necessarily understand them, however, I want to be able to predict their behavior.
One word of warning, the above article is not really a suggessted reading before this article–it’s more of required reading. If you don’t know how to force the user variables to be evaluated when you need them, then the results are unpredictable.
The TL;DR version: Order of operations matter a lot in user variables and wrap the user variable in a subquery or function to force evaluation.
At this bottom of the article I’ve included the data used in this article. You can insert it into a MySQL or MariaDB database and follow along. The goal is to convert these data into a
stop_date which would greatly reduce the storage needs.
id 1 the
stop_date equivalents would look like:
We want to end up with a table like below.
To transform the data into this table it’s important to know user variables can hold a value from one row to the next.
This should produce the following table:
Pretty cool, right? Now, if only we could get the
row_number to reset whenever the
id changes. No worries, let’s use another variable to store the
id from the previous row so we can compare it to the current.
This should give us the following output:
calc2 are not values you need. They are merely calculations used to reset the
row_number whenever the
id changes. Hmm, this is interesting–and, hopefully, you can see it has many possibilities.
Now, let’s go back and think about our problem a bit more.
We can save a value from one row to the next. Therefore, detecting the breaks in a range of attendance dates can be obtained by comparing the current row’s
date value to the previous row’s
date value. If the previous row is greater than the current row minus one, then we know there was a break in the range.
This should give the following table:
The reason I state “should”, if you modify the order of the user variables, it’ll break. If you change the
ORDER BY, it’ll break. If you add a
HAVING clause, it’ll break. Pretty much, it’s as fragile a query as they come.
However, the clever bunch probably see where we are going with this. Now, it’s simply a matter of taking the
MAX() of of
group by the
Which should provide us with output like:
And there we go. Not too amazing, but I couldn’t find this answer by Googling, so I figure I’d add it to the great Wiki in the Sky.