Sub-queries are just standard queries which are nested inside of other queries. Sub queries are generally isolated and can’t access the parent queries, but certain circumstances allow for us to access parent query data.
I came across a need for such a circumstance when I was trying find missing monthly data for which relate to objects.
When trying to find, this missing data, it sounds easy in your head: “Find the first one which is newer than you”. Once you started the planning you realise that your normal “JOIN” just isn’t going to cut it, especially if you have multiple data rows missing one after the other.
Below you will find a quick write up of some example tables:
Table – Monthly Income
|MonthlyIncomeId||Int, Not Null, Auto Increment|
|StoreId||Int, Not Null|
|Income_date||DateTime2, Not Null|
Table – Store
|StoreId||Int, Not Null, Auto Increment|
|PhoneNumber||nVarchar(15), Not Null|
If you’ve looked over the tables, you’ll have noticed that they are very simple in design. The table [store] isn’t needed, but for some reason I think it helps a mind to work around a problem when you can connect data to the object, which in this case are the stores to which the monthly incomes are related to.
If you look at the monthly income table you’ll notice that “Income_date” isn’t a DateTime, but rather a Int. I’ve done this so that it’s easier to connect the data later on.
The style and data we will be using can be found in the table below:
|201701 — January|
|201702 — February|
|201703 — March|
|201704 — April|
|201707 — July|
|201712 — December|
We could try to join on to “monthly_income” tables like so:
This would run perfectly well, but we’d be missing “April”, “July” because they wouldn’t be able to join on anything.
We could also do a “left join”, but that would only show that the month following the date found was missing and we wouldn’t know about the item which is missing after the first:
|15||1||201704||679211||Null||Null||Null||Null||← We can assume May is missing|
|30||1||201707||106212||Null||Null||Null||Null||← We can assume August is missing|
Obviously with a table this small you could simply use your eyes, but imagine if you had thousands of lines… you simply couldn’t look through it, and so you may look for next_date rows which contain nulls:
Ignoring the fact that if [first_date].[income_date] was December and I incremented by 1, that I’d end up with a date that didn’t make sense, we can see that it just doesn’t cut it and this is exactly where our “special” circumstance can come with handy.
When your joining a table, you can use a sub-query in the “ON” segment of the code, which before I’d used it couldn’t wrap my head around the use. Another great thing about those special sub-queries is that you can also order them! Which is as such unique to “ON JOIN” but rather the fact that we will be using “TOP” in our select statement.
Now we should get the following results:
|15||1||201704||679211||30||1||201707||106212||← We can assume May is missing|
|30||1||201707||106212||55||1||201712||156321||← We can assume August is missing|
From this we can quickly change the query to do a check and filter out the ones we don’t want, and change the SELECT segment to something a little more useful:
Now we can see the next available date, and how many months’ worth of data are missing!
Now we have seen the results, let’ look over the query to see what just happened line by line (starting from the join):
This part is quite obvious, we’ve just linked both of the table version together on the storied
This is where the fun starts. As we’ve used an equals sign we only need 1 result, and make sure we only get the 1 result we use “TOP 1”, which just returns the top most data row. If you never put “TOP 1”.
This part isn’t anything mind blowing we are simply calling the monthly income table and giving it reference name of “specialquery”
This part is what makes the “TOP 1” part so import, we tell the query to pull anything from the table with column income_date which has a greater value than first_date’s income_date… if this isn’t the last month your going to potentially get a whole wack of dates linking to this table. Another important part of this query, this sub-query is that’s referencing its parent query!
The order may not be required but because I want to make sure that I only return the next available date after first_date’s income_date I order it by specialqueries’ income_date ascendingly.