Basics: Lambda Linq

Introduction

As there are many introductions out there that explain what LINQ and lambdas are, I’m not going to delve to deeply into that subject. Essentially, they are quick hand ways of writing instructions on lists. The most common of those instruction are filtering, selecting and ordering.

This tutorial has been written with a combination of LINQ and lambda. The LINQ is structured closer to that of SQL, but using lambda we get an even more streamlined language to work with.

I will be dealing with IEnumerable’s in this tutorial, but you can use those commands on most Arrays / Lists. Once a command has been returned though it will be as a IEnumerable.

Example – Filtering

We have a list of numbers and we only want the ones which are greater than 5.

Filtering – Without LINQ / Lambda

 private static void Main()

        {

            IEnumerable<int> numbers = new[] {9, 3, 7, 4, 8, 1, 1, 5};

            numbers = GreaterThanFive(numbers);

        }

 

        private static IEnumerable<int> GreaterThanFive(IEnumerable<int> numbers)

        {

            foreach (var number in numbers)

            {

                if (number > 5) yield return number;

            }

        }

Filtering – With LINQ / Lambda

                 private static void Main(string[] args)

        {

            IEnumerable<int> numbers = new[] {9, 3, 7, 4, 8, 1, 1, 5};

 

            numbers = numbers.Where(number => number > 5);

        }

Looking at the non LINQ
/ Lambda compared to the one which uses LINQ / Lambda, you can instantly see there are less lines to the code. Now having less lines and being more compressed means nothing if you lose the readability but have a look, it’s actually more readable!

As this is the first of the tutorial I will run through the structure next of a LINQ / Lambda statement

Structure

Explaining the structure of LINQ / Lambda is where I found that most people explained them poorly.

Let use the Linq / Lambda from
Example 1: numbers.Where(number => number > 5)

Key Description
 numbers Is the list of items we will be using
 Where States we want to filter out items from the list which are false in the statement. Which in our case is “number > 5”
number The first appearance of the word “number” is for stating what each item should be named when working on it. It is exactly the same as doing a foreach loop:”var number in numbers

The list doesn’t need to be supplied as the Where method is called from the list.

 Numbers Is the list of items we will be using
 Numbers Is the list of items we will be using

I’m going explain the naming of each item a little further: “number => number > 5”

You must understand that the name can be anything! This is the part that scrambled my brain for so long and there is a reason why it scrambled my brain for so long. You’ll find that most people will constantly use the letter ‘o’ as the name. So when I first came across it you could be reading a list of numbers / letters / words and trying to filter, but everyone would have something like the following in their
tutorials:

words.Where(o => o.StartsWith(“L”))

numbers.Where(o => o % 2)

Now personally I think while your learning, you should stick with full names like I have in the example. Once you are comfortable you should
move to single character names (you don’t have to). Even then it’ll be beneficial to use a letter which correlates to the what the item is (you can nest LINQ / Lambda statements, when dealing with multi-dimensional arrays or classes within classes and getting used to have unique names means your less likely to run into naming conflicts).

Ordering

The example covered how to filtering, now we are going to deal with ordering. There is a reason why I’m dealing with them in this order and that is that I personally filtering easier than ordering and ordering easier than selecting to explain and understand.

If you’ve run the code from example 1 you’ll notice the numbers are output in the following order: 9, 7, 8

Ordering is a very simple process and very similar to the “Where” method.

Ordering in ascending order:

numbers.OrderBy(number => number)

Ordering in descending order:

numbers.OrderByDescending(number => number)

If we were dealing with classes we’d select a property on the second “number”.

The are another two features of LINQ / Lambda that it’s about time I brought up. The first of those features is chaining and what this means is that instead of having to have:

numbers = numbers.Where(o => o > 5);

numbers = numbers.OrderBy(o => o);

We can put them as a single line:

numbers = numbers.Where(o => o > 5).OrderBy(o => o);

We are allowed to do this because LINQ / Lambda commands returns a list of type IEnumerable<T> (in this case it’s a IEnumerable<int> list).

Now for the other piece of information, which is related to the ordering. When you have multiple properties, you may wish to order by one property and then another. In those cases you use the “ThenBy” or “ThenByDescending” method and not by calling the “OrderBy” or “OrderByDescending” method.

Selecting

Select allows you to either only select specific properties of the list of objects you have, or return something with a calculation done to it. First I’ll show you an example dealing with the numbers list and then I’ll show you another example, the second being a little more difficult than the first.

For this example, we are going to multiple each number in the numbers variable by 5. The result will be that we’ll have a list of the results.

IEnumerable<int> numbers = new[] {9, 3, 7, 4, 8, 1, 1, 5};

numbers = numbers.Select(number => number * 5); // Results: 45, 15, 35, 20, 40, 5, 5, 25

The next example will be a list of cars and the select statement will return a list of their names only.

 private class Car

        {

            public string Name { get; set; }

            public int ReleaseYear { getset; }

            public int CostInDollars { getset; }

        }

        private static void Main(string[] args)

        {

        IEnumerable<Car> cars = new[]

            {

                new Car { CostInDollars = 120000, Name = “Wonky Wagon”, ReleaseYear = 1973 },

                new Car { CostInDollars = 90000, Name = “Comfy Steel”, ReleaseYear = 1998 },

                new Car { CostInDollars = 300000, Name = “Compensating Car”, ReleaseYear = 2018 }

            };

            var carNames = cars.Select(car => car.Name);

        }

The first thing you should note is that I haven’t a clue about cars 😊. The second is that this time, we’ve had to
create a new list container for the results. This is because what we’ve selected isn’t of the same type of list. Our first select, simply got each
number and timed it by 5. So we where running the query using a list of numbers and returning a list of numbers. Here we are using a list of cars and returning a list of strings / names. You could create a new object of car in the “Select” and return them and then you’d be able to
use the same container… but that’s a little past the point of this tutorial.

The rest is pretty easy to understand. We are select the name of each object in the cars list.

There are many more methods for LINQ / Lambda out there and if your interested I’d fully recommend learning the other methods. Hopefully this tutorial has given you the understanding of how to use them. I use the following site as a reference, as it has clean and
relatively easier following examples: http://linqsamples.com/

Sub-queries are just standard queries which are nested inside of other queries. Sub queries are generally isolated and can’t access the parent queries, but certain circumstances allow for us to access parent query data.
I came across a need for such a circumstance when I was trying find missing monthly data for which relate to objects.
When trying to find, this missing data, it sounds easy in your head: “Find the first one which is newer than you”. Once you started the planning you realise that your normal “JOIN” just isn’t going to cut it, especially if you have multiple data rows missing one after the other.

Below you will find a quick write up of some example tables:

Table – Monthly Income
Name Properties
MonthlyIncomeId Int, Not Null, Auto Increment
StoreId Int, Not Null
Income_date DateTime2, Not Null
Table – Store
Name Properties
StoreId Int, Not Null, Auto Increment
Name NVarchar(200)
Address NVarchar(512)
PhoneNumber nVarchar(15), Not Null

If you’ve looked over the tables, you’ll have noticed that they are very simple in design. The table [store] isn’t needed, but for some reason I think it helps a mind to work around a problem when you can connect data to the object, which in this case are the stores to which the monthly incomes are related to.
If you look at the monthly income table you’ll notice that “Income_date” isn’t a DateTime, but rather a Int. I’ve done this so that it’s easier to connect the data later on.

The style and data we will be using can be found in the table below:

“yyyymm”


201701 — January
201702 — February
201703 — March
201704 — April
201707 — July
201712 — December

We could try to join on to “monthly_income” tables like so:

SELECT *

FROM monthly_income AS first_date

JOIN monthly_income AS next_date ON next_date.income_date = first_date.income_date

AND next_date.storeid = first_date.storeid

WHERE first_date.storeid = 1

This would run perfectly well, but we’d be missing “April”, “July” because they wouldn’t be able to join on anything.

We could also do a “left join”, but that would only show that the month following the date found was missing and we wouldn’t know about the item which is missing after the first:

First_date


Next_date


Monthly_incomeid


storeid


Income_date


Income


Monthly_incomeid


storeid


Income_date


Income


1 1 201701 901755 5 1 201702 875871
5 1 201702 875871 10 1 201703 895439
10 1 201703 895439 15 1 201704 679211
15 1 201704 679211 Null Null Null Null ← We can assume May is missing
30 1 201707 106212 Null Null Null Null ← We can assume August is missing

Obviously with a table this small you could simply use your eyes, but imagine if you had thousands of lines… you simply couldn’t look through it, and so you may look for next_date rows which contain nulls:

SELECT first_date.*, ISNULL(next_date.income_date, first_date.Income_date + 1)
FROM monthly_income AS first_date
JOIN monthly_income AS next_date ON next_date.income_date = first_date.income_date

AND next_date.storeid = first_date.storeid

WHERE first_date.storeid = 1
AND next_date.monthly_incomeid IS NULL

Ignoring the fact that if [first_date].[income_date] was December and I incremented by 1, that I’d end up with a date that didn’t make sense, we can see that it just doesn’t cut it and this is exactly where our “special” circumstance can come with handy.

When your joining a table, you can use a sub-query in the “ON” segment of the code, which before I’d used it couldn’t wrap my head around the use. Another great thing about those special sub-queries is that you can also order them! Which is as such unique to “ON JOIN” but rather the fact that we will be using “TOP” in our select statement.

SELECT *
FROM monthly_income AS first_date
JOIN monthly_income AS next_date ON next_date.storeid = first_date.storeid

AND next_date.income_date = (SELECT TOP 1 income_date
FROM monthly_income AS specialquery

WHERE specialquery.income_date > first_date.income_date
AND first_date.storeid = 1
ORDER BY specialquery.income_date)

Now we should get the following results:

First_date


Next_date


Monthly_incomeid


storeid


Income_date


Income


Monthly_incomeid


storeid


Income_date


Income


1 1 201701 901755 5 1 201702 875871
5 1 201702 875871 10 1 201703 895439
10 1 201703 895439 15 1 201704 679211
15 1 201704 679211 30 1 201707 106212 ← We can assume May is missing
30 1 201707 106212 55 1 201712 156321 ← We can assume August is missing

From this we can quickly change the query to do a check and filter out the ones we don’t want, and change the SELECT segment to something a little more useful:

SELECT first_date.*, next_date.income_date AS [Next Available Date], next_date.income_date – first_date.income_date [Missing Months]

FROM monthly_income AS first_date
JOIN monthly_income AS next_date ON next_date.storeid = first_date.storeid

AND next_date.income_date = (SELECT TOP 1 income_date
FROM monthly_income AS specialquery
WHERE specialquery.income_date > first_date.income_date
ORDER BY specialquery.income_date)

WHERE first_date.storeid = 1
AND next_date.income_date > (first_date.income_date + 1) – – This will filter out any next dates which are only 1 month ahead of the first income date.

Now we can see the next available date, and how many months’ worth of data are missing!

Now we have seen the results, let’ look over the query to see what just happened line by line (starting from the join):

JOIN monthly_income AS next_date ON next_date.storeid = first_date.storeid

This part is quite obvious, we’ve just linked both of the table version together on the storied

AND next_date.income_date = (SELECT TOP 1 income_date

This is where the fun starts. As we’ve used an equals sign we only need 1 result, and make sure we only get the 1 result we use “TOP 1”, which just returns the top most data row. If you never put “TOP 1”.

FROM monthly_income AS specialquery

This part isn’t anything mind blowing we are simply calling the monthly income table and giving it reference name of “specialquery”

WHERE specialquery.income_date > first_date.income_date

This part is what makes the “TOP 1” part so import, we tell the query to pull anything from the table with column income_date which has a greater value than first_date’s income_date… if this isn’t the last month your going to potentially get a whole wack of dates linking to this table. Another important part of this query, this sub-query is that’s referencing its parent query!

ORDER BY specialquery.income_date)

The order may not be required but because I want to make sure that I only return the next available date after first_date’s income_date I order it by specialqueries’ income_date ascendingly.

I recently came across a post which gave an example of poorly written code, which the person used in job interviews. His strategy was sound, as he didn’t expect people not to be stumped by the code but rather wanted to see how people would act when they came across such code.

The code was the following:

// The Dog class is inherited from the Animal class
for (Dog _dog = _animal as Dog; _dog != null; _dog = null)
{
// Other code runs
}

He goes as far as giving an example of how he would rewrite the above code to make it easier to read, which was as follows:

Dog _dog = _animal as Dog;

if (_dog != null) {
// more code
}

While I imagine, the code would continue to work as intended, I couldn’t help but notice that he hadn’t rewrote the code to exactly match what the original code did. He could of done multiple things to correct this and wouldn’t of taken much more time.

Dog _dog = _animal as Dog;

if (_dog != null) {
// more code
_dog = null;
}

Another way to do this as well (if the variable _dog isn’t used further on and to more accurately mimic the original code) would be to wrap the code in parenthesis. Which would then clean up _dog after it leaves the parenthesis it’s contained within, which I think would be the reason for writing the code the original way.

{

Dog _dog = _animal as Dog;
if (_dog != null) {
// more code
}

}