SQL LAG Window Function - Detailed Explanation with Examples and Code

SQL LAG window function returns the previous value from the current row. Confused? Keep reading and I will clarify what this means with examples.

SQL LAG Window Functions in SQL

SQL LAG Window Function Summary: SQL LAG window function returns the previous value from the current row. Confused? Keep reading and I will clarify what this means with examples.

The easiest way to remember SQL LAG Window function is that it’s the opposite of SQL LEAD Window Function. I did a blog post and video on SQL LEAD Window Function. Go and check it out.

It’s easier to explain what LAG window function is with an example. So, I will

  • Show you the syntax,
  • Explain what the syntax mean,
  • Show you multiple examples, then
  • Use the examples to explain LAG SQL Window function
  • Finally, we end with SQL LAG Window Function USE CASES

If you prefer to watch video instead of reading, here's a video covering the exact same topic. 

Syntax

SQL LAG window functions has this syntax. I know this syntax looks ambiguous, but don’t gloss over it yet. Take the time to actually understand what each component of the syntax is doing. If you don’t understand the syntax or explanations below, take heart cause the followed examples will clarify any ambiguity or confusion you may have.

LAG(return_value, offset, default_value) OVER (
    PARTITION BY expr1, expr2,...
    ORDER BY expr1 [ASC | DESC], expr2,...
)

Here’s a detailed explanation of what the syntax keywords and expressions mean.

LAG = That’s just the SQL keyword clause to identity the function.

return_value = The column or SQL query that you want used as the basis for the LAG Function.

offset = How many rows from the current row should be returned. Default value is 1.

In SQL, the default offset is 0 which means that query results are returned starting from the 1st row. But, you can choose to return query results starting from the 2nd row, or 3rd row, or 4th row, etc. That’s what offset in SQL means, and the same concept applies here. I will show you an example of using/not using SQL Offset Below.

If you want query result returned starting from 2nd row, then set offset = 1, 3rd row? offset = 2. etc.

default = LAG window function returns the “previous” row from the current row within the same partition. If there is no value in the “previous” row, it returns the value defined in “default”. The default value is NULL. You can specify any string or integer as the default value. I will show example of this below.

PARTITION BY expr1 = Window function utilizes partitions. Partitions is a way to group related data together using categorical data from one or more columns (not to be confused with group by). What is defined in expr1 is the column/datapoint to be used for the partition.

When nothing is specified for partition, then the entire dataset is treated like a single partition.

You can use multiple columns/expressions in your partition that’s why you see PARTITION BY expr1, expr2,... The data will be partitioned in the order the columns were specified in. Below, I will show you an example of using/not using a partition.

The simplest way to think of partition is that you are creating subsets of your data based on a category like department.

ORDER BY expr1 = Order by is exactly what it sounds like. How should the resulting data be ordered? expr1 here represents the column to be used for ordering the data.

Once again just like partition, you can have multiple columns as part of the order by clause. And again, it will be ordered sequentially in the order the columns were specified in.

There are some window functions that are order-dependent which means that how the data is ordered impacts the results. Order by is essentially required for these window functions.

But, LAG SQL window function IS NOT order-dependent which means that the order by clause can be omitted and you will still get a result. This is possible because regardless of how the data is ordered, the result is determined by the offset from the current row.

Don’t worry if non of these make sense yet, this syntax and explanations will make more sense once you see the examples below.

How to use SQL LAG Window Function

Below, I have included a few example of how SQL LAG Window Function can be used and most importantly, the use-case for it. I know it’s fun to learn new stuff, but more important than learning new stuff is learning WHEN to use the new stuff you learned.

LAG Window Function Example 1 - Using all Default Values

Let’s start by looking at an example without specifying any partition, offset, or default parameter values.

This query shows an employee hire date followed by the date the previous employee was hired.

SELECT
    first_name,
    last_name,
    hire_date as original_hire_date,
    LAG(hire_date, 1) OVER (
        ORDER BY hire_date
    ) AS hire_date_previous
FROM
    employees;

LAG Window Function no partition, offset by 1, and no default example results.

LAG Window Function no partition, offset by 1, and no default example results.
LAG Window Function no partition, offset by 1, and no default example results.


What does it do?: LAG window function basically gives you the value in the previous row after the current row from the column of interest (What is defined in return_value). The value it provides depends on the offset value.

Let’s use the example above to explain this.

The query above has no partition, so the entire dataset is used as a single partition. You will see the difference between partition and no partition later on. The query above also has 1 as the offset value.

original_hire_date is the original column and hire_date_previous is the LAG window function column.

The value is row 2 hire_date_previous column is the same value as what is in row 1 original_hire_date column

Once again, Row 4 hire_date_previous is the same as what is on row 3 original_hire_date column.

That’s how LAG Window function works. It gives you the value in the previous row depending on the offset value defined. Since the offset value for this query is 1, the “previous value” is one row above.

It could also give you value from the previous 2/3/4/5/6/… rows depending on what value is specified for the offset.

Speaking of offset, let’s look at an example of using Offset of 2

LAG Window Function Example 2 - Offset by 2

This example uses an offset of 2 instead of the default value of 1. This is how you define “2” for the offset LAG(hire_date, 2)

SELECT
    first_name,
    last_name,
    hire_date as original_hire_date,
    LAG(hire_date, 2) OVER (
        ORDER BY hire_date desc
    ) AS hire_date_previous
FROM
    employees;

LAG Window Function OFFSET by 2 Example Results

LAG Window Function OFFSET by 2 Example - Results
LAG Window Function OFFSET by 2 Example - Results


This is the same query as the one above, but now, we are using the offset of 2.

The value in hire_date_previous column row 3 is same as original_hire_date column row 1.

And once again, the value in hire_date_previous column row 5 is same as original_hire_date column row 3.

This is a clear example of the offset determines the value. The offset determines how many rows from the current row the data will be pulled from.

I think you got it by now, but just in case you haven’t, let’s look at one more example, this time using an offset of 5.

LAG Window Function Example 3 - Offset by 5

And This one, an offset by 5

SELECT
    first_name,
    last_name,
    hire_date as original_hire_date,
    LAG(hire_date, 5) OVER (
        ORDER BY hire_date desc
    ) AS hire_date_previous
FROM
    employees;

LAG Window Function OFFSET by 5 Example Results

LAG Window Function OFFSET by 5 Example Results
LAG Window Function OFFSET by 5 Example Results


Same thing again. Single partition, no defaults specified and the value in hire_date_previous column corresponds to the value in original_hire_date column 5 rows above.

LAG Window Function Example 4 - Partitioning Example

So far, we have been using a single partition in our examples. Let’s explore using more than one partition to see what the results look like.

This example partitions by department, uses 1 as the offset, and doesn’t provide any value for “default”.

SELECT
    first_name,
    last_name,
    department_name,
    hire_date as original_hire_date,
    LAG(hire_date) OVER (
        PARTITION by department_name
        ORDER BY hire_date
    ) AS hire_date_previous
FROM
    employees e
INNER JOIN departments d ON
    d.department_id = e.department_id;

Remember what I said above, when default is not provided, the resulting value is NULL (We will get into this below).

LAG Window Function - Partition by Department Example Results.

LAG Window Function Example 4- Partition by Department Example Results.
LAG Window Function Example 4- Partition by Department Example Results.


This is the main thing to remember about partitions. Partitioning is like creating subset of your data. You apply data operations to that subset. At the end of that subset, the data operation starts over. That’s the key thing to remember about partitioning.

What exactly is going on with the yellow & green boxes?

So, there’s a couple of things going on here.

If you look in the department_name column, everything is organized by department.

This is important to remember; with partitioning, the Function restarts with each partition window.

That means that hire_date_previous (Which represents the LAG Window Function results) is only applied to the Accounting department.

When it gets to Administration department, it starts over.

And when it gets to Executive department, it starts over; Same goes for finance department.

That’s what partitioning does. It breaks things into groups and for each groups, the calculation starts over. That’s what you see in the yellow & green boxes.

What’s going on with the green boxes?

If you remember earlier, I mentioned that the LAG Window Function Clause “default” has a default value of NULL. That’s what’s highlighted in the green boxes.

When the function comes to the end of the partition, there is no value in the “previous” row, so it returns whatever is specified in the “default” clause and since we didn’t specify any value for “default”, it returns NULL.

What if you wanted to provide a value for “default” instead of NULL? Let’s discuss that next.

LAG Window Function Example 5 - Specifying a value for “Default”

Like the subheading suggests, this example below shows you how to specify a value for “default

SELECT
    first_name,
    last_name,
    department_name,
    hire_date as original_hire_date,
    LAG(hire_date, 1, 'Not Applicable') OVER (
        PARTITION by department_name
        ORDER BY hire_date
    ) AS hire_date_previous
FROM
    employees e
INNER JOIN departments d ON
    d.department_id = e.department_id;

LAG Window Function - Replace NULL default Value Example Results.

LAG Window Function Example 5 Results- Replace NULL default Value with
LAG Window Function Example 5 Results- Replace NULL default Value with "Not Applicable"


This query and result is the exact same one as in Example 4, but now, we are replacing the NULL Values with “Not Applicable”.

The replacement doesn’t have to be a string. You can could also specify a number as the default value.

So, learning LAG Window Function is cool and all, but…how do you actually know when and why to use it.

See, whenever I learn something new, I want to know the applicability of it. Like…this new thing I learned is cool, but WHEN do I actually use it? and WHY would I want to use it?

Let’s talk about that next.

SQL LAG Window Function USE CASES.

Now you might be wondering, SQL LAG window function sounds cool, but when will I ever use it? What use-case will this be an effective solution? Don’t worry, I got you covered 😆

A good use-case for SQL LAG Window Function is calculating the sum or difference between the current row and the offset row.

Let’s look at an example to clarify this.

This example shows subtracting the hire date of employee in row X from the hire date of employee on row Y as a way to check the length of time between hire dates.


 with hiring as
 (SELECT
    first_name,
    last_name,
    hire_date as original_hire_date,
    LAG(hire_date) OVER (
        ORDER BY hire_date
    ) AS hire_date_previous
FROM
    employees)
select *,
    julianday(original_hire_date) - julianday(hire_date_previous) as hire_diff
    from hiring;

julianday is how sqlite converts date (string-format) into number of days since beginning of the julian period.

LAG Window Function Use Case Example Results.

LAG Window Function Example 6 Results - SQL LAG Window Function Use Case.
LAG Window Function Example 6 Results - SQL LAG Window Function Use Case.


As you can see, the column hire_diff shows the resulting difference between when the previous employee was hired (represented by hire_date_previous) and the hire_date of the current employee which is represented by original_hire_date .

Let’s use row 1 as an example.

In row 2, original_hire_date represents the date Jennifer was hired and hire_date_previous represents when Steven was hired.

Hire_diff is the number of days between when Steven (Row 1) and Jennifer (Row 2) was hired.

I am sure there are many more use cases for SQL LAG window function, but I hope this example gets you excited to try SQL LAG window function.

SQL LAG Window Functions in a Nutshell

SQL LAG Window function gives you the value in the “previous row”. What the previous row is, is determined by the offset value. The default offset value is 1, but you can set it to any number.

When there is no value in the previous row, NULL is returned. But, instead of NULL being returned when there is no value, you can specify any string or number to be returned instead. For example, you can return “Not applicable” or the number “10” instead of NULL.

You can also specify how to divide and group the data using partition. When partition is not specified, the entire dataset is treated like a single partition.

-----------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------

If you want to replicate these results on your local computer,

  1. Download SQLite to Local Computer
  2. Download the Employees Database. When you sign up to get access to my FREE technical resources, you will get access to this database as well as other databases.

Follow me on Youtube

Follow me on Linkedin

Follow me on Github

Follow me on Twitter

If you want to contact me, speak to me here

Want me to create a similar technical content for your brand? Contact me here.

Want to get updates every time I publish new content? Join my FREE Technical Newsletter here

Categories: : SQL, SQL Window Functions