Window functions (part3) - new functions
Posted In:
firebird
.
By Adriano
In the first and the second part of window function series, I talked about the OVER clause, with/without partitions and with/without ordering. Till then, no new function had been introduced, so it has about the already existing aggregate functions.
Now, new window-only functions has been introduced. I'll separate them in two groups: ranking and navigational. Both set of functions can be used with/without partition/ordering, but them does not make much sense without ordering.
Ranking functions (DENSE_RANK, RANK and ROW_NUMBER): with these functions, one can create different type of incremental counters. Think about SUM(1) OVER (ORDER BY SALARY), these functions does this type of thing, but all of them in different ways. Lets see an example query, also comparing with the SUM behavior.
select
id,
id,
salary,
dense_rank() over (order by salary),
rank() over (order by salary),
row_number() over (order by salary),
sum(1) over (order by salary)
from employee
And the result set:rank() over (order by salary),
row_number() over (order by salary),
sum(1) over (order by salary)
from employee
order by salary;
id | salary | dense_rank | rank | row_number | sum |
3 | 8.00 | 1 | 1 | 1 | 1 |
4 | 9.00 | 2 | 2 | 2 | 2 |
1 | 10.00 | 3 | 3 | 3 | 4 |
5 | 10.00 | 3 | 3 | 4 | 4 |
2 | 12.00 | 4 | 5 | 5 | 5 |
As you see, the functions differs when repeated values are found in the order key (the salary of 10.00). With DENSE_RANK, no gaps are created and all repeated values receive the same counter. With RANK, the initial counter is used for all repeated values, but it make gaps so the next non-repeating value (the salary of 12.00) does not consider the previously repeated values as in different positions. With ROW_NUMBER, each line receives an incremental value. And SUM(1) is very like RANK, but the value is computed after all repeated values are summed, so the gap is before the repeated values.
Navigational functions (LAG and LEAD): with these functions, an expression could get the value of a previously (LAG) or a posterior (LEAD) row of the query. Follow the demonstrating query.
select
id,
salary,
lag(salary) over (order by salary),
lead(salary) over (order by salary)
from employee
order by salary;
And the result set:
id | salary | lag | lead |
3 | 8.00 | <null> | 9.00 |
4 | 9.00 | 8.00 | 10.00 |
1 | 10.00 | 9.00 | 10.00 |
5 | 10.00 | 10.00 | 12.00 |
2 | 12.00 | 10.00 | <null> |
The functions have a second argument, that is implicitly 1, which means the number of previously/posterior rows will be queried. If this row does not exist, <null> is returned.
And of course, you can use these functions as expressions and compute, for example, the difference of an employee salary from the one greater and/or lesser.
2 Responses to Window functions (part3) - new functions
Great functionality, the windows functions!
Using lag with no previous row results in , the same goes for lead when there is no next row.
Oracle and postgesql support a default value in that case, it would be a nice-to-have, although COALESCE will do the work also.
@Bert, I added it now. Thanks for the comment!
Something to say?