Pages

Thursday, 12 November 2015

RANKING FUNCTIONS

There are four ranking functions included with SQL Server (starting with SQL Server 2005)

Ranking functions return a ranking value for each row in a partition.
Depending on the function that is used, some rows might receive the same value as other rows. Ranking functions are nondeterministic.

·         OVER clause is required in all the ranking functions and with that you specify the partitioning and ordering of records before the ranking functions are evaluated. If you don't specify it, you will get an error similar to "Incorrect syntax near 'ROW_NUMBER', expected 'OVER'."

·         PARTITION BY clause is not mandatory and if you don't specify it all the records of the result-set will be considered as a part of single record group or a single partition and then ranking functions are applied. When you specify a column/set of columns with PARTITION BY clause then it will divide the result-set into record groups/partitions and then finally ranking functions are applied to each record group/partition separately and the rank will restart from 1 for each record group/partition separately.

·         ORDER BY clause is mandatory and if you don't specify it you will get an error similar to "The ranking function "ROW_NUMBER" must have an ORDER BY clause." With this clause you specify a column or a set of columns which will be used to order the records within a result-set or within record groups/partitions of a result-set. Please note you can specify only those columns which are being made available by the FROM clause of the query. Also, you can not specify an integer to represent the position of a column, if you do so you will get "Windowed functions do not support integer indices as ORDER BY clause expressions." error.

These functions differ from ordinary scalar functions in that the result that they produce for a given row depends on the other rows in the result set.  They also differ from aggregate functions in that they produce exactly one output row for each input row.  Unlike aggregates they do not collapse a set of rows into a single row.

Sample Table:
Create table ExamResult(name varchar(50),Subject varchar(20),Marks int)

insert into ExamResult values('Adam','Maths',70)
insert into ExamResult values ('Adam','Science',80)
insert into ExamResult values ('Adam','Social',60)

insert into ExamResult values('Rak','Maths',60)
insert into ExamResult values ('Rak','Science',50)
insert into ExamResult values ('Rak','Social',70)

insert into ExamResult values('Sam','Maths',90)
insert into ExamResult values ('Sam','Science',90)
insert into ExamResult values ('Sam','Social',80)



RANK (Transact-SQL):
Returns the rank of each row within the partition of a result set. The rank of a row is one plus the number of ranks that come before the row in question.

Syntax
RANK ( ) OVER ( [ partition_by_clause ] order_by_clause )

Arguments
OVER ([partition_by_clause] order_by_clause):
Partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. Order_by_clause determines the order of the data before the function is applied. The order_by_clause is required.

The <rows or range clause> of the OVER clause cannot be specified for the RANK function.

Return Types
Bigint

Remarks
If two or more rows tie for a rank, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number three, because there are two rows that are ranked higher. Therefore, the RANK function does not always return consecutive integers.
The sort order that is used for the whole query determines the order in which the rows appear in a result set.
RANK is nondeterministic.

select Name,Subject,Marks,
RANK() over(partition by name order by Marks desc)Rank
From ExamResult

Name                                               Subject              Marks       Rank
-------------------------------------------------- -------------------- ----------- --------------------
Adam                                               Science              80          1
Adam                                               Maths                70          2
Adam                                               Social               60          3
Rak                                                Social               70          1
Rak                                                Maths                60          2
Rak                                                Science              50          3
Sam                                                Maths                90          1
Sam                                                Science              90          1
Sam                                                Social               80          3

(9 row(s) affected)
DENSE_RANK (Transact-SQL)

Returns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question.

Syntax
DENSE_RANK ( ) OVER ( [ <partition_by_clause> ] < order_by_clause > )

Arguments
<partition_by_clause>
Divides the result set produced by the FROM clause into partitions to which the DENSE_RANK function is applied.
<order_by_clause>
Determines the order in which the DENSE_RANK function is applied to the rows in a partition.

Return Types
Bigint

Remarks
If two or more rows tie for a rank in the same partition, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number two. This is one more than the number of distinct rows that come before this row. Therefore, the numbers returned by the DENSE_RANK function do not have gaps and always have consecutive ranks.

The sort order used for the whole query determines the order in which the rows appear in a result. This implies that a row ranked number one does not have to be the first row in the partition.

DENSE_RANK is nondeterministic.

select Name,Subject,Marks,
DENSE_RANK() over(partition by name order by Marks desc)Rank
From ExamResult

Name                                               Subject              Marks       Rank
-------------------------------------------------- -------------------- ----------- --------------------
Adam                                               Science              80          1
Adam                                               Maths                70          2
Adam                                               Social               60          3
Rak                                                Social               70          1
Rak                                                Maths                60          2
Rak                                                Science              50          3
Sam                                                Maths                90          1
Sam                                                Science              90          1
Sam                                                Social               80          2

(9 row(s) affected)




NTILE (Transact-SQL)

Distributes the rows in an ordered partition into a specified number of groups. The groups are numbered, starting at one. For each row, NTILE returns the number of the group to which the row belongs.

Syntax
NTILE (integer_expression) OVER ([<partition_by_clause>] < order_by_clause >)

Arguments
integer_expression
Is a positive integer constant expression that specifies the number of groups into which each partition must be divided.
integer_expression can be of type int, or bigint.
<partition_by_clause>
Divides the result set produced by the FROM clause into partitions to which the function is applied. For the PARTITION BY syntax.
<order_by_clause>
Determines the order in which the NTILE values are assigned to the rows in a partition. An integer cannot represent a column when the <order_by_clause> is used in a ranking function.

Return Types
Bigint

Remarks
If the number of rows in a partition is not divisible by integer_expression, this will cause groups of two sizes that differ by one member. Larger groups come before smaller groups in the order specified by the OVER clause. For example if the total number of rows is 53 and the number of groups is five, the first three groups will have 11 rows and the two remaining groups will have 10 rows each. If on the other hand the total number of rows is divisible by the number of groups, the rows will be evenly distributed among the groups. For example, if the total number of rows is 50, and there are five groups, each bucket will contain 10 rows.
NTILE is nondeterministic

select Name,Subject,Marks,
NTILE(2) over(partition by name order by Marks desc)Quartile
From ExamResult

Name                                               Subject              Marks       Quartile
-------------------------------------------------- -------------------- ----------- --------------------
Adam                                               Science              80          1
Adam                                               Maths                70          1
Adam                                               Social               60          2
Rak                                                Social               70          1
Rak                                                Maths                60          1
Rak                                                Science              50          2
Sam                                                Maths                90          1
Sam                                                Science              90          1
Sam                                                Social               80          2

(9 row(s) affected)

select Name,Subject,Marks,
NTILE(3) over(partition by name order by Marks desc)Quartile
From ExamResult

Name                                               Subject              Marks       Quartile
-------------------------------------------------- -------------------- ----------- --------------------
Adam                                               Science              80          1
Adam                                               Maths                70          2
Adam                                               Social               60          3
Rak                                                Social               70          1
Rak                                                Maths                60          2
Rak                                                Science              50          3
Sam                                                Maths                90          1
Sam                                                Science              90          2
Sam                                                Social               80          3

(9 row(s) affected)

ROW_NUMBER (Transact-SQL)

Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
Syntax
ROW_NUMBER ( )
    OVER ([PARTITION BY value_expression, ... [ n ] ] order_by_clause )
Arguments

PARTITION BY value_expression
Divides the result set produced by the FROM clause into partitions to which the ROW_NUMBER function is applied. value_expression specifies the column by which the result set is partitioned. If PARTITION BY is not specified, the function treats all rows of the query result set as a single group.
order_by_clause
The ORDER BY clause determines the sequence in which the rows are assigned their unique ROW_NUMBER within a specified partition. It is required.

Return Types
Bigint

General Remarks
There is no guarantee that the rows returned by a query using ROW_NUMBER() will be ordered exactly the same with each execution unless the following conditions are true.
·         Values of the partitioned column are unique.
·         Values of the ORDER BY columns are unique.
·         Combinations of values of the partition column and ORDER BY columns are unique.

ROW_NUMBER () is nondeterministic.

select Name,Subject,Marks,
ROW_NUMBER() over(order by Name) RowNumber
From ExamResult

Name                                               Subject              Marks       RowNumber
-------------------------------------------------- -------------------- ----------- --------------------
Adam                                               Maths                70          1
Adam                                               Science              80          2
Adam                                               Social               60          3
Rak                                                Maths                60          4
Rak                                                Science              50          5
Rak                                                Social               70          6
Sam                                                Maths                90          7
Sam                                                Science              90          8
Sam                                                Social               80          9


(9 row(s) affected)

No comments:

Post a Comment