Pages

Friday, 9 October 2015

Aggregate Functions

MIN (Transact-SQL)
Applies To: SQL Server 2014, SQL Server 2016 Preview

Returns the minimum value in the expression. May be followed by the OVER clause.

Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.


MIN ( [ ALL | DISTINCT ] expression )
   OVER ([partition_by_clause] order_by_clause )
ALL
Applies the aggregate function to all values. ALL is the default.

DISTINCT
Specifies that each unique value is considered. DISTINCT is not meaningful with MIN and is available for ISO compatibility only.

EXPRESSION
Is a constant, column name, or function, and any combination of arithmetic, bitwise, and string operators. MIN can be used with numericchar, varcharuniqueidentifier, or datetime columns, but not with bit columns.

Aggregate functions and subqueries are not permitted.

OVER ( [partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group.
 Order_by_clause determines the logical order in which the operation is performed. order_by_clause is required.

Returns a value same as expression.

MIN ignores any null values.
With character data columns, MIN finds the value that is lowest in the sort sequence.
MIN is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.

              When used in the context of a ranking window function, <ORDER BY Clause> can only refer to columns made available by the FROM clause. An integer cannot be specified to represent the position of the name or alias of a column in the select list. 
   
             <ORDER BY Clause> cannot be used with aggregate window functions.

MAX (Transact-SQL)
Applies To: SQL Server 2014, SQL Server 2016 Preview
Returns the maximum value in the expression.
Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.

Syntax
MAX ( [ ALL | DISTINCT ] expression )
   OVER ( [ partition_by_clause ] order_by_clause )  

Arguments
ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered. DISTINCT is not meaningful with MAX and is available for ISO compatibility only.
EXPRESSION
Is a constant, column name, or function, and any combination of arithmetic, bitwise, and string operators.
MAX can be used with
Numeric,
Character,
Uniqueidentifier, and
Datetime columns,
but not with bit columns.
Aggregate functions and subqueries are not permitted.

OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required.

Return Types
Returns a value same as expression.

Remarks
MAX ignores any null values.
For character columns, MAX finds the highest value in the collating sequence.

MAX is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.

SUM (Transact-SQL)

Applies To: SQL Server 2014, SQL Server 2016 Preview

Returns the sum of all the values, or only the DISTINCT values, in the expression. SUM can be used with numeric columns only. Null values are ignored.

Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.

Syntax
SUM ( [ ALL | DISTINCT ] expression )
   OVER ( [ partition_by_clause ] order_by_clause ) 

Arguments
ALL
Applies the aggregate function to all values. ALL is the default.

DISTINCT
Specifies that SUM return the sum of unique values.

Expression
Is a constant, column, or function, and any combination of arithmetic, bitwise, and string operators. expression is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted. For more information, see Expressions (Transact-SQL).

OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required.

Return Types
Returns the summation of all expression values in the most precise expression data type.

Expression result
Return type
tinyint
int
smallint
int
int
int
bigint
bigint
decimal category (p, s)
decimal(38, s)
money and smallmoney category
money
float and real category
float

Remarks
SUM is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.

AVG (Transact-SQL) 
Applies To: SQL Server 2014, SQL Server 2016 Preview

Returns the average of the values in a group. Null values are ignored.
Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database.

Syntax
AVG ( [ ALL | DISTINCT ] expression )
OVER ( [ partition_by_clause ] order_by_clause ) 

Arguments
ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that AVG be performed only on each unique instance of a value, regardless of how many times the value occurs.
Expression
Is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted.

OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required.
Return Types
The return type is determined by the type of the evaluated result of expression.

Expression result
Return type
tinyint
int
smallint
int
int
int
bigint
bigint
decimal category (p, s)
decimal(38, s)
money and smallmoney category
money
float and real category
float

Remarks
If the data type of expression is an alias data type, the return type is also of the alias data type. However, if the base data type of the alias data type is promoted, for example from tinyint to int, the return value is of the promoted data type and not the alias data type.

AVG () computes the average of a set of values by dividing the sum of those values by the count of nonnull values. If the sum exceeds the maximum value for the data type of the return value an error will be returned.

AVG is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.


COUNT (Transact-SQL)
Applies To: SQL Server 2014, SQL Server 2016 Preview

Returns the number of items in a group. COUNT works like the COUNT_BIG function. The only difference between the two functions is their return values. COUNT always returns an int data type value. COUNT_BIG always returns a bigint data type value.

Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.

Syntax

COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )
    OVER ( [ partition_by_clause ] order_by_clause )
Arguments
ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that COUNT returns the number of unique nonnull values.
Expression
Is an expression of any type except text, image, or ntext. Aggregate functions and subqueries are not permitted.
*
Specifies that all rows should be counted to return the total number of rows in a table. COUNT(*) takes no parameters and cannot be used with DISTINCT. COUNT(*) does not require an expression parameter because, by definition, it does not use information about any particular column. COUNT(*) returns the number of rows in a specified table without getting rid of duplicates. It counts each row separately. This includes rows that contain null values.

OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required.

Return Types
Int

Remarks
COUNT(*) returns the number of items in a group. This includes NULL values and duplicates.
COUNT(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values.
COUNT(DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, nonnull values.
For return values greater than 2^31-1, COUNT produces an error. Use COUNT_BIG instead.

COUNT is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.


COUNT_BIG (Transact-SQL)

Applies To: SQL Server 2014, SQL Server 2016 Preview

Returns the number of items in a group. COUNT_BIG works like the COUNT function. The only difference between the two functions is their return values. COUNT_BIG always returns a bigint data type value. COUNT always returns an int data type value.

Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.

Syntax
COUNT_BIG ( { [ ALL | DISTINCT ] expression } | * )
   OVER ( [ partition_by_clause ] order_by_clause )
Arguments
ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that COUNT_BIG returns the number of unique nonnull values.
Expression
Is an expression of any type. Aggregate functions and subqueries are not permitted.
*
Specifies that all rows should be counted to return the total number of rows in a table. COUNT_BIG(*) takes no parameters and cannot be used with DISTINCT. COUNT_BIG(*) does not require an expression parameter because, by definition, it does not use information about any particular column. COUNT_BIG(*) returns the number of rows in a specified table without getting rid of duplicates. It counts each row separately. This includes rows that contain null values.
OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required.

Return Types
bigint

Remarks
COUNT_BIG(*) returns the number of items in a group. This includes NULL values and duplicates.
COUNT_BIG (ALL expression) evaluates expression for each row in a group and returns the number of 
nonnull values.
COUNT_BIG (DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, nonnull values.

        COUNT_BIG is a deterministic function when used without the OVER and ORDER BY clauses. It is                  
        nondeterministic when specified with the OVER and ORDER BY clauses.

STDEV (Transact-SQL)

Applies To: SQL Server 2014, SQL Server 2016 Preview
Returns the statistical standard deviation of all values in the specified expression.

Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.

Syntax
STDEV ( [ ALL | DISTINCT ] expression )
   OVER ( [ partition_by_clause ] order_by_clause ) 

Arguments
ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
Expression
Is a numeric expression. Aggregate functions and subqueries are not permitted. expression is an expression of the exact numeric or approximate numeric data type category, except for the bit data type.

OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required. For more information, see OVER Clause (Transact-SQL).

Return Types
float

Remarks
If STDEV is used on all items in a SELECT statement, each value in the result set is included in the calculation. STDEV can be used with numeric columns only. Null values are ignored.

STDEV is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.

STDEVP (Transact-SQL)

Applies To: SQL Server 2014, SQL Server 2016 Preview
Returns the statistical standard deviation for the population for all values in the specified expression.

Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.

STDEVP ( [ ALL | DISTINCT ] expression )
   OVER ( [ partition_by_clause ] order_by_clause )  

ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
Expression
Is a numeric expression. Aggregate functions and subqueries are not permitted. expression is an expression of the exact numeric or approximate numeric data type category, except for the bit data type.

OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clausedetermines the logical order in which the operation is performed. order_by_clause is required.
float

If STDEVP is used on all items in a SELECT statement, each value in the result set is included in the calculation. STDEVP can be used with numeric columns only. Null values are ignored.

STDEVP is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.
 
VAR (Transact-SQL)

Applies To: SQL Server 2014, SQL Server 2016 Preview

Returns the statistical variance of all values in the specified expression. May be followed by the OVER clause.

Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.
Syntax
VAR ( [ ALL | DISTINCT ] expression )
   OVER ( [ partition_by_clause ] order_by_clause ) 

Arguments
ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
Expression
Is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted.
OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required.
Return Types
float

Remarks
If VAR is used on all items in a SELECT statement, each value in the result set is included in the calculation. VAR can be used with numeric columns only. Null values are ignored.

VAR is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.


VARP (Transact-SQL)

Applies To: SQL Server 2014, SQL Server 2016 Preview
Returns the statistical variance for the population for all values in the specified expression.
Applies to: SQL Server (SQL Server 2008 through current version), Azure SQL Database, Azure SQL Data Warehouse Public Preview.

Syntax
VARP ( [ ALL | DISTINCT ] expression )
   OVER ( [ partition_by_clause ] order_by_clause )  nh

Arguments
ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
Expression
Is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted.
OVER ( [ partition_by_clause ] order_by_clause )
partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the operation is performed. order_by_clause is required. For more information, see OVER Clause (Transact-SQL).
Return Types
float
Remarks
If VARP is used on all items in a SELECT statement, each value in the result set is included in the calculation. VARP can be used with numeric columns only. Null values are ignored.
VARP is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.


·         Applying the MIN aggregate function to a BIT field:

create table test (sw bit)
insert into test (sw) values(0)
select min(sw) from test

Msg 8117, Level 16, State 1, Line 1
Operand data type bit is invalid for min operator.

select
    SQL_VARIANT_PROPERTY(sw+0, 'BaseType') as type,
    SQL_VARIANT_PROPERTY(sw+0, 'Precision') as Precision,
    SQL_VARIANT_PROPERTY(sw+0, 'Scale') as Scale,
    SQL_VARIANT_PROPERTY(sw+0, 'MaxLength') as len
from test

type          Precision Scale len
int           10        0     4
int           10        0     4

The Solution is:
SELECT CASE WHEN MIN(sw+0) = 1 THEN 'True' ELSE 'False' END AS SW
FROM test

      When you add the BitField+0 it would automatically becomes like int


·         Applying the SUM aggregate function to a BIT field:

Pinal Dave:

He has very interesting question. He attempted to use BIT filed in the SUM aggregation function and he got following error.
He went ahead with various different datatype (i.e. INT, TINYINT etc) and he was able to do the SUM but with BIT he faced the problem.

Error Received:
Msg 8117, Level 16, State 1, Line 1
Operand data type bit is invalid for sum operator.

USE tempdb
GO
-- Preparing Sample Data
CREATE TABLE TestTable (ID INT, Flag BIT)
GO
INSERT INTO TestTable (ID, Flag)
SELECT 1, 0
UNION ALL
SELECT 2, 1
UNION ALL
SELECT 3, 0
UNION ALL
SELECT 4, 1
GO
SELECT *
FROM TestTable
GO

Following script will work fine:
-- This will work fine
SELECT SUM(ID)
FROM TestTable
GO

However following generate error:
-- This will generate error
SELECT SUM(Flag)
FROM TestTable
GO

The workaround is to convert or cast the BIT to INT:
-- Workaround of error
SELECT SUM(CONVERT(INT, Flag))
FROM TestTable
GO

-- Workaround of error
SELECT SUM(Flag + 0)
FROM TestTable
GO

-- Clean up
DROP TABLE TestTable
GO


·         If i use Aggregate Function MIN () on BIT data type column then i got below error message.
"Operand data type bit is invalid for min operator."

I am clear about how to avoid the error, but i would like to know why bit data type columns are not allowed in MIN Aggregate function? Is there any technical reason behind that?

Reason Is:
There is a technical reason, what is the sum of 1 + 1? Well, 2, obviously! However, 2 cannot fit in a bit! The sum would take on the type of the column being summed, unless you write

Sum (cast mybit as int)

Also a theoretical reason;  Since bits are used to represent Boolean values, it makes no sense to do normal arithmetic on them. e.g. what is True plus False?


·         I have observed that many interviewers asks question to SQL Server Developer about COUNT aggregate function. The question is something like

There is one table which has three fields.

1.)    ID which is integer primary key so it won’t accept null value
2.)    FirstName which varchar and can accept null value
3.)    LastName which varchar and can accept null value

If I execute COUNT(*) in SELECT query and
in other SELECT query, I execute COUNT(ID) and
in third SELECT statement I execute COUNT(FirstName), what would be the results? Whether it is same or not?



·         I have the following schema for a table in SQL.
Owner (ownerid, name, phone, age)
I am asked to write a statement to find the oldest owner(s) in the database without using any aggregate function such as MAX.
  
select * from owner

ownerid     name       age
----------- ---------- -----------
1           a          20
2           b          30
3           c          40
4           d          50
5           e          60

(5 row(s) affected)

MAX:

Option 1:
select TOP-- select the very top entry in result set
     age
from
     Owner
order by
     age DESC

Option 2:

select t1.age
from Owner t1
where not exists (select 1
                  from Owner t2
                  where t1.age < t2.age)

age
-----------
60


Option 3:

SELECT *
FROM owner d1
WHERE age > ALL (SELECT d2.age
                    FROM owner d2
                    WHERE d2.ownerid <> d1.ownerid)

Option 4:

SELECT *
  FROM Owner mna1
LEFT JOIN Owner mna2 ON (mna2.age > mna1.age)
WHERE mna2.ownerid IS NULL;

Option 5:
select * from owner t where 1 = (select count(distinct age) from owner where t.age <= age)
Option 6:

with cte(age,cnt) as
(select age,ROW_NUMBER() over(order by age desc)
from Owner)
select age from cte where cnt = 1

MIN:

Option 1:
select TOP-- select the very top entry in result set
     age
from
     Owner
order by
     age ASC

Option 2:

select t1.age
from Owner t1
where not exists (select 1
                  from Owner t2
                  where t1.age > t2.age)

age
-----------
60


Option 3:

SELECT *
FROM owner d1
WHERE age < ALL (SELECT d2.age
                    FROM owner d2
                    WHERE d2.ownerid <> d1.ownerid)

Option 4:

SELECT *
  FROM Owner mna1
LEFT JOIN Owner mna2 ON (mna2.age < mna1.age)
WHERE mna2.ownerid IS NULL;

Option 5:
select * from owner t where 1 = (select count(distinct age) from owner where t.age >= age)

No comments:

Post a Comment