Firebird Documentation Index → Firebird Null Guide → Aggregate functions |
The aggregate functions – COUNT
, SUM
,
AVG
, MAX
, MIN
and
LIST
– don't handle NULL
in the same way as ordinary
functions and operators. Instead of returning NULL
as soon as a
NULL
operand is encountered, they only take non-NULL
fields into consideration while computing the outcome. That is, if you have this table:
MyTable | ||
---|---|---|
ID | Name | Amount |
1 | John | 37 |
2 | Jack | NULL |
3 | Jim | 5 |
4 | Joe | 12 |
5 | Josh | NULL |
...the statement select sum(Amount) from MyTable
returns 54, which
is 37 + 5 + 12. Had all five fields been summed, the result would have been
NULL
. For AVG
, the non-NULL
fields are summed and the sum divided by the number of non-NULL
fields.
There is one exception to this rule: COUNT(*)
returns the count of
all rows, even rows whose fields are all NULL
. But
COUNT
(FieldName
) behaves like the other aggregate
functions in that it only counts rows where the specified field is not
NULL
.
Another thing worth knowing is that COUNT(*)
and
COUNT(
never return
FieldName
)NULL
: if there are no rows in the set, both functions return 0.
COUNT(
also returns 0 if all
FieldName
)FieldName
fields in the set are NULL
. The other
aggregate functions return NULL
in such cases. Be warned that
SUM
even returns NULL
if used on an empty set, which
is contrary to common logic (if there are no rows, the average, maximum and minimum are
undefined, but the sum is known to be zero).
Now let's put all that knowledge in a table for your easy reference:
Table 7. Aggregate function results with different column states
Function | Results | ||
---|---|---|---|
Empty set | All-null set or column | Other sets or columns | |
COUNT(*) | 0 | Total number of rows | Total number of rows |
COUNT(Field) | 0 | 0 | Number of rows where Field is not NULL |
MAX, MIN | NULL |
NULL |
Max or min value found in the column |
SUM | NULL |
NULL |
Sum of non-NULL values in the column
|
AVG | NULL |
NULL |
Average of non-NULL values in the column. This equals
SUM(Field) / COUNT(Field).[a] |
LIST[b] | NULL |
NULL |
Comma-separated string concatenation of non-NULL values in
the column
|
[a] If Field is of an integer type, AVG is always rounded towards 0. For instance, 6 non-null INT records with a sum of -11 yield an average of -1, not -2. [b] LIST was added in Firebird 2.1 |
A GROUP BY clause doesn't change the aggregate function logic
described above, except that it is now applied to each group individually rather than to the
result set as a whole. Suppose you have a table Employee, with fields Dept and Salary which
both allow NULL
s, and you run this query:
The result may look like this (the row where Dept is <null>
may
be at the top or bottom, depending on your Firebird version):
DEPT SUM ====== ===================== <null> 219465.19 000 266643.00 100 155262.50 110 130442.81 115 13480000.00 120 <null> 121 110000.00 123 390500.00
First notice that the people whose department is unknown (NULL
) are
grouped together, although you can't say that they have the same value in
the Dept field. But the alternative would have been to give each of those records a
“group” of their own. Not only would this possibly add a huge number of lines to
the output, but it would also defeat the purpose of grouping: those lines
wouldn't be aggregates, but simple “SELECT Dept, Salary”
rows. So it makes sense to group the NULL
depts by their state and the
rest by their value.
Anyway, the Dept field is not what interests us most. What does the aggregate
SUM column tell us? That all salaries are non-NULL
,
except in department 120? No. All we can say is that in every department except 120, there is
at least one employee with a known salary in the database. Each department
may contain NULL
salaries; in dept. 120
all the salaries are NULL
.
You can find out more by throwing in one or more COUNT() columns.
For instance, if you want to know the number of NULL
salaries in each
group, add a column “COUNT(*) –
COUNT(Salary)”.
A GROUP BY clause can be used to report the frequencies with
which values occur in a table. In that case you use the same field name several times in the
query statement. Let's say you have a table TT with a column
A whose contents are { 3, 8, NULL
, 6, 8, -1,
NULL
, 3, 1 }. To get a frequencies report, you could use:
SELECT A, COUNT(A) FROM TT GROUP BY A
which would give you this result:
A COUNT ============ ============ -1 1 1 1 3 2 6 1 8 2 <null> 0
Oops – something went wrong with the NULL
count, but what?
Remember that COUNT(
skips all
FieldName
)NULL
fields, so with
COUNT(
the count of the
A
)<null>
group can only ever be 0. Reformulate your query like
this:
SELECT A, COUNT(*) FROM TT GROUP BY A
and the correct value will be returned (in casu 2).
HAVING clauses can place extra restrictions on the output rows of an aggregate query – just like WHERE clauses do in record-by-record queries. A HAVING clause can impose conditions on any output column or combination of columns, aggregate or not.
As far as NULL
is concerned, the following two facts are worth
knowing (and hardly surprising, I would guess):
Rows for which the HAVING condition evaluates to
NULL
won't be included in the result set. (“Only
true
is good enough.”)
“HAVING <col>
IS [NOT] NULL” is a legal and often useful condition, whether
<col>
is aggregate or not. (But if
<col>
is non-aggregate, you may save the engine some work
by changing HAVING to WHERE and placing the
condition before the “GROUP BY” clause. This goes for
any condition on non-aggregate columns.)
For instance, adding the following clause to the example query from the “GROUP BY” paragraph:
...HAVING Dept IS NOT NULL
will prevent the first row from being output, whereas this one:
...HAVING SUM(Salary) IS NOT NULL
suppresses the sixth row (the one with Dept = 120).
Firebird Documentation Index → Firebird Null Guide → Aggregate functions |