What is a NULL value?

Columns that are NULL often confuse people and create results that are wrong. NULL is the absence of anything. NULL is not like or equal to anything, it's also not unequal to anything so you cannot use operators like greater than or equals. Let's take a look at my employee table. The [NULL] text represents a NULL values

+----+------------+-----------+-------------+---------+------------+
| id | first_name | last_name |    dept     | salary  | hire_date  |
+----+------------+-----------+-------------+---------+------------+
|  1 | Joe        | Jones     | Sales       | 1034.23 | 2015-01-15 |
|  2 | Sally      | Smith     | Sales       | 1195.98 | 2015-01-15 |
|  3 | Lisa       | Teal      | Engineering | 2496.78 | 2019-03-10 |
|  4 | Gunner     | Greenburg | Engineering | 2169.32 | 2017-09-22 |
|  5 | Jack       | Doe       | Finance     | 1945.42 | 2018-06-20 |
|  6 | Troy       | Anderson  | Sales       | 1158.34 | 2017-04-15 |
|  7 | Julie      | Lopez     | Finance     | 1455.46 | 2010-03-25 |
|  8 | Sam        | Lopez     |             | 1658.00 | 2014-02-25 |
|  9 | Mark       | Thompson  | [NULL]      | 3638.42 | 2008-11-22 |
+----+------------+-----------+-------------+---------+------------+

To find all the employees in the sales department the SQL is straight forward and the NULL department does not affect the results.

SELECT *
FROM employee
WHERE dept = 'Sales';

+----+------------+-----------+-------+---------+------------+
| id | first_name | last_name | dept  | salary  | hire_date  |
+----+------------+-----------+-------+---------+------------+
|  1 | Joe        | Jones     | Sales | 1034.23 | 2015-01-15 |
|  2 | Sally      | Smith     | Sales | 1195.98 | 2015-01-15 |
|  6 | Troy       | Anderson  | Sales | 1158.34 | 2017-04-15 |
+----+------------+-----------+-------+---------+------------+

Now, what about finding all of the employees that are not in the sales department. The SQL also seems straight forward, but might not work as expected.

SELECT *
FROM mmcguire.employee
WHERE dept != 'Sales';

+----+------------+-----------+-------------+---------+------------+
| id | first_name | last_name |    dept     | salary  | hire_date  |
+----+------------+-----------+-------------+---------+------------+
|  3 | Lisa       | Teal      | Engineering | 2496.78 | 2019-03-10 |
|  4 | Gunner     | Greenburg | Engineering | 2169.32 | 2017-09-22 |
|  5 | Jack       | Doe       | Finance     | 1945.42 | 2018-06-20 |
|  7 | Julie      | Lopez     | Finance     | 1455.46 | 2010-03-25 |
|  8 | Sam        | Lopez     |             | 1658.00 | 2014-02-25 |
+----+------------+-----------+-------------+---------+------------+

Why was Mark Thompson not included? Mark Thompson is not in any department, but then again neither is Sam Lopez. So why is Sam included, but Mark is not? Sam's department is an empty string, but Mark's department is NULL. NULL values are never equal to anything and they are not equal to anything. So how do you locate a NULL row? That's where IS NULL or IS NOT NULL is used.

SELECT *
FROM employee
WHERE dept IS NULL;

+----+------------+-----------+--------+---------+------------+
| id | first_name | last_name |  dept  | salary  | hire_date  |
+----+------------+-----------+--------+---------+------------+
|  9 | Mark       | Thompson  | [NULL] | 3638.42 | 2008-11-22 |
+----+------------+-----------+--------+---------+------------+

So then how do you find everyone not in Sales including those with a NULL department? COALESCE works well in this situation.

SELECT *
FROM employee
WHERE coalesce(dept, 'UKNOWN') != 'Sales';

+----+------------+-----------+-------------+---------+------------+
| id | first_name | last_name |    dept     | salary  | hire_date  |
+----+------------+-----------+-------------+---------+------------+
|  3 | Lisa       | Teal      | Engineering | 2496.78 | 2019-03-10 |
|  4 | Gunner     | Greenburg | Engineering | 2169.32 | 2017-09-22 |
|  5 | Jack       | Doe       | Finance     | 1945.42 | 2018-06-20 |
|  7 | Julie      | Lopez     | Finance     | 1455.46 | 2010-03-25 |
|  8 | Sam        | Lopez     |             | 1658.00 | 2014-02-25 |
|  9 | Mark       | Thompson  | [NULL]      | 3638.42 | 2008-11-22 |
+----+------------+-----------+-------------+---------+------------+

Now Mark Thompson is included in the results. COALESCE says if the column is NULL then substitute the word UKNOWN and since UKNOWN does not equal Sales then the row is returned.

Next: Select DISTINCT Values