Defend your data quality with constraints

In computer science, garbage in, garbage out is where bad, or nonsense input data produces nonsense output or “garbage”.  How do you protect your database from allowing garbage in? Using a constraint is an excellent method for keeping your data clean.  Constraints are essentially rules that your data has to follow.

Datatypes as database constraints

Datatypes are not something people think of as a constraint (because their not), but it’s probably the most basic way we constrain data we use.  If you want to store an integer and only integers then you would choose the INTEGER datatype.  PostgreSQL offers data types to store IPv4, IPv6, and MAC addresses.  You could use TEXT to store these values, but TEXT will not ensure the values are valid network values.

Common constraints and uncommon usage

The most common constraints are not null, check, unique, primary keys, and foreign Keys.  Check out the PostgreSQL constraint documentation for the common usage of these constraints.  There are also less common ways to use these constraints.  For example, partial unique indexes are less common.  Let’s say you want to have a unique constraint on an email address column, but you also want to have a soft delete concept.  You may choose to create a partial unique index like this:

  ON students (email_address)
  WHERE is_deleted is null;

Beyond very basic checks such as checking for positive prices, you could expand that to multiple columns. The PostgreSQL documentation provides a good example (you want to make sure sale prices are cheaper than the original price):

CREATE TABLE products (
    product_no integer,
    name text,
    price numeric CHECK (price > 0),
    sale_price numeric CHECK (sale_price > 0),
    CHECK (price > sale_price)

There are three data rules using a check constraint in this example, the price must be greater than 0, the sale price must also be greater than 0, and the price must be greater than the sale price.   Another uncommon usage of constraints is to use a function to constrain your data.  Take for example a DATE column that you only want to store dates that are the first day of the month.  First, create a function that will do just that.

create function start_of_month(p_date date)
  returns date
   select date_trunc('month', mydate)::date;
language SQL

Then add the function as a constraint.

create unique index on mytable ((start_of_month(mydate)));

Keep your data safe with constraints

Constraining your data is extremely important to your data quality (and sometimes performance).  Don’t let bad data get into your database, it can be very difficult to remove.  Don’t rely on application code to keep your data clean.  Your database is the foundation of your application and the database needs to prevent bad data.

Matt McGuire

Proven technology leader with 25+ years of experience in the installation, administration, development, design, and maintenance of various enterprise data related systems.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.