Tag Archives: multi column

SQL: Finding Duplicate Rows based on Multiple Columns CodeUnit 25 JUN 2010

Locating duplicate rows based on multiple column values with SQL turns out to be a fairly simple exercise.

If we take the usual method for locating duplicate rows based on a single column’s value, we see that locating a duplicate is nothing more than grouping on that column where the instance of that column value appears more than once.

In other words:

SELECT email, COUNT(email)
FROM users
GROUP BY email
HAVING ( COUNT(email) > 1 )

Now to find duplicate rows based on the values held by multiple columns, we simply expand on what we have done before and provide more parameters to work against, resulting in a SQL statement that looks like:

SELECT firstname, COUNT(firstname), surname, COUNT(surname), email, COUNT(email) 
FROM users
GROUP BY firstname, surname, email
HAVING (COUNT(firstname) > 1) AND (COUNT(surname) > 1) AND (COUNT(email) > 1)

And that is seriously how simple finding duplicate rows based on a compound key (multiple columns restraint) really is! :)

SQL: Add a Unique Compound Key to an Existing MySQL Table CodeUnit 21 JUN 2010

A simple question for today: how does one go about preventing duplicate rows based on the value held by more than one column being inserted into an existing MySQL table?

Well the solution lies in the use of a unique compound key which is basically a way of specifing unique rows based on a value given by a combination of one or more columns.

So for example, if we said we wanted to prevent people sharing the same firstname, surname and title from being added to a persons table, we are in fact specifying a unique compound key to act upon the columns firstname, surname and title!

With this in mind, now adding this key constraint to a table is as simple as running:

ALTER TABLE people ADD UNIQUE KEY (firstname,surname,title)

You’ll note that running this SQL statement on a table containing data already breaking this new rule that we are imposing will fail, meaning that one will first manually have to remove all the offending data before running the ALTER statement again.

And that’s it, a pretty simple little tip for today’s CodeUnit SQL tutorial entry! :)