Reprinted with Permission by Quest Software Dec.2007


Complete Set
by Scott Noyes

Occasionally in #mysql the question pops up, “How can I get all the values from a table that satisfy all of multiple conditions?”

Then we beat the questioner with a large stick, until they give an example of what they really mean.

SELECT * FROM quizAnswers;
+-------------+----------+
| studentName | question |
+-------------+----------+
| seekwill    | A        |
| seekwill    | B        |
| seekwill    | C        |
| roxlu       | A        |
| fury        | B        |
| fury        | B        |
+-------------+----------+

Find all the students who have answered both questions ‘A’ and ‘B’.

First shot at it is to group them together, and count their answers:

SELECT studentName FROM quizAnswers GROUP BY studentName HAVING COUNT(*) = 2;

Sometimes that works, but it won’t in this case: fury has 2 answers, but to the same question. seekwill has 3 answers. We’ll get the wrong result.

UPDATE: As Arlen notes, this query works just fine if we change COUNT(*) to COUNT(DISTINCT question). Now back to the regularly scheduled exercise in obfuscation.

Next we turn to subqueries:

SELECT DISTINCT studentName
FROM quizAnswers
WHERE
    studentName IN (
        SELECT studentName FROM quizAnswers WHERE question = 'A'
    )
    AND studentNAme IN (
        SELECT studentName FROM quizAnswers WHERE question = 'B'
);
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

Gets the right answer. But it doesn’t work in 4.0 or earlier (and people on Freenode always seem to use a shared host running 3.23). So we’ll rewrite it to use JOIN instead.

SELECT DISTINCT studentName
FROM
    quizAnswers AS a
    JOIN quizAnswers AS b USING (studentName)
WHERE
    a.question = 'A'
    AND b.question = 'B';
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

“But wait!” they say. “This example only had 2 values. My real data has 22 possible values. I don’t want to self-join the table 22 times (or write 22 subqueries). And the real values are random strings, not nice and sequential, so don’t try building the query in a loop or anything!”

Never fear! Obscure and infrequently used string and aggregate functions to the rescue!

SELECT studentName
FROM quizAnswers
GROUP BY studentName
HAVING BIT_OR(1 << FIELD(question, 'A', 'B') - 1) = 3;
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

Handling the rest of those 22 values is a simple matter of adding them to the parameters to FIELD(), and changing that 3 to (2^number_of_values - 1). We’ll even put the comma-separated values into a user variable and let MySQL figure out the rest (note the switch to FIND_IN_SET here, since now we’re using a single string of values):

SET @q = 'A,B';

SELECT studentName
FROM quizAnswers
GROUP BY studentName
HAVING
        BIT_OR(1 << FIND_IN_SET(question, @q) - 1)
    =
        (1 << LENGTH(@q) - LENGTH(REPLACE(@q, ',', '')) + 1) - 1; -- This is 2^numValues - 1
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

How does it work? I’ll let you look up the functions on your own, but here are the values each generates:

SELECT
    studentName,
    question,
    FIND_IN_SET(question, 'A,B,C') AS position,
    1 << FIND_IN_SET(question, 'A,B,C') - 1 AS bitmap
FROM quizAnswers;
+-------------+----------+----------+--------+
| studentName | question | position | bitmap |
+-------------+----------+----------+--------+
| seekwill    | A        | 1        | 1      |
| seekwill    | B        | 2        | 2      |
| seekwill    | C        | 3        | 4      |
| roxlu       | A        | 1        | 1      |
| fury        | B        | 2        | 2      |
| fury        | B        | 2        | 2      |
+-------------+----------+----------+--------+

SELECT
    studentName,
    BIT_OR(1 << FIND_IN_SET(question, 'A,B,C') - 1) AS groupBitMap
FROM quizAnswers
GROUP BY studentName;
+-------------+-------------+
| studentName | groupBitMap |
+-------------+-------------+
| fury        | 2           |
| roxlu       | 1           |
| seekwill    | 7           |
+-------------+-------------+


Scott is a Support Engineer at MySQL. He has used MySQL professionally since 2000, mostly doing web development work on a LAMP stack. He has a bachelor's degree in Computer Science and Engineering from LeTourneau University. You can learn more about MySQL on Scott's blog site, A Little Noise.