The following query was used to determine what twins should be studied further after a initial screening. This was written around 1997 using MySQL 3.19.
SELECT CONCAT(p1.id, p1.tvab) + 0 AS tvid, CONCAT(p1.christian_name, ' ', p1.surname) AS Name, p1.postal_code AS Code, p1.city AS City, pg.abrev AS Area, IF(td.participation = 'Aborted', 'A', ' ') AS A, p1.dead AS dead1, l.event AS event1, td.suspect AS tsuspect1, id.suspect AS isuspect1, td.severe AS tsevere1, id.severe AS isevere1, p2.dead AS dead2, l2.event AS event2, h2.nurse AS nurse2, h2.doctor AS doctor2, td2.suspect AS tsuspect2, id2.suspect AS isuspect2, td2.severe AS tsevere2, id2.severe AS isevere2, l.finish_date FROM twin_project AS tp /* For Twin 1 */ LEFT JOIN twin_data AS td ON tp.id = td.id AND tp.tvab = td.tvab LEFT JOIN informant_data AS id ON tp.id = id.id AND tp.tvab = id.tvab LEFT JOIN harmony AS h ON tp.id = h.id AND tp.tvab = h.tvab LEFT JOIN lentus AS l ON tp.id = l.id AND tp.tvab = l.tvab /* For Twin 2 */ LEFT JOIN twin_data AS td2 ON p2.id = td2.id AND p2.tvab = td2.tvab LEFT JOIN informant_data AS id2 ON p2.id = id2.id AND p2.tvab = id2.tvab LEFT JOIN harmony AS h2 ON p2.id = h2.id AND p2.tvab = h2.tvab LEFT JOIN lentus AS l2 ON p2.id = l2.id AND p2.tvab = l2.tvab, person_data AS p1, person_data AS p2, postal_groups AS pg WHERE /* p1 gets main twin and p2 gets his/her twin. */ /* ptvab is a field inverted from tvab */ p1.id = tp.id AND p1.tvab = tp.tvab AND p2.id = p1.id AND p2.ptvab = p1.tvab AND /* Just the screening survey */ tp.survey_no = 5 AND /* Skip if partner died before 65 but allow emigration (dead=9) */ (p2.dead = 0 OR p2.dead = 9 OR (p2.dead = 1 AND (p2.death_date = 0 OR (((TO_DAYS(p2.death_date) - TO_DAYS(p2.birthday)) / 365) >= 65)))) AND ( /* Twin is suspect */ (td.future_contact = 'Yes' AND td.suspect = 2) OR /* Twin is suspect - Informant is Blessed */ (td.future_contact = 'Yes' AND td.suspect = 1 AND id.suspect = 1) OR /* No twin - Informant is Blessed */ (ISNULL(td.suspect) AND id.suspect = 1 AND id.future_contact = 'Yes') OR /* Twin broken off - Informant is Blessed */ (td.participation = 'Aborted' AND id.suspect = 1 AND id.future_contact = 'Yes') OR /* Twin broken off - No inform - Have partner */ (td.participation = 'Aborted' AND ISNULL(id.suspect) AND p2.dead = 0)) AND l.event = 'Finished' /* Get at area code */ AND SUBSTRING(p1.postal_code, 1, 2) = pg.code /* Not already distributed */ AND (h.nurse IS NULL OR h.nurse=00 OR h.doctor=00) /* Has not refused or been aborted */ AND NOT (h.status = 'Refused' OR h.status = 'Aborted' OR h.status = 'Died' OR h.status = 'Other') ORDER BY tvid;
Some explanations:
CONCAT(p1.id, p1.tvab) + 0 AS
tvid
We want to sort on the concatenated id
and tvab
in numerical order. Adding
0
to the result causes MySQL to treat the
result as a number.
Column id
This identifies a pair of twins. It is an index in all tables.
Column tvab
This identifies a twin in a pair. It has a value of
1
or 2
.
Column ptvab
This is an inverse of tvab
. When
tvab
is 1
this is
2
, and vice versa. It exists to save
typing and to make it easier for MySQL to optimize the
query.
This query demonstrates, among other things, how to do lookups
on a table from the same table with a join
(p1
and p2
). In the
example, this is used to check whether a twin's partner died
before the age of 65. If so, the row is not returned.
All of the above exist in all tables with twin-related
information. We have an index on both id,
tvab
(all tables), and id, ptvab
(person_data
) to make queries faster.
When we did this work, our production machine was a 200MHz UltraSPARC, and on that old hardware this query returned about 150–200 rows in less than one second. The main table had 70k Rows.
User Comments
Add your own comment.