Common Table Expressions (CTE) are a feature that was first introduced to SQL in about 2005 and provide a method of defining a temporary result set then using that in the query as if it were another table.
Yes, subqueries do the same thing but a CTE can be used repeatedly in the query after it has been declared without further re-declaration whereas a sub-query has to be declared every time it is used. Views could also be created but that involves a physical object within the database being created and remembering to tidy up and delete it later.
The CTE is declared at the start of a query using the WITH construct so I can write:
WITH pers AS
(
Select
NameTable.OwnerID AS RIN1,
NameTable.Surname COLLATE NOCASE AS Surname,
NameTable.Given COLLATE NOCASE AS Given
FROM NameTable
where NameTable.IsPrimary = 1
)
The very first word of the first CTE has to be WITH and this has to be the very start of the query, apart from comments. Then the name of the CTE is declared folowed by AS then its definition in brackets. CTE names are best kept short as they will be used further on in the query. Now I have an item I can use in the rest of my query so that continues:
SELECT
EventID,
EventType,
pers.RIN1,
pers.Surname,
pers.Given
FROM EventTable as ev
LEFT OUTER JOIN pers ON
ev.OwnerID = pers.RIN1
AND ev.OwnerType = 0
so you can see pers is treated as if it were another table or view in the query.
This is just a simple example using a CTE once. A query can contain multiple CTE and once the first has been declared, subsequent CTE are declared by continuing the WITH clause and putting
, ctename AS
(
my cte code
)
rather than another WITH. So I can add another CTE to get both parent details for a family and this can (but does not have to) use the first CTE.
,fam as
(
select FamilyID, RIN1, Surname, Given
from FamilyTable f inner join pers
on f.FatherID = pers.RIN1
where f.FatherID > 0
UNION
select FamilyID, RIN1, Surname, Given
from FamilyTable f inner join pers
on f.MotherID = pers.RIN1
where f.MotherID > 0
)
All this happens before the main query and the main query simply uses those CTE as if they were tables.
So main query could now be
SELECT
EventID,
EventType,
f.Name COLLATE NOCASE AS FactType,
ev.OwnerType,
ev.OwnerTypeN,
case when ev.OwnerType = 1 THEN fam.Surname
ELSE pers.Surname
END as Surname,
case when ev.OwnerType = 1 THEN fam.Given
ELSE pers.Given
END as Given,
case when ev.OwnerType = 1 THEN fam.RIN1
ELSE pers.RIN1
END as RIN1,
ev.Eventdate,
ev.proof,
ev.placeID,
ev.SiteID,
ev.details
from ev /* join in person where owner type is person */
INNER JOIN FactTypeTable f ON
e.EventType = f.FactTypeID
LEFT OUTER JOIN pers ON
ev.OwnerID = pers.RIN1 and ev.OwnerType = 0
/* join in the family persons for owner type = family */
LEFT OUTER JOIN fam ON
ev.OwnerID = fam.FamilyID and ev.OwnerType = 1
WHERE ev.EventType <> 35 /* ignore reference numbers */
/* and (ev.Proof = 0) – not proven (remove surrounding comment markers /* and */ to include this restriction)*/
ORDER BY Surname, given, RIN1, eventtype
The overall layout is
WITH a AS (something) , b AS (something else) , c AS (something more) SELECT ……….
Note the left outer join syntax which allows all records from the left hand table to be included even if there is not a matching record on the right hand side. Missing data is set to null values. Inner JOIN and just plain JOIN would leave out records which do not have a match for the person id and an owner type of 0 and in fact the whole query would just mess up.
A good rule for SQL readability is to always fully specify the join type so never use just JOIN but always INNER JOIN or LEFT OUTER JOIN.
There is also a RIGHT OUTER JOIN and a FULL OUTER JOIN but SQLite does not support these constructs.
Another useful implementation of the CTE is to expand out tables in RM that contain dates and fields that need collation changes. The date expansion code is longwinded and distracting from the main query. Putting it in a CTE gets it out of the way, leaving the main query readable.
My full query example attached is a list of facts and their citations, including the owner (person or family). By altering the where clause this can list facts without citations or facts not marked proven (or both).
With the database changes between RM7 and RM8 the CTE is most useful to convert queries written for version 7 to run against a version 8 database. The CitationTable from version 7 has been split into CitationTable and CitationLinkTable in version 8 but using a CTE we can emulate the version 7 structure then just change the table name in the main query.
So we have:
With Cit8 as
(Select c.CitationID, cl.OwnerType, cl.OwnerID, c.SourceID, cl.Quality, cl.IsPrivate, c.Comments, c.ActualText, c.RefNumber, cl.Flags, c.Fields
from CitationTable c INNER JOIN CitationLinkTable cl
ON c.CitationID = cl.CitationID)
then continue with original query but change all occurrences of CitationTable to Cit8
So a simple list of citations with their sources becomes:
With Cit8 as
(Select c.CitationID, cl.OwnerType, cl.OwnerID, c.SourceID, cl.Quality, cl.IsPrivate, c.Comments, c.ActualText, c.RefNumber, cl.Flags, c.Fields
from CitationTable c INNER JOIN CitationLinkTable cl
ON c.CitationID = cl.CitationID)
SELECT * FROM Cit8 LEFT OUTER JOIN SourceTable ON CitationTable.SourceID = SourceTable.SourceID
I have built queries with many CTE to progressively build up data for complex reporting and find it a far more efficient way of coding which can be easily tested level by level.
My query to list facts and their citations is attached in two versions, cte-facts-citations-RM7.sql for RootsMagic version 7 databases
and cte-facts-citations-RM8.sql for RootsMagic version 8 databases.
Further Reading:
This article is aimed at SQL Server but SQLite uses the same syntax.
https://www.red-gate.com/simple-talk/sql/t-sql-programming/sql-server-cte-basics/
Pat Jones
SQL and Business Intelligence Developer
[…] query does not run on RM8. Pat Jones revised it and applied a technique she describes at Common Table Expressions – The Building Blocks of SQL to streamline […]
more good info— thanks for sharing!