All Citations – Query #citations

CTE Versions for RM7 and RM8

Revised code by Pat Jones using common table expressions:
Version for RM7 and below  AllCitations-RM7.sql  tested 2021-01-20

Version for #RM8 and above  AllCitations-RM8.sql revised 2022-04-27 to include duplicate uses

Unlike the old version, this script does not appear to be able to display “headless citations” or “phantom sources or citations” as Null fields as in the screenshot below from the original script from 2010.

Old version for RM7 and below

AllCitations.sql revised 2022-04-26 to correct a SQLite error detected by later versions of SQLite 

Lists all citations in the database from which citations of non-existent sources (‘phantoms’) and citations for non-existent events or persons (‘headless’) can be found, along with other useful information such as all citations per source.

The query builds a temporary table with index which it then queries with a filter to suppress duplicate reports of citations against Alternate Names. While the query could be made without a temporary table, it would run exponentially more slowly with larger databases without invoking measures that are deprecated by SQLite documentation and incompatible with SQLite versions < 3.6.3. The intermediate table overcomes this problem and may be used for additional queries that will execute very quickly.

SQLiteSpy-AllCitations.png
Screenshot from SQLiteSpy

Discussions & comments from Wikispaces site


ve3meo

 

Slow query – possible improvement w/o temp table

 

ve3meo
05 February 2010 15:26:27

Parking this idea here for reference.

Problem: I’ve encountered really slow query speed on larger databases, usually when both NameTable.IsPrimary and OwnerID are constraints. When just OwnerID is the constraint, the sqlite query optimizer correctly chooses the idxNameTableOwnerID index and it runs fast. Add the IsPrimary constraint and it chooses the idxNameTableIsPrimary index and it runs very slow.

Possible Solutions:
1. Use the INDEXED BY clause to override the optimisier. However, use of this clause to tune performance is a “No-No”, according to the docs and not supported by sqlite < 3.6.3, e.g., by SQLiteman and DBTools DBManager.

2. Temporary table built w/o the IsPrimary constraint but including IsPrimary as a field. Query the temp table with the IsPrimary constraint. Works well.

3. Newest idea from the sqlite newsgroup: when querying with both constraints, make the IsPrimary a formula or expression to trick the query optimiser to ignore it.
SELECT * FROM NameTable … WHERE OwnerID=’x’ AND +IsPrimary=1;

Ref:”Igor Tandetnik” <itandetnik@mvps.org> wrote in message news:hkh4a0$mk8$1@ger.gmane.org…


ve3meo

 

ve3meo
05 February 2010 16:17:49

I meant to post this against the AllFacts4Persons query which is where I first ran into the speed problem. Having just tested the slight modification of the query below by adding the ‘+’ operator to the IsPrimary constraints improved the execution time using sqlite 3.6.22 by a factor of 3883, from 1130s to 0.291s!!!

[code]SELECT FactTypeTable.Name COLLATE NOCASE AS Fact, ‘Principal’ AS ‘Role
Type’, NameTable1.OwnerID AS RIN, NameTable1.Surname COLLATE NOCASE AS
Surname, NameTable1.Suffix COLLATE NOCASE AS Suffix, NameTable1.Prefix
COLLATE NOCASE AS Prefix, NameTable1.Given COLLATE NOCASE AS ‘Given Name’,
NameTable2.OwnerID AS ‘Sharer RIN’, NameTable2.Surname COLLATE NOCASE AS
‘Sharer Surname’, NameTable2.Suffix COLLATE NOCASE AS ‘Sharer Suffix’,
NameTable2.Prefix COLLATE NOCASE AS ‘Sharer Prefix’, NameTable2.Given
COLLATE NOCASE AS ‘Sharer Given Name’, COUNT(1) AS Count
FROM EventTable
INNER JOIN FactTypeTable ON EventTable.EventType = FactTypeTable.FactTypeID
INNER JOIN FamilyTable ON EventTable.OwnerID = FamilyTable.FamilyID
INNER JOIN NameTable AS NameTable1 ON FamilyTable.FatherID =
NameTable1.OwnerID
INNER JOIN NameTable AS NameTable2 ON FamilyTable.MotherID =
NameTable2.OwnerID
WHERE EventTable.OwnerType = 1 AND +NameTable1.IsPrimary = 1 AND
+NameTable2.IsPrimary = 1
GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12[/code]


Geraniums

 

Citations for non-existent events or persons (‘headless’)

Geraniums
05 July 2010 09:36:07

I have a large area of pink in the report.

Do the “CitID” and “CrcID” numbers mean anything? There are no surnames or given names – I assume that’s what’s meant by “headless”.

How can I find out where in the RootsMagic database that these can be fixed.

Thanks,
Debbie


ve3meo

 

ve3meo
05 July 2010 12:17:15

CitID is the internal reference number for a citation and SrcID is the same for a Master Source, both numbers hidden from the RootsMagic user.

If you are seeing pink, I think you may be using SQLiteSpy -that’s how it indicates a Null value. A Null value for both Surname and Given likely indicates a “headless” citation, unless a person was entered in the database without being assigned any names – I think you may have a few examples of that.

Headless citations cannot be fixed in the RootsMagic app except through a GEDCOM export/import.

Tom

One Reply to “All Citations – Query #citations

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.