Forum

Please or Register to create posts and topics.

Counting DNA kits attached to descendants of people in RM trees

Hello,

First, thank you for this site.  It is a tremendous resource for using SQL with RootsMagic.  Recently I implemented some SQL code to count the number of DNA kits attached to descendants of people in RM trees.  The process is clunky at the moment, but it works.  Hopefully people here can give suggestions or feedback.

The problem statement: I have a RM9 tree with about 1700 Ancestry DNA kits attached (and a few from other platforms).  My primary use for these RM kits at the moment is creating DNA Matrix charts in Charting Companion.  Therefore I use the custom "DNA kit" fact supported by CC.   The problem here is I would like a way of evaluating DNA match coverage of various parts of my tree.  The intent is to check questionable parts of my tree (no DNA matches?!), suggest areas of the tree to make a search for more DNA matches (1 match here? let's look at their shared matches), and quantify potential tree connections (this unconnected subtree has 50 matches, must be an ancestor here somewhere).  To evaluate coverage I came up with the idea of presenting descendant DNA kit counts hierarchically allowing a quantitative look at exactly which ancestors/relatives/cousin levels contain matches.  Initially I hoped to present this in a fan chart, but I have been unable to make that work so am using CC ancestor or descendant charts for now.

Solution overview: use SQL code (largely derived from code from this site) to take a list of people in a RM tree (say ancestors, or everyone) and generate a RM custom fact DescDNAKitCnt containing a count of descendants with DNA kits (various conditions possible, currently I am limiting the count to Ancestry kits) for each of those people.  Then display those custom facts hierarchically in CC charts.  Or view them in the RM individual details pane while browsing the tree.

Tools: I started with DB Browser for SQLite, but found it necessary to move to SQLite Expert Personal for recursive query support.  Then found it necessary to move to SQLite Studio for support of functions created from SQL.  All of this done on Windows 10.  At the moment I am mixing between the three tools depending on my task.  Does anyone know of a good feature comparison of the SQLite tools?  This is the best I found, but not nearly comprehensive enough.
https://softwarerecs.stackexchange.com/questions/1057/sqlite-gui-editor

This site has good documentation and examples for recursive queries.  Some other links were helpful for turning SQL code into a function in SQLite Studio:
https://github.com/pawelsalawa/sqlitestudio/discussions/4850
https://github.com/pawelsalawa/sqlitestudio/wiki/User_Manual#custom-sql-functions

SQL code: Multiple steps here which are not well integrated yet.

First, create a table of people and dummy kit counts.  Table named PersonDescKitCountsTable with fields PersonID and DescKitCnt

Here is a version for ancestors of a RIN.  Cribbed from

Ancestors Query – Recursive #recursive


I see the Enlighter plugin for SQL code mentioned in a post, but don't see those options in my menu so using blockquote.

DROP TABLE IF EXISTS PersonDescKitCountsTable;

CREATE TABLE PersonDescKitCountsTable AS
WITH RECURSIVE
parent_of(ChildID, ParentID) AS
(SELECT PersonID, FatherID AS ParentID FROM PersonTable
LEFT JOIN ChildTable ON PersonID=ChildTable.ChildID
LEFT JOIN FamilyTable USING(FamilyID)
WHERE RelFather=0 --birth father (remove WHERE constraint to include all relationships)
UNION
SELECT PersonID, MotherID AS ParentID FROM PersonTable
LEFT JOIN ChildTable ON PersonID=ChildTable.ChildID
LEFT JOIN FamilyTable USING(FamilyID)
WHERE RelMother=0 --birth mother (remove WHERE constraint to include all relationships)
),
ancestor_of_person(AncestorID) AS
(SELECT ParentID FROM parent_of
WHERE ChildID=$Person(RIN) --enter RIN of starting person at runtime
UNION --ALL
SELECT ParentID FROM parent_of
INNER JOIN ancestor_of_person ON ChildID = AncestorID)
SELECT AncestorID AS PersonID, 0 AS DescKitCnt FROM ancestor_of_person, PersonTable
WHERE ancestor_of_person.AncestorID=PersonTable.PersonID

Alternatively, create the table for all RM people.  This version took about an hour to run the kit counting on my 25k person tree.

DROP TABLE IF EXISTS PersonDescKitCountsTable;

CREATE TABLE PersonDescKitCountsTable AS
SELECT PersonID, 0 AS DescKitCnt FROM PersonTable;

Second, do the kit counting.  Here is code for the SQL function I created to count descendant kits.  Scalar function named CountDescDNAKits with argument RIN.  Note dependency on my custom DNA kit fact being EventTpe 1076.  Code cribbed from this site as well.

SELECT COUNT(DISTINCT DescendantID) FROM
(WITH RECURSIVE
child_of(ParentID, ChildID) AS
(SELECT PersonID, ChildTable.ChildID FROM PersonTable
LEFT JOIN FamilyTable ON PersonID=FatherID
LEFT JOIN ChildTable USING(FamilyID)
WHERE
RelFather=0 --birth father (ELSE WHERE 1 to include all relationships)
UNION
SELECT PersonID, ChildTable.ChildID FROM PersonTable
LEFT JOIN FamilyTable ON PersonID=MotherID
LEFT JOIN ChildTable USING(FamilyID)
WHERE
RelMother=0 --birth mother (ELSE WHERE 1 to include all relationships)
),
descendant_of_person(DescendantID) AS
(SELECT ChildID FROM child_of
--WHERE ParentID=$Person(RIN) --enter RIN of starting person at
runtime
WHERE ParentID=:RIN -- Supply RIN of starting person as a function argument
UNION --ALL
SELECT ChildID FROM child_of
INNER JOIN descendant_of_person ON ParentID = DescendantID)
SELECT DescendantID FROM descendant_of_person, PersonTable
WHERE descendant_of_person.DescendantID=PersonTable.PersonID
)
JOIN EventTable
WHERE EventTable.EventType = 1076 AND EventTable.OwnerID = DescendantID
-- AND EventTable.Details LIKE '________-%-%-%-%' -- Ancestry kits without leading x
AND EventTable.Details LIKE '%-____-%-%-%' -- Ancestry kits
;

I found it easiest to run this as a query in SQLiteStudio exporting the results to a CSV which I then imported to DB Browser.

SELECT PersonID, CountDescDNAKits(PersonID) AS DescKitCnt FROM PersonDescKitCountsTable

Third, now we have the PersonDescKitCountsTable populated.  Use that to add the custom facts DNADescKitCnt (EventType 1080 for me) in RM.

INSERT INTO EventTable (EventType, OwnerType, OwnerID, FamilyID, PlaceID, SiteID, SortDate, isPrimary, isPrivate, Proof, Status, Details, UTCModDate)
SELECT 1080, 0, PersonID, 0, 0, 0, 9223372036854775807, 0, 0, 0, 0, DescKitCnt, julianday('now') - 2415018.5
FROM PersonDescKitCountsTable;

/*
Remove duplicate entries keeping the most recent.
*/

DELETE FROM EventTable
WHERE EventType = 1080 AND ROWID NOT IN (
SELECT MAX(ROWID)
FROM EventTable
WHERE EventType = 1080
GROUP BY OwnerID
);

Lastly, now we can use this information.  Many ways to do that.  For example, here I use a similar AncestorKitCountsTable to pull out all non-ancestors with DNA kits attached to descendants.  I then order by DescKitCnt to look for unattached subtrees with many matches.

SELECT PersonID, DescKitCnt FROM PersonDescKitCountsTable
WHERE NOT EXISTS (
SELECT 1
FROM AncestorKitCountsTable
WHERE AncestorKitCountsTable.AncestorID = PersonDescKitCountsTable.PersonID
) AND DescKitCnt>0;

My primary uses so far have been:

  1. To create a CC Ancestors chart to display the counts hierarchically.  This lets me easily see the number of matches at each ancestor level.  Looking at the difference between two levels shows how many cousins are specifically through the older generation.
  2. Browsing the RM tree.  For example, given an ancestor with many descendant matches look at children to see how each fits in.
  3. Queries.  For example, identify ancestors with the most matches.
  4. Targeting DNA matrix efforts and helping with interpretation.

Thanks for reading this far.  I would appreciate any comments or suggestions.

kevync has reacted to this post.
kevync

you have done some really creative stuff.

Power BI already does hierarchy and could be done based of the child - parents relationship I believe (though I have yet to explore that part).  I uses a OBDC connection to directly query data from the RM database base.

This is a beta version of what I started  (which also links) to  myRootsMagic

 

https://app.powerbi.com/groups/me/reports/76bd38ea-c181-41c6-b7d3-6d8bcc5f7cd1/cd92cf2949bb5d8d8d8b?experience=power-bi

 

Richard Seiter has reacted to this post.
Richard Seiter

Thanks.  I tried accessing that and failed.  Are you using Power BI as an individual?  It seems very much group oriented and my attempt to set up an account ended up with authentication issues.

How well does Power BI hierarchy map to genealogy?  Looking at this page

What a Power BI Hierarchy Is, and How to Use it?


I am not seeing analogs to marriage or multiple parent connections.  Though I do see how ancestor or descendant trees would map.

How are you using Power BI with RM?  I searched for forum posts about that, but did not see anything?