Query – All Names in Tree

Trees

Two related problems
1) Finding all names in a selected person’s tree
2) Count Trees

1) Finding all names in selected person’s tree
Does anyone have any idea how to emulate the function that is in the RM Mark Group of “Everyone in the highlighted person’s tree”? RM does this very efficiently/quickly.I break the relationship between parents and then use it to create groups of maternal and paternal lines. It works ok, but would be a lot more efficient if I could automate it using SQL.

Tree-HourglassMembers.sql This produces a subset of the full tree; not the solution… (Tom, May 8)

2) Count Trees
I sometimes forget about this function, but it is very useful in finding orphans

Hopefully I did this right. Thanks in advance
marc

Discussions & comments from Wikispaces site


MLeroux84

A solution – not perfect but workable

MLeroux84
19 September 2016 20:31:55

I finally got back to this, and can get a solution by combining the ancestor/descendent scripts with a bit of procedural logic. And adding in the missing link – the spouses of descendants. It seems to work, although it could use a lot of optimization I’m sure – it runs fairly quickly on my computer.

I tested this on my database. First, I broke the connection between my father and mother. That left me with two “trees”, my French side with 7126 entries, the Irish side with 1091 (from RM/Tools/Count Trees).

Next I created 3 tables:
tTree – contains the RIN of all individuals in the tree (final result)
tAncWork – list of RIN to traverse backwards to get the ancestors of
tDecWork – List of RIN of descendants from a list of ancestors

I populate tAncWork with my seed RIN – in my case “1″ – myself

Then I modified the Ancestor and Descendent scripts
Ancestor reads tAncWork for a list of RINs rather than a single RIN parameter and stores the result back in to tAncWork. After doing this it adds new RINS to tTree

Descendant script reads from tTree (not the most efficient way, but it was easiest) rather than a single RIN parameter and stores the result to tDecWork. The results are moved back to tAncWork and tTree. Spouse (from the familyTable) are added to tAncWork

Then the procedural part. I check tAncWork to see if there are any entries. If there are, then there are spouses that have not been counted and I run the scripts again. As expected, the new entries in tAncWork comes down quickly, and after 4 iterations I have no entries left.

That gives me one tree. Then it’s a simple matter to find the lowest Id in the remaining entries and repeat.

To generalize this, or replicate the “Count Trees” function I would have to add a “treeId” column to tTrees and populate it with the seedId for each “tree”. In my case I’m happy with the results as they are.

I don’t think it’s possible to use loops inside sqlLite but pretty simple in a procedural script. In my case I wrapped it into a VB.net app.

Thanks for the great starting point. This saves me a great deal of time – although if I think about the time invested I could probably have done this in a manual fashion for the next 80 years and still come out ahead. But it is a creative exercise that needed solving.

The table create scripts:

— Drop and recreate temp tables

drop table if exists [tTree] ;
CREATE temp TABLE [tTree] (
[RIN] BIGINT);
drop table if exists [tAncWork] ;
CREATE temp TABLE [tAncWork] (
[RIN] BIGINT);
drop table if exists [tDecWork] ;
CREATE temp TABLE [tDecWork] (
[RIN] BIGINT);

— Seed with the starting RIN

insert into tAncWork (RIN) values (1) ;

The Get Ancestors portion

— Get ancesters of list of individuals in file tAncWork
— Results stored back into tAncWork and also moved to tTree

— Based on the original script by Tom Holden ve3meo


WITH RECURSIVE
parent_of(ChildID, ParentID) AS
(SELECT PersonID, FatherID AS ParentID FROM PersonTable
LEFT JOIN ChildTable ON PersonID=ChildTable.ChildID
LEFT JOIN FamilyTable USING(FamilyID)
UNION
SELECT PersonID, MotherID AS ParentID FROM PersonTable
LEFT JOIN ChildTable ON PersonID=ChildTable.ChildID
LEFT JOIN FamilyTable USING(FamilyID)
),
ancestor_of_person(AncestorID) AS
(SELECT ParentID FROM parent_of
WHERE ChildID in (select RIN from tAncWork)
UNION
SELECT ParentID FROM parent_of
INNER JOIN ancestor_of_person ON ChildID = AncestorID)

insert into tAncWork SELECT DISTINCT AncestorID FROM ancestor_of_person, PersonTable
WHERE ancestor_of_person.AncestorID=PersonTable.PersonID AND ancestor_of_person.AncestorID > 0
;
insert into tTree
select DISTINCT RIN from tAncWork where RIN not in (select RIN from tTree) and RIN > 0
;

The get Descendants portion. This needs to run for every iteration of the get ancestors – I just split them to make it easier to debug
— get descendents of everyone in ancester table
— uses tTree as source
— puts results in tDecWork, then moves to tTree and tAncWork
— tTree contains the final result

— Based on the original script by Tom Holden ve3meo

WITH RECURSIVE
child_of(ParentID, ChildID) AS
(SELECT PersonID, ChildTable.ChildID FROM PersonTable
LEFT JOIN FamilyTable ON PersonID=FatherID
LEFT JOIN ChildTable USING(FamilyID)
WHERE
RelFather=0
UNION
SELECT PersonID, ChildTable.ChildID FROM PersonTable
LEFT JOIN FamilyTable ON PersonID=MotherID
LEFT JOIN ChildTable USING(FamilyID)
WHERE
RelMother=0
),
descendant_of_person(DescendantID) AS
(SELECT ChildID FROM child_of

WHERE ParentID in (select RIN from tTree)
UNION –ALL
SELECT ChildID FROM child_of
INNER JOIN descendant_of_person ON ParentID = DescendantID)
insert into tDecWork SELECT distinct DescendantID FROM descendant_of_person, PersonTable
WHERE descendant_of_person.DescendantID=PersonTable.PersonID AND descendant_of_person.DescendantID > 0 ;

— Move results to tAncWork

delete from tAncWork ;
insert into tAncWork
select distinct RIN from tDecWork where RIN not in (select RIN from tTree) AND RIN > 0
;

— Move results to tTree

insert into tTree
select distinct RIN from tAncWork where RIN not in (select RIN from tTree) and RIN > 0
;

— Add their spouses to tAncWork to prapare for next itiration

insert into tAncWork
select distinct motherId from familyTable
where fatherId in (select RIN from tTree)
and motherId > 0
and motherId not in (select RIN from tTree)
UNION
select distinct fatherId from familyTable
where motherId in (select RIN from tTree)
and fatherId > 0
and fatherId not in (select RIN from tTree);

— Finally, clean out decWork

delete from tDecWork ;


ve3meo

Inline comment: “1) Finding all names in a selected person’s tree‍”

ve3meo
04 September 2018 01:25:23

ve3meo May 8, 2016

Good question for which I have no ready answer. It was difficult to do lineage before SQLite gained recursive query functionality. Doing recursions in SQLite does not come naturally to me. Your goal requires not only ancestors and descendants but also collateral lines. While it does not require kinship results which should make it easier, I confess I do not have a strategy in mind.
MLeroux84 May 8, 2016

Thanks. recursive SQL is not that easy for me either. I was thinking that it would not be that difficult, since it executes so quickly from the RM application. I haven’t seen it take very long to do “Count Trees” – well under a second for 4K names, so I was hoping that this was something obvious that I missed.


ve3meo

Inline comment: “2) Count trees”

ve3meo
04 September 2018 01:26:30

ve3meo May 8, 2016

This one mystifies me, too. It seems that it needs to be a recursion of “Finding all persons in a given person’s tree” iterating through all those remaining persons after subtracting those trees that have been counted.
MLeroux84 May 8, 2016

I wouldn’t think that it was iterating through remaining names, just because of the challenge of recursing through each name. Given the performance I’m seeing (albeit on a machine with reasonable performance) It would seem more likely that there is a way of identifying a node of each tree, then counting from there.

That said, since I haven’t been able to solve the first part of the problem it is all pure speculation


ve3meo

Inline comment: “produces a ‍subset‍ of the full tree”

ve3meo
04 September 2018 01:27:36

ve3meo May 8, 2016

This script includes all the ancestors of the starting person and their descendants but the full tree includes all the descendants of all the ancestors of all the descendants of all the ancestors of…
MLeroux84 May 8, 2016

Thanks, Tom. I played with this a bit today. I’m hampered by my lack of knowledge/experience with recursive queries. I get a bit further (but not very much) than you did by including spouses in the ancestor query. I’m thinking that each iteration needs to include siblings and spouses as seeds for the next iteration, but have not been able to find a way that seems to work.
I’ll keep plugging at it – it has gone from something “I’d like to do” to “It’s really bugging me now”

Inline comments


ve3meo

Comment: Good question for which I have no rea…

ve3meo
08 May 2016 17:31:19

Good question for which I have no ready answer. It was difficult to do lineage before SQLite gained recursive query functionality. Doing recursions in SQLite does not come naturally to me. Your goal requires not only ancestors and descendants but also collateral lines. While it does not require kinship results which should make it easier, I confess I do not have a strategy in mind.


MLeroux84

MLeroux84
09 May 2016 02:01:42

Thanks. recursive SQL is not that easy for me either. I was thinking that it would not be that difficult, since it executes so quickly from the RM application. I haven’t seen it take very long to do “Count Trees” – well under a second for 4K names, so I was hoping that this was something obvious that I missed.


ve3meo

Comment: This one mystifies me, too. It seems …

ve3meo
08 May 2016 17:31:20

This one mystifies me, too. It seems that it needs to be a recursion of “Finding all persons in a given person’s tree” iterating through all those remaining persons after subtracting those trees that have been counted.


MLeroux84

MLeroux84
09 May 2016 02:07:11

I wouldn’t think that it was iterating through remaining names, just because of the challenge of recursing through each name. Given the performance I’m seeing (albeit on a machine with reasonable performance) It would seem more likely that there is a way of identifying a node of each tree, then counting from there.

That said, since I haven’t been able to solve the first part of the problem it is all pure speculation


ve3meo

Comment: This script includes all the ancestor…

ve3meo
09 May 2016 03:58:26

This script includes all the ancestors of the starting person and their descendants but the full tree includes all the descendants of all the ancestors of all the descendants of all the ancestors of…


MLeroux84

MLeroux84
09 May 2016 23:11:22

Thanks, Tom. I played with this a bit today. I’m hampered by my lack of knowledge/experience with recursive queries. I get a bit further (but not very much) than you did by including spouses in the ancestor query. I’m thinking that each iteration needs to include siblings and spouses as seeds for the next iteration, but have not been able to find a way that seems to work.
I’ll keep plugging at it – it has gone from something “I’d like to do” to “It’s really bugging me now”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.