Delete people who are not related
Quote from GlenB on 2019-03-05, 8:42 pmI seem to have a number of people in my database who are not related to the root person. This was not my intent so I seem to have wandered off on some non-familial branch at some point. I'm looking for a way to prune the tree.
I used RM's Set Relationships tool so when I look at a person I can see their relationship to the root person. Is that relationship stored in some field? If so, I could use that to delete all people without that field set to something.
But what about spouses? Fred Smith is a blood relation to the root person but I would not want to delete all of Fred's 3 wives from the database.
Maybe it's easier to export to a new database all the people who ARE related, and all of their spouses too?
I searched for a script on this site and did not see one.
I seem to have a number of people in my database who are not related to the root person. This was not my intent so I seem to have wandered off on some non-familial branch at some point. I'm looking for a way to prune the tree.
I used RM's Set Relationships tool so when I look at a person I can see their relationship to the root person. Is that relationship stored in some field? If so, I could use that to delete all people without that field set to something.
But what about spouses? Fred Smith is a blood relation to the root person but I would not want to delete all of Fred's 3 wives from the database.
Maybe it's easier to export to a new database all the people who ARE related, and all of their spouses too?
I searched for a script on this site and did not see one.
Quote from Tom Holden on 2019-03-05, 10:16 pmThere isn't a script that does exactly what you want.
You could mark or color code everyone in the same tree which includes spouses and export them if you are satisfied that the GEDCOM losses are not a problem.
If GEDCOM losses are objectionable the Delete Many page can help. If the number of trees is not too great (RM's Count Trees tool), you could work your way through each unwanted tree and color code everyone in the same tree or add them to a Group named as described on that page. Then use the corresponding script to batch delete.
There isn't a script that does exactly what you want.
You could mark or color code everyone in the same tree which includes spouses and export them if you are satisfied that the GEDCOM losses are not a problem.
If GEDCOM losses are objectionable the Delete Many page can help. If the number of trees is not too great (RM's Count Trees tool), you could work your way through each unwanted tree and color code everyone in the same tree or add them to a Group named as described on that page. Then use the corresponding script to batch delete.
Quote from GlenB on 2019-03-06, 2:07 pmGEDCOM is better than nothing, but I'd prefer to do it all within the database if I can.
I'm thinking that I could colour code everyone who is related using People Selected from a List> Mark Group > Ancestors > Ancestors and Descendants and then use Delete Many to delete anyone who is not coloured and then cleanup phantoms 3.
Just need to check if that marked group includes the spouses or not. Might need some extra steps!
Alternatively - where does Tools > Set Relationships store the relationship information? Could I use that to include/exclude people?
GEDCOM is better than nothing, but I'd prefer to do it all within the database if I can.
I'm thinking that I could colour code everyone who is related using People Selected from a List> Mark Group > Ancestors > Ancestors and Descendants and then use Delete Many to delete anyone who is not coloured and then cleanup phantoms 3.
Just need to check if that marked group includes the spouses or not. Might need some extra steps!
Alternatively - where does Tools > Set Relationships store the relationship information? Could I use that to include/exclude people?
Quote from Tom Holden on 2019-03-06, 9:23 pmThe results of the RM Set Relationships tool are stores in PersonTable.Relate1 and Relate2. If both are 0, RM found no blood relationship. You can learn more about the meaning of these fields on the Relationships 1 & 2 tab on Database Design Spreadsheets. Some spouses are included - certainly the closest ones where you would "-in-law"; not sure if all are.
Color-coding is easier to set than is the group outside of RM because there is a PersonTable.Color field.
The results of the RM Set Relationships tool are stores in PersonTable.Relate1 and Relate2. If both are 0, RM found no blood relationship. You can learn more about the meaning of these fields on the Relationships 1 & 2 tab on Database Design Spreadsheets. Some spouses are included - certainly the closest ones where you would "-in-law"; not sure if all are.
Color-coding is easier to set than is the group outside of RM because there is a PersonTable.Color field.
Quote from GlenB on 2019-03-07, 2:00 pmInvestigation...
Database has 14493 persons, pick the base person;
Set relationships results in 4785 people with both Relate1 and Relate2 = 0;
Meaning 14493-4785=9708 people are directly related to the base person.Run the Relationship List report and import into Excel;
There are 13607 records in the report;
3926 records contain the text "Spouse" in them;
Meaning 13607-3926=9681 people are directly related to the base person.That's a difference of 27 people who seem to be directly related but I can't figure out how or why the are not in both lists. Clearly the Relationship List report does some extra figuring out!
Maybe there is a one-to-many factor because some people have more than one relationship to the base person? However there are no duplicated people in the Relationship List so that alone cannot explain it.
There are 27 people that Set Relationship says are related but they are not in the Relationship list - time for some more analysis.
Investigation...
Database has 14493 persons, pick the base person;
Set relationships results in 4785 people with both Relate1 and Relate2 = 0;
Meaning 14493-4785=9708 people are directly related to the base person.
Run the Relationship List report and import into Excel;
There are 13607 records in the report;
3926 records contain the text "Spouse" in them;
Meaning 13607-3926=9681 people are directly related to the base person.
That's a difference of 27 people who seem to be directly related but I can't figure out how or why the are not in both lists. Clearly the Relationship List report does some extra figuring out!
Maybe there is a one-to-many factor because some people have more than one relationship to the base person? However there are no duplicated people in the Relationship List so that alone cannot explain it.
There are 27 people that Set Relationship says are related but they are not in the Relationship list - time for some more analysis.
Quote from GlenB on 2019-03-07, 9:26 pmIt looks like there are 2 factors that complicate things and I havent figured out how to handle them all yet:
1 - people with alternate names
2 - people with multiple spouses
Here's the SQL I'm using so far to make my version of the relationship list of who is related to the selected person when Set Relationships is done:
-- all the people who are related and have no spouse
-- union all the males who are related and their wife's name
-- union all the females who are related and the husband's name
select P.PersonID as PersonID, N.Surname as Surname, N.Given as Given, P.Relate1 as Relate1, P.Relate2 as Relate2,
P.Sex as Gender, P.SpouseID as FamilyID, null as FatherID, null as MotherID, null as SpouseSurname, null as SpouseGiven
from PersonTable as P, NameTable as N
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID = 0
and N.OwnerID = P.PersonIDUNION
select P.PersonID, N.Surname, N.Given, P.Relate1, P.Relate2, P.Sex, P.SpouseID, F.FatherID, F.MotherID, N2.Surname, N2.Given from PersonTable as P, NameTable as N, FamilyTable as F, NameTable as N2
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID > 0 and P.Sex = 0
and N.OwnerID = P.PersonID and F.FamilyID = P.SpouseID
and N2.OwnerID = F.MotherIDUNION
select P.PersonID, N.Surname, N.Given, P.Relate1, P.Relate2, P.Sex, P.SpouseID, F.FatherID, F.MotherID, N2.Surname, N2.Given from PersonTable as P, NameTable as N, FamilyTable as F, NameTable as N2
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID > 0 and P.Sex = 1
and N.OwnerID = P.PersonID and F.FamilyID = P.SpouseID
and N2.OwnerID = F.FatherIDORDER BY P.PersonID
It looks like there are 2 factors that complicate things and I havent figured out how to handle them all yet:
1 - people with alternate names
2 - people with multiple spouses
Here's the SQL I'm using so far to make my version of the relationship list of who is related to the selected person when Set Relationships is done:
-- all the people who are related and have no spouse
-- union all the males who are related and their wife's name
-- union all the females who are related and the husband's name
select P.PersonID as PersonID, N.Surname as Surname, N.Given as Given, P.Relate1 as Relate1, P.Relate2 as Relate2,
P.Sex as Gender, P.SpouseID as FamilyID, null as FatherID, null as MotherID, null as SpouseSurname, null as SpouseGiven
from PersonTable as P, NameTable as N
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID = 0
and N.OwnerID = P.PersonID
UNION
select P.PersonID, N.Surname, N.Given, P.Relate1, P.Relate2, P.Sex, P.SpouseID, F.FatherID, F.MotherID, N2.Surname, N2.Given from PersonTable as P, NameTable as N, FamilyTable as F, NameTable as N2
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID > 0 and P.Sex = 0
and N.OwnerID = P.PersonID and F.FamilyID = P.SpouseID
and N2.OwnerID = F.MotherID
UNION
select P.PersonID, N.Surname, N.Given, P.Relate1, P.Relate2, P.Sex, P.SpouseID, F.FatherID, F.MotherID, N2.Surname, N2.Given from PersonTable as P, NameTable as N, FamilyTable as F, NameTable as N2
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID > 0 and P.Sex = 1
and N.OwnerID = P.PersonID and F.FamilyID = P.SpouseID
and N2.OwnerID = F.FatherID
ORDER BY P.PersonID
Quote from Tom Holden on 2019-03-07, 10:45 pmBreak it down into simpler queries saved to VIEWs. JOIN the TreeMembers VIEW from the following to NameTable to fetch names (note that IsPrimary=0 for Alt Names) and to PersonTable to fetch sex and Relate1 et al.
DROP VIEW IF EXISTS Bloods; -- the blood relatives + close in-laws
CREATE TEMP VIEW Bloods AS
SELECT PersonID FROM PersonTable WHERE Relate1 OR Relate2;
DROP VIEW IF EXISTS BloodsWives; -- wives of bloods
CREATE TEMP VIEW BloodsWives AS
SELECT MotherID FROM FamilyTable
JOIN Bloods ON PersonID = FatherID
WHERE MotherID; -- must be non-zero
DROP VIEW IF EXISTS BloodsHusbands; -- husbands of bloods
CREATE TEMP VIEW BloodsHusbands AS
SELECT FatherID FROM FamilyTable
JOIN Bloods ON PersonID = MotherID
WHERE FatherID; -- must be non-zero
DROP VIEW IF EXISTS TreeMembers; --bloods and spouses altogether
CREATE TEMP VIEW TreeMembers AS
SELECT * FROM Bloods
UNION
SELECT * FROM BloodsHusbands
UNION
SELECT * FROM BloodsWives
;
Break it down into simpler queries saved to VIEWs. JOIN the TreeMembers VIEW from the following to NameTable to fetch names (note that IsPrimary=0 for Alt Names) and to PersonTable to fetch sex and Relate1 et al.
DROP VIEW IF EXISTS Bloods; -- the blood relatives + close in-laws
CREATE TEMP VIEW Bloods AS
SELECT PersonID FROM PersonTable WHERE Relate1 OR Relate2;
DROP VIEW IF EXISTS BloodsWives; -- wives of bloods
CREATE TEMP VIEW BloodsWives AS
SELECT MotherID FROM FamilyTable
JOIN Bloods ON PersonID = FatherID
WHERE MotherID; -- must be non-zero
DROP VIEW IF EXISTS BloodsHusbands; -- husbands of bloods
CREATE TEMP VIEW BloodsHusbands AS
SELECT FatherID FROM FamilyTable
JOIN Bloods ON PersonID = MotherID
WHERE FatherID; -- must be non-zero
DROP VIEW IF EXISTS TreeMembers; --bloods and spouses altogether
CREATE TEMP VIEW TreeMembers AS
SELECT * FROM Bloods
UNION
SELECT * FROM BloodsHusbands
UNION
SELECT * FROM BloodsWives
;
Quote from GlenB on 2019-03-08, 2:01 amMore elegant. I'll also incorporate IsPrimary. Going away for Fri/Sat/Sun so I'll get back to this next Monday. Thanks!
More elegant. I'll also incorporate IsPrimary. Going away for Fri/Sat/Sun so I'll get back to this next Monday. Thanks!
Quote from GlenB on 2019-03-12, 9:52 pmIn the end I used RM to colour code everyone I wanted to keep - and it seemed to have included the spouses I expected. And then colour coded another bunch who were not actually blood line but were close enough to it as near family after some remarriages. And then I deleted everyone who had no colour, cleaned up the phantoms, reindexed and compressed and now I'm correcting a few stray results that count trees identified for me.
So - no new fancy SQL, and perhaps no confident new understanding of how the RM tables implement the many varieties of family connections. But I like my result (so far).
In the end I used RM to colour code everyone I wanted to keep - and it seemed to have included the spouses I expected. And then colour coded another bunch who were not actually blood line but were close enough to it as near family after some remarriages. And then I deleted everyone who had no colour, cleaned up the phantoms, reindexed and compressed and now I'm correcting a few stray results that count trees identified for me.
So - no new fancy SQL, and perhaps no confident new understanding of how the RM tables implement the many varieties of family connections. But I like my result (so far).