Forum

Please or Register to create posts and topics.

Delete people who are not related

I seem to have a number of people in my database who are not related to the root person. This was not my intent so I seem to have wandered off on some non-familial branch at some point. I'm looking for a way to prune the tree.

I used RM's Set Relationships tool so when I look at a person I can see their relationship to the root person. Is that relationship stored in some field? If so, I could use that to delete all people without that field set to something.

But what about spouses? Fred Smith is a blood relation to the root person but I would not want to delete all of Fred's 3 wives from the database.

Maybe it's easier to export to a new database all the people who ARE related, and all of their spouses too?

I searched for a script on this site and did not see one.

There isn't a script that does exactly what you want.

You could mark or color code everyone in the same tree which includes spouses and export them if you are satisfied that the GEDCOM losses are not a problem.

If GEDCOM losses are objectionable the Delete Many page can help. If the number of trees is not too great (RM's Count Trees tool), you could work your way through each unwanted tree and color code everyone in the same tree or add them to a Group named as described on that page. Then use the corresponding script to batch delete.

GEDCOM is better than nothing, but I'd prefer to do it all within the database if I can.

I'm thinking that I could colour code everyone who is related using People Selected from a List> Mark Group > Ancestors > Ancestors and Descendants and then use Delete Many to delete anyone who is not coloured and then cleanup phantoms 3.

Just need to check if that marked group includes the spouses or not. Might need some extra steps!

Alternatively - where does Tools > Set Relationships store the relationship information? Could I use that to include/exclude people?

The results of the RM Set Relationships tool are stores in PersonTable.Relate1 and Relate2. If both are 0, RM found no blood relationship. You can learn more about the meaning of these fields on the Relationships 1 & 2 tab on Database Design Spreadsheets. Some spouses are included - certainly the closest ones where you would "-in-law"; not sure if all are.

Color-coding is easier to set than is the group outside of RM because there is a PersonTable.Color field.

Investigation...

Database has 14493 persons, pick the base person;
Set relationships results in 4785 people with both Relate1 and Relate2 = 0;
Meaning 14493-4785=9708 people are directly related to the base person.

Run the Relationship List report and import into Excel;
There are 13607 records in the report;
3926 records contain the text "Spouse" in them;
Meaning 13607-3926=9681 people are directly related to the base person.

That's a difference of 27 people who seem to be directly related but I can't figure out how or why the are not in both lists. Clearly the Relationship List report does some extra figuring out!

Maybe there is a one-to-many factor because some people have more than one relationship to the base person? However there are no duplicated people in the Relationship List so that alone cannot explain it.

There are 27 people that Set Relationship says are related but they are not in the Relationship list - time for some more analysis.

It looks like there are 2 factors that complicate things and I havent figured out how to handle them all yet:

1 - people with alternate names

2 - people with multiple spouses

Here's the SQL I'm using so far to make my version of the relationship list of who is related to the selected person when Set Relationships is done:

-- all the people who are related and have no spouse
-- union all the males who are related and their wife's name
-- union all the females who are related and the husband's name
select P.PersonID as PersonID, N.Surname as Surname, N.Given as Given, P.Relate1 as Relate1, P.Relate2 as Relate2,
P.Sex as Gender, P.SpouseID as FamilyID, null as FatherID, null as MotherID, null as SpouseSurname, null as SpouseGiven
from PersonTable as P, NameTable as N
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID = 0
and N.OwnerID  = P.PersonID

UNION

select P.PersonID, N.Surname, N.Given, P.Relate1, P.Relate2, P.Sex, P.SpouseID, F.FatherID, F.MotherID, N2.Surname, N2.Given from PersonTable as P, NameTable as N, FamilyTable as F, NameTable as N2
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID > 0 and P.Sex = 0
and N.OwnerID  = P.PersonID and F.FamilyID = P.SpouseID
and N2.OwnerID = F.MotherID

UNION

select P.PersonID, N.Surname, N.Given, P.Relate1, P.Relate2, P.Sex, P.SpouseID, F.FatherID, F.MotherID, N2.Surname, N2.Given from PersonTable as P, NameTable as N, FamilyTable as F, NameTable as N2
where ( P.Relate1 > 0 or P.Relate2 > 0 ) and P.SpouseID > 0 and P.Sex = 1
and N.OwnerID  = P.PersonID and F.FamilyID = P.SpouseID
and N2.OwnerID = F.FatherID

ORDER BY P.PersonID

 

Break it down into simpler queries saved to VIEWs. JOIN the TreeMembers VIEW from the following to NameTable to fetch names (note that IsPrimary=0 for Alt Names) and to PersonTable to fetch sex and Relate1 et al.

DROP VIEW IF EXISTS Bloods; -- the blood relatives + close in-laws

CREATE TEMP VIEW Bloods AS

SELECT PersonID FROM PersonTable WHERE Relate1 OR Relate2;

 

DROP VIEW IF EXISTS BloodsWives; -- wives of bloods

CREATE TEMP VIEW BloodsWives AS

SELECT MotherID FROM FamilyTable

JOIN Bloods ON PersonID = FatherID

WHERE MotherID; -- must be non-zero

 

DROP VIEW IF EXISTS BloodsHusbands; -- husbands of bloods

CREATE TEMP VIEW BloodsHusbands AS

SELECT FatherID FROM FamilyTable

JOIN Bloods ON PersonID = MotherID

WHERE FatherID; -- must be non-zero

 

DROP VIEW IF EXISTS TreeMembers; --bloods and spouses altogether

CREATE TEMP VIEW TreeMembers AS

SELECT * FROM Bloods

UNION

SELECT * FROM BloodsHusbands

UNION

SELECT * FROM BloodsWives

;

 

More elegant. I'll also incorporate IsPrimary. Going away for Fri/Sat/Sun so I'll get back to this next Monday. Thanks!

In the end I used RM to colour code everyone I wanted to keep - and it seemed to have included the spouses I expected. And then colour coded another bunch who were not actually blood line but were close enough to it as near family after some remarriages. And then I deleted everyone who had no colour, cleaned up the phantoms, reindexed and compressed and now I'm correcting a few stray results that count trees identified for me.

So - no new fancy SQL, and perhaps no confident new understanding of how the RM tables implement the many varieties of family connections. But I like my result (so far).

Tom Holden has reacted to this post.
Tom Holden