Contents
- 1Intro
- 2Copy Description from Cause of Death Event to Death Event Description
- 2.1List of Death events with corresponding Cause of Death events
- 2.1.1Explicit JOIN
- 2.1.2Implicit JOIN using WHERE
- 2.1.3Inspect the results
- 2.1.4COUNTing, GROUPing and green light
- 2.2Construct for new Death Description
- 2.2.1Eliminate empty Cause Descriptions
- 2.2.2Cause of Death event Description for any given Death event
- 2.2.3Test new Death event Description
- 2.3UPDATE EventTable with new Death event Descriptions
- 3Merging Events
- 3.1What do we need to do with the data from the secondary event?
- 4Discussions & comments from Wikispaces site
- 4.1Does not work anymore
- 4.2Reassigning Master Place to place details
Intro
John_James posted these requests for help in a post to the Home page:
- What I would like to do now is clean some data where indi’s have a death event and the gedcom contributor has used a custom “Cause of Death” event so in other words “Details” from “EventType 1021″ into “EventType 2″ where OwnerID is the same.
- The other thing I could achieve myself at present would be to change all EventType 1021 to EventType 2 but then I would have duplicate events to deal with and that is another query I would find most useful in the future.
Let’s see what we might do.
Copy Description from Cause of Death Event to Death Event Description
The Descriptions are contained in the Details column of EventTable, a TEXT type. We know the EventTable.EventType is 2 and 1021, respectively, for Death and Cause of Death (the second a custom FactType whose ID number may vary with subsequent or other imports. The only Death events we wish to modify are those for which there is a corresponding Casue of Death event, i.e., for a common person, designated by the common OwnerID. Because both Death and Cause of Death are solely individual FactTypes, not Family or other, we can safely ignore OwnerType. It’s possible there might be more than one Death event and/or Cause of Death event for a person but we will assume that is not the case. Let’s see how many pairs of the two events we have. Here is where the JOIN command is necessary.
List of Death events with corresponding Cause of Death events
Explicit JOIN
-- List of Death events with corresponding Cause of Death events SELECT Death.OwnerID AS [RIN Death] ,Cause.OwnerID AS [RIN Cause] ,Death.DATE AS [DATE Death] ,Cause.DATE AS [DATE Cause] ,Death.Details AS [Details Death] ,Cause.Details AS [Details Cause] FROM EventTable AS Death INNER JOIN EventTable AS Cause USING (OwnerID) WHERE Death.EventType = 2 AND Cause.EventType = 1021 ;
EventTable is given two aliases so that it can be JOINed to itself, each alias acting as an independent table with the exact same contents. The INNER JOIN of the two tables constrains the result set to only those records from the first table for which the criteria match with the second table. A LEFT JOIN would include all records from the first table.
Implicit JOIN using WHERE
Instead of the explicit JOIN command, we could have written this with an implicit JOIN as:
... FROM EventTable AS Death ,EventTable AS Cause WHERE Death.EventType = 2 AND Cause.EventType = 1021 AND Death.OwnerID = Cause.OwnerID ;
It would perform just as well either way.
Inspect the results
Now we can inspect the result set to see if we have any issues to be concerned about, primarily, more than one Death or Cause of Death event per person. This can be done by browsing the results but this could be tiring if the set is very large. A query of the query could be faster and make it more obvious:
SELECT COUNT() ,* FROM ( -- List of Death events with corresponding Cause of Death events SELECT Death.OwnerID AS [RIN Death] ,Cause.OwnerID AS [RIN Cause] ,Death.DATE AS [DATE Death] ,Cause.DATE AS [DATE Cause] ,Death.Details AS [Details Death] ,Cause.Details AS [Details Cause] FROM EventTable AS Death INNER JOIN EventTable AS Cause USING (OwnerID) WHERE Death.EventType = 2 AND Cause.EventType = 1021 ) GROUP BY [RIN Death] ,[RIN Cause] ORDER BY COUNT() DESC ;
COUNTing, GROUPing and green light
If the COUNT() column is all 1’s (and any values > 1 will be at the top of the list), we’re good to go! The original query is wrapped inside an outer query which asks for all the columns of the inner query (the * does that) plus a count of all the records in the result set having each unique combination of the RIN for Death event and the RIN for the Cause event, the GROUP BY clause. This grouping could be simplified to count just one of the columns since the two are forced to be identical by the JOIN criterion USING (OwnerID) but I thought it would be instructive to show that multiple columns can be used to define the grouping. To put the largest counts at the beginning of the list, the query is sorted on the COUNT() column in descending order.
Construct for new Death Description
Supposing that there is but one Death fact with but one Cause fact for each person, we can then proceed to copy the value from Cause Description to the Death Description. We will want a space character between the original Death description and the appended Cause description. If the Cause Description is empty, there is no point in appending anything. If the original Death description is empty, there is no need for the space character.
Eliminate empty Cause Descriptions
We can eliminate any empty Cause descriptions by extending the constraints in the above query to include “AND Cause.Details NOT LIKE ””. To get a list of the EventIDs for Death events with mating non-empty Cause descriptions, the first query is revised thusly:
-- List of Death events having non-empty Cause of Death event Descriptions SELECT Death.EventID FROM EventTable AS Death INNER JOIN EventTable AS Cause USING (OwnerID) WHERE Death.EventType = 2 AND Cause.EventType = 1021 AND Cause.Details NOT LIKE '';
Cause of Death event Description for any given Death event
The Cause of Death description for any particular Death event can be found:
-- Cause of Death description for a given Death event SELECT Cause.Details FROM EventTable AS Cause WHERE Cause.OwnerID = 567 --(an example of the Death event's OwnerID) AND Cause.EventType = 1021;
Test new Death event Description
Put together the Death Description and the Cause of Death Description with a space character between and let’s see what we get. LTRIM will clear out the space character if the Death Details field is empty. The double bars || are SQLite’s concatenate operator. Single quotes surround text so ‘ ‘ is one space character.
-- Test new Death description SELECT Death.OwnerID AS RIN ,LTRIM(Death.Details || ' ' || Cause.Details) AS "New Death Description" FROM EventTable AS Death ,EventTable AS Cause WHERE Death.EventType = 2 AND Cause.EventType = 1021 AND Death.OwnerID = Cause.OwnerID AND Cause.Details NOT LIKE '';
Look up some of the persons by RIN in RootsMagic to review how the existing Death fact Description compares with the new one that will replace it.
UPDATE EventTable with new Death event Descriptions
Now let’s revise the Death descriptions (make a backup of your database first!). We have to tinker with our queries to fit within the rules of the UPDATE command:
UPDATE EventTable SET Details = LTRIM(Details || ' ' || ( -- Cause of Death description for a given Death event SELECT Cause.Details FROM EventTable AS Cause WHERE Cause.OwnerID = EventTable.OwnerID --(the OwnerID in the record being updated) AND Cause.EventType = 1021 )) WHERE EventID IN ( -- List of Death events with corresponding Cause of Death events SELECT Death.EventID FROM EventTable AS Death INNER JOIN EventTable AS Cause USING (OwnerID) WHERE Death.EventType = 2 AND Cause.EventType = 1021 AND Cause.Details NOT LIKE '' );
The OwnerID from EventTable for the record being updated is passed to the query that returns the Cause description from the record with a matching OwnerID. Only those records in EventTable whose EventID is in the list of Death EventIDs that have related non-empty Cause of Death descriptions are updated.
Merging Events
The first half of this page addressed simply the appending of the Description (Details field) from one type of event to another. If that is all that is wanted from the one type of event, then it is a simple matter to delete all the records for that type of event. However, what if there are other elements of that event that we would like to bring over to the target event, e.g., Notes, Sources, Images? To do so involves a much more comprehensive and complicated procedure if they are not to be lost when the secondary event is deleted.
John is quite right that simply changing the event type from Cause of Death to Death could result in another problem to be addressed – now there will be two Death events where there was one before. They are more likely not complete duplicates, differing in some minor or major way, and that will make more difficult the identification of the pairs of events to be merged. Accurate and reliable pairing is fundamental to a successful automatic merging process. A general solution may require a procedure akin to RootsMagic’s Duplicate Search Merge for People, i.e., a weighted scoring for similarities between events, manual selection of the primary, and manual initiation of the merge, one pair at a time.
Let’s set that aside for John’s case where we have but one Death event with but one Cause of Death event per person. We have demonstrated above that they can be readily paired.
What do we need to do with the data from the secondary event?
- Date – discard in favour of Primary’s
- Place – discard
- Place details – discard?
- Description – append as above
- Proof – discard or take the lower of the two
- Primary – set flag
- Sort Date – discard
- Private – discard? note that merging Death and Cause of Death forces them to share one privacy setting
- Note – append as we did for Description
- Sources – add to Primary (which opens up a whole additional investigation into merging sources!)
- Images – add to Primary if not duplicate (easier to do than Sources I think)
- Shares – add to Primary
Just a bit mind-boggling…
—–more to come—–
Discussions & comments from Wikispaces site
Does not work anymore
23 November 2014 00:08:15
This does not work anymore. Was it for a older version of RootsMagic.
ve3meo
23 November 2014 02:12:40
The queries on this page were developed on a RootsMagic 6 database. Can you be more specific about which one(s) do not work? Have you edited the queries with the EventType (FactTypeID) of the Cause of Death fact type in your database in place of the 1021 in the examples?
John_James
Reassigning Master Place to place details
20 February 2018 00:33:57
Rootsmagic has a way to split out place details but no way to correct a place details being attached to the wrong parent. I have been changing the Master ID in the Place Table to reflect the correct parent using F2 in Sqlitespy and entering the correct MasterID. This should just change the Place or even give me a duplicate place detail to deal with but even after running all the database tools in RM the changes are not reflected. I am sure I have done this before and have the RMNOCASE in place, when I check the table again the changes are fixed, what on earth am I doing wrong?
John_James
20 February 2018 19:43:16
I found out where I was going wrong and now working to recover things
I created a RM10 database by merging my old Genbox database and multiple Ancestry databases. I manually merged the multitude of duplicate individuals – about 1000. This left me with many duplicated events. I decided to merge duplicated births, marriages, deaths, and burials. Here is a brief summary of my technique in case it helps others. I used Microsoft Access to write the queries, largely because I struggle a bit with complex SQLLite query syntax despite (or perhaps because of) my years of MS SQL programming. My technique was specific to my data and my needs. There are no doubt other ways to tackle this.
1. Make table query “FirstEvent” grouping EventTable by EventType, OwnerType, OwnerID, counting EventID, and the minimum EventID (“FirstEventID”) where the EventType is FactType for Birth/ Marriage / Death / Burial and the count >1. This provides the duplicates I want to merge; I chose to consider the Event with the lowest ID the one I will consider the “Primary”.
2. Update query for EventTable setting IsPrimary to 1 for records in “FirstEvent”. (I have not previously made use of the IsPrimary flag.)
3. Make table query “DupeEvent” from EventTable where EventType, OwnerType, OwnerID equals records in “FirstEvent” but EventIDs do not match; include in the table Date, PlaceID, SiteID, Details, and Note. Add blank fields to table for DateNote, PlaceNote, SiteNote, and FirstEventID.
4. Update query on DupeEvent where EventType, OwnerType, OwnerID equals records in “FirstEvent”; set FirstEventID in DupeEvent.
5. Write a couple queries to update the “primary” event date (in EventTable) with the date from the DupeEvent date if the DupeEvent date is “better”; perhaps the Primary Event Date is blank or only has a year, and the Dupe Event has a full date with a matching year.
6. Write a query to compare dates between the primary event and the duplicates; if the date is off more than a year (my threshold), remove the record from the DupeEvent table; these “duplicate” events I want to keep and research further.
7. For the remaining DupeEvent records whose date is not blank and which are not equal to the date in EventTable, create a note in DupeEvent.DateNote: “Alternate Date: ” & Date (formated for easy reading).
8. Process Place in similar way to dates; update EventTable with DupeEvent.PlaceID if blank or if Place Name from EventTable (linked to PlaceTable) is a subet (e.g. “Indiana”) of the PlaceName from DupeEent (e.g. “Posey County, Indiana”).
9. If Places are “too different” (e.g. “Virginia” vs “Indiana”), delete the record from DupeEvent.
10. For the remaining DupeEvent records, if the PlaceID does not match the PlaceID for the primary event in EventTable, construct a PlaceNote in DupeEvent.
11. Process Site IDs. Since I only had a handful (less than 5) at issue, I processed manually.
12. Write query to update Details in EventTable for primary events with EventTable.Details & ” ” & DupeEvent.Details.
Continuing my method of merging citations:
13. Write query to update note in EventTable for primary events, combing EventTable.Note with DupeEvent.Note, DupeEvent.DateNote, DupeEvent.PlaceNote, and DupeEvent.SiteNote.
14. For CitationLinkTable, MeidaLinkTable, WitnessTable, and TaskTable, update OwnerID to DupeEvent.FirstEventID where OwnerType =2 and xxxLinkTable.OwnerID = DupeEvent.EventID, thus moving the citations, media, witnesses, and tasks to the primary event.
15. I chose to add a task (using a query) to all the primary events, noting this is a “merged” event.
16. Delete query to remove from EventTable all events in DupeEvent.
17. Reset EventTable,IsPrimary to 0.
It goes without saying I backed up (multiple times) during this process, checking my work at each step. All together it was a couple afternoons of effort. I am quite pleased to have all the sources together.