Events – Merge #events #merge

Intro

John_James posted these requests for help in a post to the Home page:

  1. What I would like to do now is clean some data where indi’s have a death event and the gedcom contributor has used a custom “Cause of Death” event so in other words “Details” from “EventType 1021″ into “EventType 2″ where OwnerID is the same.
  2. The other thing I could achieve myself at present would be to change all EventType 1021 to EventType 2 but then I would have duplicate events to deal with and that is another query I would find most useful in the future.

Let’s see what we might do.

Copy Description from Cause of Death Event to Death Event Description

The Descriptions are contained in the Details column of EventTable, a TEXT type. We know the EventTable.EventType is 2 and 1021, respectively, for Death and Cause of Death (the second a custom FactType whose ID number may vary with subsequent or other imports. The only Death events we wish to modify are those for which there is a corresponding Casue of Death event, i.e., for a common person, designated by the common OwnerID. Because both Death and Cause of Death are solely individual FactTypes, not Family or other, we can safely ignore OwnerType. It’s possible there might be more than one Death event and/or Cause of Death event for a person but we will assume that is not the case. Let’s see how many pairs of the two events we have. Here is where the JOIN command is necessary.

List of Death events with corresponding Cause of Death events

Explicit JOIN

-- List of Death events with corresponding Cause of Death events
SELECT Death.OwnerID AS [RIN Death]
 ,Cause.OwnerID AS [RIN Cause]
 ,Death.DATE AS [DATE Death]
 ,Cause.DATE AS [DATE Cause]
 ,Death.Details AS [Details Death]
 ,Cause.Details AS [Details Cause]
FROM EventTable AS Death
INNER JOIN EventTable AS Cause USING (OwnerID)
WHERE Death.EventType = 2
 AND Cause.EventType = 1021
;

EventTable is given two aliases so that it can be JOINed to itself, each alias acting as an independent table with the exact same contents. The INNER JOIN of the two tables constrains the result set to only those records from the first table for which the criteria match with the second table. A LEFT JOIN would include all records from the first table.

Implicit JOIN using WHERE

Instead of the explicit JOIN command, we could have written this with an implicit JOIN as:

...
FROM EventTable AS Death
 ,EventTable AS Cause
WHERE Death.EventType = 2
 AND Cause.EventType = 1021
 AND Death.OwnerID = Cause.OwnerID
;

It would perform just as well either way.

Inspect the results

Now we can inspect the result set to see if we have any issues to be concerned about, primarily, more than one Death or Cause of Death event per person. This can be done by browsing the results but this could be tiring if the set is very large. A query of the query could be faster and make it more obvious:

SELECT COUNT()
 ,*
FROM (
 -- List of Death events with corresponding Cause of Death events
 SELECT Death.OwnerID AS [RIN Death]
 ,Cause.OwnerID AS [RIN Cause]
 ,Death.DATE AS [DATE Death]
 ,Cause.DATE AS [DATE Cause]
 ,Death.Details AS [Details Death]
 ,Cause.Details AS [Details Cause]
 FROM EventTable AS Death
 INNER JOIN EventTable AS Cause USING (OwnerID)
 WHERE Death.EventType = 2
 AND Cause.EventType = 1021
 )
GROUP BY [RIN Death]
 ,[RIN Cause]
ORDER BY COUNT() DESC
;

COUNTing, GROUPing and green light

If the COUNT() column is all 1’s (and any values > 1 will be at the top of the list), we’re good to go! The original query is wrapped inside an outer query which asks for all the columns of the inner query (the * does that) plus a count of all the records in the result set having each unique combination of the RIN for Death event and the RIN for the Cause event, the GROUP BY clause. This grouping could be simplified to count just one of the columns since the two are forced to be identical by the JOIN criterion USING (OwnerID) but I thought it would be instructive to show that multiple columns can be used to define the grouping. To put the largest counts at the beginning of the list, the query is sorted on the COUNT() column in descending order.


Construct for new Death Description

Supposing that there is but one Death fact with but one Cause fact for each person, we can then proceed to copy the value from Cause Description to the Death Description. We will want a space character between the original Death description and the appended Cause description. If the Cause Description is empty, there is no point in appending anything. If the original Death description is empty, there is no need for the space character.

 

Eliminate empty Cause Descriptions

We can eliminate any empty Cause descriptions by extending the constraints in the above query to include “AND Cause.Details NOT LIKE ””. To get a list of the EventIDs for Death events with mating non-empty Cause descriptions, the first query is revised thusly:

-- List of Death events having non-empty Cause of Death event Descriptions
SELECT Death.EventID
FROM EventTable AS Death
INNER JOIN EventTable AS Cause USING (OwnerID)
WHERE Death.EventType = 2
    AND Cause.EventType = 1021
    AND Cause.Details NOT LIKE '';

Cause of Death event Description for any given Death event

The Cause of Death description for any particular Death event can be found:

-- Cause of Death description for a given Death event
SELECT Cause.Details
FROM EventTable AS Cause
WHERE Cause.OwnerID = 567 --(an example of the Death event's OwnerID)
    AND Cause.EventType = 1021;

Test new Death event Description

Put together the Death Description and the Cause of Death Description with a space character between and let’s see what we get. LTRIM will clear out the space character if the Death Details field is empty. The double bars || are SQLite’s concatenate operator. Single quotes surround text so ‘ ‘ is one space character.

-- Test new Death description
SELECT Death.OwnerID AS RIN
    ,LTRIM(Death.Details || ' ' || Cause.Details) AS "New Death Description"
FROM EventTable AS Death
    ,EventTable AS Cause
WHERE Death.EventType = 2
    AND Cause.EventType = 1021
    AND Death.OwnerID = Cause.OwnerID
    AND Cause.Details NOT LIKE '';

Look up some of the persons by RIN in RootsMagic to review how the existing Death fact Description compares with the new one that will replace it.

 

UPDATE EventTable with new Death event Descriptions

Now let’s revise the Death descriptions (make a backup of your database first!). We have to tinker with our queries to fit within the rules of the UPDATE command:

UPDATE EventTable
SET Details = LTRIM(Details || ' ' || (
            -- Cause of Death description for a given Death event
            SELECT Cause.Details
            FROM EventTable AS Cause
            WHERE Cause.OwnerID = EventTable.OwnerID --(the OwnerID in the record being updated)
                AND Cause.EventType = 1021
            ))
WHERE EventID IN (
        -- List of Death events with corresponding Cause of Death events
        SELECT Death.EventID
        FROM EventTable AS Death
        INNER JOIN EventTable AS Cause USING (OwnerID)
        WHERE Death.EventType = 2
            AND Cause.EventType = 1021
            AND Cause.Details NOT LIKE ''
        );

The OwnerID from EventTable for the record being updated is passed to the query that returns the Cause description from the record with a matching OwnerID. Only those records in EventTable whose EventID is in the list of Death EventIDs that have related non-empty Cause of Death descriptions are updated.

Merging Events

The first half of this page addressed simply the appending of the Description (Details field) from one type of event to another. If that is all that is wanted from the one type of event, then it is a simple matter to delete all the records for that type of event. However, what if there are other elements of that event that we would like to bring over to the target event, e.g., Notes, Sources, Images? To do so involves a much more comprehensive and complicated procedure if they are not to be lost when the secondary event is deleted.

John is quite right that simply changing the event type from Cause of Death to Death could result in another problem to be addressed – now there will be two Death events where there was one before. They are more likely not complete duplicates, differing in some minor or major way, and that will make more difficult the identification of the pairs of events to be merged. Accurate and reliable pairing is fundamental to a successful automatic merging process. A general solution may require a procedure akin to RootsMagic’s Duplicate Search Merge for People, i.e., a weighted scoring for similarities between events, manual selection of the primary, and manual initiation of the merge, one pair at a time.

Let’s set that aside for John’s case where we have but one Death event with but one Cause of Death event per person. We have demonstrated above that they can be readily paired.

What do we need to do with the data from the secondary event?

  1. Date – discard in favour of Primary’s
  2. Place – discard
  3. Place details – discard?
  4. Description – append as above
  5. Proof – discard or take the lower of the two
  6. Primary – set flag
  7. Sort Date – discard
  8. Private – discard? note that merging Death and Cause of Death forces them to share one privacy setting
  9. Note – append as we did for Description
  10. Sources – add to Primary (which opens up a whole additional investigation into merging sources!)
  11. Images – add to Primary if not duplicate (easier to do than Sources I think)
  12. Shares – add to Primary

Just a bit mind-boggling…

—–more to come—–

Discussions & comments from Wikispaces site


anzenketh

Does not work anymore

anzenketh
23 November 2014 00:08:15

This does not work anymore. Was it for a older version of RootsMagic.


ve3meo

ve3meo
23 November 2014 02:12:40

The queries on this page were developed on a RootsMagic 6 database. Can you be more specific about which one(s) do not work? Have you edited the queries with the EventType (FactTypeID) of the Cause of Death fact type in your database in place of the 1021 in the examples?


John_James

Reassigning Master Place to place details

John_James
20 February 2018 00:33:57

Rootsmagic has a way to split out place details but no way to correct a place details being attached to the wrong parent. I have been changing the Master ID in the Place Table to reflect the correct parent using F2 in Sqlitespy and entering the correct MasterID. This should just change the Place or even give me a duplicate place detail to deal with but even after running all the database tools in RM the changes are not reflected. I am sure I have done this before and have the RMNOCASE in place, when I check the table again the changes are fixed, what on earth am I doing wrong?


John_James

John_James
20 February 2018 19:43:16

I found out where I was going wrong and now working to recover things

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.