Forum

Please or Register to create posts and topics.

Identical Rows in EventTable

I have been working on an SQLite script to do a detailed comparison between an RM7 database and an RM9 database. This is to help to verify that if I convert to RM9 I will be able to convert back to RM7 if necessary. In the process of developing that script, I discovered that the EventTable in my RM7 database had a few rows that were identical except for the EventID itself. Because those identical rows have identical OwnerType and OwnerID values, the identical rows represent multiple facts of the same type for the same person or family that are identical. All but one of such multiple facts need to be deleted.

In my case, the number of such rows was small enough that the fastest and easiest way to accomplish the task was to write a script to identify the rows that are identical except for the EventID field and then to delete the duplicate facts manually. Here follows the script.

Uploaded files:

I'm curious, did the people these facts were attached to show fact duplicates in the RM user interface?

Does deleting these via SQLite etc cause any issues in the RM database?

I have several hundred rows to delete so considering a batch to delete them all.

Possibly. Jerry's preamble in the script file says:

This script does not attempt to identify cases where the facts are identical
except for citations and media files. There were few enough identical facts
that was easier to identify such situations by hand.

Bulk deletion could lose some citations and media belonging to the deleted events that may be the requisite evidence for the remaining one. They will remain as detritus in their respective tables.

Likewise for Tasks for an event.

kevync has reacted to this post.
kevync

"I'm curious, did the people these facts were attached to show fact duplicates in the RM user interface?"

Sorry for the delay in replying. The answer is yes. What I meant by "deleting the duplicates manually" was that I went in to the edit person screen to do the deletes. When I was doing the deletes, I obviously could see the duplicates.

By the way, RM's user interface does allow you to find cases where there is more than one instance of the same fact for a person. But it's a fact type by fact type search. So my script was still important to me because I had no idea in advance which facts might have multiple instances. Also, it's perfectly reasonable to have more than one fact of the same type for a person if the duplicate fact types differ in some way. The tool in RM's user interface does not make this distinction, but my script does make this distinction.

Quote from RobertFrance on 2024-06-24, 12:27 am

Does deleting these via SQLite etc cause any issues in the RM database?

I have several hundred rows to delete so considering a batch to delete them all.

The use of SQLite for this particular task is not a problem, per se. The only concern would be that an overly aggressive script might not take enough conditions into consideration and might delete too many rows.

Some of RM's tables are indexed by columns which use RM's proprietary RMNOCASE collation. After using SQLite to do an INSERT or a DELETE from such tables, it's essential to follow up by running RM's FILE => Tools => Rebuild Indexes tool. EventTable is not indexed by such a column, so re-indexing is not necessary after it is updated. But it's a good idea to run RM's re-indexing tool on a regular basis anyway.

kevync has reacted to this post.
kevync