Convert RM8 Database to RM7 #rm8 #rm7

Issue

As of writing, still no Book Publisher feature in RM8 and along with other persistent issues, some users wish to return to the RM7.5 database from which they had ‘upgraded’. However, the only ways back using the RM user interface are lossy:

  • #GEDCOM which, along with other attendant losses (see GEDCOM & DnD transfer losses) does not import #namedgroup and custom #reports.
  • RM8’s File > Export Data > DropBox translates the active .RMTREE file to a RM7 compatible .RMGC file intended for use with the RootsMagic Mobile app (now retired for Android) but it loses shared events, and likely more.

A more complete conversion is wanted.

2023-05-04: Update re #RM9: despite the inclusion of the Book Publisher feature missing from RM8, there are RM9 users who seek a way to return to RM7 for other reasons. A tweak to a RM9 database to change its version number to RM8 opens up the solution below to carry through to RM7. See Database Revert RM9 to RM8 to RM7.

Solution

Both RM7 and RM8 use similar SQLite databases so, obviously, a transfer of data through SQLite operations is the most direct means possible. However, “similar” is not “identical” and there are challenges with transferring Tasks, TreeShare, FamilySearch, Custom Reports and Report Settings and other File Settings, History, Bookmarks, et al. The script above provides the most thorough and complete conversion achieved to date and is built on and supersedes:

For a full understanding of what has gone into the current procedure, they remain useful references.

Procedure

  1. In RM8, open and prepare your RM8 database file for export by running the set of File>Tools>Database Tools. Note the file’s full path and name for entry into the SQLite script. Close the database.
  2. In RM7, create a new empty database or open the database you wish to overwrite (I’ll leave it to you to backup or make a copy!). If an existing file, run the RM7 set of File>Database Tools. Note the file’s full path and name so you can find it with your SQLite manager. Close the database.
  3. Open the target RM7 database with your SQLite manager (one with a fake RMNOCASE collation – see #rmnocase).
  4. Open your SQLite manager’s SQL Editor on this database file and load into it the script listed above under Solution.
  5. Edit this script to change the path and name in the ATTACH DATABASE line near the beginning to that of your source RM8 .rmtree database file. Edit 2 more lines near the end flagged by **** to explicit media paths.
  6. Execute the script.
  7. On reopening the target RM7 database with RM7, run File>Database Tools.
  8. If Media links are broken despite the edits you made to the script, use the Media Gallery’s Fix Broken Media Links tool and/or, as appropriate, Search & Replace (Ctrl+H) on Media filenames to fix them.

Notes

  1. Copies over the ConfigTable records from RM8 which can include Custom Reports and Book Publisher settings that originated in the RM7 database which had been upgraded to RM8, thereby making the round trip from RM7 to RM8 back to RM7 pretty much intact, except for possible losses in the area of Research Logs and Folders due to structural differences.
  2. This procedure is orders-of-magnitude faster than GEDCOM export-import, approximately 2 minutes from a 330MB RM8 database file, a few seconds for a small one. 
  3. A surprising outcome from this script development is that I’ve used the script to demonstrate  that RM 8.1.8 TreeShare falsely reports mismatches between Ancestry Sources and RM’s copies after applying ‘Merge all duplicate citations ‘. See: TreeShare mysteries from RM8.1.8 ‘Merge all duplicate citations’ (likewise for RM8.2).
  4. At some point, a RootsMagic 8 update or its successor will change the database in a way that breaks the script or causes an incompatibility.
  5. RootsMagic 7 is no longer being developed so features such as TreeShare and FamilySearch integration will break when those services change their APIs (the Application Program Interface with which RM7 interacts).
  6. Color-coding suffers in translation because RM8 has 28 colours, 13 of which are outside the range for RM7 and those that are within do not all map to the corresponding color.
  7. I was ‘lazy’ with data typing despite there having been changes between the two versions. SQLite itself is also ‘loose’ with enforcement. For example, many fields in RM7 that were type BLOB became type TEXT in RM8. My prior experience with RM in the past was that it did not care when the content was textual so I’ve made no attempt to CAST these TEXT fields back to BLOB on import. Yet, it’s possible it may give rise to some obscure error.
  8. I get a memory access error in citations using Find Everywhere on one file that originated in RM7 and imported back from RM8, yet there is no such error in the original nor in the RM8 upgrade. That’s an obscure error not present in other files I’ve converted.
  9. Mac users probably cannot run RM7 any more so this procedure is of little interest except for its potential to solve some RM8 problems by bouncing the data down to RM7 and back to RM8.
  10. Please let me know what errors you encounter, discoveries you’ve made, benefits you’ve realised…, probably best through the Forum, given its message editor is superior to the Comment editor.
  11. Good luck!

RMTREE Properties Query for RM8

This is an ‘upgrade’ from RMGC_Properties – Query #database for pre-RM8 databases. It provides an expansion over the information provided on the RM8 Home – File Properties screen. With a SQLite manager that supports runtime variables, a summary report can be given (just those variables with no leading hyphen).

It ATTACHes a reference empty, unused RM8 database that you provide from your up-to-date version of RM8 against which certain built-in items are compared, e.g., Fact Types and Source Templates, from your database under test. You’ll need to edit a line in the script to point to that empty database on your system.

Here’s an example of the script’s output for a database that was created around the time of the first public release:

ValueVariableRemark
6000Versionvs Control version 8000
– pre/post RM8 releaseWARNING! Database from a pre-release version of RM8
1608Peopleall records in PersonTable
0– Nameless Peopleno record in NameTable for that RIN
153– Unresolved Duplicate Name Pairspairs of Given and Surnames, not flagged as “Not a Problem”
20– Resolved* Duplicate Name Pairsflagged as “Not a Problem” – flags lost on transfer
5– Unresolved Duplicates with Media Linkssecondary persons’ links lost on merge
257Alternate namesall records in NameTable where IsPrimary=0
0– Orphaned Alternate names*no Primary name record found
551Familiesall records in FamilyTable
69Fact Typesno. of records from FactTypeTable
5– Custom Fact Typesno. of custom Fact Types
9– Customised Built-in Fact Typesno. of built-in Fact Types modified
34– Unused Fact Typesno. of Fact Types not used
0– Blank Fact Type NamesFactTypes must be named
0– Blank FactType SentencesFactType needing definition
6733Eventsall records of EventTable
0– Orphaned Eventsevents for which no person or family match in respective tables
138– Event WitnessesAll records in WitnessTable of persons sharing events
17— Nominal Witnessesnot Persons from database, but named in WitnessTable as sharing an event
0— Headless WitnessesPersonID (RIN) in WitnessTable missing from PersonTable
0— Witnesses to Lost EventsEventID in WitnessTable cannot be found in EventTable
0— Witnesses with blank Roleno role has been assigned from RoleTable or the RoleTable role is empty
5— Witnesses with Custom Sentencea custom sentence has been assigned, unique to this witness
26— Witnesses with Notea note has been entered for this witness to an event
84— Rolesno. of records from RoleTable
26— Custom Rolesno. of custom roles
1— Customised Built-in Rolesno. of built-in roles modified
54— Unused Rolesno. of roles not used
0— Blank Role NamesRoles needing definition
2— Blank Role SentencesRoles needing definition
1485Total Placesall records in PlaceTable incl Places and Place Details (Sites)
177– System Placessystem supplied Places: LDS Temples
1081– User Placesuser defined Places excl Sites
330— Used, having Geo-coordinatesnon-empty Lat or Long
42— Unused User Places*not used by EventTable, will be dropped in a transfer
227— User Place Detailsuser defined Sites
22— Used, having Place Detail Notes*Site Notes will be lost in a transfer
60— Used, having Geo-coordinatesnon-empty Lat or Long
13— Unused Place Details*Sites will be lost in a transfer
833Source Templates# of records from SourceTemplateTable
418– Custom Source Templates# of custom Source Templates
415– Unupdated Built-in SourceTemplates# not matching reference database
392– Unused Custom SourceTemplates*lost on transfer
1– Incomplete Source Templatesmissing part of definition
256Total Sourcesall records from SourceTable
3– Unused Sources*SourceTable records unused by CitationTable
6080Total Citationsall records from CitationTable
6080Total Citation Linksall records from CitationLinkTable
11– Duplicate Citationsidentical in most respects, cluttering reports
0– Sourceless Citations*no SourceTable record for this CitationTable record
54– Headless Citations*CitationTable records for which no Person, Event, Family, AltName found; cleaned on transfer
18Repositoriesall records from AddressTable of type Repository
827Multimedia itemsall records from MultimediaTable
7– lacking thumbnailprobably an imported reference to an image file that has yet to be found
0– duplicate multimedia filenamesprobably having different paths
25– with Date & Description*(TBC) if a record has both, the Description is lost in a transfer
3894Multimedia linksall records from MediaLinkTable
0– duplicate multimedia linksimage appears multiple times for person, fact
26Addressesall records from AddressTable of type Address
0– blank namesName field of AddressTable record is blank
43Tasksall records from TaskTable
2– Research Logall records from TaskTable of Type 1
41– ToDoall records from TaskTable of Type 2
0– Correspondenceall records from TaskTable of Type 3
1Foldersall records from TagTable of Type 1
34Groups*all records from TagTable of Type 0
* NOT TRANSFERABLEvia GEDCOM or Drag&Drop to another RM database

Future additions could include information about the FamilySearch and Ancestry linkages.

Returning to RM7.5 custom reports, groups and Publisher from RM8 via GEDCOM

UPDATE 2022-05-14: better than GEDCOM is to convert directly from RM8 to RM7 for which there are a couple of possibilities:

  1. RM8’s File > Export Data > DropBox creates a copy of the active .RMTREE file to a RM7 compatible .RMGC file intended for use with the RootsMagic Mobile app (now retired for Android). It is a complete conversion except, in my test, for Roles for shared events, which should be an easy fix for the developers. In my test, the RoleTable was empty which invalidates every shared event.
  2. Use one of my SQLite scripts that were developed subsequent to this original post:
    Direct Import of RM8 database into RM7 – Part 1
    Direct Import of RM8 database into RM7 – Part 2

Once the RM8 Export to DropBox is fixed, it would be operationally more attractive and is supported by RM Inc.

Issue

As of writing, still no Book Publisher feature in RM8 and along with other persistent issues, some users wish to return to the RM7.5 database from which they had ‘upgraded’. However, the only way back using the RM user interface is via #GEDCOM which, along with other attendant losses (see GEDCOM & DnD transfer losses) does not import #namedgroup and custom #reports. The original #RM7 database contains the custom reports and book specifications and possibly useful groups as they existed at the time of committing to the upgrade but importing into it results in an overload of duplicates, as described by PatrickR in his Apr 9 post to the RootsMagic User Community: Is there a way to copy custom reports to another database?

A solution

Years ago, I had an approach to creating a quasi-empty shell database file from a populated database that I thought might be adapted to this particular case: Depopulate but keep Customs, Places, Sources. Maybe a variant that depopulated but instead kept groups, custom reports, and book definitions could be used to import the GEDCOM from RM8 to good effect. I’m pleased to report that my experiment was successful, to the extent that GEDCOM transfer allows.

Procedure

  1. Review GEDCOM & DnD transfer losses and, in your RM8 database, ensure that your Fact Types are all set to export to GEDCOM. Run the RM8 database tools and Export the database having chosen to export everyone and everything.
  2. In RM7, make a copy or backup of the original database from which the RM8 ‘child’ was created. Run the RM7 database tools. Close the RM7 database file.
  3. Open the RM7 database file with your SQLite manager (one with a fake #RMNOCASE extension), load and execute this SQL file:
  1. Reopen the now-depopulated database file with RM7.
  2. Run the RM7 Database Tools.
  3. Import the GEDCOM file from RM8 choosing the option
    Preserve record numbers
  4. Done! Review what you’ve got. Note that the Media Gallery has to build thumbnails anew.

Possibilities

It should be possible to transfer data more completely by directly copying data from the RM8 database to the RM7.5 database via SQLite. However, because of changes in database structure, it’s not straightforward for several tables. Also, there may be serious incompatibility between the ConfigTable content of the two versions which may preclude the carryover of custom reports and other settings.

Privatise Extra BMD Events #events #reports #private

Problem

Many persons in my database have multiple Birth events created from different sources; likewise for other vital facts. Even though I may have marked one as Primary, the others still clutter reports: narratives look like repetitious gobbledygook; calendars have the same person multiple times, even on the same day. I want to clean up the publishing mess without losing the evidence.

This is especially the case for databases downloaded from an Ancestry Member Tree where events are generated by accepting or assigning a source and choosing to make the data into an alternate fact because it varies in some way from an existing fact.

Solution

Many report settings in RootsMagic have an option to include or exclude private events. I could reduce report clutter by setting the non-Primary Birth events to Private without losing the data that led to their creation. The Primary event should have whatever consideration and logic went into it being deemed the most accurate. I haven’t set a Primary in every case where my database has multiples so I want two things:

  1. A batch process that sets Private all the non-Primary events of a set having a Primary.
  2. A list of all the sets of multiple events that do not have a Primary so I can follow up.

Individual Summary with multiple Birth events, highlighted.

Individual Summary excluding Private events, in this case, the extra Birth events.

Implementation

BMD_private_multiples.sql 2020-02-15 Click to view; right-click to download.

The above SQLite script builds a series of temporary SQLite Views (stored queries), does the possible privatising and exits displaying the View “BMDmultstofix” which lists the persons needing follow up attention to set Primary one of the plural (multiple) events. The Views are all in memory and are lost when the SQLite manager exits the database.

Usage

The script executes very quickly on small files so it is possible to keep the SQLite manager open on the database and to re-execute the script at intervals while editing the database in RootsMagic. One could simply set the Primary event out of a group of multiple events for a person in RM’s Edit Person screen and then run the script to set the others Private or manually set them. Re-running the script will shrink the list needing Primaries by those you have set.

For family-type events (Marriage), the View “BMDmultstofix” will list both spouses separately. When the Marriage event is set to Primary on either spouse, the two separate listings will go.

Possible Enhancements

Colour Coding

The list of persons needing attention could be quite large, depending how assiduously you have been setting the Primary flag and the number of people and events in your database. So it might be nice to extend the last query or even last View to pick up the color code for each person which could be used to sort or filter the list.

Private as Default

Instead of privatising only those non-Primary events in groups having a Primary, another approach would be to privatise all non-Primaries. That would really clean out the reports and require user editing to get the desired outputs. This may be preferable.

Enhanced Version

BMD_private_multiples2.sql 2020-02-16

Both of the hypothesized enhancements have been incorporated. To ‘fix’ a group of multiples, one of the events must be set Primary and un-Privatised or another event added that consolidates one’s research into an assertion. By default, a new event is not Private and need only be made Primary.

In SQLiteSpy, any column can be sorted by clicking on the heading. Some other software, SQLite Expert is one example, also support filtering by column. These tools enable grouping color code or by surname without having to modify the script or write your own custom query of BMDmultsToFixClr view.

Screenshot

Screenshot from SQLiteSpy of results from running BMD_private_multiples2.sql

Trying to find duplicate spouses

I am trying to find duplicate spousess.
I created the following query:
select fatherid,
(select n1.surname || “, ” || n1.given from nametable n1 where fatherid = ownerid) as fathername,
motherid,
(select n2.surname || “, ” || n2.given from nametable n2 where motherid = ownerid) as mothername
from familytable
order by fatherid, motherid;

1. Should there be blank fatherid’s for a motherid in the result?
2. Or likewise blank motherids for a fatherid in the results?
3. Am I safe to think that if there are 2 motherid’s for the same fatherid that it is a duplicate?

Backup database open in both SQLite and RootsMagic #backup

Problem

Neither the RootsMagic File>Backup nor File>Copy can proceed because the database file “is being used by another process” (RootsMagic’s own or your SQLite manager or other outboard SQLite connection). And if you’ve chosen Backup, RM gracelessly blocks you from further work with an always-on-top fake news message window “Creating backup file…” requiring closure of RM to clear.

If you were using a SQLite manager that’s connected to the database file, you have to close it in order to use RM’s utilities and then reopen it with numerous steps to finally execute the SQL you’ve been carefully crafting to make some important changes to your data. That disruption takes time, may lose some valuable history from the SQLite manager and, perhaps, derail your train of thought. Same goes for ordinary users getting thrown off track after running Compact and other unknown things that screw up the RM backup.

Outboard zip utilities such as 7Zip, RAR, WinZip,… all, to the best of my knowledge, require the closure of all other processes for them to proceed but do handle such a blockage gracefully, unlike RM4-7. It’s not sufficient that RM get to handle the blockage gracefully; it shouldn’t be blocked at all!

Some Workarounds

Looking into this issue in 2019, I’ve discovered some useful alternatives to RM’s backup and copy utilities that have arisen since the days of RM4 and Windows XP in 2009/10 when the original SQLite Tools for RootsMagic 4 Wiki began. SQLite Expert Professional lists “Backup” as a feature that likely uses the Online Backup API, perhaps the best suited to the task but at a price.

This post presents four free, fast backup methods that allow the database to remain open in RootsMagic and in one (or more) outboard SQLite connections. They have been tested with a database open in RM7.5.9 and simultaneously in three SQLite managers: SQLite Expert Personal 4.2.0.784, SQLiteSpy 1.9.12 and SQLiteStudio  3.2.1.

  1. Windows 10 File Explorer
  2. SQLite Command: VACUUM INTO
  3. SQLite Archive File
  4. Export to SQL Text file

Windows 10 File Explorer

My first surprise was to discover that the Zip function integrated in File Explorer (possibly from even before Win10) does not object to the database file being open. Nor does its copy function. The latter may be the easiest and fastest method.

Copy

With the File Explorer window open and the “filename.rmgc” database file selected, Copy (Ctrl+C) and Paste (Ctrl+V) results in “filename – Copy.rmgc”. Repeat and the copy is automatically serialized as

filename – Copy (2).rmgc”
filename – Copy (3).rmgc”

No manual file naming is needed to prevent overwriting of the last copy and any copy can be immediately opened by RM or any of the SQLite apps. The Created Date timestamp for the copy is the Date Modified timestamp of the original, so its value puts the copy’s version of the data into context, provided you have some log (recorded or remembered) of what you did before and after.

Zip

Right-click on the selected “filename.rmgc” database file and, from the popup context menu, click on Send to > Compressed (zipped) folder. This will popup a dialog box to allow you to change the default name “filename.zip”.

Repeat and File Explorer automatically serializes the name:
filename (2).zip”
filename (3).zip”

You could change the zip filename extension to “.rmgb” at any of these backups so that RootsMagic could find and restore from it but I wouldn’t bother for two reasons:

  1. It interferes with serializing the filename to preserve successive backups.
  2. It’s easy to extract from the Zip folder using File Explorer which gives you great control over where the extracted file goes with opportunity to rename and move it afterward.

Using File Explorer from within the apps

All of the mentioned apps launch a basic Windows File Explorer dialog Window to Open a database or carry out some other operation. All support Copy, Paste and Send to Compressed (zipped) Folder. Because RM’s window filters for .rmgc files, it is quicker selecting the database file. Presumably, the speed of the subsequent operation is independent of the application that opened the File Explorer dialog. When the copy of zip is complete, simply Cancel the window to back out to where you were with the database.

Copying or Zipping from the Command-Line or PowerShell

If File Explorer could copy or zip the database file while open in RootsMagic and SQLite, it seemed reasonable that these other Windows utilities could, too. The Zip is only available in PowerShell but both can copy files. Unfortunately, both the CLI and PS detected another process and refused to proceed.

SQLite command: VACUUM INTO

The VACUUM command was provided in SQLite 3 and all of the referenced SQLite managers from the get-go but it operated directly on the database in memory. About two months ago (ver 3.27), the command was extended with an INTO clause that writes the result of the VACUUM to a new database file without modifying the original. Effectively, this means that one can make a vacuumed copy of the currently opened database from a SQLite manager as a backup. If the original has not been recently vacuumed by either RootsMagic or by an outboard SQLite manager, then the vacuumed copy will be smaller in size.

The copy can be arbitrarily named and, if named with the .rmgc extension, can be opened directly by RootsMagic without going through a Restore process.

Typing the command and filename and ensuring the latter is unique is less efficient than the File Explorer method. A potential productivity gain would be to write a Windows command or Powershell script to launch the current sqlite3.exe command line shell program to open the database file to VACUUM INTO a filename defined by the script; it could incorporate date and time in the filename. Alternatively, the batch or command script could also zip the resulting .rmgc file and give the zip file the unique name.

After some time, the referenced SQLite managers will undoubtedly update to SQLite 3.27 to support the command and some may incorporate some higher level controls such as a file explorer dialog to facilitate the naming of the vacuumed file.

This command should work in SQLite for both Windows and MacOS.

For the SQLite 3 command line shell program, look for sqlite-tools-win32-x86-3270200.zip or later at https://www.sqlite.org/download.html. Also available for MacOS: sqlite-tools-osx-x86-3270200.zip.

SQLite Archive File

Since 2014, there has been development of a SQLite Archiver utility “sqlar” using SQLite and zlib to archive files of any type to a SQLite database container. The file contents are zipped to a Blob-type field in a table with one row for each archived file. In 2018, it has been made into an optional extension to the SQLite source code and is now integrated in the SQLite command-line shell tool mentioned above. It is possible for a SQLite manager to be compiled with this extension but I don’t know of any yet.

The SQLite command-line tool from 3.22 can be opened and at its prompt given the “.ar” command to  create a SQLite Archive file, add|update files to it, list contents, and extract files from it. From 3.23, the recommended use is from the OS CLI to call sqlite3.exe with the -A option.

In this example in sqlite3.exe, I updated the SQLite Archive file “_LGS_pres.db” with the database “_LGS_pres.rmgc” that was open in RM and the 3 SQLite managers.

sqlite> .ar -uvf _LGS_pres.db _LGS_pres.rmgc
_LGS_pres.rmgc

The Archive file could contain many other files, e.g., the media files or other unrelated files, just as a Zip folder can.

Moreover, the SQLite Archiver can operate as a zip utility producing a .zip file (“compressed folder” in Win10) instead of a SQLite Archive file, skipping over the locks that trip up RootsMagic’s backup and 7Zip.

For more, see SQLite Archive Files.

Export to SQL Text file

This is an old method that has merit because it reads the SQLite database, not just copying it, to create SQL commands which can later be executed to build the identical database. Some SQLite managers support it (SQLiteStudio) and it is available in the command-line shell:

Use the “.dump” command to convert the entire contents of a database into a single ASCII text file. This file can be converted back into a database by piping it back into sqlite3.

The resulting text file is bigger than the database file but it can be piped into a zip utility for compression. Whether there is a limit that a large database might run into remains to be seen.

For more, see Converting An Entire Database To An ASCII Text File

Observations

  1. There are multiple ways to backup a RootsMagic database while it is open in both the RootsMagic application and one or more outboard SQLite connections.
  2. Using these other ways instead of the RootsMagic backup function may also avoid the recurring situation when RootsMagic blocks itself from completing a backup; a forced shutdown is the only way to resume working in RootsMagic.
  3. For the less technically proficient, Windows File Explorer provides a convenient alternative to the RM backup, even from within the RootsMagic user interface.
  4. RootsMagic developers should look into some of these alternatives that work with SQLite to resolve the conflict with outboard SQLite connections. Especially appropriate could be the Online Backup API because it is native to SQLite and aware of other users of the database, allowing them to “continue uninterrupted while a backup of an online database is made”.

Caveats

  1. Testing these has been necessarily superficial so there may be issues as yet undetected. Please comment below if you come across any.
  2. Make sure that the database is not being modified by one of the connections while the ‘backup’ is going on. When the backup is opened, there could be unexpected results. This is especially true of those methods that work without a SQLite connection to the database, i.e., File Explorer, SQLite Archiver.
  3. Those methods that require a SQLite connection to the database, i.e., VACUUM INTO, Export and Online Backup API, need the RMNOCASE collation (see RMNOCASE; faking it in SQLite Expert, command-line shell et al). It is possible that the resulting database will fail the RootsMagic database integrity test; simply use its Rebuild Indexes tool to correct.

Media Duplicates – Reports and Remedies

Problem

TreeShare update operations that add images from Ancestry to the RootsMagic database can result in successive downloads of the same image file saved to a revised name, each added to the Media Gallery and independently tagged to their respective subjects. I don’t think this is either necessary or desirable (maybe there is a good reason yet to be revealed by the RootsMagician!). Moreover, such duplication (or worse, multiplication) bloats usage of storage space and inflates the Media Gallery making it harder to manage, navigate, find things. The few respondents to my survey  reported 10-20% of their media files were duplicates while an exercise with a 10-person subset of my 420 person Ancestry Tree after many TreeShare ops trying to resolve differences was far higher. The discussion and scripts that follow help understand the magnitude of the bloat and offer surgical tools to reduce it. These tools reduced the 92 media in the Gallery to 49.

Background

The issue predates TreeShare and could arise from combining databases but it is likely that it had not been noticed as frequently or to the same degree. Earlier effort dating back to RM4 (see Media Repair Queries) concerned itself with duplicate file names in different folders, not with duplicate content within differently named files.

RootsMagic places all TreeShare media from Ancestry in one folder for a given database. It does not attempt to substitute an existing duplicate media file for the one to be or is being downloaded; rather, it modifies the name of the downloading duplicate file with a unique suffix. Consequently, the old script would not even detect these differently named files with identical content, only those having the same name in separate folders. Moreover, it could generate false positives for files of the same name with different content leading to the deletion of potentially valuable information.

Finding Duplicates

By comparing RootsMagic thumbnails

RootsMagic’s database has a property for each image file that can be used as a basis for finding duplicate image content – the thumbnail image it creates of the image file. This thumbnail is stored in the database in the Thumbnail field in the MultiMediaTable. However, because it is a tiny thumbnail, it is a crude approximation of the original and it may be possible that similar images, such as some pairs of Census pages or BMD Indexes could give rise to identical thumbnails. Experience will tell. It’s a very convenient parameter to exploit as seen in this query result from a tiny, test database.

Examining the database’s MultimediaTable shows the duplicate file’s revised name:
 In this case, we can confidently surmise from the common part of the filenames AND the identical thumbnails that these undoubtedly have duplicate content.

By using CCleaner

Other files may not have strongly similar filenames despite identical thumbnails and, given the risk of false positives from the latter, a further test is required. The free CCleaner and other utilities can resolve if they have the same or different content; we’ll continue here with CCleaner and its Tools > Duplicate Finder feature set to search the folders of interest for image files that match each other only by Size and by Content. Here is a screenshot of its results:
The highlighted names are of those in the database’s Media Gallery. The others are copies made using Windows File Explorer and are unused by (and therefore unrecorded in) the RM database. The script correctly reported the two matched files used by the database from their identical thumbnails.

CCleaner exports its results as a tab-delimited text file which can be imported into a spreadsheet:

CCleaner Duplicate Files report imported into Google Sheets; headers added by user.

Combining the data from the RootsMagic database with that from the CCleaner Duplicate Files report should provide what is needed to identify and replace duplicates in the Media Gallery and to delete the then unused duplicate files from computer storage. Combining may be done by importing the CCleaner data into the RM file, temporarily, or by exporting RM data to the spreadsheet. What follows is the first approach.

Migrate CCleaner Duplicate Search results to SQLite

With the CCleaner data in the spreadsheet, formulas can be applied to create a SQLite INSERT statement for each reported file. We will create an empty temporary table with a corresponding structure into which the generated statements will insert values from the spreadsheet. I’ve created a publicly accessible Google Sheet which anyone can copy to their own spreadsheet to work on or to use directly.

Scripts for finding Duplicates

Before running any of these scripts, you should use RM’s File > Database Tools:

  1. Test Database Integrity
  2. If #1 fails, Rebuild Indexes and repeat #1
  3. Delete Phantoms

And it is probably advisable to run the latest script from Delete Phantoms to clear out any detritus that may just confuse or exaggerate comparisons of Database Properties before and after the Merge Duplicates script below. Use your OS File Explorer/Manager to get the properties of the media folder(s) used by the database.

Thumbnails

Media-DupThumbnailsList.sql Make sure you open the RM Media Gallery and let thumbnail creation complete before running this script against your database.  It creates a temp table named zDupMediaTable which is displayed at the end of script execution to show you the duplicates found. The script makes no modifications to the database; this table will be used by the merge script.  Note that it ignores media files that are not image files or any file type that is not in the subject database’s Media Gallery. And beware that there is a risk of false positives because thumbnail images are of such low resolution. If you intend to proceed to merge duplicates, do not close the database from the SQLite manager as that will delete the temp table needed.

Files

Media-DupFilesList.sql You have to edit this script to include the INSERT commands you will copy from the spreadsheet before running it on the database. It creates an identically named temp zDupMediaTable to be used by the merge script and makes no changes to the database. Because Ccleaner does a bit-wise or byte-wise comparison, there is no chance of a false match between different images. However, it will not match files having exactly the same image data but a difference in just one character of metadata. If you intend to proceed to merge duplicates, do not close the database from the SQLite manager as that will delete the temp table needed.

Merging Duplicates

Media-MergeDuplicates.sql This is where the rubber hits the road and database changes are afoot. You have taken safety precautions? And your SQLite manager still has the temp zDupMediaTable in the database from either of the above scripts? This script won’t do anything without it. Fire away!

When you next run TreeShare after script execution, all the people whose media has been affected by the merge and is tagged to anything related to the person will be marked as Changed and all previous Changed people will first have been cleared. Nothing material should have been changed by the script but flagging them so helps you to confirm that is the case.

Cleanup

Repeat the initial steps of RM Database Tools and Delete Phantoms  to clean out new detritus and then RM Database Properties for comparison of Before and After.

Delete surplus duplicate media files

The last step in the script lists the commands for deleting duplicate files from the media folder(s) that are no longer used by the database. Copy and paste them into a text editor for review and to avoid their loss if you have to close the database in the SQLite manager (some RootsMagic operations such as Backup and Restore cannot proceed if the database is open in another application). If you are confident that any or all are unused by any other database, copy and paste the respective ERASE commands into the CMD shell. They will execute immediately without asking for confirmation; the script could be modified to require a confirmation of each ERASE. For a large set, break up the command set into chunks.

Repeat getting folder properties for the After results to compare with Before, i.e., number of media files and total size.

Miscellaneous

  1. How well this procedure will scale with the size of the database is unknown. Certainly, once some large numbers of INSERTs and ERASEs is reached, there will be issues with copying and pasting and perhaps the size of the script for the INSERTs. Nor can we predict the speed with which the scripts will execute.
  2. How TreeShare functions remains a mystery in part. Events for a couple, such as Marriage, are especially problematic because we are advised that Ancestry handles them as separate individual events. It seems advisable, if not necessary, to have on the Ancestry Tree identical sources for the Marriage event for each person to be assured that RM will create the couple’s Marriage event with all the sources. And updating via TreeShare sometimes is a tail-chasing exercise in which ostensibly identical citations or media on both sides are perceived as new for the other with the update one way provoking the offer of a new one the other.
  3. The flagging of Changed People is done by setting the Modified field to 1 for records in LinkAncestryTable whose LinkType is 0. The script also sets the Modified field to 1 for records of LinkType 4 (citations) and 11 (media) but these have no effect on anything apparent in RM 7590.
  4. This procedure does not provide ERASE commands for media files that are unique (not duplicates) and unused by the database. The procedure on this early page could be adapted to do so: Scrapbook Files Status

Updates responding to TreeShare

Working out the wrinkles in a new script addressing some TreeShare issues, I was using some other scripts to help me understand what was going on and ran into a couple of issues that I’ve fixed:

Events – Add to all Children a Parent event shared with parents #facttypes #sharedevent #relationships #rm8

TBA: Rev 2021-03-0x above script adapted for RM8 by Tom Holden from the #RM7 script described below by the original author, Jerry Bryan. Note that the custom fact type ‘Parents’ and the role ‘Birth_of_Child’ must be created separately in advance of executing this script.

This is a query I wrote just for myself and I didn’t feel as if it were of sufficiently general interest to post here. Also, it is not as polished as I like queries to be that I share publicly. But a request came up on the RootsMagic forums, so here it is. The query adds a Parent fact to everyone in the database who is not a dummy person, who has parents, and who has a birth fact with a valid date. The dummy person definition is for my own database, where my dummy people have an asterisk somewhere in their name. The birth fact with a valid date is required because the Parent fact needs a valid date, and in my database at least it needs to be placed immediately after the Birth fact. The name of the fact is singular “Parent” but it supports both parents. As written, the parent information is placed two places: 1) in the Description field for the Parent fact, and 2) the Parent fact is shared with both parents.

My sentence template for the Parent fact itself is setup as follows:

<b>Parents:</b>< [Birth_of_Child].>

I use point form sentences with the sentence for each fact starting on a new line. The new line is a part of the template. Most RM users would need to set up the sentence template differently to meet their own needs. The [Birth_of_Child] variable looks funny, but that’s the role I share with the parents and it is what gets the parents listed in the sentence. Using the [Desc] variable would produce the same sentence, but using the [Birth_of_Child] variable causes the parents to be listed in the index for this page and the [Desc] variable does not. I mostly use Descendant Narrative reports. For the descendants, the parents appear in the report anyway, a generation earlier. The [Parent] variable makes the parents explicit. For the spouses of the descendant, their parents already appear by default in what I call the spouse sentence. So the [Parent] variable for the spouses represents a duplication. The spouse sentence is not under control of source templates (a major failing of RM), so I solve the problem by running a Notepad++ script on the RTF file to remove the parents from the spouse sentence before printing.

My sentence template for the [Birth_of_Child] role is as follows.

<b>Birth of Child:</b> <[Date:plain]><, [Person:Full]><, parent age: [ThisPerson:age:plain]>.

This has the salutatory effect of including the birth of each child on the timeline of each parent in a narrative report. Upon further review, I think the sentence would look better using a gender switch to say mother’s age or father’s age as appropriate. Gender switches work on the [Person] variable. I can’t remember if they work on the [ThisPerson] variable or not.

And here follows the script, warts and all

-- This script writ large creates a Parent
-- fact for every person in the RM database
-- who has parents. It loads the parents'
-- names into the Description field for the
-- Parent fact. It also shares the Parent
-- fact with the parents themselves where it
-- shows up as a Birth of Child event.
 
-- This script was not developed with the intent
-- of being generally distributed. Hence, it has
-- not been refined and tested the way I would do
-- for general distribution of a script. Please
-- use it and adapt it on an "as is" basis or for
-- ideas alone.
 
-- I can't remember for sure, but sometimes scripts
-- that insert this much data into into an RM data
-- base can create indexing errors. If so, run the
-- RM database tool in order immediately after running
-- this script.
 
-- Jerry Bryan 7/31/2018
 
 
-- Create a view of the EventTable
-- containing only Parents events.
-- This view will be used to prevent
-- creating a Parents event for
-- any individuals who already have
-- a Parents event. The main data needed
-- is the PersonID of the individuals
-- who already have a Parents event.
-- The PersonID manifests itself in
-- the EventTable as OwnerId.
 
 
DROP VIEW IF EXISTS ParentEventView;
CREATE TEMP VIEW ParentEventView AS
SELECT E.*
FROM EventTable AS E
       JOIN
     FactTypeTable AS FT  ON FT.FactTypeID = E.EventType
     WHERE FT.Name LIKE('Parents');
 
 
 
-- Create a view of the EventTable
-- containing only Birth events.
-- These events provides an initial list
-- of candidates of people to receive
-- a Parents event. Also, many of the
-- fields for newly created Parents
-- events will be replicated from the
-- corresponding Birth event.
 
DROP VIEW IF EXISTS BirthEventView;
CREATE TEMP VIEW BirthEventView AS
SELECT Birth.*
FROM EventTable AS Birth
       JOIN
     FactTypeTable AS FT  ON FT.FactTypeID = Birth.EventType
     WHERE FT.Name LIKE('Birth');
 
 
-- Create a view of Birth events for people
-- that don't have a Parent event. This is the
-- first filtering on the list of candidates
-- of people to receive a Parents event.
 
 
DROP VIEW IF EXISTS BirthWithoutParentsEventView;
CREATE TEMP VIEW BirthWithoutParentsEventView AS
SELECT Birth.*
FROM BirthEventView AS Birth
          LEFT JOIN
     ParentEventView AS Parent ON Parent.Ownerid = Birth.OwneriD
WHERE Parent.OwnerID IS NULL;
 
-- Create a view of the NameTable
-- containing only primary names.
-- This view will be a source of
-- the names needed for the newly
-- created Parents events and will
-- prevent any names other than the
-- primary ames from being loaded
-- into -- Parents events. This
-- view also prevents Parents events
-- from being created for any dummy
-- people, designated with an asterisk
-- in the name.
 
DROP VIEW IF EXISTS NameView;
CREATE TEMP VIEW NameView AS
SELECT N.*
FROM NameTable AS N
WHERE N.IsPrimary = 1
  AND N.Surname NOT LIKE('%*%')
  AND N.Given   NOT LIKE('%*%');
 
-- This view performs most of the
-- main processing for this project.
-- It determines which people actually
-- do have parents. Actually having
-- parents is based on being in the
-- ChildTable and has nothing to do
-- with whether a Parents event exists
-- or not. This list of people with parents
-- is then matched against people who are not
-- dummy people, who do have birth
-- events, and who don't already have
-- Parents events. This view also
-- develops the data that will need to
-- be stored in the newly created
-- Parents events.
 
 
DROP VIEW IF EXISTS ChildParentsView;
CREATE TEMP VIEW ChildParentsView AS
SELECT Child.ChildID,  Child.Given  || ' ' || Child.Surname  AS ChildName,
       Father.FatherID, Father.Given || ' ' || Father.Surname AS FatherName,
       Mother.MotherID, Mother.Given || ' ' || Mother.Surname AS MotherName,
       CASE
       WHEN Father.FatherID = 0 THEN Mother.Given || ' ' || Mother.Surname
       WHEN Mother.MotherID = 0 THEN Father.Given || ' ' || Father.Surname
       ELSE Father.Given || ' ' || Father.Surname || ' and ' || Mother.Given || ' ' || Mother.Surname
       END CombinedNames,
       B.Date,
       B.SortDate,
       B.EditDate
 
FROM  (  SELECT C.RecID,
                C.ChildID,
                N.Given,
                N.Surname
         FROM   ChildTable AS C
                  JOIN
                NameView AS N ON N.Ownerid = C.ChildID
      ) AS Child
 
           JOIN
 
      (  SELECT C.RecID,
                FM.FatherID,
                N.Given,
                N.Surname
         FROM ChildTable AS C
                JOIN
              FamilyTable AS FM ON C.FamilyID =  FM.FamilyID
                LEFT JOIN
              NameView AS N ON N.OwnerID = FM.FatherID
      ) AS Father ON Father.RecID = Child.RecID
 
           JOIN
 
      (  SELECT C.RecID,
                FM.MotherID,
                N.Given,
                N.Surname
         FROM   ChildTable AS C
                  JOIN
                FamilyTable AS FM ON C.FamilyID =  FM.FamilyID
                  LEFT JOIN
             NameView AS N ON N.OwnerID = FM.MotherID
       ) AS Mother ON Mother.RecID = Father.RecID
 
            JOIN
 
       BirthWithoutParentsEventView AS B ON B.OwnerID = Child.ChildID;
 
-- The data from the ChildParentsView needs to be used
-- twice, once to load Parents events into the EventTable
-- and again to load roles into the WitnessTable. The
-- Parents events have to be loaded into the EventsTable
-- first, and one Insert statment in SQLite cannot load
-- data into two tables. However, loading Parents Events
-- into the EventTable will cause the ChildParentsView
-- not to produce the correct results the second time it
-- is used. Therefore, the results from applying the
-- ChildParentsView will be saved into a temporary table
-- called ChildParentsTable. As a table instead of a view,
-- the data in ChildParentsTable can be used two different
-- times and remain the same data both times it is used.
 
DROP TABLE IF EXISTS ChildParentsTable;
CREATE TEMP TABLE ChildParentsTable
(ChildID,ChildName,FatherID,FatherName,MotherID,Mothername,CombinedNames,Date,SortDate,EditDate);
INSERT INTO ChildParentsTable
(ChildID,ChildName,FatherID,FatherName,MotherID,Mothername,CombinedNames,Date,SortDate,EditDate)
SELECT CP.*
FROM ChildParentsView AS CP;
 
 
 
-- Load new Parents facts into the EventTable.
 
INSERT OR ROLLBACK INTO EventTable
SELECT NULL AS EventID
    ,(SELECT FT.FactTypeID FROM FactTypeTable AS FT WHERE FT.Name LIKE('Parents') ) AS EventType
    ,0 AS OwnerType
    ,ChildID AS OwnerID
    ,0 AS FamilyID
    ,0 AS PlaceID
    ,0 AS SiteID
    ,Date AS Date
    ,SortDate AS SortDate
    ,0 AS IsPrimary
    ,0 AS IsPrivate
    ,0 AS Proof
    ,0 AS STATUS
    ,EditDate AS EditDate
    ,CAST('' AS TEXT) AS Sentence
    ,CAST(CombinedNames AS TEXT) AS Details
    ,CAST('' AS TEXT) AS Note
 FROM ChildParentsTable;
 
 
 
 --   ===========================================================================================================
 --     The following is adapted from Tom Holden to rank same date sort dates.
 --
 --     There are a number of changes by Jerry Bryan that are specific to his
 --     use case.
 --
 --        * The list of fact types which are supported is greatly increased.
 --        * Sort dates including ranked sort dates (date-n) are ranked and even re-ranked
 --          whether or not they match the date from from the fact itself, provided only
 --          that the sort dates in question match each other.
 --             . This allows same "ABT" dates to be ranked.
 --             . This allows same "year only" dates to be ranked if the sort date is
 --               1 July of the year.
 --             . This allows same "year and month only" dates to be ranked if the
 --             . sort date is the 15th of the month.
 --
 --   ===========================================================================================================
 
/* SortDateSameDayOrderCustom.sql
   2011-12-20 ve3meo
   Alters SortDates of any set of Fact types
   to a natural order when any pair or more occur on the same date.
   Could be extended to order other facts also. SortDates are effectively assigned
   (by arithmetical offsets) an absolute suffix -1, -2, ... related to the rank of the FactType.
   Affects only those events whose SortDates correspond to the Fact Date, as computed
   by a Date encoding algorithm. The algorithm does not handle Date modifiers so not all
   Event dates are handled, e.g. "Bef 1960".
*/
DROP TABLE IF EXISTS TmpFactOrder
;
CREATE TEMP TABLE IF NOT EXISTS TmpFactOrder
(Rank INTEGER PRIMARY KEY, FactName TEXT)
;
 
/* list of Fact Names, standard and custom, to be sorted, in rank order.
   Revise the list to suit your needs */
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Birth');
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Parents');             -- added by JB
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Birth Certificate');   -- added by JB
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Christen');
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Baptism');
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Death');
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Obituary');            -- moved up by JB
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Death Certificate');   -- added by JB
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Funeral');
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Cremation');
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Burial');
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Burial Inscription');  -- Added by JB
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Burial GPS');          -- Added by JB
INSERT INTO TmpFactOrder (Rank, FactName) VALUES (NULL, 'Memorial');
/* revise SortDates */
UPDATE EventTable
SET SortDate = SortDate
  -6692012023  -- this offset goes to Date-1 if the event is a ranked event
  *( (
      SELECT Rank
      FROM TmpFactOrder,
           FactTypeTable
      WHERE FactName LIKE Name
      AND FactTypeID = EventType
      )>0
    )
  +1048576  -- this offset adds steps of 1 to Date-1 multiplied by (rank-1)
  *( (
      SELECT Rank
      FROM TmpFactOrder,
           FactTypeTable
      WHERE FactName LIKE Name
      AND FactTypeID = EventType
      )-1
    ) -- maps the FactType to its order
WHERE EventID
IN (SELECT EventID FROM EventTable
    INNER JOIN
    (SELECT -- matching dates
     SortDate, OwnerID, COUNT()-1 AS Matches
     FROM EventTable
     INNER JOIN FactTypeTable
     ON EventType = FactTypeID
     WHERE EventTable.OwnerType = 0
     AND Name IN (SELECT FactName FROM TmpFactOrder)
/*     AND    -- commented out by JB to handle sort dates not matching fact date when sort dates are equal to each other.
     SortDate =   -- equals encoded event Date (if not a match, suggests that user has modified SortDate so don't touch it)
       (CASE
        WHEN DATE LIKE '.%'
        THEN 1
        ELSE Substr(DATE,3,5) END
        +10000
        )*562949953421312
        + Substr(DATE,8,2)*35184372088832
        + Substr(DATE,10,2)*549755813888
        + 17178820620 */
      GROUP BY SortDate, OwnerID, EventTable.OwnerType
     )
     USING (OwnerID, SortDate)
     INNER JOIN FactTypeTable
     ON EventType = FactTypeID
     WHERE Matches
     AND EventTable.OwnerType = 0
     AND Name IN (SELECT FactName FROM TmpFactOrder)
    )
;
 
 
-- We now add Parent roles to each Parents event.
-- It is most convenient to add the Parent role
-- for the father in one INSERT and to add the
-- Parent role for the mother as a separate insert.
-- So we first create a view that joins the
-- Parent events with the ChildTable and FamilyTable
-- to get a list of Parent events and their
-- respective fathers and mothers.
 
DROP VIEW IF EXISTS ParentRoleView;
CREATE TEMP VIEW ParentRoleView AS
SELECT PEV.*,
       FM.FatherID, FM.MotherID
FROM ParentEventView AS PEV
        JOIN
     ChildTable AS CT ON CT.ChildID = PEV.OwnerID
        JOIN
     FamilyTable AS FM ON CT.FamilyID =  FM.FamilyID;
 
 
-- Load Parents roles for fathers into the WitnessTable
 
INSERT OR ROLLBACK INTO WitnessTable
SELECT NULL AS WitnessID
    ,PE.EventID AS EventID
    ,Parents.FatherID AS PersonID
    ,0       AS WitnessOrder
    ,(SELECT R.RoleID FROM RoleTable AS R WHERE R.RoleName LIKE ('Parent') ) AS Role
    ,CAST('' AS TEXT) AS Sentence
    ,CAST('' AS TEXT) AS Note
    ,CAST('' AS TEXT) AS Given
    ,CAST('' AS TEXT) AS Surname
    ,CAST('' AS TEXT) AS Prefix
    ,CAST('' AS TEXT) AS Suffix
 FROM ChildParentsTable AS Parents
         JOIN
      ParentEventView AS PE ON PE.OwnerID =  Parents.ChildID AND PE.Details = Parents.CombinedNames;
 
-- Load Parents roles for mothers into the WitnessTable
 
INSERT OR ROLLBACK INTO WitnessTable
SELECT NULL AS WitnessID
    ,PE.EventID AS EventID
    ,Parents.MotherID AS PersonID
    ,0       AS WitnessOrder
    ,(SELECT R.RoleID FROM RoleTable AS R WHERE R.RoleName LIKE ('Parent') ) AS Role
    ,CAST('' AS TEXT) AS Sentence
    ,CAST('' AS TEXT) AS Note
    ,CAST('' AS TEXT) AS Given
    ,CAST('' AS TEXT) AS Surname
    ,CAST('' AS TEXT) AS Prefix
    ,CAST('' AS TEXT) AS Suffix
 FROM ChildParentsTable AS Parents
         JOIN
      ParentEventView AS PE ON PE.OwnerID =  Parents.ChildID AND PE.Details = Parents.CombinedNames;
;
 

DAT File Question #ancestrycom #treeshare #database

I found some discrepancies in RootsMagic and posted a question. This was in an answer:

C:\Users\Tom\AppData\Roaming\RootsMagic\AMT (change the “Tom ” to your Windows username). In this folder is a .DAT file for each connected Ancestry Member Tree (AMT), named “AT-AncestryTreeID.DAT”, where AncestryTreeID is an integer. This file is a SQLite database containing tables named identically to a subset of the TreeShared RM database file.

Is there any data description documentation on that file? Or any documentation on its purpose in life and what it does.

Discussions & comments from Wikispaces site


ve3meo

 

Very preliminary answer

ve3meo
26 July 2018 14:26:47

Nothing published. Your description is accurate. We surmise that it is created by TreeShare, contains relationships between RM items and Ancestry items that are sync’d. It might be an intermediate file that TreeShare creates before transmitting changes in one or the other directions.