Please join me in welcoming Kevin McLarnon ( @kmclarnon ) in the role of Editor on this site and, especially, for volunteering to lead a renewed initiative to improve the documentation we have on the design of the RootsMagic v9 #database. He is counting on users better-versed than himself in sqlite3 and #RM9 (#RM10) to collaborate, contribute to and comment on the #datadefinitions but brings to the process experience with Google Workspace teams, websites and pedagogy.
The locus of activity is the page RM10 Data Dictionary which links to the collaborative spreadsheet on which all are invited to comment. Some of you will be approached for special contributions and you can also contact him directly. You will hear more from Kevin about the project in blog posts and/or Forum messages.
Richard Otter is a skilled, professional developer who is also a RootsMagic user and has been building a library of scripts that are worthwhile checking out for possible solutions for your own needs. He publishes them on GitGub, list of them screenshotted below 2023-05-19. Check for updates and fuller descriptions. He has posted notices of significant ones in the Forum and, hopefully, will continue to do so.
This project arose from a request Fix & Merge Hundreds of Newspapers.com Sources in the Forum. The poster is a heavy consumer of the Newspapers.com sources through RM’s TreeShare with Ancestry.com and had issues with:
A long Source List in the application and repetitiously long report Bibliographies due to a different Master Source for each page of a newspaper.
Repetitious listing of “Newspapers.com” in Source Names and in the Title in Bibliographies and in Footnotes. Her approach was to manually delete it in every Master Source but still had hundreds to do.
Leading punctuation in the Footnotes and Bibliographies because the Author value is empty in sources from this Ancestry collection.
“N.p.” and “n.d.” notations in Footnotes and Bibliographies when a value for Publisher, Publish Place or Publish Date is empty.
Solution
Because the sources were imported via TreeShare, they are Ancestry Record type, i.e., they are created using the built-in Ancestry Record Source Template. Built-in Source Templates are uneditable through the RM user interface but are defined in the same table that holds user-defined templates. Thus, the built-in templates can be modified by using SQLite to edit entries in the SourceTemplateTable. We can address Issues #3 and #4 by modifying the Footnote and Bibliography sentence templates in the Ancestry Record template. That will be of benefit also to citations having empty values from some other Ancestry Collections (see Ancestry TreeShare – Impact).
Issues #1 and #2 are more challenging because the values of the source and citation variables that appear in the Footnote and Bibliography sentences are stored in a XML data structure. To solve #1, we want to “lump” all citations of a given newspaper Title under one Master Source. That requires that the data that differentiates the Master Sources for a common newspaper must be deleted or transferred from the Master Source to the Citation Details. For example, the Page # must be extracted from the Source Name in the SourceTable and moved to the Detail ([Page] variable in XML) for each Citation of that Source in the CitationTable. There are more steps than that alone for each of that one newspaper’s multiple Master Sources and Citations.
Once all the data manipulations are complete, there will be multiple identical Master Sources for a given newspaper Title. RM’s AutoMerge Sources function can finish the job.
Before/After Screenshots
The database undergoing modification was from RM7, hence the screenshots are of RM7. However, the solution also works with RM8 and RM9.
On returning to RM, run Rebuild Indexes in Database Tools.
In RM, open the Source List and run AutoMerge.
If you have two or so remaining sources for the same newspaper using the Ancestry Record template and you wish to have only one, use RM’s Manual Merge for Sources.
Repeat after you have added more Newspapers.com sources via TreeShare.
Notes
Should you have reason to revert the Ancestry Record source template to the format supplied by the application, load and execute in your SQLite manager SourceTemplate-AncestryRecord-Reset.sql, edited to find a RM database file of the same major version number to fetch the built-in format.
Should you upgrade or drag’n’drop to another database, the “Ancestry Record – cleaned” template will revert to the built-in format. Run step #4 on the target database to restore it.
The user reported that TreeShare does not report any change as a consequence of this procedure; it would seem to rely solely on the link to the Ancestry Record stored in the RM7 LinkAncestryTable (AncestryTable in RM8, RM9).
The procedures should work also on RM8 and RM9.
The main script is not what I would call ‘elegant’. It grew like Topsy as I explored the database and evolved the process through a sequence of building blocks. Someone cleverer than I with SQLite might well produce a better, faster version.
Conversion from #RM9 to #RM8 is pretty trivial with little loss except for RM9’s new features. From RM8 to #RM7 is much more complex given the significant structural changes between the two. To convert from RM9 to RM7 is simply a two-stage process with a RM8 compatible file as an intermediary. It’s possible to combine the two into one procedure but that’s for another time.
Losses
That there should be a variance in Tasks is unsurprising because of the structural changes between RM7 and RM9; Task Folders from RM9 could be non-existent in RM7.
The tiny variance in Citations is perhaps more surprising as it might be expected to go in the opposite direction, e.g., citations for Associations in the original RM9 database would be absent in RM7 and unrecoverable in the re-upgrade to RM9. In this case, however, it is because a few of the original RM9 Citations were reused and there were 6131 Citation Links which were transformed to 6131 individual Citations in the downgrade from RM8 to RM7.
The loss of Associations in the downgrade has no effect on the Events count as they are records in a different table, only reported in RM9’s Enhanced Properties List which also reveals the loss of Task folders.
Of course, the database properties lists do not tell the whole story. It would also be unsurprising to find that Notes, especially formatting, may not be preserved in a downgrade or upgrade due to the major reiterations of the Note Editor code. Certainly, new features in RM8 or RM9 will not be supported in a lower version and those that are database dependent, including Associations, Color Code Sets and Labels, and Saved Search Criteria cannot survive a round trip from a higher version to lower and back.
Stage 1: From RM9 to RM8
The only barrier to RM8 opening a RM9 .rmtree file is the version number of the database. For RM8, it cannot deal with a version higher than 8000; RM9’s is 9000 while RM7’s is 6000 (as it is for RM6). This statement opens the door to RM8:
UPDATE ConfigTable
SET DataRec=
CAST(
REPLACE(DataRec,'<Root><Version>9000</Version>',
'<Root><Version>8000</Version>')
AS BLOB
)
WHERE RecID=1
;
The next step is only needed to prevent RM8 from offering the Association fact type (which it cannot support) as a conventional individual type which one could use to add such events, leading to confusion should the database be reopened in RM9.
DELETE FROM FactTypeTable WHERE FactTypeID=902; -- Association Fact Type
--DELETE FROM EventTable WHERE EventType=902; -- Association individual event, if any
The second statement is only needed should such Association individual events have been created before upgrading back to RM9.
A possible enhancement to this down-conversion from RM9 to RM8 might be to convert Associations in RM9 to either Shared Events or simple individual type events with the name of the other person in the event description.
Stage 2: From RM8 to RM7
The heavy work in developing a procedure for this down-conversion was covered in Convert RM8 Database to RM7.
In response to a request in the Forum, Adding Surname to Truncated Source Names in RM9, posted here is a script that may be of use to others. The user’s objective is to facilitate distinguishing which Master Sources (among many similar source names derived from imports from FamilySearch, Ancestry, et al) are relevant to families or persons of interest without clicking down through Citations, Citation Uses and dead ends from which he needs to back out to go onto the next. With 32k Sources, 32k Citations (he’s a ‘source-splitter’) and 113k ‘uses’ among 7k people and 3k families, one can appreciate the scope of his challenge.
Description
This batch script appends to the Source Name the unique surnames of all persons in the database who have a citation or ‘use’ of a Master Source in their profile. The list of surnames is enclosed in drawing symbols: ╣surnamelist╠ acting as bookends. The resulting extended name is gracefully truncated at 256 characters if caused by the operation, the maximum RM9 supports in a drag’n’drop transfer; the value is easily changed in four places in the same statement.
The script execution creates a series of temporary Views (in-memory queries) to build the final View “TruncNewName” from which the SourceTable is updated with the ╣surnamelist╠ appended. These Views are lost when the SQLite manager closes the database.
At the start of the script after REINDEXing against the fake RMNOCASE collation, the SourceTable is updated with Names stripped of previous ╣surnamelist╠. Therefore, the script can be rerun again after changes have been made in the database.
Because the database has been REINDEXed against the fake RMNOCASE, the RM Rebuild Indexes tool must be run on returning to work on the database with the RootsMagic app.
Execution time for the script on the sample database with 32k sources is 20-25 seconds on a 5-yr old i5, middling laptop with HDD so it is not prohibitively long to use repeatedly. I do not know how the time would scale with larger databases.
Requires
Requires a SQLite manager that has a “fake RMNOCASE collation” and supports REGEXP_REPLACE(). Script was developed and tested with SQLiteSpy 1.9.16 64-bit and the fake RMNOCASE extension from RMNOCASE – faking it in SQLiteSpy. Backup before using!
For source ‘lumpers’, the string of surnames might well be too long for the Source Name. Even for ‘splitters’, some source such as a census page for a residential school, hospital, prison could have a long list of surnames. The current Sources view in RM9 is amenable to showing only 100-150 characters depending on screen size and the position of the divider while a drag’n’drop truncates at 256. That may call for two things:
A similar procedure to append surnames to the Citation Name instead of the Surname which still risks field overflow and truncation for heavily ‘reused’ Citations, e.g., the above example and those sometimes resulting from the “Merge all duplicate citations” tool.
An enhancement to RM to provide a popout of the full content of the Source and Citation Names when hovered over or selected.
Working on this project has got me thinking about a means to differentiate what RM thinks are duplicate citations when they differ in Media or Web tags. Stay tuned…
And please comment here or in the Forum on whether this script does anything positive or negative for you!
They say a picture is worth a thousand words. This page compares pictures of the #RM9 and Heredis 2023 databases (both SQLite 3) as automatically created by the same tool.
Entity Relationship Diagrams
An Entity-relationship Diagram (ERD) says a lot about a database. @patjones recently posted a RM9 Database Diagram with many of the linkages or relationships among its tables that are made on-the-fly by the SQL queries from the RM software or the ones found on this site. What a database Entity Relationship Diagram editor generates from a database’s self-definition includes the tables, fields and types, stored Triggers (actions on 1 or more tables by some database event) and stored Views (stored queries). So it is interesting to compare the automatically generated ERDs of the RM9 database against that of the Heredis 2023, from the same editor, DBeaver.
The comparison table and diagram show that the Heredis database is considerably more complex than RootsMagic 9 with more than double the number of tables and nearly triple the number of indexes which accelerate the association of rows across tables and speed up searches. And more divergent is the extensive use of Triggers by H and their complete absence in RM. What do these numbers say about the two databases?
Tables
That there are more tables in H than RM is from a couple of causes:
data normalisation: I think H is more rigorous in applying the principal, e.g., it has a separate table for the Alternate Name while RM has but one for all types of names and uses a field to flag which is the Primary record for a given person. Moreover, H has names in multiple tables: the Primary appears to be in the equivalent of RM’s PersonTable (Individus) while there is another for Surnames (Noms) and one for Given Names (Prenoms). Similarly, RM’s MediaLinkTable which relates a media item to a Person, Couple, Event, Source, Citation, Place… (according to an OwnerType value) is achieved in H by separate tables for each ‘OwnerType’.
a larger feature set? Hard to say. I’m not using H sufficiently to identify the differences and my understanding of table and field names in French is not thorough.
Views
Neither has stored views (stored queries) so that tells us nothing. I’ve used temporary Views on RM databases, lasting until the database is closed by the sqlite manager. It’s possible that both applications do, too, but we cannot see them from outside the app’s sqlite engine.
Indexes
With double the number of tables, it stands to reason that there could be double the number of indexes. I don’t know why there should be triple but, in the past, I’ve suggested an index be added or revised in RM. SQLite is itself very good at optimising queries and may spawn an index that it thinks is appropriate unless there is a pre-defined one that includes all the key fields of the query. Maybe H developers have attempted to cover all the possible bases with explicit indexes.
What I have yet to find out is if there is any custom collation sequence in H as there is in RM which has given us database explorers so much grief.
Triggers
This is the most pronounced difference between RM and H. H has many routine operations embedded in the database, being executed by the database engine, rather than in the higher level code which then has to talk to the database engine to make the same thing happen. For example, H has a trigger named “MajDateMediasEvenementInsert” which automatically updates the ‘Date modified’ value in its ‘EventTable’ when a link has been added to their ‘MediaLinkTable’. Looks like this:
CREATE TRIGGER "MajDateMediasEvenementInsert"
AFTER INSERT ON LiensMediaEvenement
FOR EACH ROW
BEGIN
UPDATE Evenements SET DateModification=MAX(NEW.DateModification, Evenements.DateModification) WHERE Evenements.CodeID=NEW.XrefEvenement;
END;
Maybe not the best example for comparison because I don’t think RM would update the EventTable when a media item is tagged to an event.
DBeaver
Having searched around for a (free) application that could create an Entity Relationship Diagram (ERD) from a SQLite 3 database, I landed on one that might be described as an Integrated Development Environment (IDE) for databases, the DBeaver Community Edition. It is self-described as a “Universal Database Tool” and I’ve only superficially scratched its surface figuring out how to connect a SQLite database, generate the ERD from its definition and edit it to illustrate relationships. But it is much more powerful than that as it looks to be a multi-database manager with which one can query the database data. And there is a Pro version which adds many more capabilities.
RootsMagic 9.0.0.0 was released on Feb 27, 2023 with some new features necessitating changes to the database design. All previous version #database files undergo an upgrade procedure to the RM9 structure and are not backward compatible. However, #RM9 and #RM8 table structures are identical with the following exceptions:
PersonTable
9 new fields named color1, …, color9 have been added in support of the ColorCode sets 1 through 10. The original “color” field is set 1.
FanTable, FanTypeTable
New tables added in support of the new Associations feature. They bear a resemblance, respectively, to the EventTable and FactTypeTable.
PayloadTable
New table added in support of Saved Criteria Search and Saved Criteria Group. So far:
OwnerType
New OwnerType values have been added in support of the new tables. These will be identified as encountered through usage and testing.
Table
RecType
OwnerType
OwnerID
Payload
1 (SavedCriteriaSearch)
8
0
Payload
2 (SavedCriteriaGroup)
20
TagTable.TagValue, GroupTable.GroupID
FANTable
19
CitationLinkTable, MediaLinkTable,…
Table Specs
This is a tab-delimited result of a query of the tables.
This script was prompted by a discussion in the RootsMagic Community on Feb 10 ’23: Notes are duplicated. The duplicate begins with MERGED NOTE. It is not uncommon, viz this Google search for the MERGED NOTE flag generated by RootsMagic. Its results return many pages, from sites.rootsmagic.com and from other sites with content from RM databases. There must be many more unpublished cases among users who have used RM’s Merge functions.
Problem
From RM4 on, merging of persons preserved the Person (General) Note value of the secondary person by concatenating it with that of the primary person separated by the string ‘– MERGED NOTE ————‘ and some whitespace. In some cases, both persons had the same Note but RM concatenated them anyway. And in others, there have been multiple merges into a person with duplication of notes occurring in any pattern.
Complicating the test for duplication is that some notes are identical except for the loss of trailing whitespace (when GEDCOM is involved) or the dropping of a symbol such as “<” or perhaps some punctuation.
Solution
This script parses the PersonTable.Note into the parts between the MERGED NOTE flags, keeps the alphanumerically distinctly different parts and concatenates them into a new Note with a symbolic demarcation ‘-+-+-+-+-+-+-+-‘ that is less brazen but still useful. It presents a table with both the NewNote and the OldNote for inspection.
Updating the PersonTable with the new Note is done by executing a selected statement within a comments block. The original flagged Notes are copied into a temporary table from which they can be restored, prior to re-executing the script after updating PersonTable with the revised Note.
Requirements
Uses Common Table Expression (CTE) and the REGEXP_REPLACE() function and was tested on SQLiteSpy 1.9.15 Win32. Any current SQLite manager that has been compiled with a relatively recent SQLite source code (CTE was introduced a few years ago) and has an extension or includes the REGEXP_REPLACE() function using the same syntax as SQLiteSpy should work.
Usage
Close the database in RM with a backup.
Open the database in SQLiteSpy or equivalent.
Load script and execute. Executing the whole script (F9 in Spy) does not change the data in the PersonTable; it presents you with a table to compare the draft new Note to the original|current Note. You have to select|highlight the UPDATE statement in the comments at the bottom of the script and execute just it (Ctrl+F9) to apply the new Note to the PersonTable. I set it up that way so that unwanted outcomes might be found and mitigated by editing the original Note BEFORE the database is changed.
Inspect the draft NewNote against the OldNote.
If satisfied with the results, select the first UPDATE statement at the end of the script and execute it (Ctrl+F9 in Spy). See screenshot below. You could add a constraint to the query to update only to the last Person you have reviewed, assuming you are working in order of ascending RIN: AND PersonID <= last_PersonID_checked;
If you wish to revert to the original note, select the second UPDATE statement at the end of the script before running the script again (which will update the temporary table PersonNoteOriginal) or before closing the database out of sqlite.
The three temporary views and the temporary table created by the script won’t empty until the script is run a second time, reflecting that the table has been emptied of those with the MERGED NOTE flag; they will all be dropped on closing the database from the SQLite manager.
With all the excitement surrounding how well Artificial Intelligence is doing to write papers that can fool teachers and having seen someone post a message about using it to generate some software coding, I was curious to see what the ChatGPT interactive AI research tool could say about creating a SQLite query of a RootsMagic database. The answer, at the moment, is ‘not much’ but it does tell you how to get started with the SQLite Command-Line Interface (CLI). The composition itself is very good!
How do I query my RootsMagic sqlite database?
You can use the SQLite command-line interface to query your RootsMagic database. To do this, follow these steps:
Open a terminal or command prompt window and navigate to the directory where the SQLite executable is located.
Connect to your RootsMagic database by typing the following command, replacing “database_file” with the path to your database file:
sqlite3 database_file
Once you are connected to the database, you can run SQL queries against it by typing them into the command prompt and hitting enter. For example, to get a list of all tables in the database, you can run the following query:
SELECT name FROM sqlite_master WHERE type='table';
To see the schema for a particular table, you can use the .schema command, followed by the name of the table. For example:
.schema people
To see the contents of a table, you can use the SELECT statement. For example:
SELECT * FROM people;
This will display all rows and columns in the “people” table. You can use various clauses in your SELECT statement, such as WHERE, GROUP BY, and ORDER BY, to filter and sort the results.
I hope this helps! Let me know if you have any questions. (How polite and generous is that!?)
What database engine is used for a RootsMagic 7 database?
RootsMagic 7 uses the Firebird (Oops! got that wrong- that’s pre-RM4) open-source database engine to store and manage data in its databases. Firebird is a powerful and reliable database engine that is well-suited for use in genealogy software like RootsMagic. It is capable of handling large amounts of data and can support a wide range of data types and structures. Firebird is also designed to be easy to use and maintain, making it a good choice for software like RootsMagic that is intended for use by non-technical users.
Firebird is based on the InterBase database engine, which was developed by Borland in the 1980s. Firebird was created as an open-source fork of InterBase in the early 2000s, and has since become a popular choice for a variety of applications that require a robust and reliable database engine.