Unlocking Your RootsMagic 10 Data: A Deep Dive into SQLite Queries (with AI)

I just finished a session with the free Google Gemini Flash 2.5 model to develop a quite complex query to identify couples in a RootsMagic database who are cousins of each other (see the discussion that led to it in the Forum at https://sqlitetoolsforrootsmagic.com/forum/topic/cousin-marriages/). That was preceded a few months back by a session with DeepSeek trying to develop a query that could act as an alternative to RM’s Set Relationships tool – it was not as successful but it might be more so now as its model evolves and grows. Anyway, that caused me to ask Gemini some questions about how to effectively use it to develop SQLite3 queries for RootsMagic databases. That “conversation” can be read in full at https://g.co/gemini/share/b9ef037e35c6 but I asked it to generate a summary for this post, pasted below.


Unlocking Your RootsMagic 10 Data: A Deep Dive into SQLite Queries

RootsMagic 10, like its recent predecessors, stores all your precious genealogical data in a robust SQLite database file (typically with a .rmtree extension). This self-contained, file-based relational database is embedded directly within the RootsMagic software, making it incredibly efficient for personal use.

While RootsMagic’s interface is fantastic for managing your family tree, sometimes you need to dig deeper – to run custom reports, analyze specific data patterns, or extract information in ways the software doesn’t natively offer. This is where direct SQLite querying comes in.

Key takeaways for RootsMagic users:

  1. SQLite is Your Data’s Foundation: Your entire family tree, including persons, events, sources, citations, and media, resides within a structured SQLite database.
  2. Access with Caution (and Backups!): You can explore your .rmtree file using third-party SQLite tools (like DB Browser for SQLite or SQLiteSpy). Always create a full backup of your database before attempting any direct access or modification!
  3. Understanding the Schema is Key: To write effective queries, you need to know the specific table names (e.g., PersonTable, EventTable, SourceTable) and column names (e.g., FirstName, LastName, Date1, Title) within the RootsMagic schema. These are the “building blocks” of your queries.
  4. How AI Can Help (with your input): AI models like Gemini can assist in generating complex SQLite queries. However, because they don’t have real-time access to your specific database’s schema, you’ll need to provide the relevant table and column names to ensure the queries are accurate.
  5. Leverage Context in Conversations: If you’re working with an AI, you can generally continue to refine and develop multiple queries within the same chat session, as the AI will retain context of the schema details you’ve already provided. For new chat sessions, a quick reminder of the core table/column names will be helpful.

By understanding the SQLite foundation of RootsMagic and learning how to craft (or have an AI craft!) custom queries, you gain unparalleled power to analyze and utilize your genealogical data in new and exciting ways. Just remember the golden rule: Always back up your database before you begin!


That summary of the responses to my prompts up to that point of the session really missed a lot of what Gemini said so it’s worth delving into the conversation for more insight into points 4 & 5 and how an AI might become repetitive or forget or veer off due to an error in input or interpretation, as happened in my session with DeepSeek and even to a degree with Gemini that I was able to catch and correct.

I went on in the conversation with Gemini to discover that content lifted from the RM10 Data Dictionary in CSV format and pasted into the Chat could provide it with important parts of the database schema that it readily understood. Until now, I have been typing in the bare minimum description to provide the context with which the AI engine is to work. If a comprehensive description of the schema can be easily provided to the AI engine, that will make development of custom queries much easier. At the moment, the prompt capacity is not large enough to accept the entire sheet named “RM10 Data Def Project” as CSV text.

The raison d’etre of this website is to “Exploit your RootsMagic family tree database with SQLite Tools”. AI assistance in creating new and more powerful “SQLite Tools” looks to be within reach at no or low $cost for those with a good understanding of the RootsMagic database design and SQL.

7 Replies to “Unlocking Your RootsMagic 10 Data: A Deep Dive into SQLite Queries (with AI)”

  1. Coincidently, I have been using ChatGPT to come up with Python functions to help clean up the PlaceTable. My PlaceTable has nearly 7000 rows. Along the way, I’m normalizing the names, stripping nonsensical data from the name and street addresses, and merging the duplicates. (Taking care not (mostly) lose an data and updating referencing tables).

    Along the way, I have discovered over 200 PlaceIDs that are not used. That is, referenced elsewhere. A few of them are mine, but over 95% of them are like this:

    This PlaceID 180 is not referenced: name: “SNOWF – Snowflake Arizona”
    This PlaceID 181 is not referenced: name: “SPAUL – São Paulo Brazil”
    This PlaceID 182 is not referenced: name: “SPEDR – San Pedro Sula Honduras”
    This PlaceID 183 is not referenced: name: “SPMIN – St. Paul Minnesota”
    This PlaceID 184 is not referenced: name: “SPOKA – Spokane Washington”
    This PlaceID 185 is not referenced: name: “STARV – Star Valley Wyoming”

    I’m going to delete them. Might these be inherent to the RM database, or maybe the debris came from elsewhere, like ancestry.com? The “CODE” seems to suggest that thes may have come from some GEDCOM along the way.

    • Great to hook up with a fellow traveller! I’m (almost) illiterate with Python and other higher level languages which have been suggested by both Gemini and DeepSeek when they’ve gone through multiple attempts directly in SQLite that I’ve replied to as unsuccessful. But their SQLite code creation is so far beyond my skills, I may have to acquiesce and resort to one of them.

      Not at my computer and not sure what “referenced” means. Is that from your script? Anyway, those look like LDS Temples. There is a much longer set in Place table and the set gets updated occasionally in RM updates. So as not to cause a possible problem with updates, it would be logical to keep them.

      • My development tree is here: https://github.com/mooredan/rmtree_py
        ChatGPT authored much of the code. rmutils.py contains some general functions.

        I program, but am new to Python. Seems to be all the rage nowadays.

        “Referenced” means that another Table has a column that refers to a PlaceID. For instance, the EventTable has a column: “PlaceID” which is a foreign key. If it contains 0, then there is no PlaceID associated with the event. Non-zero refers to a PlaceID in the PlaceTable. If a PlaceID is deleted, then the EventTable entry that referenced the PlaceID is set to 0. Additionally, the UTCModDate column in the EventTable is updated as well. Other tables that may reference a PlaceID are: MediaLinkTable, TaskLinkTable, and URLTable (that’s all that I, and ChatGPT, could find).

        As for the uppercase coded place names, you are quite right. I haven’t checked all yet, but they are PlaceType = 1, which is “LDS Temples, Built-In”: e.g.

        This PlaceID 181 is not referenced: name: “SPAUL – São Paulo Brazil”

        ==== PLACE USAGE REPORT FOR PlaceID: 181 ====

        — PlaceTable:
        PlaceID: 181
        PlaceType: 1
        Name: SPAUL – São Paulo Brazil
        Abbrev: SPAUL
        Normalized:
        Latitude: -235854500
        Longitude: -467230694
        LatLongExact: 0
        MasterID: 0
        Note:
        Reverse: SPAUL – São Paulo Brazil
        fsID: 0
        anID: 0
        UTCModDate: 45829.61855771991

        ==== END REPORT ====

        It may be prudent to keep them around, but ignore them when it comes to name normalization, or PlaceTable row deletion, reporting, etc… (keep ’em hidden). Thanks for the heads up.

        BTW: The report above was generated by ChatGPT’s Python code — I didn’t touch it at all, just dropped it in and worked.

  2. My current process is reduced the number of PlaceIDs by nearly 2000:

    # wc (line count)
    $ wc -l places-before.log places-after.log
    6829 places-before.log
    4837 places-after.log

    Sample results:

    places-before.log:[280] Rock Island, Illinois
    places-before.log:[1786] Lamont, Whitman County, Washington, USA
    places-before.log:[2047] Tribune, Missouri
    places-before.log:[2900] Fort Lauderdale, Broward County, Florida, United States of America
    places-before.log:[3803] Carthage, MO
    places-before.log:[5467] 1209 21st Ave., Rock Island
    places-after.log:[280] Rock Island, Illinois, USA
    places-after.log:[1786] Lamont, Whitman, Washington, USA
    places-after.log:[2047] Tribune, Missouri, USA
    places-after.log:[2900] Fort Lauderdale, Broward, Florida, USA
    places-after.log:[3803] Carthage, Missouri, USA

    Note: 5467 had its address stripped and was merged into 280.

    There’s still work to do, I have about 200 entries to address that are a bit “weird”.

  3. It’s a work in progress and hasn’t been tested on Windows yet. Richard Otter has some instructions getting Python running on Windows on his github, but I haven’t gone down that path.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.