Breakthrough! Something I've wanted for as long as I can remember is a means of detecting lineage loops (an ancestor is their own descendant) in family tree databases created with RootsMagic. This is my second major interaction with Google Gemini Flash 2.5 which took 18 attempts (only 18!) with lots of testing to nail down what I think is amazingly smart code that I could never have conceived and don't fully understand. It's robust and highly efficient.
Screenshot from SQLiteSpy of two loops from a large database.
Child in Problematic Family Link
Full Loop Path (IDs)
Problematic Father (Descendant of Child)
Problematic Mother (Descendant of Child)
Family Where Loop Occurs
John Zzzzz-42789
42789->42789
N/A
N/A
John Zzzzz-42789=N/A
Duane Michael Xxxxxx-44066
44066->44078->44066
Anthony Michael Xxxxxx-44078
N/A
Anthony Michael Xxxxxx-44078=Verna Mildred Yyyyy-44058
Gemini generated the following description, covering what the script does, its design principles, and expected performance:
RM-LineageLoopDetector.sql: Uncovering Lineage Loops in RootsMagic Databases
Genealogical databases, especially those built over many years or compiled from various sources, can sometimes contain logical inconsistencies known as "lineage loops" or "pedigree loops." These occur when an individual is recorded as being their own ancestor, or an ancestor of their parent, creating an impossible cycle in the family tree. While RootsMagic is designed to prevent some direct entry errors, more complex or indirect loops can slip through.
The RM-LineageLoopDetector.sql script is a powerful SQLite query specifically designed to identify and report these problematic lineage loops within your RootsMagic database.
What it Does
This script scans your RootsMagic database to find instances where:
A person is recorded as their own parent/ancestor (self-loop).
A person's parent is also a descendant of that person (a more complex cycle).
For each loop detected, the script provides clear, actionable information:
Child in Problematic Family Link: The individual whose family relationship initiates the discovered loop.
Problematic Father / Mother (Descendant of Child): Identifies which parent in the problematic family is also a descendant of the child.
Family Where Loop Occurs: Provides a user-friendly representation of the family (e.g., "Father's Name-ID=Mother's Name-ID") that contains the specific link causing the loop. This is crucial for RootsMagic users, as FamilyIDs are not visible in the application's interface.
Full Loop Path (IDs): A sequence of PersonIDs illustrating the exact path taken to complete the loop, making it easy to trace the issue within RootsMagic.
The script is carefully designed to report only one canonical instance for each unique lineage loop, avoiding redundant results even if multiple paths within the loop could technically trigger detection.
How it Was Designed
The development of this script involved an iterative process to balance accuracy, performance, and user-friendliness:
Recursive Common Table Expressions (CTEs): The core of the solution leverages SQLite's WITH RECURSIVE functionality. A CTE named CycleFinder performs a depth-first search through the family tree's parent-child relationships.
Targeted Loop Detection: Instead of building a complete ancestor list for every person (which is inefficient for large databases), CycleFinder starts its search from every direct child-parent link in your database. It then attempts to trace ancestors from the parent's side, looking for the original "child" that initiated that specific path.
Efficient Cycle Prevention: Early iterations struggled with performance due to string manipulation (INSTR function) within the recursive path tracking. The current design judiciously uses INSTR to prevent infinite recursion and detect general path revisiting, but the primary loop detection is based on the current_person_id eventually equalling the start_person_id (the child who began the trace). This balance significantly improved performance.
Canonical Loop Reporting: A RankedLoops CTE was introduced to address the challenge of reporting a single, meaningful instance for each unique loop. This CTE uses a sophisticated PARTITION BY clause that extracts all unique PersonIDs involved in a detected loop, sorts them, and concatenates them into a consistent "canonical loop identifier." ROW_NUMBER() then selects the shortest or lowest-FamilyID loop from that canonical set, ensuring clean, non-redundant output.
User-Centric Output: Recognizing that RootsMagic users don't see FamilyIDs, the final SELECT statement includes intelligent joins to the NameTable to convert FamilyIDs into human-readable "FatherName-ID=MotherName-ID" format, directly assisting users in locating and correcting errors within the RootsMagic application.
What Performance to Expect
The script is optimized for performance within the constraints of SQLite's recursive CTE capabilities:
Small to Medium Databases: For databases up to tens of thousands of records, you can expect the script to execute very quickly, often in a matter of milliseconds to a few seconds.
Large Databases: For very large databases (e.g., hundreds of thousands of Child and Family records), the execution time will naturally increase. Recent tests on a database with 274,000 ChildTable records and 159,000 FamilyTable records completed in approximately 3 minutes 45 seconds on a 7-year-old Intel i5 processor. While this isn't instantaneous, it is a very reasonable time for performing a comprehensive graph cycle detection on such a scale.
Factors Affecting Performance: Your computer's processing speed (CPU, RAM, SSD vs. HDD) and the depth/complexity of your family tree (very long, unbroken lineages will require deeper recursion) will influence actual execution times.
In summary, the RM-LineageLoopDetector.sql script provides an efficient and user-friendly way to maintain the integrity of your RootsMagic family history data by accurately pinpointing and describing otherwise hidden lineage loops.
If you've read this far and want more detail on the development process, you should view the dialog I had with Gemini at https://g.co/gemini/share/2cdd5db67e6f.
Breakthrough! Something I've wanted for as long as I can remember is a means of detecting lineage loops (an ancestor is their own descendant) in family tree databases created with RootsMagic. This is my second major interaction with Google Gemini Flash 2.5 which took 18 attempts (only 18!) with lots of testing to nail down what I think is amazingly smart code that I could never have conceived and don't fully understand. It's robust and highly efficient.
Screenshot from SQLiteSpy of two loops from a large database.
Child in Problematic Family Link
Full Loop Path (IDs)
Problematic Father (Descendant of Child)
Problematic Mother (Descendant of Child)
Family Where Loop Occurs
John Zzzzz-42789
42789->42789
N/A
N/A
John Zzzzz-42789=N/A
Duane Michael Xxxxxx-44066
44066->44078->44066
Anthony Michael Xxxxxx-44078
N/A
Anthony Michael Xxxxxx-44078=Verna Mildred Yyyyy-44058
Gemini generated the following description, covering what the script does, its design principles, and expected performance:
RM-LineageLoopDetector.sql: Uncovering Lineage Loops in RootsMagic Databases
Genealogical databases, especially those built over many years or compiled from various sources, can sometimes contain logical inconsistencies known as "lineage loops" or "pedigree loops." These occur when an individual is recorded as being their own ancestor, or an ancestor of their parent, creating an impossible cycle in the family tree. While RootsMagic is designed to prevent some direct entry errors, more complex or indirect loops can slip through.
The RM-LineageLoopDetector.sql script is a powerful SQLite query specifically designed to identify and report these problematic lineage loops within your RootsMagic database.
What it Does
This script scans your RootsMagic database to find instances where:
A person is recorded as their own parent/ancestor (self-loop).
A person's parent is also a descendant of that person (a more complex cycle).
For each loop detected, the script provides clear, actionable information:
Child in Problematic Family Link: The individual whose family relationship initiates the discovered loop.
Problematic Father / Mother (Descendant of Child): Identifies which parent in the problematic family is also a descendant of the child.
Family Where Loop Occurs: Provides a user-friendly representation of the family (e.g., "Father's Name-ID=Mother's Name-ID") that contains the specific link causing the loop. This is crucial for RootsMagic users, as FamilyIDs are not visible in the application's interface.
Full Loop Path (IDs): A sequence of PersonIDs illustrating the exact path taken to complete the loop, making it easy to trace the issue within RootsMagic.
The script is carefully designed to report only one canonical instance for each unique lineage loop, avoiding redundant results even if multiple paths within the loop could technically trigger detection.
How it Was Designed
The development of this script involved an iterative process to balance accuracy, performance, and user-friendliness:
Recursive Common Table Expressions (CTEs): The core of the solution leverages SQLite's WITH RECURSIVE functionality. A CTE named CycleFinder performs a depth-first search through the family tree's parent-child relationships.
Targeted Loop Detection: Instead of building a complete ancestor list for every person (which is inefficient for large databases), CycleFinder starts its search from every direct child-parent link in your database. It then attempts to trace ancestors from the parent's side, looking for the original "child" that initiated that specific path.
Efficient Cycle Prevention: Early iterations struggled with performance due to string manipulation (INSTR function) within the recursive path tracking. The current design judiciously uses INSTR to prevent infinite recursion and detect general path revisiting, but the primary loop detection is based on the current_person_id eventually equalling the start_person_id (the child who began the trace). This balance significantly improved performance.
Canonical Loop Reporting: A RankedLoops CTE was introduced to address the challenge of reporting a single, meaningful instance for each unique loop. This CTE uses a sophisticated PARTITION BY clause that extracts all unique PersonIDs involved in a detected loop, sorts them, and concatenates them into a consistent "canonical loop identifier." ROW_NUMBER() then selects the shortest or lowest-FamilyID loop from that canonical set, ensuring clean, non-redundant output.
User-Centric Output: Recognizing that RootsMagic users don't see FamilyIDs, the final SELECT statement includes intelligent joins to the NameTable to convert FamilyIDs into human-readable "FatherName-ID=MotherName-ID" format, directly assisting users in locating and correcting errors within the RootsMagic application.
What Performance to Expect
The script is optimized for performance within the constraints of SQLite's recursive CTE capabilities:
Small to Medium Databases: For databases up to tens of thousands of records, you can expect the script to execute very quickly, often in a matter of milliseconds to a few seconds.
Large Databases: For very large databases (e.g., hundreds of thousands of Child and Family records), the execution time will naturally increase. Recent tests on a database with 274,000 ChildTable records and 159,000 FamilyTable records completed in approximately 3 minutes 45 seconds on a 7-year-old Intel i5 processor. While this isn't instantaneous, it is a very reasonable time for performing a comprehensive graph cycle detection on such a scale.
Factors Affecting Performance: Your computer's processing speed (CPU, RAM, SSD vs. HDD) and the depth/complexity of your family tree (very long, unbroken lineages will require deeper recursion) will influence actual execution times.
In summary, the RM-LineageLoopDetector.sql script provides an efficient and user-friendly way to maintain the integrity of your RootsMagic family history data by accurately pinpointing and describing otherwise hidden lineage loops.
If you've read this far and want more detail on the development process, you should view the dialog I had with Gemini at https://g.co/gemini/share/2cdd5db67e6f.
When I post to the Blog, I can set the editor to cross-post to the Forum so that's why you see the Blog post here. Commenting on the Blog is more restrictive than in the Forum so that's why I've cross-posted.
I'm really impressed with what Gemini is able to do for me in generating such complex queries from my instructions and feedback on results of tests. And it's probably learning a thing or two about RootsMagic in the process that may help others.
Do I know that RM-LineageLoopDetector finds all loops? Nope. It has found the loops I knowingly created and many that others did, unknowingly. So I'd welcome your feedback on results from databases you have.
When I post to the Blog, I can set the editor to cross-post to the Forum so that's why you see the Blog post here. Commenting on the Blog is more restrictive than in the Forum so that's why I've cross-posted.
I'm really impressed with what Gemini is able to do for me in generating such complex queries from my instructions and feedback on results of tests. And it's probably learning a thing or two about RootsMagic in the process that may help others.
Do I know that RM-LineageLoopDetector finds all loops? Nope. It has found the loops I knowingly created and many that others did, unknowingly. So I'd welcome your feedback on results from databases you have.
Very impressive. It ran in under 4 seconds on my database of 40,000 people and detected no loops, which is what I would have expected.
I have obviously not yet played with this Gemini thing and SQLite. But I have been playing with ChatGPT in association with the programming I do in C++. It produces some pretty amazing results. However (and I don't think this can be prevented), the quality of the results I get seem to be highly dependent on the skill I have in composing the problems and questions. Also, I do have to "correct it" from time to time, which not all users would have the background to be able to do. But when I do so, it immediately seems to "understand" and to make the needed mid-course corrections in what it is doing.
I really wish Bruce might someday support a "plug-in" view, where the view would run "plug-ins" which would be useful SQLite queries such as this one. Presumably, by running in the friendly confines of the actual SQLite environment, there shouldn't even be any strangeness about a fake RMNOCASE collating sequence. The plug-ins would be using the real one.
Very impressive. It ran in under 4 seconds on my database of 40,000 people and detected no loops, which is what I would have expected.
I have obviously not yet played with this Gemini thing and SQLite. But I have been playing with ChatGPT in association with the programming I do in C++. It produces some pretty amazing results. However (and I don't think this can be prevented), the quality of the results I get seem to be highly dependent on the skill I have in composing the problems and questions. Also, I do have to "correct it" from time to time, which not all users would have the background to be able to do. But when I do so, it immediately seems to "understand" and to make the needed mid-course corrections in what it is doing.
I really wish Bruce might someday support a "plug-in" view, where the view would run "plug-ins" which would be useful SQLite queries such as this one. Presumably, by running in the friendly confines of the actual SQLite environment, there shouldn't even be any strangeness about a fake RMNOCASE collating sequence. The plug-ins would be using the real one.
I really wish Bruce might someday support a "plug-in" view, where the view would run "plug-ins" which would be useful SQLite queries such as this one. Presumably, by running in the friendly confines of the actual SQLite environment, there shouldn't even be any strangeness about a fake RMNOCASE collating sequence. The plug-ins would be using the real one.
we could hope -- not sure Bruce/Team would support that. Family Historian uses plugins.
Also, I do have to "correct it" from time to time, which not all users would have the background to be able to do. But when I do so, it immediately seems to "understand" and to make the needed mid-course corrections in what it is doing.
yes -- I would agree you need some basic understanding. The better your understanding you should be able to "work with AI" to improve it to get the desired result. When I first tried it last week have Tom introduce "us". I asked it to based off the RootsMagic database -- it made a couple wrong assumptions -- like the birth date info is not on PersonTable but it got really close on first try.
I really wish Bruce might someday support a "plug-in" view, where the view would run "plug-ins" which would be useful SQLite queries such as this one. Presumably, by running in the friendly confines of the actual SQLite environment, there shouldn't even be any strangeness about a fake RMNOCASE collating sequence. The plug-ins would be using the real one.
we could hope -- not sure Bruce/Team would support that. Family Historian uses plugins.
Also, I do have to "correct it" from time to time, which not all users would have the background to be able to do. But when I do so, it immediately seems to "understand" and to make the needed mid-course corrections in what it is doing.
yes -- I would agree you need some basic understanding. The better your understanding you should be able to "work with AI" to improve it to get the desired result. When I first tried it last week have Tom introduce "us". I asked it to based off the RootsMagic database -- it made a couple wrong assumptions -- like the birth date info is not on PersonTable but it got really close on first try.
Requires a RM group named “SQLite Lineage Loop Detector” to contain the people (from 1 to everyone) for whom an ancestral loop is to be sought. Faster than the original when the group is small; better for tackling a very large database. Also returns a “No loop detected” message. Compatible with RM8-RM10 databases. 2025-07-27
Requires a RM group named “SQLite Lineage Loop Detector” to contain the people (from 1 to everyone) for whom an ancestral loop is to be sought. Faster than the original when the group is small; better for tackling a very large database. Also returns a “No loop detected” message. Compatible with RM8-RM10 databases. 2025-07-27