Full length blog

Query to Search Parents by Children’s Given Names #search #family

The following script was developed in response to the discussion Search for family by given names of children (was RM-8)

-- three_childrens_names.sql
-- Jerry Bryan, 22 May 2020
-- This query searches for parents based on given names of their children.
-- It gets a hit when three or more children have any primary or alternate
-- given names that are in a hardwired list. The report lists the parents.

SELECT COUNT(C.ChildID) AS fam_size,
F.FatherID, father.Given, father.Surname, F.MotherID, mother.Given, mother.Surname
FROM ChildTable AS C
JOIN
(SELECT DISTINCT NN.OwnerID
FROM NameTable AS NN
WHERE NN.Given LIKE("%Mary%") OR NN.Given LIKE ("%John%") OR NN.Given LIKE("%Elizabeth%"))
AS N ON N.OwnerID = C.ChildID
JOIN
FamilyTable AS F ON F.FamilyID = C.FamilyID
JOIN
NameTable AS father ON father.OwnerID = F.FatherID
JOIN
NameTable AS mother ON mother.OwnerID = F.MotherID
GROUP BY C.FamilyID
HAVING fam_size >= 3
ORDER BY father.Surname, father.Given, mother.Surname, mother.Given;

Jerry

Updating to WordPress 5.4.1

I’m going to attempt an update from WordPress 4.9.14 to 5.4.1 sometime today. The disruption will either be minimal or disastrous. I’ve made a backup of the files and database. But I’m no WP expert and things may get broken. I think all the plugins have been tested to the higher level. Wish me luck!

Aargh! Systemic Error re: GroupID

I've just discovered that I caused a systemic error in several scripts dealing with Groups in relating the LabelTable.LabelID to the GroupTable.GroupID when the latter is actually related to LabelTable.LabelValue. It's of no consequence when groups are consecutively created but not deleted because the LabelID and LabelValue match. The problem arises when a group before the latest created is deleted; the LabelID for a subsequently created group could point to the wrong or a non-existent GroupID.

Over the next while, I will try to find and correct those scripts. A starting point for identifying them would be a site search using both terms "LabelID" and "GroupID".

UPDATE 2020-05-16: not as bad as I feared. The original error was corrected in many but not all scripts in 2011. Here are the exceptions that have been found and are now corrected:

Copy RIN to REFN and embedded script
Copy Fact to Group and CopyFact2Group.sql
Census Needed: Named Group and CensusNeededGroup2.sql

Names – Set Alternate of type Birth to Primary #names #alternatenames #birth

This script does two things:

  1. For those persons having one Alternate Name of type “Birth” and a Primary name of “undefined” type, it sets the Alt name as the Primary name and changes the previous Primary name to an Alternate Name.
  2. For those persons having more than one Name of type “Birth”, it lists the Record Number (RIN) and Surname and Given Names so that the user may resolve the ambiguity down to one so that the script may swap it on the next pass.

If the conclusion is that the current Primary (birth) name is the preferred name, the user may choose to delete the Alternate Name, modify its name type or leave it as is. Either way, the script will not change the Primary to Alternate unless the Primary type is “undefined”.

Sources and media remain attached to the name, whatever its Primary/Alternate state is. However, until RM8, those attached to the Primary Name are essentially inaccessible and unusable (see Citations Invisible – Reveal and Revert).

Download

Names-Swap_AltBirth_to_Primary.sql

 

Combining the contents of 2 queries on the same file

I am having trouble setting up this query.

 

I want the results of the first query to merge/join/union with the second query where the ownerid from the first query equals the ownerid from the second query. The bold line should not be part of the query. I put it in to get the records from the second that equal the ownerid from the first query.

Select n1.ownerid, n1.surname, n1.given, n1.nametype, n1.prefix, n1.isprimary

from nametable n1

where n1.nametype = 1

and n1.isprimary = 0

order by surname, given

I get the results in Pic 1

 

select n2.ownerid, n2.surname, n2.given, n2.nametype, n2.prefix, n2.isprimary

from nametable n2

where n2.nametype = 0

and n2.isprimary = 1

and ownerid in(10518, 10519, 11626, 11976, 3536, 11805, 10773, 11655, 16605, 11802, 255, 129, 130, 15721, 9046, 11495, 11531)

order by surname, given

I get the results in Pic 2

I want the results of query 1 to be combined with the results of query 2 where the ownerids from the first query are only the ownerids of the second query.

So I get something like Pic 3

Privatise Extra BMD Events #events #reports #private

Problem

Many persons in my database have multiple Birth events created from different sources; likewise for other vital facts. Even though I may have marked one as Primary, the others still clutter reports: narratives look like repetitious gobbledygook; calendars have the same person multiple times, even on the same day. I want to clean up the publishing mess without losing the evidence.

This is especially the case for databases downloaded from an Ancestry Member Tree where events are generated by accepting or assigning a source and choosing to make the data into an alternate fact because it varies in some way from an existing fact.

Solution

Many report settings in RootsMagic have an option to include or exclude private events. I could reduce report clutter by setting the non-Primary Birth events to Private without losing the data that led to their creation. The Primary event should have whatever consideration and logic went into it being deemed the most accurate. I haven’t set a Primary in every case where my database has multiples so I want two things:

  1. A batch process that sets Private all the non-Primary events of a set having a Primary.
  2. A list of all the sets of multiple events that do not have a Primary so I can follow up.

Individual Summary with multiple Birth events, highlighted.

Individual Summary excluding Private events, in this case, the extra Birth events.

Implementation

BMD_private_multiples.sql 2020-02-15 Click to view; right-click to download.

The above SQLite script builds a series of temporary SQLite Views (stored queries), does the possible privatising and exits displaying the View “BMDmultstofix” which lists the persons needing follow up attention to set Primary one of the plural (multiple) events. The Views are all in memory and are lost when the SQLite manager exits the database.

Usage

The script executes very quickly on small files so it is possible to keep the SQLite manager open on the database and to re-execute the script at intervals while editing the database in RootsMagic. One could simply set the Primary event out of a group of multiple events for a person in RM’s Edit Person screen and then run the script to set the others Private or manually set them. Re-running the script will shrink the list needing Primaries by those you have set.

For family-type events (Marriage), the View “BMDmultstofix” will list both spouses separately. When the Marriage event is set to Primary on either spouse, the two separate listings will go.

Possible Enhancements

Colour Coding

The list of persons needing attention could be quite large, depending how assiduously you have been setting the Primary flag and the number of people and events in your database. So it might be nice to extend the last query or even last View to pick up the color code for each person which could be used to sort or filter the list.

Private as Default

Instead of privatising only those non-Primary events in groups having a Primary, another approach would be to privatise all non-Primaries. That would really clean out the reports and require user editing to get the desired outputs. This may be preferable.

Enhanced Version

BMD_private_multiples2.sql 2020-02-16

Both of the hypothesized enhancements have been incorporated. To ‘fix’ a group of multiples, one of the events must be set Primary and un-Privatised or another event added that consolidates one’s research into an assertion. By default, a new event is not Private and need only be made Primary.

In SQLiteSpy, any column can be sorted by clicking on the heading. Some other software, SQLite Expert is one example, also support filtering by column. These tools enable grouping color code or by surname without having to modify the script or write your own custom query of BMDmultsToFixClr view.

Screenshot

Screenshot from SQLiteSpy of results from running BMD_private_multiples2.sql

New RootsMagic-Users Mailing List/Forum

RootsWeb is deactivating its Mailing List service on March 2, 2020, but leaving the archive of each List online, for now. A successor has been set up on Groups.io, operated by users. The old and the new list appeal to those who refuse to participate in the now 21,000 member facebook group and others for various reasons. Moreover, RootsMagic Inc. has barred the door to new subscribers to its Forums.

Trying to find duplicate spouses

Contents

    I am trying to find duplicate spousess.
    I created the following query:
    select fatherid,
    (select n1.surname || “, ” || n1.given from nametable n1 where fatherid = ownerid) as fathername,
    motherid,
    (select n2.surname || “, ” || n2.given from nametable n2 where motherid = ownerid) as mothername
    from familytable
    order by fatherid, motherid;

    1. Should there be blank fatherid’s for a motherid in the result?
    2. Or likewise blank motherids for a fatherid in the results?
    3. Am I safe to think that if there are 2 motherid’s for the same fatherid that it is a duplicate?

    Backup database open in both SQLite and RootsMagic #backup

    Problem

    Neither the RootsMagic File>Backup nor File>Copy can proceed because the database file “is being used by another process” (RootsMagic’s own or your SQLite manager or other outboard SQLite connection). And if you’ve chosen Backup, RM gracelessly blocks you from further work with an always-on-top fake news message window “Creating backup file…” requiring closure of RM to clear.

    If you were using a SQLite manager that’s connected to the database file, you have to close it in order to use RM’s utilities and then reopen it with numerous steps to finally execute the SQL you’ve been carefully crafting to make some important changes to your data. That disruption takes time, may lose some valuable history from the SQLite manager and, perhaps, derail your train of thought. Same goes for ordinary users getting thrown off track after running Compact and other unknown things that screw up the RM backup.

    Outboard zip utilities such as 7Zip, RAR, WinZip,… all, to the best of my knowledge, require the closure of all other processes for them to proceed but do handle such a blockage gracefully, unlike RM4-7. It’s not sufficient that RM get to handle the blockage gracefully; it shouldn’t be blocked at all!

    Some Workarounds

    Looking into this issue in 2019, I’ve discovered some useful alternatives to RM’s backup and copy utilities that have arisen since the days of RM4 and Windows XP in 2009/10 when the original SQLite Tools for RootsMagic 4 Wiki began. SQLite Expert Professional lists “Backup” as a feature that likely uses the Online Backup API, perhaps the best suited to the task but at a price.

    This post presents four free, fast backup methods that allow the database to remain open in RootsMagic and in one (or more) outboard SQLite connections. They have been tested with a database open in RM7.5.9 and simultaneously in three SQLite managers: SQLite Expert Personal 4.2.0.784, SQLiteSpy 1.9.12 and SQLiteStudio  3.2.1.

    1. Windows 10 File Explorer
    2. SQLite Command: VACUUM INTO
    3. SQLite Archive File
    4. Export to SQL Text file

    Windows 10 File Explorer

    My first surprise was to discover that the Zip function integrated in File Explorer (possibly from even before Win10) does not object to the database file being open. Nor does its copy function. The latter may be the easiest and fastest method.

    Copy

    With the File Explorer window open and the “filename.rmgc” database file selected, Copy (Ctrl+C) and Paste (Ctrl+V) results in “filename – Copy.rmgc”. Repeat and the copy is automatically serialized as

    filename – Copy (2).rmgc”
    filename – Copy (3).rmgc”

    No manual file naming is needed to prevent overwriting of the last copy and any copy can be immediately opened by RM or any of the SQLite apps. The Created Date timestamp for the copy is the Date Modified timestamp of the original, so its value puts the copy’s version of the data into context, provided you have some log (recorded or remembered) of what you did before and after.

    Zip

    Right-click on the selected “filename.rmgc” database file and, from the popup context menu, click on Send to > Compressed (zipped) folder. This will popup a dialog box to allow you to change the default name “filename.zip”.

    Repeat and File Explorer automatically serializes the name:
    filename (2).zip”
    filename (3).zip”

    You could change the zip filename extension to “.rmgb” at any of these backups so that RootsMagic could find and restore from it but I wouldn’t bother for two reasons:

    1. It interferes with serializing the filename to preserve successive backups.
    2. It’s easy to extract from the Zip folder using File Explorer which gives you great control over where the extracted file goes with opportunity to rename and move it afterward.

    Using File Explorer from within the apps

    All of the mentioned apps launch a basic Windows File Explorer dialog Window to Open a database or carry out some other operation. All support Copy, Paste and Send to Compressed (zipped) Folder. Because RM’s window filters for .rmgc files, it is quicker selecting the database file. Presumably, the speed of the subsequent operation is independent of the application that opened the File Explorer dialog. When the copy of zip is complete, simply Cancel the window to back out to where you were with the database.

    Copying or Zipping from the Command-Line or PowerShell

    If File Explorer could copy or zip the database file while open in RootsMagic and SQLite, it seemed reasonable that these other Windows utilities could, too. The Zip is only available in PowerShell but both can copy files. Unfortunately, both the CLI and PS detected another process and refused to proceed.

    SQLite command: VACUUM INTO

    The VACUUM command was provided in SQLite 3 and all of the referenced SQLite managers from the get-go but it operated directly on the database in memory. About two months ago (ver 3.27), the command was extended with an INTO clause that writes the result of the VACUUM to a new database file without modifying the original. Effectively, this means that one can make a vacuumed copy of the currently opened database from a SQLite manager as a backup. If the original has not been recently vacuumed by either RootsMagic or by an outboard SQLite manager, then the vacuumed copy will be smaller in size.

    The copy can be arbitrarily named and, if named with the .rmgc extension, can be opened directly by RootsMagic without going through a Restore process.

    Typing the command and filename and ensuring the latter is unique is less efficient than the File Explorer method. A potential productivity gain would be to write a Windows command or Powershell script to launch the current sqlite3.exe command line shell program to open the database file to VACUUM INTO a filename defined by the script; it could incorporate date and time in the filename. Alternatively, the batch or command script could also zip the resulting .rmgc file and give the zip file the unique name.

    After some time, the referenced SQLite managers will undoubtedly update to SQLite 3.27 to support the command and some may incorporate some higher level controls such as a file explorer dialog to facilitate the naming of the vacuumed file.

    This command should work in SQLite for both Windows and MacOS.

    For the SQLite 3 command line shell program, look for sqlite-tools-win32-x86-3270200.zip or later at https://www.sqlite.org/download.html. Also available for MacOS: sqlite-tools-osx-x86-3270200.zip.

    SQLite Archive File

    Since 2014, there has been development of a SQLite Archiver utility “sqlar” using SQLite and zlib to archive files of any type to a SQLite database container. The file contents are zipped to a Blob-type field in a table with one row for each archived file. In 2018, it has been made into an optional extension to the SQLite source code and is now integrated in the SQLite command-line shell tool mentioned above. It is possible for a SQLite manager to be compiled with this extension but I don’t know of any yet.

    The SQLite command-line tool from 3.22 can be opened and at its prompt given the “.ar” command to  create a SQLite Archive file, add|update files to it, list contents, and extract files from it. From 3.23, the recommended use is from the OS CLI to call sqlite3.exe with the -A option.

    In this example in sqlite3.exe, I updated the SQLite Archive file “_LGS_pres.db” with the database “_LGS_pres.rmgc” that was open in RM and the 3 SQLite managers.

    sqlite> .ar -uvf _LGS_pres.db _LGS_pres.rmgc
    _LGS_pres.rmgc

    The Archive file could contain many other files, e.g., the media files or other unrelated files, just as a Zip folder can.

    Moreover, the SQLite Archiver can operate as a zip utility producing a .zip file (“compressed folder” in Win10) instead of a SQLite Archive file, skipping over the locks that trip up RootsMagic’s backup and 7Zip.

    For more, see SQLite Archive Files.

    Export to SQL Text file

    This is an old method that has merit because it reads the SQLite database, not just copying it, to create SQL commands which can later be executed to build the identical database. Some SQLite managers support it (SQLiteStudio) and it is available in the command-line shell:

    Use the “.dump” command to convert the entire contents of a database into a single ASCII text file. This file can be converted back into a database by piping it back into sqlite3.

    The resulting text file is bigger than the database file but it can be piped into a zip utility for compression. Whether there is a limit that a large database might run into remains to be seen.

    For more, see Converting An Entire Database To An ASCII Text File

    Observations

    1. There are multiple ways to backup a RootsMagic database while it is open in both the RootsMagic application and one or more outboard SQLite connections.
    2. Using these other ways instead of the RootsMagic backup function may also avoid the recurring situation when RootsMagic blocks itself from completing a backup; a forced shutdown is the only way to resume working in RootsMagic.
    3. For the less technically proficient, Windows File Explorer provides a convenient alternative to the RM backup, even from within the RootsMagic user interface.
    4. RootsMagic developers should look into some of these alternatives that work with SQLite to resolve the conflict with outboard SQLite connections. Especially appropriate could be the Online Backup API because it is native to SQLite and aware of other users of the database, allowing them to “continue uninterrupted while a backup of an online database is made”.

    Caveats

    1. Testing these has been necessarily superficial so there may be issues as yet undetected. Please comment below if you come across any.
    2. Make sure that the database is not being modified by one of the connections while the ‘backup’ is going on. When the backup is opened, there could be unexpected results. This is especially true of those methods that work without a SQLite connection to the database, i.e., File Explorer, SQLite Archiver.
    3. Those methods that require a SQLite connection to the database, i.e., VACUUM INTO, Export and Online Backup API, need the RMNOCASE collation (see RMNOCASE; faking it in SQLite Expert, command-line shell et al). It is possible that the resulting database will fail the RootsMagic database integrity test; simply use its Rebuild Indexes tool to correct.

    Media Duplicates – Reports and Remedies

    Problem

    TreeShare update operations that add images from Ancestry to the RootsMagic database can result in successive downloads of the same image file saved to a revised name, each added to the Media Gallery and independently tagged to their respective subjects. I don’t think this is either necessary or desirable (maybe there is a good reason yet to be revealed by the RootsMagician!). Moreover, such duplication (or worse, multiplication) bloats usage of storage space and inflates the Media Gallery making it harder to manage, navigate, find things. The few respondents to my survey  reported 10-20% of their media files were duplicates while an exercise with a 10-person subset of my 420 person Ancestry Tree after many TreeShare ops trying to resolve differences was far higher. The discussion and scripts that follow help understand the magnitude of the bloat and offer surgical tools to reduce it. These tools reduced the 92 media in the Gallery to 49.

    Background

    The issue predates TreeShare and could arise from combining databases but it is likely that it had not been noticed as frequently or to the same degree. Earlier effort dating back to RM4 (see Media Repair Queries) concerned itself with duplicate file names in different folders, not with duplicate content within differently named files.

    RootsMagic places all TreeShare media from Ancestry in one folder for a given database. It does not attempt to substitute an existing duplicate media file for the one to be or is being downloaded; rather, it modifies the name of the downloading duplicate file with a unique suffix. Consequently, the old script would not even detect these differently named files with identical content, only those having the same name in separate folders. Moreover, it could generate false positives for files of the same name with different content leading to the deletion of potentially valuable information.

    Finding Duplicates

    By comparing RootsMagic thumbnails

    RootsMagic’s database has a property for each image file that can be used as a basis for finding duplicate image content – the thumbnail image it creates of the image file. This thumbnail is stored in the database in the Thumbnail field in the MultiMediaTable. However, because it is a tiny thumbnail, it is a crude approximation of the original and it may be possible that similar images, such as some pairs of Census pages or BMD Indexes could give rise to identical thumbnails. Experience will tell. It’s a very convenient parameter to exploit as seen in this query result from a tiny, test database.

    Examining the database’s MultimediaTable shows the duplicate file’s revised name:
     In this case, we can confidently surmise from the common part of the filenames AND the identical thumbnails that these undoubtedly have duplicate content.

    By using CCleaner

    Other files may not have strongly similar filenames despite identical thumbnails and, given the risk of false positives from the latter, a further test is required. The free CCleaner and other utilities can resolve if they have the same or different content; we’ll continue here with CCleaner and its Tools > Duplicate Finder feature set to search the folders of interest for image files that match each other only by Size and by Content. Here is a screenshot of its results:
    The highlighted names are of those in the database’s Media Gallery. The others are copies made using Windows File Explorer and are unused by (and therefore unrecorded in) the RM database. The script correctly reported the two matched files used by the database from their identical thumbnails.

    CCleaner exports its results as a tab-delimited text file which can be imported into a spreadsheet:

    CCleaner Duplicate Files report imported into Google Sheets; headers added by user.

    Combining the data from the RootsMagic database with that from the CCleaner Duplicate Files report should provide what is needed to identify and replace duplicates in the Media Gallery and to delete the then unused duplicate files from computer storage. Combining may be done by importing the CCleaner data into the RM file, temporarily, or by exporting RM data to the spreadsheet. What follows is the first approach.

    Migrate CCleaner Duplicate Search results to SQLite

    With the CCleaner data in the spreadsheet, formulas can be applied to create a SQLite INSERT statement for each reported file. We will create an empty temporary table with a corresponding structure into which the generated statements will insert values from the spreadsheet. I’ve created a publicly accessible Google Sheet which anyone can copy to their own spreadsheet to work on or to use directly.

    Scripts for finding Duplicates

    Before running any of these scripts, you should use RM’s File > Database Tools:

    1. Test Database Integrity
    2. If #1 fails, Rebuild Indexes and repeat #1
    3. Delete Phantoms

    And it is probably advisable to run the latest script from Delete Phantoms to clear out any detritus that may just confuse or exaggerate comparisons of Database Properties before and after the Merge Duplicates script below. Use your OS File Explorer/Manager to get the properties of the media folder(s) used by the database.

    Thumbnails

    Media-DupThumbnailsList.sql Make sure you open the RM Media Gallery and let thumbnail creation complete before running this script against your database.  It creates a temp table named zDupMediaTable which is displayed at the end of script execution to show you the duplicates found. The script makes no modifications to the database; this table will be used by the merge script.  Note that it ignores media files that are not image files or any file type that is not in the subject database’s Media Gallery. And beware that there is a risk of false positives because thumbnail images are of such low resolution. If you intend to proceed to merge duplicates, do not close the database from the SQLite manager as that will delete the temp table needed.

    Files

    Media-DupFilesList.sql You have to edit this script to include the INSERT commands you will copy from the spreadsheet before running it on the database. It creates an identically named temp zDupMediaTable to be used by the merge script and makes no changes to the database. Because Ccleaner does a bit-wise or byte-wise comparison, there is no chance of a false match between different images. However, it will not match files having exactly the same image data but a difference in just one character of metadata. If you intend to proceed to merge duplicates, do not close the database from the SQLite manager as that will delete the temp table needed.

    Merging Duplicates

    Media-MergeDuplicates.sql This is where the rubber hits the road and database changes are afoot. You have taken safety precautions? And your SQLite manager still has the temp zDupMediaTable in the database from either of the above scripts? This script won’t do anything without it. Fire away!

    When you next run TreeShare after script execution, all the people whose media has been affected by the merge and is tagged to anything related to the person will be marked as Changed and all previous Changed people will first have been cleared. Nothing material should have been changed by the script but flagging them so helps you to confirm that is the case.

    Cleanup

    Repeat the initial steps of RM Database Tools and Delete Phantoms  to clean out new detritus and then RM Database Properties for comparison of Before and After.

    Delete surplus duplicate media files

    The last step in the script lists the commands for deleting duplicate files from the media folder(s) that are no longer used by the database. Copy and paste them into a text editor for review and to avoid their loss if you have to close the database in the SQLite manager (some RootsMagic operations such as Backup and Restore cannot proceed if the database is open in another application). If you are confident that any or all are unused by any other database, copy and paste the respective ERASE commands into the CMD shell. They will execute immediately without asking for confirmation; the script could be modified to require a confirmation of each ERASE. For a large set, break up the command set into chunks.

    Repeat getting folder properties for the After results to compare with Before, i.e., number of media files and total size.

    Miscellaneous

    1. How well this procedure will scale with the size of the database is unknown. Certainly, once some large numbers of INSERTs and ERASEs is reached, there will be issues with copying and pasting and perhaps the size of the script for the INSERTs. Nor can we predict the speed with which the scripts will execute.
    2. How TreeShare functions remains a mystery in part. Events for a couple, such as Marriage, are especially problematic because we are advised that Ancestry handles them as separate individual events. It seems advisable, if not necessary, to have on the Ancestry Tree identical sources for the Marriage event for each person to be assured that RM will create the couple’s Marriage event with all the sources. And updating via TreeShare sometimes is a tail-chasing exercise in which ostensibly identical citations or media on both sides are perceived as new for the other with the update one way provoking the offer of a new one the other.
    3. The flagging of Changed People is done by setting the Modified field to 1 for records in LinkAncestryTable whose LinkType is 0. The script also sets the Modified field to 1 for records of LinkType 4 (citations) and 11 (media) but these have no effect on anything apparent in RM 7590.
    4. This procedure does not provide ERASE commands for media files that are unique (not duplicates) and unused by the database. The procedure on this early page could be adapted to do so: Scrapbook Files Status