Diving into UTCModDate | Date Last Edited statistics
data:image/s3,"s3://crabby-images/0fd8a/0fd8a4ec88b57d96871bb13fba2578b831c38030" alt="Tom Holden"
Quote from Tom Holden on 2025-02-12, 10:47 pmPicking up on @thejerrybryan comment about doing 'real genealogy' rather than developing SQLite scripts, I'm guilty of the latter. My recent dive into @cherylc 's problem with drag'n'drop also got me wanting to explore the UTCModDate column that is found in the RM database structure since RM8 was introduced. Examination of it could tell something about the history of a database and be of possible value in identifying when major changes occurred, such as a mass import or drag'n'drop or an automatic merge. Perhaps also useful in picking through backups to find the last one before a data catastrophe.
The attached script DatesLastEdited.sql collects the UTCModDate for all or selected records from the tables having greatest bearing on each person into a temporary table that can then be queried to aid one's analysis.
Included in the script file are a number of such queries that can be selectively executed:
- LIST EARLIEST Record DateTime by PID (PersonID)
- LIST LATEST Record DateTime by PID
- LIST MEDIAN record DateTime for each person (requires sqlean extension for stats_median(); the extension also has functions for standard deviation and variance)
The above three are stored as VIEWs for subsequent queries.
- COUNT RECORDS for each person [PID]
- LIST Range of DateTimes for each person's records
- LIST Range of DateTimes + Median DateTime for each person's records
- LIST DateLastEdited for all records for a given person
- LIST DISTRIBUTION BY DateTime of all Records
- LIST DISTRIBUTION BY DATE of all Records
Date COUNT() 6514 1899-12-30 1856 1999-03-14 1 2001-03-02 1 2001-05-22 1 2002-06-30 1 2005-03-18 1 2008-03-10 2 2009-03-31 279 2009-10-15 5 This is an example from a small, play database that originated in FamilyOrigins 9 if I recall correctly which might explain the early dates. However, EditDate and UTCModDate were not widely available in the database until RM8 which accounts for the large number of null values or 0.0 which translates to 1899-12-30.
PID Table Record UTCModDate DateTime 428 UT 3195 428 UT 3196 428 NT 454 0.0 1899-12-30 00:00:00 428 NT 1712 0.0 1899-12-30 00:00:00 428 ET 928 40122.4496844213 2009-11-05 10:47:32 428 ET 929 40122.4496844213 2009-11-05 10:47:32 This is a subset of the records associated with the Person having a RIN (PID or PersonID) = 428. Two WebTag records with null dates from the URLTable UT, followed by two Name records from the NameTable NT having the 0.0 date and then two records from the EventTable ET with valid dates.
PID Table Earliest Median Latest Table 658 ET 2009-11-05 10:47:32 2012-11-06 17:52:37 2024-12-31 01:13:58 NT The earliest record for person 658 (apart from null and 0.0) is from the EventTable in 2009, the median (half of the records are below and half are above this value) is late 2012 and the latest the end of last year from the NameTable NT. That the Median is so much closer to the Earliest than to the Latest indicates that most of the records are earlier. Other statistical calculations are supported by the sqlean extension.
With the PID or RIN for the person and table and record identified, it is possible to dig further into the person's profile. A useful adjunct would be one of the views in Search - wayfinding from data tables to RootsMagic screens
Picking up on @thejerrybryan comment about doing 'real genealogy' rather than developing SQLite scripts, I'm guilty of the latter. My recent dive into @cherylc 's problem with drag'n'drop also got me wanting to explore the UTCModDate column that is found in the RM database structure since RM8 was introduced. Examination of it could tell something about the history of a database and be of possible value in identifying when major changes occurred, such as a mass import or drag'n'drop or an automatic merge. Perhaps also useful in picking through backups to find the last one before a data catastrophe.
The attached script DatesLastEdited.sql collects the UTCModDate for all or selected records from the tables having greatest bearing on each person into a temporary table that can then be queried to aid one's analysis.
Included in the script file are a number of such queries that can be selectively executed:
- LIST EARLIEST Record DateTime by PID (PersonID)
- LIST LATEST Record DateTime by PID
- LIST MEDIAN record DateTime for each person (requires sqlean extension for stats_median(); the extension also has functions for standard deviation and variance)
The above three are stored as VIEWs for subsequent queries.
- COUNT RECORDS for each person [PID]
- LIST Range of DateTimes for each person's records
- LIST Range of DateTimes + Median DateTime for each person's records
- LIST DateLastEdited for all records for a given person
- LIST DISTRIBUTION BY DateTime of all Records
- LIST DISTRIBUTION BY DATE of all Records
Date | COUNT() |
---|---|
6514 | |
1899-12-30 | 1856 |
1999-03-14 | 1 |
2001-03-02 | 1 |
2001-05-22 | 1 |
2002-06-30 | 1 |
2005-03-18 | 1 |
2008-03-10 | 2 |
2009-03-31 | 279 |
2009-10-15 | 5 |
This is an example from a small, play database that originated in FamilyOrigins 9 if I recall correctly which might explain the early dates. However, EditDate and UTCModDate were not widely available in the database until RM8 which accounts for the large number of null values or 0.0 which translates to 1899-12-30.
PID | Table | Record | UTCModDate | DateTime |
---|---|---|---|---|
428 | UT | 3195 | ||
428 | UT | 3196 | ||
428 | NT | 454 | 0.0 | 1899-12-30 00:00:00 |
428 | NT | 1712 | 0.0 | 1899-12-30 00:00:00 |
428 | ET | 928 | 40122.4496844213 | 2009-11-05 10:47:32 |
428 | ET | 929 | 40122.4496844213 | 2009-11-05 10:47:32 |
This is a subset of the records associated with the Person having a RIN (PID or PersonID) = 428. Two WebTag records with null dates from the URLTable UT, followed by two Name records from the NameTable NT having the 0.0 date and then two records from the EventTable ET with valid dates.
PID | Table | Earliest | Median | Latest | Table |
---|---|---|---|---|---|
658 | ET | 2009-11-05 10:47:32 | 2012-11-06 17:52:37 | 2024-12-31 01:13:58 | NT |
The earliest record for person 658 (apart from null and 0.0) is from the EventTable in 2009, the median (half of the records are below and half are above this value) is late 2012 and the latest the end of last year from the NameTable NT. That the Median is so much closer to the Earliest than to the Latest indicates that most of the records are earlier. Other statistical calculations are supported by the sqlean extension.
With the PID or RIN for the person and table and record identified, it is possible to dig further into the person's profile. A useful adjunct would be one of the views in Search - wayfinding from data tables to RootsMagic screens
Uploaded files:data:image/s3,"s3://crabby-images/0fd8a/0fd8a4ec88b57d96871bb13fba2578b831c38030" alt="kevync"
Quote from kevync on 2025-02-13, 9:21 pmThere are few things that do not touch the UTCmodeDate that some user might confused by. For example. FamilySearch and Ancestry depending on hints etc.might be deceiving.
To you point I have found some points of reference when reviewing database & moddates that told me the "history"
There are few things that do not touch the UTCmodeDate that some user might confused by. For example. FamilySearch and Ancestry depending on hints etc.might be deceiving.
To you point I have found some points of reference when reviewing database & moddates that told me the "history"