Ques - Query to copy place(placeID) from Fact/Event A to Fact/Event B for same / all person

Quote from kevync on 2024-01-28, 9:32 pmQuestion
Looking for a Query to copy place(placeID) from Fact/Event A to Fact/Event B for same person / all persons that have both facts
FactType A = Burial (#4)
FactTypeB = FindaGrave (#18)
Not sure of the best safest approach to write an update query like that
Kevin
Question
Looking for a Query to copy place(placeID) from Fact/Event A to Fact/Event B for same person / all persons that have both facts
FactType A = Burial (#4)
FactTypeB = FindaGrave (#18)
Not sure of the best safest approach to write an update query like that
Kevin

Quote from Tom Holden on 2024-01-29, 12:05 pmI would create a query that gives me a list with the key data which I can inspect to see if it has what I want. Something along these lines:
SELECT
FG.EventID AS GraveID
, B.EventID AS BurialID
, B.PlaceID AS BurialPlaceID
, B.SiteID AS BurialSiteID
, FG.PlaceID AS GravePlaceID
, FG.SiteID AS GraveSiteID
, OwnerID AS RIN
FROM EventTable FG
JOIN EventTable B USING(OwnerType, OwnerID)
WHERE FG.EventType=18 --CHANGE THIS TO THE Find-a-Grave CUSTOM FactTypeID
AND B.EventType=4 -- FactTypeID for Burial
;Once satisfied that it or some version of it does, I would then make it a temporary VIEW (BurialView) or even a temp TABLE to be called from the Update query which needs to be constrained to operate only on the FaG records. Something like:
UPDATE EventTable
SET
PlaceID=
(Select BurialPlaceID
FROM BurialView B
WHERE EventTable.EventID=GraveID
),
SiteID=
(Select BurialSiteID
FROM BurialView B
WHERE EventTable.EventID=GraveID
)
WHERE EventTable.EventID
IN (SELECT GraveID FROM BurialView)
;You will need to carefully check this out on a play copy of your database. You can open BurialView to verify that all the FindaGrave facts now match the Burial facts for Place and Site.
Note that you specified the FindaGrave FactTypeID as #18 but that is Census.
I would create a query that gives me a list with the key data which I can inspect to see if it has what I want. Something along these lines:
SELECT
FG.EventID AS GraveID
, B.EventID AS BurialID
, B.PlaceID AS BurialPlaceID
, B.SiteID AS BurialSiteID
, FG.PlaceID AS GravePlaceID
, FG.SiteID AS GraveSiteID
, OwnerID AS RIN
FROM EventTable FG
JOIN EventTable B USING(OwnerType, OwnerID)
WHERE FG.EventType=18 --CHANGE THIS TO THE Find-a-Grave CUSTOM FactTypeID
AND B.EventType=4 -- FactTypeID for Burial
;
Once satisfied that it or some version of it does, I would then make it a temporary VIEW (BurialView) or even a temp TABLE to be called from the Update query which needs to be constrained to operate only on the FaG records. Something like:
UPDATE EventTable
SET
PlaceID=
(Select BurialPlaceID
FROM BurialView B
WHERE EventTable.EventID=GraveID
),
SiteID=
(Select BurialSiteID
FROM BurialView B
WHERE EventTable.EventID=GraveID
)
WHERE EventTable.EventID
IN (SELECT GraveID FROM BurialView)
;
You will need to carefully check this out on a play copy of your database. You can open BurialView to verify that all the FindaGrave facts now match the Burial facts for Place and Site.
Note that you specified the FindaGrave FactTypeID as #18 but that is Census.

Quote from kevync on 2024-01-29, 12:25 pmThanks Tom
I figure a temp table was probably what was needed.
This is actually someone else's DB so I did the other fact from memory/guess.
Of course backup db to test!
Kevin
Thanks Tom
I figure a temp table was probably what was needed.
This is actually someone else's DB so I did the other fact from memory/guess.
Of course backup db to test!
Kevin

Quote from thejerrybryan on 2024-01-29, 3:38 pmI have a solution. I haven't done much checking to see how it compare's to Tom's except that as usual Tom's solution seems simpler than mine.
There are two files. The first file just runs a query to see what's going to be updated. The second file uses the first file as a subquery and does the actual update. You can run the second file without running the first file, but I would recommend running the first file before running the second file. Both files need be updated with the FactTypeID of your two facts.
I have a solution. I haven't done much checking to see how it compare's to Tom's except that as usual Tom's solution seems simpler than mine.
There are two files. The first file just runs a query to see what's going to be updated. The second file uses the first file as a subquery and does the actual update. You can run the second file without running the first file, but I would recommend running the first file before running the second file. Both files need be updated with the FactTypeID of your two facts.
Uploaded files:

Quote from thejerrybryan on 2024-01-29, 6:10 pmI thought I might write a little more about the structure of my script. It's basically the following.
UPDATE EventTable
SET PlaceID = (subquery)
WHERE EventID IN (subquery)As I take a quick look at Tom's version, it appears to be pretty much the same structure except that he used view rather than a subquery. The view and the subquery are essentially accomplishing the same thing.
Using a view has the advantage that having defined the view, you can use it more than one time without having to type it in each time (or copy and paste it in each time). The advantage of a subquery is that it tends to run a lot faster, even though you have to replicate the code each time you use it. I wish SQL had some way of running your code so that it ran at subquery speed without having to replicate the same code each time you need it.
My two subqueries are really identical except that in the first usage I need to extract the PlaceID field of the first fact from the subquery. The first fact in this use case would be the Burial fact. And in the second usage of the same subquery, I need to extract the EventID of the second fact from the subquery. The second fact in this use case would be the FaG fact. The same subquery develops both pieces of information. I developed the subquery first as a standalone query and only later included it into the UPDATE script. That's why I submitted my script as two separate files.
I used to write all my queries with views. I now write them all with subqueries. It's much easier to write them with views, but I like the speed of subqueries.
I thought I might write a little more about the structure of my script. It's basically the following.
UPDATE EventTable
SET PlaceID = (subquery)
WHERE EventID IN (subquery)
As I take a quick look at Tom's version, it appears to be pretty much the same structure except that he used view rather than a subquery. The view and the subquery are essentially accomplishing the same thing.
Using a view has the advantage that having defined the view, you can use it more than one time without having to type it in each time (or copy and paste it in each time). The advantage of a subquery is that it tends to run a lot faster, even though you have to replicate the code each time you use it. I wish SQL had some way of running your code so that it ran at subquery speed without having to replicate the same code each time you need it.
My two subqueries are really identical except that in the first usage I need to extract the PlaceID field of the first fact from the subquery. The first fact in this use case would be the Burial fact. And in the second usage of the same subquery, I need to extract the EventID of the second fact from the subquery. The second fact in this use case would be the FaG fact. The same subquery develops both pieces of information. I developed the subquery first as a standalone query and only later included it into the UPDATE script. That's why I submitted my script as two separate files.
I used to write all my queries with views. I now write them all with subqueries. It's much easier to write them with views, but I like the speed of subqueries.

Quote from Tom Holden on 2024-01-29, 10:28 pmYour observations and explications are really useful, Jerry. I used the View in the sub-query because it does make it easier to follow the code. Since it's a one-time operation, speed didn't seem important. I've not compared the performance of the two approaches - it would be interesting to know how big the penalty might be but it is likely highly dependent on the complexity of the query that is stored in the View.
Where our scripts also differ is in that core query. Yours is the rigorous, full disclosure version that uses multiple tables; mine is minimal, using just the EventTable. All of the data needed to carry out the process is found in that one table. Indeed, the critical fields needed are the EventIDs of both the FaG and Burial Events 'owned' by the same person and the PlaceIDs for the Burial Place and Place Detail (Site). The rest of the columns are for info only, e.g., the RIN or OwnerID to be used to inspect a person using RM.
Because the two fact types have to be owned by Individuals, i.e. they have the same OwnerType, all one needs to pair up a person's FaG and Burial Events is to JOIN the EventTable to itself for matching OwnerType and OwnerID and filter the results to the FaG event on the left and the corresponding Burial event(s) on the right. Likewise, it could work for pairs of Family-type facts.
Note that my query allows there to be more than one Burial event or, for that matter, two or more FaG events whereas I think your query restricts the results to persons having only a singular event of each type. Mine will Update all FaG facts with the Place and Site of the last-created Burial EventID paired with a given FaG EventID.
Also, if Place Detail (Site) is used and wanted, your procedure would need to be revised to cover them. I know that you abandoned Place Details years ago after having converted to them initially but others haven't. We don't know what Kevin has.
Your observations and explications are really useful, Jerry. I used the View in the sub-query because it does make it easier to follow the code. Since it's a one-time operation, speed didn't seem important. I've not compared the performance of the two approaches - it would be interesting to know how big the penalty might be but it is likely highly dependent on the complexity of the query that is stored in the View.
Where our scripts also differ is in that core query. Yours is the rigorous, full disclosure version that uses multiple tables; mine is minimal, using just the EventTable. All of the data needed to carry out the process is found in that one table. Indeed, the critical fields needed are the EventIDs of both the FaG and Burial Events 'owned' by the same person and the PlaceIDs for the Burial Place and Place Detail (Site). The rest of the columns are for info only, e.g., the RIN or OwnerID to be used to inspect a person using RM.
Because the two fact types have to be owned by Individuals, i.e. they have the same OwnerType, all one needs to pair up a person's FaG and Burial Events is to JOIN the EventTable to itself for matching OwnerType and OwnerID and filter the results to the FaG event on the left and the corresponding Burial event(s) on the right. Likewise, it could work for pairs of Family-type facts.
Note that my query allows there to be more than one Burial event or, for that matter, two or more FaG events whereas I think your query restricts the results to persons having only a singular event of each type. Mine will Update all FaG facts with the Place and Site of the last-created Burial EventID paired with a given FaG EventID.
Also, if Place Detail (Site) is used and wanted, your procedure would need to be revised to cover them. I know that you abandoned Place Details years ago after having converted to them initially but others haven't. We don't know what Kevin has.

Quote from thejerrybryan on 2024-01-29, 11:15 pm"Yours is the rigorous, full disclosure version that uses multiple tables; mine is minimal, using just the EventTable." My initial version used only EventTable. I added the JOIN's to FactTypeTable and PlaceTable shortly before I posted the final version to this forum. It was interesting to see how little data was actually necessary to solve this little problem.
"Note that my query allows there to be more than one Burial event or, for that matter, two or more FaG events whereas I think your query restricts the results to persons having only a singular event of each type" Correct. I should have made note of that. I was aware of the possibility, and I did a manual check for non-singular events when I was running the subquery by itself. This issue is something to be concerned with when you are using a GROUP BY and a HAVING COUNT(*) = 2.
"Also, if Place Detail (Site) is used and wanted, your procedure would need to be revised to cover them. I know that you abandoned Place Details years ago after having converted to them initially but others haven't. " Yes, I was aware of this one also and I saw it being addressed on your version. Again, I should have made note of it, especially because a user with using this kind of script with the Burial fact may well be using Place Details for the cemetery name. I assume that if you were using Place Details for the Burial fact you would want the Place Details also for the FaG fact.
I really enjoy seeing alternative solutions to the same problem to get ideas to improve my future scripts. To a certain extent form follows function so that there have to be similarities in different scripts that solve the same problem. But that still leaves room for creativity and imagination, of which I feel confident when I am writing C++ but no so much when I am writing SQL.
"Yours is the rigorous, full disclosure version that uses multiple tables; mine is minimal, using just the EventTable." My initial version used only EventTable. I added the JOIN's to FactTypeTable and PlaceTable shortly before I posted the final version to this forum. It was interesting to see how little data was actually necessary to solve this little problem.
"Note that my query allows there to be more than one Burial event or, for that matter, two or more FaG events whereas I think your query restricts the results to persons having only a singular event of each type" Correct. I should have made note of that. I was aware of the possibility, and I did a manual check for non-singular events when I was running the subquery by itself. This issue is something to be concerned with when you are using a GROUP BY and a HAVING COUNT(*) = 2.
"Also, if Place Detail (Site) is used and wanted, your procedure would need to be revised to cover them. I know that you abandoned Place Details years ago after having converted to them initially but others haven't. " Yes, I was aware of this one also and I saw it being addressed on your version. Again, I should have made note of it, especially because a user with using this kind of script with the Burial fact may well be using Place Details for the cemetery name. I assume that if you were using Place Details for the Burial fact you would want the Place Details also for the FaG fact.
I really enjoy seeing alternative solutions to the same problem to get ideas to improve my future scripts. To a certain extent form follows function so that there have to be similarities in different scripts that solve the same problem. But that still leaves room for creativity and imagination, of which I feel confident when I am writing C++ but no so much when I am writing SQL.

Quote from Tom Holden on 2024-01-30, 12:12 pm
Quote from thejerrybryan on 2024-01-29, 6:10 pm...The advantage of a subquery is that it tends to run a lot faster, even though you have to replicate the code each time you use it. I wish SQL had some way of running your code so that it ran at subquery speed without having to replicate the same code each time you need it.
Quote from Tom Holden on 2024-01-29, 10:28 pm...I've not compared the performance of the two approaches - it would be interesting to know how big the penalty might be but it is likely highly dependent on the complexity of the query that is stored in the View...
Surprise! The penalty is negligible in my test on a very large database of over 400,000 people to which I added a "PING" fact and a "PONG" fact to every one. I tried my query using a VIEW and then with the VIEW code as explicit sub-queries. I could see no difference. Whichever one I started first after changing data took about 3.2 seconds to copy the Place and Site IDs from PING to PONG. SQLite automatic optimisation improved that to about 2.5s on repeated runs. If there is any difference, it would be in the 10-20 milliseconds range, i.e., <1%.
It certainly is my recollection that there was a real and significant penalty using VIEWs but that could have been my conclusion from experience a decade or so ago!
Quote from thejerrybryan on 2024-01-29, 6:10 pm...The advantage of a subquery is that it tends to run a lot faster, even though you have to replicate the code each time you use it. I wish SQL had some way of running your code so that it ran at subquery speed without having to replicate the same code each time you need it.
Quote from Tom Holden on 2024-01-29, 10:28 pm...I've not compared the performance of the two approaches - it would be interesting to know how big the penalty might be but it is likely highly dependent on the complexity of the query that is stored in the View...
Surprise! The penalty is negligible in my test on a very large database of over 400,000 people to which I added a "PING" fact and a "PONG" fact to every one. I tried my query using a VIEW and then with the VIEW code as explicit sub-queries. I could see no difference. Whichever one I started first after changing data took about 3.2 seconds to copy the Place and Site IDs from PING to PONG. SQLite automatic optimisation improved that to about 2.5s on repeated runs. If there is any difference, it would be in the 10-20 milliseconds range, i.e., <1%.
It certainly is my recollection that there was a real and significant penalty using VIEWs but that could have been my conclusion from experience a decade or so ago!

Quote from kevync on 2024-01-30, 12:47 pmIt certainly is my recollection that there was a real and significant penalty using VIEWs but that could have been my conclusion from experience a decade or so ago! (tomH)
this was my understanding although my SQLITE experience is limited vs the two of you. Very interesting to see that at least with this type scenario the diff is negligible . My friends database is less than 3000 people and about 1000 facts at most that need copied so this would have no difference.
Subquery can be great for many things -- currently I mainly use them for creating groups that current RM9 does not allow for. I am anticipating some new features ahead of RootsTech curious what we will see.
It certainly is my recollection that there was a real and significant penalty using VIEWs but that could have been my conclusion from experience a decade or so ago! (tomH)
this was my understanding although my SQLITE experience is limited vs the two of you. Very interesting to see that at least with this type scenario the diff is negligible . My friends database is less than 3000 people and about 1000 facts at most that need copied so this would have no difference.
Subquery can be great for many things -- currently I mainly use them for creating groups that current RM9 does not allow for. I am anticipating some new features ahead of RootsTech curious what we will see.