Forums

Please or Register to create posts and topics.

Ques - Query to copy place(placeID) from Fact/Event A to Fact/Event B for same / all person

PreviousPage 2 of 2

I wonder if perhaps I need to do some new performance testing. A great deal depends on the complexity of the query and the size of the database. It was certainly easier for me to develop new queries using views than using subqueries.

Just saw your last post and have run into some issues with your two scripts when I thought I would take a few minutes to compare their performance to mine last night. That turned into a couple of hours and leaving me very confused and wondering if I've tickled a bug in sqlite.

  1. I was getting no results from your queries, as is, until I realised that the JOIN to P2 has to be LEFT. If E2.PlaceID=0 (as I think is the default value when no Place has been selected on data entry), there is no corresponding rowid in P2. It makes sense not to LEFT JOIN P1 as nothing is gained by copying a 0 or missing PlaceID. In my test database with ~407,000 each PING and PONG events, about 6,000 are thus ignored by your queries while mine unnecessarily includes them.
  2. Having observed longer execution times with your script with LEFT JOIN P2 than with mine, I proceeded to strip out the non-essentials to make the comparison fairer. That's when things went strange on SQLiteSpy and subsequently on SQLite Expert Personal, both of which I believe to be current. Both raise no syntax error but process away until they become unresponsive and have to be end-tasked. When I get a chance back on my laptop, I'll share my stripped version of your script - maybe your eyes will catch what I've done wrong to it!

@thejerrybryan

I think I've identified that sqlite choked because the number of steps it has to do rises exponentially with the number of events for which the PlaceID has to be copied, when using my stripped-down version of your script. I pulled my big EventTable into another database and confirmed the same endless crunching.

Then I shrunk the table to just 100 each PING and PONG events - the stripped version completed this time. I ran my query on the same table and compared what SQLiteSpy reports as the effort involved in each query in the attached image. On all the reported parameters, the stripped version of your query invokes much greater effort.

I thought your unstripped script had worked on the big EventTable once I had modified the JOIN to P2 to LEFT so I'm mystified that the effort invoked by my stripped version should be fatally greater when I was expecting it to be less. It's as though one or more of the JOINs to non-essential tables is constraining the needed effort. I'll have to import them into this test database to run your original version and compare the effort.

This is of academic interest, of course, and I really should be getting on with the downsizing I'm supposed to be doing in preparation for a move!

Uploaded files:

Confirmed again that your original full script with LEFT added to the JOINs with PlaceTable AS P2 does complete Updating the 400k target events for which the source event has a PlaceID in the PlaceTable and does so on my old i5 laptop with SSD in around 9-10s today, just a few seconds slower than my VIEW-based script. That's not going to give anyone grief.

But it remains a puzzle why my stripped-down version of your script is less efficient than the original+LEFT and craps out on the large dataset. And that is opposite to the relative performance of the two core subqueries as evidenced in the expanded graphic: the stripped version is slightly more efficient in VM steps. I don't know what the relative effects are of Steps, Sorts, AutoIDX and VM Steps except that more is worse. Comparing JB-stripped to JB-Left Join P2 full scripts, Steps & VM Steps favour the latter while it's opposite for Sorts and AutoIDX.

Uploaded files:
  • Copy_Event1PlaceID_to_Event2PlaceID-SQLite-EFFORT.jpg
kevync has reacted to this post.
kevync

This is a little beyond my knowledge area on the back-end -- I am guessing index and size of database, as well as the data itself might all play into the performance. I think I would lean to sub-query vs temp table or view for lots of things.  I am guessing the database engine always make some guesses based expected performance.  Have to wonder results SQlite vs SQL, vs  MySql etc.

 

 

 

 

 

 

I've been away from the computer all day until just now, so I just wanted to let everybody know that I wasn't ignoring them. I'll study all the messages and get back to everybody in a day or two.

PreviousPage 2 of 2