Forum

Forum breadcrumbs - You are here:ForumGeneral: Chit-chatCommon Table Expression
Please or Register to create posts and topics.

Common Table Expression

I have just learned a new SQL trick that might be useful in a number of ways. It is called Common Table Expression or CTE. It's sort of like a very temporary table or very temporary view, except that a CTE only exists in SQL and never becomes a part of the database schema, not even briefly.

After it's defined, a CTE can be accessed in a query as if it were a table or a view. In and of itself, that aspect of a CTE doesn't really provide any new functionality. But the trick is that a CTE can invoke itself, thereby creating recursive queries. This recursive capability is built in to the SQL and doesn't depend on any particular SQLite manager that might provide recursive capability. For example, I could picture using a CTE to create an XML parser with SQL that could be used to parse and display the data elements that RM stores as XML such as sentence templates and source templates and sources and citations based on templates. I don't have a sample query to offer that could parse XML using a CTE, but I think it shouldn't be too difficult.

I don't find most of the documentation of CTE's to be very useful, but I found one site that has a very good tutorial. It may be found at https://blog.expensify.com/2015/09/25/the-simplest-sqlite-common-table-expression-tutorial/

I also developed a very simple query that uses a CTE to find all PersonID values in the PersonTable that are NOT being used. It follows here.

WITH CTE(id) AS
(
SELECT 1
UNION ALL
SELECT id + 1
FROM CTE
WHERE id < (SELECT MAX(PersonID) FROM PersonTable )
)
SELECT id
FROM CTE
LEFT JOIN PersonTable AS P ON P.PersonID = id
WHERE P.PersonID IS NULL;

There is nothing about the name CTE that makes it into a CTE. I could have called it XYX or anything else that is equally meaningless. What makes it into a CTE is the WITH keyword.

Defining a CTE doesn't do anything until it is used. In the case of this query, the CTE is used in the SELECT id FROM CTE clause.

Any CTE has to include a WHERE clause that eventually will become false. Otherwise, the recursion will be infinite and will eventually crash.

In a certain sense, this is a bad example. That's because after I wrote the query, I discovered that SQLite supports a function that allows me to write the same query without using a CTE. The function is the generate_series() function and my sample query using the generate_series() function to find all PersonID values that are unused is as follows.

SELECT value
FROM generate_series(1,(SELECT MAX(PersonID) FROM PersonTable ),1)
LEFT JOIN PersonTable AS P ON P.PersonID = value
WHERE P.PersonID IS NULL;

Notice that with either version of the query, I had to put an upper limit on how many times the recursion went around or on how many numbers were generated by the generate_series() function. In both cases, I accomplished this via SELECT MAX(PersonID) FROM PersonTable. I therefore think there is a minor hole in that logic that I don't presently know how to fix. For example, suppose you made a new RM database and added five people with PersonID 1 through PersonID 5. Then delete PersonID 2 and PersonID 5. The missing PersonID's are now 2 and 5, but if you run my query I think it will only list PersonID 2 as missing and not PersonID 5 because the maximum PersonID is now 4. Is there a function or some other way to query an SQLite table to determine what the next number is that will be assigned automatically?

Jerry

Tom Holden has reacted to this post.
Tom Holden

The generate_series() function is news to me but the recursive CTE has been in SQLite3 since 2014. I explored it back then, never achieving a level of confidence with it that could result in ready application but did manage to cobble together a few scripts that can be found under the #recursive tag. See https://sqlitetoolsforrootsmagic.com/tag/recursive/

Thanks for posting about both. The way you explain SQLite and RM always gives me fresh insights!

Is there a function or some other way to query an SQLite table to determine what the next number is that will be assigned automatically?

I always thought it is just incremented by the program from the current maximum value when it is the Integer Primary Key. Checking further at https://sqlite.org/autoinc.html#background there would be an internal table sqlite_sequence that would answer your question if any of the RM tables had the AUTOINCREMENT option applied. I queried a couple of RM databases for that table and it does not exist. Looking at the schema, none of the tables have AUTOINCREMENT. So MAX(rowid) is the base from which the software will assign the rowid (PersonID) for the next row. That deleted 5th and last row is not going to result in a gap.

 

I did a bit more testing. As I feared, if you delete the highest numbered person and then add a new person, it skips the number you just deleted. For example, create a new RM database with person #1 and person #2 and then delete person #2. The next person you add will be person #3 rather than person #2. But if you run either of my scripts immediately after deleting person #2 and before adding person #3, the script will not identify any missing numbers.

I have read through the documentation that Tom cited about auto-increment. Just from reading the documentation and also from looking in my little test RM database with only two people in it, I can't figure out any way that RM or SQLite knows that if you delete person #2 and then add a new person that it should add the new person as person #3 instead of as person #2. So it's a mystery to me and my scripts will have a minor glitch until the mystery is solved. There surely is a way to query the "next number to be added" to an auto-increment field. I just don't know what that way is.

Jerry

There surely is a way to query the "next number to be added" to an auto-increment field.

None of the tables are defined as AUTOINCREMENT. RM must be storing the last number assigned somewhere. I'll bet it is in the ConfigTable.

 

I've been looking at the problem some more for the last few minutes. I concur that none of the tables are defined as AUTOINCREMENT, so the auto incrementing must be being accomplished by RM itself, not by SQLite. Further confirmation is that with RM shut down, I did an INSERT INTO PeopleTable DEFAULT VALUES; after deleting the last person in a test database. The row that was created was basically garbage from the point of view of RM itself, but the INSERT did re-use the PersonID I had just deleted which confirms that the auto incrementing is not being done by SQLite.

I have also been looking in the ConfigTable, which is the logical place to store the highest used PersonID. But so far, I have not been able to find it.

Jerry

Had a look too. The answer is that RM stores the last person added in RAM. Delete that person while in the same session as it was added and the next person added is incremented from the value stored in RAM. However, if instead you close the db after deleting the last person and then reopen the database to add a new person the record number of the deleted person is reused. I guess that register gets filled with maxrowid if it's empty and is incremented without reference to maxrowid.

Reminds me I must find and finish my article for this site on CTE - I did it on my laptop then lost it but I''m sure there's a backup somewhere.