Database backup with WAL
Quote from thejerrybryan on 2025-11-19, 10:00 pmMy reaction also was "why didn't Bruce just do a commit before the backup?"
But I suspect his real reason for changing back was that he saw the messages from users asking about the -wal file and the -shm file, and he was afraid users would delete them rather than just leaving them be for the use of future commits.
Now that we are back in journal mode, the -journal file can cause similar but not identical issues if users mess around with it. But the -journal file doesn't seem to be as visible as were the -wal and -shm files. Also, there is a subtle point about the -journal file which deserves a brief mention.
Namely, it is emptied immediately after each completed commit, and RM operates in automatic commit mode. But there are two ways to empty the -journal file. One way is to delete it. The other way is to leave it allocated and set its length to zero. The latter is much more efficient because allocating and deleting files is fairly slow and setting a file to zero length is extremely fast. But Bruce is using the delete method, perhaps to try to keep the -journal file as invisible to the user as possible so they will never mess around with it. That means that an RM user will never see a -journal file except immediately after an RM crash or after a computer crash while RM is running.
My reaction also was "why didn't Bruce just do a commit before the backup?"
But I suspect his real reason for changing back was that he saw the messages from users asking about the -wal file and the -shm file, and he was afraid users would delete them rather than just leaving them be for the use of future commits.
Now that we are back in journal mode, the -journal file can cause similar but not identical issues if users mess around with it. But the -journal file doesn't seem to be as visible as were the -wal and -shm files. Also, there is a subtle point about the -journal file which deserves a brief mention.
Namely, it is emptied immediately after each completed commit, and RM operates in automatic commit mode. But there are two ways to empty the -journal file. One way is to delete it. The other way is to leave it allocated and set its length to zero. The latter is much more efficient because allocating and deleting files is fairly slow and setting a file to zero length is extremely fast. But Bruce is using the delete method, perhaps to try to keep the -journal file as invisible to the user as possible so they will never mess around with it. That means that an RM user will never see a -journal file except immediately after an RM crash or after a computer crash while RM is running.
Quote from Richard Otter on 2025-11-19, 10:06 pmminor point of clarification-
don't confuse
commit;
with
PRAGMA wal_checkpoint;
minor point of clarification-
don't confuse
commit;
with
PRAGMA wal_checkpoint;
Quote from thejerrybryan on 2025-11-19, 11:30 pmdon't confuse
commit;
with
PRAGMA wal_checkpoint;Commit and checkpoint clearly are not the same in an important and technical sense. But I would argue that commit in journal mode and checkpoint in WAL mode are extremely analogous the sense that they are both the point in time when the data is successfully written to the RMTREE file. And they are both the point in time time that a backup of the RMTREE file by a zip utility or cloud sync software or cloud backup software or anything else external to the RM app can be accomplished. And of course, if RM is still running, those external backups have no way of knowing that RM might be doing new commits and/or checkpoints at the same time.
That being said, there is still a hole in my way of thinking. Namely, commits can continue to take place in WAL mode even during a checkpoint. A given checkpoint will copy from the WAL file to the database based only on those commits that were completed before the checkpoint started. Thus, an RMTREE file after a commit in journal mode is guaranteed to be up to date at least briefly in a sense that an RMTREE file after a checkpoint is not guaranteed to be up to date because new commits could have taken place during the checkpoint.
But I still think that from a very high level point of view the analogy is apt that commits in journal mode and checkpoints in WAL mode are the point in time where the data is written to the RMTREE file. That's a user's view of "the database". But from a very technical point of view of the SQLite developers, "the database" in WAL mode is the RMTREE file plus the RMTREE-wal file. I don't think Bruce took that distinction adequately into account when he went to WAL mode. I think he had to go back to journal mode.
don't confuse
commit;
with
PRAGMA wal_checkpoint;
Commit and checkpoint clearly are not the same in an important and technical sense. But I would argue that commit in journal mode and checkpoint in WAL mode are extremely analogous the sense that they are both the point in time when the data is successfully written to the RMTREE file. And they are both the point in time time that a backup of the RMTREE file by a zip utility or cloud sync software or cloud backup software or anything else external to the RM app can be accomplished. And of course, if RM is still running, those external backups have no way of knowing that RM might be doing new commits and/or checkpoints at the same time.
That being said, there is still a hole in my way of thinking. Namely, commits can continue to take place in WAL mode even during a checkpoint. A given checkpoint will copy from the WAL file to the database based only on those commits that were completed before the checkpoint started. Thus, an RMTREE file after a commit in journal mode is guaranteed to be up to date at least briefly in a sense that an RMTREE file after a checkpoint is not guaranteed to be up to date because new commits could have taken place during the checkpoint.
But I still think that from a very high level point of view the analogy is apt that commits in journal mode and checkpoints in WAL mode are the point in time where the data is written to the RMTREE file. That's a user's view of "the database". But from a very technical point of view of the SQLite developers, "the database" in WAL mode is the RMTREE file plus the RMTREE-wal file. I don't think Bruce took that distinction adequately into account when he went to WAL mode. I think he had to go back to journal mode.
Quote from kevync on 2025-11-20, 8:11 pmDo you think this means RM will abandon for the foreseeable future?
Also do you think, RM might be better with WAL than journal?
Do you think this means RM will abandon for the foreseeable future?
Also do you think, RM might be better with WAL than journal?
Quote from thejerrybryan on 2025-11-20, 10:22 pmI suspect that Bruce will view this experience with WAL mode as being sufficiently negative that he will never return to it.
It seems to me that WAL mode offers a potential of better performance in general than journal mode. It also offers the potential for better performance with multiple readers in multi-threaded apps being able to avoid blocking each other. But WAL mode does run the risk of exposing the RMTREE + RMTREE-wal files to the vagaries of external actors such as Dropbox or BackBlaze which would have no way to be sure the copies of the RMTREE file and the RMTREE-wal file being made were coordinated together properly, and to the vagaries of external actors such as RM users who might wonder what an RMTREE-wal file is and delete it.
I have a feeling that some developers avoid WAL mode for these same reasons.
I suspect that Bruce will view this experience with WAL mode as being sufficiently negative that he will never return to it.
It seems to me that WAL mode offers a potential of better performance in general than journal mode. It also offers the potential for better performance with multiple readers in multi-threaded apps being able to avoid blocking each other. But WAL mode does run the risk of exposing the RMTREE + RMTREE-wal files to the vagaries of external actors such as Dropbox or BackBlaze which would have no way to be sure the copies of the RMTREE file and the RMTREE-wal file being made were coordinated together properly, and to the vagaries of external actors such as RM users who might wonder what an RMTREE-wal file is and delete it.
I have a feeling that some developers avoid WAL mode for these same reasons.