Corrupt Database Recovery #database

Risk

Despite its reputation as a robust database system, SQLite is nevertheless vulnerable to disk degradation and disruptions to its disk write operations by power glitches, system crashes, etc. Thus there have been reports of the dreaded error message: SQLite Error 11 – database disk image is malformed. Other error messages may arise from corruption but this is the most common one. See the full list of SQLite result codes.

Backups may not help

RootsMagic does not exploit SQLite’s journal and rollback capabilities that could recover from some failures. Corruption may occur without an error message arising for some time because it occurred in an area of the database not currently in use. Moreover, the RootsMagic backup process won’t detect it and will faithfully preserve the corruption. Hence, when this error pops up, the question naturally follows “What do I do now?”. Winding through the backups may prove very frustrating.

Some recovery techniques

Here are some techniques I have used to recover all or some of a database, more or less in ascending order of the severity of the problem and the complexity of the recovery:

  1. Drag and drop (selecting Everyone) to a new database. This can work if the corruption affects only a part of the database that is used for a certain kind of report or display, for example an index for a certain table.
  2. Alternatively, export Everyone to GEDCOM and import that to a new database; that is equivalent to (1) in terms of the data transferred but we do not know if it is in any way a different SQLite call. If (1) fails and this export fails, then you can exclude many kinds of data for simpler export using fewer aspects of the database. Start with the most basic GEDCOM export; if it fails, skip to (3). Otherwise, progressively add elements to export until it fails – you may be lucky and have exported everything you want. Remember that there are GEDCOM settings in the FactType List that affect Drag’n’Drop the same as GEDCOM; you’ll need to check all FactTypes for inclusion in GEDCOM so you don’t lose data that is perfectly recoverable. Set the database you have created from DnD or from GEDCOM aside for later comparison with the results of the next steps.
  3. Even if the DnD/Export succeeds, and certainly if it doesn’t, we have to resort to SQLite to be sure to recover the most complete set of data . Create a new, empty RootsMagic database file and open it with SQLiteSpy with the RMNOCASE extension, or equivalent. Then ATTACH the corrupted database. For example: “ATTACH ‘C:UserTomFamily TreeMyFamily.rmgc’ AS Old ;”. Open each table of the corrupted database one at a time and note which ones cause an error message.
  4. Arguably, this step should follow the next – if it were to show that a critical table could not be sufficiently recovered, there is no point going further. For each table that does NOT cause an error message, copy its records into the corresponding table of the new, main database using the command (e.g. for the NameTable): “INSERT INTO NameTable SELECT * FROM Old.NameTable ;”. Revise and repeat for each table except for some which have built-in records such as the FactTypeTable, PlaceTable, RoleTable and SourceTemplateTable; the latter will require a command which includes a WHERE constraint or change to “INSERT OR IGNORE INTO …” to skip over the existing ones. Perhaps “INSERT OR REPLACE INTO…” is better as that will capture any customisations of FactTypes and Roles. The new database now has a perfect replica of the old, uncorrupted tables.
  5. For each table that DOES cause an error message, only partial recovery is possible: from the first record to the last contiguously uncorrupted record, and, possibly, by further iterations, some records beyond the corruption. Open a corrupted table and ignore/close the error message. Scroll down the results to where corruption is encountered. Select and copy to clipboard all above that point. Paste the copied records into a text editor which supports regular expression search & replace, such as PSPad, or into a spreadsheet such as Excel. For very large results it may be necessary to do this in chunks to respect the limits of the text editing or spreadsheet program. Using the editing or spreadsheet tools, we need to transform this extract from the NameTable:

 

NameID OwnerID Surname Given Prefix Suffix Nickname NameType Date SortDate IsPrimary IsPrivate Proof EditDate Sentence Note BirthYear DeathYear
...
17 16 Hansen Frances Lorraine 0 . 9223372036854775807 1 0 0 0.0 1937 1939
18 17 Gibson Chester Herbert 0 . 9223372036854775807 1 0 0 0.0 1895 1949
19 18 Straub Florence Amelia 0 . 9223372036854775807 1 0 0 0.0 1896 1942
...

into this SQLite script:

BEGIN TRANSACTION;
...
INSERT INTO NameTable VALUES(17,16,'Hansen','Frances Lorraine','','','',0,'.',9223372036854770000,1,0,0,0,NULL,NULL,1937,1939);
INSERT INTO NameTable VALUES(18,17,'Gibson','Chester Herbert','','','',0,'.',9223372036854770000,1,0,0,0,NULL,NULL,1895,1949);
INSERT INTO NameTable VALUES(19,18,'Straub','Florence Amelia','','','',0,'.',9223372036854770000,1,0,0,0,NULL,NULL,1896,1942);
...
COMMIT TRANSACTION;

By examining the CREATE statement for the table, we can tell which fields are numeric (no quotes required), text (single quotes required) and binary or BLOB (requires CAST). In the above example, the Sentence and Note fields are binary text but there were no values so null could be used and was easier. Here’s an extract from another table which did have values in some BLOB fields:

PersonID    UniqueID    Sex    EditDate    ParentID    SpouseID    Color    Relate1    Relate2    Flags    Living    IsPrivate    Proof    Bookmark    Note
...
1165    979205DC09A94799B253CDBE51D90C137599    1    40853.0    0    0    0    0    0    0    0    0    0    0
1166    F79C9B38AFE44F619D4809724B3BAB134DFB    0    40853.0    0    0    0    0    0    0    0    0    0    0    Church records have Thomas living in 1850.
Could there have been 2 Thomas's in this family?
Bap. 26 feb 1971 SL. end. 1 apr 1971 Sl.Children's work done from 1980 census.
 
1167    E55E52812ECA44E8874FD0DB88750F4D14F3    0    40853.0    0    0    0    0    0    0    0    0    0    0
 
...

These need to be transformed into this SQLite script:

BEGIN TRANSACTION;
...
INSERT INTO PersonTable VALUES(1165,'979205DC09A94799B253CDBE51D90C137599',1,40853.0,0,0,0,0,0,0,0,0,0,0,CAST('' AS BLOB));
INSERT INTO PersonTable VALUES(1166,'F79C9B38AFE44F619D4809724B3BAB134DFB',0,40853.0,0,0,0,0,0,0,0,0,0,0,CAST('Church records have Thomas living in 1850.
Could there have been 2 Thomas''s in this family?
Bap. 26 feb 1971 SL. end. 1 apr 1971 Sl.Children''s work done from 1980 census.
 
' AS BLOB));
INSERT INTO PersonTable VALUES(1167,'E55E52812ECA44E8874FD0DB88750F4D14F3',0,40853.0,0,0,0,0,0,0,0,0,0,0,CAST('' AS BLOB));
...
COMMIT TRANSACTION;

Converting data to SQlite script

This is not a tutorial on regular expression search & replace. You can find those elsewhere. Whatever regular expression tool you use, there is a general order that needs to be followed:

  1. Because the apostrophe demarks a string in SQLite syntax, possessive words such as “John’s” or other literal uses of the apostrophe requires it to be escaped (double apostrophes) before you wrap strings with single apostrophes. So the first step is a normal search for ‘ and replace with ” (that is two apostrophes, not a double quote).
  2. The data values are separated by the Tab character. I tend to proceed then with replacements from left to right, using the Tab character to define the pattern and how it gets replaced at each position. You can find the type of each column by inspecting the SQLite command that creates the table; with SQLiteSpy, right-click on the table name in the left sidebar and select “Show CREATE statement”. It may take several iterations of search and replace for a table with many columns to arrive at a complete and syntactically correct INSERT statement from each row of the original table.
  3. A multi-line field (with embedded new line codes as in the above example) is not manageable with PSPad’s search & replace so requires manual editing. There are other tools which can handle it – my favourite being Bill Klein’s ReplaceText. Regrettably, it is no longer supported and reportedly has some problems with certain Windows 7 installations, although I have not witnessed any.
  4. Any remaining tab characters can be replaced globally with commas.

 

Finishing

Having used a fake RMNOCASE collation to build the data in the new database, the indexes may not jibe with RootsMagic’s inherent RMNOCASE. Open the new database with RootsMagic 5 (yes 5, not 4) and use its File > Database Tools > Rebuild Indexes tool. Then look at what you got from the DnD/Export versus the SQLite copying and choose to carry on with the one that seems most complete.

Discussions & comments from Wikispaces site


ve3meo

Drag ‘n’ Drop vs Export “everything”

ve3meo
10 March 2012 19:40:35
Added by Jerry Bryan 3/10/2012

(Tom, I’m sometimes still figuring out how to work the Wiki. I was just wanting in some sense to REPLY to this page and not really edit it, but I couldn’t figure out a way to do so. Please free to delete my comments from here and move them somewhere else.)

This page is a wonderful cautionary tale. It made me think that maybe once a week or so I should just drag and drop my entire database somewhere else, just to see what happens. I wouldn’t really use the copy, but maybe the drag and drop of the entire database might expose invisible database corruption that might otherwise remain invisible. Is that what you are saying? If so, then I would have one very minor quibble. Namely, in item (2) under recovery techniques, you say: “If you check off to include everything possible in the export, it is equivalent to (1).”

Item (2) is the export and item (1) is the drag and drop. The way I read the sentence I’m quoting, there is an expectation that drag and drop will include everything and that export will include everything only if you check to include everything possible. But my understanding is that drag and drop is fully equivalent to export, and to do a proper drag and drop of “everything” you would have to “check off to include everything possible” just like you would if you were doing an export of “everything”.

You can post messages “behind” the page. You can also Edit the page, select a key phrase and add a comment linked to the phrase. The phrase will be highlighted in yellow and the comments sidebar can be opened and closed to view the comment. The comment will also appear as a message “behind” the page.

I realise now after Geranium’s msg that Exporting Everyone with all the inclusions checked and exclusions unchecked is not as complete as Drag’n’Drop Everyone because some Fact settings exclude elements from export but not from DnD. To be truly equal, you would also have to go through the FactType List and include everything in it for export.

Tom


thejerrybryan
thejerrybryan
11 March 2012 01:14:29
  • I realise now after Geranium’s msg that
  • Exporting Everyone with all the inclusions
  • checked and exclusions unchecked is not
  • as complete as Drag’n’Drop Everyone
  • because some Fact settings exclude
  • elements from export but not from DnD.

My experience is that DnD honors Fact settings and therefore causes certain Facts to be omitted, just like Export. I don’t think there’s any way to make DnD get everything without messing around with lots of Fact settings. Am I missing something obvious?

Jerry

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.