Sources – Newspapers.com – Cleaner Footnotes and Simpler Bibliography

Issues

This project arose from a request Fix & Merge Hundreds of Newspapers.com Sources in the Forum. The poster is a heavy consumer of the Newspapers.com sources through RM’s TreeShare with Ancestry.com and had issues with:

  1. A long Source List in the application and repetitiously long report Bibliographies due to a different Master Source for each page of a newspaper.
  2. Repetitious listing of “Newspapers.com” in Source Names and in the Title in Bibliographies and in Footnotes. Her approach was to manually delete it in every Master Source but still had hundreds to do.
  3. Leading punctuation in the Footnotes and Bibliographies because the Author value is empty in sources from this Ancestry collection.
  4. “N.p.” and “n.d.” notations in Footnotes and Bibliographies when a value for Publisher, Publish Place or Publish Date is empty.

Solution

Because the sources were imported via TreeShare, they are Ancestry Record type, i.e., they are created using the built-in Ancestry Record Source Template. Built-in Source Templates are uneditable through the RM user interface but are defined in the same table that holds user-defined templates. Thus, the built-in templates can be modified by using SQLite to edit entries in the SourceTemplateTable. We can address Issues #3 and #4 by modifying the Footnote and Bibliography sentence templates in the Ancestry Record template. That will be of benefit also to citations having empty values from some other Ancestry Collections (see Ancestry TreeShare – Impact).

Issues #1 and #2 are more challenging because the values of the source and citation variables that appear in the Footnote and Bibliography sentences are stored in a XML data structure. To solve #1, we want to “lump” all citations of a given newspaper Title under one Master Source. That requires that the data that differentiates the Master Sources for a common newspaper must be deleted or transferred from the Master Source to the Citation Details. For example, the Page # must be extracted from the Source Name in the SourceTable and moved to the Detail ([Page] variable in XML) for each Citation of that Source in the CitationTable. There are more steps than that alone for each of that one newspaper’s multiple Master Sources and Citations.

Once all the data manipulations are complete, there will be multiple identical Master Sources for a given newspaper Title. RM’s AutoMerge Sources function can finish the job.

Before/After Screenshots

The database undergoing modification was from RM7, hence the screenshots are of RM7. However, the solution also works with RM8 and RM9.

Before

Example of one source in the original database. Note that the Source Name and the Title variable (italics) are identical and contain the unwanted “Newspapers.com”, the title of the newspaper, the publish date and the page number. The [Page] variable at the Citation level contains description of the item of interest and the publish date (repeated from the [Title] variable). All three sentences have unwanted leading punctuation and white space.
A Master Source for each page cited from a given newspaper. This example of ‘extreme splitting’ of sources is perfectly acceptable for some users while, for others, the long Source List and report Bibliographies are objectionable and ‘lumping’ to one Master Source per newspaper is preferred.

Transition

These Before/After shots of the Edit Source window show the operations needed to prepare Sources and Citations for lumping Sources by Newspaper Title and the resulting sentence previews from an improved Ancestry Record source template.

After

Now just one Arizona Republic in the Source List instead of many individual Page #’s. In some cases such as the Arizona Daily Star at the top of the list, RM’s Source AutoMerge leaves two Master Sources that look identical and it is necessary to Manual Merge the two to end with just one. Despite fields looking identical in the Source Editor, AutoMerge compares the full XML strings of each source and there’s no match if the order of otherwise identical fields is different.

Download Scripts

Procedure

  1. Backup your database in case you need to revert to it.
  2. Open your database with a SQLite manager having RMNOCASE – faking it in SQLiteSpy or RMNOCASE – faking it in SQLite Expert, command-line shell et al and supporting the REGEXP_REPLACE() function.
  3. Load and execute Sources-NewspapersCom-LumpClean.sql.
  4. If the Ancestry Record source template does not have ” – cleaned” appended to it, load and execute SourceTemplate-AncestryRecord-cleaned.sql.
  5. On returning to RM, run Rebuild Indexes in Database Tools.
  6. In RM, open the Source List and run AutoMerge.
  7. If you have two or so remaining sources for the same newspaper using the Ancestry Record template and you wish to have only one, use RM’s Manual Merge for Sources.
  8. Repeat after you have added more Newspapers.com sources via TreeShare.

Notes

  1. Should you have reason to revert the Ancestry Record source template to the format supplied by the application, load and execute in your SQLite manager SourceTemplate-AncestryRecord-Reset.sql, edited to find a RM database file of the same major version number to fetch the built-in format.
  2. Should you upgrade or drag’n’drop to another database, the “Ancestry Record – cleaned” template will revert to the built-in format. Run step #4 on the target database to restore it.
  3. The user reported that TreeShare does not report any change as a consequence of this procedure; it would seem to rely solely on the link to the Ancestry Record stored in the RM7 LinkAncestryTable (AncestryTable in RM8, RM9).
  4. The procedures should work also on RM8 and RM9.
  5. The main script is not what I would call ‘elegant’. It grew like Topsy as I explored the database and evolved the process through a sequence of building blocks. Someone cleverer than I with SQLite might well produce a better, faster version.

One Reply to “Sources – Newspapers.com – Cleaner Footnotes and Simpler Bibliography”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.