Forum

Please or Register to create posts and topics.

Posting for friend – odd database issue - duplicate filenames

Page 1 of 2Next

A friend @sbankscharles of mine who uses Roots Magic has been having odd issues with his database.

Examining a copy of his database – I have noticed that there are many duplicate filenames (indexed field). He has ~70 media items - at least 7 have duplicate MediaID with same filename / path. One has triplicates. Trying to gather more info before sending to RM support.

I can’t figure out how what steps a user would do that would cause this – Database tools have been run – possible something corrupted db before tools were run.Since the filename is  indexed field is should be near impossible to get duplicates unless some weird changes were done. He might also have problems with Sources / citations and increase in used Place (unconfirmed by me)  I have confirmed MultiMedia Table  has issues.

Here are a few examples:

Uploaded files:
  • s3.png
  • s1.png

I am hoping we can identify what potentially (or actual) cause of this issue and how to prevent. Also hoping we can come up a with a fix that does not involve needing to export/ import GEDCOM to fix  the mess.

Kevin

 

As I am answering on my phone,  I've not been able to check the RM9 database definition, only the page RM6 Version Monitoring. Maybe there has been a big change in the Multimediatable but I've no recollection of it.

There is no UNIQUE constraint on the MediaPath and FileName fields or on the Index so the same file may be added repeatedly. MediaID is a PRIMARY KEY so it is inherently UNIQUE; your examples show no duplication of MediaID.

So there is nothing wrong with the database. That RM allows the same file to be added repeatedly is arguably poor design.

Re the second question, merging duplicate media files was an enhancement requested over a decade ago.

The following pages don't directly provide sqlite tools for merging in RM9 as they were for earlier versions but the concepts may be helpful:

Media Repair Queries - RM4, major changes since then

Media Duplicates – Reports and Remedies - Treeshare issue

 

Excuse my lack of knowledge -- I thought indexes prevented (or should prevent duplicates) .  When insert is used (I assumed this would done by RM when adding media) -- wouldn't this be prevented by the index?

CREATE TABLE MultimediaTable (MediaID INTEGER PRIMARY KEY, MediaType INTEGER, MediaPath TEXT, MediaFile TEXT COLLATE RMNOCASE, URL TEXT, Thumbnail BLOB, Caption TEXT COLLATE RMNOCASE, RefNumber TEXT COLLATE RMNOCASE, Date TEXT, SortDate BIGINT, Description TEXT, UTCModDate FLOAT );

CREATE INDEX idxMediaFile ON MultimediaTable (MediaFile);

CREATE INDEX idxMediaURL ON MultimediaTable (URL);

Without the UNIQUE constraint, no. The Surname field is INDEXed but not constrained by UNIQUE nor would we want it to be.

A table definition that constrains to unique values would be:

CREATE TABLE Tablename (..., UNIQUE FileName..., ..., ...)

kevync has reacted to this post.
kevync

Ah! okay thanks-- your explanation makes sense. I guess there is need for unique and non-unique indexes ... still begs the question if it really makes sense for MediaFileName (in same path) to not be forced to be unique other than thejerrybryan's thought that it had to do with captions (and need for more than one which I now vaguely  remember seeing  a post about in  ver 8 if I recall correctly).  For example, maybe a shared Headstone the caption might be unique for each person  (wife, husband, daughter). There is logically reasoning for that purpose -- however  the lack of documentation on the wiki (maybe I missed it ) -- that should be called out and clearer.

well .. I guess its unlikely RM will change the way "Add New Media" works for the foreseeable future regardless of the intentions how it was suppose to work -- if they made it Unique it would break many users existing databases so... that will unlikely to happen before ver 10. They should add a merge duplicate media tool and updated  what its called and update documentation in meantime.

 

if they made it Unique it would break many users existing databases

True, if the UNIQUE constraint was applied to the database definition without a preceding data transformation step, i.e., the merge process. However, the app software already enforces non-duplication in the media drag'n'drop process in RM9 and did so for the Add New Media process from RM5 (maybe earlier) to RM7. It's only RM9 (and likely RM8) in which that app constraint was lost. Of course, the app constraint does not prevent new duplicates from being created outside of RM using SQLite nor does it consolidate existing duplicates created by its omission in RM9.

A Merge Duplicate Media tool would be desirable for some users. It should have options to compare not only file name, but also captions, thumbnails, Descriptions, Reference Number because some users may not want to combine all media tags under just the one media. Of course, that's an argument favouring a return to the RM4 structure or an enhancement thereof along the lines of the change in structure for Citations.

Well I only limited experience with RM 7 as I did both preview 8  & RM 7 at same time around Feb 2021.   I checked my database and found about 4-5 duplicates of over 12 K  media items.  What my issue is -- depending on how  user my use RM -- it is very easy to duplicate media unintentionally create duplicate. Anyway --- just venting.  You are correct the Merge Tool would need to have more advanced options to satisfy many users needs.

But I guess that discussion has little impact as in wrong forum

This was posted by Tom on the community RM board.

The change I made to the MultimediaTable was to add the UNIQUE constraint:
image

Page 1 of 2Next