Forum

Please or Register to create posts and topics.

Question with Media links and GEDCOM export (RM8)

Page 1 of 2Next

I seem to have something "weird" going on .  I am new to RM (both 7 & 8).  I did not have issues with RM7.  However, after I moved the file to RM8 I had issues when exporting to a GEDCOM (with Media Links).  The filesize balloons to nearly  1 gb (vs 20mb).  It is almost like there is something multiplying the links I am attaching screenshots. I am not sure if there is something odd in my db. So my question is how to identify if/what is wrong and how to resolve.   I also posted on the FB group and submitted feedback.

When Medialinks option is checked (see  screenshot1). The export takes a long time and balloons in size to over 800mb. Also the citations and media links count go up. (about 4 million vs 8,000). (see comparison of screenshot3 vs screenshot4).  What I did was to export to a GEDCOM then Import to a new file.  I perform the file tools  (Integrity, Index, and Phantoms).  However, the compact usually fails (due to vacuum error).  Version 7.9.170.0 seems to have minimized the Compact issue.   I did perform Merge duplicate sources and citations on the file version.

Does anyone have any thoughts on what to check?  I have used SQL (mySQL) on my website, but I am new to SQLite. Since this is a preview version I am not sure if there are any  missing connector tables.  Finally, the size without the media link checked is fine.  I did try one to merge the duplicates on the RM8 file that was created from the GEDCOM  export with the media links but it crashed/hung.    here is the FB post for reference and let me know if you need anything else. RM8 Preview post

 

 

Uploaded files:
  • FB_screenshot1.jpg
  • FB_screenshot2.jpg
  • FB_screenshot3.jpg
  • FB_screenshot4.jpg

ps. One other thing is the original file was from FTM (Family Tree Maker) and was Synced to Ancestry, then exported to GEDCOM, then imported to RM7 before importing to RM8

 

You imported into RM8 from the RM7 database file, not through GEDCOM, correct?

And your issue is the ballooning of the GEDCOM exported from RM8, which you then imported into a new database for the purpose of comparing the properties of the two RM8 databases?

And there is no such ballooning issue with the export from the RM7 database?

  • Have you imported the RM7 GEDCOM into a new RM7 db and compared properties?
  • Have you imported the RM7 GEDCOM into a new RM8 db and compared properties? Does this RM8 version also balloon on export to GEDCOM?

I'm guessing that there is something about your data that is triggering a bug in the RM8 Export and it is very likely related to the structural change around sharing citations.

One thing that has been an issue with TreeShare and FTM's equivalent is the duplication of media files (Media Duplicates - Reports and Remedies). There might be some compounding effect having to do with repeated base filenames in the duplicates along with the structural change for citations that dramatically manifests itself in a file like yours.

I would suggest you copy a small subset of your database to a new database file and retest. Examine the GEDCOM with a text editor such as NotePad++ and look for extraordinarily large counts of GEDCOM tags. Likewise, you can explore the database MediaLinkTable although it may take a much more complex analysis to identify the root cause. If this small database reliably exports a ballooned GEDCOM, then it should be submitted to Feedback with a description of the problem.

 

Hi Tom,

I ran a test on an earlier GEDCOM I had from about 2 week prior.  I imported that GEDCOM into RM8. I then ran an export.  The exported GEDCOM was expected size.  Then I ran Merge Sources followed, Merge Citations.  Then I ran a new GEDCOM export. That is running very slow currently. So it sounds like the issue is related to the Merge process in RM8 (I haven't tried RM7 yet -- but  I will later to see if result as same based on the same GEDCOM file. ).   I am also fairly certain there were some lingeringly data issues from FTM and Ancestry.  One of the reasons I have switched to RM.   If the results are the same for both RM7 and RM8 then its probably mostly my data had hidden issues.  If not, then RM8 may have some issues and my data is likely a contributing cause.

The results of the exported GEDCOM  980bm vs 23 mb. (after vs before merge).  The medialinks increased from ~7,000 to 3.977 mil. The rest was about the same.

I will have to familiarize myself with the DB structure. What tables are related to media besides MediaTable and MediaLinks and what tables do you think are likely the ones "messed up".

Kevin

Looking quickly  at the DB structure -- MedialinksTable connects to about 7 tables.  My guess part of the issues come from the events and/or citations tables.  My question is how to determine if there is something wrong with the data itself (and where), or if it is mostly occurring  from the export process (the database tools are not recognizing any problems.)

 

This is a shot in the dark, but if you have a lot of media files linked to citations, running RM8's Merge Duplicate Citations might help. I know that in my database, doing so greatly reduces the number of media links.

kevync has reacted to this post.
kevync

Sadly that appears to be what broke it -- I proved that.  What I do not know if it has to do with the RM8 structure or something else.  Is there a similar tool with RM7? I only saw for (Merge)Sources.

ps. The large GEDCOM file was too large for Notepad++, is there another editor that will open files ~ 1gb?

I suggested you play with a subset copy of your database starting back at RM7 because of the gargantuan size of the Export. I'm surprised NotePad++ couldn't handle it - supposed to support 4GB, if you have the RAM. Alternatives here.

My test file that ballooned on export-import also originated via TreeShare so this could possibly be a fairly common problem that RM Inc needs to address. I'm thinking there is nothing wrong with our data that is up to us to find and correct. This is most likely a system problem.

kevync has reacted to this post.
kevync

Tom,

yeah I was surprised NotePad++ could not handle it -- I have 8gb RAM but maybe there was not enough free ram for editing purposes. I guess we have demonstrated there is an issue.  Hopefully there are listening and reviewing.  I am going export a branch from FTM as a GEDCOM and work/test with that.   Thanks, Kevin

Page 1 of 2Next