Unconfigured Ad Widget

Collapse

Announcement

Collapse
No announcement yet.

Decoding a Gedcom

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Decoding a Gedcom

    I am decoding a Gedcom to extract the individuals and relationships.
    I have the format worked out but there are a few oddities in certain records.
    Some of the individual records have 2, Level 2 RIN numbers.
    Is this be because they have 2 spouses even though, in most cases, there is only 1 listed?
    http://www.hodgkinsonclark.co.uk

  • #2
    I am not an expert on GEDCOM but that structure sounds unusual. The only ged files I have with level 2 RIN numbers were generated by MyHeritage Family Tree Builder and used to enter a MH: reference for the level 1 tag, e.g.

    1 BIRT
    2 RIN MH:IF1

    even then I can find no more than one 2 RIN attached to a level 1. Can you specify your situation by including enough structure (no need to show sensitive data) so that we can see the context of the tags.

    David

    Comment


    • #3
      Thanks David
      I suspect it is down to dirty data but I thought I would ask in case I was missing something.
      I am wondering why there are separate RINs for birth, death and occupation and if it is not meant to be, why the software permitted it.

      0 @I5@ INDI
      1 RIN MH:I5
      1 _UID 5696445C6752033B1CCAF78BA7FD31C8
      1 NAME xxx /xxx/
      2 GIVN xxx
      2 SURN xxx
      1 SEX M
      1 BIRT
      2 _UID 5696445C657F033ADCCAF78BA7FD31C8
      2 RIN MH:IF8
      2 DATE 1854
      2 PLAC xxx,xxx, England
      1 DEAT
      2 _UID 5696445C65E2033AECCAF78BA7FD31C8
      2 RIN MH:IF9
      2 DATE ??/04/1925
      1 OCCU
      2 _UID 5696445C65FF033AFCCAF78BA7FD31C8
      2 RIN MH:IF10
      2 PLAC xxx
      1 FAMS @F75@
      1 FAMC @F77@
      1 RFN 112130388
      1 NOTE @N1@

      This is the first encounter I have had with Gedcom files and I am just checking through someone else's research for matches in our own.
      http://www.hodgkinsonclark.co.uk

      Comment


      • #4
        Speaking as a non techie, I'm wondering why you can't open the file in your own software program (or get a free trial of one for this purpose)?

        Anne

        Comment


        • #5
          Judging by the references in the RINs these are related to MyHeritage, whether or not they were produced by Family Tree Builder I could not say. I would follow Anne's advice and import the gedcom into an empty family file using software that you already have or one of the free versions, of which there are a number: RootsMagic Essentials, Legacy Standard, Family Tree Builder amongst others. If you take this route it is worth examining the import log file to see what the software makes of it. RootMagic for example creates a log file of the same name as the GEDCOM with an extension of LST, as far as I recall Legacy will point out problems as the import is processed. I don't know about the others.

          Unless you have some pressing reason to use these references I would ignore them. If you cannot interpret their meaning they are irrelevant.

          Unfortunately, although GEDCOM may be a standard few, if any, software programs stick to it rigorously and who can blame them it is so 20th Century!

          David

          Comment


          • #6
            Thanks for that information.
            What is the point of having a file standard if nobody sticks to it?
            I am not sure what you mean by it being so 20th Century, it is a file format and nothing more. If you change it radically then you have the same problem as Microsoft created when they updated Word to the latest version. Old documents are no longer compatible with the new version and cannot be opened, so you have to use and old version. That is not 20th century, that is just ridiculous.

            I have inspected the log file of the free prog I use GenoPro 2016 which doesn't seem to detect a problem with the file.
            The additional entries may be legitimate but as they don't make sense, I thought I would ask you guys.

            I was inspecting the Gedcom file in the hope of finding matching individuals in our web based family history site and needed to decode the file so that I could extract the names and relationships into a comparison file.
            Opening the file in a tree maker program would mean I had to compare the entries manually rather than within my existing environment.
            http://www.hodgkinsonclark.co.uk

            Comment


            • #7
              Originally posted by StudioSoft View Post
              Thanks for that information.
              What is the point of having a file standard if nobody sticks to it?
              I am not sure what you mean by it being so 20th Century, it is a file format and nothing more. If you change it radically then you have the same problem as Microsoft created when they updated Word to the latest version. Old documents are no longer compatible with the new version and cannot be opened, so you have to use and old version. That is not 20th century, that is just ridiculous.

              I have inspected the log file of the free prog I use GenoPro 2016 which doesn't seem to detect a problem with the file.
              The additional entries may be legitimate but as they don't make sense, I thought I would ask you guys.

              I was inspecting the Gedcom file in the hope of finding matching individuals in our web based family history site and needed to decode the file so that I could extract the names and relationships into a comparison file.
              Opening the file in a tree maker program would mean I had to compare the entries manually rather than within my existing environment.
              The gedcom standard has evolved over time and has never been a strict standard that has to be followed rigorously.
              Individual software designers have often added bells and whistles to allow gedcom to support additional details their specific program records that other programs do not record.
              If the gedcom containing these changes is opened in the same program on a different computer the information will transfer if it is opened on a different program the additional data will be imported into an errors file which can be viewed or discarded.

              Check out these helpful pages-

              http://tinyurl.com/has8q9j

              http://tinyurl.com/ydxuwrb

              Link to a pdf file
              http://tinyurl.com/j39tmle

              Gedcom X http://tinyurl.com/hpnz32t

              Cheers
              Guy
              Guy passed away October 2022

              Comment


              • #8
                Originally posted by Guy View Post
                The gedcom standard has evolved over time and has never been a strict standard that has to be followed rigorously.
                Individual software designers have often added bells and whistles to allow gedcom to support additional details their specific program records that other programs do not record.
                If the gedcom containing these changes is opened in the same program on a different computer the information will transfer if it is opened on a different program the additional data will be imported into an errors file which can be viewed or discarded.

                Check out these helpful pages-

                http://tinyurl.com/has8q9j

                http://tinyurl.com/ydxuwrb

                Link to a pdf file
                http://tinyurl.com/j39tmle

                Gedcom X http://tinyurl.com/hpnz32t

                Cheers
                Guy
                PS you might also find the Wiki about the history of Gedcom interesting
                Last edited by Guy; 16-01-16, 09:08.
                Guy passed away October 2022

                Comment


                • #9
                  My apologies I should have known better than to try to make a joke and you are right is is ridiculous but it is the only data transfer format supported to some extent by most genealogy software. The reference to 20th century refers to the fact that the most recently 'approved' GEDCOM standard is GEDCOM 5.5 released in January 1996. The 'standard' used by many programs is GEDCOM 5.5.1 which was issued as a draft specification in 1999 and has never been formally approved. These standards differ and exported GEDCOM files show which version they are using in the GEDCOM HEAD section to allow correct import - but this correct import frequently does not happen resulting in data corruption, particularly in long notes.

                  Like Word most genealogy programs use their own database formats and in my experience when the software is updated they provide the means to open and convert files from previous versions of their own software so that they are still usable. The developers recognise that if they fail to provide a means to transfer data between users using different family tree programs then it will impact on their business so they provide GEDCOM transfer. It is a lot better than nothing but is not perfect.

                  I realise this discussion does not help your task, sorry.

                  David

                  Comment


                  • #10
                    But I agree with DavidNewton that there could be easier ways to search for the names/compare the files.

                    One way would be to upload the gedcom to its own file within the genealogy software that you use, then run a report of all persons in the file. Ditto for the comparison file. Then use, say, a spread sheet to upload and compare.

                    Another way would be to ask your genealogy software to do the comparison. Family Historian will do that, and suggest the matches. PAF will also do it - if you don't have a copy of it, you would need to locate one, and upload two gedcoms to it.

                    IMO - sooooo, much easier than trying to take apart a gedcom manually!

                    Comment


                    • #11
                      Thanks for that, if I was a bit harsh there, I apologise.
                      I have written the program to extract the data from the Gedcom file and the data from all but 30 records, or so, is fine.
                      I have various oddities which are all down to the 'dodgy' records being quite different from the others.
                      It is more a mystery why the creation software allows this to happen rather than the layout of the file.
                      2 or in 1 case, 3 level 2 RIN numbers being an example.
                      I have 4 records which do not have the DEAT keyword in so that really messes up my system for extracting DOB and DOD.
                      As the file format appears to be a random layout, the only way to decode it is to trap the keywords, if mai keywords are missing, it messes the whole file up.
                      There are a few others which I can program to ignore but I would prefer to do a full import of all the records.
                      I tried to find out from the file format spec whether there is a purpose for multiple RINS but I couldn't see it mentioned, having said that, life is too short to read the full spec docs .
                      http://www.hodgkinsonclark.co.uk

                      Comment


                      • #12
                        Is it possible to make my posts editable on here?
                        http://www.hodgkinsonclark.co.uk

                        Comment


                        • #13
                          Originally posted by StudioSoft View Post
                          Is it possible to make my posts editable on here?
                          I am afraid that following a great deal of abuse of the system it's no longer possible. Sorry. :(

                          ( I don't see anything wrong with your post!! You are quite entitled to your own opinion.
                          Caroline
                          Caroline's Family History Pages
                          Meddle not in the affairs of Dragons, for you are crunchy and good with ketchup.

                          Comment


                          • #14
                            To add to the feeling of frustration and annoyance. I'm still thinking about this 2 RIN problem. I am not convinced that I can interpret the documented standard so I constructed a very short ged file with 1 individual, a level 1 RIN attached to the individual and a level 2 RIN attached to the BIRT. I installed the Chronoplex GEDCOM validator and after a few complaints about my header the file was passed as no problems. I then uploaded the same file to the GED Inline website




                            and requested validation. This elicited two further comments about the header and the following


                            *** Line 20: Tag RIN is not allowed under BIRT


                            The conclusion here is that even the experts producing GEDCOM validaton software cannot agree whether or not a 2-level RIN is valid.


                            Regarding the missing DEAT tags there are two possibilities that come to mind: the obvious is that the individual is still living but the more likely is that the elusive date of death has not been found. Putting in estimated DoDs is probably not a good idea and in any case may not be feasible during the process of extraction.




                            David

                            Comment


                            • #15
                              Caroline - Re. editing, I wanted to add something to a post but found I wasn't able to. It isn't a problem because like everyone else, I write perfect code and never make spilling mistooks.

                              David - I thought about the no DEAT tag but I believe that if there is no DEAT tag then there is no level 2 DATE following it either, which would make sense.

                              The record in question has a BIRT tag and then a second level 2 DATE without a DEAT tag - I assume that the file has been corrupted somehow but was looking to establish a standard.
                              http://www.hodgkinsonclark.co.uk

                              Comment


                              • #16
                                I have just put the Gedcom file through the validator you suggested and I get 477 warnings.
                                I think we have just solved the problem, thanks.
                                http://www.hodgkinsonclark.co.uk

                                Comment


                                • #17
                                  Originally posted by StudioSoft View Post
                                  I have just put the Gedcom file through the validator you suggested and I get 477 warnings.
                                  I think we have just solved the problem, thanks.
                                  Which software app produced the GEDCOM file?

                                  Comment


                                  • #18
                                    The Gedcom was created via the genesreunited website.
                                    I am not sure if they use a standard tree program or whether individual members use their own choice. The user in question readily admits that he is not really computer savvy but I find it strange that any software could save a file containing so many errors.
                                    http://www.hodgkinsonclark.co.uk

                                    Comment

                                    Working...
                                    X