[TSC-public] Sources

c.n.maclean at gmail.com c.n.maclean at gmail.com
Sat Sep 6 05:15:48 CDT 2014


Overview: I propose a Source as a core entity in the data model.

Importantly, a Source is defined as BOTH the standard identifiers allowing the source to be identified and located; AND the information contained within the source.


When I started genealogy (fairly recently), I used spreadsheets to capture the information within sources. I did this to make it easier to find information, and so that I didn't have to continually try to decipher the handwriting within the original sources. The deciphering would largely be a one-off job. Of course, I would still have the images of the sources to refer back to if required.

I thought it would be useful to use genealogy software instead of spreadsheets. However, when I looked for a way to copy my spreadsheet information to the software, I found that this wasn't available. Instead, I had to attach all information to individuals (possibly with references to sources). There was no way for me to enter all the information from a single source independently of other sources; or for the software to show me directly the information which I had captured from an individual source.

On browsing the internet, I came across BetterGedcom and other related information, emphasising the distinction between evidence and conclusion. I realised that this was what I had been trying to achieve .


My desire is that my genealogy software will allow me to enter all of the information from an individual source in one place; independently of any interpretation which I might later place on that information (eg this is my gg-grandfather); and independently of all other information which I have  entered into the software. Once I have entered this information, I should never have to modify it again (exceptionally, I might need to modify it if I have made a transcription error, for example).

The interpretations (conclusions) which I place on top on the sources will continually be changing. However, the underlying sources will remain unchanged for ever.

Correspondingly, I would (personally) hope that the FHISO work will support, encourage, promote and even prefer this style of working.


When developing a data model, we have to determine the correct entities and concepts which will form the core of the model.

For me, a Source (incorporating both the identifiers for the source *and* the information contained therein) is an essential candidate for being a core entity.

Conceptually (when defined in the way I suggested above), a Source is a very coherent unit. If you have such a Source for a birth record, for example, no-one is going to misunderstand what is meant. It is important for the core concepts to be clear, coherent, cohesive and not open to misunderstanding.

If a genealogist works by entering their sources first (presumably this is the “recommended” approach), then they immediately have a coherent structure into which they can put their source information. They only need to enter this information once, at the start; they should then never need to modify it again. It can also be entered fully independently of all other information in their database; no external information or interpretation is needed.

Also, we all have to extract information from our sources, whether explicitly, or implicitly when we attach it to the conclusion level people in our existing genealogy software. So I'm not really proposing any extra work for the genealogist; instead, I'm proposing to structure the extraction in a more useful manner.

For information from data suppliers (Ancestry etc), it seems that this directly ties in with the records which they provide to consumers. Similarly, users could share Sources as (more-or-less) interpretation-free constructs - no conclusions involved.

So I think that there are many benefits to this Source approach.


BTW, I think that one consequence of this approach would be that a suitable manner to encode the information within a Source would be personae and “eventa”. Each Source would be like a mini family tree, with structure very similar to the conclusion-level persons and events. However, I don't want to get into this directly within this posting.


Calum
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://fhiso.org/pipermail/tsc-public_fhiso.org/attachments/20140906/4b005f91/attachment.html>


More information about the TSC-public mailing list