Re: GEDCOM to JSON-LD: Request for Feedback

Thank you every one of you for responding. I think our primary objective at
this point is finding a relatively clean way of mapping GEDCOM fields
(which are problematic) to RDF URIs, and maybe it is through per-node
@context(s) as you've suggested Nicholas.

We have a few hundred thousand individuals data stored in GEDCOM that we'd
like to open up to the web and do other analyses (eliminating
duplicates/merges, etc.). In relation to what Tom Morris mentioned about
adoption and the legacy of GEDCOM... yeah, lots to sort through, but our
team consists of FHISO members and we have contacts with FamilySearch and
the teams working on GEDCOMX. Tom, you are right that a new data model is
in order, not a fiddling with legacy GEDCOM. You can actually take a look
at some of the data modeling being done by our group here (as of a few days
ago): https://github.com/earlysaints/database/tree/master/ontology. Again,
this is very early work, but we appreciate the dialog.

If you're interested in contributing more directly, contact us via
earlysaints@gmail.com.


Cheers!


PS: Nicholas, here's the GEDCOMX RDF Serialization page on their wiki:
https://github.com/FamilySearch/gedcomx/wiki/RDF-Serialization


On Thu, Apr 23, 2015 at 10:13 AM, Nicholas Bollweg <nick.bollweg@gmail.com>
wrote:

>
> Here is an interesting middle ground: a json format (apparently without
> formal json schema) for the GEDCOMX model:
>
> https
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> ://
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> github.com
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> /
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> FamilySearch
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> /
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> gedcomx
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> /blob/master/specifications/
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> json-format-
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
> specification.md
> <https://github.com/FamilySearch/gedcomx/blob/master/specifications/json-format-specification.md>
>
> I found a link to an rdf integration page, but it 404ed.
>
> I'm just going out on a limb here to think that GEDCOMX somehow maps to
> plain GEDCOM. If that's the case, this is an interesting opportunity to see
> how to use  linked data to elevate an existing document model to something
> more computable, and be able to validate the round trip with a large corpus
> of documents.
>
> As we saw with FHIR, turning an existing spec into linked data massively
> might fail due to  some concept too alien to linked data to map cleanly,
> but could guide what a future version might provide. Id templates, etc.
>
> As has been discussed, "GEDCOMX-LD" would likely require per-node
> @context, as the terms are overloaded... Or thinking about a single GEDCOM
> source document as a number of linked data documents, with uri references
> between them. Not ideal, but more concise, perhaps.
>
>
> On 10:04, Thu, Apr 23, 2015 Dave Longley <dlongley@digitalbazaar.com>
> wrote:
>
> On 04/22/2015 12:36 PM, todd.d.robbins@gmail.com wrote:
> > Hello all,
> >
> > I'm new to the list but would love your feedback on our effort to
> > convert and serialize GEDCOM data [1] as JSON-LD. Take a look at our
> > research notes and source code here:
> >
> > https
> <https://github.com/earlysaints/database/blob/master/gedcom2jsonld.md>://
> <https://github.com/earlysaints/database/blob/master/gedcom2jsonld.md>
> github.com
> <https://github.com/earlysaints/database/blob/master/gedcom2jsonld.md>/
> <https://github.com/earlysaints/database/blob/master/gedcom2jsonld.md>
> earlysaints
> <https://github.com/earlysaints/database/blob/master/gedcom2jsonld.md>
> /database/blob/master/
> <https://github.com/earlysaints/database/blob/master/gedcom2jsonld.md>
> gedcom2jsonld.md
> <https://github.com/earlysaints/database/blob/master/gedcom2jsonld.md>
> >
> > We're particularly interested in approaches to representing content
> > and nested nodes. This is the beginning of our effort, but we wanted
> > to get the larger community involved now to get a better sense of the
> > challenges other groups have faced when fitting certain data models
> > into JSON-LD.
>
> I likely won't have much time to review the document above, but I will
> make a recommendation:
>
> Don't keep any allegiance to however the data was previously modeled.
> Model it in a proper Linked Data fashion moving forward and create a
> tool that can perform whatever mappings are necessary. It's not going to
> be worth your time if you just try to take the existing GEDCOM data
> model and port it to JSON-LD. I think that would be a mistake because:
>
> 1. I don't know what's to gain by keeping it, a simpler converter tool?
> You have to write a tool anyway, so focus on creating good output, not
> the complexity of the tool. If you need to, in the interim, write a tool
> that converts back to GEDCOM for legacy applications.
>
> 2. It will inhibit your ability to move quickly.
>
> 3. It may result in something unnatural to people familiar with Linked
> Data -- and make them shy away from it.
>
> 4. You have a one time opportunity to make improvements and fix data
> modeling issues from the past. It's not like you're taking legacy JSON
> and just adding an LD layer -- you're switching the format entirely.
>
> 5. The more natural it feels as Linked Data, and the more you reuse
> existing vocabularies (eg: schema.org) where appropriate, the more
> adoption you'll see -- and the more innovation on top of it! This can be
> exciting change that opens many doors to accessing and improving
> genealogical data, or it can be "GEDCOM as JSON-LD".
>
> I recommend you talk to 23andme and see if they'd be interested in a new
> JSON-LD format for genealogical data, and what it might mean for Linked
> Data (or Big Data) on the Web and their research.
>
> --
> Dave Longley
> CTO
> Digital Bazaar, Inc.
> http:// <http://digitalbazaar.com>digitalbazaar.com
>
>
>


-- 
Tod Robbins
Digital Asset Manager, MLIS
todrobbins.com | @todrobbins <http://www.twitter.com/#!/todrobbins>

Received on Thursday, 23 April 2015 22:38:19 UTC