- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Tue, 18 Sep 2012 13:24:16 -0400
- To: Linked JSON <public-linked-json@w3.org>
- CC: RDF WG <public-rdf-wg@w3.org>
Thanks to François for scribing! The minutes from last week's call are
now available here:
http://json-ld.org/minutes/2012-09-11/
Full text of the discussion follows including a link to the audio
transcript:
--------------------
JSON-LD Community Group Telecon Minutes for 2012-09-11
Agenda:
http://lists.w3.org/Archives/Public/public-linked-json/2012Sep/0004.html
Topics:
1. ISSUE-159: Add specifying @language to expanded form
Chair:
Manu Sporny
Scribe:
François Daoust
Present:
François Daoust, Manu Sporny, Markus Lanthaler, Niklas Lindström,
Stéphane Corlosquet, Lin Clark
Audio:
http://json-ld.org/minutes/2012-09-11/audio.ogg
François Daoust: [Manu going through the agenda. A couple of
issues may not be resolved today as there are too many proposals
on the table]
François Daoust is scribing.
Topic: ISSUE-159: Add specifying @language to expanded form
Manu Sporny: https://github.com/json-ld/json-ld.org/issues/159
Manu Sporny: Issue has to do with round-tripping language-map
stuff.
... We added support for Drupal community and Wikidata
community.
... No context in expanded form, otherwise we'd have to
interpret this in very weird ways.
... Question I asked the Wikidata community was "Why not work
in compact form?"
... Having languages as keys gives direct access to data
... The problem is now to define how the expanded form is
generated from the compact form so that we can get back to the
compact form afterwards.
Markus Lanthaler:
https://github.com/json-ld/json-ld.org/issues/159#issuecomment-8455585
Markus Lanthaler: If you have @language in expanded form, there
might be collisions with @language that are already there or with
properties that are of other types and do not accept @language.
... See comment in github issue
... One option to solve this would be to keep a @context in
expanded form, but not what we'd like to have.
Niklas Lindström: Precedence is good in any case. Even in
compact form.
Manu Sporny: Yes. If we have precedence, does it address your
concern Markus?
Stéphane Corlosquet: are you guys saying that in any case, any
typed value could not have a language?
Markus Lanthaler: for a plain literal, it wouldn't because you
cannot add @language to a plain literal.
Niklas Lindström: we understand we're diverging from RDF here
[scribe assist by Stéphane Corlosquet]
... It's strange to have language information in expanded
form. The only way to describe this is RDF is to have a named
graph.
... (scribe missed details)
Manu Sporny: "term": { "@language": {"en": ..., "de": ...}}
Manu Sporny: "http://foo.bar/vocab#term": { "@language": {"en":
..., "de": ...}}
Manu Sporny: wondering if we could do something like the snippet
I just pasted
Markus Lanthaler: The problem is that we're trying to express
data that is not there. It's metadata.
Niklas Lindström: The expanded form is an abstract triple
representation and what we do with language maps (and id-maps for
that matter) is just reify indexing.
... Only if we stay within JSON-LD and expand/compact would
you get round-tripping.
Manu Sporny: The concern in the Drupal community is that you
could get something different out.
Niklas Lindström: The only thing expanded are terms. That's the
only expansion we've talked about. Perhaps that's a good concept.
Manu Sporny: I don't know if ends up becoming a different type
of form for JSON-LD.
Stéphane Corlosquet: Niklas, you were talking about
round-tripping in RDF.
... It wouldn't be a concern in Drupal because it's never used
internally.
... Our goal is not necessarily to output RDF in the end.
... What we'd like to do is use the compact form, expand it
and process it.
... We just want to have the language in the expanded form.
... Getting the same data from compaction is not exactly our
use case.
... You guys may want to recompact it again and get the same
data, but not exactly what we need in practice.
Niklas Lindström: I can understand your use case. I touched upon
it during a RDFa to JSON-LD workshop.
... If we want to support it, we should do it via the notion
of term expansion, not full expansion.
Manu Sporny: Just a quick explanation about the Drupal use case.
Every Drupal site has a slightly different context.
... Tags can have different information associated with them
across Drupal sites.
Stéphane Corlosquet: can be anything, 'tags' is just an example
Manu Sporny: Those tags are kind of b-nodes.
... When two Drupal sites share data, one of them is going to
export data as JSON-LD, using its context, probably expanding it.
... The targeted Drupal site will process the received data,
using the expanded form as input and compacting using the target
context.
... The idea that we need to reconstruct the language map is a
pretty strong requirement.
... I also think that both Niklas and Markus have very strong
points.
Manu Sporny: "http://foo.bar/vocab#term": { "@language": {"en":
..., "de": ...}}
... The only solution that I can see working that doesn't have
the issue Markus raised in the beginning is the idea I share on
IRC
... I don't see any issue with this, but I may miss something.
Markus Lanthaler: alternative: { "@context": { "langmap":
"example.com/vocab/term#" }, "langmap:de": ..., "langmap:en": ...
}
Markus Lanthaler: perhaps additionaly define "langmap:de": {
"@language": "de" } in context or add context inline
Markus Lanthaler: I don't see an issue with that but proposing
another alternative on IRC for Stéphane.
Lin Clark: hey all, I'm on call now as well
Markus Lanthaler: langmap:de - example.com/vocab/term#de
Markus Lanthaler: langmap:it - example.com/vocab/term#it
Markus Lanthaler: example.com/vocab/term#it
Markus Lanthaler: basically, you'd have different terms for
different properties.
Stéphane Corlosquet: How would you re-compact this in the end?
Markus Lanthaler: { "@context": { "langmap":
"example.com/vocab/term#" }
Markus Lanthaler: langmap:LANGUAGE
Markus Lanthaler: with the context just pasted on IRC, you would
just re-generate the initial data
Lin Clark: That sounds a lot like the proposal Manu had made
initially
Manu Sporny: There's a downside (missed by scribe) to that that
explains why we had left the idea in the end.
... The only reason why we want it in expanded form is to be
able to recompact it in lossless form.
... This idea of being able to tell whether something came
from a language map is to reconstruct the same structure in the
end.
... There may be times that you express values in expanded
form where you didn't want them to be necessarily put back in
language maps.
Niklas Lindström: The question is whether data coming from
language-based data can be reconstructing. Any deviation from
that should not use language maps to compact because that would
always give weird results.
... If you start mixing from various sources, you may have
titles in English but description in Italian, then properties
would fall in different buckets if you use language maps.
Manu Sporny: I'm proposing this: "http://foo.bar/vocab#term": {
"@language": {"en": ..., "de": ...}} because... 1-to-1 mapping
term
[Markus and Manu discussing examples of expansion/compaction]
Niklas Lindström: I wonder if the expanded form you're proposing
here would solve the problem of combining two sources.
... It seems to require things from the compaction algorithm.
Manu Sporny: Let's say you have two documents that use the same
IRI term and you expand.
... Without a flag and with the rank algorithm that we have,
there wouldn't be any problem.
... The term with the language map would be separated from the
term without the language map.
... That's for when we don't flatten.
... If we do flatten, (scribe missed that), that would address
the issue.
Niklas Lindström: I'd rather we put information in the different
buckets in expanded form so that compaction be done
deterministically
Manu Sporny: and it's a fairly expensive operation when the data
gets bigger. I agree with you Niklas. If we could simplify, we
should.
... It turns out that, each time we need to look into details,
we end up with things that are fairly complex. The ranking
algorithm is a good example of this. It becomes impossible to
know what will happen without understanding the algorithm itself.
... All that to say that I agree in principle, but I'm worried
about the algorithm will become more complex than expressing a
1-to-1 mapping with language maps.
Niklas Lindström: The problem is that we're trying to express
something that we cannot even express in our data model.
Lin Clark: Are there differences between RDF data model and
JSON-LD data model?
... I saw discussions from Gregg
Manu Sporny: This is kind of corner state. We don't make use of
the differences for the time being, although there is a tiny
difference, indeed.
... We just have to be very careful if we say JSON-LD uses RDF
data model since that's not entirely true.
Niklas Lindström: {'@language': 'en', '@id':
'http://example.com/tags/foo', 'label': ' Foo'}
Niklas Lindström: {'@language': 'de', '@id':
'http://example.com/tags/baz', 'label': ' Baz'}
Niklas Lindström: Example on IRC. Different resources because
different IDs.
... The node themselves have not, in RDF terms, any language
expressed.
Niklas Lindström: {'@language': 'en', '@id':
'http://example.com/tags/foo', 'label': ' Foo'}
Niklas Lindström: {'@language': 'de', '@id':
'http://example.com/tags/foo', 'label': ' Foo'}
... You can infer that "Foo" seems to be in English.
... but that's all.
... Now consider the second example, where IDs are the same.
Niklas Lindström: {'dc:language': 'en', '@id':
'http://example.com/tags/foo', 'label': ' Foo'}
... We have a problem here because it's not clear whether we
want to reify the language. Do we want to say that the node is
somehow intrinsically associated with English, then you should
use 'dc:language'.
Manu Sporny: "term": {"en": "Foo", "de": "Bar"}
... That is quite different from that there is an English
label about this.
Manu Sporny: On the opposite, we need to account for very simple
examples such as the one I just pasted.
Niklas Lindström: {'dc:language': 'en', '@id':
'http://example.com/tags/foo', 'label': ' Foo'}
Niklas Lindström: {'@language': 'en', '@id':
'http://example.com/tags/foo', 'label': ' Foo'}
Niklas Lindström: {'@language': 'de', '@id':
'http://example.com/tags/foo', 'label': ' Foo'}
Niklas Lindström: actually, that's simple and straightforward.
Stéphane Corlosquet: I just wanted to jump on Niklas comments.
... When you use 'dc:language', you say that the resource is
in English
... (scribe missed description because of noise)
Niklas Lindström: you have two different resources, one being a
translation of the other.
Lin Clark: No, they don't want to have separate graphs.
... Different properties in different languages.
... You would have an author field on the node. That field
count point to Stéphane for the French version and to myself in
the English version.
... I understand that in the RDF model, it would be understood
as two different graphs.
... If we start to introduce complex syntax, people will get
lost, and it's just not worth it for the 2-3 people that
understand this.
Manu Sporny: We understand the need to have simple ways of
accessing the data.
Niklas Lindström: I object to this. This has nothing to do with
simplicity of accessing the data, but with simplicity of modeling
the data.
Manu Sporny: I don't think it applies to the Drupal use case. I
don't think they should have to change data modeling for this.
Niklas Lindström: I am not a fundamentalist here, we have to
find a pragmatical solution to the issue.
Lin Clark: The translation to us is not a different resource.
Niklas Lindström: but there are two different translations.
Manu Sporny: aside - "http://foo.bar/vocab#term": { "@language":
{"en": ..., "de": ...}} also allows the Drupal folks to work w/
expanded form, if they need to.
Niklas Lindström: ... {"@id": "/resource", translation": {"en":
{"author": {"@id": "/lin"}}, "de": {"author": {"@id":
"/stephane"}}}
Niklas Lindström: you don't have to describe the translation in
any more detail than in the code I just pasted.
Markus Lanthaler: alternative {"@id": "/resource", "en":
{"author": {"@id": "/lin"}}, "de": {"author": {"@id":
"/stephane"}}}
Markus Lanthaler: where "en" is a property like
example.com/vocab/languages/en
... You can have a property that combines translations
Markus Lanthaler: along the same lines as Niklas
Markus Lanthaler: alternative {"@id": "/resource", "en":
{"author": {"@id": "/lin"}}, "de": {"author": {"@id":
"/stephane"}}}
Lin Clark: I actually suggested that to our multilingual
initiative, but they put so much work in it and it's already
almost done that I don't think that we can or we should change
our data model at this point.
Niklas Lindström: From an implementation perspective, it's more
or less the same.
Lin Clark: They're doing a lot of stuff in the multilingual
initiative that I'm not involved with, so I can't speak
particularly to all the details.
... I don't think we can convince everyone that it's worth it
because of JSON-LD.
Markus Lanthaler: how would it help to turn the structure
around? (assuming we coud) [scribe assist by Stéphane Corlosquet]
Markus Lanthaler: but you wouldn't mind if "en" would not expand
to full IRIs in expanded form.
Markus Lanthaler: {"@id": "/resource", {"author": { "en" {"@id":
"/lin"}}, "de": {"@id": "/stephane"}}}
Markus Lanthaler: {"@id": "/resource", {"author": {
"http://example.com/en" {"@id": "/lin"}},
"http://example.com/de": {"@id": "/stephane"}}}
Markus Lanthaler: Something like this would work for you, right?
... No big deal if it becomes something like this in expanded
form, right?
Lin Clark: Then can it compact back to the other form?
Markus Lanthaler: yes, you wouldn't event need language map for
that.
Manu Sporny: The one concern is that we're going to have terms
for each language.
Markus Lanthaler: is that really an issue?
Manu Sporny: If you're expressing languages as predicates, the
data is jammed.
Markus Lanthaler: right, but that's you have. It's a predicate,
not a language.
Manu Sporny: My only concern is that if Drupal wants to move to
RDF in the future, then that direction might be problematic
longer term.
Stéphane Corlosquet: probably not a real concern for the time
being.
Markus Lanthaler: Here's how the example could work today -
completely round-trippable: http://bit.ly/P8i7h7
Lin Clark: When we come to that, we could update what's needed
to move things to the RDF data modeling
Lin Clark: I got bumped
Manu Sporny: OK, it definitely works. I don't know if it's good
to model data in that way. I feel uneasy about it.
... The other concern I have is that if it works for Drupal
folks, and if that works as well for Wikidata folks, then there's
a question about supporting language maps in the end.
Manu Sporny: it didn't pick up first try [scribe assist by Lin
Clark]
Markus Lanthaler: I wonder if language map couldn't be
restricted to simple values such as "title.en" resolves to the
English title
Lin Clark: now it is busy
Markus Lanthaler: Wikidata apparantly just uses simple language
maps: http://meta.wikimedia.org/wiki/Wikidata/Data_model_in_JSON
Manu Sporny: Ok, we spent an hour on this. We should step back
and think a bit more about it.
Lin Clark: dialing in
... We have two fairly proposals on the table.
... 1) Languages become IRIs, 2) 1-to-1 between
compact/expanded form for language maps.
Niklas Lindström: I wonder where the title would end up in the
example Markus wrote up. Would there be a similar map for each
thing or would we want to group them in language buckets?
Markus Lanthaler: that's what I suggested initially but Lin
suggested they would rather have properties before languages.
Manu Sporny: any objection to move on to next issue and track
this up in github comments?
Markus Lanthaler: Do we want to support complex language maps?
Manu Sporny: My gut feeling is that, if we're going to support
language maps, we need to support all of Drupal's needs. I don't
know if it's worth the complexity to add language maps for
literal values only.
... We could associate the language with a term in the
context. If we go with the approach Markus proposed, I don't
think we need language maps in the end. In the context, you would
have term definitions for languages.
Manu Sporny: "en": {"@id": "http://purl.org/bcp47#en",
"@language": "en"}
... That would expand to:
Manu Sporny: "en": "Foo" - "http://purl.org/bcp47#en": {"@value":
"Foo", "@language": en}
Manu Sporny: The way we're modeling this does not really map to
RDF, that's what I'm concerned about.
Niklas Lindström: I do think that things such as freebase may
benefit from data exported by Drupal sites
Stéphane Corlosquet: I don't think we should be blocking things
here. We could create IRIs for each translations and so on if we
really need to.
Lin Clark: hmm, I can't hear what was said but Crell specifically
requested we not create IRIs for each translation
Lin Clark: was talking about how to handle things in RDF [scribe
assist by Stéphane Corlosquet]
Stéphane Corlosquet: not in JSON-LD
Lin Clark: what we've discussed before is that you lose the
language handling for objects that are resources
Lin Clark: I don't think we want to have different subject IRIs
between JSON-LD and other RDF formats
Manu Sporny: I'm not convinced that we need to model the data in
the way Markus and Niklas are proposing. It works for Drupal
folks but I don't think it's the right way to model it as RDF.
Lin Clark: yes - we said we could discuss this outside the call
[scribe assist by Stéphane Corlosquet]
Stéphane Corlosquet: we haven't decided or changed anything since
when you dropped
... The other concern that I have is that JSON-LD should be
able to cope with data as modeled, especially in cases such as
Drupal when it's difficult to identify a right/wrong way of
modeling data.
Lin Clark: yeah, I was able to get back in now
Manu Sporny: I think these are the options available to us right
now:
Manu Sporny: 1) Ask Drupal to change the data model
(non-starter),
Manu Sporny: 2) Adopt a 1-to-1 mapping between compact/expanded
form for language maps, (adds complexity to syntax)
Manu Sporny: 3) Adopt a complex algorithm to reconstruct language
maps from expanded form, (adds complexity to API, and may be
non-deterministic)
Manu Sporny: 4) Model the data using BCP47 language code IRIs.
(problematic from an RDF data model standpoint)
Manu Sporny: each has annoying down-sides.
-- manu
--
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Which is better - RDFa Lite or Microdata?
http://manu.sporny.org/2012/mythical-differences/
Received on Tuesday, 18 September 2012 17:24:44 UTC