W3C home > Mailing lists > Public > public-schemaorg@w3.org > November 2016

Re: Question on expressing translations of terms

From: Richard Wallis <richard.wallis@dataliberate.com>
Date: Thu, 24 Nov 2016 11:50:06 +0000
Message-ID: <CAD47Kz6gSToeCY5Ga_A6Z2+4OZsRLyM5QC-PgjP5iBgKbr=-aA@mail.gmail.com>
To: Felix Sasaki <fsasaki@w3.org>
Cc: Thad Guidry <thadguidry@gmail.com>, Alexandre Bertails <bertails@apple.com>, Thomas Francart <thomas.francart@sparna.fr>, Dan Brickley <danbri@google.com>, "schema.org Mailing List" <public-schemaorg@w3.org>
Felix,

What you describe is a fairly standard pattern in Schema.org and the world
of authoritative linked data sets.

Several organisations/people may have their own understanding of a thing,
concept, person etc. plus related supporting information, comments, local
resources, etc.  They also recognise that the other authoritive sources
exist and use sameAs links to share the fact that they are describing the
same entity as others are.

They publish their own individual identifier for the Person, Place,
Product, etc. and then sameAs links to others’ descriptions of the same
Person, Place, Product, etc.

It is up to the consumers of this data, mostly the search engines, to take
this data interpret it as they wish (potentially merging it into aggregated
descriptions of entities in their knowledge graphs) and use it as they feel
appropriate to satisfy their users’ needs.

Where I see your proposal differing, if my understanding is correct, is
that you are trying to identify an individual description about a
particular entity (as against the entity itself) and then say it is the
sameAs another description. (by description I mean all attributes such as
reviews etc.)  In this pattern you are saying that two different
descriptions are sameAs each other, which, as you say they are not direct
translations of each other, they are obviously not.  What *is* the same is
entity being described.

As an external observer, I see the data consumers looking for multiple
structured data descriptions of individual Things (entities) for them to
aggregate together in knowledge graphs and then use them to guide users to
appropriate views of those entities.

By concentrating on the description identities, as against the entities
they describe, I believe you are providing confusion in the data patterns
and the results may well be unpredictable.  I believe you are making
assumptions about the way the search engines use this data, and are trying
to push, or even game, them in to operating in a specific way.  History
shows us that such initiatives usually only have short term positive
benefit.

Although I have great sympathy with your objectives of helping users see
the appropriate description of a resource in an appropriate language, I
believe a message to implementers about ensuring the that text in their
descriptions is correctly language tagged would be as, or possibly even
more, effective.

Forgive me if my understanding of what you are proposing is not correct.

~Richard.



Richard Wallis
Founder, Data Liberate
http://dataliberate.com
Linkedin: http://www.linkedin.com/in/richardwallis
Twitter: @rjw

On 24 November 2016 at 08:08, Felix Sasaki <fsasaki@w3.org> wrote:

> Hi Thad,
>
> we may have a misunderstanding about the motivation. Let me try to explain
> with a comparison. If I want to give my own ratings of a product,
> Schema.org <http://schema.org> allows me to that e.g. via
> https://schema.org/Rating . And ratings then are taken up in search
> results previous.
>
> I can give my own translations (with your example, the one from Richard or
> the one from Alexandre in an earlier mail), but there is no guidance on how
> how translations will or should be taken up.
>
> By „translations" I don’t mean general translations based on world
> knowledge. That would be feasible, as you pointe out, with global lexical
> data bases.
>
> If you look at the slides I linked to below, you will see an example.
> Companies have their own multilingual terminologies. They own and govern
> these terminologies and don’t want them to be a part of a global lexical
> data base. Still they want to put them on the Web, to ease cross lingual
> access to their data.
>
> It would not make sense to put a highly company specific name like „Easy
> Graphics Framework“ and its Chinese translation „图形提供商“ into a general
> lexical data base. Still the companies want to have their multilingual data
> taken up in web search. See attached mockup of how that could look like.
>
> So this topic mostly about motivations and benefits. For ratings and their
> uptake in web search this relation is clear. For company specific
> multilingual data it is not yet clear to me.
>
> Best,
>
> Felix
>
>
>
>
> Am 23.11.2016 um 17:52 schrieb Thad Guidry <thadguidry@gmail.com>:
>
> Felix,
>
> Let the Web work for you.  Do not try to "game" Languages for SEO or even
> enrichment purposes.
> Instead, to encourage enrichment, invest in improving translations
> themselves at sites like Wiktionary, translate.Google.com
> <http://translate.google.com/>, bing.com/translator , etc.
>
> To handle your use case and others, we already support translation of
> "name" by simply using "sameAs".
>
> Use sameAs property
>
> (URL of a reference Web page that unambiguously indicates the item's
> identity. E.g. the URL of the item's Wikipedia page, Freebase page, or
> official website.)
>
>  to point to a URL that includes more information about that name to reap
> the benefits of a global translation community, instead of rolling your
> own.  (But if you want to roll your own, then you can use sameAs as well,
> but there might be limited understanding from search engines, since there
> is already investment against the major lexical databases and wikis out
> there in the world from the likes of Google, Bing, Yahoo, and Yandex.  They
> already can handle most translations and have an understanding using those
> lexical databases and wikis as well as their own.
>
> Example:
>
> {
>   "@context": "http://schema.org",
>   "@type": "ProductModel",
>
>   "description": "Our extra long, elongated screwdriver allows turning
> even in the tightest of confined previously unreachable spaces!",
>   "name": "Screwdriver",
>   "sameAs": "https://en.wikipedia.org/wiki/Screwdriver",
>   "sameAs": "https://en.wiktionary.org/wiki/screwdriver",
>   "image": "xyz_screwdriver-32in.jpg",
>   "brand": "XYZ",
>   "manufacturer":"XYZ"
> }
>
>
>
> On Wed, Nov 23, 2016 at 3:43 AM Felix Sasaki <fsasaki@w3.org> wrote:
>
>> I want to do what Alexandre described in his example here
>> https://lists.w3.org/Archives/Public/public-schemaorg/2016Mar/0055.html
>> in that thread, we discussed already usage of name properties or
>> translationOfWork. Name properties don’t allow to attach additional
>> information to the language specific name. But that additional information
>> is the reason why a terminology data base exists: to express name variants
>> within one language, to express that a name belongs to a certain (company
>> specific) terminology in a certain version, to connect the name to a topic
>> domain (e.g. screwdriver in manufacturing processes of company XYZ) etc.
>>
>> So to achieve this you need two separate things. But translationOfWork
>> seems to be tailored towards CreativeWorks, which seem to mean things like
>> books, films, pieces of music. If one subsumes terms (from terminology data
>> bases) as creative works, there are a lot of confusing properties.
>>
>> The whole reason for this exercise is to allow users to influence
>> cross-lingual search. Something like the mock up on slide 14 would be nice.
>> Search engines allow for cross lingual search, see slide 29; but a user
>> cannot influence that with Schema.org <http://schema.org/> markup.
>>
>> - Felix
>>
>> Am 22.11.2016 um 01:06 schrieb Richard Wallis <
>> richard.wallis@dataliberate.com>:
>>
>> Scanning your slides I am not clear (in the Schema.org
>> <http://schema.org/> markup) if you are describing two separate things
>> the contents of which are in different languages or a single thing with
>> names in different languages.
>>
>> The definition of inLanguage <http://schema.org/inLanguage> indicates “The
>> language of the content..”
>>
>> If it is the former, they are not the same thing and they probably should
>> be related with translationOfWork
>> <http://bib.schema.org/translationOfWork> and WorkTranslation
>> <http://bib.schema.org/workTranslation> not *sameAs.*
>>
>> If it is the latter, surely the use of two *name* properties, one in
>> each language, with language labels would suffice.
>>
>> ~Richard.
>>
>> Richard Wallis
>> Founder, Data Liberate
>> http://dataliberate.com
>> Linkedin: http://www.linkedin.com/in/richardwallis
>> Twitter: @rjw
>>
>> On 21 November 2016 at 14:27, Felix Sasaki <fsasaki@w3.org> wrote:
>>
>> Hello Alexandre and all,
>>
>> I had the pleasure to explore the topic of how to express translation of
>> terms further in a presentation at the Tekom / TCWorld conference. See the
>> announcement and slides (including an extended abstract at the end) here
>>
>> http://conferences.tekom.de/conference/tcworld16/
>> conference-program/conference-program/program/sv_1486_IN21/
>> http://conferences.tekom.de/fileadmin/tx_doccon/slides/
>> 1486_Summit_Meeting_Search_Meets_Terminology.pdf
>>
>> The presentation was well received and it seems that there is an interest
>> in using existing terminology assets to foster cross lingual search use
>> cases. It would be interesting to explore this further in the context of
>> Schema.org <http://schema.org/>
>>
>> Any comments on this topic & the presentation slides are very welcome.
>>
>> Kind regards,
>>
>> Felix
>>
>> Am 17.03.2016 um 15:35 schrieb Alexandre Bertails <bertails@apple.com>:
>>
>> Felix,
>>
>> We are currently trying to solve a very similar problem. My plan is to
>> use schema:sameAs for that. Applied to your example:
>>
>> {
>>  "@id": "http://example.com/my-term-data-base-entry-1",
>>  "@type": "schema:Term",
>>  "schema:inLanguage": "en",
>>  "schema:name": "screwdriver",
>>  "schema:sameAs": {
>>    "@id": "http://example.com/my-term-data-base-entry-2",
>>    "schema:inLanguage": "de",
>>    "schema:name": "schraubendreher"
>>  }
>> }
>>
>> Conceptually, the 2 entities really denote the same thing. Granted, our
>> usage of schema:sameAs is not exactly what's described in
>> https://schema.org/sameAs but there are reasons why we prefer to stay
>> within the schema.org realm. And owl:sameAs would bring a lot of baggage
>> with it which we are not interested in.
>>
>> Also, I think schema:translation would be too specific. Personally, I
>> would be happy if the definition of schema:sameAs was less about web pages.
>>
>> Best,
>> Alexandre
>>
>> On Mar 17, 2016, at 6:22 AM, Felix Sasaki <fsasaki@w3.org> wrote:
>>
>>
>> Am 17.03.2016 um 13:56 schrieb Thomas Francart <thomas.francart@sparna.fr
>> >:
>>
>> I don't think the original question was about translating the terms of
>> schema.org itself (classes and properties); it was about the possibility
>> to describe terms/words, similar to what SKOS-XL proposes.
>> For me the original proposition makes sense, it would allow to state
>> things like "this term/word A is used for a large public", "that other
>> word/term B is used by the scientific community" "the words/terms A and B
>> are both used to refer to concept C", "word/term A is an acronym of
>> word/term B", "word/term D is slang, while word/term E is formal language",
>> etc.
>>
>>
>> Yes, that was the original question. A further comment below.
>>
>>
>> Thomas
>>
>> 2016-03-17 13:38 GMT+01:00 Dan Brickley <danbri@google.com>:
>> Yes, I tend to agree with Chaals & Richard here: for translated labels
>> of structured data vocabulary terms (schema.org's and others), we
>> should look towards the underlying W3C standards: RDF/S and perhaps
>> sometimes SKOS, SKOS-XL. It is usual to stick to a single URL for
>> types and properties rather than proliferate them by having different
>> URLs for each language.
>>
>>
>>
>> In my use case (see below) I need to differentiate uniquely (= via URIS)
>> between
>>
>> 1) terms in language X,Y,Z
>> 2) common = language agnostic concepts that they denote
>> 3) domains (= topics) that they belong too
>>
>> Richard wrote :
>>
>> [
>> As to proposing a general purpose term definition / relationship
>> structure such as you describe, I can see the need for such a capability
>> but wonder if in most cases SKOS-like existing solutions would suffice for
>> detailed description.  Whereas I would require some convincing as to the
>> potential take up in a broad general purpose vocabulary such as
>> Schema.org <http://schema.org/>.
>> ]
>>
>> The use case is a Japanese buyer of items who knows how something is
>> expressed in his language. He wants to be able to make a search for
>> スクリュードライバー
>> and say: give me pages about screwdrivers that express the concept of a
>> screwdriver in my domain and denotes the concept I want to buy (= take up
>> the information provided by 1,2,3 above). The buyer does not want to buy
>> screwdrivers in general, and he does not want to buy everything with the
>> label screwdriver in english; but he wants to be a specific screwdriver in
>> a given domain, e.g. automative manufacturing. The buyer also wants to take
>> variants of how terms are expressed into account, e.g. differences in
>> spelling, abbreviations etc.
>>
>> Such searches are quite common in search of multilingual terminology data
>> bases. In these data bases terms are uniquely identified first class
>> citizens. More and more companies put such data bases on the web but don’t
>> have a way yet to do that with structured HTML markup. So search for
>> multilingual terminology, taking 1,2,3 into account, is not yet possible on
>> the Web.
>>
>> - Felix
>>
>>
>>
>> Here is an example btw of RDFa+RDFS definitions that do this, from
>> https://github.com/schemaorg/schemaorg/blob/sdo-deimos/
>> data/l10n/zh-cn/schema_org_zhcn.html
>>
>> <div typeof="rdfs:Class" resource="http://schema.org/Audience">
>> <span class="h" property="rdfs:label">Audience</span>
>> <span class="h" property="rdfs:label" xml:lang="zh-cn">听众</span>
>> <span property="rdfs:comment">Intended audience for an item, i.e. the
>> group for whom the item was created.</span>
>> <span property="rdfs:comment" xml:lang="zh-cn">听众,观众, 读者</span>
>> <span>Subclass of: <a property="rdfs:subClassOf"
>> href="http://schema.org/Intangible">Intangible</a></span>
>> </div>
>>
>> Does this approach do what you have in mind, Felix?
>>
>> Dan
>>
>> On 17 March 2016 at 10:56, Richard Wallis
>> <richard.wallis@dataliberate.com> wrote:
>>
>> Not sure I understand your definition of a term, but the ability to handle
>> names, or any other text based properties, of things in multiple languages
>> is already possible:
>>
>> {
>>
>>  "@context": “http://schema.org/”,
>>
>>  "@id": "http://example.com/my-term-data-base-entry-1",
>>
>>  "@type": "schema:Thing",
>>
>>  "schema:name": [
>>
>>    {
>>
>>      "@language": "en",
>>
>>      "@value": "screwdriver"
>>
>>    },
>>
>>    {
>>
>>      "@language": "de",
>>
>>      "@value": "schraubendreher"
>>
>>    }
>>
>>  ]
>>
>> }
>>
>>
>> or in RDFa:
>>
>>
>> <div typeof="schema:Thing"
>> about="http://example.com/my-term-data-base-entry-1">
>>    <div property="schema:name" xml:lang="en" content="screwdriver"></div>
>>    <div property="schema:name" xml:lang="de"
>> content="schraubendreher"></div>
>>  </div>
>>
>>
>> ~Richard
>>
>> Richard Wallis
>> Founder, Data Liberate
>> http://dataliberate.com
>> Linkedin: http://www.linkedin.com/in/richardwallis
>> Twitter: @rjw
>>
>> On 17 March 2016 at 09:04, Felix Sasaki <fsasaki@w3.org> wrote:
>>
>>
>> Hi all,
>>
>> It seems that schema.org as of writing would not allow to express the
>> relation for terms „A is a translation from B“ or „A is an abbreviation
>> from
>> B“. It is already possible to express that A is translation of B, see
>>
>> http://bib.schema.org/translationOfWork
>>
>> but this is specific to works, not translated terms. Would the below make
>> sense? It is adapted from
>> https://schema.org/translator
>>
>> note: schema:Term and schema:translation do not exist in schema.org, I
>> made them up for the example.
>>
>> {
>>  "@id": "http://example.com/my-term-data-base-entry-1",
>>  "@type": "schema:Term",
>>  "schema:inLanguage": "en",
>>  "schema:name": "screwdriver",
>>  "schema:translation": {
>>    "@id": "http://example.com/my-term-data-base-entry-2",
>>    "schema:inLanguage": "de",
>>    "schema:name": "schraubendreher"
>>  }
>> }
>>
>> - Felix
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Thomas Francart - SPARNA
>> Web de données | Architecture de l'information | Accès aux connaissances
>> blog : blog.sparna.fr, site : sparna.fr, linkedin : fr.linkedin.com/in/
>> thomasfrancart
>> tel :  +33 (0)6.71.11.25.97, skype : francartthomas
>>
>>
>>
>>
>>
>>
>>
>>
>>
>


PastedGraphic-1.tiff
(image/tiff attachment: PastedGraphic-1.tiff)

Received on Thursday, 24 November 2016 11:50:42 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 24 November 2016 11:50:43 UTC