Re: Eurocentrism, incorrect unit abbreviations, and proprietary Royalist Engish terms

Thank you Martin Hepp, for providing a very detailed and useful answer to
these points!  I'm a big admirer of your work introducing the UN/CEFACT
Common Code into schema.org, and I feel that it is underutilized, perhaps
because the current documentation isn't as helpful as it might be for
newcomers about this approach.  I also appreciate you addressing the
differences in terminology -- even within English.  I've lived in 4
English-speaking countries, and can attest that English is not a uniform
language.  Maybe schema.org can do more to explain what a property might
mean to various audiences. The property name and URI are just a common
reference to a concept that might have many different labels within a
language, or among languages.

Maybe I can work with Thad to reuse these explanations to improve schema.org
documentation.  I imagine both the issues raised by the question, and the
answers explaining the status and rationale of choices, will interest a
broader audience.

On Wed, Jul 4, 2018 at 2:27 PM, Martin Hepp <mfhepp@gmail.com> wrote:

> Dear Joe:
>
> thanks for your email!
>
> > On 04 Jul 2018, at 09:54, Joe Duarte <songofapollo@gmail.com> wrote:
> >
> > Hi all,
> >
> > I have a few threads of feedback:
> >
> > 1. The schema is littered with incorrect abbreviations of American
> units. Examples:
> >       • In the Vehicle schema, for cargoVolume, it gives FTQ as the
> abbreviation for cubic feet.
> >       • For fuelCapacity, it gives GLL for gallons.
> > I couldn't find any reference on the web that gave these abbreviations,
> so I'm stumped where they came from. Cubic feet can just be written as
> cubic feet, or cubic ft., or ft³.
> >
> > And gallon is abbreviated gal.
> >
>
> As for the unit codes you critize: These are the official UN/CEFACT Common
> Codes
>
> http://wiki.goodrelations-vocabulary.org/Documentation/
> UN/CEFACT_Common_Codes
> https://www.unece.org/cefact/codesfortrade/codes_index.html
>
> This is a global standard, and was selected after a careful analysis of
> alternative standard representation for unit of measurement codes. A bit of
> background is on p. 20 and p. 26 of
>
> http://www.heppnetz.de/projects/goodrelations/GoodRelations-TR-final.pdf
>
> Computers benefit from unique codes for meanings, so the UN/CEFACT Common
> Code strings are better suited for schema.org. Some users of schema.org
> data may be able to handle other encodings for units of measurements, but
> it is better to not rely on that.
>
>
> > For fuelConsumption, the schema doesn't even try to account for the US,
> and only refers to a European measure (liters per 100 km). (The US measure
> is miles per gallon, abbreviated as MPG.)
>
> Please read the spec carefully:
>
>     https://schema.org/fuelConsumption
>
> clearly says:
>
> "Note 2: There are two ways of indicating the fuel consumption,
> fuelConsumption (e.g. 8 liters per 100 km) and fuelEfficiency (e.g. 30
> miles per gallon). They are reciprocal."
>
> So simply use
>
>     https://schema.org/fuelEfficiency
>
> > 2. Which leads to a related observation. The schema is vividly
> Eurocentric, in that it seems designed around European norms rather than
> American ones. This is odd, since Schema.org sponsored mostly by American
> search companies, and there are 325 million people in the US vs. 65 million
> in Britain, for example.
>
> Schema.org is a global, international standard. It serves 7.6 billion
> people on the planet.
>
> Also, as for the term "sponsored": Schema.org heavily relies on community
> contributions. People have dedicated a lot of time to this standard without
> being paid a single cent by the "American" search companies. Also, Yandex
> has been a schema.org sponsor (= endorser and user of respective data)
> from early on
>
> > Since the schema is in the English language, and the user base will be
> overwhelmingly American, I think it's more appropriate that we use American
> English by default, unless there's a contextual reason to use Royalist
> English. Here are some examples:
>
> In general, schema.org tries to use a "global" form of English. Due to
> the many contributors, this may be neither perfect American nor perfect
> British English. As for spelling, we mostly try to use the American form.
> If there are spelling inconsistencies, the most effective approach is to
> file a pull request on GitHub.
>
> > CampingPitch is not a term Americans will be familiar with. It's a
> Royalist* term.
>
> Please propose a better term for the area you can rent temporarily on a
> campground to put your tent on.
>
> > Under Car, there are but two properties (from Car). I would expect these
> to be important, fundamental properties at the right level of ontological
> abstraction,
>
> The properties also relevant for other vehicles are one layer up the
> hierarchy, see
>
>     https://schema.org/Vehicle
>
> > but rather they are:
> >
> > acrissCode – this is a coding system used by European car rental
> businesses. The description is so deeply Eurocentric that it doesn't even
> mention Europe, or the fact these codes are only used in Europe. It's as if
> the rest of the world doesn't exist. Any codes that are only used in
> certain countries or continents should be identified as so encumbered.
>
> ACRISS is used in Europe, Middle East and Africa, and respective data can
> be valuable in rental car mark-up in those markets.
>
> It is a principle of schema.org to provide properties that may not be
> applicable in all contexts, i.e. the existence of this property does not
> imply they must or should be used. They are simply an option.
>
> If there is another coding system for rental car categories in the US or
> other parts of the world, we could add additional properties or rename this
> one and expand the range of values.
>
> > roofLoad – this is the second of the two core properties from Car. And
> again it has unit errors, this time across the board. It claims that a
> kilogram is abbreviated KGM, and pounds as LBR. The SI abbreviation for
> kilogram is kg, and for American customary units, pounds are lbs. This can
> be confirmed in any appropriate reference, including Wikipedia.
>
> It may not be obvious from the documentation, but the range of this
> property is
>
>     https://auto.schema.org/QuantitativeValue
>
> The unit is encoded using the property
>
>     https://auto.schema.org/unitCode
>
> So you can use any UN/CEFACT Common Code unit code for a weight.
>
> The specs tries to explain that by
>
> "The unit of measurement given using the UN/CEFACT Common Code (3
> characters) or a URL. Other codes than the UN/CEFACT Common Code may be
> used with a prefix followed by a colon."
>
> > Note also that roofLoad is likely to be a European concern – Americans
> don't talk about it, and it's never advertised by carmakers here. In any
> case, we have properties for Car, and they are a European rental coding
> system and roofLoad.
>
> It seems that loading stuff on the roof is more popular in Europe than in
> North America, since cars in Europe tend to be smaller so the need for
> additional storage capacity is a more frequent consumer need. But the laws
> of physics will also apply to roof loads in North America, and a quick
> Google search reveals that user manuals of American cars contain respective
> load limits.
>
> So just omit it when you have no such data or do not need the information.
>
> And as said, most of the properties you seem to miss are at the level of
> Vehicle, since they apply to motorbikes, coaches, and maybe boats and
> airplanes alike.
>
> > The Car schema is in pretty bad shape.
>
> I think that is a bold statement, and inappropriate.
>
>
> > There are a lot more errors in the Schema, including repetitions of the
> bizarre, incorrect unit abbreviations.
>
> The UN/CEFACT Common Codes are neither bizarre nor incorrect but instead
> widely used in e-commerce data exchange.
>
> Alternative unit coding schemes can be used by a prefix followed by a
> colon.
>
> > 3. The English-only instantiation of Schema.org also raises some
> important long-term questions. Do you plan to expand or mirror Schema.org
> into other major languages (French, Spanish, German, Russian, Simplified
> Chinese, Japanese, Korean, etc.)? Or is it meant to be mostly Western
> focused? That still implicates some European languages. Moreover, if we
> create Schema.org for country-specific languages like French and German,
> we'll need to be sure to avoid the mistakes in the current schema, and have
> the French version littered with British assumptions, for example.
>
> There has been a lot of discussions about translations of global data
> schemas in general and schema.org in particular.
>
> First, keep in mind that IT systems should be able to process the data
> independent of the location and language of the respective Web sites. So we
> strive to find classes and properties that strike a compromise between
> familiarity (which is often bound to cultural contexts) and cross-cultural
> usefulness.
>
> Translations make it easier for developers to use the standard, but they
> are difficult to maintain and can introduce additional ambiguity.
>
> Most programming languages and most other standards in the history of
> computing have been maintained in English, and so far this has worked out
> well.
>
> Cultural bias can be a problem, but we can only minimize, never avoid it.
> And there are tradeoffs. So the most effective way of a contribution are
> tangible change requests in GitHub.
>
> > In summary, I think there are lots of problems with the schema right
> now, from bizarrely incorrect units, sections that do not contemplate the
> existence of the United States, and messy structures and hierarchies that
> do not meet normal ontological – or just logical – standards.
>
> I have tried to explain that I do not share your assessment here.
>
> >
> > Is the team too small? Is it perhaps based in Europe? I'm happy to help.
> I'm working on a metadata schema for scientific research right now, and a
> couple of other schemas that I might propose for inclusion in Schema.org.
> In any case, I'm happy to help. I can look for pull request opportunities,
> and you might want to just hire me – I'm a social scientist who specializes
> intellectual and cultural diversity and how it helps science and teams. It
> would probably help to have someone on the team who knows mainstream
> American norms, with a rural background, who isn't white, who loves and
> knows cars very well, as well as ontology in general. Schema.org won't
> reach its full potential if it's run by a handful of urbanites in the Bay
> Area, Europe, etc. There would be too much cultural sameness and bias, and
> giving semantics to the web is a job for a culturally diverse team.
>
> Joining the team is easy, and you have already made the first step ;-)
>
> Just get a GitHub account, fork schema.org, create proposals for
> improvement, and submit a pull request to the main repository. My advice
> would be to start small, with commits that fix typos or wording before
> investing a lot of time into major structural modifications. The latter are
> more challenging than they appear at first view, because there are many
> aspects to consider when designing globally shared data schemas.
>
> Except for a few people employed by the big search engine companies, all
> others are volunteers. So I think turning it into a paid occupation will be
> a rather thorny road.
>
> Best wishes
>
> Martin
>
>
>
> >
> > Cheers,
> >
> > Joe Duarte, PhD
> > Phoenix, AZ, USA
> >
> > * By Royalist English, I mean that which is spoken in countries where
> they bend the knee to the underemployed fashion models of the House of
> Windsor.
> >
>
>
>

Received on Wednesday, 4 July 2018 15:26:40 UTC