Re: Permit external JSON-LD files?

It's a good idea, not just for Schema, but for *all* metadata not used by
browsers or similar clients. This would include all the social media
metadata like Facebook's OpenGraph, Twitter cards, Google site
verification, the search result snippet things, etc.

I mapped out an approach for this with a separate file extension for the
non-browser metadata: .meta

Bots would request the .meta file, in most cases in addition to the actual
page (in some cases they might only need the .meta file, maybe the social
media links where they just need title and description and an image URL).
The .meta file URLs would exactly match the page URLs, except for the
extension.

As you noted, the status quo is quite wasteful. It's not just the useless
metadata – users are forced to download enormous amounts of CSS that is not
used by the page, typically 90%+ unused, and in the form of separate files,
which makes it even worse. And enormous amounts of JS, again in separate
files, most of it unused, and much of the rest unnecessary for the
functionality of, say, an article with comments and a few ads. There's
never been a good explanation for this – the caching argument was always a
mere assertion falsified by digging in and testing. So there's a lot of
room for improvement – the web is much slower than it could be and should
be given the ridiculous power of modern computers and the fat pipes we now
have. It's amazing how slow even multimillion dollar websites are.

The metadata bloat isn't the biggest culprit, but it's worth sorting out
along with the other sources of bloat. I sketched out a framework with .web
and .meta files, where .web replaced HTML and CSS. It would be equivalent
to, but more compact than, a formally specified minification format for
HTML and CSS (something we could really use), combined with default
tree-shaken CSS (only the CSS used by the page is in the source, which is
trivially easy to achieve by simple selector matching), minified 1-2 byte
selector, class, ID, etc. names (no more ponderous 30-byte class names –
the browser doesn't need or do anything with them), and an efficient link
format with standardized URLs (URLs just 2-3 bytes before the hostname,
e.g. H: instead of https://, link markup just 3-4 bytes, and URLs never
more than 25 bytes after the hostname).

The metadata format could also be much more compact. There's no reason for
machine-readable syntax to be human readable and so bloated. We could
easily flip between machine and human readable forms, so it's never made
sense to go for both in one bloated format. Most tags could be just one or
two bytes. Standardized order can eliminate some tags or other bytes. The
format could also be optimized for compression by design (to specific
compression formats like brotli or Zstandard, though it might be possible
to optimize for both at the same time). JSON is bloated with the quoted
keys, long key names, and excessive punctuation – simple newline separation
solves two of those, and a binary format could have richer separators and
markers just by using the forgotten 1-byte control codes in UTF-8 Basic
Latin / ASCII, in addition to 1-2 byte key names.

Cheers,

Joe

On Sat, Aug 27, 2022, 11:11 Roger Rogerson <tesconda2@hotmail.com> wrote:

> I appreciate that things like MicroData are inlined,
> and utilise the HTML Markup to associate data with content.
>
> But JSON-LD Schema is embedded.
> In many cases, this additional code serves no "human" purpose,
> and is provided for "machines" (typically Google).
>
> A shining example is the following web page (remove spaces after periods):
> https://www. delish. com/cooking/g1956/best-cookies/
>
> That page has approximately 35Kb of Schema.
> That is loaded for every single human visitor.
>
> In the case of popular pages - this means a large amount of unnecessary
> code is transferred (Gigabytes or higher per year).
>
> If the JSON-LD could be externalised into a referred to file,
> then this could reduce bandwidth consumption for users,
> help speed up some page load times/improve performance
> and help towards "going green".
>
>
> I appreciate that technically,
> this isn't about "Schema" directly,
> but about how Browsers and Parsers can recognise and handle
> and externalised version - but I'm hoping this is the right place
> to get it considered and the right people to see it/push it to browser
> vendors.
>
>
> Thank you.
> Autocrat.
>

Received on Wednesday, 31 August 2022 16:33:13 UTC