Re: scientific publishing process (was Re: Cost and access)

On 10/7/14 1:14 PM, Norman Gray wrote:
> Sarven, hello.
>
> On 2014 Oct 7, at 13:13, Sarven Capadisli <info@csarven.ca> wrote:
>
>> On 2014-10-07 11:39, Norman Gray wrote:
>>> The original spark to the thread was a lament that SW and LD conferences don't mandate something XMLish for submissions because X(HT)ML is clearly better for... well ... dammit, it's Better.
>> Straw man argument. Please stop that now!
>>
>> I will spell out the main proposal and purpose for you because it sounds like you are completely oblivious to them. Let me know if anything is unclear.
> My remark was intended as facetious rather than fractious, but if you feel I misjudged the balance, I apologise.
>
> I want to clarify what I meant, because on reflection it explains (at least to me) why I'm participating in this thread at such length.  My intention was to indicate that I don't feel that HTML is as central as you, amongst others, seem to assert it is.
>
> I characterise the web as:
>
>    1. URIs for addressing things,
>    2. HTTP for retrieving things (other protocols exist, but...),
>    3. a downloadable format which clients can parse to obtain more URIs, with a 'follow this' semantic.

How about:

1. HTTP URIs for naming (or identifying) things -- basically, the 
combined effects of denotation (signification) and connotation 
(perceptible description)
2. RDF abstract language for describing things -- systematic use of 
signs, syntax, and role semantics for communication
3. Notations for inscribing RDF language based descriptions to documents 
-- where notations serve the medium-specific purpose of representing the 
words of a language.

Once you have the base RDF Document in place, using a preferred 
notation, and subject to viewer preferences, you transform the RDF 
document into other document types (HTML, PDF, etc..), in line with 
viewer preferences.

>
> Now, the obvious candidate for (3) is of course HTML; but on the web, and _especially_ on the Semantic Web, it can be anything: RDF in one or other format, XML+GRDDL, some discipline-specific format with has a link semantic in it, or even a PDF file with a standardised lump of RDF/XMP inside it.

The trouble with the paragraph above is that RDF isn't a format. That 
presumption is the root of mass confusion.

> That RDF may be immediately present, or it may require some sort of heuristic or deterministic extraction (as Kingsley has discussed).
>
> All of these are web-native technologies, and I'd go as far as to say that the _least_ interesting thing you can find at the end of a URI is an HTML file.

For sure!

>
> The big deal, for me, in the idea of the Semantic Web, and the RDF world, is the realisation that the RDF model is sufficiently general that you can turn almost any structured data into RDF, put it into a big bucket, and start inferencing, querying, linking, and so on.  That generation/extraction of RDF is probably easier if the stuff is already pointy-bracketed for you, but that's only a detail.

Yes, which is why we have to think of RDF (accurately) as a Language, 
and never a format. The format issue is something that should have been 
attended to years ago in W3C literature i.e., the notion of abstract and 
concrete syntaxes leads to the misconception that RDF is about document 
content formats. The loose-coupling of language (signs, syntax, and 
semantics) and notations (representation of words of a language) aspect 
isn't visible, and as a result lost or overlooked (on a good day).

JSON-LD and TURTLE are all accurately pitched (across all related 
collateral) as Notations. Funnily enough, each is also associated with 
significant RDF uptake initiatives: TURTLE re., the LOD Cloud and 
JSON-LD re., Google, Bing!, Yandex, and possibly Yahoo!, as major RDF 
supporters and adopters that are driving mass production of HTML 
documents that include RDF-language based structured data (inline or via 
structured data islands using <script/>) .

>
> The interesting thing, for me, is just how the web as a whole can go about collectively managing or facilitating this generation/extraction in a way which balances faithfulness to the original with interoperable meaning (Dublin Core and FOAF are truly wonderful things).  That is why I do feel that -- especially in this SW/LD community --
>
>      HTML is a bit of a sideshow.

Yes, it is, but I think Sarven uses it as a simple starting point i.e., 
a point of least distraction, so to speak.
>
> HTML is a splendid thing for all the reasons that you know and I know, but if it's seen as central, if all questions turn into "what does that look like in HTML?", if it's so in-our-face that we can't see round it, then we miss the interesting questions.

Yes!

> So it's not that I've a particular downer on HTML, or a particular enthusiasm for PDF, but I think that "what does that look like in PDF?" and "what does that look like in FITS?" (the format of choice in my area) are more interesting.

Yes.

>
> (or put another way, I don't think that HTML is the SW/LD community's dogfood to eat -- for WHATWG, yes; us no)
>
> The sub-threads here about practicalities are amongst those questions, because they pick up the questions of "how does semantics get attached to documents in practice?", "why would authors bother?", "how does that information get passed around faithfully?"  It would be more interesting and productive if (and I don't mean this completely unseriously) the SW/LD community _forbade_ HTML from its conferences and journals.

Tricky, that one :-)

>
> So, this is where the opposite end of the spectrum is, from your position.  This may make a little more sense of what I've been saying.
>
> Best wishes,
>
> Norman
>
>


-- 
Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

Received on Tuesday, 7 October 2014 18:23:35 UTC