- From: Johannes Wilm <johanneswilm@gmail.com>
- Date: Sat, 9 Sep 2017 22:48:36 +0200
- To: Sarven Capadisli <info@csarven.ca>
- Cc: Scholarly HTML community group <public-scholarlyhtml@w3.org>
- Message-ID: <CABkgm-TarxPvzuQW5cCorZus=aSW9O-z3mMd8GPturF8tThdaQ@mail.gmail.com>
The problem with having full HTML5 is that there are way too many tags, attributes, css ways of doing things and others ways off adding meaning that may seem clear to a human reader but very difficult to understand for a computer. Anyone who wants to reuse the data has to fish in a soup of HTML to try to find out what is going on. All the wysiwyg-ish editors I have come across had to only accept a small number of tag/attribute combinations for the entire to be viable. Apparently Dokieli is able to handle any kind of HTML, but I wouldn't be able to create an editor that is able to do that, and I cannot see how that HTML can be converted into other formats without more human intervention. I really don't think creating a common format is impossible. Over the past two years we have worked with mainly IT-focused scientists on Fidus Writer. Even though FW was made with social sciences/humanities in mind, the IT-academics had very few problems writing their papers on it, and several of them have been published. Yes, there were occasional minor "cultural" issues (such as IT-people not getting why one would reference several works in the same citation), and of course there are always more things that would be nice to have, but we're far from needing 500 new tags just to start. I don't know this, but maybe part of the reason it is easier now to fibd a solution than 30 years ago is that publishers are more willing to make compromises in terms of layout as they are no longer just dealing with print and they are facing academics shopping around for lower APCs? I think the proposed way forward of creating something that generally works for the sciences and then allowing further specification to allow very subject-specific things via rdfa or alike sounds good. The formats that focus on a limited tag-set have been developed already (RASH and Scholarly HTML) may have just about everything we need already. The reason why it would seem to me to be a good thing to look at combining efforts is that RASH seems like so far has been developed by just one organization (and some not entirely intuitive decisions have been made on for example banning h2-h6 and adding author information to the head and not the body), while ScholarlyHTML sounds like it's lacking the manpower and openly available tooling around it. It sounds to me like Dokieli is heading a different direction, where it's less important to reuse the text in other applications and the web is the last place where the document is landing. That is probably working for a lot of people, but not for everyone. We already have students work on the dokieli export filter, and we'll support them with that, but in addition I think it would be good to have an exporter to a limited HTML that can be processed and transformed into other formats. That's what I hope comes out of this process. On 9 Sep 2017 9:45 pm, "Sarven Capadisli" <info@csarven.ca> wrote: On 2017-09-09 21:08, Bruce Miller wrote: > It would seem more productive to focus on the metadata magic > (RDFa, Aria,...) to indicate the data's meaning & purpose, > rather than to try to make HTML actually semantic after-the-fact. I agree completely. It is precisely the approach that dokieli takes with RDF(a) meanwhile making the best of what HTML5 has to offer. -Sarven http://csarven.ca/#i
Received on Saturday, 9 September 2017 20:49:02 UTC