microformats 2 and Linked Data from henry.story@bblfish.net on 2015-03-13 (public-socialweb@w3.org from March 2015)

From: <henry.story@bblfish.net>
Date: Fri, 13 Mar 2015 13:19:47 +0100
To: Social Web Working Group <public-socialweb@w3.org>
Message-Id: <5DA7FD21-79CF-4C60-80A9-3439CCB08EE8@bblfish.net>

Since we now have to browse the whole web for discussions that might be relevant to the group from those that don't follow the e-mail ( see ISSUE-19 ), I thought I'd relay here a discussion by Tantek. You'll have to find his channel to respond to him.

http://logs.glob.uno/?c=freenode%23microformats&s=26+Jan+2015&e=26+Jan+2015

26 Jan 2015
09:40 csarven Can someone point me to why microformats2? For instance, what was at stake with microformats1 in for example hCard that needed a revisit for parsing? I'm sure this is documented somewhere. Would appreciate a pointer.
09:50 csarven I should clarify: IIRC, in microformats, the storyline was to simplify authoring, and think of scripts later. In microformats2, however, the story appears to have changed a little i.e., authoring is slightly more complex or involved (depending on how you look at it), in order to improve how machines parse the information.
09:50 tantek csarven: yup - all documented at the "obvious" place :)
09:50 tantek http://microformats.org/wiki/microformats2
09:50 Loqi +1
09:50 tantek csarven - the "storyline" for microformats, was authors before parsers
09:51 tantek for microformats2, the basic question we asked was, could we make thing simpler for BOTH authors and parsers
09:51 tantek and that's what we ended up doing
09:51 tantek the first part of the microformats2 page is more like a spec now rather than "story"
09:51 tantek but the background is still there - let me get a fragment
09:52 tantek here we go: http://microformats.org/wiki/microformats2#Background
09:54 tantek your questions "what was at stake with microformats1" - if you mean what were the problems - are documented there
09:54 tantek HTH and definitely let me know if you have any specific follow-ups - happy to improve the documentation accordingly
09:55 tantek (but am calling it a night soon - willl check the logs - or I'll be back in the morning PST)
09:58 csarven tantek Thanks!
09:58 csarven Perhaps that was wording. I didn't necessarily mean "problems". It was more about the cause/initiative to move towards microformats2.
09:59 csarven re: "all microformats are simply an object with a set of properties with values." from http://microformats.org/wiki/microformats2#Background . That's pretty much EAV model. Which is used by RDF as well.
10:01 tantek nah - RDF complicates the model unnecessarily with basing it on "triples" http://microformats.org/wiki/triples
10:01 tantek there's no "pretty much" about it
10:01 tantek more like "ugly much"
10:02 csarven http://microformats.org/wiki/linked-data is misleading :)
10:02 csarven As well as /triples
10:03 csarven In fact, it is very opinionated.
10:03 tantek nope - such challenges are based on tons of experience
10:03 csarven "unnecessarily complicated" ? C'mon
10:03 tantek yes
10:03 tantek takes longer to explain = unnecessarily complicated
10:04 tantek anyway - no interest in arguing about RDF because that's a useless waste of time - since it's unnecessary
10:04 tantek it does seem to float some plumbing boats, but so does plenty of backend futzing
10:04 csarven First of all, the problem space for LD is completely different than mf. It is misleading to suggest that LD/triples is "unnecessarily complicated" and that mf should be preferable.
10:05 tantek csarven: no who does LD bothers to actually document specific problem spaces with open research, so I categorically reject your statement
10:05 csarven tantek This is not about arguing. We are discussing. I have championed mf for a long time and still do. So, I don't like being positioned somewhere as if I'm fighting against mf.
10:05 csarven Supporting LD doesn't mean that I don't support mf.
10:05 tantek not saying you're fighting mf
10:05 tantek I'm saying that LD is a waste of time, that's a separate problem
10:06 tantek plenty of people support both, that's their hobby
10:06 csarven Well, I disagree. LD solves problems that mf can't.
10:06 csarven Neither is it that mf is intended to solve those problems either.
10:06 tantek csarven - well, when you find some actual scientific documentation of such problems then let me know
10:06 tantek because they're usually framed in terms of abstractions and what ifs, and but if I want tos
10:07 csarven I and many would argue that SW/LD is more "scientific" than mf :)
10:07 csarven It is not at all about what ifs.
10:07 tantek so that's the essence of the problem. I ask you for "research" (e.g. URLs pointing to), and you say you would "argue"
10:07 csarven We have all sorts of data. Data is not only bound to what exists on Web pages that should be easy to author for Web developers. That's a very narrow POV.
10:07 tantek you offering to "argue" is "what ifs"
10:08 csarven Well, how about we back up and try to back up the statement "unnecessarily complicated" scientifically?
10:08 csarven Can you provide surveys?
10:08 tantek again, you're arguing hypotheticals, like I said, let me know when you have documented research about specific problems at a public URL, until then, you're wasting time with handwaving
10:08 csarven I'm simply asking for documentation on "unnecessarily complicated".
10:09 tantek yup - microformats are useful (plenty of specific use-cases on the wiki), microformats solve those problems without any need for triples
10:09 tantek ergo, no need for concepts of triples for those use-cases
10:09 csarven It is a strong claim. I'd like to know what type of research went into concluding that.
10:09 tantek ergo any use of triples would add *unnecessary* complexity
10:10 tantek triples vs. property: value
10:10 csarven No, you are unfairly comparing the problem.
10:10 tantek nope, I'm comparing a documented problem
10:10 tantek vs. no documented problems that you need triples for
10:10 tantek the burden of proof is on needing triples, not on not needing them
10:10 csarven What you are saying is that, given the problem space of mf, triples/LD complicates the problem. I'm saying that, well, that's not an accurate picture.
10:10 tantek I'm saying you have no picture
10:11 tantek you have no documented research of specific problems
10:11 tantek you have handwavings about what a picture could be
10:11 tantek like I said, let me know when you have URLs to specific documentation about specific problems / use-cases
10:11 tantek until then - you're just wasting time arguing
10:11 csarven Documented problem? Based on what? Information based on the microformats wiki about LD? And that you conclude based on that documentation, LD is complicated?
10:11 csarven Ok
10:12 tantek no - real world problems for users
10:12 csarven You are repeating yourself. DRY.
10:12 csarven You need to revisit your axioms.
10:12 csarven "real world problems for users"
10:12 tantek occam's razor - triples/LD unnecessary
10:12 tantek until proven otherwise. hence burden of proof
10:13 csarven I see. So, you arbitrarily come up with a simple view of what triples/LD is, .. then go ahead and document that in the wiki and call it a victory for mf?
10:13 tantek no it's more a defense against wasting time
10:14 tantek victories for mf havee nothing to do with LD being good or not
10:14 csarven Why bother with the documentation on the alternative in the wiki any way?
10:14 tantek victories come from solving real world use-cases
10:14 csarven mf is victorious in its own right.
10:14 tantek right
10:14 csarven There is no need to bash the alternatives.
10:14 tantek the /triples etc. documentation is because people keep bringing them up
10:14 tantek like an FAQ
10:14 tantek so it's a summary answer
10:14 tantek and it's usually sufficient
10:14 csarven Well, I appreciate your POV.
10:14 csarven I agree, it is sufficient for many.
10:15 tantek there is actually a need to filter out crap
10:15 tantek in everything
10:15 tantek and filter out inefficiencies
10:15 csarven But I disagree on the approach taken "against" LD
10:15 tantek it's a trivial debunking, that's all
10:15 tantek if you disagree - you can provide research that substantiates LD
10:15 tantek until then - there is no point to it
10:16 csarven That's trivial. Data exists outside of Web pages that are not "common".
10:16 tantek if it's so trivial, point me to a URL to research
10:16 tantek barring that, the research doesn't exist, because it's not trivial, or the problems don't actually exist that *require* LD
10:16 csarven You want me to point you to some research that says "data exists everywhere... not only on web pages"?
10:16 tantek no to specific such data
10:17 tantek that somehow has a specific aspect that *requires* LD
10:17 tantek point to actual research, not meta research
10:17 csarven LD is a pretty good candidate. How about that? If there is an alternative approach (and often there and being argued) that can be compared.
10:18 csarven That's an axiom.
10:18 tantek nah - you have no problems being solved, so it's just theoretical handwaving
10:18 tantek it's philosophy, not science
10:18 csarven RDF is a a good candidate for the problem space. And it is based on EAV. .. Just as mf2. The fact that they differ on syntax/namespaces or not.. or whatever, it is a very minor
10:19 csarven Are you serious?
10:19 tantek you sound like you're actually asking for extensible vocabulary though, not triples, by your referencing "data exists outside"
10:19 csarven Do you expect CERN to output their data from LHC into Web pages?
10:19 tantek so now you're approaching a problem statement - so that's better
10:19 csarven (I'm not arguing about LD here.. but that data exists elsewhere and that needs to be captured and modelled..)
10:20 tantek can you point to a URL documenting the specific problems of CERN needing to output the data from the LHC?
10:20 csarven Uhm.. they already do! http://opendata.cern.ch/
10:20 csarven And there is more to it. One can't expect all roads to lead to mf.
10:21 tantek that's a strawman
10:21 tantek no one said all roads
10:21 tantek I'm just saying I don't accept any "this is a solution!" statements without documentation of the problem
10:21 csarven mf is not intended to deal with all those "problems". And that is perfectly fine. Just because mf can't, it doesn't mean that others are irrelevant or are "unnecessarily complex".
10:21 tantek still don't see any documentation of any such problems
10:22 tantek sorry - you're not providing *any* actual problem documentation
10:22 csarven Well, if you want an occam's razor, then EAV, RDF are good candidates.
10:22 tantek therefore you can't argue about it
10:22 csarven We are discussing!
10:22 csarven You want me to address all your issue with URLs on the spot?
10:22 tantek nope, occam's razor is property:value works, don't need triples
10:22 tantek yes
10:22 csarven Especially when you leave an unscientific statement like "unnecessarily complex" up on the wiki?
10:22 csarven But then go ahead and argue for something scientific?
10:22 tantek if you can't back up your claims about problems with documentation of specific problems, your arguments are baseless
10:23 csarven .. for LD?
10:23 csarven C'mon.
10:23 tantek unnecessarily complex -> occam's razor
10:23 tantek already answered, quit asking same question
10:23 csarven I've already explained to you that data exists everywhere. That's trivial. That's an axiom. Can we not agree to that?
10:23 tantek nope. document a speciifc problem.
10:24 tantek not some handwaving about data everywhere
10:24 csarven We have a lot of data, and we want to "connect" this data with each other so we can have a interesting insights about societies, build better systems, make better decisions...
10:24 csarven Ok.
10:24 csarven That's *good enough*
10:25 tantek again you're speaking in generalities
10:25 csarven Not at all.
10:25 tantek stop describing, and start providing URLs to documentation of specific research
10:25 csarven Very concrete.
10:25 csarven Did you skip over the whole Data Science trend nowadays?
10:25 tantek don't care. specific URL or stop talking.
10:25 csarven You are asking me to justify the problem for the users for CERN's data.
10:25 csarven .. practically.
10:26 csarven :)
10:26 tantek "socieites", "systems", = generic
10:26 tantek nothing specific
10:26 csarven Okay, lets leave it at that.
10:26 csarven The moment you are tellin gme to stop talking ... well, there is no discussion.
10:26 tantek right, no point in any discussion since you cannot provide a specific URL to specific research about a specific problem
10:27 csarven I think TimBL made a pretty good case about "linking data" 25 years ago aka Web.
10:27 tantek barring that, no need for tripls/RDF etc.
10:27 csarven Do we need to revisit that?
10:27 tantek the web didn't need RDF/LD
10:27 tantek and succeeded without it
10:27 tantek more occam's razor
10:27 tantek thanks for the proof
10:27 csarven Web didn't need HTML5+JS+Flash... either
10:27 csarven Web succeeded because of HTML.
10:28 csarven More generally about linking documents.
10:28 tantek yup - and the features added to HTML5 were all added one at a time based on documented use-cases
10:28 csarven Linking "things" is a specialization of that.
10:28 tantek web succeeded because HTML was *simple*
10:28 csarven Agreed.
10:28 tantek TimBL said so himself
10:28 csarven Yes, and that he decided on HTML instead of something like TeX
10:29 tantek generalizing and building abstractions without a problem to solve is philosophy not sceince
10:29 csarven But the point is that, HTML opened up the idea for linking stuff across the globe. I fyou have some data and put it up somewhere, we can link to it.
10:29 tantek here's the difference
10:29 tantek HTML5 audio and video tags - clear documented use-cases
10:29 tantek LD/RDF abstractions - no clear documented use-cases
10:30 csarven I'm sorry to say but, I strongly dislike your position on mf being somehow "scientific", but that upper-case SW or LD is not.
10:30 tantek science involves documenting your problems, and research
10:30 tantek SW/LD advocates don't actually bother with that - they just invent stuff and prescribe it
10:30 tantek no homework, no showing of steps
10:31 tantek and frankly, there were areas where we didn't do enough documentation with microformats (classic) either
10:31 tantek and most of those failed
10:31 csarven I will entertain your idea for a moment. But, have you heard of "stamp collecting"?
10:31 tantek we were not *strict* enough
10:31 tantek with asking for documented research
10:31 tantek the irony of LD advocates - they can't provide links to back up their statements
10:32 csarven If mf was so "scientific", I'd expect a proper methodology. Starting from hypothesis and null hypothesis, and moving up. Certainly that's not the case. Did mf reject a null hypothesis somewhere? Is that in the wiki?
10:32 tantek yes!
10:32 csarven mf is "stamp collecting" just as much as SW/LD/RDF
10:32 tantek http://microformats.org/wiki/process
10:32 csarven Information Science.
10:33 csarven Where is the hypothesis?
10:33 tantek you start with not needing anything
10:33 tantek and then documentation is the first step - of the problem etc.
10:33 tantek you're leading with hypothesis and that's your problem
10:33 tantek with science, you lead with *observation*
10:34 tantek i.e. research
10:34 tantek then you document it
10:34 csarven That's an axiom. I'm looking for a hypothesis. And that at some point, mf rejected the null hypothesis and went along with the altnerative. Where is that mentioned clearly?
10:34 tantek only after you have documented observations do you go to a hypothesis
10:34 tantek that's scientific method 101
10:34 csarven ".... hence, we reject the null hypothesis "
10:34 tantek LD/RDF advocates skipped the observation and documentation steps
10:35 tantek so thus, unscientific
10:35 csarven What you are talking about is stamp collecting. Not some brute force testing.
10:35 csarven tantek Like I said, where is the blurb on rejecting the null hypothesis in the mf wiki?
10:35 tantek scientific method doesn't need reject null hypothesis
10:36 tantek thus we don't need it
10:36 tantek we document existing real world user problems through observation
10:36 csarven "All science is either physics or stamp collecting" -- Lord Rutherford
10:36 csarven Thanks. So, again, mf is as "scientific" as SW/LD.
10:37 tantek nope because we document our problems
10:37 tantek and require it in our method
10:37 tantek http://microformats.org/wiki/process#Why.3F
10:37 csarven Unless you want to show me that hypothesis, than we can classify mf taking on the "hard-science" approach.
10:37 tantek whereas SW/LD folks make up vocabularies first, then try to apply them
10:37 csarven Ok. I stand by my position. I don't think we are disagreeing.
10:38 tantek it's ok - eventually made-up stuff without documented problems / use-cases whithers and falls by the wayside
10:38 csarven They didn't come up with a vocab out of thin air. Surely that's based on observing patterns or needs. You may argue that their documentation sucks (and I won't necessarily disagree with that). However, it is wrong to suggest that they are somehow doing something that's not scientific.
10:39 tantek why are you assuming "based on observing patterns or needs"?
10:39 tantek that's your flaw
10:39 tantek I'm asking for proof in the form of a URL to documentation of observing patterns or needs
10:39 tantek but you're willing to accept it on faith
10:39 csarven No, that's your flaw. Just because yo udon't know it, it doesnm't mean that it doesn't exist.
10:39 tantek so without that documentation I say it's a waste of time
10:39 csarven That's a clear distinction to be made.
10:39 tantek it doesn't exist until evidence is provided
10:39 csarven Personally, I am a pragmatic.
10:40 csarven I don't see a flaw in there :)
10:40 tantek you're taking it on faith
10:40 tantek I'm saying I don't believe it until you give me a URL to the documentation
10:40 csarven We all start with axioms.
10:40 tantek not how you do science sorry
10:40 tantek you start with observation and documentation
10:40 tantek philosophers start with axioms
10:41 tantek hence my point about SW/LD being philosophy, not science
10:41 csarven You may not believe it because you haven't seen a documentation, yet, you come up with a belief that something is "unnecessarily complex" because that's ... occam's razor?
10:41 csarven Do you realize how absurd that sounds?
10:41 tantek no that's the default
10:41 tantek without evidence, something is unnecessary
10:41 tantek do you know how absurd it is to suggest otherwise?
10:41 tantek to suggest you need something without evidence?
10:41 tantek that's called marketing
10:41 csarven The fact that there are "observable" 65 billion triples + across ... is not some "philosophy". It exists. Deal with it.
10:41 tantek lol
10:43 csarven Sorry, ran out of battery :) ... And you need to go to bed :)
10:43 csarven (if have not already)
10:44 csarven Any way.. I appreciate the chat regardless
10:45 tantek csarven, again, I'll leave you with, why is it so hard for LINKed data advocates to actually provide LINKs to substantiate their arguments? ;)
10:48 csarven I tried to explain.. but I probably didn't do a good job. I'm fairly certain that you are quite aware of the SW/LD position. I suspect that issues are not due to technical differences. Some of the arguments against SW/LD (from the mf position) has different roots - some of which I'm aware but that's not the point.
10:48 tantek this is not unique to SW/LD btw
10:48 csarven So, when a debate arises, it is not essentially about the technical differences. It gets philosophical.
10:48 tantek most standards (web or otherwise) don't provide documentation of their problems and use-cases
10:49 tantek which means they get bloated and political
10:49 csarven I agree.
10:49 tantek instead of simple and pragmatic
10:49 tantek SW/LD is just one example
10:49 tantek a specific example
10:49 tantek but there are many (most?) others
10:49 csarven That's all valid. But, poor communication on that front doesn't equate to problem existing. Communicating well is an art.
10:49 csarven So, don't let the SW/LD "research" "papers" get in the way.
10:50 tantek not even asking for good documentation
10:50 tantek just *some* real world documentation
10:50 tantek yeah the research paper problem
10:50 csarven IMO, this is a solid documentation as it gets: http://www.w3.org/History/1989/proposal.html
10:50 tantek the fact that they're not publishing on the web at stable URLs with open acccess
10:50 csarven In there, I can see SW/LD/mf... all coexisting, and they do!
10:51 tantek can coexist doesn't mean must
10:51 tantek that's the point
10:51 tantek from a pragmatic minimalist viewpoint, everything must be justified
10:51 tantek not just by political statemetns like "coexist"
10:52 csarven I think that's the point. What you just said.. Many see SW/LD as sufficiently justified.
10:52 tantek you're right that there's a lot of specific problems described in http://www.w3.org/History/1989/proposal.html
10:52 tantek every standard / spec developed is seen as sufficiently justified by "many"
10:52 tantek that doesn't mean they are actually justified, by documented research
10:54 tantek it would be an interesting exercise to extract the specific problems mentioned/described in http://www.w3.org/History/1989/proposal.html and document them at their own URLs
10:55 csarven You know, many in the LD community disagree on what LD is too. There is an RDF-only camp vs. RDF is one of many.
10:56 tantek is that a syntax argument? or a model argument?
11:03 Loqi [[to-do]] http://microformats.org/wiki/inde...amp;oldid=64630&rcid=101170 * Tantek * (+316) more documentation and research, extract from TimBL's 1989 proposal
11:05 tantek csarven thanks for the reminder about and URL for TimBL's paper and his documentation or at least referencing of specific use-cases
11:07 csarven Essentially syntax on the surface but I would say both. Some view HTML/mf/Microdata to all belong to the LD goal.
11:07 csarven All "linked data". As opposed to whatever "Linked Data" is.
11:08 csarven TimBL's http://www.w3.org/DesignIssues/LinkedData originally didn't mention RDF/SPARQL. There was an update to include them. So, some confusion arises from that as well.
11:08 tantek yeah - most use of microformats2 is in a very lowercase "linked" data way - with emphasis on URLs
11:08 tantek except without all the formalities with URLs for predicates
11:09 tantek links for the *data* that is, not the predicates/relationships/vocab
11:09 csarven Yeap. All URIs are welcome in RDF, but for LD, HTTP is most *useful*
11:09 tantek oh that distinction. yeah the URN thing was hilarious.
11:09 tantek Urns are what you put dead things into ;)
11:11 csarven Pretty much. Just a string. Essentially good as any other unique srying
11:15 tantek right - and thus not as valuable / useful as an actual *link*

Social Web Architect
http://bblfish.net/

Received on Friday, 13 March 2015 12:20:18 UTC