microformats 2 and Linked Data

Since we now have to browse the whole web for discussions that might be relevant to the group from those that don't follow the e-mail ( see ISSUE-19 ), I thought I'd relay here a discussion by Tantek. You'll have to find his channel to respond to him.


26 Jan 2015
09:40	csarven	Can someone point me to why microformats2? For instance, what was at stake with microformats1 in for example hCard that needed a revisit for parsing? I'm sure this is documented somewhere. Would appreciate a pointer.
09:50	csarven	I should clarify: IIRC, in microformats, the storyline was to simplify authoring, and think of scripts later. In microformats2, however, the story appears to have changed a little i.e., authoring is slightly more complex or involved (depending on how you look at it), in order to improve how machines parse the information.
09:50	tantek	csarven: yup - all documented at the "obvious" place :)
09:50	tantek	http://microformats.org/wiki/microformats2
09:50	Loqi	+1
09:50	tantek	csarven - the "storyline" for microformats, was authors before parsers
09:51	tantek	for microformats2, the basic question we asked was, could we make thing simpler for BOTH authors and parsers
09:51	tantek	and that's what we ended up doing
09:51	tantek	the first part of the microformats2 page is more like a spec now rather than "story"
09:51	tantek	but the background is still there - let me get a fragment
09:52	tantek	here we go: http://microformats.org/wiki/microformats2#Background
09:54	tantek	your questions "what was at stake with microformats1" - if you mean what were the problems - are documented there
09:54	tantek	HTH and definitely let me know if you have any specific follow-ups - happy to improve the documentation accordingly
09:55	tantek	(but am calling it a night soon - willl check the logs - or I'll be back in the morning PST)
09:58	csarven	tantek Thanks!
09:58	csarven	Perhaps that was wording. I didn't necessarily mean "problems". It was more about the cause/initiative to move towards microformats2.
09:59	csarven	re: "all microformats are simply an object with a set of properties with values." from http://microformats.org/wiki/microformats2#Background . That's pretty much EAV model. Which is used by RDF as well.
10:01	tantek	nah - RDF complicates the model unnecessarily with basing it on "triples" http://microformats.org/wiki/triples
10:01	tantek	there's no "pretty much" about it
10:01	tantek	more like "ugly much"
10:02	csarven	http://microformats.org/wiki/linked-data is misleading :)
10:02	csarven	As well as /triples
10:03	csarven	In fact, it is very opinionated.
10:03	tantek	nope - such challenges are based on tons of experience
10:03	csarven	"unnecessarily complicated" ? C'mon
10:03	tantek	yes
10:03	tantek	takes longer to explain = unnecessarily complicated
10:04	tantek	anyway - no interest in arguing about RDF because that's a useless waste of time - since it's unnecessary
10:04	tantek	it does seem to float some plumbing boats, but so does plenty of backend futzing
10:04	csarven	First of all, the problem space for LD is completely different than mf. It is misleading to suggest that LD/triples is "unnecessarily complicated" and that mf should be preferable.
10:05	tantek	csarven: no who does LD bothers to actually document specific problem spaces with open research, so I categorically reject your statement
10:05	csarven	tantek This is not about arguing. We are discussing. I have championed mf for a long time and still do. So, I don't like being positioned somewhere as if I'm fighting against mf.
10:05	csarven	Supporting LD doesn't mean that I don't support mf.
10:05	tantek	not saying you're fighting mf
10:05	tantek	I'm saying that LD is a waste of time, that's a separate problem
10:06	tantek	plenty of people support both, that's their hobby
10:06	csarven	Well, I disagree. LD solves problems that mf can't.
10:06	csarven	Neither is it that mf is intended to solve those problems either.
10:06	tantek	csarven - well, when you find some actual scientific documentation of such problems then let me know
10:06	tantek	because they're usually framed in terms of abstractions and what ifs, and but if I want tos
10:07	csarven	I and many would argue that SW/LD is more "scientific" than mf :)
10:07	csarven	It is not at all about what ifs.
10:07	tantek	so that's the essence of the problem. I ask you for "research" (e.g. URLs pointing to), and you say you would "argue"
10:07	csarven	We have all sorts of data. Data is not only bound to what exists on Web pages that should be easy to author for Web developers. That's a very narrow POV.
10:07	tantek	you offering to "argue" is "what ifs"
10:08	csarven	Well, how about we back up and try to back up the statement "unnecessarily complicated" scientifically?
10:08	csarven	Can you provide surveys?
10:08	tantek	again, you're arguing hypotheticals, like I said, let me know when you have documented research about specific problems at a public URL, until then, you're wasting time with handwaving
10:08	csarven	I'm simply asking for documentation on "unnecessarily complicated".
10:09	tantek	yup - microformats are useful (plenty of specific use-cases on the wiki), microformats solve those problems without any need for triples
10:09	tantek	ergo, no need for concepts of triples for those use-cases
10:09	csarven	It is a strong claim. I'd like to know what type of research went into concluding that.
10:09	tantek	ergo any use of triples would add *unnecessary* complexity
10:10	tantek	triples vs. property: value
10:10	csarven	No, you are unfairly comparing the problem.
10:10	tantek	nope, I'm comparing a documented problem
10:10	tantek	vs. no documented problems that you need triples for
10:10	tantek	the burden of proof is on needing triples, not on not needing them
10:10	csarven	What you are saying is that, given the problem space of mf, triples/LD complicates the problem. I'm saying that, well, that's not an accurate picture.
10:10	tantek	I'm saying you have no picture
10:11	tantek	you have no documented research of specific problems
10:11	tantek	you have handwavings about what a picture could be
10:11	tantek	like I said, let me know when you have URLs to specific documentation about specific problems / use-cases
10:11	tantek	until then - you're just wasting time arguing
10:11	csarven	Documented problem? Based on what? Information based on the microformats wiki about LD? And that you conclude based on that documentation, LD is complicated?
10:11	csarven	Ok
10:12	tantek	no - real world problems for users
10:12	csarven	You are repeating yourself. DRY.
10:12	csarven	You need to revisit your axioms.
10:12	csarven	"real world problems for users"
10:12	tantek	occam's razor - triples/LD unnecessary
10:12	tantek	until proven otherwise. hence burden of proof
10:13	csarven	I see. So, you arbitrarily come up with a simple view of what triples/LD is, .. then go ahead and document that in the wiki and call it a victory for mf?
10:13	tantek	no it's more a defense against wasting time
10:14	tantek	victories for mf havee nothing to do with LD being good or not
10:14	csarven	Why bother with the documentation on the alternative in the wiki any way?
10:14	tantek	victories come from solving real world use-cases
10:14	csarven	mf is victorious in its own right.
10:14	tantek	right
10:14	csarven	There is no need to bash the alternatives.
10:14	tantek	the /triples etc. documentation is because people keep bringing them up
10:14	tantek	like an FAQ
10:14	tantek	so it's a summary answer
10:14	tantek	and it's usually sufficient
10:14	csarven	Well, I appreciate your POV.
10:14	csarven	I agree, it is sufficient for many.
10:15	tantek	there is actually a need to filter out crap
10:15	tantek	in everything
10:15	tantek	and filter out inefficiencies
10:15	csarven	But I disagree on the approach taken "against" LD
10:15	tantek	it's a trivial debunking, that's all
10:15	tantek	if you disagree - you can provide research that substantiates LD
10:15	tantek	until then - there is no point to it
10:16	csarven	That's trivial. Data exists outside of Web pages that are not "common".
10:16	tantek	if it's so trivial, point me to a URL to research
10:16	tantek	barring that, the research doesn't exist, because it's not trivial, or the problems don't actually exist that *require* LD
10:16	csarven	You want me to point you to some research that says "data exists everywhere... not only on web pages"?
10:16	tantek	no to specific such data
10:17	tantek	that somehow has a specific aspect that *requires* LD
10:17	tantek	point to actual research, not meta research
10:17	csarven	LD is a pretty good candidate. How about that? If there is an alternative approach (and often there and being argued) that can be compared.
10:18	csarven	That's an axiom.
10:18	tantek	nah - you have no problems being solved, so it's just theoretical handwaving
10:18	tantek	it's philosophy, not science
10:18	csarven	RDF is a a good candidate for the problem space. And it is based on EAV. .. Just as mf2. The fact that they differ on syntax/namespaces or not.. or whatever, it is a very minor
10:19	csarven	Are you serious?
10:19	tantek	you sound like you're actually asking for extensible vocabulary though, not triples, by your referencing "data exists outside"
10:19	csarven	Do you expect CERN to output their data from LHC into Web pages?
10:19	tantek	so now you're approaching a problem statement - so that's better
10:19	csarven	(I'm not arguing about LD here.. but that data exists elsewhere and that needs to be captured and modelled..)
10:20	tantek	can you point to a URL documenting the specific problems of CERN needing to output the data from the LHC?
10:20	csarven	Uhm.. they already do! http://opendata.cern.ch/
10:20	csarven	And there is more to it. One can't expect all roads to lead to mf.
10:21	tantek	that's a strawman
10:21	tantek	no one said all roads
10:21	tantek	I'm just saying I don't accept any "this is a solution!" statements without documentation of the problem
10:21	csarven	mf is not intended to deal with all those "problems". And that is perfectly fine. Just because mf can't, it doesn't mean that others are irrelevant or are "unnecessarily complex".
10:21	tantek	still don't see any documentation of any such problems
10:22	tantek	sorry - you're not providing *any* actual problem documentation
10:22	csarven	Well, if you want an occam's razor, then EAV, RDF are good candidates.
10:22	tantek	therefore you can't argue about it
10:22	csarven	We are discussing!
10:22	csarven	You want me to address all your issue with URLs on the spot?
10:22	tantek	nope, occam's razor is property:value works, don't need triples
10:22	tantek	yes
10:22	csarven	Especially when you leave an unscientific statement like "unnecessarily complex" up on the wiki?
10:22	csarven	But then go ahead and argue for something scientific?
10:22	tantek	if you can't back up your claims about problems with documentation of specific problems, your arguments are baseless
10:23	csarven	.. for LD?
10:23	csarven	C'mon.
10:23	tantek	unnecessarily complex -> occam's razor
10:23	tantek	already answered, quit asking same question
10:23	csarven	I've already explained to you that data exists everywhere. That's trivial. That's an axiom. Can we not agree to that?
10:23	tantek	nope. document a speciifc problem.
10:24	tantek	not some handwaving about data everywhere
10:24	csarven	We have a lot of data, and we want to "connect" this data with each other so we can have a interesting insights about societies, build better systems, make better decisions...
10:24	csarven	Ok.
10:24	csarven	That's *good enough*
10:25	tantek	again you're speaking in generalities
10:25	csarven	Not at all.
10:25	tantek	stop describing, and start providing URLs to documentation of specific research
10:25	csarven	Very concrete.
10:25	csarven	Did you skip over the whole Data Science trend nowadays?
10:25	tantek	don't care. specific URL or stop talking.
10:25	csarven	You are asking me to justify the problem for the users for CERN's data.
10:25	csarven	.. practically.
10:26	csarven	:)
10:26	tantek	"socieites", "systems", = generic
10:26	tantek	nothing specific
10:26	csarven	Okay, lets leave it at that.
10:26	csarven	The moment you are tellin gme to stop talking ... well, there is no discussion.
10:26	tantek	right, no point in any discussion since you cannot provide a specific URL to specific research about a specific problem
10:27	csarven	I think TimBL made a pretty good case about "linking data" 25 years ago aka Web.
10:27	tantek	barring that, no need for tripls/RDF etc.
10:27	csarven	Do we need to revisit that?
10:27	tantek	the web didn't need RDF/LD
10:27	tantek	and succeeded without it
10:27	tantek	more occam's razor
10:27	tantek	thanks for the proof
10:27	csarven	Web didn't need HTML5+JS+Flash... either
10:27	csarven	Web succeeded because of HTML.
10:28	csarven	More generally about linking documents.
10:28	tantek	yup - and the features added to HTML5 were all added one at a time based on documented use-cases
10:28	csarven	Linking "things" is a specialization of that.
10:28	tantek	web succeeded because HTML was *simple*
10:28	csarven	Agreed.
10:28	tantek	TimBL said so himself
10:28	csarven	Yes, and that he decided on HTML instead of something like TeX
10:29	tantek	generalizing and building abstractions without a problem to solve is philosophy not sceince
10:29	csarven	But the point is that, HTML opened up the idea for linking stuff across the globe. I fyou have some data and put it up somewhere, we can link to it.
10:29	tantek	here's the difference
10:29	tantek	HTML5 audio and video tags - clear documented use-cases
10:29	tantek	LD/RDF abstractions - no clear documented use-cases
10:30	csarven	I'm sorry to say but, I strongly dislike your position on mf being somehow "scientific", but that upper-case SW or LD is not.
10:30	tantek	science involves documenting your problems, and research
10:30	tantek	SW/LD advocates don't actually bother with that - they just invent stuff and prescribe it
10:30	tantek	no homework, no showing of steps
10:31	tantek	and frankly, there were areas where we didn't do enough documentation with microformats (classic) either
10:31	tantek	and most of those failed
10:31	csarven	I will entertain your idea for a moment. But, have you heard of "stamp collecting"?
10:31	tantek	we were not *strict* enough
10:31	tantek	with asking for documented research
10:31	tantek	the irony of LD advocates - they can't provide links to back up their statements
10:32	csarven	If mf was so "scientific", I'd expect a proper methodology. Starting from hypothesis and null hypothesis, and moving up. Certainly that's not the case. Did mf reject a null hypothesis somewhere? Is that in the wiki?
10:32	tantek	yes!
10:32	csarven	mf is "stamp collecting" just as much as SW/LD/RDF
10:32	tantek	http://microformats.org/wiki/process
10:32	csarven	Information Science.
10:33	csarven	Where is the hypothesis?
10:33	tantek	you start with not needing anything
10:33	tantek	and then documentation is the first step - of the problem etc.
10:33	tantek	you're leading with hypothesis and that's your problem
10:33	tantek	with science, you lead with *observation*
10:34	tantek	i.e. research
10:34	tantek	then you document it
10:34	csarven	That's an axiom. I'm looking for a hypothesis. And that at some point, mf rejected the null hypothesis and went along with the altnerative. Where is that mentioned clearly?
10:34	tantek	only after you have documented observations do you go to a hypothesis
10:34	tantek	that's scientific method 101
10:34	csarven	".... hence, we reject the null hypothesis "
10:34	tantek	LD/RDF advocates skipped the observation and documentation steps
10:35	tantek	so thus, unscientific
10:35	csarven	What you are talking about is stamp collecting. Not some brute force testing.
10:35	csarven	tantek Like I said, where is the blurb on rejecting the null hypothesis in the mf wiki?
10:35	tantek	scientific method doesn't need reject null hypothesis
10:36	tantek	thus we don't need it
10:36	tantek	we document existing real world user problems through observation
10:36	csarven	"All science is either physics or stamp collecting" -- Lord Rutherford
10:36	csarven	Thanks. So, again, mf is as "scientific" as SW/LD.
10:37	tantek	nope because we document our problems
10:37	tantek	and require it in our method
10:37	tantek	http://microformats.org/wiki/process#Why.3F
10:37	csarven	Unless you want to show me that hypothesis, than we can classify mf taking on the "hard-science" approach.
10:37	tantek	whereas SW/LD folks make up vocabularies first, then try to apply them
10:37	csarven	Ok. I stand by my position. I don't think we are disagreeing.
10:38	tantek	it's ok - eventually made-up stuff without documented problems / use-cases whithers and falls by the wayside
10:38	csarven	They didn't come up with a vocab out of thin air. Surely that's based on observing patterns or needs. You may argue that their documentation sucks (and I won't necessarily disagree with that). However, it is wrong to suggest that they are somehow doing something that's not scientific.
10:39	tantek	why are you assuming "based on observing patterns or needs"?
10:39	tantek	that's your flaw
10:39	tantek	I'm asking for proof in the form of a URL to documentation of observing patterns or needs
10:39	tantek	but you're willing to accept it on faith
10:39	csarven	No, that's your flaw. Just because yo udon't know it, it doesnm't mean that it doesn't exist.
10:39	tantek	so without that documentation I say it's a waste of time
10:39	csarven	That's a clear distinction to be made.
10:39	tantek	it doesn't exist until evidence is provided
10:39	csarven	Personally, I am a pragmatic.
10:40	csarven	I don't see a flaw in there :)
10:40	tantek	you're taking it on faith
10:40	tantek	I'm saying I don't believe it until you give me a URL to the documentation
10:40	csarven	We all start with axioms.
10:40	tantek	not how you do science sorry
10:40	tantek	you start with observation and documentation
10:40	tantek	philosophers start with axioms
10:41	tantek	hence my point about SW/LD being philosophy, not science
10:41	csarven	You may not believe it because you haven't seen a documentation, yet, you come up with a belief that something is "unnecessarily complex" because that's ... occam's razor?
10:41	csarven	Do you realize how absurd that sounds?
10:41	tantek	no that's the default
10:41	tantek	without evidence, something is unnecessary
10:41	tantek	do you know how absurd it is to suggest otherwise?
10:41	tantek	to suggest you need something without evidence?
10:41	tantek	that's called marketing
10:41	csarven	The fact that there are "observable" 65 billion triples + across ... is not some "philosophy". It exists. Deal with it.
10:41	tantek	lol
10:43	csarven	Sorry, ran out of battery :) ... And you need to go to bed :)
10:43	csarven	(if have not already)
10:44	csarven	Any way.. I appreciate the chat regardless
10:45	tantek	csarven, again, I'll leave you with, why is it so hard for LINKed data advocates to actually provide LINKs to substantiate their arguments? ;)
10:48	csarven	I tried to explain.. but I probably didn't do a good job. I'm fairly certain that you are quite aware of the SW/LD position. I suspect that issues are not due to technical differences. Some of the arguments against SW/LD (from the mf position) has different roots - some of which I'm aware but that's not the point.
10:48	tantek	this is not unique to SW/LD btw
10:48	csarven	So, when a debate arises, it is not essentially about the technical differences. It gets philosophical.
10:48	tantek	most standards (web or otherwise) don't provide documentation of their problems and use-cases
10:49	tantek	which means they get bloated and political
10:49	csarven	I agree.
10:49	tantek	instead of simple and pragmatic
10:49	tantek	SW/LD is just one example
10:49	tantek	a specific example
10:49	tantek	but there are many (most?) others
10:49	csarven	That's all valid. But, poor communication on that front doesn't equate to problem existing. Communicating well is an art.
10:49	csarven	So, don't let the SW/LD "research" "papers" get in the way.
10:50	tantek	not even asking for good documentation
10:50	tantek	just *some* real world documentation
10:50	tantek	yeah the research paper problem
10:50	csarven	IMO, this is a solid documentation as it gets: http://www.w3.org/History/1989/proposal.html
10:50	tantek	the fact that they're not publishing on the web at stable URLs with open acccess
10:50	csarven	In there, I can see SW/LD/mf... all coexisting, and they do!
10:51	tantek	can coexist doesn't mean must
10:51	tantek	that's the point
10:51	tantek	from a pragmatic minimalist viewpoint, everything must be justified
10:51	tantek	not just by political statemetns like "coexist"
10:52	csarven	I think that's the point. What you just said.. Many see SW/LD as sufficiently justified.
10:52	tantek	you're right that there's a lot of specific problems described in http://www.w3.org/History/1989/proposal.html
10:52	tantek	every standard / spec developed is seen as sufficiently justified by "many"
10:52	tantek	that doesn't mean they are actually justified, by documented research
10:54	tantek	it would be an interesting exercise to extract the specific problems mentioned/described in http://www.w3.org/History/1989/proposal.html and document them at their own URLs
10:55	csarven	You know, many in the LD community disagree on what LD is too. There is an RDF-only camp vs. RDF is one of many.
10:56	tantek	is that a syntax argument? or a model argument?
11:03	Loqi	[[to-do]] http://microformats.org/wiki/inde...amp;oldid=64630&rcid=101170 * Tantek * (+316) more documentation and research, extract from TimBL's 1989 proposal
11:05	tantek	csarven thanks for the reminder about and URL for TimBL's paper and his documentation or at least referencing of specific use-cases
11:07	csarven	Essentially syntax on the surface but I would say both. Some view HTML/mf/Microdata to all belong to the LD goal.
11:07	csarven	All "linked data". As opposed to whatever "Linked Data" is.
11:08	csarven	TimBL's http://www.w3.org/DesignIssues/LinkedData originally didn't mention RDF/SPARQL. There was an update to include them. So, some confusion arises from that as well.
11:08	tantek	yeah - most use of microformats2 is in a very lowercase "linked" data way - with emphasis on URLs
11:08	tantek	except without all the formalities with URLs for predicates
11:09	tantek	links for the *data* that is, not the predicates/relationships/vocab
11:09	csarven	Yeap. All URIs are welcome in RDF, but for LD, HTTP is most *useful*
11:09	tantek	oh that distinction. yeah the URN thing was hilarious.
11:09	tantek	Urns are what you put dead things into ;)
11:11	csarven	Pretty much. Just a string. Essentially good as any other unique srying
11:15	tantek	right - and thus not as valuable / useful as an actual *link*

Social Web Architect

Received on Friday, 13 March 2015 12:20:18 UTC