Re: Review of alternate2.html from Gregg Kellogg on 2013-02-11 (public-linked-json@w3.org from February 2013)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Mon, 11 Feb 2013 18:34:39 -0500
To: Dave Longley <dlongley@digitalbazaar.com>
CC: "public-linked-json@w3.org" <public-linked-json@w3.org>
Message-ID: <28925C26-0BA8-42BA-8510-275635CA8D0F@kellogg-assoc.com>
On Feb 11, 2013, at 1:49 PM, "Dave Longley" <dlongley@digitalbazaar.com> wrote:

> Thanks for the detailed review, Markus. I've got some feedback on your
> comments below:
>
> On 02/11/2013 07:53 AM, Markus Lanthaler wrote:
>> Hi folks,
>>
>> I'm not convinced that splitting the Context Processing algorithm into two
>> (Context Processing and Create Term Definition Subalgorithm) is really such
>> a good idea - especially considering that the Create Term Definition
>> Subalgorithm is just as long as the complete Context Processing algorithm in
>> index.html. The cost of having two algorithms is that, when you are reading
>> it, you have to jump between the two (probably first reading the intro of
>> the second and forgetting where you've been before in the meantime).
>
> I don't know that we should be too concerned with the length of the
> algorithm if it better explains what's going on. I do sympathize with
> the idea that it is an annoyance to be jumping between two algorithms.
> However, I'm not sure that most people would do that; they may just put
> a placeholder in their head for "now create a term definition" whenever
> they see it, without having to jump down and look at all the details
> each time. Furthermore, it makes it more obvious where you might want to
> break out separate functions when implementing because of common behavior.

I tend to agree with Dave: the main algorithm is the driver for creating term definitions. Also, given that creating term definitions is recursive, it makes sense to call this out.

>> The compaction algorithms look to be very similar at the first sight but the
>> devil lies in the details. I've found a number of bugs for which I added
>> tests to the test suite and I saw that Dave already fixed at least one of
>> them in his implementation.
>
> Hopefully they are all resolved now (I already updated the
> implementations and spec text, it was only a few of lines of text). I
> added a couple more test cases as well; there may be some more I need to
> add related to bugs I found when doing the algorithm rewrite (based off
> of the older algorithms), I'll need to double check. Perhaps we can get
> some more implementors to add in test cases that they have found or that
> seem likely to trip up implementations based on a review of the
> algorithms. There was also at least one new test case added that I'm not
> entirely sure I agree with what the output should be (related to term
> selection w/null language mappings).

We should have a process of reviewing contentious test cases.

>> Alternatively, we could
>> also try to get rid of the Inverse Context altogether as I proposed on the
>> mailing list some time ago, see:
>> http://lists.w3.org/Archives/Public/public-linked-json/2013Jan/0047.html
>
> IMO, I don't think it's too difficult to understand with the new text
> and algorithm.

I've come around to it. It is inevitable that a reverse lookup will be used. The main change for me is to go to a term definition abstraction, which is good modeling anyway.

>> 2. Features
>> ===========================
>>
>> Considering the discussion on the mailing list, I would propose to keep
>> using FOAF in these examples.
>
> I agree that we should add FOAF back into some examples.

+1

>> 2.1 Expansion
>> ---------------------------
>> 1st paragraph: Emphasize removal of context, not switching between contexts
>
> Maybe. I like how the main purpose of removing contexts is explained
> now. JSON-LD is often really, IMO, about being able to express the same
> data using different contexts, so it makes sense to talk about
> everything in that ... context.

Hmm; the purpose of expansion is clearly to remove a context. Alternative contexts only come in when re-compacting.

>> Under Example2: "The difference is in their context information" - that's
>> wrong, the contexts are the same.
>> I would thus propose to replace the paragraph with the intro from the
>> corresponding paragraph in index.html
>
> Are you sure? The two contexts look different to me, but we may be
> talking about something else. The context in Example 1 has the term
> "name", whilst Example 2's context does not.
>
>> 2.3 Flattening
>> ---------------------------
>> I would like to mention that all blank nodes will be labeled. This is an
>> integral part of flattening. It makes the examples slightly more difficult
>> but certainly helps to understand the algorithms later on.
>
> This was removed because blank nodes and blank node label is a fairly
> complex issue, IMO. I didn't think it was appropriate to be in the
> introduction. Perhaps it needs more highlighting later on?

All blank nodes are already relabeled in expansion, AKAIKR, so I'm not sure why this stands out as being any different.

>> 5. Algorithms
>> ===========================
>>
>> 5.2 Remote Context Resolution
>> ---------------------------
>> I really like the idea in general but would change it slightly. Instead of
>> pre-processing the whole document I would call the algorithm as first step
>> whenever a local context is being processed. So it would be a sub-algorithm
>> of context processing. It should also be described more formally and include
>> error handling, mention that other data in the remote context document is
>> ignored, etc. Slightly adapting the steps under Context Processing 2.2 in
>> index.html will do the job.
>
> Well, one of the reasons for getting it all out of the way to begin
> with, is to avoid introducing a lot of asynchronous complexity with the
> other algorithms. If someone were implementing a new processor in an
> asynchronous environment, and they tried to follow the algorithms, they
> might find control flow to be a pain. Hopefully they'd notice that if
> they got remote context resolution out of the way first their work would
> be easier, but why not give them a headstart in solving that problem for
> them? I do think both approaches should be mentioned either way.

For my part,  do synchronous processing anyway, and HTTP caching can eliminate most sources of latency; I find it better to keep it in the cor Expansion algorithm, which works well in my case. I'm agnostic about separating the algorithm out, but it sounds like an implementation-specific optimization.

>> 5.3 Context Processing
>> ---------------------------
>> Steps 3.4.4 and 3.5.4 are unclear. Instead of removing the keys, they are
>> set to true. This is confusing till you read the Create Term Definition
>> Subalgorithm.
>
> I don't know that it's confusing to mark those keys as having been
> defined (their entries in the defined map are set to true). I will say
> that the reason why those keys aren't removed is because, in this
> rewrite, I tried to take the approach of not modifying inputs as well as
> not having to say "copy the input" everywhere. This means it takes an
> implementor less mental effort to have to tease out where they can
> optimize away copying and where they can't.
>
>> 5.4 Create Term Definition Subalgorithm
>> ---------------------------
>> I'm not sure that it's a good idea to separate these steps from Context
>> Processing. I also find the introduction quite confusing. Is it talking
>> about the term (the left hand side) or its IRI mapping?
>
> Is the confusion in the choice of the word "term"? I have found some of
> the terminology to be confusing myself; we create "term definitions" for
> keys that are CURIEs ... are these "terms" or not? The terminology in
> the spec indicates that a term is any "short word" defined in a context
> that may be expanded to an IRI. Perhaps we should be saying "keys" (in
> the local context) before terms are defined? I think this is a general
> problem we have in the specification; I know it was there when I was
> looking over the old spec during the rewrite. I think we should get some
> consensus on our terminology for this moving forward.

If a CURIE is on the LHS, it is a term, IMO.

>> Step 3 seems to suggest that it is the left hand side
>> but it is not clear how to proceed after that step.
>
> This may just arise from the general confusion w/"term". All you do is
> invoke the create term definition algorithm recursively to define the
> dependency (as step 3 states), and then you move onto step 4. There's
> nothing special going on there.

The algorithm also causes curies to inherit the term definitions of the prefix, which isn't called out in the Syntax spec, AFAIKR.

>> Step 5 suggests that keyword aliases are managed separately. Why is then
>> just the alias removed and not a previous IRI mapping?
>
> Because the IRI mapping will be overwritten when you define the term.
> I'm not sure if this needs to be made more clear with a note?
>
>> Step 8.1 Remove @preserve
>
> Yes, this is for supporting the framing algorithm and it should be
> removed at this time.
>
>> Step 8.2 vocabRelative must be set to true; the result might be a keyword,
>> but it won't be put handled properly as keyword alias as in 8.1
>
> I thought you are not allowed to use keyword aliases in the context. Has
> this changed?
>
>> Step 8.4 is empty
>
> Fixed.
>
>> Step 11, really? Including type mapping etc. What's the point?
>
> Inheritance from dependencies. You may add/override a container, type,
> or language mapping. For example, if you have a term that is a CURIE
> that defines only a container, language, or type mapping, you want to
> inherit the IRI mapping from the prefix's definition. It sounds like we
> should add several test cases to the test-suite for this.
>
>> Step 12.2.2.2 Property generators should just expand to IRIs, not @type,
>> this should also be checked.
>
> This has been fixed ("@type" is now disallowed as a property generator
> IRI mapping).

Good.

>> Step 12.3 Why special case for @type? vocabRelative flag missing again
>
> The special case has been removed, it was an artifact from a previous
> algorithm. The vocabRelative flag is missing for the same reason as
> above, no keyword aliases are permitted in the context.
>
>> Step 14 Where comes the prefix mapping from? Active ctx, local ctx?
>
> From the active context (the step text is correct). All dependencies
> are resolved at step 3.
>
>> The order of the algorithms is a bit confusing IMHO. It clearly makes sense
>> to put context processing first, but then the IRI expansion algorithm should
>> directly follow it. Alternatively the order of index.html could be used,
>> i.e., the order of their dependency: expansion -> context processing -> iri
>> expansion
>
> I definitely think the order should not be what is in index.html.
> However, if it makes more sense to juggle the order some more in
> alternate2.html I'm all for that. Context processing should definitely
> come first. We should get some consensus on the order afterwards.

I think IRI/Value Compaction/Expansion should come after context processing. For my, EvaluationContext is a class, which implements these algorithms as methods, this is logical to me.

>> 5.6 IRI Expansion
>> ---------------------------
>> The defined map may be more efficient but I believe it's easier to
>> understand how cycles are detected if you pass an array of the currently
>> being processed terms, e.g., [ A, B, C ] if C then requires B you can simply
>> detect the recursion if it's already in the algorithm. Don't know, maybe
>> it's also just a personal preference. I don't have a strong opinion about
>> this but wanted to mention it.
>
> The map was also used because of the way the algorithm was rewritten --
> it stores a tri-state not defined/defining/defined, not just
> defining/not defined. This is used to avoid infinite recursion and is
> simpler to understand, IMO. I thought showing that there were three
> states was more clear than just having an array. Clearly we differ here;
> I don't really mind which way it is explained in the algorithm so long
> as the consensus is that it is easiest to understand (or maybe it just
> doesn't matter that much).

I sort of liked this too.

>> Problem desc: The algorithm doesn't care if it's already an absolute IRI, it
>> tries nevertheless. The given value might be an absolute IRI, we don't know
>> if it is at this point.
>>
>> General solution: I would shorten it, maybe just keeping the first
>> paragraph. The rest is just as complicated to read as the algorithm itself.
>
> I remember wanting to shorten that particular general solution as well.
>
>> Algorithm
>> Step 1: Also keywords should be returned as is
>
> This is handled in step 5 to better consolidate, after keyword aliasing
> has been handled.
>
>> Step 2 & 3: This should be made a bit clearer. It's quite difficult to
>> understand what's going on. The point is, that terms/prefixes in the local
>> ctx that haven't been processed yet, are processed first.
>
> Agree. We could change the language a bit here to clarify.
>
>> Step 4: no need to say "explicitly ignore value", this is confusing. Just
>> return null.
>
> I disagree. I wanted to know why null was being returned and I wanted to
> know it right there in the algorithm without having to go find out what
> was going on via another document or statement somewhere. I think it's
> important to have notes within these algorithms about what the special
> meaning of things, like a "null mapping", are.
>
>> Step 6 onwards: isAbsoluteIri is confusing, could be a bNode ID as well.
>
> There are probably some other temporary variable names that could be
> better, in addition to this one.
>
>> Step 7: Why not just return the mapping (given vocabRelative is true)?
>
> It may be a blank node, which needs renaming. This is consolidated into
> a single step below.
>
>> The whole branching around isAbsoluteIri is really confusing.
>> Maybe it's simpler to use the algorithm index.html and invoke the Create
>> Term Def. algorithm there instead of recursing into itself (even if I find
>> that simpler since the algorithm is self-contained).
>
> I definitely prefer keeping the algorithm self-contained. Maybe we can
> clean up the branching in another way. I would prefer that we don't
> repeat ourselves too, if we can avoid it. I'm ok with the general idea:
> get the value associated with the active context, if it's a keyword or
> null return, if you got something else don't look for a CURIE, otherwise
> you got nothing, so you have to check for a CURIE. Once you've dealt
> with the possible CURIE, do blank node renaming or apply vocab or base.
> I think that flow is easy to follow and it's what we should go with, we
> might just need to clean up the algorithm text a bit.

Another thing, it should be clear what the document base is when running this. When expanding a context IRI, it should be the referencing context.

>> 5.5 Expansion
>> ---------------------------
>> Step 3.1: The @context member is not removed
>
> Yeah, I see that now. Perhaps we want to instead skip that key further
> down (rather than removing it, to keep with the notion of not having to
> copy inputs).
>
>> Step 3.2: This wouldn't be needed if @graph would be handled accordingly
>> under 3.4.3 (see index.html 4.3.2.8) and would simplify 3.4.6, 3.9, and 4.1
>
> We should look into this. From what I recall, the algorithm in
> index.html was buggy. One of the test cases I added might be related to
> this as well (it didn't work w/your playground). There may be another
> that is required to further expose the problem.
>
>> Step 3.4.2: ... or a bnode ID
>
> Yes, just another case related to bnode IDs not being IRIs. It would be
> nice to have a catch-all identifier name that includes bnode IDs. Either
> that or we can use "if the expanded property is a string that contains a
> colon", but I find that somewhat ugly and doesn't do as good of a job
> explaining what's going on.

Yes, let's mint a "term" for this.

>> Steps 3.4.3.x: IMHO it would be much simpler to expand the values directly
>> here (just as in steps 4.3.2.x in index.html) instead of having to expand
>> the property again in value expansion to find out that it is @id and then
>> expand using the IRI expansion algorithm. All keywords would then be
>> completely processed after this step.
>
> The split was made to separate validation from the general processing
> and to avoid repeating the same steps for array processing of @type,
> etc. In this way @type isn't handled differently from @list or @graph,
> both of which recurse. These array-handling steps are not present in the
> index.html version (so it doesn't work if you implement it according to
> the text because IRI expansion doesn't handle arrays). You can either
> recurse to handle arrays or you can repeat yourself; I tried to keep
> things consistent with recursing for @type ... which also meant it
> seemed cleaner to keep validation consistent and separate from general
> processing. I think this is all a matter of style. Of course, if we can
> simplify it all without repeating ourselves somewhere I'm all for that.
>
> It may be that it's just always a tradeoff with where we repeat
> ourselves. In that case, I'd prefer to just recurse through the
> expansion algorithm to let it handle arrays vs. other values and let all
> values run through "value expansion". The difference in approach may
> again arise from the somewhat amorphous nature of our terminology (eg:
> "value").
>
>> Step 3.4.4.2: "and value language value in value" is confusing
>
> Agree. We should comb over the spec and read it without special markup
> like italics, etc. -- for similar issues.
>
>> Step 3.4.6: might need some clarification. It might be easier to put it
>> under 3.4.3
>
> Agree, needs clarification.
>
>> Step 3.4.11: also this step would be simpler if everything keyword-related
>> would be done under 3.4.3
>> Step 3.5.3: already verified in 3.4.3.5, but @type must be a string and not
>> an array
>> Step 3.9: could be simplified if 3.2 is changed (see index.html 4.6)
>
> Again, these simpifications come with tradeoffs. As usual, we should do
> whatever the consensus is.
>
>> insideList flag can be eliminated by checking looking instead into active
>> property's container mapping or active property=@list
>
> We need to check this. That may not be true -- unless a side-effect
> somewhere else handles a particular case, perhaps through recursion. I
> think we need more test cases in the test-suite for this.
>
>> 5.7 Value Expansion
>> ---------------------------
>> As already said above, I think it would be simpler to handle @id and @type
>> expansion directly in the expansion algo. Fewer jumps, fewer branches,
>> shorter algorithm(s) - not a deal breaker though.
>>
>> Step 5: not needed, it's done again in step 9.
>> Step 10: default language missing
>> Step 12: missing, return result
>
> The result is returned in step 11, but we can break that out into a
> separate step if necessary. The other two steps have been fixed.
>
>> 5.8 Label Blank Nodes Subalgorithm
>> ---------------------------
>> The isId flag isn't needed, @type is relabeled already in IRI expansion
>> When will step 4 ever be reached (and do something, not reassigning the same
>> bnodeId it already is)?
>
> This has been removed. It used to be used when blank node relabeling
> could be "triggered" to be turned on when encountering a property
> generator instead of having to always be on.
>
>> 5.9 Generate Blank Node Identifier
>> ---------------------------
>> Proposal: Get rid of the configurable prefix, we never change it anyway.
>> Just hardcode _:t or, even better because it's easier to remember, _:b.
>>
>> Briefly describe what the identifier map is used for, i.e., keeping track of
>> blank node reassignments.
>
> I agree with this. The reason why the prefix was configurable is because
> this algorithm is reused by RDF graph normalization (no longer part of
> this document).
>
>> 5.10 Compaction Algorithm
>> ---------------------------
>> I didn't mention this for the other algorithms, but I think it would be
>> simpler to reorder the branches so that the longest comes last. In this case
>> e.g., first process scalars, then arrays, and finally objects (instead of
>> array, object, scalar which results in a lonely one-liner at the end).
>
> I think it's easier to think of "scalar" as "not the other things". I'm
> not sure if other developers will tend to agree with me or not. I have a
> vague feeling (no hard data), based on my experience with reading other
> people's code, that they prefer to check for arrays and objects first
> ... and then fall back to primitives. This is cleaner to do in certain
> languages. However, I do prefer that shortest steps come first. I could
> go either way here.
>
>> The algorithm needs to be passed the inverse context otherwise it can't pass
>> it to the subalgorithms (or they have to re-create it constantly from the
>> active context)
>
> I may have been intentionally vague on how you get to the inverse
> context. Perhaps that should be changed.
>
>> Step 1.3: The compactArrays option is not used
>
> This needs to be added.
>
>> Step 2.2: I don't think we need to explain a shallow copy is created and
>> certainly we don't need to distinguish between array values and non-array
>> values, the point is that each value is copied but the references contained
>> in that copy remain the same references.
>
> The reason for the distinction was to highlight that the shallow copy is
> actually two-levels deep. If we can clean this up without losing what's
> going on, that would be great.
>
>> Step 2.4: Say keys from shallow and mention that this has to be done because
>> keys can be removed from shallow in the following loop
>
> I think it says from "element", not from "shallow".
>
>> Step 2.5.3.1: The IRI compaction algorithm has no documentRelative flag; it
>> is also not needed as vocabRelative can be used to determine if the value is
>> being used in @type (true) or @id (false). All IRIs are document-relative.
>
> This has been corrected.
>
>> Step 2.5.5.1: Is activeProperty the same as "active property" or not? It's
>> confusing to have two variables with the same name
>
> Not the same. This has been fixed.
>
>> Step 2.5.5.1 (and in general): Passing parent to the IRI compaction algo
>> which then passes it to term selection just to eliminate *some* of the
>> potential property generators because not all IRIs are in shallow doesn't
>> make sense to me since it doesn't optimize much but makes the algorithms
>> much harder to understand. You can easily do that check in the Find and
>> Remove Property Generator Duplicates Subalgorithm and I think that's the
>> place where it belongs.
>
> I think it makes more sense to consolidate how term selection happens.
> It's easier, IMO, to understand that you pass in some parameters to a
> term selection algorithm -- which then gives you back the term you are
> supposed to use. I think that's better than it maybe returning you a
> list of terms to try out yourself.
>
>> Step 2.5.5 and substeps: There's a bug that's a bit difficult to explain,
>> please see compact-0046 which I just added.
>
> This is fixed.
>
>> Step 2.5.6.2: Active property could be an array of potential property
>> generators, how do you choose the container then? Actually I went on and saw
>> that property generators are not handled correctly at all. I added another
>> test, compact-0047 which shows the problem. Given this bug, the rest of the
>> algorithm works under the wrong assumptions. So it doesn't make much sense
>> to review the remaining steps.
>
> This has been fixed.
>
>> 5.12 Inverse Context Creation Subalgorithm
>> ---------------------------
>> Good idea to sort the terms before processing them! Overall, the algorithm
>> is 4 times as long as the one in index.html. While I think we should clarify
>> the one in index.html I don't think it's necessary to blow it up that much.
>
> I'm not sure if the algorithm is actually longer or if the other one did
> not really do enough of an explanation for what was going on. Also, the
> newer approach seems to simplify understanding the inverse context overall.
>
>> Overall: I find it confusing to "initialize [variable] to the value
>> associated with the [key] key in [map]. What you do here is to store a
>> reference to that member, you don't assign its value to [variable].
>
> I'm not sure what language we want to use to clarify what's going on
> here. We could easily be more verbose to hide what is really an
> implementation detail, but it is also just used to help people
> understand the steps by breaking them down a bit. I'm not sure what we
> want to do with these situations in general.
>
>> Step 3.3.3: It would be more efficient to do this outside of the loop
>> Step 3.3.4: No need to initialize the defaultLanguage key here, this is done
>> again in 3.3.6.3.1
>> Step 3.3.6: I would find it easier to understand if the checks would be "If
>> there's a type mapping.", "Otherwise, if there's a language mapping", and
>> finally just "Otherwise". This would also eliminate some nestings.
>
> These sound like good suggestions.
>
>> There's a bug since the algorithm doesn't distinguish between a term with no
>> language (and no type) mapping and a term with a language mapping set to
>> null. I've added compact-0048.
>
> I think it does make that distinction, but I think it makes in an
> opposite direction from the test case. I'm not sure that test-case has
> the desired output. It seems like we'd prefer to select the null
> language term over the no-language term because it is more specific.
>
>> 5.11 IRI Compaction Algorithm
>> ---------------------------
>> As already said in the Compaction algorithm, I would suggest dropping the
>> "parent" parameter. Instead I would pass the inverse context explicitly.
>> Depending of the outcome of ISSUE-204, also the base IRI is required.
>>
>> Step 5.4: This is related to the bug described above, numbers, e.g., have no
>> language mapping. This is different than strings with a language that equals
>> null.
>
> I'm not sure I agree with this approach. I'll defer to the group,
> though. It's a very easy change if we decide to go the route you're
> proposing.
>
>> Step 5.4.4 and substeps: At least @id objects are not handled properly; I've
>> added compact-0049 for this bug. I won't review the rest of these sub-steps
>> but continue directly with compact IRIs.
>> Step 7.4 and 7.5: Why introduce the variable isUsableCurie here?
>
> The bug has been fixed. That variable is meant to help make clear when a
> CURIE can be used and when it can't. We could do the same in prose and
> remove the variable.
>
>> I can't see how the algorithm works if there are potential property
>> generators that turn out to not match because the data doesn't match. I've
>> added compact-0050 as a minimal test. The point is that the fallback terms
>> are never returned. As I understand the algorithm, in this case only "gen"
>> would be returned.
>
> This has been fixed.
>
>> Depending on the outcome of ISSUE-204, compaction to relative IRIs is
>> missing.
>>
>>
>> 5.13 Term Selection Subalgorithm
>> ---------------------------
>> Drop parent parameter and the corresponding "optimization"
>>
>> Sometimes @null is used, sometimes @none (also in the other algorithms). I
>> find this confusing as it is very difficult to remember what's in use where.
>
> I thought it was very important to distinguish between the two. Saying
> something is null in JSON-LD explicitly means you don't want to use it;
> whereas "@none" is used when you haven't said anything at all.
>
>> Step 3: Missing space "underwhich"
>> Step 4: Typo in typeOrLangageValue. In step 3 typeOrLanguageValue is
>> modified, in step 4 then a new variable "preferred" is introduced. Why not
>> reuse typeOrLanguageValue? Why not do those steps directly in IRI
>> compaction? Same question for step 1
>
> We could modify typeOrLanguageValue to be an array when the algorithm
> begins if people think that's clearer than putting it into an array of
> preferred values to iterate over. Perhaps renaming the variables, in
> general (you also suggest this), would help resolve this issue.
>
>> 5.14 Value Compaction
>> ---------------------------
>> Substeps of 1: The whole preserveIndex stuff is quite complicated. Why not
>> just create a copy of value and then modify that copy as the algorithm in
>> index.html does?
>
> I didn't think it was that complicated. I'll have to compare the two
> algorithms again.
>
>> Step 2: What if there's also @index? Why expand the value? Why IRI expand an
>> object? It's already expanded, we are compacting here. I think it should
>> mean expand active property. Typo "passing passing"
>>
>> PROPOSAL: Keep the algorithm in index.html which is half as long. Modify it
>> to work on a copy of value if necessary or construct a copy step-by-step
>> (this should require just minimal changes).
>
> I think the case where @index is in a subject reference may not be
> handled. Sounds like we need a test-case for that. Also, the algorithm
> in index.html doesn't seem to handle compacting the IRI in that case.
> The length of the algorithm may have more to do with
> understandability/clarify than anything, but I'll have to go back and
> look. Obviously, if the extra verbosity adds nothing we should shorten it.
>
>> 5.15 Find and Remove Property Generator Duplicates Subalgorithm
>
> This algorithm has been patched up a bit since you wrote your review.
> Hopefully it functions appropriately now. It is also reused (used in two
> places) now.
>
>> ---------------------------
>> In contrast to the algorithm in index.html, the algorithm accepts just a
>> single property generator. So they do something different.
>>
>> The algorithm in index.html does the following:
>>   - Check for *every potential* property generator if all of its IRIs are
>> included as properties in element
>>   - If so, check if they all contain the passed value
>>   - If so remove those duplicate values (and potentially the whole property
>> if no other value is left)
>>   - Return the first matching property generator for which all duplicates
>> have been found
>>   - If no matching property generator was found, return the fallback term,
>> compact IRI or IRI.
>>
>> The algorithm in alternate2.html does the following:
>>   - It checks for a *single's* property generator IRIs if element contains
>> the duplicates
>>   - If a duplicate is found, it is deleted straight ahead (not checking
>> first verifying whether all duplicates are really there) - this algorithm
>> doesn't seem to work on a copy of element so data is potentially lost
>>   - It doesn't return anything
>>
>> Since the algorithm is just invoked in the compaction algorithm the
>> remaining work must be done there but it isn't. Here's what done there:
>>
>>   If there is a term definition for activeProperty in active
>>   context that is a property generator, then invoke the Find
>>   and Remove Property Generator Duplicates algorithm, passing
>>   active context, shallow for element, expanded property,
>>   expanded item for value, and activeProperty.
>>
>> So I cannot see how this would possibly work. I would thus propose to keep
>> the algorithm of index.html but include steps 2.3.1.X from alternate2.html
>> to describe in detail what's a duplicate and what's not.
>>
>> The remaining algorithms haven't been changed.
>>
>>
>>
>> --
>> Markus Lanthaler
>> @markuslanthaler
>
>
> --
> Dave Longley
> CTO
> Digital Bazaar, Inc.
> http://digitalbazaar.com

I'll have more later as I work my way through a re-write.

Gregg
Received on Monday, 11 February 2013 23:35:25 UTC