Re: [xbiblio-devel] Syntax proposal: conditions [take two]

Sylvester Keil Mon, 13 May 2013 07:45:20 -0700

On May 13, 2013, at 2:44 PM, Frank Bennett wrote:

> On Mon, May 13, 2013 at 7:33 PM, Sylvester Keil <[email protected]> wrote:
>> 
>> On May 13, 2013, at 11:22 AM, Charles Parnot wrote:
>> 
>>> The conclusion on the original proposal was that it was hard to read (from 
>>> a human perspective) and had the problem of moving some of the logic to the 
>>> attribute values (which from an implementation point of view, and for 
>>> validation purposes, is a problem).
>> 
>> Isn't this problem eliminated by introducing the nand matcher?
>> 
>> As I understood it the new proposal does not support the individual negation 
>> either but uses the 'nand' matcher for this purpose – you can apply this to 
>> the original proposal just as easily. The example becomes:
>> 
>> <if match="all" type="book" variable-nand="edition">
>> 
>>> This proposal is more verbose, but also more readable.
>>> 
>>>> ['all', ['type', 'all', ['book']], ['variable', 'nand', ['edition']]]
>>> 
>>> 
>>> Only programmers can parse that ;-)
>> 
>> Indeed – it wasn't supposed to be a new syntax proposal, just to explain 
>> what sort of information we as programmers need to extract from the style. : 
>> )
>> 
>>>> This is the information I need to process the conditions – I personally 
>>>> think that using 4 XML nodes to describe this list makes testing a 
>>>> nightmare.
>>> 
>>> It's again a trade-off between being friendly/readable to style 
>>> implementors, and being friendly to the processors. Though I am not sure 
>>> where the nightmare comes from in this case. Did you mean it's harder to 
>>> build test cases? In which way?
>> 
>> Is it being friendly to style implementors though? Personally, I don't enjoy 
>> writing so much XML. Nightmare is perhaps a little strong, granted, but 
>> look: I want to have test cases for each individual element – it helps limit 
>> errors when I am able to do that. With the new syntax the sole purpose of 
>> the if or conditions element is to fetch the conditions from its child 
>> elements – that's something you cannot test in isolation unless you mock all 
>> the child elements. However, since the logic involved here is so trivial it 
>> isn't really worth the trouble to write mocks – and your tests will, in all 
>> likelihood not be isolated anymore.
> 
> That's true. But the problem isn't limited to this syntax. cs:label
> inside cs:names can't be tested without its parent and sibling
> elements, and children of a sibling cs:substitute node. Implicit
> suppression in cs:group can't be tested without its children. cs:macro
> can't be tested without children.


Of course you need integration tests, too – my point is that the cost of 
switching from attributes to elements means that condition evaluation (note 
that I'm not talking about actual rendering/processing, but just evaluating the 
condition) now requires mocks for unit testing too.

In trying to write a modular processor, every inter-dependency between elements 
hurts. Obviously we have this elsewhere, too, but this adds one more instance 
to the list – I'm not against it on principle, but if all we get from this is 
improved legibility in a handful of complex styles I wonder if it is really 
worth it. You have a much better grasp of both sides – styles and processor – 
so I'll gladly take your word on it, if you say that it is worth it after all. 
I just wanted to emphasize that I do feel it is a steep price to pay.

This is not even considering the issue of dealing with two syntax variants, 
which adds a considerable burden I think.

>> Furthermore, I am writing a processor specifically to be able to deal with 
>> CSL fragments, because sometimes users want to overrule a specific element 
>> on the fly or because they don't need a full CSL style, but just want to 
>> format, say a title, or a name – for these cases, and for testing, it is 
>> extremely useful to just create a new node on the fly and feed it to the 
>> processor. Having child elements always complicates matters because you need 
>> to create those first.
>> 
>> Of course, in many cases this is necessary – but is that true for a 
>> condition? More so, as conditions are something that, conceivably, you might 
>> want to quickly change on the fly if you're working with a CSL processor 
>> programmatically.
>> 
>> I have the feeling that CSL processors tend to become these monolithic 
>> black-boxes that take the style, and the locale, and the citation data and 
>> give you the output – but are extremely difficult for other developers to 
>> work with from the outside – this is certainly what happened to 
>> citeproc-ruby and this is why I started to re-write it from scratch. Having 
>> to handle an XML sub-tree to hold a simple condition is precisely something 
>> that encourages this tendency in my opinion.
> 
> I agree that it is less cumbersome to have the entire logic on the
> cs:if or cs:else-if node, but working through some initial cleanup in
> one of the legal styles has taught me that second proposal is
> necessary to produce more compact code in a complex style.
> 
> Here is what the title macro from the American Law style now looks like:
> 
>  https://gist.github.com/fbennett/5567599
> 
> Welcome to my nightmare: as you can see, the underlying style is quite
> a disaster zone, but we're stuck with it (and not just us -- pity the
> poor paralegals who are currently tasked with typing this stuff out by
> hand). The conditions seem to cope: the current set of 1760
> integration tests for the style now pass.

Ouch! This reminds me of the BibTeX style manual – 'taming the beast' : )

> Through this consolidation I was able to remove four macros and 60
> lines of code. Any mis-rendered title in the style will now originate
> from this macro, which should save a lot of debugging time going
> forward. With the initial proposal, the result would not have been
> this compact.

Looking at this style I do see the advantages. It is mostly that this allows 
you to include different combinations of condition-types in the same statement, 
right? (The one on line 130 is a good example) Whereas with the original 
proposal you would have to split that into individual else-ifs all targeting 
the same macro?

How many styles like that are there, though? There's no doubt that this 
application obviously makes the syntax more powerful – if this outweighs the 
costs in your opinion, I'm all for adding it.

> Implementing the syntax wasn't too much work, since it was just a
> matter of nesting functions that had already been coded. But that's
> just one experience: I hear you when you say that spreading logic
> across nodes has costs. I just don't see how logic of several of the
> conditions in the sample could be squeezed onto a single node.

I don't think that it is a problem to implement this syntax: I thought it added 
unwarranted overhead. If there are enough benefits for enough styles, I agree 
that it is worth it.

>> 
>> I realize that at the end of the day it is largely a matter of taste: do you 
>> prefer more attributes or more elements? – But to actively support both 
>> approaches is a really bad idea in my opinion. Most processors parse the 
>> style and convert it into some kind of internal representation – but all 
>> tools that also want to convert the internal presentation back into a valid 
>> style are in a really bad position now: the easy way out is of course, to 
>> always convert to the more expressive syntax – but if, as you say, the idea 
>> is to keep the old one for the simple cases because it is shorter and more 
>> readable, this defeats the purpose.
> 
> I confess that the idea of converting the internal representation back
> to XML never occurred to me, and I'm curious. What would be the use
> case for that?

For example, Rintze asked me the other day if we could detect styles that had 
superfluous strip-periods attributes. This involves a kind of mini cite 
processor because you need to translate all possible combinations of term 
translations. You can see the barebones of how this could work here: 
https://gist.github.com/inukshuk/5508232 – this is much easier to do with a 
dedicated library than in pure XML. Once you detect the changes, it would be 
great to simply save them back as XML but for that you need to convert the 
internal representation back to XML. When doing that, there are already many 
subtleties (e.g., how to treat attributes which are set to default values) – 
having to keep track of different output variants doesn't make the job easier.

Sylvester



>>> 
>>> On May 13, 2013, at 10:40 AM, Sylvester Keil <[email protected]> wrote:
>>> 
>>>> 
>>>> On May 13, 2013, at 9:50 AM, Charles Parnot wrote:
>>>> 
>>>>> When you put it this way, I agree "backward-compatibility" is more of a 
>>>>> trick than a real thing.
>>>>> 
>>>>> However, it got me thinking: do we want to get rid of the "old" syntax, 
>>>>> where the predicates are in the `<if>` element? I would think no, we want 
>>>>> to keep them. This is still the shortest and most readble way to write 
>>>>> conditionals in most cases. So maybe in this case, we are in fact truly 
>>>>> backward-compatible :-)
>>>> 
>>>> Charles, you're right, if we still want to encourage its use that's a 
>>>> compelling reason to keep the old way around, of course. (I wanted to make 
>>>> the case firmly against keeping old features around just for the sake of 
>>>> compatibility.)
>>>> 
>>>> But if this is the case, it gives me even more reason to believe that the 
>>>> original proposal was actually the better one – if only a handful of 
>>>> styles need the advanced syntax anyway it seems much more reasonable to me 
>>>> for it to be an extension of the normal syntax rather than a completely 
>>>> different one. Especially since you decided to lose the 'not:' in favour 
>>>> of 'nand' matchers anyway.
>>>> 
>>>> Do we really want to add two new elements for an alternate syntax when 
>>>> most styles use attributes on cs:if anyway?
>>>> 
>>>> Thinking about how to implement this, my best approach was to simply 
>>>> convert the new conditions to the lists which can be used with the 
>>>> original/extended rendering algorithm. But this just tells me that all 
>>>> we're accomplishing here is introducing a bloated XML syntax to describe 
>>>> simple lists.
>>>> 
>>>> Looking at the example:
>>>> 
>>>> <choose>
>>>> <if>
>>>> <conditions match="all">
>>>>   <condition type="book"/>
>>>>   <condition variable="edition" match="nand"/>
>>>> </conditions>
>>>> <text macro="some-text-macro"/>
>>>> </if>
>>>> </choose>
>>>> 
>>>> 
>>>> What I would do here, in order to evaluate the statement is translate the 
>>>> conditions to something that can be reduced efficiently. Something like 
>>>> this:
>>>> 
>>>> ['all', ['type', 'all', ['book']], ['variable', 'nand', ['edition']]]
>>>> 
>>>> This is the information I need to process the conditions – I personally 
>>>> think that using 4 XML nodes to describe this list makes testing a 
>>>> nightmare.
>>>> 
>>>> Also, think of the headaches you're creating for tools that create styles 
>>>> like the style editor: if you have a condition in the 'old' way and you 
>>>> change a little detail – does it stay in the old way or is it converted? 
>>>> If you try to keep it in the old way then once you add a feature that 
>>>> can't be expressed in the old way you need to convert everything etc. – or 
>>>> if tools cannot handle this sort of thing, as a style author you're forced 
>>>> to convert all conditions to the new syntax if you want to add something 
>>>> that was not supported previously. With the original approach you can 
>>>> always add new conditions without having to change the old ones.
>>>> 
>>>> Sylvester
>>>> 
>>>> 
>>>> 
>>>>> On May 13, 2013, at 9:11 AM, Sylvester Keil <[email protected]> wrote:
>>>>> 
>>>>>> 
>>>>>> On May 8, 2013, at 9:19 PM, Rintze Zelle wrote:
>>>>>> 
>>>>>>> On Wed, May 8, 2013 at 3:06 PM, Bruce D'Arcus <[email protected]> wrote:
>>>>>>>> I'd like to ask an orthogonal question. See below ...
>>>>>>>> 
>>>>>>>>> As this is backward-compatible, deployment in a CSL 1.0.2 version
>>>>>>>>> should be possible (but I have no strong opinion).
>>>>>>>> 
>>>>>>>> Do we have a common understanding of how we define
>>>>>>>> "backward-compatible" for CSL?
>>>>>>>> 
>>>>>>>> Do we maybe need to make that explicit if it's not now?
>>>>>>> 
>>>>>>> Didn't we have this discussion already when I prepared the release of 
>>>>>>> CSL 1.0.1?
>>>>>>> 
>>>>>>>> From http://en.wikipedia.org/wiki/Backward_compatibility: "If products
>>>>>>> designed for the new standard can receive, read, view or play older
>>>>>>> standards or formats, then the product is said to be
>>>>>>> backward-compatible". So for us, I would define backward-compatibility
>>>>>>> as the ability of CSL processors written for one specific version
>>>>>>> (e.g. "1.0.1") to process CSL styles written in an earlier CSL version
>>>>>>> (e.g. "1.0").
>>>>>> 
>>>>>> I think the following are two separate things:
>>>>>> 
>>>>>> - the spec is backwards compatible
>>>>>> - a CSL processor is backwards compatible
>>>>>> 
>>>>>> Note that that no matter what you write in the spec a processor can 
>>>>>> *always* opt to be backwards compatible (unless you outright forbid it) 
>>>>>> by analyzing a given style and choosing between alternative algorithms.
>>>>>> 
>>>>>> Note also that you can *always* make new features backwards compatible 
>>>>>> in the spec by doing exactly what you propose below: explicitly 
>>>>>> specifying alternatives by saying there is a 'new' way and an 'old' way. 
>>>>>> In my opinion this should not be part of a spec – by all means, you can 
>>>>>> add a note urging implementers to also support an 'old' way, but why 
>>>>>> force them? I find this extremely disrespectful – sure, it's easy for an 
>>>>>> established implementation to switch to a new feature and keep the old 
>>>>>> one around for compatibility's sake, but think for a moment what you're 
>>>>>> saying to developers working on a new implementation (I understand that 
>>>>>> this is something to be encouraged) – essentially, to implement a 
>>>>>> feature twice, or to wrap their head around how they can reconcile two 
>>>>>> approaches into a single algorithm – precisely what the spec fails to do 
>>>>>> in the first place.
>>>>>> 
>>>>>> If a feature differs so much that the underlying syntax changes or that 
>>>>>> new elements need to be created, I don't think it's fair to call it 
>>>>>> backwards compatible. Such a feature should go into a major release and 
>>>>>> the legacy feature should be dropped completely. By doing this you're 
>>>>>> actually *helping* implementers to provide backwards compatibility, 
>>>>>> because they can simply look at a style's version and tell whether they 
>>>>>> need to follow the 'new' or 'old' way.
>>>>>> 
>>>>>> A new feature that can be implemented in such a way that the *same* 
>>>>>> algorithm can still render old styles – this is something I would call 
>>>>>> backwards-compatible (as far as the spec is concerned). The original 
>>>>>> proposal for the feature in question was just that by the way – I would 
>>>>>> see no problem to include it in a minor release. This is not the case 
>>>>>> with the current proposal.
>>>>>> 
>>>>>> Sylvester
>>>>>> 
>>>>>> 
>>>>>>> If this proposal is accepted, there would be two ways to encode
>>>>>>> conditions in CSL styles: the "old" way, where all conditions and the
>>>>>>> "match" attribute are loaded on cs:if's and cs:else-if's, and the
>>>>>>> "new" way, where we use the XML structure from the current proposal.
>>>>>>> Styles would be able to mix & match at the level of individual cs:if's
>>>>>>> and cs:else-if's (e.g., using the "old" structure for one cs:if, and
>>>>>>> the "new" structure for the following cs:else-if). Since we keep the
>>>>>>> "old" structure around, CSL 1.0.1 styles will keep working. Hence the
>>>>>>> change would be backward-compatible, hence the change would qualify
>>>>>>> for a 1.0.2 release.
>>>>>>> 
>>>>>>> Rintze
>>>>>>> 
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>>>>> their applications. This 200-page book is written by three acclaimed
>>>>>>> leaders in the field. The early access version is available now.
>>>>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>>>>> _______________________________________________
>>>>>>> xbiblio-devel mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>>>> 
>>>>>> 
>>>>>> ------------------------------------------------------------------------------
>>>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>>>> their applications. This 200-page book is written by three acclaimed
>>>>>> leaders in the field. The early access version is available now.
>>>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>>>> _______________________________________________
>>>>>> xbiblio-devel mailing list
>>>>>> [email protected]
>>>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>>> 
>>>>> --
>>>>> Charles Parnot
>>>>> [email protected]
>>>>> twitter: @cparnot
>>>>> http://mekentosj.com
>>>>> 
>>>>> 
>>>>> 
>>>>> ------------------------------------------------------------------------------
>>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>>> their applications. This 200-page book is written by three acclaimed
>>>>> leaders in the field. The early access version is available now.
>>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>>> _______________________________________________
>>>>> xbiblio-devel mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>> 
>>>> 
>>>> ------------------------------------------------------------------------------
>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>> their applications. This 200-page book is written by three acclaimed
>>>> leaders in the field. The early access version is available now.
>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>> _______________________________________________
>>>> xbiblio-devel mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>> 
>>> --
>>> Charles Parnot
>>> [email protected]
>>> twitter: @cparnot
>>> http://mekentosj.com
>>> 
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and
>>> their applications. This 200-page book is written by three acclaimed
>>> leaders in the field. The early access version is available now.
>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>> _______________________________________________
>>> xbiblio-devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>> 
>> 
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
>> their applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>> _______________________________________________
>> xbiblio-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
> 
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and 
> their applications. This 200-page book is written by three acclaimed 
> leaders in the field. The early access version is available now. 
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> xbiblio-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
xbiblio-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

Re: [xbiblio-devel] Syntax proposal: conditions [take two]

Reply via email to