> For those reasons, the cull function can't work on the output string:
> it needs to analyse the nested structure before collapsing to identify 
> "adjacent"
> punctuation. With content strings, delimiters and affixes in the mix,
> it's pretty hair-raising.

W3C specifications often include pseudo-algorithms that
implementations should follow.
Perhaps it would make sense to try and do the same in the CSL spec?

Regards,
Rob.

On 21 September 2012 10:22, Frank Bennett <[email protected]> wrote:
> On Fri, Sep 21, 2012 at 4:16 PM, Rönkkö Mikko <[email protected]> wrote:
>> Hi
>>
>> Thanks for the response.
>>
>> On Sep 21, 2012, at 0:20 , Frank Bennett wrote:
>>
>>> On Fri, Sep 21, 2012 at 4:47 AM, Rönkkö Mikko <[email protected]> wrote:
>>>> Hi
>>>>
>>>> I decided to develop a simple CSL processor to convert Zotero json strings
>>>> to APA citations. The code will be used in ZotPad and after the processor
>>>> works, I will publish the code as a separate project in gihub.
>>>>
>>>> I am using json strings from Zotero server  as data and validating the
>>>> output against formatted citations from Zotero server. The citations are
>>>> formatted using the APA style from
>>>> https://github.com/citation-style-language/styles/blob/master/apa.csl using
>>>> the CSL 1.0.1 specification.
>>>>
>>>> I am using the following bibliography item as my test data
>>>>
>>>> Cadogan, J. W., & Lee, N. (Forthcoming). Improper Use of Endogenous
>>>> Formative Variables. <i>Journal of Business Research</i>.
>>>>
>>>> There is one thing that I do not understand. In the APA style (lines
>>>> 429-434) there is a group
>>>>
>>>>        <group delimiter=". ">
>>>>          <text macro="author"/>
>>>>          <text macro="issued"/>
>>>>          <text macro="title" prefix=" "/>
>>>>          <text macro="container"/>
>>>>        </group>
>>>>
>>>>
>>>> The macro "author" has a names element with initialize-with=". " and the
>>>> macro "issued" contains a group with prefix " (". Now to my understanding,
>>>> this means that
>>>>
>>>> - The "author" macro will end with ". "    [Cadogan, J. W., & Lee, N.]
>>>> - The "issued" macro will start with " ("   [ (Forthcoming)]
>>>> - The macros are delimited with ". "
>>>>
>>>> This results in a bibliographic item that starts by
>>>>
>>>> Cadogan, J. W., & Lee, N..  (Forthcoming).
>>>>
>>>> This is obviously not correct. There should not be a double period followed
>>>> by a double space, but I do not understand which part of the formatting
>>>> logic is incorrect.
>>>>
>>>> Mikko
>>>
>>> Mikko,
>>>
>>> Below I've assumed that the output is from your project code. If I
>>> have it backwards, let me know.
>>
>> You are correct.
>>
>> The problem was that my implementation produces incorrect bibliography items 
>> even though the implementation follows the CSL specification. (Or a subset 
>> of the CSL specification, that is sufficient to produce bibliography items 
>> in the APA style). I did not know that strictly following the specification 
>> will not result in correct formatting, but the processor needs to "be smart" 
>> about spaces and punctuation. I could not find this in the documentation. 
>> But now that I know this, it should not be difficult to fix.
>>
>> I posted my code to https://github.com/mronkko/CSLProcessor
>>
>> At this point the goal is to format single citations and single bibliography 
>> items using the APA style. In the future I may make it more generic.
>>
>> Mikko
>
> This may be more distraction than you need at this point, but just in case ...
>
> There is a set of test fixtures covering space-suppression in the
> citeproc-js sources (scroll down to the fixtures prefixed with
> "spaces_"):
>
>   
> https://bitbucket.org/fbennett/citeproc-js/src/5cc7cff350ee/tests/fixtures/local
>
> I didn't put the tests into the main test suite, because the
> discussion I linked above was inconclusive about whether it would be
> appropriate to recognise space-suppression in the official
> specification. The main test suite is here:
>
>   https://bitbucket.org/bdarcus/citeproc-test
>
> The future of CSL processor testing probably lies in work by Sylvester
> Keil, which is here:
>
>   https://github.com/citation-style-language/test-suite
>
> (The repository above hasn't been updated in awhile, but Sylvester
> recently indicated that there will be activity there once he has
> reached a milestone in his current work on citeproc-ruby.)
>
> Frank
>
>>
>>
>>>
>>> You have the logic right. That's the literal result you will get from
>>> flattening the structure without anything more:
>>>
>>>  [author ending in "."] + ". "{delimiter} + " ("{prefix} + [issued]
>>>
>>> Double punctuation needs to be culled by the processor. It's a little
>>> tricky, since formatting (italics etc) might lie between the two
>>> periods, depending on the style. There is also potential interaction
>>> with quote marks, depending on whether or not the style has
>>> punctuation-in-quotes set true or false. For those reasons, the cull
>>> function can't work on the output string: it needs to analyse the
>>> nested structure before collapsing to identify "adjacent" punctuation.
>>> With content strings, delimiters and affixes in the mix, it's pretty
>>> hair-raising. The citeproc-js code for this is heavily tested and
>>> seems to work quite well, but I would be hard-pressed to explain
>>> exactly how it works.
>>>
>>> Concerning spaces, there was a long discussion a couple of years back
>>> concerning whether extraneous spaces added by affixes should be
>>> considered style bugs:
>>>
>>>  
>>> http://xbiblio-devel.2463403.n2.nabble.com/how-much-bugged-a-style-may-be-tt5784767.html#none
>>>
>>> That thread does not reflect well on me, I'm afraid. The point made by
>>> Andrea (and, I think, Bruce) is perfectly valid: double-space issues
>>> *can* be eliminated by more careful construction of CSL code, and
>>> should be. It is also true that masking double spaces in the processor
>>> gives a green light to sloppy coding. That said, the amount of work
>>> required to eliminate all potential extra spaces from the CSL
>>> repository would be pretty staggering. At the end of the day, we're
>>> kind of stuck with this problem.
>>>
>>> Double spaces are hard to catch in the processor for the same reason:
>>> you have to work on the nested structure before it is flattened into
>>> an output string. It's a little simpler because you can assume input
>>> strings will not have leading or trailing spaces; but tracking spaces
>>> across affix and delimiter attributes across multiple nested layers is
>>> still a challenge.
>>>
>>> If you are only going to process one style in one output format and a
>>> single locale, you may be able to fix things up by running a regular
>>> expression over the output string. That wouldn't work as a general
>>> solution, though.
>>>
>>> Sorry for the long response. Hope it helps!
>>>
>>> Frank
>>>
>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Everyone hates slow websites. So do we.
>>>> Make your web apps faster with AppDynamics
>>>> Download AppDynamics Lite for free today:
>>>> http://ad.doubleclick.net/clk;258768047;13503038;j?
>>>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html
>>>> _______________________________________________
>>>> xbiblio-devel mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Everyone hates slow websites. So do we.
>>> Make your web apps faster with AppDynamics
>>> Download AppDynamics Lite for free today:
>>> http://ad.doubleclick.net/clk;258768047;13503038;j?
>>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html
>>> _______________________________________________
>>> xbiblio-devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>
>>
>> ------------------------------------------------------------------------------
>> Got visibility?
>> Most devs has no idea what their production app looks like.
>> Find out how fast your code is with AppDynamics Lite.
>> http://ad.doubleclick.net/clk;262219671;13503038;y?
>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html
>> _______________________________________________
>> xbiblio-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>
> ------------------------------------------------------------------------------
> Got visibility?
> Most devs has no idea what their production app looks like.
> Find out how fast your code is with AppDynamics Lite.
> http://ad.doubleclick.net/clk;262219671;13503038;y?
> http://info.appdynamics.com/FreeJavaPerformanceDownload.html
> _______________________________________________
> xbiblio-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
xbiblio-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

Reply via email to