> For those reasons, the cull function can't work on the output string: > it needs to analyse the nested structure before collapsing to identify > "adjacent" > punctuation. With content strings, delimiters and affixes in the mix, > it's pretty hair-raising.
W3C specifications often include pseudo-algorithms that implementations should follow. Perhaps it would make sense to try and do the same in the CSL spec? Regards, Rob. On 21 September 2012 10:22, Frank Bennett <[email protected]> wrote: > On Fri, Sep 21, 2012 at 4:16 PM, Rönkkö Mikko <[email protected]> wrote: >> Hi >> >> Thanks for the response. >> >> On Sep 21, 2012, at 0:20 , Frank Bennett wrote: >> >>> On Fri, Sep 21, 2012 at 4:47 AM, Rönkkö Mikko <[email protected]> wrote: >>>> Hi >>>> >>>> I decided to develop a simple CSL processor to convert Zotero json strings >>>> to APA citations. The code will be used in ZotPad and after the processor >>>> works, I will publish the code as a separate project in gihub. >>>> >>>> I am using json strings from Zotero server as data and validating the >>>> output against formatted citations from Zotero server. The citations are >>>> formatted using the APA style from >>>> https://github.com/citation-style-language/styles/blob/master/apa.csl using >>>> the CSL 1.0.1 specification. >>>> >>>> I am using the following bibliography item as my test data >>>> >>>> Cadogan, J. W., & Lee, N. (Forthcoming). Improper Use of Endogenous >>>> Formative Variables. <i>Journal of Business Research</i>. >>>> >>>> There is one thing that I do not understand. In the APA style (lines >>>> 429-434) there is a group >>>> >>>> <group delimiter=". "> >>>> <text macro="author"/> >>>> <text macro="issued"/> >>>> <text macro="title" prefix=" "/> >>>> <text macro="container"/> >>>> </group> >>>> >>>> >>>> The macro "author" has a names element with initialize-with=". " and the >>>> macro "issued" contains a group with prefix " (". Now to my understanding, >>>> this means that >>>> >>>> - The "author" macro will end with ". " [Cadogan, J. W., & Lee, N.] >>>> - The "issued" macro will start with " (" [ (Forthcoming)] >>>> - The macros are delimited with ". " >>>> >>>> This results in a bibliographic item that starts by >>>> >>>> Cadogan, J. W., & Lee, N.. (Forthcoming). >>>> >>>> This is obviously not correct. There should not be a double period followed >>>> by a double space, but I do not understand which part of the formatting >>>> logic is incorrect. >>>> >>>> Mikko >>> >>> Mikko, >>> >>> Below I've assumed that the output is from your project code. If I >>> have it backwards, let me know. >> >> You are correct. >> >> The problem was that my implementation produces incorrect bibliography items >> even though the implementation follows the CSL specification. (Or a subset >> of the CSL specification, that is sufficient to produce bibliography items >> in the APA style). I did not know that strictly following the specification >> will not result in correct formatting, but the processor needs to "be smart" >> about spaces and punctuation. I could not find this in the documentation. >> But now that I know this, it should not be difficult to fix. >> >> I posted my code to https://github.com/mronkko/CSLProcessor >> >> At this point the goal is to format single citations and single bibliography >> items using the APA style. In the future I may make it more generic. >> >> Mikko > > This may be more distraction than you need at this point, but just in case ... > > There is a set of test fixtures covering space-suppression in the > citeproc-js sources (scroll down to the fixtures prefixed with > "spaces_"): > > > https://bitbucket.org/fbennett/citeproc-js/src/5cc7cff350ee/tests/fixtures/local > > I didn't put the tests into the main test suite, because the > discussion I linked above was inconclusive about whether it would be > appropriate to recognise space-suppression in the official > specification. The main test suite is here: > > https://bitbucket.org/bdarcus/citeproc-test > > The future of CSL processor testing probably lies in work by Sylvester > Keil, which is here: > > https://github.com/citation-style-language/test-suite > > (The repository above hasn't been updated in awhile, but Sylvester > recently indicated that there will be activity there once he has > reached a milestone in his current work on citeproc-ruby.) > > Frank > >> >> >>> >>> You have the logic right. That's the literal result you will get from >>> flattening the structure without anything more: >>> >>> [author ending in "."] + ". "{delimiter} + " ("{prefix} + [issued] >>> >>> Double punctuation needs to be culled by the processor. It's a little >>> tricky, since formatting (italics etc) might lie between the two >>> periods, depending on the style. There is also potential interaction >>> with quote marks, depending on whether or not the style has >>> punctuation-in-quotes set true or false. For those reasons, the cull >>> function can't work on the output string: it needs to analyse the >>> nested structure before collapsing to identify "adjacent" punctuation. >>> With content strings, delimiters and affixes in the mix, it's pretty >>> hair-raising. The citeproc-js code for this is heavily tested and >>> seems to work quite well, but I would be hard-pressed to explain >>> exactly how it works. >>> >>> Concerning spaces, there was a long discussion a couple of years back >>> concerning whether extraneous spaces added by affixes should be >>> considered style bugs: >>> >>> >>> http://xbiblio-devel.2463403.n2.nabble.com/how-much-bugged-a-style-may-be-tt5784767.html#none >>> >>> That thread does not reflect well on me, I'm afraid. The point made by >>> Andrea (and, I think, Bruce) is perfectly valid: double-space issues >>> *can* be eliminated by more careful construction of CSL code, and >>> should be. It is also true that masking double spaces in the processor >>> gives a green light to sloppy coding. That said, the amount of work >>> required to eliminate all potential extra spaces from the CSL >>> repository would be pretty staggering. At the end of the day, we're >>> kind of stuck with this problem. >>> >>> Double spaces are hard to catch in the processor for the same reason: >>> you have to work on the nested structure before it is flattened into >>> an output string. It's a little simpler because you can assume input >>> strings will not have leading or trailing spaces; but tracking spaces >>> across affix and delimiter attributes across multiple nested layers is >>> still a challenge. >>> >>> If you are only going to process one style in one output format and a >>> single locale, you may be able to fix things up by running a regular >>> expression over the output string. That wouldn't work as a general >>> solution, though. >>> >>> Sorry for the long response. Hope it helps! >>> >>> Frank >>> >>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Everyone hates slow websites. So do we. >>>> Make your web apps faster with AppDynamics >>>> Download AppDynamics Lite for free today: >>>> http://ad.doubleclick.net/clk;258768047;13503038;j? >>>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html >>>> _______________________________________________ >>>> xbiblio-devel mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel >>>> >>> >>> ------------------------------------------------------------------------------ >>> Everyone hates slow websites. So do we. >>> Make your web apps faster with AppDynamics >>> Download AppDynamics Lite for free today: >>> http://ad.doubleclick.net/clk;258768047;13503038;j? >>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html >>> _______________________________________________ >>> xbiblio-devel mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel >> >> >> ------------------------------------------------------------------------------ >> Got visibility? >> Most devs has no idea what their production app looks like. >> Find out how fast your code is with AppDynamics Lite. >> http://ad.doubleclick.net/clk;262219671;13503038;y? >> http://info.appdynamics.com/FreeJavaPerformanceDownload.html >> _______________________________________________ >> xbiblio-devel mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel > > ------------------------------------------------------------------------------ > Got visibility? > Most devs has no idea what their production app looks like. > Find out how fast your code is with AppDynamics Lite. > http://ad.doubleclick.net/clk;262219671;13503038;y? > http://info.appdynamics.com/FreeJavaPerformanceDownload.html > _______________________________________________ > xbiblio-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ xbiblio-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
