Frank recently added some tests to catalog the current citeproc-js behavior when it comes to punctuation suppression:
https://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/punctuation_FullMontyPlain.txt https://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/punctuation_FullMontyQuotesIn.txt https://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/punctuation_FullMontyQuotesOut.txt https://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/punctuation_FullMontyField.txt It doesn't cover suppression of duplicated spaces (as discussed below, there are already older "space_..." unit tests), and it only covers punctuation added by prefixes, suffixes and punctuation that is part of the variable field content (e.g. punctuation added as group delimiters isn't tested). I tried to pull these new results together in a spreadsheet: https://docs.google.com/spreadsheet/ccc?key=0AoKgWUjfrk4_dE9vNmVQeElmQkhPbzFKd1ZVWUticVE&usp=sharing With this as a starting point, I hope we can agree on specific rules for punctuation suppression so that we can include some guidance on this topic in the CSL specification. These rules will likely have to be very precise, and take into account the origin of the punctuation (variable field content, affixes, group delimiters, group affixes, etc.). Sebastian, could you remind me whether CMoS has any clear rules on punctuation suppression? Rintze On Fri, Sep 21, 2012 at 5:22 AM, Frank Bennett <[email protected]> wrote: > On Fri, Sep 21, 2012 at 4:16 PM, Rönkkö Mikko <[email protected]> wrote: >> The problem was that my implementation produces incorrect bibliography items >> even though the implementation follows the CSL specification. (Or a subset >> of the CSL specification, that is sufficient to produce bibliography items >> in the APA style). I did not know that strictly following the specification >> will not result in correct formatting, but the processor needs to "be smart" >> about spaces and punctuation. I could not find this in the documentation. >> But now that I know this, it should not be difficult to fix. >> >> I posted my code to https://github.com/mronkko/CSLProcessor >> >> At this point the goal is to format single citations and single bibliography >> items using the APA style. In the future I may make it more generic. >> >> Mikko > > This may be more distraction than you need at this point, but just in case ... > > There is a set of test fixtures covering space-suppression in the > citeproc-js sources (scroll down to the fixtures prefixed with > "spaces_"): > > > https://bitbucket.org/fbennett/citeproc-js/src/5cc7cff350ee/tests/fixtures/local > > I didn't put the tests into the main test suite, because the > discussion I linked above was inconclusive about whether it would be > appropriate to recognise space-suppression in the official > specification. The main test suite is here: > > https://bitbucket.org/bdarcus/citeproc-test > > The future of CSL processor testing probably lies in work by Sylvester > Keil, which is here: > > https://github.com/citation-style-language/test-suite > > (The repository above hasn't been updated in awhile, but Sylvester > recently indicated that there will be activity there once he has > reached a milestone in his current work on citeproc-ruby.) > > Frank > >> >> >>> >>> You have the logic right. That's the literal result you will get from >>> flattening the structure without anything more: >>> >>> [author ending in "."] + ". "{delimiter} + " ("{prefix} + [issued] >>> >>> Double punctuation needs to be culled by the processor. It's a little >>> tricky, since formatting (italics etc) might lie between the two >>> periods, depending on the style. There is also potential interaction >>> with quote marks, depending on whether or not the style has >>> punctuation-in-quotes set true or false. For those reasons, the cull >>> function can't work on the output string: it needs to analyse the >>> nested structure before collapsing to identify "adjacent" punctuation. >>> With content strings, delimiters and affixes in the mix, it's pretty >>> hair-raising. The citeproc-js code for this is heavily tested and >>> seems to work quite well, but I would be hard-pressed to explain >>> exactly how it works. >>> >>> Concerning spaces, there was a long discussion a couple of years back >>> concerning whether extraneous spaces added by affixes should be >>> considered style bugs: >>> >>> >>> http://xbiblio-devel.2463403.n2.nabble.com/how-much-bugged-a-style-may-be-tt5784767.html#none >>> >>> That thread does not reflect well on me, I'm afraid. The point made by >>> Andrea (and, I think, Bruce) is perfectly valid: double-space issues >>> *can* be eliminated by more careful construction of CSL code, and >>> should be. It is also true that masking double spaces in the processor >>> gives a green light to sloppy coding. That said, the amount of work >>> required to eliminate all potential extra spaces from the CSL >>> repository would be pretty staggering. At the end of the day, we're >>> kind of stuck with this problem. >>> >>> Double spaces are hard to catch in the processor for the same reason: >>> you have to work on the nested structure before it is flattened into >>> an output string. It's a little simpler because you can assume input >>> strings will not have leading or trailing spaces; but tracking spaces >>> across affix and delimiter attributes across multiple nested layers is >>> still a challenge. >>> >>> If you are only going to process one style in one output format and a >>> single locale, you may be able to fix things up by running a regular >>> expression over the output string. That wouldn't work as a general >>> solution, though. >>> >>> Sorry for the long response. Hope it helps! >>> >>> Frank >>> ------------------------------------------------------------------------------ Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk _______________________________________________ xbiblio-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
