Hi Rob, While the idea of pseudo-algorithms is attractive, I very much like the idea of fixtures being the specification instead. In the case of bibliogprahic software, this seems like a really good fit, as it's easy to discuss the output, and compare those to what's actually in books and articles (or common sense). The fixtures can be read by people that are not programmers, and this is a big plus as well: you can show them and discuss them with non-technical people that know the field. Discussing the "algorithms" to get to the actual results is not as useful IMO. Or at least if should not come first, and only be formalized when enough examples of the issue at hand have been produced. The disambiguation process is another one of the hair-rising issue.
My 2 cents :-) Charles On Sep 21, 2012, at 11:44 AM, Robert Knight <[email protected]> wrote: >> For those reasons, the cull function can't work on the output string: >> it needs to analyse the nested structure before collapsing to identify >> "adjacent" >> punctuation. With content strings, delimiters and affixes in the mix, >> it's pretty hair-raising. > > W3C specifications often include pseudo-algorithms that > implementations should follow. > Perhaps it would make sense to try and do the same in the CSL spec? > > Regards, > Rob. > > On 21 September 2012 10:22, Frank Bennett <[email protected]> wrote: >> On Fri, Sep 21, 2012 at 4:16 PM, Rönkkö Mikko <[email protected]> wrote: >>> Hi >>> >>> Thanks for the response. >>> >>> On Sep 21, 2012, at 0:20 , Frank Bennett wrote: >>> >>>> On Fri, Sep 21, 2012 at 4:47 AM, Rönkkö Mikko <[email protected]> >>>> wrote: >>>>> Hi >>>>> >>>>> I decided to develop a simple CSL processor to convert Zotero json strings >>>>> to APA citations. The code will be used in ZotPad and after the processor >>>>> works, I will publish the code as a separate project in gihub. >>>>> >>>>> I am using json strings from Zotero server as data and validating the >>>>> output against formatted citations from Zotero server. The citations are >>>>> formatted using the APA style from >>>>> https://github.com/citation-style-language/styles/blob/master/apa.csl >>>>> using >>>>> the CSL 1.0.1 specification. >>>>> >>>>> I am using the following bibliography item as my test data >>>>> >>>>> Cadogan, J. W., & Lee, N. (Forthcoming). Improper Use of Endogenous >>>>> Formative Variables. <i>Journal of Business Research</i>. >>>>> >>>>> There is one thing that I do not understand. In the APA style (lines >>>>> 429-434) there is a group >>>>> >>>>> <group delimiter=". "> >>>>> <text macro="author"/> >>>>> <text macro="issued"/> >>>>> <text macro="title" prefix=" "/> >>>>> <text macro="container"/> >>>>> </group> >>>>> >>>>> >>>>> The macro "author" has a names element with initialize-with=". " and the >>>>> macro "issued" contains a group with prefix " (". Now to my understanding, >>>>> this means that >>>>> >>>>> - The "author" macro will end with ". " [Cadogan, J. W., & Lee, N.] >>>>> - The "issued" macro will start with " (" [ (Forthcoming)] >>>>> - The macros are delimited with ". " >>>>> >>>>> This results in a bibliographic item that starts by >>>>> >>>>> Cadogan, J. W., & Lee, N.. (Forthcoming). >>>>> >>>>> This is obviously not correct. There should not be a double period >>>>> followed >>>>> by a double space, but I do not understand which part of the formatting >>>>> logic is incorrect. >>>>> >>>>> Mikko >>>> >>>> Mikko, >>>> >>>> Below I've assumed that the output is from your project code. If I >>>> have it backwards, let me know. >>> >>> You are correct. >>> >>> The problem was that my implementation produces incorrect bibliography >>> items even though the implementation follows the CSL specification. (Or a >>> subset of the CSL specification, that is sufficient to produce bibliography >>> items in the APA style). I did not know that strictly following the >>> specification will not result in correct formatting, but the processor >>> needs to "be smart" about spaces and punctuation. I could not find this in >>> the documentation. But now that I know this, it should not be difficult to >>> fix. >>> >>> I posted my code to https://github.com/mronkko/CSLProcessor >>> >>> At this point the goal is to format single citations and single >>> bibliography items using the APA style. In the future I may make it more >>> generic. >>> >>> Mikko >> >> This may be more distraction than you need at this point, but just in case >> ... >> >> There is a set of test fixtures covering space-suppression in the >> citeproc-js sources (scroll down to the fixtures prefixed with >> "spaces_"): >> >> >> https://bitbucket.org/fbennett/citeproc-js/src/5cc7cff350ee/tests/fixtures/local >> >> I didn't put the tests into the main test suite, because the >> discussion I linked above was inconclusive about whether it would be >> appropriate to recognise space-suppression in the official >> specification. The main test suite is here: >> >> https://bitbucket.org/bdarcus/citeproc-test >> >> The future of CSL processor testing probably lies in work by Sylvester >> Keil, which is here: >> >> https://github.com/citation-style-language/test-suite >> >> (The repository above hasn't been updated in awhile, but Sylvester >> recently indicated that there will be activity there once he has >> reached a milestone in his current work on citeproc-ruby.) >> >> Frank >> >>> >>> >>>> >>>> You have the logic right. That's the literal result you will get from >>>> flattening the structure without anything more: >>>> >>>> [author ending in "."] + ". "{delimiter} + " ("{prefix} + [issued] >>>> >>>> Double punctuation needs to be culled by the processor. It's a little >>>> tricky, since formatting (italics etc) might lie between the two >>>> periods, depending on the style. There is also potential interaction >>>> with quote marks, depending on whether or not the style has >>>> punctuation-in-quotes set true or false. For those reasons, the cull >>>> function can't work on the output string: it needs to analyse the >>>> nested structure before collapsing to identify "adjacent" punctuation. >>>> With content strings, delimiters and affixes in the mix, it's pretty >>>> hair-raising. The citeproc-js code for this is heavily tested and >>>> seems to work quite well, but I would be hard-pressed to explain >>>> exactly how it works. >>>> >>>> Concerning spaces, there was a long discussion a couple of years back >>>> concerning whether extraneous spaces added by affixes should be >>>> considered style bugs: >>>> >>>> http://xbiblio-devel.2463403.n2.nabble.com/how-much-bugged-a-style-may-be-tt5784767.html#none >>>> >>>> That thread does not reflect well on me, I'm afraid. The point made by >>>> Andrea (and, I think, Bruce) is perfectly valid: double-space issues >>>> *can* be eliminated by more careful construction of CSL code, and >>>> should be. It is also true that masking double spaces in the processor >>>> gives a green light to sloppy coding. That said, the amount of work >>>> required to eliminate all potential extra spaces from the CSL >>>> repository would be pretty staggering. At the end of the day, we're >>>> kind of stuck with this problem. >>>> >>>> Double spaces are hard to catch in the processor for the same reason: >>>> you have to work on the nested structure before it is flattened into >>>> an output string. It's a little simpler because you can assume input >>>> strings will not have leading or trailing spaces; but tracking spaces >>>> across affix and delimiter attributes across multiple nested layers is >>>> still a challenge. >>>> >>>> If you are only going to process one style in one output format and a >>>> single locale, you may be able to fix things up by running a regular >>>> expression over the output string. That wouldn't work as a general >>>> solution, though. >>>> >>>> Sorry for the long response. Hope it helps! >>>> >>>> Frank >>>> >>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Everyone hates slow websites. So do we. >>>>> Make your web apps faster with AppDynamics >>>>> Download AppDynamics Lite for free today: >>>>> http://ad.doubleclick.net/clk;258768047;13503038;j? >>>>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html >>>>> _______________________________________________ >>>>> xbiblio-devel mailing list >>>>> [email protected] >>>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel >>>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Everyone hates slow websites. So do we. >>>> Make your web apps faster with AppDynamics >>>> Download AppDynamics Lite for free today: >>>> http://ad.doubleclick.net/clk;258768047;13503038;j? >>>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html >>>> _______________________________________________ >>>> xbiblio-devel mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel >>> >>> >>> ------------------------------------------------------------------------------ >>> Got visibility? >>> Most devs has no idea what their production app looks like. >>> Find out how fast your code is with AppDynamics Lite. >>> http://ad.doubleclick.net/clk;262219671;13503038;y? >>> http://info.appdynamics.com/FreeJavaPerformanceDownload.html >>> _______________________________________________ >>> xbiblio-devel mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel >> >> ------------------------------------------------------------------------------ >> Got visibility? >> Most devs has no idea what their production app looks like. >> Find out how fast your code is with AppDynamics Lite. >> http://ad.doubleclick.net/clk;262219671;13503038;y? >> http://info.appdynamics.com/FreeJavaPerformanceDownload.html >> _______________________________________________ >> xbiblio-devel mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel > > ------------------------------------------------------------------------------ > Got visibility? > Most devs has no idea what their production app looks like. > Find out how fast your code is with AppDynamics Lite. > http://ad.doubleclick.net/clk;262219671;13503038;y? > http://info.appdynamics.com/FreeJavaPerformanceDownload.html > _______________________________________________ > xbiblio-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel -- Charles Parnot [email protected] twitter: @cparnot http://mekentosj.com ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ xbiblio-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
