On Fri, Apr 6, 2012 at 8:56 PM, Charles Parnot <[email protected]> wrote: > Hi all, > > It's not a subject where we get too many complaints, but on occasion. In > Papers2, we have hard-coded a number of name particles and have tried to > decide what rule to apply to each (dropping or non-dropping) based on usage. > I realize the rule can change for the same particle, as some particles are > the same in different languages, and even worse, the rules can differ when > used in different countries. In any case, I was curious to hear your feedback > on that topic. Please let me know if it's been beaten to death in a previous > thread. I have seen a few threads in searching the mailing list, but no > extensive discussion.
The citeproc-js relies on input for the semantic dropping/non-dropping distinction. With two-field input, a particle that precedes the "family" name element is non-dropping, and one that is attached to the "given" name with a comma is dropping. Some parsing clutter is used to cover special cases, such as name suffixes (Jr, III), and particles that form a fixed part of the family name, and a few cases that have come up where a particle is capitalized in the input. Apart from those bits, which are essentially workarounds, we don't try to interpret what a given fragment means in its own right. > > I am listing here all the particles Papers2 detect. The particles are > decomposed in the dropping part + non-dropping part (either can be empty of > course). Note we also correct the capitalization. > > I think we have the 'al', 'el', wrong. > > > // spain(??) / arabic > al al > dos dos > el el > de las de Las > lo lo > les les > > // italy(??) > il il > del del > dela dela > della della > dello dello > di Di > da Da > do Do > des Des > lou Lou > pietro Pietro > > // france > de de > de la de La > du du > d' d' > le Le > la La > l' L' > saint Saint > sainte Sainte > st. Saint > ste. Sainte > > // holland > van van > van de vande > van der vander > van den vanden > vander vander > v.d. vander > vd vander > van het van het > ver ver > ten ten > ter ter > te te > op de op de > in de in de > in 't in 't > in het in het > uit de uit de > uit den uit den > > // germany / austria > von von > von der von der > von dem von dem > von zu von zu > v. von > v von > vom vom > das das > zum zum > zur zur > den den > der der > des des > auf den auf den > > // scotland(?) > mac Mac > > > // arabic > ben Ben > bin Bin > sen sen > > // what to do with these?? > // mc Mc > // o' O' > // au > // af > > > > ------------------------------------------------------------------------------ > For Developers, A Lot Can Happen In A Second. > Boundary is the first to Know...and Tell You. > Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! > http://p.sf.net/sfu/Boundary-d2dvs2 > _______________________________________________ > xbiblio-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/xbiblio-devel ------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 _______________________________________________ xbiblio-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
