A more short term goal would be to incorporate this into Mendeley's
CSL editor. It's probably more user-friendly if users no longer have
to reformat a fixed predefined set of item metadata (as currently is
required for http://editor.citationstyles.org/searchByExample/ ), but
instead can just copy and paste some references that already exist in
the desired format, and have the tool show CSL styles that give
similar output.

Rintze

On Sun, May 18, 2014 at 8:27 PM, Bruce D'Arcus <bdar...@gmail.com> wrote:
> So does this get us one step closer to the magic style finder and generator?
>
> Bruce
>
> On May 17, 2014 4:07 PM, "Sylvester Keil" <sylves...@keil.or.at> wrote:
>>
>> Dear all,
>>
>> I've recently completed a long-term project of mine by writing a web
>> application that exposes the AnyStyle parser library for ML-powered
>> parsing of bibliographies. The web application (and API) is available at
>> http://anystyle.io (SSL available too) and is very exciting (if I may
>> say so myself) for mainly two reasons:
>>
>> 1. The parsing process is split into two steps, showing you the output
>> of the ML-driven step in an editor that allows you to make changes to
>> the parse result.
>>
>> 2. These changes can be recorded and used directly to train the ML
>> model.
>>
>> This is exciting, because so far it required a lot of effort and
>> know-how to prepare training data. Now there is a single public model
>> that everyone can help improve. Obviously, this part is still very
>> experimental — it will be interesting to see if the model starts to
>> deteriorate at some point if fed too much training data. Meanwhile, we
>> now have a publicly available parser that should be fairly easy to train
>> to recognize, for example new styles or languages. Please do take a look
>> if you're interested! I imagine most of you will be interested in the
>> 'CiteProc' output format (the 'JSON' format is less interesting, because
>> it does not apply as much post-processing to individual fields).
>>
>> The parser is also accessible via a JSON API; I wrote a very quick
>> prototype for a style-predictor (Rintze's idea!) similar to the one in
>> the CSL editor. You can give the predictor a reference, the reference
>> will be parsed and the parsed result rendered in all independent CSL
>> styles; these formatted references are then compared with the original
>> one using the Levenshtein distance and the best matches reported. It's
>> just a quick prototype; you can take a look at it here:
>>
>> https://gist.github.com/inukshuk/f1d47aeab1f778bca8ce
>>
>> The parsing is very fast, but the rendering using citeproc-ruby takes
>> quite some time :) But since the parsing API is so simple, it should be
>> very easy to recast this example in JavaScript, Haskell or Python.
>>
>> I thought this might be of interest to some of you on this list. Just
>> let me know if you have any questions!
>>
>> Sylvester

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
xbiblio-devel mailing list
xbiblio-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

Reply via email to