Francesco Mondora wrote:
Hello, I'm building an italian receipt xml db full of
20000 receipts.
The receipt is structured in this way:

<ricetta>
        <titolo></titolo>
        <tipo></tipo>

</ricetta>

I've created a collection and imported my XML files.
now I'm running my XPATH query to look for a special
dish:

xindice xpath_query -c /cucina/ricette -q
"/ricetta[contains(tipo, 'Pollame')]/titolo"

This query is performed in 1m15s

I've tried to setup indexes on element using the
statement:

xindiceadmin ai -c /cucina/ricette -n ricette2 -p '*'

The index creation time is about 3 sec.

And the query is slow.

Yep. You are running a full text search, which is painfully slow. This is a known issue, and some work is underway to speed up things, but don't hold your breath, it won't be fast.


Anyway, you can quite easily improve your performance by moving your "data oriented" stuff to attributes instead than element content. If you use a different markup, say:

<ricetta titolo="Pollo alla cacciatora" tipo="Pollame">
        <ingredienti>
                <ingrediente nome="pollo" principale="true"/>
                <ingrediente nome="cipolla />
        </ingredienti>
        <persone numero="6"/>
        <note>(blah...)</note>
[...]

you will speed up a lot the queries, with the added value of having a more structured XML which will be possibly faster to process, say, using XSLT.

Also, you'd better split your db in multiple collections. This will also allow you to get a substantial speed increase.

Ciao,

--
Gianugo Rabellino



Reply via email to