You could try removing the "ORDER BY" and/or "path LIKE" pieces of the query, see what that does for your performance.
If this turns out to help, I can split the loop into several while yielding in between to make the rest of the application "tick" -- Jaap On Sun, Mar 25, 2018 at 11:14 PM <[email protected]> wrote: > At brief glance, I'm guessing this is the issue: > > > https://github.com/jaap-karssenberg/zim-desktop-wiki/blob/d96b3509890f4c9b9af9119f64b64947337d8da7/zim/notebook/index/files.py > line 89 > > def _update_iter_inner(self, prefix=''): > # sort folders before files: first index structure, > then contents # this makes e.g. index links more efficient and > robust # sort by id to ensure parents are found before children > while True: > row = self.db.execute( > 'SELECT id, path, node_type FROM files' > ' WHERE index_status = ? AND path > LIKE ?' ' ORDER BY node_type, id', > (STATUS_NEED_UPDATE, prefix + '%') > ).fetchone() > > if row: > node_id, path, node_type = row > #print ">> UPDATE", node_id, path, > node_type > else: > break > > It seems like the whole database is being re-loaded and re-ordered > again for the import of every single file. As file number in a notebook > increases, this per-file database operation seems not to scale > linearly, but some much higher order. Something like globbing for the > entire notebook-subdirectory structure and then db importing on a loop > through that glob would be vastly more efficient for large file numbers. > > > > On Sun, 25 Mar 2018 14:30:33 -0500 > <[email protected]> wrote: > > > Would you mind pointing me to the source file(s) that manage this > > indexing? I'd like to see if there is any way to speed the process > > up for large numbers of files. > > > > > > > > On Mon, 3 Jul 2017 17:42:07 +0000 > > <[email protected]> wrote: > > > > > Yes, for medium sized notebooks, and those with a "normal" amount of > > > files, indexing is still under 5 minutes. I also use Zim to manage a > > > notebook under which there are lots of small work data files > > > (>350,000). > > > > > > The progress bar suggests there is some part of the parsing process > > > that slows down over time, as does a cursory check on the contents > > > of the database updating over time. There are many more files added > > > within the first few minutes, and many fewer over time, such that > > > after a while, only one or two files are added ever several minutes. > > > It suggests to me that the whole list is being re-processed or > > > re-opened as part of the indexing loop, perhaps re-opening the > > > sqlite file for every new file or something. Ultimately, I don't > > > think that exponential slowdown is a necessity, but I have not had > > > a free moment to familiarize myself with the source yet. > > > > > > Thanks! > > > > > > > > > > > > On Mon, 03 Jul 2017 08:04:36 +0000 > > > Jaap Karssenberg <[email protected]> wrote: > > > > > > > Yes, zim does indeed now build a tabel of all files in the > > > > notebook folder, not just text files. However it doesn't access > > > > them, it just stores file names and mtime. > > > > > > > > Despite this change, the indexing is faster than with 0.65 in most > > > > of my test cases. The behavior you describe suggest a huge amount > > > > of files under the notebook folder, is this the case? > > > > > > > > -- Jaap > > > > > > > > On Sun, Jul 2, 2017 at 8:43 PM <[email protected]> wrote: > > > > > > > > > The notebooks that used to take me about 5 minutes to re-index > > > > > are taking close to 40 hours for me (they are larger notebooks). > > > > > > > > > > It looks like the sql database is indexing every file under the > > > > > root directory of the notebook, even those not associated with > > > > > Zim directly, like zip or data files. I'm not sure if that was > > > > > happening with earlier versions. > > > > > > > > > > > > > > > > > > > > On Sat, 1 Jul 2017 23:32:16 +0200 > > > > > Olivier Boesch <[email protected]> wrote: > > > > > > > > > > > 6 minutes to reindex. pretty long in comparison with the 0.65. > > > > > > > > > > > > > > > > > > Le 01/07/2017 à 23:24, Olivier Boesch a écrit : > > > > > > > > > > > > > > I seem to experience the same issue... > > > > > > > > > > > > > > I clicked the "cancel" button after several minutes... > > > > > > > > > > > > > > testing now how long it takes to re-index... > > > > > > > > > > > > > > > > > > > > > Le 01/07/2017 à 23:04, [email protected] a écrit : > > > > > > >> After this latest upgrade came through (it looks great), > > > > > > >> notebooks that took me several minutes to re-index are now > > > > > > >> taking multiple days of time, and it seems like an > > > > > > >> exponential slowdown with the number (and maybe size) of > > > > > > >> files under the notebook root directory. Has anyone else > > > > > > >> experienced this? > > > > > > >> > > > > > > >> > > > > > > >> _______________________________________________ > > > > > > >> Mailing list:https://launchpad.net/~zim-wiki > > > > > > >> Post to :[email protected] > > > > > > >> Unsubscribe :https://launchpad.net/~zim-wiki > > > > > > >> More help :https://help.launchpad.net/ListHelp > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Mailing list: https://launchpad.net/~zim-wiki > > > > > > > Post to : [email protected] > > > > > > > Unsubscribe : https://launchpad.net/~zim-wiki > > > > > > > More help : https://help.launchpad.net/ListHelp > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Mailing list: https://launchpad.net/~zim-wiki > > > > > Post to : [email protected] > > > > > Unsubscribe : https://launchpad.net/~zim-wiki > > > > > More help : https://help.launchpad.net/ListHelp > > > > > > > > > > > > > > _______________________________________________ > > > Mailing list: https://launchpad.net/~zim-wiki > > > Post to : [email protected] > > > Unsubscribe : https://launchpad.net/~zim-wiki > > > More help : https://help.launchpad.net/ListHelp > > > > > > _______________________________________________ > > Mailing list: https://launchpad.net/~zim-wiki > > Post to : [email protected] > > Unsubscribe : https://launchpad.net/~zim-wiki > > More help : https://help.launchpad.net/ListHelp > > > _______________________________________________ > Mailing list: https://launchpad.net/~zim-wiki > Post to : [email protected] > Unsubscribe : https://launchpad.net/~zim-wiki > More help : https://help.launchpad.net/ListHelp >
_______________________________________________ Mailing list: https://launchpad.net/~zim-wiki Post to : [email protected] Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp

