https://bugzilla.wikimedia.org/show_bug.cgi?id=52435

       Web browser: ---
            Bug ID: 52435
           Summary: Find previous/next links in the navbar more
                    efficiently
           Product: MediaWiki extensions
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Unprioritized
         Component: BookManagerv2
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
    Classification: Unclassified
   Mobile Platform: ---

Each time a page is loaded, currently, the prev/next links in the navbar are
generated by looping through the entire sections block of the JSON and finding
the current page. This is probably quite slow for very large books.

Relevant IRC discussion:
14:33 < GorillaWarfare> mwalker: If you look at BookManagerv2.hooks.php on line
399, you'll see it's getting the prev/next sections by looping through each
section
14:33 < GorillaWarfare> mwalker: But some books have literally 40,000 sections
14:33 < GorillaWarfare> mwalker: Any thoughts on how to deal wiht that better?
14:33 < mwalker> looking
14:34 < GorillaWarfare> Sure
14:38 < mwalker> GorillaWarfare: so two things come to mind -- first that this
if you keep it this way could be cached
14:38 < mwalker> the second thing is; why is it unsuitable to use the list
order given in the JSON?
14:39 < mwalker> that way it could be numerically indexed and you wouldn't have
this problem
14:39 < GorillaWarfare> Use the list order?
14:40 < mwalker> the order in which they were listed in the raw file
14:42 < mwalker> you have an implicit ordering there which you can actually use
as an index -- and if you wanted on page save; you could turn that into an
explicit ordering which you can then pass down to JS
14:43 < mwalker> or... because I'm an idiot and now see what you're trying to
do
14:43 < mwalker> this hook gets run for a specific page
14:44 < mwalker> so... find the page you're rendering for; and then use that
implicit index to construct just those forward/backward links?
14:54 < GorillaWarfare> How would it know which page was which index?
14:56 < mwalker> GorillaWarfare: for more generic cases it's in_array; but in
this case you can use array_walk()
14:57 < mwalker> or
14:59 < mwalker> hurm... that's not impressively efficient; your best bet might
just be to make an array of index => section name on json page save
14:59 < mwalker> save that in cache
14:59 < mwalker> and then retrieve it
15:00 < mwalker> this is where I wish php had proper trees and tree searching
15:03 < GorillaWarfare> Wait, how is index => section name different from what
it's doing now?
15:05 < mwalker> it's not; except for the bit where you can't search for a
section name to find it's index
15:05 < mwalker> what you're doing right now is actually probably totally fine
so long as you cache it
15:05 < mwalker> it's just that you dont want to do this operation for every
page
15:05 < GorillaWarfare> Even for 40,000 sections?
15:05 < GorillaWarfare> Ohh; it is doing it for every page
15:06 < GorillaWarfare> And the navbars aren't currently cached
15:06 < mwalker> check
15:06 < GorillaWarfare> So it could be slooow
15:06 < mwalker> indeed!
15:06 < GorillaWarfare> Hmm... so how would we do this..
15:06 < GorillaWarfare> Create each navbar at once and cache them sometime?
15:06 < mwalker> the smaller advantage of just having a simple lookup table
without the rest of the metadata is that then you dont have to store all the
metadata into memcache
15:07 < mwalker> well; pre-generate the metadata required for each navbar
15:08 < mwalker> but with 20k sections you're going to run into other problems
-- like how do you even display that
15:09 < GorillaWarfare> Yeaaaah
15:09 < GorillaWarfare> When Raylton was talking about "big books", I don't
think I quite envisioned the true scale
15:09 < mwalker> do they have subsections?
15:09 < mwalker> because then you could cache it sort of as a tree
15:11 < GorillaWarfare> We can't count on it
15:12 < mwalker> grrr
15:12 < mwalker> ok -- so we know we have some limited structured data here
15:13 < mwalker> it might make sense to put this into a database table and then
it's persistent and queryable
15:13 < GorillaWarfare> Eek
15:13 < GorillaWarfare> That... doesn't sound simple
15:13 < mwalker> probably easier than you think; but it will force you to make
some modifications to your algorithms
15:14 < mwalker> because you'll pull in the section ordering data from the
database; and the rest of the book metadata from json
15:14 < mwalker> this might help
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
15:14 < mwalker> just for a concept

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to