daniel added a comment.

Hi @jcrespo!

Performance and scalability. We need a way to efficiently track and query page names across all Wiktionaries.

Why not solve the problem forever by making title a first-class entity on core, solving title and *link, page_assessment,etc. space issues at the same time?

Can you elaborate? I don't quite understand what you are proposing. Title is already a first-class entity, that's what the page table is. But that is per-wiki. What we need here is cross-wiki.

Are you proposing to use numeric IDs instead of title strings, even for pages that do not exist? That's worth considering. I have done this before on occasion, for instance for my thesis. Maintaining the mapping is a pain, though. I found that using a 64 bit hash of the title is a better solution than auto-increment: the mapping is implicit at least in one direction, and collisions are nearly impossible.

Do you mean the global lookup table for cognate should be blocked on having a numeric representation for titles, to reduce the need for space? Actually, we could immediately use 64 bit hashes for the normalized keys... all we are interested in is equality anyway. That could work.

But what does this have to do with page_assessment? That seems unrelated.



To: Addshore, daniel
Cc: jcrespo, Meno25, gerritbot, Darkdadaah, WMDE-leszek, Lydia_Pintscher, gabriel-wmde, JAnD, daniel, Addshore, Aklapper, Lewizho99, Maathavan, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331
Wikidata-bugs mailing list

Reply via email to