Michaelcochez added a comment.

  An update on the current status, mainly regarding the index file:
  
  First, I made a mistake in my response above. The size of the file is a lot 
smaller than what I wrote above. The binary version is currently around 75mb 
(and not 1.5gb).
  
  Progress:
  
  - We implemented serialization using protocol buffers. Initial experiments 
seem promising. Store and load times appear to be only slightly slower compared 
to the native format. The on-disk size grew from 75mb to 250mb. However, using 
compression (bzip2) the protocol buffer version can be queezed into 51mb, so 
that seems to go well.
  - While rewriting the serialization code, I noticed that it was hard to 
maintain that in a separate project. Hence, I integrated a minimal version of 
the index creation into the codebase.
  - The previous index creation was using the rdf dump as its datasource. The 
new version uses the json dump. That has several benefits, mainly with regards 
to needed preprocessing steps (or rather avoiding them).
  - The go version has been bumped from 1.17 to 1.18
  
  These changes are not merged into main yet.  Development is ongoing in 
https://github.com/martaannaj/RecommenderServer/tree/protobuffer_serialization 
  The following still needs to be done before merging.
  
  - test coverage for the new serialization format
  - checking whether gokart can be used with the latest go somehow. It does not 
support the new generics capabilities of go. Most functionality is covered by 
the other checking tools used.

TASK DETAIL
  https://phabricator.wikimedia.org/T301471

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Michaelcochez
Cc: akosiaris, QChris, ItamarWMDE, Joe, Aklapper, Addshore, karapayneWMDE, 
Martaannaj, Michaelcochez, Astuthiodit_1, Arnoldokoth, Invadibot, maantietaja, 
wkandek, JMeybohm, Akuckartz, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Eevans, Hardikj, 
Wikidata-bugs, aude, Sjoerddebruin, Jdforrester-WMF, Mbch331, Jay8g, Dzahn
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to