daniel added a comment.

  In T198341#5044442 <https://phabricator.wikimedia.org/T198341#5044442>, 
@BPirkle wrote:
  
  > To make sure I'm heading in the right direction:
  >
  > - Postgres currently uses columns in specific tables 
(pagecontent.textvector and page.titlevector, both of type tsvector) to store 
search index information
  > - Postgres also currently uses triggers/procedures to maintain these tables
  > - And directly related to this task, searchPostgres.php currently and 
undesirably references fields rev_text_id and old_id
  
  
  That sounds right to me, but then, I don't actually know how each works on 
postgres.
  
  > I should:
  > 
  > - add a searchindex table, similar to the MySQL one, but adapted to 
Postgres data types
  > - add related indexes/triggers/procedures to maintain this table
  > - add necessary documentation and install/update support for these db 
changes
  > - modify searchPostgres.php to use the new db changes
  
  That sounds like "completely change how postgres does search". Now, my 
initial impression is that the way postgres does search is broken by design, 
since a database trigger has no way to correctly interpret the contents of the 
text table, which may be a serialized object, or compressed, or JSON, or a 
reference to an external storage system, or all of these. This has been the 
case for over ten years now.
  
  But maybe I'm wrong about what it's trying to do. In any case, I'd recommend 
to simply try to keep the current functionality as-is, without trying to fix 
its semantics.
  
  So, my (possibly completely wrong) understanding of how this currently works 
is: pagecontent.textvector  contains the searchable version of the page text. 
We can keep that as it is, the trigger just needs to use the slots and content 
tables to connect rows in pagecontent to rows in revision, instead of using 
rev_text_id. From my current understanding, that should be it. But perhaps Brad 
has different ideas.

TASK DETAIL
  https://phabricator.wikimedia.org/T198341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: BPirkle, daniel
Cc: tstarling, gerritbot, Tgr, Jdforrester-WMF, Anomie, Addshore, aude, 
Aklapper, daniel, alaa_wmde, EvanProdromou, CucyNoiD, Nandana, NebulousIris, 
Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, 
Lahi, Gq86, Baloch007, Ramsey-WMF, Darkminds3113, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
LawExplorer, Lewizho99, JJMC89, Maathavan, _jensen, rosalieper, Agabi10, 
Wikidata-bugs, Mbch331, Ltrlg
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to