BPirkle added a comment.
In T198341#5024626 <https://phabricator.wikimedia.org/T198341#5024626>, @Anomie wrote: > In T198341#5022412 <https://phabricator.wikimedia.org/T198341#5022412>, @daniel wrote: > > > In T198341#5021347 <https://phabricator.wikimedia.org/T198341#5021347>, @BPirkle wrote: > > > > > - includes/search/SearchPostgres.php > > > > > > odd stuff, maybe @anomie can help. Two uses seem completely pointless, the third one should probably use getQueryInfo? > > > Note in the PostgreSQL schema the "text" table is named "pagecontent". That was done in rSVN15791 <https://phabricator.wikimedia.org/rSVN15791>; I don't know why quoting wasn't just used instead. The plan for T164898 <https://phabricator.wikimedia.org/T164898>/T191231 <https://phabricator.wikimedia.org/T191231> includes renaming it back to "text". > > It looks like that file needs some #technical-debt <https://phabricator.wikimedia.org/tag/technical-debt/> cleanup to use the IDatabase querying methods rather than building SQL strings directly, but that alone would probably be out of scope here. > > What is in scope is that it's trying to search directly against the `text` table, having added an extra column for that purpose, and will need to be converted to use a `searchindex` table like MySQL does. To make sure I'm heading in the right direction: - Postgres currently uses columns in specific tables (pagecontent.textvector and page.titlevector, both of type tsvector) to store search index information - Postgres also currently uses triggers/procedures to maintain these tables - And directly related to this task, searchPostgres.php currently and undesirably references fields rev_text_id and old_id I should: - add a searchindex table, similar to the MySQL one, but adapted to Postgres data types - add related indexes/triggers/procedures to maintain this table - add necessary documentation and install/update support for these db changes - modify searchPostgres.php to use the new db changes Questions: - I'm assuming we are using triggers/procedures to maintain the existing tsvector search columns because that works well in Postgres, and we want to retain that technique with the searchindex table. Am I right, or do we want to do more of this at the PHP level per SearchMySQL.php? - should I remove/refactor the existing columns/triggers/procedures and have all Postgres searches work off (only) the new searchindex table? Or should I leave that alone for now and use $wgMultiContentRevisionSchemaMigrationStage gating in searchPostgres.php to continue to use the existing columns if we're reading the old schema? Leaving the old and maintaining two sets of indexes at the db level sounds inefficient. But maybe there's a reason I'm missing to retain it. - there are a. bunch of "ts2_" prefixes floating around in postgres/tables.sql (ex. "CREATE TRIGGER ts2_page_text..."). Is that a meaningful naming convention I should be aware of and retain, or is that just "text search 2" to indicate it was a "new thing" at some point? - is there developer documentation on update.php/DatabaseUpdater.php/PostgresUpdater.php? I've looked through that code (and postgres/tables.sql). I have a formative understanding, but I need to be more solid on how all this works. TASK DETAIL https://phabricator.wikimedia.org/T198341 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BPirkle Cc: tstarling, gerritbot, Tgr, Jdforrester-WMF, Anomie, Addshore, aude, Aklapper, daniel, alaa_wmde, EvanProdromou, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Ramsey-WMF, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, Lewizho99, JJMC89, Maathavan, _jensen, rosalieper, Agabi10, Wikidata-bugs, Mbch331, Ltrlg
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
