Ladsgroup added a comment.

  In T242366#5800757 <https://phabricator.wikimedia.org/T242366#5800757>, 
@WMDE-leszek wrote:
  
  > To make sure I understand the current state, is the following correct?
  >
  > 1. rebuildItemTerms.php is run with input being a file containing a million 
Q-IDs
  
  It's a file for million range of Q-ids, e.g. from Q2M to Q3M, it actual rows 
is around 300K usually
  
  > 2. All output of the script is recorded to a file (in your home directory - 
who can access it then?).
  
  Output is separated by stderrr and stdout. That actually makes checking 
things easier, you just look out stderr.
  This output contains all kind of log - is there an automated way to extract 
Q-IDs for which rebuilding failed, hence
  
  > 3. Somehow (how?) Q-IDs for which rebuilding has to happen again are 
extracted to a file in a format consumed by a script (i.e. similar to the file 
used in step 1)
  
  Yes, You get the file, check at which stage is failed, split the file based 
on the the number and save it to a new file.
  
  > 4. The input file used in step 1 is removed.
  >
  > 5 The cycle repeats for the the next input file, containing up to one 
million Q-IDs.
  
  Nope, you just make a new file and feed it to the system.
  
  The important thing is that the failures are not intermittent on every Q-id, 
they happen sometimes and halt the whole range. The only time that it happened 
was around Q2.1M making it jump Q3M. I had to feed the range of 2.1M to 3M to 
the script again but nothing happened since then. For other types of failure 
(that can happen as well). It just skips them but checking the sqooped data 
says it's too small to be of worry. (don't get me wrong 7% is not small but it 
also can be caused by drift of the sqooped tables, issue in wb_terms instead 
and so many other factors).
  
  > What I also don't know if all of the above are automated, are they run on 
certain times, is maybe each iteration (i.e. starting the new go from the step 
1) triggered manually?

TASK DETAIL
  https://phabricator.wikimedia.org/T242366

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ladsgroup
Cc: Aklapper, Addshore, WMDE-leszek, Ladsgroup, Iflorez, darthmon_wmde, 
alaa_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, 
Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to