https://bugzilla.wikimedia.org/show_bug.cgi?id=68506

--- Comment #6 from Aaron Schulz <[email protected]> ---
(In reply to Fæ from comment #4)
> (In reply to Aaron Schulz from comment #3)
> > Aside from the failed "claimed" jobs, the queue is empty atm.
> 
> In which case I don't understand how "ghost jobs" can be arising. There may
> be a deeper problem here.
> 

I should not that failed claimed jobs can be retried after an hour or so (up to
3 total tries, including the first). Nevertheless, that isn't the issue here.

However, from the description above it seems like the old jobs never come back
until new ones are added (e.g. hours pass with nothing happening otherwise).
Even right now I see:

aaron@terbium:~$ mwscript showJobs.php commonswiki --group | grep gwtoolset
gwtoolsetUploadMediafileJob: 0 queued; 138 claimed (0 active, 138 abandoned); 0
delayed
gwtoolsetUploadMetadataJob: 8 queued; 0 claimed (0 active, 0 abandoned); 0
delayed

aaron@terbium:~$ mwscript showJobs.php commonswiki --list | grep gwtoolset
gwtoolsetUploadMetadataJob User:Fæ/GWToolset/Metadata_Batch_Job/53d2a118b0f90
attempts=1 user-name=Fæ whitelisted-post=array(45)
jobReleaseTimestamp=1406312788 status=unclaimed
gwtoolsetUploadMetadataJob User:Fæ/GWToolset/Metadata_Batch_Job/53d2a1152a052
attempts=1 user-name=Fæ whitelisted-post=array(45)
jobReleaseTimestamp=1406312785 status=unclaimed
gwtoolsetUploadMetadataJob User:Fæ/GWToolset/Metadata_Batch_Job/53d2a1154d9b3
attempts=1 user-name=Fæ whitelisted-post=array(45)
jobReleaseTimestamp=1406312785 status=unclaimed
gwtoolsetUploadMetadataJob
User:Ayaita/GWToolset/Metadata_Batch_Job/53d2a113495f1 attempts=1
user-name=Ayaita whitelisted-post=array(31) jobReleaseTimestamp=1406312783
status=unclaimed
gwtoolsetUploadMetadataJob User:Fæ/GWToolset/Metadata_Batch_Job/53d2a11345628
attempts=1 user-name=Fæ whitelisted-post=array(45)
jobReleaseTimestamp=1406312783 status=unclaimed
gwtoolsetUploadMetadataJob User:Fæ/GWToolset/Metadata_Batch_Job/53d2a11654ce5
attempts=1 user-name=Fæ whitelisted-post=array(45)
jobReleaseTimestamp=1406312786 status=unclaimed
gwtoolsetUploadMetadataJob User:Fæ/GWToolset/Metadata_Batch_Job/53d2a11316624
attempts=1 user-name=Fæ whitelisted-post=array(45)
jobReleaseTimestamp=1406312783 status=unclaimed
gwtoolsetUploadMetadataJob User:Fæ/GWToolset/Metadata_Batch_Job/53d2a113296a5
attempts=1 user-name=Fæ whitelisted-post=array(45)
jobReleaseTimestamp=1406312783 status=unclaimed

So there are 8 jobs waiting to be claimed though nothing claimed them. This is
odd since we have a dedicated runner for gwt jobs on each of 17 servers and
none of them are claimed. This could be possible if they are not in
JobQueueAggregator (making the runners be unaware of the queue's readiness).
Checking that gives:

aaron@terbium:~$ mwscript eval.php testwiki
> print_r( JobQueueAggregator::singleton()->getAllReadyWikiQueues() );
Array
(
    [ParsoidCacheUpdateJobOnEdit] => Array
        (
            [0] => fawiki
            [1] => warwiki
            [2] => wikidatawiki
            [3] => ruwiki
            [4] => hewiki
            [5] => eswiki
            [6] => arwiki
            [7] => frwiki
            [8] => plwiki
            [9] => mgwiktionary
            [10] => hywiki
            [11] => mediawikiwiki
            [12] => enwiktionary
            [13] => frwiktionary
            [14] => svwiki
            [15] => zhwiki
        )

    [refreshLinks] => Array
        (
            [0] => itwiki
            [1] => frwiki
            [2] => commonswiki
            [3] => enwiki
            [4] => enwiktionary
            [5] => ruwiki
        )

    [cirrusSearchLinksUpdatePrioritized] => Array
        (
            [0] => metawiki
            [1] => enwiktionary
            [2] => svwiki
            [3] => eswiki
            [4] => hewiki
            [5] => fawiki
            [6] => ruwiki
            [7] => arwiki
            [8] => hywiki
            [9] => frwiki
            [10] => plwiki
            [11] => warwiki
            [12] => wikidatawiki
            [13] => commonswiki
            [14] => dewiki
            [15] => itwiki
        )

    [ParsoidCacheUpdateJobOnDependencyChange] => Array
        (
            [0] => eswiki
            [1] => warwiki
            [2] => cawiktionary
            [3] => wikidatawiki
            [4] => plwiki
            [5] => shwiktionary
            [6] => mediawikiwiki
            [7] => svwiki
            [8] => frwiki
            [9] => enwiktionary
            [10] => mgwiktionary
            [11] => hewiki
            [12] => itwiki
            [13] => frwiktionary
            [14] => arwiki
            [15] => nlwiki
            [16] => commonswiki
            [17] => fawiki
            [18] => enwiki
            [19] => ruwiki
            [20] => hywiki
            [21] => shwiki
            [22] => dewiki
        )

    [cirrusSearchLinksUpdate] => Array
        (
            [0] => mediawikiwiki
            [1] => enwiki
            [2] => frwiki
            [3] => enwiktionary
            [4] => lawiki
            [5] => euwiki
            [6] => itwiki
            [7] => commonswiki
            [8] => ruwiki
        )

    [cirrusSearchOtherIndex] => Array
        (
            [0] => enwiki
        )

    [cirrusSearchLinksUpdateSecondary] => Array
        (
            [0] => wikidatawiki
            [1] => frwiktionary
            [2] => dewiki
            [3] => svwiki
            [4] => enwiki
            [5] => shwiki
            [6] => frwiki
            [7] => ptwiki
        )

    [htmlCacheUpdate] => Array
        (
            [0] => warwiki
            [1] => shwiki
            [2] => dewiki
            [3] => enwiktionary
            [4] => frwiki
            [5] => ruwiki
            [6] => enwiki
        )

    [enotifNotify] => Array
        (
            [0] => plwiki
            [1] => commonswiki
            [2] => dewiki
            [3] => eswiki
        )

    [MessageUpdateJob] => Array
        (
            [0] => mediawikiwiki
        )

)

No entries for gwt jobs. Since the above jobs had jobReleaseTimestamp=<X>, they
must have started off delayed and then became available. Maybe the aggregator
wasn't notified then. I don't see any bugs in executeReadyPeriodicTasks() in
redisJobRunner off hand, but I'll see if I can find anything. That would
explain some of these problems.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to