I think the latter case is a likely candidate, as we see a couple hundred
apache worker segfaults daily, in both php5 and libxml2 space. The first
case is likely a bug in php core and it's worth checking whether we'd see
the same behavior running a current php release. Core analysis may help us
determine if a reproduceable state always leads to the crash. Similarly
with libxml2. I suppose we'd have to patch apc with additional logging to
really know for sure that this is the cause.
>From your understanding of the apc source, when would such items ever
actually be freed? Only on apache restart?
On Tuesday, October 2, 2012, Patrick Reilly wrote:
> Time to time, we receive a strange warning message
> in fenari:/home/wikipedia/log/syslog/apache.log
>
> Oct 3 01:01:03 10.0.11.59 apache2[20535]: PHP Warning: * require() [<a
> href='function.require'>function.require</a>]: GC cache entry
> '/usr/local/apache/common-local/wmf-config/ExtensionMessages-1.20wmf12.php'
> (dev=2049 ino=10248005) was on gc-list for 601 seconds in
> /usr/local/apache/common-local/php-1.20wmf12/includes/AutoLoader.php* on
> line 1150
>
> Definitely this issue comes from *APC*, source code from package
> apc-3.1.6-r1.
> When item is inserted into user cache or file cache, this function is
> called.
>
> static void process_pending_removals(apc_cache_t* cache TSRMLS_DC)
> {
> slot_t** slot;
> time_t now;
>
> /* This function scans the list of removed cache entries and deletes any
> * entry whose reference count is zero (indicating that it is no longer
> * being executed) or that has been on the pending list for more than
> * cache->gc_ttl seconds (we issue a warning in the latter case).
> */
>
> if (!cache->header->deleted_list)
> return;
>
> slot = &cache->header->deleted_list;
> now = time(0);
>
> while (*slot != NULL) {
> int gc_sec = cache->gc_ttl ? (now - (*slot)->deletion_time) : 0;
>
> if ((*slot)->value->ref_count <= 0 || gc_sec > cache->gc_ttl) {
> slot_t* dead = *slot;
>
> if (dead->value->ref_count > 0) {
> switch(dead->value->type) {
> case APC_CACHE_ENTRY_FILE:
> apc_warning("GC cache entry '%s' (dev=%d ino=%d)
> was on gc-list for %d seconds" TSRMLS_CC,
> dead->value->data.file.filename,
> dead->key.data.file.device, dead->key.data.file.inode, gc_sec);
> break;
> case APC_CACHE_ENTRY_USER:
> apc_warning("GC cache entry '%s'was on gc-list for
> %d seconds" TSRMLS_CC, dead->value->data.user.info, gc_sec);
> break;
> }
> }
> *slot = dead->next;
> free_slot(dead TSRMLS_CC);
> }
> else {
> slot = &(*slot)->next;
> }
> }
> }
>
> From APC configuration (
> http://us.php.net/manual/en/apc.configuration.php#ini.apc.gc-ttl )
>
> *apc.gc_ttl integer*
>
> The number of seconds that a cache entry may remain on the
> garbage-collection list. This value provides a fail-safe in the event that
> a server process dies while executing a cached source file; if that source
> file is modified, the memory allocated for the old version will not be
> reclaimed until this TTL reached. Set to zero to disable this feature.
>
> We get messages "GC cache entry '%s' (dev=%d ino=%d) was on gc-list for %d
> seconds" or "GC cache entry '%s'was on gc-list for %d seconds" in this
> condition:
>
> (gc_sec > cache->gc_ttl) && (dead->value->ref_count > 0)
>
> First condition means, item was deleted later then apc.gc_ttl seconds ago
> and its still in garbage collector list.
> Seconds condition means, item is still referenced.
>
> e.g., when a process unexpectedly died, reference is not decreased. First
> apc.ttl seconds is active in APC cache, then is deleted (there isn't next
> hit on this item). Now item is on garbage collector list (GC) and
> apc.gc_ttl timeout is running.
> When apc.gc_ttl is less then (now - item_deletion_time), warning is written
> and item is finally completely flushed.
>
> So what should we do?
> _______________________________________________
> Wikitech-l mailing list
> [email protected] <javascript:;>
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l