HI
We're swamped with other work and Abhishek has gone back to school.
Sean
Roger Mason wrote:
> Hi Daniel,
>
> "Daniel Gruner" <[email protected]> writes:
>
>> I haven't seen any postings on the list for about a month - well,
>> except for my own. What is going on with the xcpu project?
>>
>> I've reported a bunch of bugs, specifically with bjs, but never heard
>> anything from anybody (Abhishek?). I've gone "production" (with
>> trepidation) with my cluster, and these bugs are quite a nuisance.
>> Furthermore, yesterday two of my nodes crashed - in fact appeared to
>> be completely powered off - while running stuff. No warnings or
>> apparent problems, but I am still investigating. One of the worst
>> issues with that is that bjs ends up in a weird state, hanging up
>> rather than reporting that there are two less nodes available, and it
>> does not recover until xstat shows that all the nodes are back online.
>>
>> Then there is the MPI problem...
>
> I'm still here. Not much use to you I know.
> I havn't had time to do any further debugging because term started and
> I'm running to keep up. Sonn I hope.
>
> Cheers,
> Roger
>