On Tue, 2008-09-16 at 15:49 -0400, Daniel Gruner wrote:
> Hi Hugh,
> 
> I am still having some weird problems with moab/torque on my test xcpu
> cluster.  I mentioned some of these in a previous e-mail, but the
> query went unanswered, and since you wrote the script(s) perhaps you
> could help me debug this?  Here is the issue:
> 
> I have 2 compute nodes, each with 2 cpus.  I submit several jobs to
> the queue using qsub:
> 
> [EMAIL PROTECTED] xcpu]$ xstat
> n0000   tcp!10.10.0.10!6667     /Linux/x86_64   up      0
> n0001   tcp!10.10.0.11!6667     /Linux/x86_64   up      0
> 
> [EMAIL PROTECTED] xcpu]$ showq
> 
> active jobs------------------------
> JOBID                     USERNAME      STATE PROCS   REMAINING
>     STARTTIME
> 
> 25.dgk3.chem.utoronto.ca     danny    Running     1    00:58:30  Mon
> Sep 15 10:11:07
> 26.dgk3.chem.utoronto.ca     danny    Running     1    00:58:30  Mon
> Sep 15 10:11:07
> 27.dgk3.chem.utoronto.ca     danny    Running     1    00:58:30  Mon
> Sep 15 10:11:07
> 28.dgk3.chem.utoronto.ca     danny    Running     1    00:58:30  Mon
> Sep 15 10:11:07
> 
> 4 active jobs               4 of 4 processors in use by local jobs
> (100.00%)
>                             2 of 2 nodes active      (100.00%)
> 
> eligible jobs----------------------
> JOBID              USERNAME      STATE PROCS     WCLIMIT
> QUEUETIME
> 
> 
> 0 eligible jobs
> 
> blocked jobs-----------------------
> JOBID              USERNAME      STATE PROCS     WCLIMIT
> QUEUETIME
> 
> 
> 0 blocked jobs
> 
> Total jobs:  4
> 
> The job script is:
> #!/bin/bash
> #PBS -l nodes=1
          ^^^^^^^
Isn't this supposed to mean, run "date" on 1 node?

> #XCPU -p
> 
> date
> 
> 
> The weird thing is that all the jobs end up being executed on node
> n0000, as per the output:
> 
> [EMAIL PROTECTED] xcpu]$ cat script.cmd.o25
> n0000: Mon Sep 15 10:11:42 UTC 2008
> [EMAIL PROTECTED] xcpu]$ cat script.cmd.o26
> n0000: Mon Sep 15 10:11:26 UTC 2008
> [EMAIL PROTECTED] xcpu]$ cat script.cmd.o27
> n0000: Mon Sep 15 10:13:03 UTC 2008
> [EMAIL PROTECTED] xcpu]$ cat script.cmd.o28
> n0000: Mon Sep 15 10:12:02 UTC 2008
> 
[snip]

Reply via email to