Dear Users and developers I ran my job via slurm job file on a remote server (2 nodes/64 cores) everything went fine upto DOSS but when I ran "x optic -p" through job file the below mentioned message occurred:
[1] 1371 ssh: connect to host nid01855 port 204: Connection refused^M [1] + Exit 255 ( $remote $machine[$p] "cd $PWD;$t $taskset0 $exe ${def}_${loop}.def;rm -f .lock_$lockfile[$p]" ) >> .timeop_$loop [1] 1375 ssh: connect to host nid01855 port 204: Connection refused^M [1] + Exit 255 ( $remote $machine[$p] "cd $PWD;$t $taskset0 $exe ${def}_${loop}.def;rm -f .lock_$lockfile[$p]" ) >> .timeop_$loop [1] 1379 ssh: connect to host nid01855 port 204: Connection refused^M [1] + Exit 255 ( $remote $machine[$p] "cd $PWD;$t $taskset0 $exe ${def}_${loop}.def;rm -f .lock_$lockfile[$p]" ) >> .timeop_$loop *** OPTIC crashed!* 0.840u 1.800s 1:50.21 2.3% 0+0k 82495+1135io 4pf+0w error: command /usr/common/software/wien2k-ccm/14.2/opticpara optic.def failed ............... I went through the list and found couples of threads but the error is not solved. Please look for this. The job was successfully complied on a local two CPU based cluster (4GB RAM each) The job file was: -------------------------------------------------------- #!/bin/bash -l #SBATCH -N 2 #SBATCH -n 64 #SBATCH -t 00:20:00 #SBATCH -p regular #SBATCH -J orthorhombic_1 #SBATCH --ccm #module load wien2k-ccm #generating .machines file for k-point and mpi parallel lapw1/2 let ntasks_per_kgroup=1 gen.machines -m $ntasks_per_kgroup #need to disable SLURM envs hereafter unset `env|grep SLURM_|awk -F= '{print $1}'` #put your Wien2k command here x optic -p #remove leftover .machines file rm -fr .machine --------------------------------------------------------------------------- regards Bhamu
_______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html