If you have a look at [1], it can be seen that different cluster systems
have different commands for job submission.
I did not see it clearly shown in your post how the job was submitted,
for example did you maybe use something similar to that at [2]:
$ sbatch MyJobScript.sh
*What command creates your .machines file?*
In your MyJobScript.sh below, I'm not seeing any lines that create a
.machines file.
MyJobScript.sh
--------------------------------------------------------------------------------------------------------
#!/bin/sh
#SBATCH -J test #job name
#SBATCH -p 44core #partition name
#SBATCH -N 1 #node
#SBATCH -n 18 #core
#SBATCH -o %x.o%j
#SBATCH -e %x.e%j
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so #Do not change here!!
srun ~/soft/qe66/bin/pw.x < case.in > case.out
--------------------------------------------------------------------------------------------------------
The available jobs files on FAQs are not working. They give me
.machine0 .machines .machines_current files only
wherein .machines has # and the other two are empty.
In the Slurm documentation at [3], it looks like there is variable for
helping creating a list of nodes on the fly that would need to be
written to the .machines file:
SLURM_JOB_NODELIST (and SLURM_NODELIST for backwards compatibility)
I'm not seeing this in your MyJobScript.sh like that seen in other job
scripts found on the Internet, for example [4-7].
[1] https://slurm.schedmd.com/rosetta.pdf
[2] https://hpc-uit.readthedocs.io/en/latest/jobs/examples.html
[3] https://slurm.schedmd.com/sbatch.html
[4] https://itp.uni-frankfurt.de/wiki-it/index.php/Wien2k
[5]
https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg15511.html
[6]
https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg07097.html
[7] https://www.nsc.liu.se/software/installed/tetralith/wien2k/
On 11/13/2020 3:37 AM, Laurence Marks wrote:
N.B., example mid-term questions:
1. What SBATCH command will give you 3 nodes?
2. What command creates your .machines file?
3. What are your fastest and slowest nodes?
4. Which nodes have the best communications.
N.B., please don't post your answers -- just understand!
_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what
nobody else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu <http://www.numis.northwestern.edu>
On Fri, Nov 13, 2020, 04:21 Laurence Marks <laurence.ma...@gmail.com
<mailto:laurence.ma...@gmail.com>> wrote:
Much of what you are requesting is problem/cluster specific, so
there is no magic answer -- it will vary. Suggestions:
1) Read the UG sections on .machines and parallel operation.
2) Read the man page for your cluster job command (srun)
3) Reread the UG sections.
4) Read the example scripts, and understand (lookup) all the
commands so you know what they are doing.
It is really not that complicated. If you cannot master this by
yourself, I will wonder whether you are in the right profession.
_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think
what nobody else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu <http://www.numis.northwestern.edu>
On Fri, Nov 13, 2020, 03:24 Dr. K. C. Bhamu <kcbham...@gmail.com
<mailto:kcbham...@gmail.com>> wrote:
Dear All
I need your extensive help.
I have tried to provide full details that can help you
understand my requirement. In case I have missed something,
please let me know.
I am looking for a job file for our cluster. The
available jobs files on FAQs are not working. They give me
.machine0 .machines .machines_current files
only wherein .machines has # and the other two are empty.
The script that is working fine for Quantum Espresso for
44core partition is below
#!/bin/sh
#SBATCH -J test #job name
#SBATCH -p 44core #partition name
#SBATCH -N 1 #node
#SBATCH -n 18 #core
#SBATCH -o %x.o%j
#SBATCH -e %x.e%j
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so #Do not change
here!!
srun ~/soft/qe66/bin/pw.x < case.in
<https://urldefense.com/v3/__http://case.in__;!!Dq0X2DkFhyF93HkjWTBQKhk!GAoAiAGPo-P9rf1ZIm9YcQa-sF1GVFoIXYQ5SUQSFmUQH3oCvMobKrJ6gbDtT98andJs2Q$>
> case.out
I have compiled Wien2k_19.2 on the Centos queuing system which
has the head node of Centos kernel Linux
3.10.0-1127.19.1.el7.x86_64.
I used compilers_and_libraries_2020.2.254 , fftw-3.3.8 ,
libxc-4.34 for the installation.
The details of the nodes that I can use are as follows (I can
login into these nodes with my user password):
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY
TMP_DISK WEIGHT AVAIL_FE REASON
elpidos 1 master idle 4 4:1:1 15787
0 1 (null) none
node01 1 72core allocated 72 72:1:1 515683
0 1 (null) none
node02 1 72core allocated 72 72:1:1 257651
0 1 (null) none
node03 1 72core allocated 72 72:1:1 257651
0 1 (null) none
node09 1 44core mixed 44 44:1:1 128650
0 1 (null) none
node10 1 44core mixed 44 44:1:1 128649
0 1 (null) none
node11 1 52core* allocated 52 52:1:1 191932
0 1 (null) none
node12 1 52core* allocated 52 52:1:1 191932
0 1 (null) none
The other nodes have a mixture of the kernel as below.
OS=Linux 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4
23:02:59 UTC 2020
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25
17:23:54 UTC 2020
OS=Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41
UTC 2016
OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14
21:24:32 UTC 2019
Your extensive help will improve my research productivity.
Thank you very much.
Regards
Bhamu
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html