Common errors

From HPC
Revision as of 13:12, 15 March 2017 by Cwmoller (talk | contribs) (Submit script format)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

SSH keys

When your job completes, any output that it generated is copied to your work directory on the head node via SSH. If you don't have SSH keys set up, it can't copy the output without your password (which it never has, so it always fails). You'll get an email similar to the one below.

PBS Job Id: 26
Job Name:   myJOB
Exec host:  comp028/38+comp028/37+comp028/36+comp028/35+comp028/34+comp028/33+comp028/32+comp028/31+comp028/30+comp028/29+comp028/28+comp028/27+comp028/10+comp028/9+comp028/8
An error has occurred processing your job, see below.
Post job file processing error; job 26 on host comp028/38+comp028/37+comp028/36+comp028/35+comp028/34+comp028/33+comp028/32+comp028/31+comp028/30+comp028/29+comp028/28+comp028/27+comp028/10+comp028/9+comp028/8

Unable to copy file /var/spool/torque/spool/26.OU to username@launch:/export/home/username/out
*** error from copy
Permission denied (publickey,keyboard-interactive).

lost connection
*** end error output
Output retained on that host in: /var/spool/torque/undelivered/26.OU

Unable to copy file /var/spool/torque/spool/26.ER to username@launch:/export/home/username/err
*** error from copy
Permission denied (publickey,keyboard-interactive).

lost connection
*** end error output
Output retained on that host in: /var/spool/torque/undelivered/26.ER

To fix this, create a set of SSH keys.

$ ssh-keygen -t dsa

Accept all defaults, they're fine.

Then, have your own account trust your own keys. This will allow you to SSH from yourself to yourself without a password, no matter which node you're on.

$ cat ~/.ssh/ > ~/.ssh/authorized_keys

Each machine you then want to SSH to without a password then has to be added to the ~/.ssh/known_hosts file by connecting to it manually.

$ ssh launch
$ ssh launch.hpc

Submit script format

Files created on Windows machines usually contain unprintable end-of-line characters which may be misinterpreted by Linux command interpreters (shells). If your submit script is Windows formatted, you will get the following error when trying to submit it:

qsub:  script is written in DOS/Windows text format

or this error when it tries to execute:

/bin/bash^M: bad interpreter: No such file or directory

If this happens, there is a utility called dos2unix that you can use to convert the text file from DOS/Windows formatting to Linux formatting.

$ dos2unix myscript.sub
dos2unix: converting file myscript.sub to UNIX format ...