Common errors
SSH keys
When your job completes, any output that it generated is copied to your work directory on the head node via SSH. If you don't have SSH keys set up, it can't copy the output without your password (which it never has, so it always fails). You'll get an email similar to the one below.
PBS Job Id: 26 Job Name: myJOB Exec host: comp028/38+comp028/37+comp028/36+comp028/35+comp028/34+comp028/33+comp028/32+comp028/31+comp028/30+comp028/29+comp028/28+comp028/27+comp028/10+comp028/9+comp028/8 An error has occurred processing your job, see below. Post job file processing error; job 26 on host comp028/38+comp028/37+comp028/36+comp028/35+comp028/34+comp028/33+comp028/32+comp028/31+comp028/30+comp028/29+comp028/28+comp028/27+comp028/10+comp028/9+comp028/8 Unable to copy file /var/spool/torque/spool/26.OU to username@launch:/export/home/username/out *** error from copy Permission denied (publickey,keyboard-interactive). lost connection *** end error output Output retained on that host in: /var/spool/torque/undelivered/26.OU Unable to copy file /var/spool/torque/spool/26.ER to username@launch:/export/home/username/err *** error from copy Permission denied (publickey,keyboard-interactive). lost connection *** end error output Output retained on that host in: /var/spool/torque/undelivered/26.ER
To fix this, create a set of SSH keys.
$ ssh-keygen -t dsa
Accept all defaults, they're fine.
Then, have your own account trust your own keys. This will allow you to SSH from yourself to yourself without a password, no matter which node you're on.
$ cat ~/.ssh/id_dsa.pub > ~/.ssh/authorized_keys
Each machine you then want to SSH to without a password then has to be added to the ~/.ssh/known_hosts file by connecting to it manually.
$ ssh launch $ ssh launch.hpc
Submit script format
Files created on Windows machines usually contain unprintable end-of-line characters which may be misinterpreted by Linux command interpreters (shells). If your submit script is Windows formatted, you will get the following error when trying to submit it:
qsub: script is written in DOS/Windows text format
or this error when it tries to execute:
/bin/bash^M: bad interpreter: No such file or directory
If this happens, there is a utility called dos2unix that you can use to convert the text file from DOS/Windows formatting to Linux formatting.
$ dos2unix myscript.sub dos2unix: converting file myscript.sub to UNIX format ...