Difference between revisions of "Main Page"

From HPC
m (Rhasatsha High Performance Computer)
m (Specifications)
 
(64 intermediate revisions by 3 users not shown)
Line 1: Line 1:
= Rhasatsha High Performance Computer =
+
The University of Stellenbosch hosts multiple HPCs (High Performance Computing clusters). This wiki provides information on the two largest systems, HPC1 and HPC2.
  
'''rhasatsha''': (the rha is pronounced as gaan in Afrikaans)
+
= HPC1 (also known as Rhasatsha) =
a clever person/object; highly intelligent; something that acts
 
promptly; a wide awake person/object who/that is always on the spot; a
 
versatile person/object that can tackle anything successfully.
 
  
The cluster currently has the following [[#Specifications|specifications]].
+
HPC1 is available to all users registered  on campus. In essence, if you have a network login, you can use this HPC.
 +
 
 +
All users are granted 1000 CPU hours to test the system and determine its usefulness. Once the 1000 hour quota is depleted, users are required to pay a registration fee to gain unlimited access.
 +
 
 +
Free users are granted 1000 CPU hours and a 10GB disk quota. Paid users are granted unlimited CPU and a 1TB disk quota.
 +
 
 +
See [[HOWTO register]] for details on how to register.
  
 
== General information ==
 
== General information ==
  
Feel free to contact [mailto:cwmoller@sun.ac.za Charl Möller] (x9490) with any queries regarding the cluster.
+
Please direct enquires to help@sun.ac.za.
  
 
* [[HOWTO register]]
 
* [[HOWTO register]]
Line 20: Line 23:
 
* [[Common errors]]
 
* [[Common errors]]
  
* [[Getting Started]]
+
'''rhasatsha''': (the rha is pronounced as gaan in Afrikaans) a clever person/object; highly intelligent; something that acts promptly; a wide awake person/object who/that is always on the spot; a versatile person/object that can tackle anything successfully.
  
 
=== Monitoring tools ===
 
=== Monitoring tools ===
  
* [http://hpc1.sun.ac.za:40253/ganglia/ Ganglia Cluster Monitor] (only available on campus)
+
* [https://hpc1-manager.sun.ac.za/ganglia/ Ganglia cluster monitor] (only available on campus)
* [https://hpc1.sun.ac.za:8444 XDMoD Portal] (only available on campus)
+
* [https://hpc1-manager.sun.ac.za/munin/ Munin health monitor] (only available on campus)
 +
* [https://hpc1-manager.sun.ac.za:8444/ XDMoD usage explorer] (only available on campus)
  
 
=== Specifications ===
 
=== Specifications ===
  
 
The HPC currently has the following compute specifications:
 
The HPC currently has the following compute specifications:
* 20x 8-core Intel Xeon E5440 with 16GB RAM
+
* 11x 8-core Intel Xeon E5440 @ 2.83GHz with 16GB RAM
* 1x 24-core Intel Xeon X5650 with 24GB RAM
+
* 17x 48-core AMD Opteron 6172 @ 2.10GHz with 96GB RAM, Infiniband interconnect
* 1x 16-core Intel Xeon X5550 with 48GB RAM, dual NVIDIA GT200GL
+
* 2x 64-core AMD Opteron 6274 @ 2.20GHz with 128GB RAM, Infiniband interconnect
* 17x 48-core AMD Opteron 6172 with 96GB RAM, Infiniband interconnect
+
* 8x 8-core Intel Xeon X5450 @ 3.00GHz with 32GB RAM
* 2x 64-core AMD Opteron 6274 with 128GB RAM, Infiniband interconnect
+
* 2x 8-core Intel Xeon X5450 @ 3.00GHz with 24GB RAM
 +
* 1x 64-core AMD Opteron 6366 HE @ 1.8GHz with 128GB RAM, Infiniband interconnect
 +
* 2x 48-core Intel Xeon E5-2670 v3 @ 2.30GHz with 512GB RAM, Infiniband interconnect
 
   
 
   
The total is 1144 available cores.
+
The total is 1272 available cores. Note we have managed to increase the total core count to 2344 since this was published.
 +
 
 +
=== Job priorities ===
 +
 
 +
The HPC currently has 5 queues into which jobs are automatically divided based on walltime requested.
 +
* '''short''' - queue for jobs running up to 2 hours (#PBS -l walltime=2:00:00)
 +
* '''day''' - queue for jobs running up to 24 hours (#PBS -l walltime=24:00:00)
 +
* '''week''' - queue for jobs running up to 7 days (#PBS -l walltime=168:00:00)
 +
* '''month''' - queue for jobs running up to 31 days (#PBS -l walltime=744:00:00)
 +
* '''long''' - queue for jobs running longer than 31 days
 +
 
 +
At any given time every queue is only allowed a maximum number of cores to ensure quick jobs aren't unnecessarily blocked by long running jobs.
 +
* '''short''' - unlimited cores, highest priority
 +
* '''day''' - unlimited cores
 +
* '''week''' - maximum of 1000 cores
 +
* '''month''' - maximum of 600 cores, maximum of 20 jobs per user, maximum of 400 cores per user
 +
* '''long''' - maximum of 500 cores, maximum of 20 jobs per user, maximum of 300 cores per user
  
== HOWTOs for different programs ==
+
It is imperative that you accurately estimate your job's running time. Estimate too high, and you may find yourself in an unfavourable queue. Estimate too low and the job will be killed by the system when the walltime is reached. Once a job is running, only the administrator can increase the walltime. All jobs are required to specify a walltime.
  
[[ R ]]
+
Furthermore, all interactive jobs will be satisfied by the '''test''' queue and will be limited to a maximum of 8 cores and walltime of 24 hours.
  
 
== Acceptable usage ==
 
== Acceptable usage ==
Hierdie stelsel mag alleen gebruik word vir '''bona fide akademiese doeleindes'''. 
 
Enige poging dit vir  konsultasiewerk of
 
enige kommersiële doeleindes te gebruik, kan veroorsaak dat die
 
gebruiker permanent verbied word om die stelsel te gebruik.
 
  
This system may only be used for '''bona fide academic work'''.
+
This system may only be used for '''bona fide academic work'''. Any effort to use it for consultancy work or any other commercial purpose may lead to the permanent banning of the user from the system.
Any effort to use it for consultancy work or any other
+
 
commercial purpose may lead to the permanent
+
== Citations ==
banning of the user from the system.
+
 
 +
We require an acknowledgement in any thesis, paper, publication or presentation that references results computed on this system. In addition we would like to be able to reference these published works.
 +
 
 +
Suggested form of acknowledgement:
 +
<pre>
 +
Computations were performed using the University of Stellenbosch's HPC1 (Rhasatsha): http://www.sun.ac.za/hpc
 +
</pre>
 +
 
 +
= HPC2 =
 +
 
 +
HPC2 is only available to registered users.
 +
 
 +
See [[HOWTO register]] for details on how to register.
 +
 
 +
== General information ==
 +
 
 +
Feel free to contact [mailto:gerhardv@sun.ac.za Gerhard Van Wageningen] (x4554) with any queries regarding the cluster.
 +
 
 +
* [[HOWTO register]]
 +
* [[HOWTO login]]
 +
* [[HOWTO submit jobs]]
 +
* [[HOWTO check up on jobs]]
 +
* [[Useful commands]]
 +
 
 +
* [[Common errors]]
 +
 
 +
=== Monitoring tools ===
 +
 
 +
Not currently available
 +
 
 +
=== Specifications ===
 +
 
 +
The HPC currently has the following compute specifications:
 +
* 1x 80-core Intel Xeon E7-4850 @ 2.00GHz with 1024GB RAM, Infiniband interconnect
 +
* 3x 48-core Intel Xeon E5-2650 v4 @ 2.20GHz with 512GB RAM, Infiniband interconnect
 +
* 2x 64-core AMD Opteron 6274 @ 2.20GHz with 128GB RAM, Infiniband interconnect
 +
* 2x 24-core Intel Xeon X5650 @ 2.67GHz with 48GB RAM, Infiniband interconnect
 +
* 3x 16-core Intel Xeon E5530 @ 2.40GHz with 24GB RAM
 +
* 1x Dell R910 80-core Intel Xeon 4850 @ 2.0GHz 1024GB RAM, Infiniband
 +
* 4x Dell R730 48-core Intel Xeon 2650 @ 2.2GHz 256GB, 504GB, 504GB, 756GB RAM, Infiniband interconnect
 +
* 1x Dell R740 72-core Intel Zeon 6254 @ 3.1GHz 1.5TB RAM, Infiniband interconnect
 +
* 1x Dell R640 72-core Intel Xeon 6254 @ 3.1GHz 1.5TB RAM, Infiniband interconnect
 +
 
 +
The total is 672 available cores.
 +
 
 +
=== Job priorities ===
 +
 
 +
The HPC currently has 5 general CPU queues into which jobs are automatically divided based on walltime requested.
 +
* '''short''' - queue for jobs running up to 2 hours (#PBS -l walltime=2:00:00)
 +
* '''day''' - queue for jobs running up to 24 hours (#PBS -l walltime=24:00:00)
 +
* '''week''' - queue for jobs running up to 7 days (#PBS -l walltime=168:00:00)
 +
* '''month''' - queue for jobs running up to 31 days (#PBS -l walltime=744:00:00)
 +
* '''long''' - queue for jobs running longer than 31 days
 +
 
 +
At any given time every queue is only allowed a maximum number of cores to ensure quick jobs aren't unnecessarily blocked by long running jobs.
 +
* '''short''' - unlimited, highest priority
 +
* '''day''' - unlimited
 +
* '''week''' - 450 cores, burstable to 500 if cluster is idle
 +
* '''month''' - 200 cores (burstable to 300), maximum of 3 jobs per user (burstable to 5), maximum of 100 cores per user (burstable to 200)
 +
* '''long''' - 100 cores (burstable to 200), maximum of 3 jobs per user (burstable to 5), maximum of 50 cores per user (burstable to 100)
 +
 
 +
It is imperative that you accurately estimate your job's running time. Estimate too high, and you may find yourself in an unfavourable queue. Estimate too low and the job will be killed by the system when the walltime is reached. Once a job is running, only the administrator can increase the walltime.
 +
 
 +
Any job that does not specify a walltime will be assigned a default of '''5 minutes'''.
 +
 
 +
 
 +
== Citations ==
 +
 
 +
We require an acknowledgement in any thesis, paper, publication or presentation that references results computed on this system. In addition we would like to be able to reference these published works.
 +
 
 +
Suggested form of acknowledgement:
 +
<pre>
 +
Computations were performed using the University of Stellenbosch's  HPC2: http://www.sun.ac.za/hpc
 +
</pre>

Latest revision as of 15:49, 10 May 2023

The University of Stellenbosch hosts multiple HPCs (High Performance Computing clusters). This wiki provides information on the two largest systems, HPC1 and HPC2.

HPC1 (also known as Rhasatsha)

HPC1 is available to all users registered on campus. In essence, if you have a network login, you can use this HPC.

All users are granted 1000 CPU hours to test the system and determine its usefulness. Once the 1000 hour quota is depleted, users are required to pay a registration fee to gain unlimited access.

Free users are granted 1000 CPU hours and a 10GB disk quota. Paid users are granted unlimited CPU and a 1TB disk quota.

See HOWTO register for details on how to register.

General information

Please direct enquires to help@sun.ac.za.

rhasatsha: (the rha is pronounced as gaan in Afrikaans) a clever person/object; highly intelligent; something that acts promptly; a wide awake person/object who/that is always on the spot; a versatile person/object that can tackle anything successfully.

Monitoring tools

Specifications

The HPC currently has the following compute specifications:

  • 11x 8-core Intel Xeon E5440 @ 2.83GHz with 16GB RAM
  • 17x 48-core AMD Opteron 6172 @ 2.10GHz with 96GB RAM, Infiniband interconnect
  • 2x 64-core AMD Opteron 6274 @ 2.20GHz with 128GB RAM, Infiniband interconnect
  • 8x 8-core Intel Xeon X5450 @ 3.00GHz with 32GB RAM
  • 2x 8-core Intel Xeon X5450 @ 3.00GHz with 24GB RAM
  • 1x 64-core AMD Opteron 6366 HE @ 1.8GHz with 128GB RAM, Infiniband interconnect
  • 2x 48-core Intel Xeon E5-2670 v3 @ 2.30GHz with 512GB RAM, Infiniband interconnect

The total is 1272 available cores. Note we have managed to increase the total core count to 2344 since this was published.

Job priorities

The HPC currently has 5 queues into which jobs are automatically divided based on walltime requested.

  • short - queue for jobs running up to 2 hours (#PBS -l walltime=2:00:00)
  • day - queue for jobs running up to 24 hours (#PBS -l walltime=24:00:00)
  • week - queue for jobs running up to 7 days (#PBS -l walltime=168:00:00)
  • month - queue for jobs running up to 31 days (#PBS -l walltime=744:00:00)
  • long - queue for jobs running longer than 31 days

At any given time every queue is only allowed a maximum number of cores to ensure quick jobs aren't unnecessarily blocked by long running jobs.

  • short - unlimited cores, highest priority
  • day - unlimited cores
  • week - maximum of 1000 cores
  • month - maximum of 600 cores, maximum of 20 jobs per user, maximum of 400 cores per user
  • long - maximum of 500 cores, maximum of 20 jobs per user, maximum of 300 cores per user

It is imperative that you accurately estimate your job's running time. Estimate too high, and you may find yourself in an unfavourable queue. Estimate too low and the job will be killed by the system when the walltime is reached. Once a job is running, only the administrator can increase the walltime. All jobs are required to specify a walltime.

Furthermore, all interactive jobs will be satisfied by the test queue and will be limited to a maximum of 8 cores and walltime of 24 hours.

Acceptable usage

This system may only be used for bona fide academic work. Any effort to use it for consultancy work or any other commercial purpose may lead to the permanent banning of the user from the system.

Citations

We require an acknowledgement in any thesis, paper, publication or presentation that references results computed on this system. In addition we would like to be able to reference these published works.

Suggested form of acknowledgement:

Computations were performed using the University of Stellenbosch's HPC1 (Rhasatsha): http://www.sun.ac.za/hpc

HPC2

HPC2 is only available to registered users.

See HOWTO register for details on how to register.

General information

Feel free to contact Gerhard Van Wageningen (x4554) with any queries regarding the cluster.

Monitoring tools

Not currently available

Specifications

The HPC currently has the following compute specifications:

  • 1x 80-core Intel Xeon E7-4850 @ 2.00GHz with 1024GB RAM, Infiniband interconnect
  • 3x 48-core Intel Xeon E5-2650 v4 @ 2.20GHz with 512GB RAM, Infiniband interconnect
  • 2x 64-core AMD Opteron 6274 @ 2.20GHz with 128GB RAM, Infiniband interconnect
  • 2x 24-core Intel Xeon X5650 @ 2.67GHz with 48GB RAM, Infiniband interconnect
  • 3x 16-core Intel Xeon E5530 @ 2.40GHz with 24GB RAM
  • 1x Dell R910 80-core Intel Xeon 4850 @ 2.0GHz 1024GB RAM, Infiniband
  • 4x Dell R730 48-core Intel Xeon 2650 @ 2.2GHz 256GB, 504GB, 504GB, 756GB RAM, Infiniband interconnect
  • 1x Dell R740 72-core Intel Zeon 6254 @ 3.1GHz 1.5TB RAM, Infiniband interconnect
  • 1x Dell R640 72-core Intel Xeon 6254 @ 3.1GHz 1.5TB RAM, Infiniband interconnect

The total is 672 available cores.

Job priorities

The HPC currently has 5 general CPU queues into which jobs are automatically divided based on walltime requested.

  • short - queue for jobs running up to 2 hours (#PBS -l walltime=2:00:00)
  • day - queue for jobs running up to 24 hours (#PBS -l walltime=24:00:00)
  • week - queue for jobs running up to 7 days (#PBS -l walltime=168:00:00)
  • month - queue for jobs running up to 31 days (#PBS -l walltime=744:00:00)
  • long - queue for jobs running longer than 31 days

At any given time every queue is only allowed a maximum number of cores to ensure quick jobs aren't unnecessarily blocked by long running jobs.

  • short - unlimited, highest priority
  • day - unlimited
  • week - 450 cores, burstable to 500 if cluster is idle
  • month - 200 cores (burstable to 300), maximum of 3 jobs per user (burstable to 5), maximum of 100 cores per user (burstable to 200)
  • long - 100 cores (burstable to 200), maximum of 3 jobs per user (burstable to 5), maximum of 50 cores per user (burstable to 100)

It is imperative that you accurately estimate your job's running time. Estimate too high, and you may find yourself in an unfavourable queue. Estimate too low and the job will be killed by the system when the walltime is reached. Once a job is running, only the administrator can increase the walltime.

Any job that does not specify a walltime will be assigned a default of 5 minutes.


Citations

We require an acknowledgement in any thesis, paper, publication or presentation that references results computed on this system. In addition we would like to be able to reference these published works.

Suggested form of acknowledgement:

Computations were performed using the University of Stellenbosch's  HPC2: http://www.sun.ac.za/hpc