NCI logo and link to NCI home page

Home Notices and News Accounts Facilities, Software and Userguides Frequently Asked Questions Training Annual Reports
Newsletter September 2004

Table of Contents

New Support for Data Intensive Projects APAC has approved an initiative to provide selected research groups with assistance to manage large-scale data sets of national significance.

The initiative will support a number of data-intensive projects that can use the Mass Data Storage System (MDSS) of the National Facility and provide researchers with easier access to their large-scale data sets. This is not meant to provide Merit Allocation Scheme users with more storage for their current computational projects, but is aimed at supporting projects which are fundamentally data intensive and require little if any computational resources.

APAC's support will be a grant of storage capacity on the MDSS and, to the extent possible with available technical staff, advice on the access and management of the data.

Further details and the web application are available at http://nf.apac.edu.au/policies/massdata_data_intensive.php

Call for Applications to the Merit Allocation Scheme The next call for applications for time on the APAC machines will go out in early October. This call will be affected by the fact that APAC is currently assessing proposals from several vendors for provision of a replacement for the SC. Details of this replacement are not yet available but it is expected that the new machine will be installed in the early part of 2005 and, as a result, there will be many more hours of compute time available for users. Details on how this will affect the call for time will be provided when the call for proposals is made.
Summer Intern positions ANUSF is offering several Internships over the summer break (December 2004-February 2005) to students with skills relevant to our programs. The student would be employed for around six weeks as a casual employee at ANU level 4-5.

Details on some of the possible projects can be found at: http://anusf.anu.edu.au/Summer_Internships

Candidates are encouraged to submit applications by 18th Oct, after which date we will begin to match people to available projects.

To submit an application simply email Bob Gingold at the address below expressing your interests (with reference to the projects on the web where possible) and attach your resume and indicate for what dates through the summer period you would be available.

Tracking job characteristics and project usage

As part of the APAC National Facility Computational Tools and Techniques program, the ANU has been developing software for tracking the characteristics of jobs that have been run on the SC and LC systems. The software, known as Scope, has a web interface that can been accessed via http://nf.apac.edu.au/facilities/scope. The login details are validated against your LC (linux cluster) username and password. If you are unable to log in please contact help@nf.apac.edu.au.

Scope enables a user to monitor current jobs or trace back for historical jobs through explicit naming of a job id, or by using a combination of search filters. For example using a combination of software flags, special keywords in job scripts, and other resource usage flags it is possible to select a particular subset of current or historical jobs.

By selecting a particular job, a number of graphs are produced. Graphs of key indicators are shown over the walltime of the job, and they include average and sampled CPU utilisation, virtual memory and physical memory use, and jobfs (job scratch) usage. In addition, graphs are produced for nodes that are either directly or indirectly influenced by the selected job. These characteristics currently include load on the node, several forms of memory usage, and paging rates. The graphs also include a summary of job turn-around times.

In addition to job characteristics, users can access graphs and other details regarding the their grants, which may help people better understand the usage of the system under their connected project. We have been tracking the details of jobs since around the start of the year, and some users may be interested to have a look back at some of the details of previous jobs.

Scope is continually being updated and features added. We have included documentation and helpful comments for most options to assist in getting the best use from the facility. For assistance, more information, or additional feature requests, please contact help@nf.apac.edu.au.

Recent software acquisitions and updates There have been several updates to software packages on the SC and LC. These include:
  • a trial of the Accelrys DISCOVER package on the SC
  • a trial of the Computational Fluid Dynamics software CFD++
  • Scalapack has been updated to version 1.7 on the SC
  • the Cactus framework has been installed on the LC
  • the CSIRO Scientific Desktop (SCD) has been installed on both the SC and LC.

The VASP software package is available for users who already hold a valid licence themselves. If you wish to use it but do not have a valid licence you must arrange that first. Similarly, we are currently testing an installation of the Wien2K package for current license holders. Please contact help@nf.apac.edu.au to register your interest in having access to these installations.

Remember to set the relevant PBS software flag to ensure that the package is available when your job runs on a node. The PBS software keyword flag is listed for each package under the appropriate software web page. In the near future batch jobs which do not have a necessary software flag will not be able to run. Contact help@nf.apac.edu.au if you have any problems running jobs. A list of all software available is listed here.

Parallel programming courses Staff of the APAC National Facility give courses in MPI Programming and one on Applications and Optimisation of MPI programming. These are both 1 day courses which involve intensive "hands-on" exercises and provide a good introduction to the both the advantages and the pitfalls of parallel programming. We are prepared to send several staff members to any partner organisation interested in hosting these courses. Also, if any research group has a number of researchers interested in learning MPI programming we can provide these courses for a smaller group. You can register your interest in hosting a course at http://nf.apac.edu.au/training.

We can also provide an introductory course on using the National Facility which covers details of the queuing system, filesystems, compilers and gives a brief introduction to running parallel programs. As well as giving the formal course material NF staff can provide user advice and support on specific problems in less formal one-on-one discussions.

APAC Summer School APAC will be running a Summer School in Advanced Computation at the ANU from January 10 to January 21 2005. Funding will be provided for successful students and will cover some or all of their travel and accommodation.

This course is aimed at honours level or postgraduate students in any computational research area and will cover topics from techniques for optimising code and writing parallel programs to mathematical methods for numerical linear algebra and solving PDEs. There will be extensive hands-on sessions using the APAC SC and LC and lectures on topics such as the future of grid computing.

More information and application procedures will appear soon on the ANU Mathematical Sciences Institute events web page.

MDSS news There has been a number of packages and other services installed on the MDSS system over the past few months. Recent updates and changes include:
  • The installation of OPeNDAP, particularly for groups interested in Oceanographic data
  • The SDSC Storage Resource Brocker (SRB)
  • The ftp daemon has been changed to remove the 2Gbyte limit.

Given the increased interest in software for data analysis we have now included references to such software on our National Facility software pages: http://nf.apac.edu.au/facilities/software/. If there is a special package not listed, please fill in the software request form.

Filesystems on the SC and LC Are you always over your /home quota?

The APAC National Facility systems have a number of filesystems available for your use. Each has particular features to play different roles in your work cycle. The "File Systems" section of the National Facility User Guide explains these features and roles in detail.

The quota on /home directories is quite small because /home is NOT intended to be used for running batch jobs from. Job data should live in /short where every project has at least a 20GB quota. /short has a 28 day file expiry limit on it because we expect you to be responsible for archiving the data you wish to keep, either to your local system or to the massdata system. If you have not done so, we strongly recommend that you create for yourself a /short directory and a massdata directory for housing and managing your job input and output. For those not strong in Unix, below is an example session involving the use of /short and /massdata.

  mkdir /short/$PROJECT/$USER
  cd /short/$PROJECT/$USER
  mkdir jobdir
  cd jobdir
  # place your input files in this directory
  # run your job from this directory
  qsub -wd ....
 
  # if you wish to archive the results first cleanup unwanted files
  # then tar up the results and send them to massdata
  cd ..
  tar cf jobout.tar ./jobdir
  gzip jobout.tar
  # create your own massdata directory in your project's area
  # if you haven't already done so
  mdss mkdir $USER
  mdss put jobout.tar.gz $USER/jobout.tar.gz
  mdss ls $USER

Is I/O a problem?

Another very important filesystem is the job scratch filesystem jobfs. If your job requires frequent I/O, for example, if it writes frequently to an output file, then you may find that performance in the batch queues on the SC and LC suffers compared with the code's performance on a single workstation. You can see this by the %CPU column on your nqstat output. It is common for this entry to be much less than 100% at initial start-up of the code as data is read in. However, if this entry remains well under 100% for any length of time then it is quite likely that your code is doing frequent I/O to /home or /short which is slowing it down. The solution to this is to use a file system which is local to the node on which your batch job is running, jobfs.

Details on jobfs are given in the user guide and one important thing to remember is that your files in jobfs exist only whilst the batch job is running. Thus, if you require them as output, they must be copied back to /home or /short at the end of the batch job.

Here is a simple example, taken from our introductory course to using the National Facility, of a batch job that shows how to access jobfs.

#!/bin/csh
#PBS -q express
#PBS -l walltime=2:00
#PBS -l jobfs=10mb
#PBS -l vmem=30mb

cd $PBS_JOBFS

#Move input files from home directory to the local jobfs directory
cp $HOME/INTRO_COURSE/input.1 .
cp $HOME/INTRO_COURSE/a.out .

# Run program and write an output file to the local disk.
a.out < input.1 >& output$PBS_JOBID

# Move output data from jobfs to /short.
mv output$PBS_JOBID /short/$PROJECT/

If you wish to monitor the files that are being created on jobfs as the batch job executes you can use the commands qls jobid to list the files in the jobfs directory and qcp jobid/file to copy a file from the jobfs directory while the job is running. Read the man pages for these two commands for further information.

Email problems, suggestions, questions to