acpbs system load
Re-load this
page in your browser to see the latest information.
Some browsers may require you to re-load the
image in a new window.
The Y axis shows the nodes (individual computers) in the cluster.
The bottom two bars represent the front-end nodes that you connect to when you log in to the machine, and are generally lightly loaded as users' jobs don't run on them.
Memory usage
- User.This is the amount of physical memory currently being used by users' jobs. It shows the total of all the currently resident pages, which excludes memory for jobs which are swapped or paged out.
- Buffers.This is memory allocated by the system to I/O buffers to speed file access. The system will use most of the memory that's not being used by jobs for buffers if it needs to.
- Paging.This is an indicator of how fast the node is copying pages to or from disc. The scale for this graph is somewhat arbitrary, as there's no fixed maximum paging rate, but anything over 50% indicates a lot of paging activity.
- Physical memory key.
Some nodes have more memory than others, indicated by colour
codes at the far left of the memory plot.
Largest memory.Second largest.
- User.This is the amount of CPU time used by jobs. This is summed across all the CPUs in a node so a figure of 100% means all CPUs are going flat out. A node may appear idle or underutilised for a number of reasons:
- There are no jobs in the queue that can fit in the set of currently free CPUs.
- A job is using fewer CPUs than its user asked for when submitting the job.
- The job is paging or swapping (memory being transferred to/from disc), or is doing a lot of IO.
- The last update of the plot occurred in the brief period between one job finishing and the next one starting.
- System.This is the amount of CPU time used by system calls in users' jobs (things like memory allocation, I/O operations). In general this should be a fairly low figure, as a well behaved job should spend most of its time crunching numbers and not doing system calls.
- Down.This indicates that a node is unavailable. The node may either be suffering a hardware problem, or may be deliberately configured out for some reason. If one of these 'down' nodes is busy running tests you may see memory usage figures for it.
- Node status flag. Nodes that have been deliberately put offline, have disabled themselves due to problems, or have not recently reported in are flagged at the far right of the CPU plot.