Open XDMoD
Open XDMoD is an NSF-funded open source tool to facilitate the management of high
performance computing resources. It is widely deployed at academic, industrial
and governmental HPC centers. Open XDMoD's management capabilities include
monitoring standard metrics such as utilization, providing quality of service
metrics designed to proactively identify underperforming system hardware and
software, and reporting job level performance data for every job running on the
HPC system without the need to recompile applications.
Open XDMoD is designed
to meet the following objectives:
- provide the user community with a tool to more effectively and efficiently use their allocations and optimize their use of HPC resources,
- provide operational staff with the ability to monitor, diagnose, and tune system performance as well as measure the performance of all applications running on their system,
- provide software developers with the ability to easily obtain detailed analysis of application performance to aid in optimizing code performance,
- provide stakeholders with a diagnostic tool to facilitate HPC planning and analysis, and
- provide publication and external award metrics to help measure return on investment.
In addition, analyses of the operational characteristics of the HPC environment can be carried out at different levels of granularity, including job, user, or on a system-wide basis.
How can I install Open XDMoD at my Center?
Open XDMoD is available at no cost to the user. The XDMoD team fully supports its installation and configuration.
Please see the
open.xdmod.org site for detailed documentation about the install and configuration of the software.