Open XDMoD is an NSF-funded open source tool to facilitate the management of high performance computing resources. It is widely deployed at academic, industrial and governmental HPC centers. Open XDMoD's management capabilities include monitoring standard metrics such as utilization, providing quality of service metrics designed to proactively identify underperforming system hardware and software, and reporting job level performance data for every job running on the HPC system without the need to recompile applications.
Open XDMoD is designed to meet the following objectives:
In addition, analyses of the operational characteristics of the HPC environment can be carried out at different levels of granularity, including job, user, or on a system-wide basis.
Open XDMoD is available at no cost to the user. The XDMoD team fully supports its installation and configuration.
Please see the open.xdmod.org site for detailed documentation about the install and configuration of the software.