Distributed infrastructures providing cloud services as well as compute- and storage clusters are notoriously difficult to administer and optimize. Management applications benefit from up-to-date data on system performance. This paper describes the Performance Monitoring and Management System (PMMS), a light-weight and versatile performance monitoring tool that collects hundreds of thousands of metrics per second and delivers this information as time-series in near real-time.
At its core lies an in-memory database that scales through federation to large clusters of several hundreds of nodes. Data is collected using sensors which collect and periodically send their metric data to the PMMS collectors. Designed as a component that is embeddable into a larger system, PMMS is light-weight, with little impact on the system resources and it is easy to install and to configure.