Stress testing of a data center during operation is often viewed with caution. Doubts boil down to the eternal question of whether it is worth interfering with a mechanism that already works. We’re sure it’s worth it – we’ll explain why.
When the data center is put into operation, the equipment is commissioned and then tested under load. After each system separately and everything in the complex has been tested, the data center is put into operation.
Then it works – a year, two, three … There is a possibility that over time, under the influence of vibration, temperatures, the human factor, the functional state of the systems deteriorates. Inevitably, there will be processes associated with wear of components, loosening of contacts, etc. Therefore, it is recommended to carry out tests regularly – ideally once a year.
But this is where a dilemma arises. After all, the data center is working successfully, which in itself confirms its serviceability. At the same time, testing carries a low, but still not zero risk of provoking a failure in the processes. Why risk it while everything is okay?
Why do you need stress testing during operation?
The simplest analogy is the planned maintenance of a car. It is also carried out for externally serviceable transport. This is a preventative measure, the task of which is to detect a potential problem in advance so that the car does not get stuck somewhere halfway.
So it is with the data center. If not tested regularly, failure can occur suddenly, when no one is ready for it. It is always more difficult to eliminate a problem in an emergency: finding the cause of the breakdown, calling the necessary specialists, delivering spare parts – this is a lot of time and often large losses.
With specially organized tests, the situation will be planned and controlled. The conditions and procedure for actions are thought out in advance, specialized teams are involved – even if something goes wrong, the work will be immediately restored.
Thus, the question should not be whether or not to carry out stress tests. And how to organize them to perform effective diagnostics with minimal risks.
How to prepare stress tests on a working data center?
The main difference between tests in a working data center is the preparation of a detailed methodology. It is developed for a specific data center and, if necessary, adjusted before each test. In it, specialists lay down the testing sequence, and also prescribe actions in case of an emergency. Tests are always carried out segment by segment, for individual parts of the systems: this ensures that if some equipment is turned off, the entire data center will continue to work.
The basis for stress tests is the standards for their performance, which should be included in the data center documentation set. However, in practice, we often find that they do not exist. The reasons may be different: the documents were not received from the contractor when the center was put into operation, or they were lost, or were not transferred when the operation service was changed. In such situations, there is no clear understanding of the structure of the data center and the organization of processes, this greatly complicates the preparation and conduct of tests.
Our specialists are ready to help in solving these issues. We can develop an individual test methodology for your data center. In the absence of a set of documents – to conduct an audit of the facility, which will allow not only to carry out tests correctly, but also to optimize the work of the data center, to increase its efficiency.
If required, we will fully undertake or support your operations department in carrying out the tests themselves. The principles and subtleties of their organization is a big topic for a separate article.