- Home›
- Technology and Research›
- Intel Technology Journal›
- Technology with the Environment in Mind
Technology with the Environment in Mind
Dynamic Data Center Power Management: Trends, Issues, and Solutions
CASE STUDIES AND RESULTS
To demonstrate the value of a policy-based dynamic power management approach using a platform-resident PM, we implemented the PM and conducted two sets of experiments: one set at a pilot data center, and the other in our internal labs. The rest of this section describes these experiments and the early results that demonstrated the value of this approach.
We conducted a proof of concept (PoC) at a top Internet portal customer's data center as a pilot project. The objective of this PoC is to maximize the number of servers allowed in a single rack within a given power envelope while maintaining maximum application performance.
Table 1 lists the use cases we developed for the PoC. At the beginning of test, we recorded the nameplate value of the servers (~350W) which the customer uses to populate their data centers today. We then installed servers with PMs in the rack and measured the actual maximum power consumption of each server at peak search workload using the PM (~310W). We then used the observed maximum power value as the baseline for setting power limits for the servers and for determining number of servers per rack. We allowed a 10% headroom, to make sure that the power consumption at the rack level did not exceed the power envelope for prolonged period of time (10 min. or longer). The PM automatically adjusts power consumption toward this target, while continuing to deliver maximum performance for the given workload.
The data center management system communicates with the PM using IPMI [14] to continuously monitor actual power consumption of each server. It then aggregates power measurements at the rack level to make sure that the rack-level power envelope was not violated. The data center manager is used to set power limits dynamically for each server as desired to achieve the IT management policy. If the PM cannot maintain the limit set, or the data center manager observes a trend towards violating the rack-level power envelope, it resets the limits appropriately to ensure that the rack power envelop is not violated. With the interaction of the data center manager and the PM, the customer can safely achieve the maximum number of servers for a given rack-level power budget, thereby increasing the density of servers on a rack.

Table 1: Rack-level power optimization use cases
click image for larger view
The initial result from the PoC described above is summarized in Table 2. The result shows the performance measurement and power reduction observed for a single server in the PoC. The server has two Intel® Xeon® processors configured with 16 GB memory and a PM. The server was running actual search workload at the customer site in a near-production test environment.
It is interesting to note that when the workload is around 1,500 concurrent searches (above average workload) and the PM imposes a power cap at around 270W, the server CPU utilization and throughput virtually remain the same, i.e., ~67% and ~4.7ms per search respectively. This means that with a PM and proper power limit, we could save 40W from a server without performance loss when the CPU is not fully loaded. This is a 13% power reduction without performance loss. Under this circumstance, we could add one more server to a rack that is populated with 6 servers within the same power envelop.
It is important to understand that the value of PM is dependent on the actual application running on the server, the typical workload, and configuration of the server itself. For each combination of server and workload it is running, the user should determine the desired control points that reduces power consumption with minimal or no performance impact.

Table 2: PM Test Result on a Single Node
click image for larger view
In the second set of experiments which we conducted in our labs, we further explored the value of a PM by setting policies at different levels. We populated a rack with Intel® Bensley servers equipped with a Intel® Xeon® processor configured with a PM. We integrated servers under test with a management console, so that the management console could get real-time server power-consumption data and define policies to set a power-limiting target, while maintaining best possible performance at the limit.

Figure 5: Policy Manager case study
click image for larger view
Four different test scenarios were considered:
- Populate the rack with nameplate power consumption (current customer practice).
- Populate the rack based on power capping at maximum performance power measured for given search workloads (~280W).
- Populate the rack based on power capping at 99% of maximum power measured for given search workloads (~275W).
- Populate the rack based on power capping at 90% of maximum power measured for given search workloads (~250W).
As shown in Figure 5, servers with a typical configuration, running representative search workloads, are populated in Rack #1. The rack is provisioned with a nameplate power with a derating factor of 60%.
Using the PM, as shown in Rack #2, by provisioning servers with power corresponding to maximum performance power, additional servers can be populated in a rack up to 21% more. Similarly, using the PM, Rack #3 shows an increase in server density of 28% when servers are provisioned with power that impacts peak performance 1% of the time. An exception scenario shown in Rack #4 is applicable when the power is severely constrained, for e.g., due to bad weather. In this case, the PM will limit power to the individual servers and hence to the whole data center, but it will allow the data center to operate in a stable environment. Each additional server defers data center capital expenditure by ~$2,000 [3]. Another example of the value of the PM is demonstrated in Figure 6. We show the power consumption and throughput measures with and without the PM for utilizations between 60% and 100% for WebBench load, which is a benchmark for Web traffic. For this workload, when the PM has a policy that limits the power to 20% below at 100% utilization, the impact to throughput is only 10%. The numbers are better at 80% utilization, where for a power reduction of 18%, the throughput is only 3%. It should be noted that typical servers in data centers run at utilizations well below 60% and as seen from the chart, the performance impact for a given power reduction is even less. Therefore, using the PM, we could safely craft power-capping policies to limit server platform power consumption with little impact on the peak performance of applications.

Figure 6: Power/performance with the Policy Manager
click image for larger view
