• Select a language



J2EE* Performance Optimization, Part 1
Page & Feed Options
Print this
Bookmark This
Digg this | Add to your del.icio.us account
Table of Contents

Implementing a Top-Down, Iterative Tuning Methodology
Application-server configurations involve multiple computers interconnected over a network. Given the complexity involved, ensuring an adequate level of performance in this environment requires a systematic approach. There are many factors that may impact the overall performance and scalability of the system. Examples of these performance and scalability factors include network and system topology and configurations, as well as hardware and software configurations of all computers involved. As illustrated in Figure 2, examples of software components, from top to bottom in the software stack, are as follows: the deployed application, the application server, the Java Virtual Machine, and the operating system.







Figure 2. The performance stack: examples of software and hardware components.
Examples of the hardware components include the hardware platform (e.g., network and disk I/O), the memory system, the microprocessor, and microarchitecture of the processor. There are usually configurations, or tunables, that can be used for each such component for performance optimization. Since there are so many configurable components, and each component can have many configuration tunables, it is imperative to follow a systematic approach for performance tuning.

A top-down data-driven and iterative approach is the proper way to improve performance. 'Top-down' refers to addressing system-level issues first, followed by application-level issues, and finally issues at the microarchitectural level (although tuning efforts may be limited to the system level only, or to the system and application levels). The reason for addressing issues in this order is that higher-level issues may mask issues that originate at lower levels. The top-down approach is illustrated in Figure 3.

Figure 3. Top-down tuning methodology.
'Data-driven' means that performance data must be measured, and 'iterative' means the process is repeated multiple times until the desired level of performance is reached.

Figure 4 illustrates the iterative methodology. The first set of performance data is the baseline data, as it is used for comparison with future configurations. The baseline data should be established in a test environment that mimics production as closely as is practical. It should be configured based on the estimated capacity needed to sustain the desired load, including network bandwidth and topology, processor memory sizes, disk capacity, and physical database layout. In addition, the baseline configuration should incorporate basic initial configuration recommendations given by the application server, database server, JVM, and hardware-platform vendors. These recommendations should include tunable parameter settings, choice of database connectivity (JDBC) drivers, and the appropriate level of product versions, service packs, and patches.

Figure 4. The iterative nature of the methodology for performance tuning and optimizations.
Prior to baseline data collection, one should also define performance goals for the system. Performance goals are usually defined in terms of desired throughput within certain response time constraints. An example of such a goal might be 'the system needs to be able to process 500 operations per second with 90% or more of the operations taking less than one second.' In the case of SPECjAppServer2002, the performance goal can be expressed in TOPS.

The steps in the iterative process, as illustrated in Figure 4, are as follows:

  1. Collect performance data. Use stress tests and performance-monitoring tools to capture performance data as the system is exercised. In the case of this workload, one should collect not only the key performance metric (TOPS), but also performance data that can aid tuning and optimization.

  2. Identify bottlenecks. Analyze the collected data to identify performance bottlenecks. Some examples of bottlenecks are data-access inefficiencies, significant disk I/O activities on the database server, and so on.

  3. Identify alternatives. Identify, explore, and select alternatives to address the bottlenecks. For example, if disk I/O is a problem on the database back-end, consider using a high-performance disk array to overcome the bottleneck.

  4. Apply solution. Apply the proposed solution. Sometimes applying the solution requires only a change to a single parameter, while other solutions can be as involved as reconfiguring the entire database to use a disk array and raw partitions.

  5. Test. Evaluate the performance effect of the corresponding action. Data must be compared before and after a change has been applied. Sometimes fluctuation of the performance data for a given workload and measurement occurs. One must make sure the change in performance is significant in such cases.
If the proposed solution does not remove the bottleneck, try a new alternative solution. Once a given bottleneck is addressed, additional bottlenecks may appear, so the process starts over again: the performance engineer must collect performance data and initiate the cycle again, until the desired level of performance is attained. Two very important points to keep in mind during this process are letting the available data drive performance-improvement actions, and making sure that only one performance-improvement action is applied at a time, allowing you to associate a performance change with a specific action. Note, however, that there are cases where one must apply multiple changes at the same time (e.g., using a new software release requires a patch in the operating system).

As the quantity and variety of collected data can be overwhelming, and the bottlenecks can often come from many interrelated sources, cost considerations can be a useful guide. For example, if performance is due to the lack of physical memory, adding memory would be a good, cost-effective solution. Other examples include adding hard drives and processors, as hardware cost can be used as an alternative solution to consuming programmer’s time. Sometimes the performance is poor and adding processors is not a good solution, as when CPU utilization is low. In that case, a different software or network configuration may solve the problem.

Prev1  2  3  4  5  Next

Page 3 of 10