Intel® VTune™ Profiler Performance Analysis Cookbook

ID 766316
Date 9/05/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Profiling Java Applications as a Remote User (NEW)

Use a wrapper script with Intel® VTune™ Profiler to profile Java applications as a remote user.

Normally, if you want to use VTune Profiler to profile a Java process that is in execution, you run a hardware-based sampling analysis and selecting Attach to Process as the target type. In Linux* environments, VTune Profiler uses the Linux perf tool to collect sampling data. For this purpose, you must run VTune Profiler as the same user who is running the Java process. This is because, if you run the Java process as a another user (even as root), the collector cannot attach to the Java process. However, in many cases, there is an arbitrary account that runs these applications, thus making it challenging to run them as a remote user.

In this recipe, we will see how you can use a wrapper script to run VTune Profiler and profile Java applications as a remote user.

Ingredients

Here are the hardware and software tools you need for this recipe.

  • Application: Pi. This Java application is used as a demo and is not available for download. The application uses a Monte Carlo algorithm to estimate the value of Pi with multiple threads.

  • Analysis Tool: VTune Profiler version 2022 or newer - Hotspots Analysis using Hardware Event-Based Sampling (with Collect Stacks enabled)

    NOTE:
    • Starting with the 2020 release, Intel® VTune™ Amplifier has been renamed to Intel® VTune™ Profiler.

    • Most recipes in the Intel® VTune™ Profiler Performance Analysis Cookbook are flexible. You can apply them to different versions of Intel® VTune™ Profiler. In some cases, minor adjustments may be required.

    • Get the latest version of Intel® VTune™ Profiler:

  • Java Development Kit (JDK) version: OpenJDK 11.0.8 or newer

  • Local Operating System: Windows* 10 Enterprise

  • Remote Operating System/Amazon Machine Image (AMI): Ubuntu Server 20.04 LTS (HVM)

  • Remote CPU/Instance Type: AWS EC2 c5.9xlarge (Intel® microarchitecture code named Skylake with 36 logical CPUs)

Create the Java Application on the Remote Target

Prerequisite: Ensure that you have installed the JDK.

  1. Create the Java file. In this example, we add an infinite loop around the body of main (while(true)) to simulate a long-running process:

    *Pi.java*

  2. Compile the Java file with symbols:

    $ javac -g Pi.java 

Run the Java Application as a Remote User

Start the Java application as a user other than the SSH user. In this example, the default user for the AWS instance is Ubuntu, so we create a new user (named intel) to run the application.

$ sudo adduser intel 
$ su intel
$ java -Xcomp -Djava.library.path=native_lib/ia32 -cp ./ Pi
 

Verify that user intel is running the java process. Also note the ID of the process:

$ top

Configure VTune Profiler on the Local Machine for Remote Collection

  1. Open the VTune Profiler GUI on your local machine.

  2. In the WHERE pane, select Remote Linux (SSH).

  3. Configure the SSH destination for the target system with a user different from the one running the Java application. In this example, the SSH user is Ubuntu.

  4. Deploy the VTune Profiler target package.

  5. In the WHAT pane, select Attach to Process.

  6. Enter the process ID for the Java application.

Create the Wrapper Script to Run VTune Collector

  1. Use a text editor to create a file named vtune_wrapper.sh.

  2. Populate the wrapper file with this text:

    #!/bin/bash
    echo "Target result dir: $VTUNE_RESULT_DIR"
    chmod -R o+w $VTUNE_TEMP_DIR
    chmod -R o+w $VTUNE_RESULT_DIR
    sudo -A -u intel "$@"
    sudo -A chown -R ubuntu $VTUNE_RESULT_DIR
    
  3. In the WHAT pane, under the Advanced section, scroll to the Wrapper script text box.

  4. Select vtune_wrapper.sh.

  5. In the text box, place the cursor at the end of the last line and hit Enter to add a line feed. This helps to ensure that the script is recognized by the collection.

  6. In the Advanced section, scroll up and locate the option marked Automatically stop collection after (sec). Enable this option and set a value of 30 to stop the collection after 30 seconds.

Run Hotspots Analysis with Hardware Event-Based Sampling and Stack Collection

  1. In the HOW pane, select Hotspots analysis type in the Algorithm group.

  2. Enable Hardware Event-Based Sampling.

  3. Enable the Collect stacks option and set the Stack size to 4096.

  4. Click Start to run the analysis.

Review Analysis Results

When the collection completes, the Summary tab displays CPU performance information along with a list of hotspots found in the application. You can ignore warnings in the Collection Log about locating debugging information.

See Also