5.2.2. Running the Hostless DDR-Free Design Example
Procedure
- Download and prepare the ResNet-18 PyTorch Model with the OpenVINO™ Open Model Zoo tools with the following commands:
source ~/build-openvino-dev/openvino_env/bin/activate omz_downloader --name resnet-18-pytorch \ --output_dir $COREDLA_WORK/demo/models/ omz_converter --name resnet-18-pytorch \ --download_dir $COREDLA_WORK/demo/models/ \ --output_dir $COREDLA_WORK/demo/models/Important: The OpenVINO™ Open Model Zoo (OMZ) PyTorch models do not include a softmax operation at the end of the model. - Generate the parameter ROMs as .mif files by running the FPGA AI Suite compiler with the following command:
dla_compiler \ --batch-size=1 \ --network-file <path/to/graph> \ --march $COREDLA_ROOT/example_architectures/AGX7_Streaming_Ddrfree_Resnet18.arch \ --foutput-format=open_vino_hetero \ --o <compiler output .bin file name> \ --fplugin HETERO:FPGA \ --dumpdir $COREDLA_WORK/resnet-18-dlac-out/
The .mif files are created in a subdirectory of the directory specified by the --dumpdir option. This subdirectory is called parameter_rom.
For details about creating the .mif files required for DDR-free operation, refer to Generating Artifacts for Hostless DDR-Free Operation.
- Build the design example with the following command:
dla_build_example_design.py build \ --output-dir <path/to/build/dir> \ --num-instances 1 \ --seed 1 \ --parameter-rom-dir $COREDLA_WORK/resnet-18-dlac-out/parameter_rom/ \ agx7_iseries_ddrfree \ $COREDLA_ROOT/example_architectures/AGX7_Streaming_Ddrfree_Resnet18.arch
Building the design example creates the bitstream needed to program the FPGA device.
For more information about the dla_build_example_design command, refer to The dla_build_example_design Command.
- Program the FPGA device with the Quartus® Prime Programmer.
The bitstream used to program the device is <path/to/build/dir>/hw/output_files/top.sof.
Program the FPGA device with the following command:quartus_pgm -c 1 -m jtag -o "p;top.sof@1"
For more information about the Quartus® Prime Programmer, refer to Quartus® Prime Pro Edition User Guide: Programmer .
- To make the JTAG connection stable, lower the JTAG clock speed to 16 MHz or lower with the following command:
jtagconfig --setparam 1 JtagClock 16M
- Use the Quartus® Prime System Console to run inference on the design example.
Because this design example is hostless, operations that typically come from the host are performed through Quartus® Prime System Console instead. For more information about the Quartus® Prime System Console, refer to "Analyzing and Debugging Designs with System Console" in Quartus® Prime Pro Edition User Guide: Debug Tools .
Use the System Console to complete the following steps:- (Optional) Update the graph parameters and instructions using the CSR interface.
- Store input features in the FPGA on-chip memory.
- Prime the FPGA AI Suite IP registers for inference.
- Configure an ingress Modular Scatter-Gather DMA (mSGDMA) core to read the input features from on-chip memory and stream data into the FPGA AI Suite IP.
- Configure an egress mSGDMA core to stream data from the FPGA AI Suite IP into on-chip memory.
- Read the inference results from on-chip memory.
The design example provides a System Console script to automate these operations for you. You can find the script in the $CORDLA_ROOT/runtime/streaming/ed0_streaming_example folder.
To use the design example System Console script:- (Optional) To update the graph parameters and configurations, generate the update .mif files by running the FPGA AI Suite compiler as in step 2, then send the new .mif files to the FPGA AI Suite IP with the following command:
system-console --script=system_console_script.tcl \ --online-reconfiguration <path-to-MIF-directory> - Run the following command to run inference on the FPGA device:
system-console --script=system_console_script.tcl \ --input <path-to-img.bin> \ --num_inferences <#-of-inferences> \ --output_shape <[C H W]> \ --functional \ --arch=<path-to-architecture-description-file>The design example Quartus® Prime System Console script generates a file called output.bin that contains the raw inference results.
- (Optional) To measure the performance of the design example, run the following command:
system-console --script=system_console_script.tcl \ --input <path-to-img.bin> \ --output_shape <[C H W]> \ --core_ip_performance \ --arch=<path-to-architecture-description-file>
- Postprocess the raw inference output for readability with the following command:
python3 $COREDLA_ROOT/bin/streaming_post_processing.py <path-to-output.bin>
This script cleans the raw output binary file by script some invalid bytes and storing an FP16 formatted result_hw.txt file for readability.