RapidWright Report Timing Example

Reports the critical path within an example design (e.g. “microblaze4.dcp”).

Background

Please see our FPT‘19 paper, "An Open-source Lightweight Timing Model for RapidWright" (Presentation) for background details on our RapidWright Timing Model.

Steps to Run

  1. Download the example microblaze4.dcp design:

wget http://www.rapidwright.io/docs/_downloads/microblaze4.dcp
  1. Invoke RWRoute via gradle (this will ensure code is compiled before running):

rapidwright ReportTimingExample microblaze4.dcp

The source code for ReportTimingExample.java is provided below for easy reference:

public static void main(String[] args) {
    if(args.length != 1) {
        System.out.println("USAGE: <dcp_file_name>");
        return;
    }
    CodePerfTracker t = new CodePerfTracker("Report Timing Example");
    t.useGCToTrackMemory(true);

    // Read in an example placed and routed DCP
    t.start("Read DCP");
    Design design = Design.readCheckpoint(args[0], CodePerfTracker.SILENT);

    // Instantiate and populate the timing manager for the design
    t.stop().start("Create TimingManager");
    TimingManager tim = new TimingManager(design);

    // Get and print out worst data path delay in design
    t.stop().start("Get Max Delay");
    GraphPath<TimingVertex, TimingEdge> criticalPath = tim.getTimingGraph().getMaxDelayPath();

    // Print runtime summary
    t.stop().printSummary();
    System.out.println("\nCritical path: "+ ((int)criticalPath.getWeight())+ " ps");
    System.out.println("\nPath details:");
    System.out.println(criticalPath.toString().replace(",", ",\n")+"\n");
}

Example Output

Please refer to the timing library Javadoc and code for more implementation details. The Java source code for the timing library is located in: RapidWright/src/com/xilinx/rapidwright/timing/.

Example output using the microblaze4.dcp design is included below:

==============================================================================
    ==                          Report Timing Example                           ==
    ==============================================================================
                        Read DCP:     6.275s    436.922MBs
            Create TimingManager:     1.838s     19.600MBs
                   Get Max Delay:     0.087s      0.213MBs
    ------------------------------------------------------------------------------
                         *Total*:     8.200s    456.734MBs

    Critical path: 1921 ps

    Path details:
    [superSource -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Op2_reg[31]/Q,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Op2_reg[31]/Q -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/I0,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/I0 -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/O,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/O -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/S[1],
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/S[1] -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/CO[7],
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/CO[7] -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/CI,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/CI -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/O[2],
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/O[2] -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/I0,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/I0 -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/O,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/O -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/I0,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/I0 -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/O,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/O -> microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[22]/D,
     microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[22]/D -> superSink]

The GraphPath<TimingVertex, TimingEdge> object contains a path/set of edges. From each TimingEdge you can access the net delay and/or the logic delay values for more detail.

This example was designed to illustrate the default way to use the timing library to report the critical path and its associated data path delay.

Compare with Vivado

To compare the output of the RapidWright timing model to Vivado, run the following at the prompt:

vivado -mode tcl microblaze4.dcp

and then run the following command at the Tcl prompt:

report_timing

In Vivado 2020.1, the timing report shows a data path delay of 1.846 ns (1846 ps). Which has an error of 30 ps or ~1.6%. The full Vivado timing report is shown below:

-----------------------------------------------------------------------------------------
| Tool Version      : Vivado v.2020.1 (lin64) Build 2902540 Wed May 27 19:54:35 MDT 2020
| Date              : Mon Nov  8 22:17:03 2021
| Host              : yun-Latitude-3470 running 64-bit Ubuntu 16.04.7 LTS
| Command           : report_timing
| Design            : design_1
| Device            : xcvu3p-ffvc1517
| Speed File        : -2  PRODUCTION 1.27 02-28-2020
| Temperature Grade : E
-----------------------------------------------------------------------------------------

Timing Report

Slack (MET) :             0.051ns  (required time - arrival time)
Source:                 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[8]/C
                            (rising edge-triggered cell FDRE clocked by TS_clk  {rise@0.000ns fall@1.000ns period=2.000ns})
Destination:            microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]/CE
                            (rising edge-triggered cell FDRE clocked by TS_clk  {rise@0.000ns fall@1.000ns period=2.000ns})
Path Group:             TS_clk
Path Type:              Setup (Max at Slow Process Corner)
Requirement:            2.000ns  (TS_clk rise@2.000ns - TS_clk rise@0.000ns)
Data Path Delay:        1.846ns  (logic 0.730ns (39.545%)  route 1.116ns (60.455%))
Logic Levels:           7  (CARRY8=4 LUT2=1 LUT4=1 LUT6=1)
Clock Path Skew:        -0.007ns (DCD - SCD + CPR)
Destination Clock Delay (DCD):    0.021ns = ( 2.021 - 2.000 )
Source Clock Delay      (SCD):    0.028ns
Clock Pessimism Removal (CPR):    0.000ns
Clock Uncertainty:      0.035ns  ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE
Total System Jitter     (TSJ):    0.071ns
Total Input Jitter      (TIJ):    0.000ns
Discrete Jitter          (DJ):    0.000ns
Phase Error              (PE):    0.000ns

Location             Delay type                Incr(ns)  Path(ns)    Netlist Resource(s)
-------------------------------------------------------------------    -------------------
                         (clock TS_clk rise edge)     0.000     0.000 r
                                                      0.000     0.000 r  Clk_0 (IN)
                         net (fo=835, unset)          0.028     0.028    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/Clk
SLICE_X75Y108        FDRE                                         r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[8]/C
-------------------------------------------------------------------    -------------------
SLICE_X75Y108        FDRE (Prop_DFF2_SLICEL_C_Q)
                                                      0.081     0.109 f  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[8]/Q
                         net (fo=1, routed)           0.300     0.409    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/Using_FPGA.Native_0[21]
SLICE_X75Y108        LUT6 (Prop_C6LUT_SLICEL_I2_O)
                                                      0.088     0.497 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/S0_inferred__3/i_/O
                         net (fo=1, routed)           0.010     0.507    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/Part_Of_Zero_Carry_Start/lopt_5
SLICE_X75Y108        CARRY8 (Prop_CARRY8_SLICEL_S[2]_CO[7])
                                                      0.155     0.662 f  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/Part_Of_Zero_Carry_Start/Using_FPGA.Native_CARRY4_CARRY8/CO[7]
                         net (fo=1, routed)           0.026     0.688    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY2/jump_carry1
SLICE_X75Y109        CARRY8 (Prop_CARRY8_SLICEL_CI_CO[1])
                                                      0.042     0.730 f  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY2/Using_FPGA.Native_CARRY4_CARRY8/CO[1]
                         net (fo=1, routed)           0.117     0.847    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY3/ex_jump_wanted
SLICE_X74Y109        LUT4 (Prop_D6LUT_SLICEM_I0_O)
                                                      0.051     0.898 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY3/Using_FPGA.Native_i_1__100/O
                         net (fo=1, routed)           0.025     0.923    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/mem_wait_on_ready_N_carry_or/MUXCY_I/lopt_9
SLICE_X74Y109        CARRY8 (Prop_CARRY8_SLICEM_S[3]_CO[7])
                                                      0.163     1.086 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/mem_wait_on_ready_N_carry_or/MUXCY_I/Using_FPGA.Native_CARRY4_CARRY8/CO[7]
                         net (fo=1, routed)           0.026     1.112    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Use_MuxCy[7].OF_Piperun_Stage/MUXCY_I/of_PipeRun_carry_5
SLICE_X74Y110        CARRY8 (Prop_CARRY8_SLICEM_CI_CO[4])
                                                      0.099     1.211 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Use_MuxCy[7].OF_Piperun_Stage/MUXCY_I/Using_FPGA.Native_CARRY4_CARRY8/CO[4]
                         net (fo=324, routed)         0.472     1.683    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/of_piperun_for_ce
SLICE_X80Y99         LUT2 (Prop_F6LUT_SLICEM_I1_O)
                                                      0.051     1.734 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count[0]_i_1/O
                         net (fo=2, routed)           0.140     1.874    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count[0]_i_1_n_0
SLICE_X80Y99         FDRE                                         r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]/CE
-------------------------------------------------------------------    -------------------

                         (clock TS_clk rise edge)     2.000     2.000 r
                                                      0.000     2.000 r  Clk_0 (IN)
                         net (fo=835, unset)          0.021     2.021    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/Clk
SLICE_X80Y99         FDRE                                         r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]/C
                         clock pessimism              0.000     2.021
                         clock uncertainty           -0.035     1.986
SLICE_X80Y99         FDRE (Setup_HFF2_SLICEM_C_CE)
                                                     -0.061     1.925    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]
-------------------------------------------------------------------
                         required time                          1.925
                         arrival time                          -1.874
-------------------------------------------------------------------
                         slack                                  0.051