Attention

You are viewing an older version of the documentation. The latest version is v3.3.

Performance Sanity Check Testing¶

The following section is applicable to:

The sanity checks in this section include:

Sanity Check 0 - Cyclictest Workload
Sanity Check 1 - LMbench Workload
Sanity Check 2 - Jitter Workload
Sanity Check 3 - Caterpillar Workload
Sanity Check 4 - Latency Workload
Sanity Check 5 - MSI Latency
Sanity Check 6 - MSI Jitter
Sanity Check 7 - Rhealstone Workload
Sanity Check 8 - MMIO Latency Workload
Sanity Check 8 - MMIO Latency Workload
Sanity Check 9 - CODESYS PlcLogic Workload
Sanity Check 10 - OpenGL glxgears Workload
Sanity Check 11 - Smokey
Sanity Check 12 - Smokey Net Server

Sanity Check 0 - Cyclictest Workload¶

Script

rt_bmark.py

Information

The benchmark will use the following input configuration:

One thread per core
Use clock_nanosleep instead of posix interval timers
Set number of threads to the number of CPUs and same priority of all threads
Priority = 99
Timer interval on core 0: 100[us]
Interval increment for each core: 20[us]
Number of loops (on core 0): 30000

Step 0

Install the Cyclictest Workload.

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ /opt/benchmarking/rt-tests/rt_bmark.py

Script

start-cyclic.py

Information

The benchmark will use the following input configuration:

Core affinity = 3
Priority = 95 (chrt -f 95)
Total loops = 100000
Thread distance = 1
Interval = 1000.
SSH session (not using integrated GPU)

Step 0

Install the Cyclictest Workload.

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ /opt/benchmarking/rt-tests/start-cyclic.py

Sanity Check 1 - LMbench Workload¶

Information

The benchmark will use the following input configuration:

Core affinity ($CORE_AFFINITY) = 1
Memory total ($MEMRD_SIZE) = 192M
Stride size ($MEMRD_CHUNCKS) = 512
SSH session (not using integrated GPU)

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ cd ~/;taskset -c $CORE_AFFINITY /usr/bin/lat_mem_rd -P 1 $MEMRD_SIZE $MEMRD_CHUNCKS &> /tmp/result_lat_mem_rd.txt

Sanity Check 2 - Jitter Workload¶

Information

The benchmark will use the following input configuration:

Core affinity = 3
Priority = 95 chrt -f 95
Noisy Neighbor stress-ng affinity = 0
CAT used to assign 0x0f to COS0
CAT used to assign 0xf0 to COS1
Logging output to jitter_with_cat.log
Running for 7200 samples
SSH session (not using integrated GPU)

Step 0

Install the Jitter Workload.

Step 1

Run the workload with Cache Allocation Technology using the following command in target shell (locally or remotely over SSH or UART console):

$ test_core=$(cat /sys/devices/system/cpu/isolated | rev | cut -d '-' -f1 | cut -d ',' -f1 | rev)
$ sudo /opt/pqos/pqos-helper.py --cos0 0x0f --cos1 0xf0 --assign_cos "0=0 0=1 0=2 1=${test_core:-3}" --pqos_rst --pqos_msr --command "/opt/benchmarking/jitter/start-benchmark.py --jitter_args '-c ${test_core:-3} -l jitter_with_cat.log -s 7200'"

The test results will be saved as jitter_with_cat.log.

Step 2

Run the workload without Cache Allocation Technology using the following command in target shell (locally or remotely over SSH or UART console):

$ test_core=$(cat /sys/devices/system/cpu/isolated | rev | cut -d '-' -f1 | cut -d ',' -f1 | rev)
$ sudo /opt/benchmarking/jitter/start-benchmark.py --jitter_args "-c ${test_core:-3} -l jitter_without_cat.log -s 7200"

The test results will be saved as jitter_without_cat.log.

Step 3

Compare the results of the run with and without ref:Cache Allocation Technology <intel-pqos>. The run with Cache Allocation Technology should measure less jitter.

Sanity Check 3 - Caterpillar Workload¶

Information

The benchmark will use the following input configuration:

Core affinity = 3
Priority = 95 chrt -f 95
Noisy Neighbor stress-ng affinity = 0
CAT used to assign 0x0f to COS0
CAT used to assign 0xf0 to COS1
Logging output to caterpillar_withcat.log
Running for 7200 samples (about 3 hours)
SSH session (not using integrated GPU)

Step 0

Install the Caterpillar Workload.

Step 1

Run the workload with Cache Allocation Technology using the following command in target shell (locally or remotely over SSH or UART console):

$ test_core=$(cat /sys/devices/system/cpu/isolated | rev | cut -d '-' -f1 | cut -d ',' -f1 | rev)
$ sudo /opt/pqos/pqos-helper.py --cos0 0x0f --cos1 0xf0 --assign_cos "0=0 0=1 0=2 1=${test_core:-3}" --pqos_rst --pqos_msr --command "/opt/benchmarking/caterpillar/start-benchmark.py --caterpillar_args '-c ${test_core:-3} -l caterpillar_with_cat.log -s 7200'"

The test results will be saved as caterpillar_with_cat.log.

Step 2

Run the workload without Cache Allocation Technology using the following command in target shell (locally or remotely over SSH or UART console):

$ test_core=$(cat /sys/devices/system/cpu/isolated | rev | cut -d '-' -f1 | cut -d ',' -f1 | rev)
$ sudo /opt/benchmarking/caterpillar/start-benchmark.py --caterpillar_args '-c ${test_core:-3} -l caterpillar_without_cat.log -s 7200'

The test results will be saved as caterpillar_without_cat.log.

Step 3

Compare the results of the run with and without ref:Cache Allocation Technology <intel-pqos>. The run with Cache Allocation Technology should measure less jitter.

Sanity Check 4 - Latency Workload¶

Information

The benchmark will use the following input configuration:

Core affinity = 1
Noisy Neighbor stress-ng affinity = 2
Test time = 10800 seconds
SSH session (not using integrated GPU)

Step 0

Install the Latency Workload.

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ cd /opt/benchmarking/rhealstone && python run_xlatency.py -T 10800 --stress
//exclude the gfx stress for image without glxgears
$ python run_xlatency.py -T 10800 --stress --no-gfx

The test results will be saved as latency_test_results.txt.

Sanity Check 5 - MSI Latency¶

Information

The benchmark will use the following input configuration:

coreSpecIRQ = 1
coreSpecWQ = 1
irqSpec = 1
irqPeriod = 10
irqCount = 2000000
verbosity = 1
offsetStart = 0
blockIRQ = 0

Step 0

Install MSI Latency.

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ cd /opt/benchmarking/msi-latency
$ ./msiLatencyTest.sh 1 10 2000000
$ cat /sys/kernel/debug/msi_latency_test/current_value   # Get current result

Sanity Check 6 - MSI Jitter¶

Information

The benchmark will use the following input configuration:

core affinity = 1
runtime = 21600(s)
interval = 100(ms)
unbind_igb_id (checked by lspci)

Step 0

Install MSI Jitter.

Step #1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ cd /opt/benchmarking/msi-jitter
$ ./irq_rcu.sh
$ ./run_msijitter.sh <unbind_igb_id> 1 100 21600
//get current test result
$ cat /sys/kernel/debug/msi_jitter_test/current_value

Sanity Check 7 - Rhealstone Workload¶

Information

The benchmark will use the following input configuration:

Core affinity = 1
Total loops = 100
Noisy Neighbor stress-ng affinity = 2
SSH session (not using integrated GPU)

Step 0

Install the Rhealstone Workload.

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ cd /opt/benchmarking/rhealstone && ./run_rhealstone_bmark_stress.py 100

The test result will be saved as a file named rhealstone_test_result.txt.

Sanity Check 8 - MMIO Latency Workload¶

Step 0

Setup the workload with the following input configuration:

Install the MMIO Latency Workload.

Configure non-RT related tasks to core 0.

$ ./opt/benchmarking/mmio-latency/configRTcores.sh

Find the physical MMIO address using either of the following options:

Option 1:

//Find the physical mmio address to test by `lspci -vvv -s $BDF`.
eg: $ lspci -nn ==> 00:02.0 SATA controller; $ lspci -vvv -s 00:02.0 ==> Region 0: Memory at 80002000.

Option 2:

$ lspci -k
$ cat /proc/bus/pci/devices | grep <name> | awk '{print $4}'

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ ./opt/benchmarking/mmio-latency/mmioLatency.sh <mmio-address>

Sanity Check 9 - CODESYS PlcLogic Workload¶

Information

The benchmark will use the following input configuration:

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ systemctl restart codesyscontrol.service
$ /opt/benchmarking/codesys/start_codesys_native.sh

You can consider that the sanity check has passed, if stdout results are similar to the data provided in the following links:

Sanity Check 10 - OpenGL glxgears Workload¶

Step 0

Make sure that mesa has been installed in the target set up with the Graphical Windowing System option.

Install the Cyclictest Workload.

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

$ cd /opt/benchmarking/rhealstone && python3 run_opengl_glxgears.py -T 60 --cyclictest

Test result will be saved as a file named OpenGL_glxgears_result.txt. You can consider that the sanity check has passed, if the test results are similar to the following:

==========modify date:Mon Jul  6 10:18:22 2020==========#==============================================================================
#  Test case (1/1): Test.glxgears.OpenGL.cyclictest
#..............................................................................
#Starting stress(cyclictest)
#  Command: 'taskset -c 1 /opt/benchmarking/rt-tests/cyclictest -p 99 -i 250 -m -N'
#Starting test
#  Command: taskset -c 0 glxgears -geometry 600x800
#Hung task detection not supported
#  (File /proc/sys/kernel/hung_task_timeout_secs not found)
#10:18:22: Start of execution
#10:19:22:  1/ 1: min: 59.997
#10:19:22: Test completed. Actual execution time:0:01:00
#Terminated stress
#Min FPS: 59.997
#PASS

Test.OpenGL.cyclictest[FPS]:
59.997
PASS:Test.OpenGL.cyclictest

==========modify date:Mon Jul  6 10:19:33 2020==========#==============================================================================
#  Test case (1/1): Test.glxgears.OpenGL.latency
#..............................................................................
#Starting stress(latency)
#  Command: 'taskset -c 1 /usr/bin/latency -c 1 -p 250'
#Starting test
#  Command: taskset -c 0 glxgears -geometry 600x800
#Hung task detection not supported
#  (File /proc/sys/kernel/hung_task_timeout_secs not found)
#10:19:33: Start of execution
#10:20:33:  1/ 1: min: 59.998
#10:20:33: Test completed. Actual execution time:0:01:00
#Terminated stress
#Min FPS: 59.998
#PASS

Test.OpenGL.latency[FPS]:
59.998
PASS:Test.OpenGL.latency

Sanity Check 11 - Smokey¶

Step 0

Setup the workload with the following input configuration:

Setup rtnet:

//find RT_DRIVER
$ lspci -v | grep ' Ethernet controller: Intel Corporation I210 Gigabit Network Connection ' -A 15

e.g.

01:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
        Subsystem: Intel Corporation I210 Gigabit Network Connection
        Flags: bus master, fast devsel, latency 0, IRQ 18
        Memory at a1100000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at 3000 [size=32]
        Memory at a1120000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 00-07-32-ff-ff-6f-ee-90
        Capabilities: [1a0] Transaction Processing Hints
        Kernel driver in use: rt_igb
        Kernel modules: igb, rt_igb

$ sudo vim /etc/rtnet.conf
//
$ RT_DRIVER="rt_igb"
// Change to the device address you found,here is my example address
$ REBIND_RT_NICS="0000:01:00.0"

Start rtnet:

$ sudo /usr/sbin/rtnet master
$ sudo /usr/sbin/rtifconfig rteth0 up
$ sudo /usr/sbin/rtifconfig

Query cmd:

$ sudo /usr/lib/xenomai/testsuite/smokey --list

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

//Default
$ sudo /usr/lib/xenomai/testsuite/smokey --run=12 --verbose=2
//Optional
$ sudo /usr/lib/xenomai/testsuite/smokey --run=12 --verbose=2 rtnet_driver=rt_loopback rtnet

.. list-table::
  :widths: 500 500
  :header-rows: 1

  * - parameters
    - description
  * - --run
    - run [portion of] the test list, 12 = ``net_udp``
  * - --verbose
    - set verbosity to desired level, default = 1
  * - ``rtnet_driver``
    - choose network driver, default = ``rt_loopback``
  * - ``rtnet_interface``
    - choose network interface, default = ``rteth0``
  * - ``rtnet_rate``
    - choose packet rate . default=1000, it means send/receive UDP datagram per 1000000000/1000 ns = 1ms
  * - ``rtnet_duration``
    - choose test duration. default=10, it means test lasts 10 seconds

You can consider that the sanity check has passed, if stdout results are as shown in the following figure, that is, no-zero and no-n/a values.

Sanity Check 12 - Smokey Net Server¶

Step 0

Setup the workload with the following input configuration:

Setup rtnet for server and client

//Find the values of RT_DRIVER and REBIND_RT_NICS for setting "/etc/rtnet.conf" and the mac address for setting "/tmp/rtnet_smokey.log"
$ lspci -v | grep ' Ethernet controller: Intel Corporation I210 Gigabit Network Connection ' -A 15

 e.g.

01:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
      Subsystem: Intel Corporation I210 Gigabit Network Connection
      Flags: bus master, fast devsel, latency 0, IRQ 18
      Memory at a1100000 (32-bit, non-prefetchable) [size=128K]
      I/O ports at 3000 [size=32]
      Memory at a1120000 (32-bit, non-prefetchable) [size=16K]
      Capabilities: [40] Power Management version 3
      Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
      Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
      Capabilities: [a0] Express Endpoint, MSI 00
      Capabilities: [100] Advanced Error Reporting
      Capabilities: [140] Device Serial Number 00-07-32-ff-ff-6f-ee-90
      Capabilities: [1a0] Transaction Processing Hints
      Kernel driver in use: rt_igb
      Kernel modules: igb, rt_igb

//modify those values in the configure file according your board(both server and client)
$ sudo vim /etc/rtnet.conf

//for I210, the rt driver is "rt_igb"
$ RT_DRIVER="rt_igb"
//change to the device address of your board, here is an example address
$ REBIND_RT_NICS="0000:01:00.0"

//clear the IPADDR and NETMASK, those values will be configured by following steps
IPADDR=""
NETMASK=""

//set TDMA_MODE as master for both boards(server and client), this value will be overwrite by following steps
TDMA_MODE="master"

//clear the TDMA_SLAVES, the client address will be configure by the file "/tmp/rtnet_smokey.log"
TDMA_SLAVES=""

Create rtnet_smokey.log for the server:

//create file "/tmp/rtnet_smokey.log" in server then record the client address according your client board in this file, here is an example
$ sudo vim /tmp/rtnet_smokey.log
00:07:32:6B:A7:FE 192.208.1.101

Query cmd:

$ sudo /usr/lib/xenomai/testsuite/smokey --list

Step 1

Run the workload with the following command in target shell (locally or remotely over SSH or UART console):

Start rtnet and configure the IP address on server:

//load all nessary modules
$ sudo /usr/sbin/rtnet start

//Server will be reconfigure by command smokey_net_server, here detach firtly, and TDMA is not enabled in this case.
$ sudo /usr/sbin/rtcfg rteth0 detach

//configure the server ip address
$ sudo /usr/sbin/rtifconfig rteth0 up 192.208.1.100 netmask 255.255.255.0

Start rtnet and configure the IP address on client:

//load all nessary modules
$ sudo /usr/sbin/rtnet start

//Client will be reconfigure by following step, here detach firtly, and TDMA is not enabled in this case.
$ sudo /usr/sbin/rtcfg rteth0 detach

//configure the client ip address
$ sudo /usr/sbin/rtifconfig rteth0 up 192.208.1.101 netmask 255.255.255.0

Start smokey_net_server on the server:

$ sudo /usr/lib/xenomai/testsuite/smokey_net_server rteth0 --file /tmp/rtnet_smokey.log

Start smokey on the client:

//configure as client mode
$ sudo /usr/sbin/rtcfg rteth0 client -c

//client announe to server
$ sudo /usr/sbin/rtcfg rteth0 announce

//run net_udp test case
$ sudo /usr/lib/xenomai/testsuite/smokey --run=12 --verbose=2 rtnet_driver=rt_igb rtnet_interface=rteth0 rtnet_rate=1000 rtnet_duration=10

Parameters	Description
–run	run [portion of] the test list, 12 = `net_udp`
–verbose	set verbosity to desired level, default = 1
`rtnet_driver`	choose network driver, default = `rt_loopback`
`rtnet_interface`	choose network interface, default = `rteth0`
`rtnet_rate`	choose packet rate . default=1000, it means send/receive UDP datagram per 1000000000/1000 ns = 1ms
`rtnet_duration`	choose test duration. default=10, it means test lasts 10 seconds

You can consider that the sanity check has passed, if stdout results are similar to the following, that is, no-zero and no-n/a values:

Note: It is recommended to use onboard Ethernet ports on the target to reach best real time performance.