Overview of IEEE 802.1AS Generalized Precision Time Protocol (gPTP)¶
Precision Time Protocol (PTP) is defined in IEEE 1588 as Precision Clock Synchronization for Networked Measurements and Control Systems. The IEEE 802.1AS standard specifies the use of IEEE 1588 specifications, where applicable, in the context of IEEE Std 802.1D-2004 and IEEE Std 802.1Q-2005. It includes distributed device clocks of varying precision and stability.
Generalized Precision Time Protocol (gPTP) is designed specifically for industrial control systems and is optimal for use in distributed systems because it requires minimal bandwidth and very little CPU processing overhead.
An IEEE 802.1AS capable TSN domain is made up of gPTP enabled Ethernet endpoints and switches.
The following figure illustrates the PTP clocks in a primary and secondary Ethernet port hierarchy within a TSN domain.
gPTP Clock Types¶
The Egress and Ingress timestamp precision is of paramount importance for robust IEEE 802.1AS time synchronization.
It is best obtained via PTP, directly over Layer 2 of Ethernet Type field 0x88F7
(that is, PTP), VLAN tags, and timestamping with PTP Hardware Clock (PHC) assistance on the present MAC multicast addresses:
01-1B-19-00-00-00
for all except peer delay measurement01-80-C2-00-00-0E
for peer delay measurement
A PTP network may comprehend the following clock types:
Ordinary Clock¶
An Ordinary Clock (OC) functions as a single PTP port and can be selected by the Best Master Clock Algorithm (BMCA) to serve as a primary port or secondary port within a IEEE 802.1 TSN domain.
OCs are the most common clock type because they are used as Ethernet endpoints on a network that is connected to devices requiring synchronization.
Best Master Clock Algorithm
The Best Master Clock Algorithm (BMCA) is the foundation of the PTP functionality. It provides a means to establish the best master clock in its subdomain out of all advertised IEEE 802.1AS clocks on the network, using PTP =unicast or multicast packets.
The BMCA must run locally on each Ethernet port of the IEEE 802.1 TSN network to continuously monitor PTP packets at every Announce interval to quickly adjust for changes in time synchronization configuration.
BMCA based on IEEE 1588-2008 uses Announce PTP general messages for advertising clock properties.
BMCA assesses the best master clock in the subdomain using the following criteria:
Clock quality (GPS is considered the highest quality)
Accuracy of the clock’s time base (decimal from 0-255)
Stability of the local oscillator
Closest clock to the grandmaster
For synchronizing a local free-running clock, BMCA based on IEEE 1588-2008 collects several set-points to determine the best clock using the following attributes, in the indicated order:
Priority1 - User-assigned priority to each clock. The range is from 0 to 255.
Class - Class to which the clock belongs. Each class has its own priority.
Accuracy - Precision between clock and UTC in nanoseconds.
Variance - Variability of the clock (ECI default OxFFFF ).
Priority2 - Final-defined priority. The range is from 0 to 255.
Extended Unique Identifier (EUI) - 64-bit Unique Identifier.
Transparent Clock¶
The role of Transparent Clocks (TC) in a IEEE 802.1 TSN domain with TSN switch hops is to update the time-interval field that is part of the PTP event message.
TC provides correction that compensates for TSN switch propagation delay on downstream data-link to all secondary ports receiving the PTP event message sequences during time-synchronization cycles.
There are two types of TCs:
End-to-end (E2E) TC measures the transit time (also called residence time) of the PTP event message. It does not provide compensation for the propagation delay of the link itself.
Peer-to-peer (P2P) TC measures the residence time (same as for E2E TC) of the PTP event message and provides the compensation for the link propagation delay.
P2P delay measurement is very useful when the network is reconfigured by a redundancy protocol mechanism utilized by IEEE 802.1AS for gPTP.
Peer-to-Peer Delay Measurement Mechanism
The P2P delay request-response mechanism improves the accuracy of the transmit time added offset (also known as residence time), by including the link delay measured between two clock ports implementing the P2P TC.
P2P uses the following PTP general and event messages to generate and communicate link delay information:
Pdelay_Req
Pdelay_Resp
Pdelay_Resp_Follow_Up
The upstream link delay is the estimated packet propagation delay between the upstream neighbor P2P TC and the P2P TC under consideration.
Both residence time and upstream link delay are added to the correction field of the PTP event message and the correction field of the message received by the secondary port contains the sum of all link delays.
Grandmaster Clock¶
The Grandmaster (GM) clock is the primary source of time for clock synchronization using the PTP protocol. The GM clock usually has a very precise time source, such as a external GPS (example UART NMEA protocol) or atomic clock accurate pulse-per-second (PPS) signal.
When the Industrial IEEE 802.1 TSN domain does not require any external time reference and only needs to be synchronized with single time reference, the GM clock can be a free running oscillator.
Boundary Clock¶
A Boundary Clock (BC) in a IEEE 802.1 TSN domain operates in place of a standard network switch. The BC typically provides an interface between TSN domains. Such device needs more than one PTP enabled Ethernet port, and each port provides access to dissociated PTP communication path.
The BC uses the BMCA to select the best clock seen by any port. The selected Ethernet port is then set as a secondary port. The primary port synchronizes the clocks connected downstream, while the secondary port synchronizes with the upstream master clock.
Linux PTP Stack 802.1AS gPTP Profile¶
Intel® Ethernet Controller provides hardware offloading capability to synchronize the clocks in packet-based networks as defined in IEEE 802.1AS PTP event message sequences.
The open source Linux PTP 3.1 is the essential ingredient to set up an IEEE 802.1AS-2011 defined gPTP Profile on Intel® Ethernet Controller, since it:
Supports IEEE 802.1AS-2011 in the role of TSN endpoint
asCapable
Implements OC, TC, and BC
Transports
PTPv2
message UDP/IPv4, UDP/IPv6, and Layer 2 Ethernet (EtherType0x88f7
)Implements Unicast, Multicast, and Hybrid mode operations
Implements P2P and E2E delay measurement mechanisms (one-step or two-step)
Supports hardware offloading and software time-stamping via the Linux SO_TIMESTAMPING socket option
Supports the Linux PHC subsystem by using the
clock_gettime
family of calls, including theclock_adjtimex
system callAdds
ts2phc
pin control and GPS NMEA external time-reference sourceSupports VLAN interfaces
IEEE 802.1as gPTP Profile Essential¶
The following section is applicable to:

The gPTP profiles can be installed from the ECI APT repository to match IEEE standard 802.1AS:
[Ethernet PCI
8086:7aac
and8086:7aad
] 12th Gen Intel® Core™ S-Series [Alder Lake] Ethernet GbE Time-Sensitive Network Controller[Ethernet PCI
8086:a0ac
] 11th Gen Intel® Core™ U-Series and P-Series [Tiger Lake] Ethernet GbE Time-Sensitive Network Controller[Ethernet PCI
8086:4b32
and8086:4ba0
] Intel® Atom™ x6000 Series [Elkhart Lake] Ethernet GbE Time-Sensitive Network Controller[Ethernet PCI
8086:15f2
] Intel® Ethernet Controller I225-LM for Time-Sensitive Networking (TSN)[Ethernet PCI
8086:125b
] Intel® Ethernet Controller I226-LM for Time-Sensitive Networking (TSN)[Ethernet PCI
8086:157b
,8086:1533
,…] Intel® Ethernet Controller I210-IT for Time-Sensitive Networking (TSN)
Setup the ECI APT repository, then run the following command to install this component:
Install from individual Deb package
$ sudo apt install iotg-gptp-configs
Examples: gPTP profiles
The
ptp4l
daemon establishes gPTP Global Time Reference assuming either the GM or the OC role:taskset -c 1 ptp4l -mP2Hi enp1s0.vlan -f /opt/intel/iotg_tsn_ref_sw/common/gPTP.cfg --step_threshold=2 --socket_priority 2 2&> /var/log/ptp4l.log &
Synchronizes the PTP Hardware Clock (PHC) from the Intel® Ethernet Controller
/opt/intel/iotg_tsn_ref_sw/common/gPTP*.cfg
templates files specified and-f
:-i
- Specifies the network interface that this instance ofptp4l
is controlling--step_threshold
- Is set so thatptp4l
converges faster when the time jump occurs-ml [1-7]
- Enables log-level messages on standard output
For more information on the
ptp4l
configuration option, refer to theptp4l
man page.The following table exemplifies the IEEE 802.1as time domain configurations with or without the TSN switch.
Note: Some of the following gPTP would require a TSN switch, such as Kontron Kbox C-102-2 TSN StarterKit.
gPTP
ProfilesECI Endpoints (Master-only Port)
p4pl -Hi <eth> -f <.cfg> --socket_priority=
Kontron PCIE-0400-TSN (Switch Port)
p4pl -f <.cfg> -ml 7
ECI Endpoints OCx (Secondary-state Port)
p4pl -Hi <eth> -f <.cfg> --socket_priority=
DeepCascade 802.1as
ECI GM clock
Kontron TC clock
ECI OCx clock
[global] gmCapable 1 priority1 248 priority2 248 logAnnounceInterval 1 logSyncInterval -3 syncReceiptTimeout 3 neighborPropDelayThresh 800 min_neighbor_prop_delay -20000000 assume_two_step 1 path_trace_enabled 1 follow_up_info 1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P tx_timestamp_timeout 100 transportSpecific 0x1 # clockClass 248 clockAccuracy 0xfe offsetScaledLogVariance 0xffff timeSource 0xa0 # # # # # # #
[global] gmCapable 0 priority1 254 # tc_spanning_tree 1 summary_interval 1 # # # assume_two_step 1 # follow_up_info 1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P tx_timestamp_timeout 10 clock_type P2P_TC # productDescription Kontron; manufacturerIdentity 00:3a:98 # [CE01] transportSpecific 0x1 [CE02] transportSpecific 0x1 [CE03] transportSpecific 0x1 [CE04] transportSpecific 0x1
[global] gmCapable 0 priority1 248 priority2 248 logAnnounceInterval 1 logSyncInterval -3 syncReceiptTimeout 3 neighborPropDelayThresh 8000 min_neighbor_prop_delay -20000000 assume_two_step 1 path_trace_enabled 1 follow_up_info 1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P tx_timestamp_timeout 100 transportSpecific 0x1
Star 802.1as
Kontron GM clock
ECI OC clocks
[global] gmCapable 1 priority1 248 priority2 248 logAnnounceInterval 1 logSyncInterval -3 syncReceiptTimeout 3 neighborPropDelayThresh 800 min_neighbor_prop_delay -20000000 assume_two_step 1 path_trace_enabled 1 follow_up_info 1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P tx_timestamp_timeout 100 summary_interval 0 # productDescription Kontron; manufacturerIdentity 00:3a:98 [CE01] transportSpecific 0x1 [CE02] transportSpecific 0x1 [CE03] transportSpecific 0x1 [CE04] transportSpecific 0x1
[global] gmCapable 0 priority1 248 priority2 248 logAnnounceInterval 1 logSyncInterval -3 syncReceiptTimeout 3 neighborPropDelayThresh 8000 min_neighbor_prop_delay -20000000 assume_two_step 1 path_trace_enabled 1 follow_up_info 1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P tx_timestamp_timeout 100 transportSpecific 0x1
Direct 802.1as
ECI GM clock
ECI OC clocks
[global] gmCapable 1 priority1 248 priority2 248 logAnnounceInterval 1 logSyncInterval -3 syncReceiptTimeout 3 neighborPropDelayThresh 800 min_neighbor_prop_delay -20000000 assume_two_step 1 path_trace_enabled 1 follow_up_info 1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P tx_timestamp_timeout 100 transportSpecific 0x1 # clockClass 248 clockAccuracy 0xfe offsetScaledLogVariance 0xffff timeSource 0xa0 # # # # #
[global] gmCapable 0 priority1 248 priority2 248 logAnnounceInterval 1 logSyncInterval -3 syncReceiptTimeout 3 neighborPropDelayThresh 8000 min_neighbor_prop_delay -20000000 assume_two_step 1 path_trace_enabled 1 follow_up_info 1 ptp_dst_mac 01:80:C2:00:00:0E network_transport L2 delay_mechanism P2P tx_timestamp_timeout 100 transportSpecific 0x1
ptp4l[338318.346]: selected /dev/ptp1 as PTP clock ptp4l[338318.394]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[338318.394]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[338322.332]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[338322.332]: selected local clock 001395.fffe.3462a0 as best master ptp4l[338322.332]: port 1: assuming the grand master role
Use the
pmc
utility to configureptp4l
in runtime:$ pmc -u -b 0 -t 1 "SET GRANDMASTER_SETTINGS_NP clockClass 248 clockAccuracy 0xfe offsetScaledLogVariance 0xffff currentUtcOffset 37 leap61 0 leap59 0 currentUtcOffsetValid 1 ptpTimescale 1 timeTraceable 1 frequencyTraceable 0 timeSource 0xa0"
This utility incrementally modifies certain clock parameters at runtime using the
SET GRANDMASTER_SETTINGS_NP
runtime API:currentUtcOffset
- Time Delta between TAI and UTC (default is 37 seconds).Class
- Class to which the clock belongs. Each class has its own priority.Accuracy
- Precision between clock and UTC, in nanoseconds.Variance
- Variability of the clock (default OxFFFF )timeSource
- TimeSource field used in announce messages.Extended Unique Identifier (EUI)
- A 64-bit unique identifier.
phc2sys
daemon synchronizes the 802.1as network time (so called PHC) and the Linux System clock (CLOCK_REALTIME
,CLOCK_TAI
, and so on):$ taskset -c 1 phc2sys -c CLOCK_REALTIME --step_threshold=1 -s enp1s0 --transportSpecific=1 -O 0 -w -ml 7 2&> /var/log/phc2sys.log &
Initiates periodic System clock adjustment from 802.1AS time-domains reference:
-s
- Specifies the PHC from eth0 interface as the primary clock-c
- Specifies the system clock as the secondary clock--transportSpecific
- Is required when runningphc2sys
in a gPTP domain.--step_threshold
- Is set sophc2sys
converges faster when time jump occurs-w
- Makesphc2sys
wait untilptp4l
is synchronized-ml [1-7]
- Enables log messages on standard output
For more information on the
phc2sys
configuration option, refer to thephc2sys
man page.
Note: Optionally, enp1s0.vlan
can be set with Virtual LANs (VLANs) on egress Ethernet L2/Ethernet PTP event message (EtherType 0x88f7
) to set hardware queue affinity.
ts2phc¶
ts2phc
is used to synchronize one or more PTP hardware clocks using external timestamps.
Usage: ts2phc -f [configuration file]
Note: Using pulse-per-second (PPS ) and auxiliary timestamping (AUXTS) signals, mapping depends on hardware pin headers being provisioned on the motherboard and the discrete PCIe boards:
Board
PPS
AUXTS/EXTTS
Comments
i210-IT
SDP0
SDP1
Server i210-IT x1 PCIe board
i225-LM
SDP0
SDP1
i226-LM
SDP0
SDP1
TGL-mGBE
J2H4 pin 1
J2H4 pin 6
Tiger Lake UP3 Intel® RVP
EHL-mGBE0
J2E1 Pin 18
J2E1 Pin 9
Elkhart Lake Intel® CRB
EHL-mGBE1
J2E1 Pin 9
J2E1 Pin 11
Elkhart Lake Intel® CRB
ADL-mGBE0
J7H8 pin 3
J7H8 pin 2
Alder Lake Intel® RVP
ADL-mGBE1
J7H8 pin 9
J7H8 pin 8
Alder Lake Intel® RVP
Attention
PPS
andAUXTS/EXTTS
pins may be physically accessible on commercial off-the-shelf industrial PC products. For more information, contact the hardware vendor.echo 0 0 0 1 0 > /sys/class/ptp/ptpX/period
Usage:
echo <idx> <ts> <tns> <ps> <pns> > /sys/class/ptp/ptpX/period
Where:
<idx>
- PPS number
<ts>
- Start time (second), based on PTP time
<tns>
- Start time (nanosecond), based on PTP time
<ps>
- Period (s)
<pns>
- Period (ns)
ptpX
- PTP device on Ethernet secondary or primary port
Configuration File
The configuration file is divided into sections. Each section starts with a line containing its name enclosed in brackets and followed by settings. Each setting is placed on a separate line and it contains the name of the option and the value separated by whitespace characters. Empty lines and lines starting with # are ignored.
There are two different section types:
Global section (indicated as [global]) sets the program options and default secondary clock options. Other sections are clock-specific sections, and they override the default options.
Secondary clock section provides the name of the configured clock (for example, [eth0]). Secondary clocks specified in the configuration file need not be specified with the
-c
command line option.
Examples
# # This example uses a PPS signal from a GPS receiver as an input to # the SDP0 pin of an Intel i210 card. The pulse from the receiver has # a width of 100 milliseconds. # # Important! The polarity is set to "both" because the i210 always # time stamps both the rising and the falling edges of the input # signal. # [global] use_syslog 0 verbose 1 logging_level 6 ts2phc.pulsewidth 100000000 [eth6] ts2phc.channel 0 ts2phc.extts_polarity both ts2phc.pin_index 0# # This example shows ts2phc keeping a group of three Intel i210 cards # synchronized to each other in order to form a Transparent Clock. # The cards are configured to use their SDP0 pins connected in # hardware. Here eth3 and eth4 will be slaved to eth6. # # Important! The polarity is set to "both" because the i210 always # time stamps both the rising and the falling edges of the input # signal. # [global] use_syslog 0 verbose 1 logging_level 6 ts2phc.pulsewidth 500000000 [eth6] ts2phc.channel 0 ts2phc.master 1 ts2phc.pin_index 0 [eth3] ts2phc.channel 0 ts2phc.extts_polarity both ts2phc.pin_index 0 [eth4] ts2phc.channel 0 ts2phc.extts_polarity both ts2phc.pin_index 0
Overview of IEEE 802.1Q-2018 Enhancements for Scheduled Traffic (EST)¶
IEEE 802.1Q-2018 has enforced predictable time of delivery by dividing Ethernet traffic into different classes, thus ensuring that at specific times only one traffic class (or set of traffic classes) has access to the network.
TSN endpoints and TSN bridge need time-aware traffic scheduling to enable quality-of-service (QoS) for time-sensitive stream communication between Talkers and Listeners. To enable time-aware scheduling, bridges support the mechanisms defined in IEEE 802.1Q-2018 Enhancements for Scheduled Traffic (EST) feature (formerly known as 802.1Qbv).

By introducing a hardware differentiator, Intel® Linux Ethernet controllers supports OT network QoS administration by enforcing Gate Control List (GCL) to define the traffic-queues that are permitted to transmit at a specific time within a control network cycle:
Differentiate the traffic between high priority and low priority, or best-effort traffic (PCP)
Manage transmission hardware queues that need to be switched ON or OFF according to a global time-aware scheduling-policy, which indicates the duration for which an entry will be active on each port of each network device.

Virtual LANs (VLANs)¶
Virtual LANs (VLANs) QoS is the center pillar in IEEE 802.1Q standard for all endpoints and bridges to support Forward and Queuing Enhancements Time-Sensitive Streams (FQTSS) mechanisms to:
Recognize VLAN Priority Information (PCP)
Identify Stream Reservation (SR) traffic classes
The VLAN interface is created using the ip-link
command from the iproute2
project, which is pre-installed in ECI.
PCP value
Priority
Acronym
Traffic Types
1
0 (lowest)
BK
Background
0
1 (default)
BE
Best effort
2
2
EE
Excellent effort
3
3
CA
Critical applications
4
4
VI
Video, < 100 ms latency and jitter
5
5
VO
Voice, < 10 ms latency and jitter
6
6
IC
Inter-network control
7
7 (highest)
NC
Network control
The egress-qos-map
argument defines a mapping of Linux internal packet priority (SO_PRORITY
) to VLAN header PCP field for outgoing frames.
$ sudo ip link add link eth0 name eth0.5 type vlan id 5 egress-qos-map 2:2 3:3 && cat /proc/net/vlan/enp5s0.vlan
The following example creates a VLAN interface for traffic-class. Socket egress messages with SO_PRIORITY=2 map to VLAN PCP 2 while those with SO_PRIORITY=3 map to VLAN PCP=3
enp5s0.vlan VID: 5 REORDER_HDR: 1 dev->priv_flags: 1021
total frames received 0
total bytes received 0
Broadcast/Multicast Rcvd 0
total frames transmitted 0
total bytes transmitted 0
Device: enp5s0
INGRESS priority mappings: 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0
EGRESS priority mappings: 2:2 3:3
For further information on the command arguments, refer to the ip-link(8) man page.
Linux Traffic Control (TC)¶
Linux Traffic Control (TC) provides various packet scheduling policies to ensure that the inter-packet transmission latency on Intel® Ethernet Controllers meet deterministically a specific industrial network, user-defined cycle deadline, or both.
Every ECI node is capable of supporting the transmission algorithms specified in the FQTSS chapter of IEEE 802.1Q-2018 via TC Queuing Discipline (QDisc) usages.
Linux Network QDisc presents several offload mechanism options, leveraging multiple Ethernet hardware-queues, to realize 802.1Q-2018 Enhancements for Scheduled Traffic (EST) mechanism (formerly known as 802.1Qbv).

For more information, refer to the TSN Documentation Project for Linux.
Earliest TxTime First (NET_SCHED_ETF) QDisc¶
While not an actual FQTSS feature, Intel® Ethernet Linux drivers also provide the Earliest TxTime First (ETF) QDisc, which enables the LaunchTime feature present in Intel® Ethernet Controller I210, Intel® Ethernet Controller I225-LM/I226-LM, and TGL mGBE.
In Linux, this hardware feature is enabled through the SO_TXTIME
socket and ETF QDisc.
The SO_TXTIME
socket option allows applications to configure the transmission time for each frame while ETF QDisc ensures that the frames coming from multiple sockets are sent to the hardware ordered by transmission time.
The following steps describe how an application sends a time-scheduled packet:
Open a raw and low-level packet interface socket:
socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
Set the socket priority option (
SO_PRIORITY
) corresponding to the desired VLAN’s QoS:setsockopt(fd, SOL_SOCKET, SO_PRIORITY, &priority, sizeof(priority));
Set the socket transmit time option (
SO_TXTIME
):sk_txtime.clockid = CLOCK_TAI; sk_txtime.flags = (use_deadline_mode | receive_errors); if (setsockopt(sock, SOL_SOCKET, SO_TXTIME, &sk_txtime, sizeof(sk_txtime))) { exit_with_error("setsockopt SO_TXTIME"); }
Note: The flags take bit-wise fields:
report_error
at bit 1 anddeadline_mode
at bit 0. For details on these fields, refer to Add a new socket option for a future transmit time and Make etf report drops on error_queue .
For every TX packet, the user space application will specify the per-packet transmit time in the socket control message (cmsg
) ancillary data before sending it:
struct msghdr msg; // struct cmsghdr *cmsg; struct iovec iov; iov.iov_base = rawpktbuf; // the transmit packet iov.iov_len = sizeof(rawpktbuf); // the size of the transmit packet msg.msg_iov = &iov; // internal scatter/gather array for transmit packet cmsg = CMSG_FIRSTHDR(&msg); // obtain the control message cmsg->cmsg_level = SOL_SOCKET; // Set to socket level cmsg->cmsg_type = SCM_TXTIME; // Set ancillary data is TXTIME socket control message type cmsg->cmsg_len = CMSG_LEN(sizeof(__u64)); *((__u64 *) CMSG_DATA(cmsg)) = txtime; // Set per-packet transmit time sendmsg(fd, &msg, 0);Note: A TX packet sent from the user-space copies data when the packet enters the kernel-space. The copied packets are stored in the data buffer that is pointed to by
sk_buff
, the socket buffer structure inside the Linux kernel that tracks network packets. For additional details, refer to socket interface in the Linux networking subsystem .
Data copying becomes the bottleneck when 100us network cycle with very low-latency injection time is needed.
The ETF QDisc operates on a per-queue basis, so that the either TAPRIO or MQPRIO QDisc configuration is required in addition to exposing the hardware transmission queues.
Both MQPRIO and TAPRIO also define how Linux network priorities map into traffic classes and how traffic classes map into hardware queues.
The following example illustrates queue configuration using the MQPRIO QDisc for Intel® Ethernet Controller I210, which has four transmission queues:
$ sudo tc qdisc add dev eth0 parent root handle 6666 mqprio \ num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 2@2 \ hw 0
After running this command:
MQPRIO is installed as root QDisc on the
eth0
interface with the handle ID 6666.Three different traffic-classes are defined (from 0 to 2), where Linux priority 3 maps into traffic class 0, Linux priority 2 maps into traffic class 1, and all other Linux priorities map into traffic class 2.
Packets belonging to traffic class 0 go into one queue at offset 0 (that, queue index 0 or TXQ0), packet from traffic class 1 go into one queue at offset 1 (that is, queue index 1 or TXQ1), and packets from traffic class 2 go into two queues at offset 2 (that is, queues index 2 and 3, or TXQ2 and TXQ3).
No hardware offload is enabled.
Note: By configuring MQPRIO, Stream Reservation (SR) Class A (Priority 3) is enqueued on Q0, the highest priority transmission queue in Intel® Ethernet Controller, while SR Class B (Priority 2) is enqueued on TXQ1, the second priority. All best-effort traffic goes into TXQ2 or TXQ3.
In the following example, the ETF QDisc is installed on TXQ0 and offload feature is enabled, since the Intel® Ethernet Controller I210 driver supports the LaunchTime feature:
$ ethtool -K eth0 hw-tc-offload on $ sudo tc qdisc add dev eth0 parent 6666:1 etf \ clockid CLOCK_TAI \ delta 500000 \ offloadAfter running this command:
The
clockid
parameter specifies the clock that is utilized to set the transmission timestamps from frames (onlyCLOCK_TAI
is supported). Moreover, ETF requires the system clock to be in sync with the PTP Hardware Clock.The delta parameter specifies the duration before the transmission timestamp the ETF QDisc sends the frame to hardware. That value depends on multiple factors and can vary from system-to-system. This example uses 500us.
Important
Developer Tips about ethtool -K eth0 hw-tc-offload on
:
[Ethernet PCI
8086:7aac
and8086:7aad
] 12th Gen Intel® Core™ S-Series [Alder Lake] Ethernet GbE Time-Sensitive Network Controller onlyTXQ[2]
andTXQ[3]
can achieve hardware TC ETF offload.[Ethernet PCI
8086:a0ac
] 11th Gen Intel® Core™ U-Series and P-Series [Tiger Lake] Ethernet GbE Time-Sensitive Network Controller onlyTXQ[2]
andTXQ[3]
can achieve hardware TC ETF offload.[Ethernet PCI
8086:4b32
and8086:4ba0
] Intel® Atom™ x6000 Series [Elkhart Lake] Ethernet GbE Time-Sensitive Network Controller onlyTXQ[5]
andTXQ[6]
can achieve hardware TC ETF offload.[Ethernet PCI
8086:15f2
] Intel® Ethernet Controller I225-LM for Time-Sensitive Networking (TSN) allTXQ[0-3]
can achieve hardware TC ETF offload.[Ethernet PCI
8086:125b
] Intel® Ethernet Controller I226-LM for Time-Sensitive Networking (TSN) allTXQ[0-3]
can achieve hardware TC ETF offload.[Ethernet PCI
8086:157b
,8086:1533
,…] Intel® Ethernet Controller I210-IT for Time-Sensitive Networking (TSN) onlyTXQ[0]
andTXQ[1]
can achieve hardware TC ETF offload.
For more information on command arguments, refer to the tc-etf man page.
Time Aware Priority (NET_SCHED_TAPRIO) QDisc¶
IEEE 802.1Q-2018 introduces the Enhancements for Scheduled Traffic (EST) feature (formerly known as 802.1Qbv), which allows packet transmission from each Endpoint and Bridge hardware queue to be scheduled relative to a known time-slice Control Gate List (GCL) within the TSN domains.
The Linux GCL abstraction is simple:
A gate is associated with each transmission queue (TXQ).
The
Open
orClosed
states of the transmission gate determines queued frames policy.Each port is associated with a GCL, which contains an ordered list of gate operations.

For more details on the EST algorithm, refer to section 8.6.8.4 of IEEE 802.1Q standard.
The EST feature is supported in Linux via the TAPRIO QDisc. Similar to MQPRIO, the QDisc defines how Linux networking stack priorities map into traffic classes and how traffic classes map into hardware queues. This feature also enables you to configure the GCL for a given interface.

For more information on command arguments, refer to the tc-taprio man page.
Also, refer to the user guides - Intel(R) Ethernet Controller I210 Time-Sensitive Networking (TSN) Linux Reference Software and Intel® Tiger Lake UP3 (TGL) Ethernet MAC Controller Time-Sensitive Networking (TSN) Reference Software.
TAPRIO Enhancements for Scheduled Traffic (EST) TXQ GCL Offload Mode
IEEE 802.1Q-2018 standard introduced both the Enhancements for Scheduled Traffic (EST) (formerly known as 802.1Qbv) hardware features to enforce packet transmission scheduling-policy from each Endpoint and Bridge hardware queue within the predefined TSN Domains cycle-time, possibly also preempted in-between the time-window transition.
ethtool -K eth0 hw-tc-offload on BASE=$(expr $(date +%s) + 5)000000000 echo "$BASE" tc qdisc add dev eth0 parent root handle 100 taprio \ num_tc 4 \ map 0 1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 \ base-time $BASE \ sched-entry S 01 5000000 \ sched-entry S 02 5000000 \ sched-entry S 04 5000000 \ sched-entry S 08 5000000 \ sched-entry S 00 5000000 \ flags 0x2 \ txtime-delay 0 \
Following parameters are added to support GCL offload-mode:
flags
- Enables GCL hwoffload value=0x2 capability and TX and RX of traffic-shaping frames automatically (Note: These are transparent to the user or kernel).txtime-delay
- The value zero indicates that the packet scheduling and preemption are entirely hardware managed.preempt
- Enables FPE hwoffload queue-bitmask (for example 1110 TXQ[0-2] are preemptible). TXQ[3] is an express queue and not preemptible (Note: These are transparent to the user or kernel).
IEEE 802.1Qbu Frame-Preemption FPE hwoffload use TAPRIO flags GCL offload-mode tc qdisc .. taprio ... preempt 1110
Linux traffic-class software defines hardware queues to be set as preemptible and other as express (e.g. non-preemptible) :
On Intel® Atom™ x6000 Series [Elkhart Lake] Ethernet GbE Time-Sensitive Network Controller [Ethernet PCI 8086:4b32
and 8086:4ba0
].
Intel® Ethernet Controller Linux driver supports IEEE 802.1Qbu Frame-Preemption
TXQ[0-3]
configuration viaSET_AND_HOLD
andSET_AND_RELEASE
GCL hardware offload.ethtool -K enp0s29f1 hw-tc-offload on BASE=$(expr $(date +%s) + 5)000000000 echo "$BASE" tc qdisc add dev enp0s29f1 parent root handle 100 taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time $BASE \ sched-entry S 01 5000000 \ sched-entry H 01 5000000 \ sched-entry R 02 5000000 \ flags 0x2 \ txtime-delay 0 \ preempt 1110High priority traffic-class (TC) would
map
to non-preemptible queues (example, set to Queue 0) as express queue packets will otherwise not be preempted.TXQ[0]
is express by default (.e.g non-preemptible) otherTXQ[1-7]
are preemptible.qdisc taprio 100: root refcnt 9 tc 8 map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 offset 4 count 1 offset 5 count 1 offset 6 count 1 offset 7 count 1 preemptible 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 clockid invalid flags 0x2 base-time 1694087908000000000 cycle-time 1000000 cycle-time-extension 0 index 0 cmd S gatemask 0x1 interval 500000 index 1 cmd H gatemask 0x1 interval 500000 index 2 cmd R gatemask 0x2 interval 500000
On Intel® Ethernet Controller I225-LM for Time-Sensitive Networking (TSN) [Ethernet PCI 8086:15f2
] and Intel® Ethernet Controller I226-LM for Time-Sensitive Networking (TSN) [Ethernet PCI 8086:125b
]
Intel® Ethernet Controller Linux driver supports IEEE 802.1Qbu Frame-Preemption
TXQ[0-3]
configuration via bitmask e.g. it does NOT supportSET_AND_HOLD
andSET_AND_RELEASE
GCL based hardware offload.ethtool -K enp2s0 hw-tc-offload on ethtool --set-frame-preemption enp2s0 fp on BASE=$(expr $(date +%s) + 5)000000000 echo "$BASE" tc qdisc add dev enp2s0 parent root handle 100 taprio \ num_tc 4 \ map 0 1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 \ base-time $BASE \ sched-entry S 01 5000000 \ sched-entry S 0e 5000000 \ flags 0x2 \ txtime-delay 0 \ preempt 0001Note
To double-check GCL FPE offload-mode is enabled :
$ ethtool --show-frame-preemption enp2s0 fp onFrame preemption settings for enp2s0: enabled: enabled additional fragment size: 68 verified: 0 verification disabled: 1Low priority traffic-class (TC) would
map
to preemptible queues (example, set to Queue 1-3) where as express queue packets.qdisc taprio 100: root refcnt 5 tc 4 map 0 1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 preemptible 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 clockid invalid flags 0x2 base-time 1694101561000000000 cycle-time 10000000 cycle-time-extension 0 index 0 cmd S gatemask 0x1 interval 5000000 index 1 cmd S gatemask 0xe interval 5000000
Important
Consider the known limitations of TSQ GCL offload-mode:
Unexplicitly defined TSQ GCL
A GCL without explicit TSQ definition
Intel®I22x-LM/igc
offload-mode apply nothing for that user-undefined gate. So, the behavior falls back onto theIntel®I22x-LM
default GCL ALL Opened Gates. For more details, refer to the implementation of theigc_save_qbv_schedule
andigc_tsn_clear_schedule
functions inkernel-source/drivers/net/ethernet/intel/igc/igc_main.c
.The workaround is to always explicitly define gate behavior in TAPRIO. You would expect that all traffic will be blocked because no gate was set to open.
qdisc taprio 100: root refcnt 5 tc 4 map 0 0 0 1 0 3 0 2 0 0 0 0 0 0 0 0 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 clockid invalid flags 0x2 base-time 1658915097000000000 cycle-time 2500000 cycle-time-extension 0 index 0 cmd S gatemask 0 interval 2500000However, enabling verbose debug shows that traffic-class is not blocked according to the EST TXQ GCL polices.
rmmod igc insmod /lib/modules/$(uname -r)/kernel/drivers/net/ethernet/intel/igc/igc.ko debug=16To block traffic-class according to the EST TXQ GCL polices, define 0 time explicitly for all gates. Then, to pass the checking of total time slot equal cycle time, define the rest of cycle time to NULL gate.
qdisc taprio 100: root refcnt 5 tc 4 map 0 0 0 1 0 3 0 2 0 0 0 0 0 0 0 0 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 clockid invalid flags 0x2 base-time 1658915097000000000 cycle-time 2500000 cycle-time-extension 0 index 0 cmd S gatemask 0xf interval 0 index 1 cmd S gatemask 0x0 interval 250000
Packet drop with TxLaunchTime (SO_TXTIME)
Beside ETF QDisc, TAPRIO QDisc also checks if the
TxTime`
field of the packet from a socket withSO_TXTIME
has timed out. If theTxTime
is not located in any GCL transmission window, TAPRIO QDisc policy drops the L2 packet. For more details, refer to the implementation of thefind_entry_to_transmit
function inkernel-source/net/sched/sch_taprio.c
.The workaround is that if the socket is of
SO_TXTIME
type, set theTxLaunchTime
of packet being sent. Always locate theTxLaunchTime
in a transmission window.
MAC does not inherit TxLaunchTime from the application directly (SO_TXTIME)
Intel®I22x-LM/igc
calculates LaunchTime according to the equation:LaunchTime = TxTime - BaseT - StQT[q]
. Linux upstream igc: Fix sending packets too early (fix wrong equationLaunchTime = TxTime - BaseT
) has only Linux v5.15 backport. Consequently MAC will postpone the packet transmission forStQT[q]
duration and possibly miss the right transmission window. For more details, refer to the implementation of theigc_tx_launchtime
function inkernel-source/drivers/net/ethernet/intel/igc/igc_main.c
.As a workaround, for Linux Intel v5.10/lts, use ONLY the first GCL transmission window for transmitting packet with
SO_TXTIME
, until upstream driver is fixed.
Inaccurate reported TxLaunchTime
tcpdump
only applies timestamp while the packet leaves the QDisc, but not from the Ethernet HMAC. So, the time is earlier than the effectiveTxLaunchTime
.As a workaround, report the most precise timestamping by enabling the
SOF_TIMESTAMPING_TX_HARDWARE
socket option. The timestamp is the epoch time of the L2 packet when leaving HMAC.
For support on limitations and open issues, contact Intel Support (Log in with Intel® account) and fill a new Intel Edge Software Recipes Case under the Category Software/Driver/OS and the Subcategory Industrial Edge Control Software.
TAPRIO Enhancements for Scheduled Traffic (EST) TXQ GCL Assisted-mode
Certain Intel® Ethernet Controllers do not provide the GCL hw-tc-offload
feature.
However, Linux 802.1Q-2018 EST can still be leveraged on a ECI node since it provides the latest version of the TAPRIO QDisc with the SO_TXTIME
assisted-mode, which combines skb_data
with SO_TXTIME
as provided by the ETF QDisc at kernel-level (reduce packets scheduling corner-cases).
In the SO_TXTIME
assisted-mode, the LaunchTime feature in the Intel® Ethernet Controller I210 is used to schedule packet transmissions, emulating the EST algorithm.
For all the skb_data
packets that do not have the SO_TXTIME
field set, Taprio QDisc will:
Set the transmit timestamp (set
skb->tstamp
).Ensure that the transmit time for the packet is set to when the gate is open.
Validate whether the timestamp (in
skb->tstamp
) occurs when the gate corresponding to skb’s traffic class is open (whenSO_TXTIME
is set).
This mechanism reduces the risk in Intel® Ethernet Controller for time non-critical packets being transmitted outside of their timeslice due to induced delay in the 802.3 and PHY or high-priority hardware queues starving the low-priority queues.
ethtool -K eth0 hw-tc-offload on BASE=$(expr $(date +%s) + 5)000000000 echo "$BASE" tc -d qdisc replace dev enp1s0 parent root handle 100 taprio \ num_tc 4 \ map 0 1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 \ base-time $BASE \ sched-entry S 01 5000000 \ sched-entry S 02 5000000 \ sched-entry S 04 5000000 \ sched-entry S 08 5000000 \ sched-entry S 00 5000000 \ clockid CLOCK_TAI \ flags 0x1 \ txtime-delay 5000000 tc qdisc replace dev enp1s0 parent 100:1 \ etf clockid CLOCK_TAI delta 5000000 \ offload skip_sock_check
Following parameters are added to support Enhancements for Scheduled Traffic (EST) TXQ GCL SO_TXTIME assisted-mode:
flags
- Enables Enhancements for Scheduled Traffic (EST) SO_TXTIME assisted-mode value=0x1 for kernel toTXTIME
timestamp on every egress packet (that is,skb_data
) DMA descriptor sent through ETF hardware offload queues.
txtime-delay
- Indicates the minimum time it will take for the packet to hit the wire. This is useful in determining whether you can transmit the packet in the remaining time the gate corresponding to the packet is currently open.
Important
Consider the known limitations of TXQ GCL SO_TXTIME assisted-mode:
MAC does not inherit TxTime from application directly (SO_TXTIME)
Intel®I22x-LM/igc
calculates LaunchTime according to the equation:LaunchTime = TxTime - BaseT - StQT[q]
. However, the current Linux v5.15 and v5.10 upstream igc driver uses a wrong equation:LaunchTime = TxTime - BaseT
. Consequently, MAC will postpone packet transmission forStQT[q]
duration and possibly miss the right transmission window. For more details, refer to the implementation of theigc_tx_launchtime
function inkernel-source/drivers/net/ethernet/intel/igc/igc_main.c
.As a workaround, use ONLY the first GCL transmission window for packet with
SO_TXTIME
transmission, until upstream driver is fixed.
Inaccuracy reported TxLaunchTime
tcpdump
only applies timestamp while the packet leaves the QDisc, but not from the Ethernet HMAC. So, the time is earlier than the effectiveTxLaunchTime
.As a workaround, report the most precise timestamping by enabling the
SOF_TIMESTAMPING_TX_HARDWARE
socket option. The timestamp is the epoch time of the L2 packet when leaving HMAC.
TAPRIO Enhancements for Scheduled Traffic (EST) TXQ GCL Software-fallback
This mechanism reduces the risk in Intel® Ethernet Controller for time non-critical packets being transmitted outside of their timeslice due to induced delay in the 802.3 and PHY or high-priority hardware queues starving the low-priority queues.
BASE=$(expr $(date +%s) + 5)000000000 echo "$BASE" tc qdisc add dev eth0 parent root handle 100 taprio \ num_tc 4 \ map 0 1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 \ base-time $BASE \ sched-entry S 01 5000000 \ sched-entry S 02 5000000 \ sched-entry S 04 5000000 \ sched-entry S 08 5000000 \ sched-entry S 00 5000000 \ clockid CLOCK_TAI \ flags 0x0 \ txtime-delay 5000000
Following two parameters are added to support Enhancements for Scheduled Traffic (EST) software fallback :
flags
- Enables Enhancements for Scheduled Traffic (EST) value=0x0 to scheduled packet queueing using Linux high-resolution timer-interrupt and soft IRQ kernel work-queues.
txtime-delay
- Indicates the minimum time it will take for the packet to hit the wire. This is useful in determining whether you can transmit the packet in the remaining time the gate corresponding to the packet is currently open.
Important
Developer tips on ethtool -K eth0 hw-tc-offload on
:
[Ethernet PCI
8086:7aac
and8086:7aad
] 12th Gen Intel® Core™ S-Series [Alder Lake] Ethernet GbE Time-Sensitive Network Controller allTXQ[0-3]
hardware TC TAPRIO offload.[Ethernet PCI
8086:a0ac
] 11th Gen Intel® Core™ U-Series and P-Series [Tiger Lake] Ethernet GbE Time-Sensitive Network Controller allTXQ[0-3]
hardware TC TAPRIO offload.[Ethernet PCI
8086:4b32
and8086:4ba0
] Intel® Atom™ x6000 Series [Elkhart Lake] Ethernet GbE Time-Sensitive Network Controller allTXQ[0-7]
hardware TC TAPRIO offload.[Ethernet PCI
8086:15f2
] Intel® Ethernet Controller I225-LM for Time-Sensitive Networking (TSN) allTXQ[0-3]
can achieve hardware TC TAPRIO offload.[Ethernet PCI
8086:125b
] Intel® Ethernet Controller I226-LM for Time-Sensitive Networking (TSN) allTXQ[0-3]
can achieve hardware TC TAPRIO offload.[Ethernet PCI
8086:157b
,8086:1533
,…] Intel® Ethernet Controller I210-IT for Time-Sensitive Networking (TSN)TXQ[0-3]
can not achieve hardware TC TAPRIO offload, but TC TAPRIO tx-assisted onTXQ[1]
andTXQ[1]
leverage hardware TC ETF offload.
Packet Classifier (CONFIG_NET_CLS_FLOWER)¶
Traffic-Class (TC) Flower classifier allows matching packets against pre-defined flow key fields:
Packet headers: f.e. IPv6 source address
Tunnel metadata: f.e. Tunnel Key ID
Metadata: Input port
Flower classifier actions allow packet to be modified, forwarded, dropped, and so on:
pedit
: Modify packet datamirrored
: Output packetVLAN
: Push, pop or modify VLAN
Hardware packets filters are used to achieve the lowest ingress traffic latency using on Intel® Ethernet controllers.
Enable
netdev
hardware Filter offload capabilities :$ ethtool -K eth0 hw-tc-offload on $ ethtool -K eth0 ntuple-filters on
For example, Intel® Ethernet Controller I210-IT for Time-Sensitive Networking (TSN) steering ingress traffic RXQ[1]
Filter by EtherType UADP ETH (EtherType 0xb62c
) at Ethernet L2 frame-level hardware can be set using iproute2 tc filter ... flower
command:
Set
skip_sw
to add to the hardware Filters (by defaultskip_hw
otherwise) :$ tc filter add dev eth0 parent ffff: proto 0xb62c flower \ src_mac cc:cc:cc:cc:cc:cc \ hw_tc 1 skip_sw
To show traffic control applied ingress filter
$ tc filter show dev ethO ingress
The output contents would reveal
in_hw
ornot_in_hw
to confirm thatskip_sw
rule is effectively applied inin_hw
hardware offload.filter parent ffff. protocol ip pref 49152 flower chain 0 handle 0x1 eth_type ipv4 Ip_ proto sctp dst port 80 skip_sw in_hw
Another example, Intel® Atom™ x6000 Series [Elkhart Lake] Ethernet GbE Time-Sensitive Network Controller [Ethernet PCI 8086:4b32
and 8086:4ba0
] steering ingress traffic RXQ[2]
Filter by EtherType PTPv2 (EtherType 0x88f7
) at Ethernet L2 frame-level hardware can be set using iproute2 tc filter ... flower
command:
Set another hardware Filters for steering all ingress PTPv2-messages to traffic-class 2 :
$ tc filter add dev eno1 parent ffff: protocol 0x88f7 flower \ hw_tc 2 skip_sw
To show traffic control applied ingress filter
$ tc filter show dev eno1 ingress
The output contents would reveal
in_hw
ornot_in_hw
to confirm thatskip_sw
rule is effectively applied inin_hw
hardware offload.filter parent ffff: protocol [35063] pref 49152 flower chain 0 handle 0x1 hw_tc 2 eth_type 88f7 in_hw in_hw_count 1
Final example for Intel® Ethernet Controller I225-LM for Time-Sensitive Networking (TSN) [Ethernet PCI 8086:15f2
] and Intel® Ethernet Controller I226-LM for Time-Sensitive Networking (TSN) [Ethernet PCI 8086:125b
] where tc filter ... flower
capabilities are limited for complex ingress traffic scenario or not supported in Intel® Ethernet controller, use ethtool -U flow-type
as an alternative.
Set flow type Filter to steer onto
RXQ[3]
all UADP ETH EtherType0xb62c
ingress Layer 2 Ethernet frames :$ ethtool -U enp3s0 flow-type ether proto 0xb62c queue 3 && ethtool -u enp3s0
The output contents should be similar to the following:
4 RX rings available Total 1 rules Filter: 63 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0xB62C mask: 0x0 Action: Direct to queue 3
Set flow type filter to steer onto
RXQ[2]
L2/PTPv2
messages EtherType0x88f7
ingress Layer 2 Ethernet frames :$ ethtool -U enp3s0 flow-type ether proto 0x88f7 queue 2 && ethtool -u enp3s0
The output contents should be similar to the following:
Added rule with ID 62 4 RX rings available Total 2 rules Filter: 62 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x88F7 mask: 0x0 Action: Direct to queue 2 Filter: 63 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0xB62C mask: 0x0 Action: Direct to queue 3
Set flow type Filter to steer onto
RXQ[3]
all VLAN tagged with PCP=3 (see IEEE_802.1Q header format ) ingress Layer 2 Ethernet frames :$ ethtool -U enp3s0 flow-type ether proto 0x8100 vlan 0x6000 m 0x1FFF queue 3 && ethtool -u enp3s0
The output contents should be similar to the following:
Added rule with ID 61 4 RX rings available Total 3 rules Filter: 61 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x8100 mask: 0x0 Action: Direct to queue 3 Filter: 62 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x88F7 mask: 0x0 Action: Direct to queue 2 Filter: 63 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0xB62C mask: 0x0 Action: Direct to queue 3
Steer to
RXQ[0]
all ingress packet from a specified MAC source address for examplecc:cc:cc:cc:cc:cc
:$ ethtool -U enp3s0 flow-type ether src cc:cc:cc:cc:cc:cc queue 0 && ethtool -u enp3s0
The output contents should be similar to the following:
Added rule with ID 60 4 RX rings available Total 4 rules Filter: 60 Flow Type: Raw Ethernet Src MAC addr: CC:CC:CC:CC:CC:CC mask: 00:00:00:00:00:00 Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x0 mask: 0xFFFF Action: Direct to queue 0 Filter: 61 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x8100 mask: 0x0 Action: Direct to queue 3 Filter: 62 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0x88F7 mask: 0x0 Action: Direct to queue 2 Filter: 63 Flow Type: Raw Ethernet Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF Ethertype: 0xB62C mask: 0x0 Action: Direct to queue 3
Important
Developer tips about ethtool -K eth0 hw-tc-offload on
and ethtool -K eth0 ntuple-filters on
:
[Ethernet PCI
8086:7aac
and8086:7aad
] 12th Gen Intel® Core™ S-Series [Alder Lake] Ethernet GbE Time-Sensitive Network Controller allTXQ[0-3]
hardware TC TAPRIO offload. It is recommended totc filter ... flower
steering trafficin_hw
for ingress 802.1Q traffic steering usecase.[Ethernet PCI
8086:a0ac
] 11th Gen Intel® Core™ U-Series and P-Series [Tiger Lake] Ethernet GbE Time-Sensitive Network Controller allTXQ[0-3]
hardware TC TAPRIO offload. It is recommended totc filter ... flower
steering trafficin_hw
for ingress 802.1Q traffic steering usecase.[Ethernet PCI
8086:4b32
and8086:4ba0
] Intel® Atom™ x6000 Series [Elkhart Lake] Ethernet GbE Time-Sensitive Network Controllerethtool -U flow-type
is NOT supported. It is recommended totc filter ... flower
steering trafficin_hw
for ingress 802.1Q traffic steering usecase.[Ethernet PCI
8086:15f2
] Intel® Ethernet Controller I225-LM for Time-Sensitive Networking (TSN) It is recommended to useethtool -U flow-type
ntuples
filters for ingress 802.1Q traffic steering usecase.[Ethernet PCI
8086:125b
] Intel® Ethernet Controller I226-LM for Time-Sensitive Networking (TSN) It is recommended to useethtool -U flow-type
ntuples
filters for ingress 802.1Q traffic steering usecase.[Ethernet PCI
8086:157b
,8086:1533
,…] Intel® Ethernet Controller I210-IT for Time-Sensitive Networking (TSN)tc filter ... flower
steering trafficin_hw
is limited to 1. It is recommended to useethtool -U flow-type
ntuples
filters for ingress 802.1Q traffic steering usecase.
For more information on command arguments, refer to the tc-flower and ethtool man pages.