1 Overview
With the rapid development of industrial Ethernet technology and relevant standards, Ethernet has gradually become the mainstream transmission technology for industrial networks in lieu of traditional field bus technology, providing flexible, high-speed and standardized transmission services. Given more stringent requirements of reliability in industrial application scenarios, it is crucial to improve the reliability and protection capability of Ethernet transmission in order to promote the development and widespread adoption of industrial Ethernet technology. This White Paper relates to the FRRP (Fast Ring Recovery Protocol) technology, a link layer ring network technology that aims to greatly improve Ethernet reliability and robustness. Capable of preventing broadcast storms on the Ethernet ring, the FRRP technology enables rapid convergence in the event of a link failure to ensure continuous network availability.
2 FRRP Protocol Introduction
The FRRP protocol is a link layer protocol designed for Ethernet ring. It can quickly switch to the backup link in the event of a link or device fault on Ethernet ring to ensure rapid recovery of services. Based on VLAN classification in a ring network and through a dedicated control channel (VLAN) for information transmission, the FRRP protocol enables isolated control information and service data. It supports a number of networking topologies such as single ring, double ring, and multi-ring; supports single ring and multiple domains to realize ring network load sharing; supports flexible scalability of ring network bandwidth through link aggregation; supports real-time ring network health check to quickly detect network faults and realize quick fault link switching, with a ring network convergence time ≤ 20ms, which is independent of the number of nodes on the ring network.
2.1 FRRP Basic Concepts
■ FRRP Domain
A FRRP domain is a group of devices with the same protocol VLAN and service VLAN. A device can be configured with different FRRP domains, but the same VLAN cannot be configured for multiple domains. Domains are distinguished from each other by Domain IDs in a same network. Each FRRP domain is responsible for forwarding different traffic of respective VLAN to ensure different forwarding paths are set aside for different VLANs in Ethernet ring links for load sharing. A maximum of 16 FRRP domains can be configured for a network.
■ FRRP Ring
The devices in the FRRP domain form a ring topology via link connections, thus a FRRP ring is formed. Nodes on a FRRP logical ring must fall into the same domain. Up to 16 FRRP logical rings can be configured for each FRRP domain. Rings are distinguished by ring IDs. A FRRP domain may contain multiple FRRP rings. However, there FRRP rings can have only one primary ring, which is determined by user. Once a primary ring is selected, the rest FRRP rings are sub-rings.
FRRP protocol supports single ring, double ring, multi-ring, tangent ring, intersecting ring and other ring topologies to prevent potential loops caused by multi-ring network. In intersecting ring topology, one of the two intersecting rings must be selected as primary ring. The protocol packets of the sub-ring are transmitted through the primary ring, while the packets of the primary ring are transmitted internally. Their protocol VLANS are different and independent.
Fig.1 FRRP Networking Diagram
As shown in Fig. 1, the FRRP Domain 1 contains two intersecting Ethernet rings, i.e., Ring 1 and Ring 2. Set Ring 1 as the primary ring, and Ring 2 as the sub-ring. Then Ring 1 and Ring 2 will calculate a loop-free topology respectively, eliminating loops in the network and ensuring full connectivity of each node.
In the FRRP protocol, the primary ring can be deemed as a transmission node for sub-ring. The packets of the sub-ring, except the COMMON-HEALTH packet, are deemed as data packets and are transmitted through the primary ring. As a result, data packets and sub-ring protocol packets, except the COMMON-HEALTH packet, cannot be transmitted if the port on the primary ring is blocked.
■ FRRP Protocol VLAN
Protocol VLAN is a VLAN for transmitting FRRP protocol packets. To enable intersecting ring networking, each FRRP domain is provided with main protocol VLAN and sub-protocol VLAN to transmit protocol packets of the primary ring and sub-ring respectively. The primary ring protocol packets and sub-ring COMMON-HEALTH packet are transmitted in the primary protocol VLAN, while other sub-ring protocol packets are transmitted in sub-protocol VLAN.
■ FRRP Service VLAN
Service VLAN is used to transmit data packets. Service VLAN can be forwarded through FRRP ports and non-FRRP ports, as configured by its FRRP Domain. The devices on FRRP ring will have two or more interfaces connected to this ring network, which are collectively referred to as FRRP ports. Different FRRP domains in a ring network can be assigned with different Service VLANs. Each FRRP domain is responsible for calculating forwarding ports in its own ring.
■ FRRP Port Roles
There are three types of FRRP ports: primary port, secondary port, and edge port. In the primary ring, there is no difference between a primary port and a secondary port for transmission nodes. For primary nodes, the protocol packets are sent from the primary port and received at the secondary port, which is then blocked.
1) Master port and slave port
There are two ports connected to the FRRP ring at the control node and forward node respectively. One of them is primary port, and the other secondary port. Users may determine the role of ports at their discretion.
Primary port and secondary port is functionally different for control nodes. As a control node sends HEALTH packet from its primary port, if the same packet can be received at the secondary port, it indicates that the FRRP network is a loop, therefore, the secondary port should be blocked to prevent data loops. If the packet cannot be received at the secondary port in a specified time period, it means a fault in the ring network, so the secondary port should be released to ensure normal communication of all nodes on the ring network.
There is no functional difference between the primary port and the secondary port for forwarding nodes. Users may determine the role of ports at their discretion.
2) Public port and edge port
The ports and edge ports connecting the primary public node (secondary public node) to the sub-ring are edge ports, and the two ports connected to the primary ring are public nodes. The link between the public port on the primary public node and the public port on the secondary public node constitutes the public link. Users may determine the role of public and edge ports at their discretion. In FRRP protocol, the whole primary ring is deemed as a transmission node on the sub-ring, therefore, the public link is considered as an internal link of the primary ring. Changes of link state will be notified to the primary control node for processing.
■ Role of FRRP Nodes
1)Control node
A device on the FRRP ring is called a node. There should be one and only one control node on each FRRP ring. As shown in Fig. 1, SW1 is the control node of the primary ring, and SW4 is the control node of the sub-ring. Control nodes will send detection packets to check the link state. There are two states available for control nodes:
SUCCESSFUL State
When all links on the ring network are in the UP state, the control node can received HELLO packet it sends from the slave port, it means the links are trouble-free. In this case, the control node will block the secondary port to prevent data packets from forming a broadcast loop on the ring topology.
FAULT State
When there is a faulty link on the ring network, the control node is in FAULT state. In this case, the control node will stop blocking data packets to ensure uninterrupted communication on the ring network.
2)Forwarding node
All other nodes except the control node on the FRRP ring are forwarding nodes. Forwarding nodes transmit protocol packets and monitor the state of its neighboring links. If a faulty link is detected, the forwarding nodes will send protocol packets to the ring network, reporting the link DOWN event to the control node. There are three states available for forwarding nodes:
Link-Up State
When the primary port and the secondary port of the forwarding node are in the UP state, the forwarding node is in the Link-Up state.
Link-Down State
When the primary port and the secondary port of the forwarding node are in the Down state, the forwarding node is in the Link-Down state.
PRE-RECOVER State
When the primary or secondary port of a forwarding node is blocked, it means the transmitting node is in Pre-forwarding State.
3)Primary public node and Secondary public node
There are two intersections between the primary ring and the sub-ring. One of them is called the primary public node, and the other is called the secondary public node. The primary and secondary public nodes should be configured in pairs. The primary or secondary public node is the role of the device on the sub-ring, and its role on the primary ring is control node or forwarding node. The primary and secondary public nodes are special forwarding nodes, and there are three states available:
Link-Up State
Edge port is a port that is neither directly connected to any switch, nor indirectly connected to any switch through its connected network.
Therefore, when the edge port is in UP state, it means the primary public node (secondary public node) is in the Link-Up state.
Link-Down State
When the edge port is in Down state, it means the primary public node (secondary public node) is in the Link-Down state.
PRE-RECOVER State
When the edge port is blocked, it means the primary public node (secondary public node) is in the PRE-RECOVER state.
The main types of FRRP packets are as follows:
Packet Type |
Description |
HEALTH |
Health-check packet, initiated by the control node to perform integrity detection on the network. |
LINK-OK |
Link UP packets, initiated by the forwarding node, the primary public node or the secondary public node with a direct link that is in UP state to notify the control node that there is link recovery on the loop. |
LINK-FAULT |
The link DOWN packet, initiated by the forwarding node, the primary public node or the secondary public node with a direct link that is in DOWN state to notify the control node that there is a link DOWN on the loop. The physical loop will then disappear. |
LINK-DOWN-FLUSH-FDB |
FDB (forwarding database) refresh packet, initiated by the control node to notify forwarding node, primary public node or secondary public node to update their respective MAC address forwarding tables. |
SUCCESSFUL-FLUSH-FDB |
FDB (forwarding database) refresh packet after the ring network recovers, initiated by the control node to notify forwarding node, primary public node or secondary public node to update their respective MAC address forwarding tables, while at the same time informing the forwarding node of releasing Pre-recover ports. |
COMMON-HEALTH |
Primary ring integrity check packet, initiated by the forwarding node, the primary public node or the secondary public node with a direct link that is in DOWN state and received by the secondary public node of the sub-ring, which, in turn, will check for ring integrity of the domain where the primary ring is located. |
MOTHER-FAULT |
Packet used to notify the primary ring failure. When the secondary public node of the sub-ring fails to receive the COMMON-HEALTH packet sent by the primary public node within the specified time, it will report the failure of the primary ring in its domain. |
Table 1 Packet Type List of FRRP Protocol
FRRP Protocol Features
Health monitoring
The control node on the ring periodically sends a HEALTH packet from the primary port (as set by the Health timer on the control node), which passes all transmission nodes on the ring to detect the state of the ring network. If the control node receives Health packet from the secondary port within specified time, it means all links are working properly and the ring network state is complete. To prevent the formation of a broadcast loop, the control node blocks its secondary port, as shown in Fig. 2 below.
Fig. 2 Schematic diagram of ring network integrity check
If no HEALTH packet is received within the specified time, the link on the ring is considered to be faulty. The control node switches to Fault state, releases the secondary port, and sends LINK-DOWN-FLUSH-FDB packet from the primary and secondary ports to inform all forwarding nodes on the ring to refresh MAC and ARP entries.
In this way, all abnormal conditions on the ring network can be detected, ensuring the connectivity and quick processing after the ring network is found faulty.
High-speed port event processing
Port status on the link is monitored by all nodes. In case of any change to the port status, the node quickly disconnects and reports the event to the control plane, which will then handle the event with top priority. The entire process, as shown below, lasts for no more than 1ms to ensure timely processing of events and speed up the convergence of the ring network.
Port Down event:
When the primary port of the control node is DOWN, the control node immediately releases the secondary port, and sends LINK-DOWN-FLUSH-FDB packet from the secondary port to inform all forwarding nodes on the ring to refresh MAC and ARP entries. When the FRRP port on the forwarding node is DOWN, the node sends a LINK-FAULT packet from other FRRP ports in UP state (Please refer to Fig. 3 for the reporting process of LINK-FAULT). After the control node receives the LINK-FAULT packet, it releases the secondary port and switch to the Fault state. As the network topology has changed, to prevent packet direct error, the control node needs to refresh MAC and ARP entries, and send the LINK-DOWN-FLUSH-FDB packet from the primary and the secondary ports to inform all forwarding nodes to refresh MAC and ARP entries.
Fig. 3 Schematic diagram of reporting a link disconnection on forwarding nodes
Port Up event:
As the forwarding node port recovers, it will immediately send a LINK-OK packet from other FRRP ports in UP state to inform the control node (as shown in Fig. 4). When the control node receives the packet, it will block the secondary port, and switch back to SUCCESSFUL state. As the FRRP ring topology has changed, the control node needs to refresh MAC and ARP entries.
Fig. 4 Schematic diagram of the link recovery process of the forwarding node
Fast delivery of protocol packets
In traditional solutions, protocol packets should be processed on the control plane before forwarding. To accelerate convergence, the FRRP protocol packet is copied and received on the control plane and transmitted in a high speed on the data plane. If the device node needs to process a certain type of protocol packet, it configures a policy to copy the protocol packet to the control plane. In the meantime, the protocol packet is transmitted to other nodes for processing through the high-speed forwarding mechanism on the data plane. Both the LINK-DOWN-FLUSH-FDB packet and the SUCCESSFUL-FLUSH-FDB packet of FRRP protocol are processed in this manner to ensure high-speed convergence.
The aforesaid mechanism enables a ring network convergence time ≤ 20ms, which is independent of the number of nodes on the ring network. Besides, there is no limit on the maximum number of nodes.
2.3 FRRP Protocol Working Mechanism
In actual networking, a sub-ring is often added as a backup link for the primary ring. In this case, if the primary link is faulty, the sub-ring can detect primary ring abnormality within a short time. The blocked port is released to transmit service packets through the sub-ring and make sure the network environment works properly. The FRRP protocol working mechanism is illustrated in the networking shown in Fig. 5 below:
Regular sub-ring detection to ensure the ring network works properly
In the topology, the primary ring intersects with the sub-ring to form an intersecting ring network. However, the sub-ring acts as an independent FRRP ring logically, so the self-check mechanism works the same way as that of a single ring network.
In the networking shown in Fig. 5, Ring1 acts as the primary ring, which is composed of SW1-SW2-SW3. The two sub-rings, Ring 2 and Ring 3, connect to each other via the primary public node and the secondary public node, forming a ring itself. The protocol packets of each sub-ring have two paths in the primary ring, namely SW2-SW3 and SW2-SW1-SW3. When the primary ring is complete, the secondary port of the Ring 2 control node is blocked, and SW2-SW3 path is activated. If the primary ring is found faulty on the SW2-SW1-SW3 path, it indicates SW2-SW3 path is unblocked and activated. If the fault is found on the SW2-SW3 path, it indicates SW2-SW1-SW3 path is unblocked and activated. In other words, only one of the two paths of RING2 is activated at any time, which helps prevent sub-ring protocol packets from forming a loop in the primary ring. If both paths of RING2 are interrupted, and the control node of RING2 cannot receive the HEALTH packet it sends out, then the Fault timer is timeout, and the control node releases the slave port to provide RING2 with as more communication paths as possible to avoid forming a loop.
Fig. 5 Double sub-ring networking
Sub-ring assists the detection of the primary ring to guarantee the ring network is working properly
(1) Check the status of the primary ring:
The primary public node of the sub-ring sends a COMMON-HEALTH packet to the primary ring via the two ports connected the primary ring on a regular basis, which passes all nodes on the ring to the secondary public node, as shown in Fig. 6. If the secondary public node can receive the COMMON-HEALTH packet within the specified time, it indicates that at least one sub-ring in the primary ring works properly, and that the sub-ring packet can pass through. Otherwise, if the secondary public node cannot receive the COMMON-HEALTH packet, it indicates that both paths of the two sub-rings in the primary ring are all blocked, and that the sub-ring packet cannot pass through.
Fig. 6 Schematic diagram of COMMON-HEALTH sent from the primary public node to the secondary public node
(2) Abnormal primary link:
When the secondary public node detects that the sub-rings in the primary ring are all disconnected, it sends a MOTHER-FAULT packet to the primary public node from the sub-ring. As shown in Fig. 7, if there is no fault in the sub-ring, and the primary public node can receive MOTHER-FAULT packet, it immediately blocks its own edge port. If there is a fault in the sub-ring, the edge port of the primary public node will not be blocked. The MOTHER-FAULT packet is sent on a regular basis. If the primary node receives the packet, its edge ports will continue to be blocked. If no packet is received within the specified time, the edge port will release automatically.
Fig. 7 Sub-ring failure due to a link abnormality on the primary ring
(3) Sub-ring failure, and the state is changed:
Since the paths of the two sub-rings in the primary ring are both disconnected, the sub-ring protocol packet cannot be transmitted on the primary ring, and the control node cannot receive the HEALTH primary it sends out. Therefore, the secondary port is released and the state is switched to Fault status.
(4) Primary ring recovers:
While the primary ring is recovering, the sub-ring path of the primary ring recovers as well, and the secondary public node no longer reports the MOTHER-FAULT packet. If there is no fault on the sub-ring, its control node receives the HEALTH message sent by itself again, then the secondary port is blocked and the state is switched to Successful, as shown in Fig. 8. After the sub-ring recovers, the control node sends a SUCCESSFUL-FLUSH-FDB packet from the master port. Upon receipt, the primary public node releases the edge port that has been blocked, and the network communication is restored. Please refer to Fig. 9.
While the sub-ring path of the primary ring is recovering, if a fault is found on the sub-ring, the sub-ring cannot be recovered. In this case, the control node of the sub-ring will not send a SUCCESSFUL-FLUSH-FDB packet. If the edge port of the primary public node is blocked, it cannot be released until the Fault timer timeout.
Fig. 8 Schematic diagram of primary ring recovery
Fig. 9 Schematic diagram of the primary public node of the sub-ring releasing edge ports
3 FRRP Typical Networking
3.1 Single ring and multiple domains to realize ring network load sharing
With only one ring on the network topology, it is possible that there are multiple VLANs on a ring network. Configure multiple FRRP domains, in which respective service VLANs are configured for FRRP Ring. Different FRRP domains are responsible for transmitting traffic of different VLANs, thus forming different topologies of various VLANs on one FRRP network. The VLAN under protection of the FRRP Domain is identical to that on the FRRP port. Therefore, the packet will find and match the corresponding port based on its own VLAN before forwarding. The matching FRRP domain is automatically selected for forwarding. As shown in Fig. 10, the ring network made up of SW1-SW5 is configured with two types of rings. Traffic from SW2 is sent out through SW4. If the traffic matches the service VLAN of Domain 1, it goes clockwise; if it matches the service VLAN of Domain 2, it goes counterclockwise. In this way, load sharing is achieved.
Fig. 10 Schematic diagram of single ring and multiple domains
3.2 Tangent ring networking to provide highly reliable access
In network topology, two rings are connected via a node to form a tangent ring network. The two tangent rings should belong to different FRRP domains. This type of networking is commonly found in highly reliable access, for example, connecting the aggregation ring and the access ring in an enterprise ring network. The tangent ring network can act as a backup link for ring networks. A faulty link can be switched off immediately to ensure the normal forwarding of traffic between ring networks.
Fig. 11 Typical networking of tangent ring
3.3 Intersecting ring networking to provide backup of key links
There are two or more rings in the network topology, but each ring has two public nodes. When two intersecting rings are in the same FRRP domain, one of them should be made the primary link. This type of networking is commonly found in backup of key links. As shown in Fig. 12, traffic from SW2 is sent out through SW4. If a faulty link is found in SW3-SW5, the sub-ring can detect abnormality in a short time and releases the secondary port to ensure normal traffic forwarding.
Fig. 12 Typical networking of tangent ring