Internet-Draft | IOAM Path Protection | March 2022 |
Li | Expires 21 September 2022 | [Page] |
In-situ operation and maintenance management (IOAM, In-situ OAM), as a network performance monitoring technology, is based on the principle of path-associated detection to perform specific field marking/coloring and identification on actual service flows, and perform packet loss and delay measurement. It can quickly perceive network performance-related faults, and accurately delimit boundaries and do troubleshooting. However, the current IOAM solution has shortcomings too. For example, after the service traffic path switching, the IOAM cannot continue working. This paper proposes a scheme to achieve automatic performance monitoring through service path switching and linkage with IOAM, which enhances the feasibility of the IOAM scheme in large-scale deployment and the completeness of IOAM technology.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 21 September 2022.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
In-situ operation and maintenance management (In-situ OAM, IOAM) is a flow monitoring technology with high accuracy. It does not need to use out-of-band monitoring messages, and measures network KPIs such as packet loss and delay directly. But there are also shortcomings: in the current solution, performance monitoring can only be performed based on traffic quintuple information (pre-configuration or learning from traffic flow). If the path of this flow changes, it cannot working in most cases. However, In real network , the service flow path is not stable. There are many reasons for the change of the flow path, such as the interruption of the working fiber link in the network and the error code exceeding the threshold, or switching traffic to the backup link temporarily because of the equipments' upgrade. Regardless of the cause of the service traffic path switching, it is of great significance to monitor the performance on the new path after the switching automatically. Service path switching is a key event in the network. If the switched service path is not monitored in real time, it is impossible to ensure that the switched path can meet the requirements of the upper-layer service; on the contrary, if the IOAM performance monitoring of the switched path can be used to detect the deterioration of the network KPI after the switch in time , the operator may optimize and adjust the service path as soon as possible. Except for the manual and planned switching, it is difficult to predict the time for other switching caused by network failures, which will also cause the network operator to be unable to redeploy and start IOAM performance monitoring in time after the switching.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
IOAM:In-situ Operation And Maintenance Management¶
KPI:Key Performance Indicator¶
HSB:Hot Standby¶
APS:Automatic Protect Switch¶
Ti-LFA:Topology Independent Loop Free Alternate¶
SD:Signal Degrade¶
UNI:User Network Interface¶
NNI:Network Network Interface¶
IOAM Data collection and analysis process is shown in the figure1:¶
IOAM Data collection and analysis process¶
If it is an automatic switching triggered by a network failure, it can be divided into signal failure (SF, often caused by line fiber break, equipment power failure), signal degradation (SD, line error or packet loss over the threshold of performance availability, due to aging of fiber. Switching occurs when the error rate or the accumulated packet loss rate reaches the detection threshold). Fiber breakage, power failure of P node, and SD error codes will trigger HSB or APS switching (for SR-TE tunnel) or Ti-LFA protection (for SR-BE tunnel), the tail node power failure will trigger VPN FRR(Fast reroute) protection.¶
If the switching is triggered because of network expansion, upgrade, etc., the switching mechanism is basically the same as the network failure trigger, and the impact on IOAM is also the same, so it will not be analyzed separately.¶
As shown in Figure2 below, network equipments A, C, and G are PE devices; B ,E and F are P devices, and D is CE devices. Under normal condition, services are forwarded through the working tunnel path, which is A-B-C-D ; The protection tunnel path is A-E-F-G-C-D (one by one protection path of the tunnel), and the tail node protection path is A-E-F-G-D.¶
5G Bearer Network with Backup Path¶
From the above, it can be seen that the change in the flow direction caused by the switching of the active and standby service paths will directly affect the data collection and reporting of the IOAM monitoring instance. Based on the existing solution, one way to continue monitoring after the switch is to deploy IOAM monitoring on all nodes of the active and standby paths. When there is no traffic on the standby path, the nodes along the way do not report monitoring data; whenever traffic reaches, the monitoring data will be reported again. There are two issues with this solution: the first one is that after the traffic is switched, because there is no linkage, the upper-layer statistical analysis module in the controller does not perceive the change of the service path, and does not know what the real service path is, so it may not be able to calculate the result of delay and packet loss normally; the second one is that it will cause waste of IOAM resources configured on the device that no traffic passes through. Therefore, if a certain linkage mechanism can be established between the IOAM and the service path to dynamically perceive this path change, and reconfigure in time, continuous IOAM monitoring will be performed automatically when the service path switches and recovers(except a short interruption during the switching process), and no additional IOAM resources are occupied.¶
When the service path changes, the IOAM management module in the network controller can be notified through the alarms or events reported by the device; in addition, after the IGP on the device detects the network topology change, it will also notify network controller to perform the topology refresh through the BGP-LS protocol.¶
Information and sources to be configured, as shown below:¶
Process of IOAM and service path linkage scheme¶
TBD¶
This memo includes no request to IANA.¶