RFC 9251 | IGMP and MLD Proxies for EVPN | May 2022 |
Sajassi, et al. | Standards Track | [Page] |
This document describes how to support efficiently endpoints running the Internet Group Management Protocol (IGMP) or Multicast Listener Discovery (MLD) for the multicast services over an Ethernet VPN (EVPN) network by incorporating IGMP/MLD proxy procedures on EVPN Provider Edges (PEs).¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9251.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
In data center (DC) applications, a point of delivery (POD) can consist of a collection of servers supported by several top-of-rack (ToR) and spine switches. This collection of servers and switches are self-contained and may have their own control protocol for intra-POD communication and orchestration. However, EVPN is used as a standard way of inter-POD communication for both intra-DC and inter-DC. A subnet can span across multiple PODs and DCs. EVPN provides a robust multi-tenant solution with extensive multihoming capabilities to stretch a subnet (VLAN) across multiple PODs and DCs. There can be many hosts (several hundreds) attached to a subnet that is stretched across several PODs and DCs.¶
These hosts express their interests in multicast groups on a given subnet/VLAN by sending IGMP/MLD Membership Reports for their interested multicast group(s). Furthermore, an IGMP/MLD router periodically sends membership queries to find out if there are hosts on that subnet that are still interested in receiving multicast traffic for that group. The IGMP/MLD Proxy solution described in this document accomplishes three objectives:¶
The first two objectives are achieved by using the IGMP/MLD proxy on the PE. The third objective is achieved by setting up a multicast tunnel among only the PEs that have interest in the multicast group(s) based on the trigger from IGMP/MLD proxy processes. The proposed solutions for each of these objectives are discussed in the following sections.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document also assumes familiarity with the terminology of [RFC7432], [RFC3376], and [RFC2236]. Though most of the place this document uses the term "IGMP Membership Report", the text applies equally for MLD Membership Report too. Similarly, text for IGMPv2 applies to MLDv1, and text for IGMPv3 applies to MLDv2. IGMP/MLD version encoding in the BGP update is stated in Section 9.¶
It is important to note that when there is text considering whether a PE indicates support for IGMP proxying, the corresponding behavior has a natural analog for indicating support for MLD proxying, and the analogous requirements apply as well.¶
The IGMP Proxy mechanism is used to reduce the flooding of IGMP messages over an EVPN network, similar to the ARP proxy used in reducing the flooding of ARP messages over EVPN. It also provides a triggering mechanism for the PEs to set up their underlay multicast tunnels. The IGMP Proxy mechanism consists of two components:¶
The goal of IGMP and MLD proxying is to make the EVPN behave seamlessly for the tenant systems with respect to multicast operations while using a more efficient delivery system for signaling and delivery across the VPN. Accordingly, group state must be tracked synchronously among the PEs serving the VPN, with join and leave events propagated to the peer PEs and each PE tracking the state of each of its peer PEs with respect to whether there are locally attached group members (and in some cases, senders), what version(s) of IGMP/MLD are in use for those locally attached group members, etc. In order to perform this translation, each PE acts as an IGMP router for the locally attached domain, maintains the requisite state on locally attached nodes, sends periodic membership queries, etc. The role of EVPN Selective Multicast Ethernet Tag (SMET) route propagation is to ensure that each PE's local state is propagated to the other PEs so that they share a consistent view of the overall IGMP Membership Request and Leave Group state. It is important to note that the need to keep such local state can be triggered by either local IGMP traffic or BGP EVPN signaling. In most cases, a local IGMP event will need to be signaled over EVPN, though state initiated by received EVPN traffic will not always need to be relayed to the locally attached domain.¶
When IGMP protocol is used between hosts and their first hop EVPN router (EVPN PE), Proxy-reporting is used by the EVPN PE to summarize (when possible) reports received from downstream hosts and propagate them in BGP to other PEs that are interested in the information. This is done by terminating the IGMP Reports in the first hop PE and translating and exchanging the relevant information among EVPN BGP speakers. The information is again translated back to an IGMP message at the recipient EVPN speaker. Thus, it helps create an IGMP overlay subnet using BGP. In order to facilitate such an overlay, this document also defines a new EVPN route type Network Layer Reachability Information (NLRI) and the EVPN SMET route, along with its procedures to help exchange and register IGMP multicast groups; see Section 9.¶
When a PE wants to advertise an IGMP Membership Report using the BGP EVPN route, it follows the proceeding rules (BGP encoding is stated in Section 9). The first four rules are applicable to the originator PE, and the last three rules are applicable to remote PE processing SMET routes:¶
Processing at the BGP route originator:¶
Processing at the BGP route receiver:¶
When a PE wants to withdraw an EVPN SMET route corresponding to an IGMPv2 Leave Group or IGMPv3 "Leave" equivalent message, it follows the rules below. The first rule defines the procedure at the originator PE, and the last two rules talk about procedures at the remote PE:¶
Processing at the BGP route originator:¶
Processing at the BGP route receiver:¶
As mentioned in the previous sections, each PE MUST have proxy querier functionality for the following reasons:¶
Consider the EVPN network in Figure 1, where there is an EVPN instance configured across the PEs (namely PE1, PE2, and PE3). Let's consider that this EVPN instance consists of a single bridge domain (single subnet) with all the hosts and sources and the multicast router connected to this subnet. PE1 only has hosts (host denoted by Hx) connected to it. PE2 has a mix of hosts and a multicast source. PE3 has a mix of hosts, a multicast source (source denoted by Sx), and a multicast router (router denoted by Rx). Furthermore, let's consider that for (S1,G1), R1 is used as the multicast router. The following subsections describe the IGMP proxy operation in different PEs with regard to whether the locally attached devices for that subnet are:¶
When PE1 receives an IGMPv2 Membership Report from H1, it does not forward this Membership Report to any of its other ports (for this subnet) because all these local ports are associated with the hosts. PE1 sends an EVPN Multicast Group route corresponding to this Membership Report for (*,G1) and setting the v2 flag. This EVPN route is received by PE2 and PE3, which are the members of the same BD (i.e., same EVI in case of a VLAN-based service or EVI and VLAN in case of a VLAN-aware bundle service). PE3 reconstructs the IGMPv2 Membership Report from this EVPN BGP route and only sends it to the port(s) with multicast routers attached to it (for that subnet). In this example, PE3 sends the reconstructed IGMPv2 Membership Report for (*,G1) only to R1. Furthermore, even though PE2 receives the EVPN BGP route, it does not send it to any of its ports for that subnet (viz., ports associated with H6 and H7).¶
When PE1 receives the second IGMPv2 Membership Report from H2 for the same multicast group (*,G1), it only adds that port to its OIF list, but it doesn't send any EVPN BGP routes because there is no change in information. However, when it receives the IGMPv3 Membership Report from H3 for the same (*,G1), besides adding the corresponding port to its OIF list, it re-advertises the previously sent EVPN SMET route with the v3 and exclude flag set.¶
Finally, when PE1 receives the IGMPv3 Membership Report from H4 for (S2,G2), it advertises a new EVPN SMET route corresponding to it.¶
The main difference in this case is that when PE2 receives the IGMPv3 Membership Report from H7 for (S2,G2), it advertises it in BGP to support the source moving, even though PE2 knows that S2 is attached to its local AC. PE2 adds the port associated with H7 to its OIF list for (S2,G2). The processing for IGMPv2 received from H6 is the same as the IGMPv2 Membership Report described in the previous section.¶
The main difference in this case relative to the previous two sections is that IGMPv2/v3 Membership Report messages received locally need to be sent to the port associated with router R1. Furthermore, the Membership Reports received via BGP (SMET) need to be passed to the R1 port but filtered for all other ports.¶
Because the Link Aggregation Group (LAG) flow hashing algorithm used by the CE is unknown at the PE, in an All-Active redundancy mode, it must be assumed that the CE can send a given IGMP message to any one of the multihomed PEs, either Designated Forwarder (DF) or non-DF, i.e., different IGMP Membership Request messages can arrive at different PEs in the redundancy group. Furthermore, their corresponding Leave messages can arrive at PEs that are different from the ones that received the Membership Report. Therefore, all PEs attached to a given Ethernet Segment (ES) must coordinate the IGMP Membership Request and Leave Group (x,G) state, where x may be either "*" or a particular source S for each BD on that ES. Each PE has a local copy of that state, and the EVPN signaling serves to synchronize that state across PEs. This allows the DF for that (ES,BD) to correctly advertise or withdraw a SMET route for that (x,G) group in that BD when needed. All-Active multihoming PEs for a given ES MUST support IGMP synchronization procedures described in this section if they need to perform IGMP proxy for hosts connected to that ES.¶
When a PE, either DF or non-DF, receives an IGMP Membership Report for (x,G) on a given multihomed ES operating in All-Active redundancy mode, it determines the BD to which the IGMP Membership Report belongs. If the PE doesn't already have the local IGMP Membership Request (x,G) state for that BD on that ES, it MUST instantiate that local IGMP Membership Request (x,G) state and MUST advertise a BGP IGMP Membership Report Synch route for that (ES,BD). The local IGMP Membership Request (x,G) state refers to the IGMP Membership Request (x,G) state that is created as a result of processing an IGMP Membership Report for (x,G).¶
The IGMP Membership Report Synch route MUST carry the ES-Import Route Target (RT) for the ES on which the IGMP Membership Report was received. Thus, it MUST only be imported by the PEs attached to that ES and not any other PEs.¶
When a PE, either DF or non-DF, receives an IGMP Membership Report Synch route, it installs that route, and if it doesn't already have the IGMP Membership Request (x,G) state for that (ES,BD), it MUST instantiate that IGMP Membership Request (x,G) state, i.e., the IGMP Membership Request (x,G) state is the union of the local IGMP Membership Report (x,G) state and the installed IGMP Membership Report Synch route. If the DF did not already advertise (originate) a SMET route for that (x,G) group in that BD, it MUST do so now.¶
When a PE, either DF or non-DF, deletes its local IGMP Membership Request (x,G) state for that (ES,BD), it MUST withdraw its BGP IGMP Membership Report Synch route for that (ES,BD).¶
When a PE, either DF or non-DF, receives the withdrawal of an IGMP Membership Report Synch route from another PE, it MUST remove that route. When a PE has no local IGMP Membership Request (x,G) state and it has no installed IGMP Membership Report Synch routes, it MUST remove that IGMP Membership Request (x,G) state for that (ES,BD). If the DF no longer has the IGMP Membership Request (x,G) state for that BD on any ES for which it is the DF, it MUST withdraw its SMET route for that (x,G) group in that BD.¶
In other words, a PE advertises a SMET route for that (x,G) group in that BD when it has the IGMP Membership Request (x,G) state on at least one ES for which it is the DF, and it withdraws that SMET route when it does not have an IGMP Membership Request (x,G) state in that BD on any ES for which it is the DF.¶
When a PE, either DF or non-DF, receives an IGMP Leave Group message for (x,G) from the attached CE on a given multihomed ES operating in All-Active redundancy mode, it determines the BD to which the IGMPv2 Leave Group belongs. Regardless of whether it has the IGMP Membership Request (x,G) state for that (ES,BD), it initiates the (x,G) leave group synchronization procedure, which consists of the following steps:¶
When a PE, either DF or non-DF, receives an IGMP Leave Synch route, it installs that route and it starts a timer for (x,G) on the specified (ES,BD), whose value is set to the Maximum Response Time in the received IGMP Leave Synch route. Note that the receipt of subsequent IGMPv2 Leave Group messages or BGP Leave Synch routes for (x,G) do not change the value of a currently running Maximum Response Time timer and are ignored by the PE.¶
If a PE attached to the multihomed ES receives an IGMP Membership Report for (x,G) before the Maximum Response Time timer expires, it advertises a BGP IGMP Membership Report Synch route for that (ES,BD). If it doesn't already have the local IGMP Membership Request (x,G) state for that (ES,BD), it instantiates that local IGMP Membership Request (x,G) state. If the DF is not currently advertising (originating) a SMET route for that (x,G) group in that BD, it does so now.¶
If a PE attached to the multihomed ES receives an IGMP Membership Report Synch route for (x,G) before the Maximum Response Time timer expires, it installs that route, and if it doesn't already have the IGMP Membership Request (x,G) state for that BD on that ES, it instantiates that IGMP Membership Request (x,G) state. If the DF has not already advertised (originated) a SMET route for that (x,G) group in that BD, it does so now.¶
When the Maximum Response Time timer expires, a PE that has advertised an IGMP Leave Synch route withdraws it. Any PE attached to the multihomed ES, which started the Maximum Response Time and has no local IGMP Membership Request (x,G) state and no installed IGMP Membership Report Synch routes, removes the IGMP Membership Request (x,G) state for that (ES,BD). If the DF no longer has the IGMP Membership Request (x,G) state for that BD on any ES for which it is the DF, it withdraws its SMET route for that (x,G) group in that BD.¶
A PE that has received an IGMP Membership Request would have synced the IGMP Membership Report by the procedure defined in Section 6.1. If a PE with the local Membership Report state goes down or the PE to CE link goes down, it would lead to a mass withdraw of multicast routes. Remote PEs (PEs where these routes were remote IGMP Membership Reports) SHOULD NOT remove the state immediately; instead, General Query SHOULD be generated to refresh the states. There are several ways to detect failure at a peer, e.g., using IGP next-hop tracking or ES route withdraw.¶
Note that to facilitate state synchronization after failover, the PEs attached to a multihomed ES operating in Single-Active redundancy mode SHOULD also coordinate the IGMP Membership Report (x,G) state. In this case, all IGMP Membership Report messages are received by the DF and distributed to the non-DF PEs using the procedures described above.¶
If an ingress PE uses ingress replication, then for a given (x,G) group in a given BD:¶
This document defines three new BGP EVPN routes to carry IGMP Membership Reports. The route types are known as:¶
The detailed encoding and procedures for these route types are described in subsequent sections.¶
A SMET route-type-specific EVPN NLRI consists of the following:¶
+---------------------------------------+ | RD (8 octets) | +---------------------------------------+ | Ethernet Tag ID (4 octets) | +---------------------------------------+ | Multicast Source Length (1 octet) | +---------------------------------------+ | Multicast Source Address (variable) | +---------------------------------------+ | Multicast Group Length (1 octet) | +---------------------------------------+ | Multicast Group Address (Variable) | +---------------------------------------+ | Originator Router Length (1 octet) | +---------------------------------------+ | Originator Router Address (variable) | +---------------------------------------+ | Flags (1 octet) | +---------------------------------------+¶
For the purpose of BGP route key processing, all the fields are considered to be part of the prefix in the NLRI, except for the 1-octet Flags field. The Flags fields are defined as follows:¶
0 1 2 3 4 5 6 7 +--+--+--+--+--+--+--+--+ | reserved |IE|v3|v2|v1| +--+--+--+--+--+--+--+--+¶
This section describes the procedures used to construct the SMET route.¶
The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364]. The value field comprises an IP address of the PE (typically, the loopback address), followed by a number unique to the PE.¶
The Ethernet Tag ID MUST be set, as per the procedure defined in [RFC7432].¶
The Multicast Source Length MUST be set to the length of the Multicast Source Address in bits. If the Multicast Source Address field contains an IPv4 address, then the value of the Multicast Source Length field is 32. If the Multicast Source Address field contains an IPv6 address, then the value of the Multicast Source Length field is 128. In case of a (*,G) Membership Report, the Multicast Source Length is set to 0.¶
The Multicast Source Address is the source IP address from the IGMP Membership Report. In case of a (*,G) Membership Report, this field is not used.¶
The Multicast Group Length MUST be set to the length of the Multicast Group Address in bits. If the Multicast Group Address field contains an IPv4 address, then the value of the Multicast Group Length field is 32. If the Multicast Group Address field contains an IPv6 address, then the value of the Multicast Group Length field is 128.¶
The Multicast Group Address is the group address from the IGMP or MLD Membership Report.¶
The Originator Router Length is the length of the Originator Router Address in bits.¶
The Originator Router Address is the IP address of the router originating this route. The SMET Originator Router IP address MUST match that of the IMET (or S-PMSI Authentic Data (AD)) route originated for the same EVI by the same downstream PE.¶
The Flags field indicates the version of IGMP protocol from which the Membership Report was received. It also indicates whether the multicast group had the INCLUDE or EXCLUDE bit set.¶
Reserved bits MUST be set to 0. They can be defined by other documents in the future.¶
IGMP is used to receive group membership information from hosts by ToRs. Upon receiving the host's expression of interest in a particular group membership, this information is then forwarded using the SMET route. The NLRI also keeps track of the receiver's IGMP protocol version and any source filtering for a given group membership. All EVPN SMET routes are announced per EVI Route Target extended communities (EVI-RT ECs).¶
This section describes the procedures used to reconstruct IGMP/MLD Membership Reports from the SMET route.¶
If there is a multicast router connected behind the EVPN domain, the PE MAY originate a default SMET (*,*) to get all multicast traffic in the domain.¶
Consider the EVPN network in Figure 2, where there is an EVPN instance configured across the PEs. Let's consider that PE2 is connected to multicast router R1 and there is a network running PIM ASM behind R1. If there are receivers behind the PIM ASM network, the PIM Join would be forwarded to the PIM Rendezvous Point (RP). If receivers behind the PIM ASM network are interested in a multicast flow originated by multicast source S2 (behind PE1), it is necessary for PE2 to receive multicast traffic. In this case, PE2 MUST originate a (*,*) SMET route to receive all of the multicast traffic in the EVPN domain. To generate wildcard (*,*) routes, the procedure from [RFC6625] MUST be used.¶
This EVPN route type is used to coordinate the IGMP Membership Report (x,G) state for a given BD between the PEs attached to a given ES operating in All-Active (or Single-Active) redundancy mode, and it consists of the following:¶
+--------------------------------------------------+ | RD (8 octets) | +--------------------------------------------------+ | Ethernet Segment Identifier (10 octets) | +--------------------------------------------------+ | Ethernet Tag ID (4 octets) | +--------------------------------------------------+ | Multicast Source Length (1 octet) | +--------------------------------------------------+ | Multicast Source Address (variable) | +--------------------------------------------------+ | Multicast Group Length (1 octet) | +--------------------------------------------------+ | Multicast Group Address (Variable) | +--------------------------------------------------+ | Originator Router Length (1 octet) | +--------------------------------------------------+ | Originator Router Address (variable) | +--------------------------------------------------+ | Flags (1 octet) | +--------------------------------------------------+¶
For the purpose of BGP route key processing, all the fields are considered to be part of the prefix in the NLRI, except for the 1-octet Flags field, whose fields are defined as follows:¶
0 1 2 3 4 5 6 7 +--+--+--+--+--+--+--+--+ | reserved |IE|v3|v2|v1| +--+--+--+--+--+--+--+--+¶
The Flags field assists in distributing the IGMP Membership Report of a given host for a given multicast route. The version bits help associate the IGMP version of receivers participating within the EVPN domain. The include/exclude bit helps in creating filters for a given multicast route.¶
If the route is being prepared for IPv6 (MLD), then bit 7 indicates support for MLD version 1. The second least significant bit (bit 6) indicates support for MLD version 2. Since there is no MLD version 3, in case of the IPv6 route, the third least significant bit MUST be 0. In case of the IPv6 route, the fourth least significant bit MUST be ignored if bit 6 is not set.¶
This section describes the procedures used to construct the IGMP Membership Report Synch route. Support for these route types is optional. If a PE does not support this route, then it MUST NOT indicate that it supports "IGMP proxy" in the Multicast Flags extended community for the EVIs corresponding to its multihomed ESs.¶
An IGMP Membership Report Synch route MUST carry exactly one ES-Import Route Target extended community, i.e., the one that corresponds to the ES on which the IGMP Membership Report was received. It MUST also carry exactly one EVI-RT EC, i.e., the one that corresponds to the EVI on which the IGMP Membership Report was received. See Section 9.5 for details on how to encode and construct the EVI-RT EC.¶
The RD SHOULD be Type 1 [RFC4364]. The value field comprises an IP address of the PE (typically, the loopback address), followed by a number unique to the PE.¶
The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet value defined for the ES.¶
The Ethernet Tag ID MUST be set, as per the procedure defined in [RFC7432].¶
The Multicast Source Length MUST be set to the length of the Multicast Source Address in bits. If the Multicast Source field contains an IPv4 address, then the value of the Multicast Source Length field is 32. If the Multicast Source field contains an IPv6 address, then the value of the Multicast Source Length field is 128. In case of a (*,G) Membership Report, the Multicast Source Length is set to 0.¶
The Multicast Source is the Source IP address of the IGMP Membership Report. In case of a (*,G) Membership Report, this field does not exist.¶
The Multicast Group Length MUST be set to the length of the Multicast Group Address in bits. If the Multicast Group field contains an IPv4 address, then the value of the Multicast Group Length field is 32. If the Multicast Group field contains an IPv6 address, then the value of the Multicast Group Length field is 128.¶
The Multicast Group is the group address of the IGMP Membership Report.¶
The Originator Router Length is the length of the Originator Router Address in bits.¶
The Originator Router Address is the IP address of the router originating the prefix.¶
The Flags field indicates the version of IGMP protocol from which the Membership Report was received. It also indicates whether the multicast group had the INCLUDE or EXCLUDE bit set.¶
Reserved bits MUST be set to 0.¶
This section describes the procedures used to reconstruct IGMP/MLD Membership Reports from the Multicast Membership Report Synch route.¶
This EVPN route type is used to coordinate the IGMP Leave Group (x,G) state for a given BD between the PEs attached to a given ES operating in an All-Active (or Single-Active) redundancy mode, and it consists of the following:¶
+--------------------------------------------------+ | RD (8 octets) | +--------------------------------------------------+ | Ethernet Segment Identifier (10 octets) | +--------------------------------------------------+ | Ethernet Tag ID (4 octets) | +--------------------------------------------------+ | Multicast Source Length (1 octet) | +--------------------------------------------------+ | Multicast Source Address (variable) | +--------------------------------------------------+ | Multicast Group Length (1 octet) | +--------------------------------------------------+ | Multicast Group Address (Variable) | +--------------------------------------------------+ | Originator Router Length (1 octet) | +--------------------------------------------------+ | Originator Router Address (variable) | +--------------------------------------------------+ | Reserved (4 octets) | +--------------------------------------------------+ | Maximum Response Time (1 octet) | +--------------------------------------------------+ | Flags (1 octet) | +--------------------------------------------------+¶
For the purpose of BGP route key processing, all the fields are considered to be part of the prefix in the NLRI, except for the Reserved, Maximum Response Time, and 1-octet Flags fields, which are defined as follows:¶
0 1 2 3 4 5 6 7 +--+--+--+--+--+--+--+--+ | reserved |IE|v3|v2|v1| +--+--+--+--+--+--+--+--+¶
The Flags field assists in distributing the IGMP Membership Report of a given host for a given multicast route. The version bits help associate the IGMP version of the receivers participating within the EVPN domain. The include/exclude bit helps in creating filters for a given multicast route.¶
If the route is being prepared for IPv6 (MLD), then bit 7 indicates support for MLD version 1. The second least significant bit (bit 6) indicates support for MLD version 2. Since there is no MLD version 3, in case of the IPv6 route, the third least significant bit MUST be 0. In case of the IPv6 route, the fourth least significant bit MUST be ignored if bit 6 is not set.¶
Reserved bits in the flag MUST be set to 0. They can be defined by other documents in the future.¶
This section describes the procedures used to construct the IGMP Leave Synch route. Support for these route types is optional. If a PE does not support this route, then it MUST NOT indicate that it supports "IGMP proxy" in the Multicast Flags extended community for the EVIs corresponding to its multihomed Ethernet Segments.¶
An IGMP Leave Synch route MUST carry exactly one ES-Import Route Target extended community, i.e., the one that corresponds to the ES on which the IGMP Leave was received. It MUST also carry exactly one EVI-RT EC, i.e., the one that corresponds to the EVI on which the IGMP Leave was received. See Section 9.5 for details on how to form the EVI-RT EC.¶
The RD SHOULD be Type 1 [RFC4364]. The value field comprises an IP address of the PE (typically, the loopback address), followed by a number unique to the PE.¶
The ESI MUST be set to the 10-octet value defined for the ES.¶
The Ethernet Tag ID MUST be set, as per the procedure defined in [RFC7432].¶
The Multicast Source Length MUST be set to the length of the Multicast Source Address in bits. If the Multicast Source field contains an IPv4 address, then the value of the Multicast Source Length field is 32. If the Multicast Source field contains an IPv6 address, then the value of the Multicast Source Length field is 128. In case of a (*,G) Membership Report, the Multicast Source Length is set to 0.¶
The Multicast Source is the Source IP address of the IGMP Membership Report. In case of a (*,G) Membership Report, this field does not exist.¶
The Multicast Group Length MUST be set to the length of the Multicast Group Address in bits. If the Multicast Group field contains an IPv4 address, then the value of the Multicast Group Length field is 32. If the Multicast Group field contains an IPv6 address, then the value of the Multicast Group Length field is 128.¶
The Multicast Group is the group address of the IGMP Membership Report.¶
The Originator Router Length is the length of the Originator Router Address in bits.¶
The Originator Router Address is the IP address of the router originating the prefix.¶
The Reserved field is not part of the route key. The originator MUST set the Reserved field to 0; the receiver SHOULD ignore it, and if it needs to be propagated, it MUST propagate it unchanged.¶
The Maximum Response Time is the value to be used while sending a query, as defined in [RFC2236].¶
The Flags field indicates the version of IGMP protocol from which the Membership Report was received. It also indicates whether the multicast group had an INCLUDE or EXCLUDE bit set.¶
This section describes the procedures used to reconstruct IGMP/MLD Leave from the Multicast Leave Synch route.¶
The Multicast Flags extended community is a new EVPN extended community. EVPN extended communities are transitive extended communities with a Type Value of 0x06. IANA has assigned 0x09 to Multicast Flags Extended Community in the "EVPN Extended Community Sub-Types" subregistry.¶
A PE that supports IGMP and/or the MLD Proxy on a given BD MUST attach this extended community to the IMET route it advertises for that BD, and it MUST set the IGMP and/or MLD Proxy Support flags to 1. Note that a PE compliant with [RFC7432] will not advertise this extended community, so its absence indicates that the advertising PE does not support either IGMP or MLD Proxies.¶
The advertisement of this extended community enables a more efficient multicast tunnel setup from the source PE specially for ingress replication, i.e., if an egress PE supports the IGMP proxy but doesn't have any interest in a given (x,G), it advertises its IGMP proxy capability using this extended community, but it does not advertise any SMET route for that (x,G). When the source PE (ingress PE) receives such advertisements from the egress PE, it does not replicate the multicast traffic to that egress PE; however, it does replicate the multicast traffic to the egress PEs that don't advertise such capability, even if they don't have any interests in that (x,G).¶
A Multicast Flags extended community is encoded as an 8-octet value as follows:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type=0x06 |Sub-Type=0x09 | Flags (2 Octets) |M|I| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved=0 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
The low-order (least significant) 2 bits are defined as the "IGMP Proxy Support" and "MLD Proxy Support" bits (see Section 12.3. The absence of this extended community also means that the PE does not support the IGMP proxy, where:¶
Flags are 2-octet values.¶
If a router does not support this specification, it MUST NOT add the Multicast Flags Extended Community in the BGP route. When a router receives a BGP update, if both M and I flags are 0, the router MUST treat this update as malformed. The receiver of such an update MUST ignore the extended community.¶
In EVPN, every EVI is associated with one or more Route Targets. These RTs serve two functions:¶
An IGMP Membership Report Synch or IGMP Leave Synch route is associated with a particular combination of ES and EVI. These routes need to be distributed only to PEs that are attached to the associated ES. Therefore, these routes carry the ES-Import RT for that ES.¶
Since an IGMP Membership Report Synch or IGMP Leave Synch route does not need to be distributed to all the PEs on which the associated EVI exists, these routes cannot carry the RT associated with that EVI. Therefore, when such a route arrives at a particular PE, the route's RTs cannot be used to identify the EVI to which the route applies. Some other means of associating the route with an EVI must be used.¶
This document specifies four new ECs that can be used to identify the EVI with which a route is associated but do not have any effect on the distribution of the route. These new ECs are known as "Type 0 EVI-RT EC", "Type 1 EVI-RT EC", "Type 2 EVI-RT EC", and "Type 3 EVI-RT EC".¶
Each IGMP Membership Report Synch or IGMP Leave Synch route MUST carry exactly one EVI-RT EC. The EVI-RT EC carried by a particular route is constructed as follows. Each such route is the result of having received an IGMP Membership Report or an IGMP Leave message from a particular BD. The route is said to be associated with that BD. For each BD, there is a corresponding RT that is used to ensure that routes "about" that BD are distributed to all PEs attached to that BD. So suppose a given IGMP Membership Report Synch or Leave Synch route is associated with a given BD, say BD1, and suppose that the corresponding RT for BD1 is RT1. Then:¶
An IGMP Membership Report Synch or Leave Synch route MUST carry exactly one EVI-RT EC.¶
Suppose a PE receives a particular IGMP Membership Report Synch or IGMP Leave Synch route, say R1, and suppose that R1 carries an ES-Import RT that is one of the PE's Import RTs. If R1 has no EVI-RT EC or has more than one EVI-RT EC, the PE MUST apply the "treat-as-withdraw" procedure per [RFC7606].¶
Note that an EVI-RT EC is not a Route Target extended community, is not visible to the RT Constrain mechanism [RFC4684], and is not intended to influence the propagation of routes by BGP.¶
1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type=0x06 | Sub-Type=n | RT associated with EVI | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RT associated with the EVI (cont.) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
The value of "n" is 0x0A, 0x0B, 0x0C, or 0x0D, corresponding to EVI-RT types 0, 1, 2, or 3, respectively.¶
There are certain situations in which an ES is attached to a set of PEs that are not all in the same AS, or not all operated by the same provider. In this situation, the RT that corresponds to a particular EVI may be different in each AS. If a route is propagated from AS1 to AS2, an ASBR at the AS1/AS2 border may be configured with a policy that replaces the EVI RTs for AS1 with the corresponding EVI RTs for AS2. This is known as RT-rewriting.¶
If an ASBR is configured to perform RT-rewriting of the EVI RTs in EVPN routes, it MUST be configured to perform RT-rewriting of the corresponding EVI-RT extended communities in IGMP Join Synch and IGMP Leave Synch Routes.¶
If a received BGP update contains Flags not in accordance with the IGMP/MLD version-X expectation, the PE MUST apply the "treat-as-withdraw" procedure per [RFC7606].¶
If a received BGP update is malformed such that BGP route keys cannot be extracted, then the BGP update MUST be considered invalid. The receiving PE MUST apply the "session reset" procedure per [RFC7606].¶
This document does not provide any detail about IGMPv1 processing. Implementations are expected to only use IGMPv2 and above for IPv4 and MLDv1 and above for IPv6. IGMPv1 routes are considered invalid, and the PE MUST apply the "treat-as-withdraw" procedure per [RFC7606].¶
This document describes a means to efficiently operate IGMP and MLD on a subnet constructed across multiple PODs or DCs via an EVPN solution. The security considerations for the operation of the underlying EVPN and BGP substrates are described in [RFC7432], and specific multicast considerations are outlined in [RFC6513] and [RFC6514]. The EVPN and associated IGMP proxy provides a single broadcast domain so the same security considerations of IGMPv2 [RFC2236] [RFC3376], MLD [RFC2710], or MLDv2 [RFC3810] apply.¶
IANA has allocated the following codepoints in the "EVPN Extended Community Sub-Types" subregistry under the "Border Gateway Protocol (BGP) Extended Communities" registry.¶
Sub-Type Value | Name | Reference |
---|---|---|
0x09 | Multicast Flags Extended Community | RFC 9251 |
0x0A | EVI-RT Type 0 | RFC 9251 |
0x0B | EVI-RT Type 1 | RFC 9251 |
0x0C | EVI-RT Type 2 | RFC 9251 |
0x0D | EVI-RT Type 3 | RFC 9251 |
IANA has allocated the following EVPN route types in the "EVPN Route Types" subregistry.¶
IANA has created and now maintains a new subregistry called "Multicast Flags Extended Community" under the "Border Gateway Protocol (BGP) Extended Communities" registry. The registration procedure is First Come First Served [RFC8126]. For the 16-bit Flags field, the bits are numbered 0-15, from high order to low order. The registry was initialized as follows:¶
Bit | Name | Reference | Change Controller |
---|---|---|---|
0-13 | Unassigned | ||
14 | MLD Proxy Support | RFC 9251 | IETF |
15 | IGMP Proxy Support | RFC 9251 | IETF |
The authors would like to thank Stephane Litkowski, Jorge Rabadan, Anoop Ghanwani, Jeffrey Haas, Krishna Muddenahally Ananthamurthy, and Swadesh Agrawal for their reviews and valuable comments.¶