Network Working Group S. Poretsky Internet Draft Allot Communications Expires: June 2010 Rajiv Papneja Intended Status: Informational Isocore J. Karthik S. Vapiwala Cisco Systems December 2009 Benchmarking Terminology for Protection Performance Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 15, 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract This document provides common terminology and metrics for benchmarking the performance of sub-IP layer protection mechanisms. The performance benchmarks are measured at the IP-Layer, avoiding dependence on specific sub-IP protection mechanisms. The benchmarks and terminology can be applied in methodology documents for different sub-IP layer protection mechanisms such as Automatic Protection Switching (APS), Virtual Router Redundancy Protocol (VRRP), Stateful High Availability (HA), and Multi-Protocol Label Switching Fast Reroute (MPLS-FRR). Poretsky, Papneja, Karthik, Vapiwala Expires June 2010 [Page 1] Internet-Draft Benchmarking Terminology for December 2009 Protection Performance Table of Contents 1. Introduction..............................................3 2. Existing definitions......................................6 3. Test Considerations.......................................7 3.1. Paths................................................7 3.1.1. Path............................................7 3.1.2. Working Path....................................8 3.1.3. Primary Path....................................8 3.1.4. Protected Primary Path..........................8 3.1.5. Backup Path.....................................9 3.1.6. Standby Backup Path.............................10 3.1.7. Dynamic Backup Path.............................10 3.1.8. Disjoint Paths..................................10 3.1.9. Point of Local repair (PLR).....................11 3.1.10. Shared Risk Link Group (SRLG)..................11 3.2. Protection Mechanisms................................12 3.2.1. Link Protection.................................12 3.2.2. Node Protection.................................12 3.2.3. Path Protection.................................12 3.2.4. Backup Span.....................................13 3.2.5. Local Link Protection...........................13 3.2.6. Redundant Node Protection.......................14 3.2.7 State Control Interface.........................14 3.2.8. Protected Interface.............................15 3.3. Protection Switching.................................15 3.3.1. Protection Switching System.....................15 3.3.2. Failover Event..................................15 3.3.3. Failure Detection...............................16 3.3.4. Failover........................................17 3.3.5. Restoration.....................................17 3.3.6. Reversion.......................................18 3.4. Nodes................................................18 3.4.1. Protection-Switching Node.......................18 3.4.2. Non-Protection Switching Node...................19 3.4.3. Headend Node....................................19 3.4.4. Backup Node.....................................19 3.4.5. Merge Node......................................20 3.4.6. Primary Node....................................20 3.4.7. Standby Node....................................21 3.5. Benchmarks...........................................21 3.5.1. Failover Packet Loss............................21 3.5.2. Reversion Packet Loss...........................22 3.5.3. Failover Time...................................22 3.5.4. Reversion Time..................................23 3.5.5. Additive Backup Delay...........................23 3.6 Failover Time Calculation Methods.....................24 3.6.1 Time-Based Loss Method...........................24 3.6.2 Packet-Loss Based Method.........................25 3.6.3 Timestamp-Based Method...........................25 4. Acknowledgments...........................................26 5. IANA Considerations.......................................26 6. Security Considerations...................................26 7. References................................................26 8. Authors' Addresses........................................27 Poretsky, Papneja, Karthik, Vapiwala Expires June 2010 [Page 2] Internet-Draft Benchmarking Terminology for December 2009 Protection Performance 1. Introduction The IP network layer provides route convergence to protect data traffic against planned and unplanned failures in the internet. Fast convergence times are critical to maintain reliable network connectivity and performance. Convergence Events [7] are recognized at the IP Layer so that Route Convergence [7] occurs. Technologies that function at sub-IP layers can be enabled to provide further protection of IP traffic by providing the failure recovery at the sub-IP layers so that the outage is not observed at the IP-layer. Such sub-IP protection technologies include, but are not limited to, High Availability (HA) stateful failover, Virtual Router Redundancy Protocol (VRRP) [11], Automatic Link Protection (APS) for SONET/SDH, Resilient Packet Ring (RPR) for Ethernet, and Fast Reroute for Multi-Protocol Label Switching (MPLS-FRR) [8]. 1.1 Scope Benchmarking terminology was defined for IP-layer convergence in [7]. Different terminology and methodologies specific to benchmarking sub-IP layer protection mechanisms are required. The metrics for benchmarking the performance of sub-IP protection mechanisms are measured at the IP layer, so that the results are always measured in reference to IP and independent of the specific protection mechanism being used. The purpose of this document is to provide a single terminology for benchmarking sub-IP protection mechanisms. A common terminology for Sub-IP layer protection mechanism benchmarking enables different implementations of a protection mechanism to be benchmarked and evaluated. In addition, implementations of different protection mechanisms can be benchmarked and evaluated. It is intended that there can exist unique methodology documents for each sub-IP protection mechanism based upon this common terminology document. The terminology can be applied to methodologies that benchmark sub-IP protection mechanism performance with a single stream of traffic or multiple streams of traffic. The traffic flow may be uni-directional or bi-directional as to be indicated in the methodology. 1.2 General Model The sequence of events to benchmark the performance of Sub-IP Protection Mechanisms is as follows: 1. Failover Event - Primary Path fails 2. Failure Detection- Failover Event is detected 3. Failover - Backup Path becomes the Working Path due to Failover Event 4. Restoration - Primary Path recovers from a Failover Event 5. Reversion (optional) - Primary Path becomes the Working Path These terms are further defined in this document. Poretsky, Papneja, Karthik, Vapiwala Expires June 2010 [Page 3] Internet-Draft Benchmarking Terminology for December 2009 Protection Performance Figures 1 through 5 show models that MAY be used when benchmarking Sub-IP Protection mechanisms, which MUST use a Protection Switching System that consists of a minimum of two Protection-Switching Nodes, an Ingress Node known as the Headend Node and an Egress Node known as the Merge Node. The Protection Switching System MUST include either a Primary Path and Backup Path, as shown in Figures 1 through 4, or a Primary Node and Standby Node, as shown in Figure 5. A Protection Switching System may provide link protection, node protection, path protection, local link protection, and high availability, as shown in Figures 1 through 5 respectively. A Failover Event occurs along the Primary Path or at the Primary Node. The Working Path is the Primary Path prior to the Failover Event and the Backup Path after the Failover Event. A Tester is set outside the two paths or nodes as it sends and receives IP traffic along the Working Path. The tester MUST record the IP packet sequence numbers, departure time, and arrival time so that the metrics of Failover Time, Additive Latency, Packet Reordering, Duplicate Packets, and Reversion Time can be measured. The Tester may be a single device or a test system. If Reversion is supported then the Working Path is the Primary Path after Restoration (Failure Recovery) of the Primary Path. Link Protection, as shown in Figure 1, provides protection when a Failover Event occurs on the link between two nodes along the Primary Path. Node Protection, as shown in Figure 2, provides protection when a Failover Event occurs at a Node along the Primary Path. Path Protection, as shown in Figure 3, provides protection for link or node failures for multiple hops along the Primary Path. Local Link Protection, as shown in Figure 4, provides Sub-IP Protection of a link between two nodes, without a Backup Node. An example of such a Sub-IP Protection mechanism is SONET APS. High Availability Protection, as shown in Figure 5, provides protection of a Primary Node with a redundant Standby Node. State Control is provided between the Primary and Standby Nodes. Failure of the Primary Node is detected at the Sub-IP layer to force traffic to switch to the Standby Node, which has state maintained for zero or minimal packet loss. +-----------+ +--------------| Tester |<-----------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | ------------ | ---------- | +--->| Ingress/ | V | Egress/ |---+ |Headend Node|------------------|Merge Node| Primary ------------ ---------- Path | ^ | --------- | Backup +--------| Backup |-------------+ Path | Node | --------- Figure 1. System Under Test (SUT) for Sub-IP Link Protection Poretsky, Papneja, Karthik, Vapiwala Expires June 2010 [Page 4] Internet-Draft Benchmarking Terminology for December 2009 Protection Performance +-----------+ +--------------------| Tester |<-----------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | V | | ------------ -------- ---------- | +--->| Ingress/ | |MidPoint| | Egress/ |---+ |Headend Node|----| Node |----|Merge Node| Primary ------------ -------- ---------- Path | ^ | --------- | Backup +--------| Backup |-------------+ Path | Node | --------- Figure 2. System Under Test (SUT) for Sub-IP Node Protection +-----------+ +---------------------------| Tester |<----------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | Primary Path | | | ------------ -------- | -------- ---------- | +--->| Ingress/ | |MidPoint| V |Midpoint| | Egress/ |---+ |Headend Node|----| Node |---| Node |---|Merge Node| ------------ -------- -------- ---------- | ^ | --------- -------- | Backup +--------| Backup |----| Backup |--------+ Path | Node | | Node | --------- -------- Figure 3. System Under Test (SUT) for Sub-IP Path Protection +-----------+ +--------------------| Tester |<-------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | Primary | | | +--------+ Path v +--------+ | | | |------------------------>| | | +--->| Ingress| | Egress |----+ | Node |- - - - - - - - - - - - >| Node | +--------+ Backup Path +--------+ ^ ^ | IP-Layer Forwarding | +-------------------------------------------+ Figure 4. System Under Test (SUT) for Sub-IP Local Link Protection Poretsky, Papneja, Karthik, Vapiwala Expires June 2010 [Page 5] Internet-Draft Benchmarking Terminology for December 2009 Protection Performance +-----------+ +-----------------| Tester |<--------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | V | | --------- -------- ---------- | +--->| Ingress | |Primary | | Egress/ |------+ | Node |----| Node |----|Merge Node| Primary --------- -------- ---------- Path | State |Control ^ | Interface |(Optional) | | --------- | +---------| Standby |---------+ | Node | --------- Figure 5. System Under Test (SUT) for Sub-IP Redundant Node Protection Some protection switching technologies may use a series of steps that differ from the general model. The specific differences SHOULD be highlighted in each technology-specific methodology. Note that some protection switching technologies are endowed with the ability to re-optimize the working path after a node or link failure. 2. Existing definitions This document uses existing terminology defined in other BMWG work. Examples include, but are not limited to: Latency [Ref.[2], section 3.8] Frame Loss Rate [Ref.[2], section 3.6] Throughput [Ref.[2], section 3.17] Device Under Test (DUT) [Ref.[3], section 3.1.1] System Under Test (SUT) [Ref.[3], section 3.1.2] Offered Load [Ref.[3], section 3.5.2] Out-of-order Packet [Ref.[4], section 3.3.2] Duplicate Packet [Ref.[4], section 3.3.3] Forwarding Delay [Ref.[4], section 3.2.4] Jitter [Ref.[4], section 3.2.5] Packet Loss [Ref.[7], Section 3.5] Packet Reordering [Ref.[10], section 3.3] This document has the following frequently used acronyms: DUT Device Under Test SUT System Under Test This document adopts the definition format in Section 2 of RFC 1242 [2]. Terms defined in this document are capitalized when used within this document. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [5]. RFC 2119 defines the use of these key words to help make the intent of standards track documents as clear as possible. While this document uses these keywords, this document is not a standards track document. Poretsky, Papneja, Karthik, Vapiwala Expires June 2010 [Page 6] Internet-Draft Benchmarking Terminology for December 2009 Protection Performance 3. Test Considerations 3.1. Paths 3.1.1 Path Definition: A unidirectional sequence of nodes, , and links with the following properties: a. R1 is the ingress node and forwards IP packets, which input into DUT/SUT, to R2 as sub-IP frames over link L12. b. Ri is a node which forwards data frames to R[i+1] over Link Li[i+1] for all i, 1