Internet Engineering Task Force I. Jarvinen INTERNET-DRAFT M. Kojo draft-ietf-tcpm-sack-recovery-entry-00.txt University of Helsinki Intended status: Standards Track 19 October 2009 Expires: April 2010 Using TCP Selective Acknowledgement (SACK) Information to Determine Duplicate Acknowledgements for Loss Recovery Initiation Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Jarvinen/Kojo [Page 1] INTERNET-DRAFT Expires: April 2010 October 2009 Abstract This document describes a TCP sender algorithm to trigger loss recovery based on the TCP Selective Acknowledgement (SACK) information gathered on a SACK scoreboard instead of simply counting the number of arriving duplicate acknowledgements (ACKs) in the traditional way. The given algorithm is more robust to ACK losses, ACK reordering, missed duplicate acknowledgements due to delayed acknowledgements, and extra duplicate acknowledgements due to duplicated segments and out-of-window segments. The algorithm allows not only a timely initiation of TCP loss recovery but also reduces false fast retransmits. It has a low implementation cost on top of the SACK scoreboard defined in RFC 3517. Jarvinen/Kojo [Page 2] INTERNET-DRAFT Expires: April 2010 October 2009 Table of Contents 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1. Conventions and Terminology. . . . . . . . . . . . . . . 6 1.2. Definitions. . . . . . . . . . . . . . . . . . . . . . . 6 2. Algorithm Details . . . . . . . . . . . . . . . . . . . . . . 6 3. Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1. Small Segment Sender . . . . . . . . . . . . . . . . . . 8 3.2. One Segment is Small . . . . . . . . . . . . . . . . . . 10 3.3. SACK Capability Misbehavior. . . . . . . . . . . . . . . 10 3.4. Compatibility with Duplicate ACK based Loss Recovery Algorithms . . . . . . . . . . . . . . . . . . . . . 10 4. Security Considerations . . . . . . . . . . . . . . . . . . . 10 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 6. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . 11 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 A. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 12 A.1. Basic Case . . . . . . . . . . . . . . . . . . . . . . . 12 A.2. Delayed ACK. . . . . . . . . . . . . . . . . . . . . . . 13 A.3. ACK Losses . . . . . . . . . . . . . . . . . . . . . . . 14 A.4. ACK Reordering . . . . . . . . . . . . . . . . . . . . . 14 A.5. Packet Duplication . . . . . . . . . . . . . . . . . . . 15 A.6. Mitigation of Blind Throughput Reduction Attack. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 References . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Normative References . . . . . . . . . . . . . . . . . . . . . . 16 Informative References . . . . . . . . . . . . . . . . . . . . . 16 AUTHORS' ADDRESSES . . . . . . . . . . . . . . . . . . . . . . . 17 Jarvinen/Kojo [Page 3] INTERNET-DRAFT Expires: April 2010 October 2009 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: Changes from draft-jarvinen-tcpm-sack-recovery-entry-01.txt * Clarified issues that based on feedback may cause confusion for the reader. * Incorporated handling of cumulative ACKs into the algorithm * 2581 refs -> 5681 * Added early-rexmt ID as a related one, it uses SACK information similar to this algorithm (Thanks to Anna Brunstrom). * More cases added where this algorithm is beneficial in taking advantage of SACK block redundancy (thanks to Anna Brunstrom). * Discuss on differences how duplicate ACK counter is managed (traditional vs. this algorithm) * Added ref and couple of words about blind throughput reduction attack * Wrote SACK splitting attacks. These attacks are quite close to the edge in significance. Should consider just dropping (rather insignificant). Changes from draft-jarvinen-tcpm-sack-recovery-entry-00.txt * TODO items embedded: Improvements with window update, clarify dupack counting * Modified ACK reordering scenario in appendix, shows now a scenario where recovery is triggered in a more timely manner. * IDnits * Handle small segments case using duplicate ACKs counter paraller to the SACK blocks based detection. * Add a placeholder for SACK splitting * Mentioned FACK as some ideas are inherited from there END OF SECTION TO BE DELETED. Jarvinen/Kojo [Page 4] INTERNET-DRAFT Expires: April 2010 October 2009 1. Introduction The Transmission Control Protocol (TCP) [RFC793] has two methods for triggering retransmissions. First, the TCP sender relies on incoming duplicate acknowledgements (ACKs) [RFC5681], indicating receipt of out-of-order segments at the TCP receiver. After receiving a required number of duplicate ACKs (usually three), the TCP sender retransmits the first unacknowledged segment and continues with a fast recovery algorithm such as Reno [RFC5681], NewReno [RFC3782] or SACK-based loss recovery [RFC3517]. Second, the TCP sender maintains a retransmission timer that triggers retransmission of segments, if the retransmission timer expires before the segments have been acknowledged. While the conservative loss recovery algorithm defined in [RFC3517] takes full advantage of SACK information during a loss recovery, it does not consider the very same information during the pre-recovery detection phase. Instead, it simply counts the number of arriving duplicate ACKs and leans on the number of duplicate ACKs in deciding when to enter loss recovery. However, this traditional heuristics of simply counting the number of duplicate ACKs to trigger a loss recovery fails in several cases to determine correctly the actual number of valid out-of-order segments the receiver has successfully received. First, trusting on duplicate ACKs alone utterly fails to get hold of the whole picture in case of ACK losses and ACK reordering, resulting in delayed or missed initiation of fast retransmit and fast recovery. Similarly, the delayed ACK mechanism tends to conceal the first duplicate ACK as the delayed cumulative ACK becomes combined with the first duplicate ACK when the first out-of-order segment arrives at the receiver (in case of an enlarged ACK ratio such as with ACK congestion control [FARI08], even more significant portion is affected). Second, segment duplication or out-of-window segments increase the risk of falsely triggering loss recovery as they trigger duplicate ACKs. At worst, this legitimate behavior on out-of-window segments can be turned into a blind throughput reduction attack [CPNI09]. Third, receiver window updates or opposite direction data segments cannot be counted as duplicate ACKs with the traditional approach but can still contain redundant SACK information that the sender could benefit from in a scenario where the actual duplicate ACKs where lost. The algorithm specified in this document uses TCP Selective Acknowledgement Option [RFC2018] to determine duplicate ACKs and to trigger loss recovery based on the information gathered on the SACK scoreboard [RFC3517]. It works in the pre-recovery state giving a more accurate heuristic for determining the number of out-of-order segments arrived at the TCP receiver. The information gathered on the scoreboard reveals missing ACKs and allows detecting duplicate Jarvinen/Kojo Section 1. [Page 5] INTERNET-DRAFT Expires: April 2010 October 2009 events. Therefore, the algorithm enables a timely triggering of Fast Retransmit. In addition, it allows the use of Limited Transmit [RFC3042] regardless of lost ACKs and also in the cases where the SACK information is piggybacked to a cumulative ACK due to delayed ACKs. This, in turn, allows keeping the ACK clock running more accurately. This algorithm is close to what Linux TCP implementation has used for a very long time when in conservative SACK mode. A similar approach is briefly mentioned along ACK congestion control [FARI08] but as the usefulness of the algorithm in this document is more general and not limited to ACK congestion control we specify it separately. We also note that the definition of a duplicate acknowledgement already suggests that an incoming ACK can be considered as a duplicate ACK if it "contains previously unknown SACK information" [RFC5681]. In addition, SACK information is used, whenever available, for similar purpose by Early Retransmit [AAA+09]. This algorithm also resembles Forward Acknowledgement (FACK) [MM96] but they differ in how the quantity of data outstanding in the network is determined. FACK always assumes that every non-SACKed octet below the highest SACKed octet is lost which is only true if no reordering occurs. Thus it would simply trigger loss recovery whenever the highest SACKed octet is more than dupThresh segments above SND.UNA. 1.1. Conventions and Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [RFC2119] and indicate requirement levels for protocols. 1.2. Definitions The reader is expected to be familiar with the definitions given in [RFC5681], [RFC2018], and [RFC3517]. 2. Algorithm Details In order to use this algorithm, a TCP sender MUST have TCP Selective Acknowledgement Option [RFC2018] enabled and negotiated for the TCP connection. A TCP sender MUST maintain SACK information in an appropriate data structure such as scoreboard defined in [RFC3517]. Jarvinen/Kojo Section 2. [Page 6] INTERNET-DRAFT Expires: April 2010 October 2009 This algorithm uses functions IsLost (SeqNum), Update(), and SetPipe () and variables DupThresh, HighData, HighRxt, Pipe, and RecoveryPoint, as defined in [RFC3517]. A TCP sender using this algorithm MUST take following steps: 1) Upon the receipt of any ACK containing SACK information: If no previous loss event has occurred on the connection OR RecoveryPoint is less than SND.UNA (the oldest unacknowledged sequence number [RFC793]), continue with the other steps of this algorithm. Otherwise, continue the ongoing loss recovery. 2) Update the scoreboard via the Update () function as outlined in [RFC3517]. 3) If ACK is a cumulative ACK, reset duplicate ACK counter to zero. 4) If ACK contains SACK blocks with previously unknown in-window (i.e., between SND.UNA and HighData, assuming SND.UNA has been updated from the acknowledgment number of the ACK) SACK information, increase duplicate ACK counter. 5) Determinate if a loss recovery should be initiated: If IsLost(SND.UNA) returns false AND the sender has received less than DupThresh duplicate ACKs, goto step 6A. Otherwise goto step 6B. 6A) Invoke optional Limited Transmit: Set HighRxt to SND.UNA and run SetPipe(). The TCP sender MAY transmit previously unsent data segments according the guidelines of Limited Transmit [RFC3042], with the exception that the amount of octets that can be send is determined by Pipe and cwnd. If cwnd - pipe >= 1 SMSS, the TCP sender can transmit one or more segments as follows: Send Loop: a) If available unsent data exists and the receiver's advertised window allows, transmit one segment of up to SMSS octets of previously unsent data starting with sequence number HighData+1 and update HighData to reflect the transmission of the data segment. Otherwise, exit Send Loop. Jarvinen/Kojo Section 2. [Page 7] INTERNET-DRAFT Expires: April 2010 October 2009 b) Run SetPipe() to re-calculate the number of outstanding octets in the network. If cwnd - pipe >= 1 SMSS, go to step a) of Send Loop. Otherwise, exit Send Loop. 6B) Invoke Fast Retransmit and enter loss recovery: Initiate a loss recovery phase, per the fast retransmit algorithm outlined in [RFC5681] and continue with a fast recovery algorithm, such as the SACK-based loss recovery algorithm outlined in [RFC3517]. 3. Discussion In scenarios where no ACK losses nor reordering occur and the first acknowledgement with SACK information is not the ACK held due to delayed acknowledgements mechanism, the new SACK information with each duplicate ACK covers a single segment. In such a case, this algorithm will trigger loss recovery after three duplicate acknowledgements and will allow transmission of a single new segment using Limited Transmit on the first and second duplicate ACK. This is identical to the behavior that would occur without this algorithm (assuming DupThresh is 3 and that all segments are SMSS sized). This scenario together with other scenarios describing the behavior of the algorithm are depicted in Appendix A. This algorithm SHOULD be used also with an ACK that contains a window update or opposite direction data that could not be considered as a duplicate ACK in the traditional algorithm. Such behavior is safe because the SACK information can only add more information to the current state of the sender; at worst, all received information is just redundant. Setting HighRxt to SND.UNA in Step 6A has no direct relation to this algorithm. Yet it is included in the algorithm to avoid confusion in how to implement SetPipe() correctly because it depends on having a valid HighRxt value [RFC3517]. A set of potential issues to consider with the algorithm are discussed in the following. 3.1. Small Segment Sender If a TCP sender is sending small segments (usually intentionally overriding Nagle algorithm [RFC896]), the IsLost(SND.UNA) used in step 5 of the algorithm might fail to detect the need for loss Jarvinen/Kojo Section 3.1. [Page 8] INTERNET-DRAFT Expires: April 2010 October 2009 recovery on the third duplicate acknowledgement because not enough octets have been SACKed to cover DupThresh * SMSS bytes above SND.UNA. Therefore, the traditional duplicate ACK algorithm is needed as a fallback. Steps 3, 4 and the latter condition of step 5 implement the traditional algorithm in paraller to the SACK block based detection. The number of duplicate ACKs is an artificial metric to estimate the number of segments the receiver has already in its receive buffer. How accurately they match depends on the scenario. Because of that, the goal of the duplicate ACK counter included into this algorithm is not to achieve bug-to-bug compatibility with the plain duplicate ACK counter but to estimate how many out-of-order segments the receiver has already queued in a more accurate way. Therefore, the duplicate ACK counter used as a fallback mechanism in this algorithm differs from the plain duplicate ACK counter. However, such differences indicate a scenario where the plain counter was not able to accurately keep track of the receiver state. While the fallback algorithm itself does not look into acknowledgment field in order to make a decision whether ACK is a "duplicate ACK", the duplicate ACK counter is not renamed in this document as in practice most of ACKs that increment the counter would still contain a duplicate acknowledgment number. In contrast to the traditional approach, only condition that must be satisfied to increment the duplicate ACK counter with this algorithm is that the acknowledgement MUST contain at least one in-window SACK block that covers octets that where not previously SACKed [RFC5681]. In cases with ACK losses or delayed ACKs this condition can also match to cumulative ACKs, receiver window updates and opposite direction data segments but still the counter can safely be incremented. Alternatively to the fallback algorithm, a TCP sender that is able to discern segment boundaries accurately can consider full segments in IsLost() regardless of segment size. Therefore, such a TCP sender can avoid the problem with small segments using IsLost(SND.UNA) check alone which means that Steps 3, 4 and the latter condition of step 5 are redundant and do not have to be implemented. Note: the small segments problem is not unique to this algorithm but also the SACK-based loss recovery [RFC3517] encounters it because of how IsLost() is defined. Jarvinen/Kojo Section 3.1. [Page 9] INTERNET-DRAFT Expires: April 2010 October 2009 3.2. One Segment is Small A variant of small segment sender case is the case where only one of the SACKed segments is smaller than SMSS (possible even with Nagle enabled). If TCP sender lacks ability to use the improved method by discerning segment boundaries but still wants robustness against ACK losses in this case, it MAY extend the condition in Step 5 with the test: SACKed octets > SMSS * (DupThresh - 1) 3.3. SACK Capability Misbehavior If the receiver represents such a SACK misbehavior that it advertises SACK capability but never sends any SACK blocks when it should, this algorithm fails to enter loss recovery and retransmission timeout is required for recovery. However, such misbehavior does not allow SACK-based loss recovery [RFC3517] to work either, and a TCP sender will anyway require a timeout to recover. 3.4. Compatibility with Duplicate ACK based Loss Recovery Algorithms This algorithm SHOULD NOT be used together with a fast recovery algorithm that determines the segments that have left the network based on the number of arriving duplicate acknowledgements (e.g., NewReno [RFC3782]), instead of the actual segments reported by SACK. In presence of ACK reordering such an algorithm will count the delayed duplicate acknowledgements during the fast recovery algorithm as extra while determining the number of packets that have left the network. In general there should be very little reason to combine this algorithm with a loss recovery algorithm that is based on inferior, non-SACK based information only. 4. Security Considerations A malicious TCP receiver may send false SACK information for sequence number ranges which it has not received in order to trigger Fast Retransmit sooner. Such behavior would only be useful when out- of-order segments have arrived because otherwise the flow undergoes a loss recovery with a window reduction. This kind of lying involves guessing which segments will arrive later. In case the guess was wrong, the performance of the flow is ruined because the TCP sender Jarvinen/Kojo Section 4. [Page 10] INTERNET-DRAFT Expires: April 2010 October 2009 will need a retransmission timeout as it will not retransmit the segments until it assumes SACK reneging. On a successful guess the attacker is able to trigger the recovery slightly earlier. The later segments would have allowed reporting the very same regions with SACK anyway. Therefore, the gain from this attack is small, hardly justifiable considering the drastic effect of a misguess. Also, a similar attack can be made with the duplicate acknowledgment based algorithm (even if the new SACK information rule is applied) by sending false duplicate acknowledgements with false SACK ranges, and trivially without the new SACK information rule. A variation of the lying attack discards reliability of the flow but as soon as the reliability is not a concern of the receiver, a number of simpler ways exist to attack TCP independently of this algorithm. Thus this algorithm is not considered to weaken TCP security properties against false information. Splitting SACK blocks into a smaller than the received segment sized chunks allows the receiver to enable recovery to start sooner because of IsLost() discontiguous check. However, by doing so the receiver neglects the possiblity of reordering for a little gain. If the segment was just reordered, the sender performs unnecessary window reduction and unnecessary retransmission of the reordered segment. Another variant of SACK block splitting simply tries to increase consumption of bandwidth but with small dupThresh value such as three the difference between sending three duplicate ACKs (traditional algorithm) and a single ACK with SACK blocks will not offer significant benefits to make such attack practical. In case the sender keeps track of segment boundaries and applies them in IsLost(), these attack will not succeed as the sender cannot be mislead to believe that a segment was split into multiple chunks. 5. IANA Considerations This document has no actions for IANA. 6. Acknowledgements The authors would like to thank Alexander Zimmermann and Anna Brunstrom for the comments on this document. Appendix Jarvinen/Kojo [Page 11] INTERNET-DRAFT Expires: April 2010 October 2009 A. Scenarios A.1. Basic Case In this scenario no Delayed ACK, ACK losses, reordering or other "abnormal" behavior happens. For simplicity all the segments are SMSS sized. Once the TCP receiver gets first out-of-order segment, it sends a duplicate ACK with SACK information about the received octets. The following two out-of-order segments trigger a duplicate ACK each, with the corresponding range SACKed in addition to the previously know information. The sender gets those duplicate ACKs in-order, each of them will SACK a new previously unknown segment. This algorithm triggers loss recovery on third duplicate ACK because IsLost returns true as DupThresh * SMSS bytes became SACKed above the SND.UNA on the same acknowledgement, thus the behavior is identical to that of a sender which is using duplicate acknowledgments. If Limited Transmit is in use, two first duplicate ACKs allow a single segment to be sent with either of the algorithms (Pipe is decremented by SMSS by the SACKed octets per ACK allowing SMSS worth of new octets). ACK Transmitted Received ACK Sent Received Segment Segment (Including SACK Blocks) 1000 3000-3499 3000-3499 (delayed ACK) 3500-3999 3500-3999 4000 2000 4000-4499 (dropped) 4500-4999 4500-4999 4000, SACK=4500-5000 3000 5000-5499 5000-5499 4000, SACK=4500-5500 5500-5999 5500-5999 4000, SACK=4500-6000 4000 6000-6499 6000-6499 4000, SACK=4500-6500 6500-6999 6500-6999 4000, SACK=4500-7000 4000, SACK=4500-5000 7000-7499 7000-7499 4000, SACK=4500-7500 4000, SACK=4500-5500 7500-7999 7500-7999 4000, SACK=4500-8000 4000, SACK=4500-6000 4000-4499 4000-4499 8000 4000, SACK=4500-6500 Jarvinen/Kojo Section A.1. [Page 12] INTERNET-DRAFT Expires: April 2010 October 2009 A.2. Delayed ACK A basic case with delayed ACK send the first ACK with SACK information but since the previous ACK was sent with a lower sequence number because an acknowledgment is held by delayed ACK, the sender will not considered it as duplicate ACK. Because the segment contains SACK information that is identical to the basic case, the sender can use Limited Transmit with the same segments as in the basic case and will start loss recovery at the third acknowledgment, i.e., with the second duplicate acknowledgment. In the same situation the duplicate ACK based sender will have to wait for one more duplicate ACK to arrive to do the same as the first acknowledgment is fully "wasted". Technically an acknowledgement with a sequence number higher than what was previously acknowledged is not a duplicate acknowledgement but a presence of the SACK block tells another story revealing the receiver which used delayed ACK, and thus the missing duplicate acknowledgement in between. The response of a TCP sender taking advantage of such inferred duplicate acknowledgements is well within the guidelines of packet conservation principle [Jac88] as it still sends only when segments have left the network. ACK Transmitted Received ACK Sent Received Segment Segment (Including SACK Blocks) 1500 3000-3499 3000-3499 3500 3500-3999 3500-3999 (delayed ACK) 2500 4000-4499 (dropped) 4500-4999 4500-4999 4000, SACK=4500-5000 3500 5000-5499 5000-5499 4000, SACK=4500-5500 5500-5999 5500-5999 4000, SACK=4500-6000 4000, SACK=4500-5000 6000-6499 6000-6499 4000, SACK=4500-6500 6500-6999 6500-6999 4000, SACK=4500-7000 4000, SACK=4500-5500 7000-7499 7000-7499 4000, SACK=4500-7500 4000, SACK=4500-6000 4000-4499 4000-4499 7500 4000, SACK=4500-6500 Jarvinen/Kojo Section A.2. [Page 13] INTERNET-DRAFT Expires: April 2010 October 2009 A.3. ACK Losses This case with ACK loss shares much behavior with the case with delayed ACK. If hole at rcv.nxt is filled, the sender will notice that cumulative ACK advanced. In case of out-of-order segments the first ACK which gets through to the sender includes SACK blocks up to the quantity the SACK block redundancy is able to cover. With this algorithm the sender immediately takes use of all the information that is made available by the incoming ACK. ACK Transmitted Received ACK Sent Received Segment Segment (Including SACK Blocks) 1000 3000-3499 3000-3499 (delayed ACK) 3500-3999 3500-3999 4000 2000 4000-4499 (dropped) 4500-4999 4500-4999 4000, SACK=4500-5000 (dropped) 3000 5000-5499 5000-5499 4000, SACK=4500-5500 5500-5999 5500-5999 4000, SACK=4500-6000 4000 6000-6499 6000-6499 4000, SACK=4500-6500 6500-6999 6500-6999 4000, SACK=4500-7000 4000, SACK=4500-5500 (two segments left the network) 7000-7499 7000-7499 4000, SACK=4500-7500 7500-7999 7500-7999 4000, SACK=4500-8000 4000, SACK=4500-6000 4000-4499 4000-4499 8000 4000, SACK=4500-6500 A.4. ACK Reordering With ACK reordering an ACK is postponed. Due to redundancy the next ACK after postponed one contains not only its own information but also the information of the reordered ACK (similar to the ACK losses case). Then when the reordered ACK arrives, the sender already knows about the information it provides and therefore no actions are taken with this algorithm. ACK Transmitted Received ACK Sent Received Segment Segment (Including SACK Blocks) 1000 3000-3499 3000-3499 (delayed ACK) Jarvinen/Kojo Section A.4. [Page 14] INTERNET-DRAFT Expires: April 2010 October 2009 3500-3999 3500-3999 4000 2000 4000-4499 (dropped) 4500-4999 4500-4999 4000, SACK=4500-5000 (delayed) 3000 5000-5499 5000-5499 4000, SACK=4500-5500 5500-5999 5500-5999 4000, SACK=4500-6000 4000 6000-6499 6000-6499 4000, SACK=4500-6500 6500-6999 6500-6999 4000, SACK=4500-7000 4000, SACK=4500-5500 7000-7499 7000-7499 4000, SACK=4500-7500 7500-7999 7500-7999 4000, SACK=4500-8000 4000, SACK=4500-6000 4000-4499 4000-4499 8000 4000, SACK=4500-5000 (has only redundant information) 4000, SACK=4500-6500 A.5. Packet Duplication Packet duplication happens either due to unnecessary retransmission or hardware duplication. It adds a redundant ACK which has only redundant information or a data segment to the stream which will triggers a redundant duplicate ACK (possibly with SACK and/or DSACK [RFC2883] information). Because neither adds any new SACKed octets at the sender, this algorithm will not do anything while duplicate ACK based receiver would falsely consider it as a duplicate ACK. If one of the redundant ACKs is lost, the effect of duplication is just negated. It is possible for the sender to detect this case using DSACK alone. A.6. Mitigation of Blind Throughput Reduction Attack In case an attacker knows or is able to guess 4-tuple of a TCP connection, it may apply a blind throughput reduction attack [CPNI09]. In this attack TCP is tricked to send duplicate ACK to the other endpoint using out-of-window segments which it is considerably easier to achieve than a match with sequence numbers. If more than dupThresh duplicate ACKs can be triggered in row without any legimate segment that advances acknowledged sequence number, the other end acts according that false congestion signal and halves the window. Jarvinen/Kojo Section A.6. [Page 15] INTERNET-DRAFT Expires: April 2010 October 2009 With this algorithm such duplicate ACKs are filtered because they do not have any new in-window SACK blocks (DSACK [RFC2883] might be present though). References Normative References [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981. [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP Selective Acknowledgment Options", RFC 2018, October 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing TCP's Loss Recovery Using Limited Transmit", RFC 3042, January 2001. [RFC3517] Blanton, E., Allman, M., Fall, K., and L. Wang, "A Conservative Selective Acknowledgment (SACK)-based Loss Recovery Algorithm for TCP", RFC 3517, April 2003. [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion Control", RFC 5681, September 2009. Informative References [AAA+09] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and P. Hurtig, "Early Retransmit for TCP and SCTP", Internet-Draft, draft-ietf-tcpm-early-rexmt-01, January 2009. [CPNI09] Security Assessment of the Transmission Control Protocol (TCP). Available at: http://www.cpni.gov.uk/Docs/tn-03-09-security-assessment- TCP.pdf [FARI08] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding Acknowledgement Congestion Control to TCP", Internet-Draft, draft-floyd-tcpm-ackcc-06, July 2009. Jarvinen/Kojo [Page 16] INTERNET-DRAFT Expires: April 2010 October 2009 [Jac88] Jacobson, V., "Congestion Avoidance and Control", In Proc. ACM SIGCOMM 88. [MM96] M. Mathis, J. Mahdavi, "Forward Acknowledgment: Refining TCP Congestion Control," Proceedings of SIGCOMM'96, August 1996, Stanford, CA. [RFC896] Nagle, J., "Congestion Control in IP/TCP Internetworks", RFC 896, January 1984. [RFC2883] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An Extension to the Selective Acknowledgement (SACK) Option for TCP", RFC 2883, July 2000. [RFC3782] Floyd, S., Henderson, T., and A. Gurtov, "The NewReno Modification to TCP's Fast Recovery Algorithm", RFC 3782, April 2004. AUTHORS' ADDRESSES Ilpo Jarvinen University of Helsinki P.O. Box 68 FI-00014 UNIVERSITY OF HELSINKI Finland Email: ilpo.jarvinen@helsinki.fi Markku Kojo University of Helsinki P.O. Box 68 FI-00014 UNIVERSITY OF HELSINKI Finland Email: kojo@cs.helsinki.fi Jarvinen/Kojo [Page 17]