The Need for Congestion Exposure in the Internet
BT
B54/70, Adastral Park
Martlesham Heath
Ipswich
IP5 3RE
UK
+44 7918 901170
toby.moncaster@bt.com
BT
B54/77, Adastral Park
Martlesham Heath
Ipswich
IP5 3RE
UK
louise.burness@bt.com
University of Wuerzburg
room B206, Institute of Computer Science
Am Hubland
Wuerzburg
D-97074
Germany
+49 931 888 6644
menth@informatik.uni-wuerzburg.de
UCL
GS206 Department of Electronic and Electrical Engineering
Torrington Place
London
WC1E 7JE
UK
j.araujo@ee.ucl.ac.uk
Extreme Networks
Pamlico Building One, Suite 100
3306/08 E. NC Hwy 54
RTP, NC 27709
US
sblake@extremenetworks.com
Comcast
Comcast Cable Communications
27 Industrial Avenue
Chelmsford
01824
MA
US
richard_woundy@cable.comcast.com
http://www.comcast.com
Transport
Congestion Exposure
Today's Internet is a product of its history. TCP is the main transport protocol
responsible for sharing out bandwidth and preventing a recurrence of congestion
collapse while packet drop is the primary signal of congestion at bottlenecks.
Since packet drop (and increased delay) impacts all their customers negatively,
network operators would like to be able to distinguish between overly aggressive
congestion control and a confluence of many low-bandwidth, low-impact
flows. But they are unable to see the actual congestion signal and thus, they
have to implement bandwidth and/or usage limits based on the only information they
can see or measure (the contents of the packet headers and the rate of the traffic).
Such measures don't solve the packet-drop problems effectively and are leading to
calls for government regulation (which also won't solve the problem).
We propose congestion exposure as a possible solution. This allows packets
to carry an accurate prediction of the congestion they expect to
cause downstream thus allowing it to be visible to ISPs and network
operators. This memo sets out the motivations for congestion
exposure and introduces a strawman protocol designed to achieve
congestion exposure.
The Internet has grown from humble origins to become a global phenomenon with
billions of end-users able to share the network and exchange data and more.
One of the key elements in this success has been the use of distributed algorithms
such as TCP that share capacity while avoiding congestion collapse. These algorithms
rely on the end-systems altruistically reducing their transmission rate in response
to any congestion they see.
In recent years ISPs have seen a minority of users taking a
larger share of the network by using applications that transfer data
continuously for hours or even days at a time and even opening multiple
simultaneous TCP connections. This issue became prevalent with the
advent of "always on" broadband connections. Frequently peer to peer
protocols have been held responsible but streaming video
traffic is becoming increasingly significant. In order to improve
the network experience for the majority of their customers, many ISPs
have chosen to impose controls on how their network's capacity is shared
rather than continually buying more capacity. They calculate that most customers will
be unwilling to contribute to the cost of extra shared
capacity if that will only really benefit a minority of users.
Approaches include volume counting or charging, and application rate
limiting. Typically these traffic controls, whilst not impacting
most customers, set a restriction on a customer's level of network
usage, as defined in a "fair usage policy".
We believe that such traffic controls seek to control the wrong
quantity. What matters in the network is neither the volume of
traffic nor the rate of traffic, it is the contribution to congestion over
time - congestion means that your traffic impacts other users, and
conversely that their traffic impacts you. So if there is no congestion
there need not be any restriction on the amount a user can
send; restrictions only need to apply when others are sending
traffic such that there is congestion. In fact some of the current work at
the IETF and IRTF already reflects
this thinking. For example, an application intending to transfer large
amounts of data could use LEDBAT to try to reduce its transmission
rate before any competing TCP flows do, by detecting an increase in end-to-end
delay (as a measure of incipient congestion). However these techniques
rely on voluntary, altruistic action by end users and their application
providers. ISPs cannot enforce their use. This leads to our second point.
The Internet was designed so that end-hosts detect and control congestion. We
believe that congestion needs to be visible to network nodes as well,
not just to the end hosts. More specifically, a network needs to be able to measure how much congestion traffic causes
between the monitoring point in the network and the destination ("rest-of-path congestion"). This would be a new capability;
today a network can use explicit congestion notification (ECN) to
detect how much congestion traffic has suffered between the source and a monitoring point in the
network, but not beyond. Such a capability would enable an ISP to give incentives for the use, without
restrictions, of LEDBAT-like applications whilst perhaps restricting excessive use of TCP and UDP ones.
So we propose a new approach which we call congestion exposure. We
propose that congestion information should be made visible at the IP
layer, so that any network node can measure the contribution to congestion of
an aggregate of traffic as easily as straight volume can be measured today.
Once the information is exposed in this way, it is then
possible to use it to measure the true impact of any traffic on the
network. Lacking the ability to see congestion, some ISPs count the volume
each user transfers. On this basis LEDBAT applications would get blamed for
hogging the network given the large amount of volume they transfer. However,
because they yield rather than hog, they actually contribute very little to
congestion. One use of exposed congestion information would be to measure the
congestion attributable to a given user. and thereby
incentivise the use of protocols such as
which aim to reduce the congestion caused by bulk data transfers.
Creating the incentive to deploy low-congestion protocols such as LEDBAT is just one of many motivations for
congestion exposure. In general, congestion exposure gives ISPs a principled way
to hold their customers accountable for the impact on others of their network
usage and their choices of applications. It can measure the impact of an
individual consumer, a large enterprise network or the traffic crossing a border
from another ISP – anywhere where volume is used today as a (poor) measure of
usage. In , a range of potential use cases for congestion exposure are given,
showing it is possible to imagine a wide range of other ways to use the exposed congestion information.
Throughout this document we refer to congestion repeatedly. Congestion has a wide
range of definitions. For the purposes of this document it is defined using the
simplest way that it can be measured - the instantaneous fraction of loss. More
precisely, congestion is bits lost divided by bits sent, taken over any brief period.
By extension, if explicit congestion notification (ECN) is being used, the
fraction of bits marked (rather than lost) gives a useful metric that can be
thought of as analagous to congestion. Strictly congestion should measure impairment, whereas
ECN aims to avoid any loss or delay impairments due to congestion. But for the
purposes of this document, the two will both be called congestion.
We also need to define two specific terms carefully:
Upstream congestion is defined as the congestion
that has already been experienced by a packet as it travels along its path. In other words at
any point on the path it is the congestion between that point and the source of the packet.
Downstream congestion is defined as the congestion
that a packet still has to experience on the remainder of its path. In other words at any
point it is the congestion still to be experienced as the packet travels between that point
and ts destination.
Abstract re-written again following comments from John Leslie.
Use Cases Section re-written.
Security Considerations section improved.
This ChangeLog added.
NOTE: there are still sections needing more work, especially the Use Cases. The whole document also needs
trimming in places and checking for repetition or omission.
Extensive changes throughout the document:
Abstract and Introduction re-written.
The Problem section re-written and extended significantly.
Why Now? Section re-written and extended.
Requirements extended.
Security Considerations expanded.
Other less major changes throughout.
Significant changes throughout including re-organising the main structure.
New Abstract and changes to Introduction.
The problem is not congestion itself. The problem is how best to share available
capacity. When too much traffic meets too little capacity, congestion occurs.
Then we have to share out what capacity there is. But we should not (and cannot) solve the
capacity sharing problem by trying to make it go away – by saying there
should somehow be no congestion, slower traffic or more capacity. That
misses the whole point of the Internet: to multiplex or
share available capacity at maximum utilisation.
So as we say, the problem is not congestion in itself. Every elastic data
transfer should (and usually will) congest a healthy data network. If it doesn't,
its transport protocol is broken. There should always be periods approaching
100% utilisation at some link along every data path through the Internet,
implying that frequent periods of congestion are a healthy sign. If transport
protocols are too weak to congest capacity, they are under-utilising it and
hanging around longer than they need to, reducing the capacity available for the
next data transfers that might be about to start.
Some say the problem is that ISPs should invest in more capacity. Certainly
increasing capacity should make the congested periods during data transfers
shorter and the non-congested gaps between them longer. The argument goes that
if capacity were large enough it would make the periods when there is a capacity
sharing problem insignificant and not worth solving.
Yet, ISPs are facing a quandary - traffic is growing rapidly and traffic
patterns are changing significantly (see and
) They know that any increases in capacity will
have to be paid for by all their customers but capacity growth will be of most
benefit to the heaviest users. Faced with these problems, some ISPs are seeking to
reduce what they regard as "heavy usage" in order to improve the service experienced
by the majority of their customers.
If done properly, managing traffic should be a valid alternative to increasing
capacity. An ISP's customers can vote with their feet if the ISP chooses the
wrong balance between making the service more expensive or managing heavy
traffic. Current traffic management techniques
fight against the capacity shares that TCP is aiming for. Ironically, they try to
impose something approaching LEDBAT-like behaviour on heavier flows. But as
we have seen, they cannot give LEDBAT the credit for doing this itself –
the network just sees a LEDBAT flow as a large amount of volume.
Thus the problem for the IETF is to ensure that ISPs and their equipment
suppliers have appropriate protocol support – not just to impose good capacity
sharing themselves, but to encourage end-to-end protocols to share out capacity
in everyone's best interests.
Unfortunately ISPs are only able to see limited information about the
traffic they forward. As we will see in section 3 they are forced to use
the only information they do have available which leads to myopic control
that has scant regard for the actual impact of the traffic or the underlying network
conditions. All their approaches are unsound because they cannot measure the most useful
metric. The volume or rate of a given flow or aggregate doesn’t directly affect other users,
but the congestion it causes does. This can be seen with a simple illustration.
A 5Mbps flow in an otherwise empty 10Mbps bottleneck causes no congestion and so
affects no other users. By contrast a 1Mbps flow entering a 10Mbps bottleneck that
is already fully occupied causes significant congestion and impacts every other user
sharing that bottleneck as well as suffering impairment itself. So the real problem that needs to be addressed is how to
close this information gap. How can we expose congestion at the IP layer so that it
can be used as the basis for measuring the impact of any traffic on the network
as a whole?
Explicit Congestion Notification allows routers to
explicitly tell end-hosts that they are approaching the point of
congestion. ECN builds on Active Queue Mechanisms such as random
early discard (RED) by allowing the router to mark a packet
with a Congestion Experienced (CE) codepoint, rather than dropping
it. The probability of a packet being marked increases with the
length of the queue and thus the rate of CE marks is a guide to the
level of congestion at that queue. This CE codepoint travels forward
through the network to the receiver which then informs the sender
that it has seen congestion. The sender is then required to respond
as if it had experienced a packet loss. Because the CE codepoint is
visible in the IP layer, this approach reveals the upstream
congestion level for a packet.
So Is ECN the Solution? Alas not - ECN does allow downstream nodes to measure the
upstream congestion for any flow, but this is not enough. This can make a receiver
accountable for the congestion caused by incoming traffic. But a receiver can only control incoming
congestion indirectly, by politely asking the sender to control it. A receiver
cannot make a sender install an adaptive codec, or install LEDBAT instead of TCP.
And a receiver cannot ask an attacker to stop flooding it with traffic.
What is needed is knowledge of the downstream congestion level for which you need
additional information that is still concealed from the network - by design.
Existing approaches intended to address the problems outlined above can be broadly divided into two groups - those that passively monitor
traffic and can thus measure the apparent impact of a given flow of packets and those that can actively discriminate against certain
packets, flows, applications or users based on various characteristics or metrics.
L3 measurement of traffic relies on using the information that
can be measured directly or is revealed in the IP header of the
packet (or lower layers). Architecturally, L3 measurement is best since it
fits with the idea of the hourglass design of the Internet [RFC3439].
This asserts that "the complexity of the Internet belongs at the
edges, and the IP layer of the Internet should remain as simple as possible."
Volume accounting is a technique that is often used to discriminate between heavy and
light users. The volume of traffic sent by a given user or network is one of the easiest
pieces of information to monitor in a network. Measuring the size of every packet from the
header and adding them up is a simple operation. Consequently this has long been a favoured
measure used by operators to control their customers.
The precise manner in which this volume information is used may vary. Typically ISPs may
impose an overall volume cap on their customers (perhaps 10Gbytes a month). Alternatively
they may decide that the heaviest users each month are subjected to some sanction.
Volume is naively thought to indicate the impact that one party's traffic has
on others. But the same volume can cause very different impacts on others if it
is transferred at slightly different times, or between slightly different endpoints.
Also the impact on others greatly depends on how responsive the transport is to
congestion, whether responsive (TCP), very responsive (LEDBAT), aggressive (multiple TCPs)
or totally unresponsive.
Traffic rates are often used as the basis of accounting at borders between ISPs. For
instance a contract might specify a charge based on the 95th percentile of the peak
rate of traffic crossing the border every month. Such bulk rate measurements are usually
simplified by measuring traffic volume over a short period, such as 15mins. With a little extra
effort it is possible to measure the rate of a given flow by using the 3-tuple of source
and destination IP address and protocol number.
Rate measurements might be thought indicative of the impact of one aggregate of traffic on
others, and rate is often limited to avoid impact on others. However such limits
generally constrain everyone much more than they need to, just in case most
parties send fast at the same time. And such limits constrain everyone too
little at other times, when everyone actually does send fast at the same time.
Over recent years a number of traffic management techniques have emerged that explicitly differentiate
between different traffic types, applications and even users. This is done because ISPs and operators
feel they have a need to use such techniques to better control a new raft of applications that
break some of the implicit design assumptions behind TCP (short-lived flows, limited flows per connection,
generally between server and client).
Bottleneck flow rate policers such as and
have been proposed as approaches for rate policing traffic. But they must be deployed at bottlenecks in order to work.
Unfortunately, a large proportion of traffic traverses at least two bottlenecks which limits the utility of this
approach. Such rate policers also make an assumption about what constitutes acceptable per-flow behaviour. If these
bottleneck policers were widely deployed, the Internet could find itself with one universal rate adaptation
policy embedded throughout the network. With TCP's congestion control algorithm approaching its scalability
limits as the network bandwidth continues to increase, new algorithms are being developed for high-speed
congestion control. Embedding assumptions about acceptable rate adaptation would make evolution to such
new algorithms extremely painful.
Some operators use deep packet inspection (DPI) and traffic analysis
to identify certain applications they believe to have an excessive impact
on the network. ISPs generally pick on applications that that they judge as
low value to the customer in question and high impact on other customers. A
common example is peer-to-peer file-sharing. Having identified a
flow as belonging to such an application, the operator uses differential
scheduling to limit the impact of that flow on others, which usually limits its
throughput as well. This has fuelled the on-going battle between application developers
and DPI vendors.
When operators first started to limit the throughput of P2P,
it soon became common knowledge that turning on encryption could boost your throughput. The DPI vendors
then improved their equipment so that it could identify P2P traffic by the pattern of packets it sends.
This risks becoming an endless vicious cycle - an arms race that neither side can win. Furthermore such
techniques may put the operator in direct conflict with the customers, regulators and content providers.
The accountability and capacity sharing problems highlighted so far have always
characterised the Internet to some extent. In 1988 Van Jacobson coded capacity
sharing into TCP's e2e congestion control algorithms . But
fair queuing algorithms were already being written for network operators to ensure each
active user received an equal share of a link and couldn't game the system
. The two approaches have divergent objectives, but
they have co-existed ever since.
The main new factor has been the introduction of residential broadband, making
'always-on' available to all, not just campuses and enterprises. Both TCP and
approaches like fair queuing only share out capacity instantaneously; they don't
take account of how much of each user's data is occupying a link over time,
which can significantly reduce the capacity available to lighter usage.
Therefore residential ISPs have been introducing new traffic management
equipment that can prioritise based on each customer's usage profile. Otherwise
capacity upgrades get eaten up by transfers of large amounts of data, with
little gain for interactive usage .
In campus networks, capacity upgrades are the easiest way to mitigate the
inability of TCP or FQ to take account of activity over time. But
capacity upgrades are much more expensive in residential broadband networks
that are spread over large geographic areas and customers will only be happy
to pay more for their service if they can see a significant benefit.
However, these traffic management techniques fight the capacity
shares e2e protocols are aiming at, rather than working together in unison. And,
the more optimal ISPs try to make their controls, the more they need application
knowledge within the network – which isn't how the Internet was designed to work.
Congestion exposure hasn't been considered before, because the depth of the
problem has only recently been understood. We now understand that both networks
and end-systems to focus on contribution to congestion, not volume or rate. Then
application knowledge is only needed on the end-system, where it should be. But
the reason this isn't happening is because the network cannot see the information it needs
(congestion).
As long as ISPs continue to use rate and volume as the key metrics for determining
when to control traffic there is no incentive to use LEDBAT or other low-congestion protocols that
improve the performance of competing interactive traffic. We believe that
congestion exposure gives ISPs the information they need to be able to discriminate in
favour of such low-congestion transports. In turn this will give users a direct benefit
from using such transports and so encourage their wider use.
This section proposes some requirements for any solution to this problem. We believe that a
solution that meets most of these requirements is likely to be better than one that doesn't, but we recognise
that if a working group is established in this area, it may have to make tradeoffs.
Allow both upstream and downstream congestion to be visible at the IP layer -- visibility at the IP layer allows congestion in the heart of
the network to be monitored at the edges and without deploying complicated and intrusive equipment such as DPI boxes. This gives several advantages:
It enables policing of flows based on the congestion they are actually going to cause in the network.
It allows the flow of congestion across ISP borders to be monitored.
It supports a diversity of intra-domain and inter-domain congestion management practices.
It allows the contribution to congestion over time to be counted as easily as volume can be counted today.
It supports contractual arrangements for managing traffic (acceptable use policies, SLAs etc) between just
the two parties exchanging traffic across their point of attachment, without involving others.
Avoid making assumptions about the behavior of specific applications (e.g. be agnostic to application and transport behaviour).
Support the widest possible range of transport protocols for the widest range of data types
(elastic, inelastic, real-time, background, etc) -- don't force a "universal rate adaptable policy"
such as TCP-friendliness .
Be responsive to real-time congestion in the network.
Allow incremental deployment of the solution and ideally permit permanent partial deployment to increase chances of successful deployment.
Ensure packets supporting congestion exposure are distinguishable from others, so that each transport can control when it chooses to deploy congestion
exposure, and ISPs can manage the two types of traffic distinctly.
Support mechanisms that ensure the integrity of congestion notifications, thus making it hard for a user or network to distort the congestion signal.
Be robust in the face of DoS attacks, so that congestion information can be used to identify and limit DoS traffic and to protect the hosts and
network elements implementing congestion exposure.
Many of these requirements are by no means unique to the problem of congestion exposure. Incremental deployment for instance is a critical requirement for any
new protocol that affects something as fundamental as IP. Being robust under attack is also a pre-requisite for any protocol to succeed in the real Internet
and this is covered in more detail in .
In this section we explore a simple strawman protocol that would solve the congestion exposure problem. This protocol
neatly illustrates how a solution might work. A practical implementation of this protocol has been produced and both simulations and
real-life testing show that it works. The protocol is based on a concept known as re-feedback and builds on existing
active queue management techniques like RED and ECN
that network elements can already use to measure and expose congestion.
Re-feedback, standing for re-inserted feedback, is a system designed to allow end-hosts to reveal to the network information
about their network path that they have received via conventional feedback (for instance congestion).
In our strawman protocol we imagine that packets have two "congestion" fields in their IP header:
The first is a congestion experienced field to record the upstream congestion level along the path.
Routers indicate their current congestion level by updating this field in every packet. As the packet traverses the network it
builds up a record of the overall congestion along its path in this field. This data is sent back to the sender who uses it to
determine its transmission rate.
The other is a whole-path congestion field that uses re-feedback to record the total congestion along the path. The sender
does this by re-inserting the current congestion level for the path into this field for every packet it transmits.
Thus at any node downstream of the sender you can see the upstream congestion for the packet (the congestion thus far), the whole
path congestion (with a time lag of 1RTT) and can calculate the downstream congestion by subtracting one from the other.
So congestion exposure can be achieved by coupling congestion notification from routers with
the re-insertion of this information by the sender. This establishes information symmetry between users
and network providers.
Once downstream congestion information is revealed in the IP header it can be used for a number of purposes. Precise details
of how the information might be used are beyond the scope of this document but this section will give an overview
of some possible uses. {ToDo: write up the rest of this section properly. Concentrate on a couple of the most useful
potential use cases (traffic management and accountability?) and mention a couple of more arcane uses (traffic engineering
and e2e QoS). The key thing is to clarify that Congestion Exposure is a tool that can be used for many other things...}
It allows an ISP to accurately identify which traffic is having the greatest impact on the network and either police
directly on that basis or use it to determine which users should be policed. It can form the basis of inter-domain
contracts between operators. It could even be used as the basis for inter-domain routing, thus encouraging operators
to invest appropriately in improving their infrastructure.
From Rich Woundy: "I would add a section about use cases. The primary use case would seem to be an "incentive environment that ensures
optimal sharing of capacity", although that could use a better title. Other use cases may include "DDoS mitigation",
"end-to-end QoS", "traffic engineering", and "inter-provider service monitoring". (You can see I am stealing liberally
from the motivation draft here. We'll have to see whether the other use cases are "core" to this group, or "freebies"
that come along with re-ECN as a particular protocol.)"
My take on this is we need to concentrate on one or two major use cases. The most obvious one is using this to control user-behaviour and
encourage the use of "congestion friendly" protocols such as LEDBAT.
{Comments from Louise Krug:} simply say that operators MUST turn off any kind of rate limitation for LEDBAT traffic and what
they might mean for the amount of bandwidth they see compared to a throttled customer? You could then extend that to say how
it leads to better QoS differentiation under the assumption that there is a broad traffic mix any way? Not sure how much detail
you want to go into here though?
{ToDo: better incorporate this text from Mirja into Michael's text below.}
Congestion exposure can enable ISPs to give an incentive to end-systems to response to congestion
in a way that leads to a better share of the available capacity. For example the introduction of a
per-user congestion volume might motivate "heavy-user" to back off with their high-bandwidth traffic
(when congestion occurs) to save their congestion volume for more time-critical traffic. If every
end-system reacts to congestion in such a way that it avoids congestion for non-critical traffic
and allow a certain level of congestion for the more important traffic (from the user's point of
view), the all-over user experience will be increased. More-over the network might be utilized
more equally when less-important traffic is shifted to less congested time slots.
As described earlier in this document, ISPs throttle traffic not because it causes congestion
in the network but because users have exceeded their traffic profile or because individual
applications or flows are suspected to cause congestion. This is done because it is not
possible to police only the traffic that is causing congestion. Congestion exposure allows
new possibilities for rate policing.
A straightforward application of congestion exposure is per-flow or per-aggregate
congestion policing. Instead of limiting flows or aggregates because they have
exceeded certain rate thresholds, they can be throttled if they cause too much
congestion in the network. This is throttling on evidence instead of suspicion.
The assumption is that every customer has an allowance of congestion per second.
If he causes more congestion than this throughout the network, his traffic can be
policed or shaped to ensure he stays within his allowance. The nice features of
this approach are that it sets incentives for the use of congestion-minimising
transport protocols such as LEDBAT and allows tariffs that better reflect the
relative impact of customers each other.
A user generates foreground and background traffic. Foreground traffic needs to
go fast while background traffic can afford to go slow. With per-customer
congestion policing, users can optimise their network experience by using
congestion-minimising transport protocols for background traffic and normal
TCP-like or even high-speed transport protocols for foreground traffic. Doing so means
background traffic only causes minimal congestion so that foreground traffic
can go faster than when both were transmitted over the same transport protocols.
Hence, per-customer congestion policing sets incentives for selfish users to utilise
congestion-minimising transport protocols.
Currently customers are offered tariffs with all manner of differentaitors from
peak access rate to volume limit and even specific application rate limits.
Congestion-policing offers a better means of distinguishing between tariffs. Heavy
users and light users will get equal access in terms of speed and short-term
throughput, but customers that cause more congestion and thus have a bigger
impact on others will have to pay for the privilege or suffer reduced throuhgput
during periods of heavy congestion. However tariffs are a subject best left to
the market to determine, not the IETF.
This document makes no request to IANA.
One intended use of exposed congestion information is to hold the e2e transport and the network accountable to each other.
Therefore, any congestion exposure protocol will have to provide the necessary hooks to mechanisms that can assure the
integrity of this information. The network cannot be relied on to report information to the receiver against its interest,
and the same applies for the information the receiver feeds back to the sender, and that the sender reports back to the
network. Looking at all each in turn:
The Network. In general it is not in any network's interest to under-declare congestion
since this will have potentially negative consequences for all users of that network. It may
be in its interest to over-declare congestion if, for instance, it wishes to force traffic
to move away to a different network or indeed simply wants to reduce the amonut of traffic it
is carrying. Congestion Exposure itself shouldn't significantly alter the incentives
for and against honest declaration of congestion by a network, but it is possible to
imagine applications of Congestion Exposure that will change these incentives.
There is a general perception among networks that their level of congestion is a business
secret. Actually in the Internet architecture congestion is one of the worst-kept secrets a
network has, because end-hosts can see congestion better than networks can. Nonetheless, one goal
of a congestion exposure protocol is to allow networks to pinpoint whether congestion is in one
side or the other of a border. Although this extra transparency should be good for ISPs with low
congestion, those with underprovisioned networks may try to obstruct deployment.
The Receiver. Receivers generally have an incentive to under-declare congestion since they
generally wish to receive the data from the sender as rapidly as
possible. explains how a receiver can significantly improve their
throughput my failing to declare congestion. This is a problem with or without Congestion Exposure.
explains one possible technique to encourage receiver's to be honest
in their declaration of congestion.
The Sender. One proposed mechanisms for congestion exposure adds a requirement for a sender
to let the network know how much congestion it has suffered or caused. Although most senders
currently respond to congestion they are informed of, one use of exposed congestion information
might be to encourage sources of excessive congestion to respond more than previously. Then clearly
there may be an incentive for the sender to under-declare congestion. This will be a particular
problem with sources of flooding attacks.
In addition there are potential problems from source spoofing. A malicious sender can pretend to
be another user by spoofing the source address. A congestion exposure protocol will need to be
robust against injection of false congestion information into the forward path that could distort
or disrupt the integrity of the congestion signal.
Congestion exposure is the idea that traffic itself indicates to all nodes on its path how much
congestion it causes on the entire path. It is useful for network operators to police traffic only
if it really causes congestion in the Internet instead of doing blind rate capping independently
of the congestion situation. This change would give incentives to users to adopt new transport
protocols such as LEDBAT which try to avoid congestion more than TCP does. Requirements for
congestion exposure in the IP header were summarized, one technical solution was presented, and
additional use cases for congestion exposure were discussed.
A number of people other than authors have provided text and comments for this memo. The document is being produced in support of a
BoF on Congestion Exposure as discussed extensively on the <re-ecn@ietf.org> mailing list.
XCHOKe: Malicious Source Control for Congestion Avoidance at Internet Gateways
HP, Cupertino
Stanford
IIT Delhi
DSET Corp
Georgia Tech
IIT Delhi
IBM-IRL
Promoting the Use of End-to-End Congestion Control in the Internet
ICSI Center for Internet Research
Intel
Policing Congestion Response in an Internetwork Using Re-Feedback
BT & UCL
BT
BT
Eurécom & BT
BT
BT
Rate control for communication networks: shadow prices, proportional fairness and stability
Cam Uni
Cam Uni
Cam Uni
Problem Statement: Transport Protocols Don't Have To Do Fairness
The Internet is an amazing achievement - any of the thousand million hosts can freely use any of the resources
anywhere on the public network. At least that was the original theory. Recently issues with how these resources are shared
among these hosts have come to the fore. Applications are innocently exploring the limits of protocol design to get larger
shares of available bandwidth. Increasingly we are seeing ISPs imposing restrictions on heavier usage in order to try to
preserve the level of service they can offer to lighter customers. We believe that these are symptoms of an underlying problem:
fair resource sharing is an issue that can only be resolved at run-time, but for years attempts have been made to solve it at
design time. In this document we show that fairness is not the preserve of transport protocols, rather the design of such
protocols should be such that fairness can be controlled between users and ISPs at run-time.
-
Low Extra Delay Background Transport (LEDBAT)
-
-
LEDBAT is an alternative experimental congestion control algorithm. LEDBAT enables an advanced
networking application to minimize the extra delay it induces in the bottleneck while saturating
the bottleneck. It thus implements an end-to-end version of scavenger service. LEDBAT has been
been implemented in BitTorrent DNA, as the exclusive congestion control mechanism, and in uTorrent,
as an experimental mechanism, and deployed in the wild with favorable results.
Open Research Issues in Internet Congestion Control
This document describes some of the open problems in Internet congestion control that are known today.
This includes several new challenges that are becoming important as the network grows, as well as some issues that
have been known for many years. These challenges are generally considered to be open research topics that may
require more study or application of innovative techniques before Internet-scale solutions can be confidently
engineered and deployed.
UK Broadband Speeds 2008: Research report
Ofcom: Office of Communications
Cisco Visual Networking Index: Forecast and Methodology, 2008–2013
Cisco Systems, inc.
The Broadband Incentive Problem
MIT Communications Futures Program (CFP)
Cambridge University Communications Research Network
Congestion Avoidance and Control
LBL
UCB
-
TCP Congestion Control with a Misbehaving Receiver
-
-
-
-
Incrementally Deployable Prevention to TCP Attack with Misbehaving Receivers
-
Computer Science Department, Carnegie Mellon University
-
Computer Science Department, Carnegie Mellon University