Presence Interdomain Scaling Analysis for SIP/SIMPLEIBMScience Park
RehovotIsraelavshalom@il.ibm.comAOL LLC401 Ellis St.Mountain ViewCA94043USAaoki@aol.netMicrosoft CorporationOne Microsoft WayRedmondWA98052USASriram.Parameswar@microsoft.comMicrosoft CorporationOne Microsoft WayRedmondWA98052USAtimrang@microsoft.comColumbia UniversityDepartment of Computer Science450 Computer
Science BuildingNew YorkNY10027USvs2140@cs.columbia.eduhttp://www.cs.columbia.edu/~vs2140Columbia UniversityDepartment of Computer Science450 Computer
Science BuildingNew YorkNY10027US+1 212 939
7004hgs+ecrit@cs.columbia.eduhttp://www.cs.columbia.edu/~hgs Real Time
SIMPLE WGI-DInternet-
DraftSIMPLEproblem statementThis document analyzes the traffic that is generated by presence
subscriptions between domains and shows that the amount of traffic can be
extremely large. This document also analyzes
the effects of a large presence system on the memory footprint and the CPU load.
Approved and in-work optimizations to the Session Initiation Protocol are analyzed,
considering the possible impact on the load. Separate documents contain the
requirements for optimizations and suggestions for new optimizations.The document analyzes the traffic that is generated by Session Initiation Protocol (SIP) for presence (as defined by the SIMPLE working group) due to presence subscriptions between domains.
It shows that the number of messages and the amount of data generated can be
extremely large. This document also
analyzes the effects of a large presence system on the memory footprint
and the CPU load. Approved and in-work optimizations to SIP
are analyzed, considering the possible impact on the load. Another
document provides requirements for optimizations
while other documents contain suggestions for new
optimizations: and This document is intended to drive work on possible solutions that
will make the deployment of a SIP-based presence server a less challenging
task. Deployment of highly scalable presence systems is challenging by its
nature, and protocol developers design their own techniques for
optimizing their protocols. Comparing
protocols is beyond the scope of this document. This document discusses the following areas, showing the
complexity and load that the presence server handles in order to
provide its service:Message load - By computing the number of messages that are required for
connecting presence systems, the document shows that the number of messages and
the required bandwidth are large, and that it is quite obvious that
optimizations are needed. State management - Due to the nature of the service that the presence server
provides, the presence server has to manage a relatively large and complex state.
Some computations are provided in the document.Processing complexities - The presence server maintains many small objects
and performs frequent operations on these objects. We show that these operations,
as well as the optimizations that are intended to reduce the amount of data sent
between watchers and presence servers, are not so simple and may create a heavy
processing load on the presence server.Groups - Resource List Servers optimize the number
of sessions that are created between the watchers and the presence server.
However, this optimization may create an exponential increase in the size of subscriptions
as a result of the minimal effort of subscribing to large groups.The terms "presence domain" or "presence system" appear in this document frequently.
These terms refer to SIP-based presence servers that provide presence
subscription and notification services to their users. A presence system can be
deployed in a small enterprise or in a large consumer network.Some optimizations are approved or are being defined for the SIP presence
protocol, but even with these optimizations, a large number of messages and
wide bandwidth are needed in order to establish federation between presence
systems of large communities. Further thinking is needed in order to make large
deployment of presence systems less resource demanding.Note that even though this document talks about inter-domain traffic, the
introduction of resource list servers (RLSs) introduces
similar traffic patterns intra-domain and inter-domain. See the detailed
discussion on resource lists in .The current optimizations that were approved as RFCs
or are approved as working group items
in the SIMPLE working group can be divided into two categories:Dialog-saving optimizations - Here we refer to optimizations such as the
resource list RFC or to the URI list subscriptions
RFC . These documents
define ways to reduce the number of dialogs that are required between the
subscriber and the presence system.
Note that the terms "dialog optimization" or "RLS usage", as used in this
document, refer to the usage of a URI that represents a list of URI lists
between domains and not within the same domain. An example is a user Alice in
domain example.org who subscribes to the URI external-reps-list at
example.com or uses a URI list to subscribe to her watch list in example.com.
Note also that, when calculating the traffic due to the RLS within a domain,
the traffic between the RLS and the presence agents should also be considered.
However, because we are mostly concerned with inter-
domain traffic, we are not taking into account the traffic between the RLS and the presence agents.
Notification optimizations - Here we refer to the optimizations
covered in the subnot-etags draft ,
which describes the suppression of unnecessary NOTIFYs when
subscriptions are refreshed. There are other drafts that reduce the size of
messages by using partial notifications or filtering. This document shows that
partial optimizations can reduce the bandwidth but do not reduce the number of
messages. One optimization that was not considered
is the reception of presence information
outside of SIP. An example of this is the ability to download
persistent presence information directly from a web site.
The calculations assume that all presence
data is carried within SIP and not by other means.
These out-of-band optimizations may improve the number of messages and number
of bytes significantly, but they are out of scope for this document.In this document, several assumptions are used regarding size of messages, rate
of presence change and more. It should be noted that these assumptions are not
directly based on rigorous statistics from actual SIP-based deployments of
presence systems but more on some experience with other types of presence-based systems.The following numbers are given more as examples from real deployments and they are not intended to be
complete.In a large consumer network, we have seen the following patterns:There are approximately 110 users on a watch list on average.There are approximately 12 billion status changes a day (139k/second) across
the network. When a proprietary binary protocol conveys the
status changes, the average message size is about 188 bytes. When a SIP NOTIFY is
used, the average is about 1228 bytes.The average number of logins/logouts in the system is about 2000 logins per second and about 4000 logouts per second.
When a promotion, contest, or network
hiccup causes many users to login and logout simultaneously, there are about 20,000 logins per second.The peak number of instant messages sent is about 50,000 messages per second.In an enterprise deployment, we have seen the following patterns:Average watch list size was 200 users.About half of the registered users were online at peak time.Status change rate was 2 changes per hour.The average logins/logouts in the system was about 5 logins per second with
additional 15 logins/logouts during start/end of day rush hours.Even though the assumptions in this document are not based on rigorous
statistical data, the target here is not to analyze a specific system but
to show that even with VERY moderate assumptions (which are even less than
the observations mentioned above), the number of messages, the network
bandwidth, the required state management, and the CPU load are
high. Real-life systems could have much larger scalability challenges.
For example, the presence state change that we assumed (one presence state
change per hour) is maybe one of the most moderate assumptions that we
have taken. Experience from consumer networks shows that the frequency can be much higher, especially with the younger generation using more
presence attributes like mood, etc. In an environment where a user may
have several devices and other resources for presence information such as
geographical location and calendar, the frequency of presence state changes
will be much higher.It is hard to measure presence load because it depends heavily
on the behavior of users, and the behavior of users differs widely.
Some users will have a small number of presentities in their watch
list, while others may have hundreds, or even thousands. Some users will
change their state frequently and have many sources of presence information,
while others may have small number of changes during the day. In
addition, the "rush hour" calculations of when the day starts and ends were
not included in this document. Rush hour differs between different
enterprises and is different still in the consumer presence systems. It is
hard, if not impossible, to include in a static document all the possible
combinations.Throughout the calculations, a certain number of users are assumed for the
different models. That does not mean that in actual deployments all the
users of the domain are actually subscribed to presence documents and/or have
published their presence documents. Observing actual deployments shows that, in
the consumer market, the number of users that use presence services may be
10 percent or less of the registered users. In the enterprise market, the
numbers tend to be around 50 percent of the actual enterprise registered
users.The same is true for the number of watched presentities per
watcher. If only some percentage of the domain users are online at a given
time, then this number should match that percentage. However, adding
this assumption to the calculations will make the calculations more
complex because the effect of the watched offline presentities
would need to be considered. This means that empty
NOTIFYs would be sent for offline presentities when the subscription is created and there
are no updates on them. In order to make the computations less complex
(they are complex enough as they are), the number of the watched
presentities used in the calculations is the number of the
federated presentities from the watcher list that are online.The basic SIP subscription dialog involves the following message-
transfer:SUBSCRIBE/200Initial NOTIFY/200(j) NOTIFY/200 where "j" is the number of presence changes seen by the
watcher(k) SUBSCRIBE/200 where "k" is the number of subscription dialog refresh
periodsSUBSCRIBE/200 with Expires = 0 to terminate the dialogNOTIFY/200 ending the dialogAn individual watcher will generate X number of SIP subscription
dialogs corresponding to the number of presentities it chooses to watch. The
amount of traffic generated is significantly affected by several factors:Number of watchers connected to the system.Number of presentities connected to the system.Frequency of changes to presence information.This document contains several calculations that show the expected
message rate and bandwidth between presence domains. The following sections explain the
assumptions and methods behind the calculations.The following are the "constants" that we use in the calculations. Some of the
constants are used throughout the calculations while others change between use cases.(C01) Subscription lifetime (hours) - The assumed lifetime of a subscription,
in hours. We assume 8 hours for all calculations. Note that the term "day" that
is used in the document and C01 are synonymous.(C02) Presence state changes / hour - The average time that a presentity
changes his/her status in one hour. We assumed 3 times per hour for most
calculations. Note that for some users in consumer messaging systems, the actual
number of changes is likely to be much higher.(C03) Subscription refresh interval / hour - The duration of the SUBSCRIBE
session, after which it needs to be refreshed. We assumed that the duration is
one hour.(C04) Total federated presentities per watcher - The number of presentities
that the watcher is watching. The number here changes in this document according
to the type of the specific deployment.(C05) Number of dialogs to maintain per watcher - The number of the SUBSCRIBE
dialogs that are maintained per watcher. If a dialog optimization is not assumed,
this number is equal to C04, otherwise it is 1.(C06) Total number of watchers in the federated presence domains. The number
here is the number of all watchers in all the federated domains.(C07) SUBSCRIBE message size, in bytes. We assume 450 bytes in all
calculations. The size is based on a typical SUBSCRIBE taken from RFCs.(C08) 200 OK for SUBSCRIBE message size, in bytes. We assume 370 bytes in all
calculations. The size is based on a typical 200 OK taken from RFCs.(C09) NOTIFY message size, not including the presence document. The size of this
message for a single presentity is assumed to be 500 bytes for the NOTIFY
message itself (based on sizes from examples in RFCs).(C10) 200 OK for NOTIFY message size, in bytes. We assume 370 bytes in all
calculations. The size is based on a typical 200 OK taken from RFCs.(C11) Size of an average presence document, in bytes. Two sizes of average presence
doucment are used. One is the minimal size of the PIDF
document, assumed to be 350 bytes based on examples from RFCs, and the other is
3000 bytes for a rich presence document . It should be
noted that 3000 bytes for a presence document is relatively modest if we take
into account multiple devices and location information.(C12) The size of a NOTIFY, in bytes, when partial notification is used.
We have taken this size to be 200 bytes, much smaller than the
example given in , which assumes
multiple changes in the presence document. Here we assume a single
change.
When dialog optimization is used, an RLMI document,
which contains the presence documents for the users on the watch list, is sent.
In a previous version of this document, we had omitted the overhead of the RLMI
document. This "bug" was found by Victoria Beltran-Martinez and is fixed in this
version by adding the following constants C13, C14 and C15 to the
calculations.(C13) Item size per each contact in RLMI document, 160 bytes.(C14) The size of the multipart boundary in RLMI notifications, 144 bytes.(C15) The size of the XML root node in RLMI document (once per notification), 144 bytes.The following are the calculations for the messages in the initial phase of
the establishment of the subscriptions. The calculations contain both the number of
messages and the number of bytes.(I01) Number of initial SUBSCRIBE messages per watcher = C05.(I02) Number of initial 200 OK messages for SUBSCRIBE messages per watcher = C05.(I03) Number of initial NOTIFY messages per watcher = C05.(I04) Number of initial 200 OK messages for NOTIFY messages per watcher = C05.(I05) Total number and bytes of initial SUBSCRIBE messages for all watchers =
Number: I01*C06, Bytes: I01*C06*C07.(I06) Total number and bytes of initial 200 OK for SUBSCRIBE messages for all
watchers = Number: I01*C06, Bytes: I01*C06*C08.(I07) Total number and bytes of initial NOTIFY messages for all watchers =
Number: I01*C06.
The calculation for the size in bytes is different depending on the use of dialog
optimization:
When dialog optimization is not applied, the number of
bytes is calculated by (I01*C06*C09)+(I01*C06*C11).When dialog optimization is applied, the number of bytes is calculated by
(I01*C06*(C09+C14+C15))+(I01*C06*C04*(C11+C13+C14)).(I08) Total number and bytes of initial 200 OK for NOTIFY messages for all
watchers = Number: I04*C06, Bytes: I04*C06*C10.(I09) Total number and bytes of initial messages per day = Number: numbers
in I05+I06+I07+I08, Size: sizes in I05+I06+I07+I08.Here we describe the calculations for steady state messages. Steady state
is the time between the initial subscription and the teardown of the
subscription. It contains the NOTIFYs due to state change and the subscription refreshes.(S01) NOTIFY messages due to state change per watched presentity per day
(less 2, because the NOTIFYs for initial and terminating states are included
in the initial and terminating calculations) = (C02*C01-2).(S02) 200 messages (for NOTIFYs due to state change) per watched presentity
per day (less 2, because the NOTIFYs for initial and terminating states are included
in the initial and terminating calculations) = (C02*C01-2).(S03) Total number and size of messages due to state change per day = Number: (S01+S02)*C06*C04.
The calculation for the size in bytes depends on the use of
dialog optimization:
When dialog optimization is not applied, the
number of bytes is calculated by (C06*C04)*((S01*(C09+C11))+(S02*C10)).When dialog optimization is applied, the number of bytes is calculated by
(C06*C04)*((S01*(C09+C11+C13+C14+C15+C14))+(S02*C10)).
This includes the
multipart boundary of the resource list. Note that for dialog optimization it is
assumed that only a single presentity is changed and partial state notification
is used.(S04) Number of SUBSCRIBE messages for refreshes per watcher per day =
((C01/C03)-1)*C05. One is subtracted because the termination is calculated
separately. For example, if there are 8 hours in the day and a refresh should
occur every hour, there are 7 refreshes during the day and not 8.(S05) Number of 200 OK messages for SUBSCRIBE messages for refreshes per watcher per day =
((C01/C03)-1)*C05.(S06) Number of NOTIFY messages for refreshes per watcher per day =
((C01/C03)-1)*C05. If NOTIFY optimization is used , there is no
need to send NOTIFYs for refreshes, and S06 will be zero.(S07) Number of 200 OK messages for NOTIFY messages for refreshes per watcher
per day = ((C01/C03)-1)*C05. If NOTIFY optimization is used ,
there is no need to send NOTIFYs for refreshes, and S07 will be zero.(S08) Total number and size of messages due to SUBSCRIBE refreshes per day =
Number: (S04+S05+S06+S07)*C06.
The size in bytes is calculated by adding the
SUBSCRIBE bytes (S04*C06*C07), the OK bytes for the SUBSCRIBE (S05*C06*C08), the
NOTIFY bytes C06*(S06*(C09+C11)) and the OK bytes for the NOTIFY
(S07*C06*C10).
Note that the formula for the NOTIFY bytes assumes that dialog
optimization is not used. When dialog optimization is used, the formula is:
C06*(S06*((C09+C14+C15)+(C04*(C11+C13+C14)))).Note that a full state should be
given in SUBSCRIBE refreshes in resource lists. See section 5.2 in
.
The fact that the full state needs to be returned in a
NOTIFY response to refresh makes the NOTIFY optimization more efficient in
conjunction with the dialog optimization.(S09) Total number and bytes of steady messages per day = Number: numbers
in S03+S08, Bytes: sizes in S03+S08.The following are the calculations for the messages in the termination phase of
the subscriptions. The calculations contain both the number of
messages and the number of bytes.(T01) Number of terminating SUBSCRIBE messages per watcher = C05.(T02) Number of terminating 200 OK messages for SUBSCRIBE messages per watcher = C05.(T03) Number of terminating NOTIFY messages per watcher = C05. If NOTIFY optimization is used ,
there is no need to send a NOTIFY for terminations, and T03 will be zero.(T04) Number of terminating 200 OK messages for NOTIFY messages per watcher =
C05. If NOTIFY optimization is used ,
there is no need to send a NOTIFY for terminations, and T04 will be zero.(T05) Total number and bytes of terminating SUBSCRIBE messages for all watchers = Number: T01*C06, Bytes: T01*C06*C07.(T06) Total number and bytes of terminating 200 OK for SUBSCRIBE messages for
all watchers = Number: T01*C06, Bytes: T01*C06*C08.(T07) Total number and bytes of terminating NOTIFY messages for all watchers
= Number: T01*C06. The number of bytes is calculated to be:
(T03*C06*(C09+C11) when dialog optimization is not used, and(T03*C06*(C09+C14+C15))+(T03*C06*C04*(C11+C13+C14)) when dialog optimization
is used.
Note that a full state should be given in SUBSCRIBE refreshes in resource
lists. See section 5.2 in .(T08) Total number and bytes of terminating 200 OK for NOTIFY messages for all
watchers = Number: T04*C06, Bytes: T04*C06*C10.(T09) Total number and bytes of terminating messages per day = Number: numbers
in T05+T06+T07+T08, Size: sizes in T05+T06+T07+T08.The following are the calculations of several totals that are based on the
above calculations.(B01) Total number of messages and bytes during the day = Messages: number
of messages in I09+S09+T09, Bytes: number of bytes in I09+S09+T09.(B02) Total number of messages and bytes per second = Messages: number of
messages in B01/(C01*3600), Bytes: number of bytes in B01/(C01*3600).(B03) Total number of message and bytes per user per day = Messages: number
of messages in B01/C06, Bytes: number of bytes in B01/C06.With the way that the calculations are built, it is relatively easy to see the
effect of rush hours at the beginning and the end of the day. For the beginning
of the day, we should look at the numbers of "(I09) Total number and bytes of
initial messages per day" and for the end of the day we should look at the
number of "(T09) Total number and bytes of terminating messages per day". Taking
these numbers with some assumed percentage of the number of users logging in
at the same hour should give good indication for the rush hour load.The following table uses some common presence characteristics to demonstrate
the effect these factors have on state and message rate within a presence domain
using base SIP without any proposed optimizations. In this
example, there are two presence domains with a total of 40,000 federating users
with an average of 4 contacts per user in the peer domain. Note that the main calculation
is done for a presence document size of 350 bytes, which is the base PIDF
document size, but the bottom-line calculation is also given for a presence
document size for rich presence , which is assumed to be 3000
bytes, based on the examples given in the RFCs. This two-fold calculation is
done for every use case in this document.The same analysis provided above is repeated here with the assumption
that the dialog optimization is applied. Note that while the sign-in (ramp
up) and sign-out message flows are positively affected, the steady state
rates are not.The analysis provided in is
repeated here with the assumption that the notification optimization is applied. The
optimization saves the need for NOTIFY upon refreshing a SUBSCRIBE if there was
no change since the last NOTIFY. It is assumed here that there are no NOTIFY
messages for SUBSCRIBE refreshes and terminations. As expected, this
optimization affects the steady and termination states and does not affect the
initial state.Here, both optimizations are combined. In all the subsequent use cases, we will
show only the analysis with no optimizations and with both optimizations
combined.While scalability issues exist in any large deployment, certain
deployments have characteristics that are conducive to the existing
optimizations, and others have characteristics that are not. What follows
is a list of federation scenarios that have varying usage
characteristics. For each, a message rate and bandwidth table is
provided reflecting typical changes message rates. Those
characteristics can alter the overall effectiveness of existing
optimizations.Note that the number of users considered is not the total number of users
in the domains but the number of actual logged-in users. As was mentioned before,
not all the domain users will use the presence service at the same
time. The numbers used for watchers and watched
presentities are for online users.In some environments, presence federation may be common, perhaps
even more common than intra-domain presence. An example of this type of
environment is a small ISV or public server. Users in that small ISV
are not likely to subscribe to the presence of other users in the their
server because they do not necessarily have any relationship with each
other aside from receiving service from the same provider. They are
much more likely to be subscribed to the presence of users in one of the
federated domains (whether in consumer domains, academic domains, other ISVs,
etc). Common characteristics of this deployment are these:Federated subscriptions are the majority of subscription traffic.Individual users are likely to subscribe to multiple users in any one
domain.The intersection of users in the deployment watching the same
presentities is quite small (that is, the probability that multiple watchers in the
domain are subscribed to the same presentity is low).To account for the extraordinarily high percentage of federation
traffic, the number of federated presentities is increased to 20. Although the
number of watchers in the domain could also be adjusted to allow for
an expected larger community of users being peered with, it is omitted
here for simplification.The first table below provides the calculations without optimizations.
The second table provides the calculations with optimizations.In this type of environment, the domain is a collection of associated
users, such as an enterprise. Here, federation is once again
common. However, there is also a strong association between some users
in the deployment. These associations make it somewhat more likely that
users in that domain are watchers of the same presentity. This can
occur because of business relationships (for example, two co-workers on a project
federating with a partner company).Common characteristics of this deployment are these:Federated subscriptions are large minority or small majority of
subscription traffic.Individual users are likely to subscribe to multiple users in any one
domain, especially their own.The intersection of users in the deployment watching the same
presentities increases.This federation type has traffic rates similar to the previous examples
but with different levels of association of the users. In this environment, two or more Large networks create a peering
relationship, allowing their users to subscribe to presence in the other
domains. Whereas the number of users in other deployment types ranges
from hundreds to several hundred thousand, these large networks host up
to hundreds of millions of users. Examples of these networks are large
wireless carriers and consumer instant messaging networks.Common characteristics of this deployment are these:As users become accustomed to network
boundaries disappearing, federated subscriptions become as common as
subscriptions within the same domain.Individual users are highly
likely to want to see presence of multiple presentities in the peer
network.The intersection of users in the deployment watching the
same presentities is high (that is, two or more users in network A are
extremely likely to be watching a same user in network B).Status
changes increase greatly due to typical observed consumer behavior.The first table below provides the calculations without optimizations;
the second table provides the calculations with optimizations. Even
though the optimizations help a lot (cut the number of messages almost in
half), the numbers are still high. Note also that the bandwidth required is high.Within a particular domain, multiple presence infrastructures are
deployed, with users split between them. This scenario is unique, in
that federated messages do not pass outside the administrative domain's
network. The two infrastructures peer directly inside the domain. A
common example of this is an enterprise IT system that has deployed multiple,
independent-vendor presence solutions (for example, a presence solution
for desktop messaging deployed alongside a presence solution for IP
telephony).Common characteristics of this deployment are these:The differences between subscriptions to presentities in one system versus
the other are completely arbitrary. Any one presentity is as likely to
be homed on one infrastructure as on the other.Active users are almost guaranteed to subscribe to many users in the
peer infrastructure.The level of intersection of presentities is extremely high.The first table below provides the calculations without optimizations.
The second table provides the calculations with optimization. Although
relatively conservative numbers are used, the number of
messages is still high even though optimization may cut the
traffic by more than half. RFC 5263 defines a way for the
watcher to request getting only what was changed in the presence document. The
following calculation shows the bandwidth saved in the large
peering network case when we add partial notification to the
dialog and NOTIFY optimizations. It is assumed that, except for the initial NOTIFY,
all other NOTIFY messages will be partial. It is also assumed that only a single attribute in the presence
document will be changed each time, thus the size of the partial presence document is assumed to be 200 bytes.SIP is a network-agnostic protocol, therefore, the protocol carries
additional messages like "200 OK" that would be redundant in a
protocol that is TCP-based only.The following calculation assumes an imaginary TCP-only version of SIP that optimizes the following:There is no "200 OK" for each message. because only TCP would be supported, there is no need to compensate
for issues arising with other transport protocols.There is no refresh for subscriptions.There is no NOTIFY upon termination of the subscription.The size of each message is smaller, because there is no need for the various header fields that SIP uses for routing,
etc. So we need to assume smaller message sizes, while keeping the size of
the presence document the same.As noted above, the calculations in this document do not assume offline
means of receiving presence information. Therefore, in addition
to the above optimizations, the other optimizations that were assumed in
the document are assumed here also. These includes partial
notifications and dialog optimizations. The NOTIFY optimization is not
relevant here, because there are no refreshes of subscriptions.The following is a calculation for the large network peering scenario
assuming an imaginary TCP-only SIP. It is interesting to note that
the dialog optimization does not reduce the number of bytes when partial
notification optimization is applied (on the contrary), due to the RLMI overhead.In previous sections, we discussed the large number of messages that need to
be sent to/from a presence server. In this section, we will analyze the state that needs to be
maintained by a presence server and will show it to be far from
trivial.The presence server has two parallel tasks:Maintain the state of the presentities to which watchers subscribe.Maintain the state of the subscriptions of watchers and provide timely
updates to the watchers.For a single subscription from a single watcher on a presentity, the presence
server has to maintain the following state:Subscription state, including all the parameters that are needed to
maintain the subscription as timers.Optional filtering information that was requested by the watcher, which
includes information needed for filtering. If partial notification is supported for the subscription,
additional information has to be maintained.Optional rate management information as throttling.Watcher information (, ) that
is the result of the subscription, in order to allow watched presentities to know
who is watching them.For each presentity with subscriptions in the presence server, the presence server has to maintain the
following state:A list of the subscriptions for the presentity. Note that the size calculation
is already handled by the subscription state above.Authorization information for the presentity.For each presentity that has published a state other than a default value,
the presence server has to maintain the current value of the presentity's state.Let's assume the following sizes:Subscription size - 2K bytes. This includes watcher information that
the presence server creates for each subscription. This is
for every subscription created by a watcher to each presentity that
the watcher is watching. So, if we have 10K watchers, we should have 10K of
these.Subscribed-to resource - 1K bytes for privacy information and other
management information. This is for each watched presentity, regardless of the number of its watchers.
The subscriptions themselves are
already calculated in the previous bullet.Resource with a state - 6K bytes. This is a moderate assumption if we
consider the amount of data, including calendar and geographical information, placed in a presence
document by multiple devices. This
is for each presentity, watched or not, that has state other than the default empty state.
10K subscriptions = 19M bytes.5K subscribed-to presentities = 5M bytes.10K presentities with state = 58M bytes.Total is 82M bytes.100K subscriptions = 195M bytes.50K subscribed-to presentities = 49M bytes.100K presentities with state = 586M bytes.Total is 830M bytes.6M subscriptions = 11,718M bytes.3M subscribed-to presentities = 2,929M bytes.4M presentities with state = 23,437M bytes.Total is 38G bytes.150M subscriptions = 292,969M bytes.75M subscribed-to presentities = 73,242M bytes.100M presentities with state = 585,937M bytes.Total is 952G bytes, which is a large number for dynamic storage as needed by the presence server.Although the numbers above may seem moderate enough for the sizes that the presence server is
handling, we should consider the following:Dynamic state - Although the state may not seem so large for databases, even for
the larger system, we need to remember that this state is dynamic.
Subscriptions come and go all the time, the statuses of presentities are being
updated, and so forth. This means that the presence server has to manage its
state in a dynamic medium, and for such large sizes, this task is not
trivial.Interlinked state - The subscriptions and the subscribed-to presentities are
dependent on each other. There needs to be a link from the presentity to the
subscriptions and vice versa. See about the
interlinkage that is created due to resource lists.Moderate assumptions - The size assumptions that were made above are quite
moderate. Presence is becoming more a core middleware functionality that holds
much of the user's data. In real life, the numbers above may be even higher, and
the presence server can have additional overhead such as managing the SIP sessions,
networking, and more.Although the above calculations do not show that there is a real issue with
state management of presence in medium systems or even in large systems, because
state could be divided among different machines, the state
size is still large. A bigger issue with the state involves resource lists,
which create an interlinked state between many servers. In that case, the division of large state to
multiple servers becomes less trivial.The basic presence paradigm comprises a watcher and a presentity that the
watcher watches. It sounds simple enough, but the presence server has to manage many additions and extensions,
which makes processing complex.In this section, we show that in addition to the large number of messages and the large state
that the presence server has to handle, it also has to handle quite intensive processing for aggregation,
partial notification and publish, filtering, and privacy. This adds complexity to the presence server on the CPU front in
addition to the network and memory fronts that were described before.A presence document may contain multiple resources. These resources can be devices of
the presentity, information from external providers of presence information such as geographical location, calendars, and more.The presence server needs to receive the updates from all the resources and
to aggregate them correctly into a single presence document. Although this is just an "XML processing" task,
the number of updates that the presence server may receive, the need to keep the presence document
aligned with its schema, and the need to notify the users as soon as possible create a significant
processing burden on the presence server.Partial notification and partial publish
define a way for the watcher to request
notification only on what was changed in the presence document, and for the publisher of
presence information to publish only what was changed in the presence document
since the last publish. Although these optimizations help reduce the amount
of the data that is sent from/to the presence server, these optimizations
create additional processing burden on the presence server.When a partial publish arrives at the presence server, the presence
server has to be able to process the partial publish and change only what is
indicated in the partial publish, while keeping the presence document well-formed
according to the schema.In partial notification the processing is even more complex, because each
watcher needs to get the partial update based on the last update that was
received by that watcher. Therefore,
specifies a versioning mechanism that enables the watcher to get the
updates based on the previous state that it has seen. This versioning
mechanism has to be maintained by the presence server for each watcher that is
subscribed to a presentity and requires partial notification.Filtering, as defined in RFCs and , enables a watcher to request to be notified only when the
presence document fulfills certain conditions. Although this is a
convenient feature for watchers, the burden put on the presence server
is quite large. For each change in the presence document, the presence server
needs to compute the filtering expressions, which can be complex, and to decide
whether and what to send to the watcher that has requested filtering.RFC defines presence authorization rules
that allow presentities to specify what each watcher can see in their presence documents.
The processing that the presence server performs here is similar to filtering. When a
presence document with defined authorization rules changes, the presence server
creates different notifications for different watchers according to those rules.RFC defines a way to subscribe to a single URI
that represents a list of resources that are subscribed to by a
single subscription. Although this quite useful mechanism
significantly reduces the number of sessions between the watcher and the
presence server (as we show in the calculations of messages), this feature has
the potential to complicate the scaling of presence systems.The reasons that resource lists may make the scaling of the
presence server even more complex are these:Subscriptions and state - The resource list may contain references to many
other presence servers in many other domains. This requires the RLS to create
subscriptions to other presence servers and buffer the state of all presentities
so that it can provide the full state of the presentities in the list
when needed. So in the overall system, the number of subscriptions reduced between
the watcher and the presence server is moved to the backend system, while state
is duplicated among the various presence servers that serve the various
presentities and the RLSs. This issue could have been mitigated if there were a
way for the RLS to retrieve the presence information for many watchers, while
adhering to privacy when sending the actual notifications to the watchers.Interlinkage - The resource list subscription will reach one RLS that will
open it and send it to many presence servers and to other RLSs (if there is a
subgroup inside the list). This creates a complex linkage among the states of many
components. This linkage makes state management and other
maintenance of presence systems quite complex.Big lists are easy - There are two types of groups that may be used with this
feature: private groups that are defined by/for each watcher, and public groups
that are defined in the system and can be used by any watcher. Although we should
expect IT administrators to be cautious when creating public groups, this may
not be the case in real life. The connection between the size of the public
group and the load on the presence system may not be apparent to everyone.
Furthermore, many public groups may have been
created for other purposes, such as email systems (where the size of the lists is not
as important), and are now used in the presence systems. For example,
a public group including all the users in
the enterprise is used by many users in the enterprise, thus overwhelming
the presence server. Note that this is not a protocol or design
issue, but more a usage issue that may have a real impact on the presence
system.Stopping notifications - A watcher may accidentally subscribe to an extensive list
and be overwhelmed by the number of NOTIFYs that it receives from the presence
server. There is no current way to stop this stream of NOTIFYs, and even
canceling the subscription may take time before being effective.These issues show how an optimization can help in
one part of the system, but create even bigger problems in the overall system.
There is a need to think about the problems listed above, but, more than that,
there is a need to make sure that introducing an optimization does not
create issues in other places.This section highlights several optimizations that either are
already part of SIP or have been suggested in various drafts.
Several other optimizations that have been suggested but have not been discussed
in any working group yet are summarized in
and in . Note that
trials with batched NOTIFYs optimization, described in ,
showed an improvement
of 117% in the whole throughput of presence traffic.Subnot-etags - . This draft
suggests ways to suppress the sending of unnecessary NOTIFYs when, for example, a
subscription is refreshed. This suggestion seems to be an efficient
optimization, because it reduces both the number of messages sent and the
processing time of the presence server.Resource List Service - enables creating a single subscription
session between the watcher and the presence server for subscribing to a list of users.
This reduces the number of sessions that are created between watchers and presence servers.
However, this mechanism enables creating large numbers of subscriptions in the
presence server/RLS system, thus enabling the creation of a large number of subscriptions
between presence servers and RLSs with relatively few clients, especially if large public groups
are used. It seems that, in order to really optimize in this area, the usage of large public
groups should not be considered as BCP, and there should be a way for an RLS to create a single
subscription for multiple occurrences of the same resource in resource lists. See consolidated
subscriptions below.Partial notify/publish - and
define a way for the
subscriber to request getting only what was changed in the presence document, and
for the publisher of presence information to publish only what was changed in
the presence document since the last publish. Although these optimizations
reduce the amount of actual data sent from/to the presence server,
these optimizations create additional processing burden on the presence
server, as was discussed above.Filtering, as defined in and , enables a watcher to request to be notified only when the
presence document fulfills certain conditions. Although this optimization
reduces the number of messages that are sent from the presence server to the watcher, this
optimization burdens the processing time of the presence server, as
was discussed above.Throttling - defines a
mechanism by which a watcher requests updates only at certain intervals.
Although this mechanism may add some extra load on the
presence server, that load is negligible and the reduction of the number of
messages sent from the presence server to the watchers is significant. This
optimization is even more important with resource lists, which can contain many
resources, because the watcher may receive a large number of notifications
if the traffic caused by updates on resource list is not regulated.Presence-specific SigComp dictionary -
defines a SigComp dictionary for
presence. This optimization reduces the number of bytes that are
transferred in presence systems by compressing the textual SIP messages. By using
the specialized presence dictionary, the compression may be more significant than
just using SigComp as-is. Note that number of actual messages will remain the
same, and a calculation of the number of bytes that will be saved may be
useful here.Content Indirection - enables the sending of only the URI of the
presence document to the watcher, thus relieving the presence server from sending the entire
presence document to the watcher. This optimization may be useful in some cases, especially when
a large number of users receive the same presence document.The following summary of the various calculations is provided here
to help support the conclusions listed below.The following table summarizes the constants that are used in ALL calculations:The following table summarizes the results of various optimization factors for the basic use case.The following table summarizes the results of various optimization factors
for the widely distributed inter-domain use case.The following table summarizes the results of various optimization factors
for the intra-domain peering use case.The following table summarizes the results of various optimization factors for the large-scale peering networks use case.The following conclusions can be drawn from the above numbers:Due to the overhead of RLMI, the dialog optimization does not help reduce the number of
bytes nor the number of the messages. It seems to be more important from the
point of view of the user, because it enables convenient management of
his/her watch list on, for example, a web page.The notify optimization significantly reduces both the number of messages and the number
of bytes.Partial notification saves a large number of bytes, especially when the
presence document is a rich presence document, which is relatively large.Extremely optimized SIP (imaginary TCP-only SIP) cuts the
number of messages by about half. The number of bytes is also reduced
by about half.From the perspective of the number of bytes that
a user "consumes" per day, the numbers may not look so large. Nevertheless, we
should remember that the overall effect on the network may be quite large, because
the network will have to convey dozens of gigabytes of presence traffic per day for the modest
use cases that are described in this document. Recalling
that presence is only an enabler for other media, these numbers are not so easy
to handle.This document analyzes the scalability of presence systems in general and of SIP-based presence systems
in particular. It is apparent that the scalability of these systems is far from
being trivial from several perspectives: number of messages, network bandwidth,
state management, and CPU load.As part of the analysis, we assessed several optimizations and showed the
effect of these optimizations on the number of messages and the number of bytes
that are sent between the federating domains.We have also computed the number of messages and bytes for a large-scale
peering network while assuming a protocol that has much less overhead than SIP.
Even with that protocol, we calculated relatively high numbers.It is possible that the issues described in this document are
inherent to presence systems in general and not specific to the SIMPLE protocol.
Organizations need to be prepared to invest a lot in network and hardware in
order to create real large systems. However, it is apparent that not all the
possible optimizations have been done yet, and further work is needed in the IETF in
order to provide better scalability. Nevertheless, we should remember that SIP was originally designed for end-to-end session creation, and that the number and size of messages are of secondary importance
for end-to-end session negotiation. For large scale and especially for
large scale presence, the number of messages that are needed and the size of each
message are of extreme importance. It seems that we need to think about the
problem in a different way. We need to think about scalability as part of the
protocol design. The IETF tends not to think about actual deployments when
designing a protocol, but in this case it seems that if we do not think about
scalability with the protocol design, it will be hard to scale.We should also consider whether using the same protocol between clients and
servers and between servers is a good choice with this problem. It may be that
in interdomain or even between servers in the same domain (as between RLSs and
presence servers) there is a need to have a different protocol that will be extremely
optimized for the load and that can make some assumptions about the network (for example,
use only TCP, and not an unreliable protocol such as UDP).When a server connects to another server using the current protocol, there
will be an extreme number of redundant messages due to the overhead of
supporting UDP and the need to send multiple presence documents for the same
watched user due to privacy issues. A server-to-server protocol will have to
address these issues. Some initial work to address these issues can be found in
, and Another issue that concerns protocol design is whether NOTIFY
messages should be considered as media, the way audio, video and even text
messaging are considered. The SUBSCRIBE method can be extended to create a three-way
handshake similar to INVITE, and negotiate where the NOTIFY messages should go, rate, and
other parameters. This way, the load can be shifted to specialized NOTIFY
"relays", and taken off the control path of SIP. One of the possible ideas
(due to Marc Willekens) is to use the SIP stack for the client/server NOTIFY, but make
use of a more optimized and controllable protocol for the server-to-server
interface. Another possibility is to use the MSRP (,
) protocol for the NOTIFYs.This document discusses scalability issues with the existing SIP/SIMPLE
presence protocol and model. Therefore, there are no security considerations
to be considered for this document. However, many of the possible
optimizations that should emerge as a result of this document will have
security implications that will need to be solved.This document has no actions for IANA.Added clarifications, fixed typos and language usage issues raised during the IETF last call.Updated to conform with new IETF IPR boilerplate and updated references.Fixed mistakes in calculations that were found by Victoria Beltran-Martinez,
both relate to dialog optimizations. One mistake was not including the multipart
boundary of the resource list itself in S03 when dialog optimizations were used.
The other one was assuming in T07 that only a single presentity is returned in
termination in T07 calculation.Fixed nits that were referred to me by Robert SparksFixed mistake in the formula of I07 and S08 (RLMI was not included).
Effect on total number of bytes was infitesimile.Fixed mistake in the text of the calculation of number of bytes for S08
for non dialog optimization. No actual change in number of bytes, because the
excel file calculations were done correctly.Removed general references throughout the text to "other protocols".
This was done in order to avoid the impression that the document tries
to compare SIP protocol with any other presence base protocol.Several other editorial and clarification changesAdded some input from real life deployments and input on a test with batched notifies.Added Calculations of messages and bytes per user.Calculations are now done both for minimal size of presence document and for an average size of rich presence document.Comparison with other protocol is now done using small, tiny and rich presence document sizes.Removed dialog optimization with partial notification, because it is not relevantFixed a few issues in calculations that were found by Victoria Beltran-Martinez:Added overhead for RLMI for dialog optimizations (list subscription). This
calculation fix actually shows that dialog optimization is not a real
optimization from the point of view of bytes and number of messages.When NOTIFY optimizations are applied no need for final NOTIFYThe usage of RLS between domains was clarified.Significantly enhanced the conclusions sectionSeveral typo fixesFixed a bug in the calculations. Thanks to Marc Willekens for finding the bug.Clarifications and corrections of the computation model and the computations.Added several more computations to show the influence of different optimizations.The requirements were moved to The new suggestions for optimizations were moved to We would like to thank Jonathan Rosenberg, Ben Campbell, Robert Sparks,
Markus Isomaki Piotr Boni, David Viamonte, Aki Niemi and Peter-Saint Andre
for ideas and input. Special thanks to Marc Willekens and Victoria Beltran-
Martinez for finding several issues in the calculations. Additional special
thanks to A. Jean Mahoney, Joel M. Halpern and Barry Leiba, for their dedicated review as
part of the IETF last call.