RFC 458 基于实时传输控制协议（RTCP）反馈（RTP / AVPF）的扩展RTP配置文件

佚名 6年前 (2019-04-17) 随笔 514人围观抢沙发百度已收录

SRE实战互联网时代守护先锋，助力企业售后服务体系运筹帷幄！一键直达领取阿里云限量特价优惠。

Network Working Group J. Ott
Request for Comments: 4585 Helsinki University of Technology
Category: Standards Track S. Wenger
Nokia
N. Sato
Oki
C. Burmeister
J. Rey
Matsushita
July 2006

基于实时传输控制协议（RTCP）反馈（RTP / AVPF）的扩展RTP配置文件

本备忘录的状态

本文档为Internet社区指定了Internet标准跟踪协议，并请求讨论和改进建议。有关本协议
的标准化程度和文档状态，请参阅当前版本的“Internet官方协议标准”（STD 1）。本备忘
录的发布是无限制的。

摘要

使用RTP的实时媒体流在某种程度上可以抵御分组丢失。接收器可以使用实时传输控制协议
（RTCP）的基本机制来报告分组接收统计，从而允许发送者在中期调整其传输行为。这是反馈
和基于反馈的错误修复的唯一手段（除了一些特定于编解码器的机制）。该文档定义了音视频配
置文件（AVP）的扩展，使得接收器能够在统计上向发送者提供更即时的反馈，从而允许实现短
期适应和更有效的基于反馈的修复机制。这种早期反馈配置文件（AVPF）维护了RTCP的AVP带
宽限制，并保留了大型组的可扩展性。

Ott, et al. Standards Track [Page 1]

RFC 4585 RTP/AVPF July 2006

Table of Contents

1. Introduction ....................................................3
1.1. Definitions ................................................3
1.2. Terminology ................................................5
2. RTP and RTCP Packet Formats and Protocol Behavior ...............6
2.1. RTP ........................................................6
2.2. Underlying Transport Protocols .............................6
3. Rules for RTCP Feedback .........................................7
3.1. Compound RTCP Feedback Packets .............................7
3.2. Algorithm Outline ..........................................8
3.3. Modes of Operation .........................................9
3.4. Definitions and Algorithm Overview ........................11
3.5. AVPF RTCP Scheduling Algorithm ............................14
3.5.1. Initialization .....................................15
3.5.2. Early Feedback Transmission ........................15
3.5.3. Regular RTCP Transmission ..........................18
3.5.4. Other Considerations ...............................19
3.6. Considerations on the Group Size ..........................20
3.6.1. ACK Mode ...........................................20
3.6.2. NACK Mode ..........................................20
3.7. Summary of Decision Steps .................................22
3.7.1. General Hints ......................................22
3.7.2. Media Session Attributes ...........................22
4. SDP Definitions ................................................23
4.1. Profile Identification ....................................23
4.2. RTCP Feedback Capability Attribute ........................23
4.3. RTCP Bandwidth Modifiers ..................................27
4.4. Examples ..................................................27
5. Interworking and Coexistence of AVP and AVPF Entities ..........29
6. Format of RTCP Feedback Messages ...............................31
6.1. Common Packet Format for Feedback Messages ................32
6.2. Transport Layer Feedback Messages .........................34
6.2.1. Generic NACK .......................................34
6.3. Payload-Specific Feedback Messages ........................35
6.3.1. Picture Loss Indication (PLI) ......................36
6.3.2. Slice Loss Indication (SLI) ........................37
6.3.3. Reference Picture Selection Indication (RPSI) ......39
6.4. Application Layer Feedback Messages .......................41
7. Early Feedback and Congestion Control ..........................41
8. Security Considerations ........................................42
9. IANA Considerations ............................................43
10. Acknowledgements ..............................................47
11. References ....................................................48
11.1. Normative References .....................................48
11.2. Informative References ...................................48

Ott, et al. Standards Track [Page 2]

RFC 4585 RTP/AVPF July 2006

1. 介绍

使用RTP的实时媒体流在某种程度上可以抵御数据包丢失。RTP [1]提供了所有必要的机制来恢
复发送者处的排序和定时，以便在接收者处正确地再现媒体流。RTP还提供所有相关接收器的整
体接收质量的连续反馈，从而允许发送器在中途（大约几秒到几分钟）将其编码方案和传输行为
与观察到的网络服务质量（QoS）相适配。但是，除少数特定于有效载荷的机制[6]外，RTP没
有规定允许发送者立即修复媒体流的及时反馈：如通过重传，追溯前向纠错（FEC）控制或对于
某些视频编解码器的特定媒体机制，例如参考图像选择等。

当前可用于RTP的提高错误恢复能力的机制包括音频冗余编码[13]，视频冗余编码[14]，RTP级
FEC [11]，以及为实现更稳定的媒体流传输的一般考虑因素[12]。可以主动应用这些机制（从
而增加给定媒体流的带宽）。或者，在具有小往返时间（RTT）的足够小的组中，发送者可以使
用上述机制和/或媒体编码特定方法按需执行修复。请注意，“小组”和“足够小的RTT”都是高度
依赖于应用的。

本文档通过两个修改和补充，为基于[1]和[2]的最小控制规定了音视频会议的改进版RTP配置
文件：首先，为了实现及时反馈，引入早期RTCP消息的概念，以及在小型组播组中实现低延迟
反馈（并防止在大型组中发生反馈消息爆炸）的算法。特别考虑了点对点场景。其次，定义少量
通用反馈消息，以及用于编解码器和特定于应用的反馈信息格式，以在RTCP有效载荷中进行传
输。

1.1. 定义

在RTP/RTCP [1]协议文档和”具有最小控制的音视频会议RTP配置文件”[2]中的定义适用于本
文。此外，本文档中使用了以下定义：

Ott, et al. Standards Track [Page 3]

RFC 4585 RTP/AVPF July 2006

早期RTCP模式：
媒体流的接收者通常（但不总是）能够将感兴趣的事件在接近其发生时间内报告给发送者的
操作模式。在早期RTCP模式中，RTCP数据包根据本文档中定义的时序规则进行传输。

早期RTCP分组：
早期的RTCP分组是指比在遵循参考文献[1]的调度算法允许的时间更早发送的分组，其原因
是接收方观察到的“事件”。早期RTCP分组可以以立即反馈方式和以早期RTCP模式发送。发
送早期RTCP分组在本文档中也称为发送早期反馈。

事件：
媒体流的接收者对发送者（可能）感兴趣的事件的观察，例如丢包、分组接收或者丢帧等等，
因此通过反馈信息手段向发送者报告是有用的。

反馈（FB）消息：
本文档中定义的RTCP消息用于将除了在RTCP接收方报告（RR）中携带的接收方长期状态信
息之外的，在接收方观察到的事件的相关信息传送回媒体流发送方。为清楚起见，反馈消息
在本文档中称为FB消息。

反馈（FB）阈值：
FB阈值表示在立即反馈模式和早期RTCP模式之间的切换。对于多方会话场景，FB阈值表示
平均每个接收器能够立即将每个事件报告给发送方的最大组大小，即通过Early RTCP分组
而不必等待其安排的定期RTCP间隔。该阈值高度依赖于提供的反馈的类型，网络QoS（例如，
分组丢失概率和分布），使用的编解码器和分组方案，会话带宽和应用需求。请注意，算法
不依赖于所有发送方和接收方同意相同的阈值。它仅用于为应用程序设计者提供概念性指导，
不用于任何计算。为清楚起见，术语反馈阈值在本文档中称为FB阈值。

Ott, et al. Standards Track [Page 4]

RFC 4585 RTP/AVPF July 2006

即时反馈模式：
一种操作模式，其中媒体流的每个接收器在统计上能够立即将每个感兴趣的事件报告回媒体
流发送器。在即时反馈模式下，RTCP FB消息根据本文档中定义的时序规则进行传输。

媒体分组：
媒体分组是RTP分组。

常规RTCP模式：
不允许优选传输FB消息的操作模式。相反地，RTCP消息是按照参考文档[1]的规则发送的，
虽然此类RTCP消息可能包含本文档中定义的反馈信息。

常规RTCP分组：
不作为早期RTCP分组发送的RTCP分组。

RTP发送方：
RTP发送方是RTP实体，其发送媒体分组以及RTCP分组，并接收常规分组以及早期RTCP（即，
反馈）分组。注意，RTP发送方是逻辑角色，并且同一RTP实体可以同时充当RTP接收方。

RTP接收方：
RTP接收方是RTP实体，其接收媒体分组以及RTCP分组，并发送常规分组以及早期RTCP（即，
反馈）分组。注意，RTP接收方是逻辑角色，并且相同的RTP实体可以同时充当RTP发送方。

1.2. 术语

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [5].

Ott, et al. Standards Track [Page 5]

RFC 4585 RTP/AVPF July 2006

2. RTP和RTCP分组格式和协议行为

2.1. RTP

参考文献[2]中定义的规则也适用于此配置文件，但以下提到的规则除外：

RTCP分组类型：
本备忘录的第6节中注册了两个附加的RTCP分组类型，并定义了传输反馈信息的相应FB消息。

RTCP报告间隔：
本文档描述了影响RTCP报告间隔的三种操作模式（参见本备忘录的第3.2节）。在常规RTCP
模式中，除了来自同一RTP实体的两个RTCP报告之间的建议最小间隔5秒之外，参考文献[1]
中的所有规则都适用。在立即反馈和早期RTCP模式中，两个RTCP报告之间的最小间隔为5秒
的规则被去除，此外，如果传输的RTCP分组包含FB消息（如本备忘录的第4节中所定义），
本备忘录第3节中指定的规则也适用。

参考文献[1]中提出的规则可以被指定不同参数（例如，分别分配发送者和接收者的RTCP的
带宽份额）的会话描述覆盖。对于使用会话描述协议（SDP）[3]定义的会话，适用参考文献
[4]的规则。

拥塞控制：
在参考文献[2]中详述的相同的基本规则也适用。除此之外，在第7节中，进一步考虑了反馈
的影响以及发送方对FB消息的反应。

2.2. 基础传输协议

RTP旨在用于不可靠的传输协议，包括UDP协议和数据报拥塞控制协议（DCCP）。本节简要介绍
本备忘录中规定的RTCP反馈引入的普通RTP操作之外的细节。

UDP: UDP为点对点和多播通信提供尽力传输数据报。UDP不支持拥塞控制或错误修复。本备忘
录中定义的基于RTCP的反馈能够为有限的错误修复提供最小的支持。由于RTCP反馈不能保
证在足够小的时间尺度上运行（按RTT的顺序），因此RTCP反馈不适合支持拥塞控制。此备
忘录同时处理单播和多播操作。

Ott, et al. Standards Track [Page 6]

RFC 4585 RTP/AVPF July 2006

DCCP: DCCP [19]为单播通信提供了拥塞控制但不可靠的数据报流。使用基于TCP友好速率控
制（TFRC）的[20]拥塞控制（CCID 3），DCCP特别适用于音视频通信。DCCP的确认消息
可以提供关于接收和丢失的数据报（以及因此关于拥塞）的详细反馈报告。

当在DCCP上运行RTP时，在DCCP层执行拥塞控制，不需要RTP层提供额外的机制。此外，
具有RTCP反馈能力的发送方可以利用更频繁的基于DCCP的反馈，因此接收方可以在适当的
情况下避免使用（附加的）通用反馈消息。

3. RTCP反馈规则

3.1. 复合RTCP反馈分组

如本文档所述，两个组件构成基于RTCP的反馈：

o 状态报告包含在发送方报告（SR）和接收报告（RR）数据包中，并作为复合RTCP数据包
（也包括源描述（SDES）和可能的其他消息）的一部分定期发送; 这些状态报告提供了媒体
流最近接收质量的总体情况。

o 本文档中定义的FB消息表示媒体流特定分片的丢失或接收（或者对所接收的数据提供其他形
式的相当即时的反馈）。本文档中新引入了FB消息传输规则。

RTCP FB消息只是某种RTCP分组类型（参见第4节）。因此，可以将多个FB消息组合在单个复合
RTCP分组中，并且它们也可以与其他RTCP分组复合一起发送。

包含本文档中定义的FB消息的复合RTCP数据包必须按照[1]中定义的顺序包含RTCP数据包：

o 如果要根据[1]的第9.1节加密RTCP分组，必须存在可选的加密前缀。
o 必须包含SR或者RR.

Ott, et al. Standards Track [Page 7]

RFC 4585 RTP/AVPF July 2006

o 必须包含SDES, 必须包含CNAME项目; 所有其他SDES项目都是可选的。
o 一条或多条FB消息。

在复合分组中FB消息必须放在参考文献[1]中定义的RR和SDES RTCP分组之后。关于其他相关
的RTCP扩展的顺序未定义。

本文档中使用了两种携带反馈分组的复合RTCP分组：

a) 最小复合RTCP反馈分组

一个最小复合RTCP反馈分组必须仅包含上面列出的强制信息：必要时加密前缀，恰好一个
RR或SR，恰好只有一个存在CNAME项的SDES，以及一个或者多个FB消息。这是为了最小化
传输的传送反馈的RTCP分组的大小，从而最大化可以提供反馈的频率，同时仍然遵守RTCP
带宽限制。

当RTCP FB消息作为早期RTCP分组的一部分发送时，应该使用这种分组格式。此分组类型在
本文档中称为最小复合RTCP分组。

b) （完整）复合RTCP反馈分组

（完整）复合RTCP反馈分组可以包含任何额外数量的RTCP分组（额外的RR，还有SDES项
等）。必须遵守上述订购规则。

只要RTCP FB消息是作为常规RTCP分组的一部分发送，或者是在常规RTCP模式中发送，就
必须使用此分组格式。它还可用于在立即反馈或早期RTCP模式下发送RTCP FB消息。此分
组类型在本文档中称为完整复合RTCP分组。

不包含FB消息的RTCP分组被称为非FB RTCP分组。这些分组必须遵循参考文献[1]中的格式规
则。

3.2. 算法概述

FB消息是RTCP控制流的一部分，因此受RTCP带宽限制。这尤其意味着可能无法将在接收者处观
察到的事件立即报告给发送者。然而，给予发送者的反馈的价值通常随着时间的推移而降低，就
用户在接收端感知的媒体质量和/或实现媒体流修复所需的成本而言。

Ott, et al. Standards Track [Page 8]

RFC 4585 RTP/AVPF July 2006

RTP [1]和常用的RTP配置文件[2]指定发送复合RTCP分组的规则。该文档修改了这些规则，
以便允许应用程序及时报告事件（例如，丢失或接收RTP数据包），并适应使用FB消息的算法。

修改后的RTCP传输算法概述如下：只要不传送FB消息，复合RTCP数据包就按照RTP [1]的规则
发送，除了不强制要求RTCP报告之间的最小间隔为5秒。因此，RTCP报告之间的间隔仅来自RTP
/RTCP实体可用的平均RTCP分组大小和RTCP带宽分配。可选地，可以强行规定常规RTCP分组之
间的最小时间间隔。

如果接收器检测到需要发送FB消息，则它可以比下一常规RTCP报告间隔（遵循上述常规RTCP算
法排期）更早地进行。反馈抑制用于避免多方会话中的反馈内爆：接收器等待（短）随机抖动间
隔，以检查它是否从任何其他接收器看到报告相同事件的相应FB消息。请注意，对于点对点会话，
没有这样的延迟。如果接收到来自另一成员的相应FB消息，则该接收器避免发送FB消息并继续遵
循常规RTCP传输调度。如果接收方尚未看到来自任何其他成员的相应FB消息，则它检查是否允许
发送早期反馈。如果允许发送早期反馈，则接收器将FB消息作为最小复合RTCP分组的一部分发送。
发送早期反馈的权限取决于此接收器先前发送的RTCP分组的类型，以及发送上一个早期反馈消息
的时间。

FB消息也可以作为完整复合RTCP分组的一部分发送，其按照参考文档[1]（除了5秒下限规则）
以规定间隔发送。

3.3. 运作模式

基于RTCP的反馈可以以三种模式之一（图1）操作，如下所述。操作模式仅表示接收方在一般情
况下是否能够及时向发送方报告所有事件; 该模式不影响用于调度FB消息传输的算法。

Ott, et al. Standards Track [Page 9]

RFC 4585 RTP/AVPF July 2006

并且，取决于接收质量和RTP会话的本地监视状态，各个接收器可能不会（也不必）就当前操作
模式达成一致。

a) 立即反馈模式：在此模式下，组大小低于FB阈值，这为每个接收方提供足够的带宽，以便为
预期目的传输RTCP反馈分组。这意味着，对于每个接收方，有足够的带宽通过虚拟的“即时”
RTCP反馈分组报告每个事件。

组大小阈值是许多参数的函数，包括（但不限于）：所使用的反馈类型（例如，ACK与NACK），
带宽，分组速率，分组丢失概率和分布，媒体类型，编解码类型，和（最坏情况下或观察到
的）报告事件的发生频率（例如，接收帧，分组丢失）。

作为粗略估计，令N是接收器每个间隔T报告的平均事件数，B是该特定接收器的RTCP带宽分
数，R是平均RTCP分组大小，那么只要N<=B*T/R，接收器以立即反馈模式操作。

b) 早期RTCP模式：在此模式下，组大小和其他参数不再允许每个接收器对每个值得报告（或需
要报告）的事件做出反应。但是仍然可以充分地给出反馈，以便它允许发送者相应地调整媒
体流传输，从而提高整体媒体播放质量。

使用上述表示法，早期RTCP模式可粗略地表征为以 N > B*T/R 为“下限”。对上限的估
计更加困难。使N = 1，即可得到对于给定的R和B，间隔 T = R/B 作为要报告的事件之
间的平均间隔。该信息可用于提示以确定是否采用RTCP分组的早期传输。

c) 常规RTCP模式：当组的大小超出某个值时，根据接收器的个别事件提供反馈不再有用。因为
可以提供反馈的时间有限，还有在大组中发送者没有机会对个体反馈做出反应。

此模式不能指定精确的组大小阈值，但显然，此边界与上面b）项中指定的早期RTCP模式的
上限匹配。

Ott, et al. Standards Track [Page 10]

RFC 4585 RTP/AVPF July 2006

由于本文档中描述的反馈算法可以平滑地扩展，因此不需要参与者之间就组内各个FB阈值的精
确值达成一致。因此，所有这些模式之间的边界是软的。

ACK
feedback
V
:<- - - - NACK feedback - - - ->//
:
: Immediate ||
: Feedback mode ||Early RTCP mode Regular RTCP mode
:<=============>||<=============>//<=================>
: ||
-+---------------||---------------//------------------> group size
2 ||
Application-specific FB Threshold
= f(data rate, packet loss, codec, ...)

Figure 1: Modes of operation

如前所述，相应的FB阈值取决于许多技术参数（编解码器，传输，所使用的反馈的类型等），但
也取决于相应的应用场景。第3.6节提供了估算这些阈值的一些有用的提示（但没有精确的计算）。

3.4. 定义和算法概述

每个接收器需要维护以下状态信息（主要取自[1]）。请注意，所有变量（下面的项目h除外）都
是在每个接收器上独立计算的。因此，它们的本地值可能在任何给定的时间点都不同。

a) 令“发送方”为RTP会话中活动的发送器的数量。

b) 令“成员”为当前RTP会话中接收者数量的估计。

c) 令tn和tp为下一个（最后一个）调度的RTCP RR传输的时间，该传输时间在计时器重新考虑
之前计算。

d) 令Tmin为RTCP包之间的最小间隔[1]。与[1]不同，初始Tmin设置为1秒，以允许在发送第
一个RTCP数据包之前进行某些组大小采样。发送第一个RTCP数据包后，Tmin设置为0。

Ott, et al. Standards Track [Page 11]

RFC 4585 RTP/AVPF July 2006

e) 令T_rr为在刚发送了定期调度的RTCP分组之后，接收器将调度其下一个常规RTCP分组的传
输的时间间隔。该值是按照[1]的规则获得的，但是使用本文件中定义的Tmin：T_rr = T
（[1]中定义的“计算间隔”），tn = tp + T。T_rr总是指已计算的T的最后一个值（由
于重新考虑或确定tn）。T_rr在本文档中也称为常规RTCP间隔。

f) 令 t0 为接收器检测到要报告的事件的时间。

g) 令T_dither_max是可以额外延迟RTCP反馈包以防止多方会话发生内爆的最大间隔;
T_dither_max的值是基于T_rr动态计算的（或者可以通过将来要指定的所有RTP接收器共
用的另一种机制来导出）。对于点对点会话（即，具有恰好两个成员而没有预期的组大小改
变的会话，如，单播流会话），T_dither_max被设置为0。

h) 令T_max_fb_delay是一个上限，需要在该时间上限内将对事件的反馈报告给发送方以使其
有价值。此值是特定于应用程序的，并且在本文档中未作任何定义。

i) 令te是排期发送反馈分组的时间。

j) 令T_fd是响应时间t0发生的事件而发送FB消息的实际（随机）延迟。

k) 令allow_early为布尔变量，指示接收器当前是否可以在其下一个定期调度的RTCP间隔tn
之前发送FB消息。此变量用于限制单个接收器发送的反馈。在早期反馈传输之后，
allow_early设置为FALSE，并在下一次常规RTCP传输发生后立即设置为TRUE。

l) 令avg_rtcp_size为[1]中定义的RTCP数据包大小的移动平均值。

m) 令T_rr_interval为常规RTCP数据包之间使用的可选最小间隔。
如果T_rr_interval == 0，则该变量对RTCP反馈算法的整体操作没有任何影响。
如果T_rr_interval != 0，则在最后一次常规RTCP传输的T_rr时间之后
（即，在tp + T_rr），将不调度下一个常规RTCP分组。相反，下一个常规RTCP分组将
被延迟，直到最后一次常规RTCP传输之后至少T_rr_interval，即，它将被安排在
tp + T_rr_interval或之后。注意，T_rr_interval不影响T_rr和tp的计算; 相反，
如果例如它们不包含任何FB消息，则将抑制在tp + T_rr_interval之前调度传输的常规
RTCP分组。 T_rr_interval不影响早期RTCP分组的传输调度。

Ott, et al. Standards Track [Page 12]

RFC 4585 RTP/AVPF July 2006

注意：将T_rr_interval作为独立变量提供意味着根据应用程序的需要最小化常规RTCP反
馈（以及带宽消耗），同时另外允许使用更频繁的早期RTCP分组来提供及时反馈。由于RTCP
带宽减少也会影响早期反馈的频率，因此无法通过降低整体RTCP带宽来实现此目标。

n) 令t_rr_last为最后一个常规RTCP包被调度和发送的时间点，即未由于T_rr_interval
而被抑制。

o) 设T_retention是AVPF实体存储过去FB消息的时间窗口。这是为了确保反馈抑制也适用于
在注意到反馈事件本身之前已从其他实体接收到FB消息的实体。T_retention必须设置为
至少2秒。

p) 令M * Td为接收器被视为无效的超时值（如[1]中所定义）。

在接收器处报告事件的反馈情况如下图2所示。在时间t0，在接收器处检测到这样的事件（例如，
分组丢失）。接收器根据当前带宽，组大小和其他特定于应用程序的参数决定需要发送FB消息
回发送方。

为了避免多方会话中反馈分组的内爆，接收方必须将RTCP反馈分组的传输延迟随机时间量T_fd
（随机数均匀分布在区间[0，T_dither_max]中）。然后必须在te = t0 + T_fd调度复合
RTCP分组的传输。

T_dither_max参数是从常规RTCP间隔T_rr导出的，而T_rr又基于组大小。如果可以确保所
有RTP接收器将使用相同的机制来计算T_dither_max，则未来文档可能指定T_dither_max
的其他计算方式（例如，基于RTT）。

Ott, et al. Standards Track [Page 13]

RFC 4585 RTP/AVPF July 2006

对于某种应用场景，接收器可以确定FB消息的可接受的本地延迟的上限：T_max_fb_delay。
如果先验估计或T_dither_max的实际计算指示可以违反该上限（例如，因为
T_dither_max> T_max_fb_delay），则接收器可以决定不发送任何反馈，因为可实现的增
益被认为是不足的。

如果调度了早期RTCP分组，则必须相应地更新下一个常规RTCP分组的时隙，以便之后具有新的
tn（tn = tp + 2 * T_rr）和新的tp（tp = tp + T_rr）。这是为了确保早期反馈使用
的短期平均RTCP带宽不超过没有早期反馈的带宽。

event to
report
detected
|
| RTCP feedback range
| (T_max_fb_delay)
vXXXXXXXXXXXXXXXXXXXXXXXXXXX ) )
|---+--------+-------------+-----+------------| |--------+--->
| | | | ( ( |
| t0 te |
tp tn
\_______ ________/
\/
T_dither_max

图2：早期RTCP调度的事件报告和参数

3.5. AVPF RTCP调度算法

设S0为某一活动发送方（S个发送方中的一个），并且设N为接收方的数量，其中R为这些接收方
之一。

假设R已经证实在当前环境中使用反馈机制是合理的（这是高度特定于应用的，因此在本文档中
没有指定）。

进一步假设T_rr_interval为0，如果没有设置执行常规RTCP分组之间的最小间隔，或者
T_rr_interval设置为某个有意义的值，如应用程序所给出的。那么，该值表示常规RTCP分组
之间的最小间隔。

由此，接收机R必须使用以下规则来传输一个或多个FB消息，以最小或完全符合RTCP分组的形式。

Ott, et al. Standards Track [Page 14]

RFC 4585 RTP/AVPF July 2006

3.5.1. 初始化

最初，R必须设置allow_early = TRUE和t_rr_last = NaN（非数字的意思，即可以与有效
时间值区分的某个无效值）。

此外，除了Tmin的初始值之外，按照[1]初始化RTCP变量。对于点对点会话，初始Tmin设置为0。
对于多方会话，Tmin初始化为1.0秒。

3.5.2. 早期反馈传输

假设R已经调度了最后的常规RTCP RR分组以便在tp发送（并且在tp发送或抑制该分组），并且
已经安排下一次发送时间 tn= tp+T_rr（包括根据[1]的可能重新考虑）。还假设最后的常规
RTCP分组传输发生在t_rr_last。

然后，早期反馈算法包括以下步骤：

1. 在时间t0，R检测到需要发送一个或多个FB消息，例如，因为媒体“单元”需要被ACK或NACK，
并且发现提供反馈信息对于发送者可能是有用的。

2. R首先检查是否已经存在包含一个或多个调度用于传输的FB消息的复合RTCP分组（作为早期
或常规RTCP分组）。

2a) 如果是这样，新的FB消息必须包含在调度的分组中; 等待复合RTCP分组的调度必须保
持不变。这样做时，应该合并可用的反馈信息以产生尽可能少的FB消息。这样就达成了
立即采取的行动。

2b) 如果没有调度复合RTCP分组进行传输，则必须创建一个新的（最小或完整）复合RTCP
分组，并且必须按如下方式选择T_dither_max的最小间隔：

i) 如果会话是点对点会话，那么

T_dither_max = 0.

Ott, et al. Standards Track [Page 15]

RFC 4585 RTP/AVPF July 2006

ii) 如果会话是多方会话，那么

T_dither_max = l * T_rr

其中 l=0.5.

The value for T_dither_max MAY be calculated differently
(e.g., based upon RTT), which MUST then be specified in a
future document. Such a future specification MUST ensure that
all RTP receivers use the same mechanism to calculate
T_dither_max.

The values given above for T_dither_max are minimal values.
Application-specific feedback considerations may make it
worthwhile to increase T_dither_max beyond this value. This
is up to the discretion of the implementer.

3. Then, R MUST check whether its next Regular RTCP packet would be
within the time bounds for the Early RTCP packet triggered at t0,
i.e., if t0 + T_dither_max > tn.

3a) If so, an Early RTCP packet MUST NOT be scheduled; instead,
the FB message(s) MUST be stored to be included in the Regular
RTCP packet scheduled for tn. This completes the course of
immediate actions to be taken.

3b) Otherwise, the following steps are carried out.

4. R MUST check whether it is allowed to transmit an Early RTCP
packet, i.e., allow_early == TRUE, or not.

4a) If allow_early == FALSE, then R MUST check the time for the
next scheduled Regular RTCP packet:

1. If tn - t0 < T_max_fb_delay, then the feedback could still
be useful for the sender, despite the late reporting.
Hence, R MAY create an RTCP FB message to be included in
the Regular RTCP packet for transmission at tn.

2. Otherwise, R MUST discard the RTCP FB message.

This completes the immediate course of actions to be taken.

4b) If allow_early == TRUE, then R MUST schedule an Early RTCP
packet for te = t0 + RND * T_dither_max with RND being a
pseudo random function evenly distributed between 0 and 1.

Ott, et al. Standards Track [Page 16]

RFC 4585 RTP/AVPF July 2006

5. R MUST detect overlaps in FB messages received from other members
of the RTP session and the FB messages R wants to send.
Therefore, while a member of the RTP session, R MUST continuously
monitor the arrival of (minimal) compound RTCP packets and store
each FB message contained in these RTCP packets for at least
T_retention. When scheduling the transmission of its own FB
message following steps 1 through 4 above, R MUST check each of
the stored and newly received FB messages from the RTCP packets
received during the interval [t0 - T_retention ; te] and act as
follows:

5a) If R understands the received FB message's semantics and the
message contents is a superset of the feedback R wanted to
send, then R MUST discard its own FB message and MUST re-
schedule the next Regular RTCP packet transmission for tn (as
calculated before).

5b) If R understands the received FB message's semantics and the
message contents is not a superset of the feedback R wanted to
send, then R SHOULD transmit its own FB message as scheduled.
If there is an overlap between the feedback information to
send and the feedback information received, the amount of
feedback transmitted is up to R: R MAY leave its feedback
information to be sent unchanged, R MAY as well eliminate any
redundancy between its own feedback and the feedback received
so far from other session members.

5c) If R does not understand the received FB message's semantics,
R MAY keep its own FB message scheduled as an Early RTCP
packet, or R MAY re-schedule the next Regular RTCP packet
transmission for tn (as calculated before) and MAY append the
FB message to the now regularly scheduled RTCP message.

Note: With 5c), receiving unknown FB messages may not lead to
feedback suppression at a particular receiver. As a
consequence, a given event may cause M different types of FB
messages (which are all appropriate but not mutually
understood) to be scheduled, so that a "large" receiver group
may effectively be partitioned into at most M groups. Among
members of each of these M groups, feedback suppression will
occur following 5a and 5b but no suppression will happen
across groups. As a result, O(M) RTCP FB messages may be
received by the sender. Hence, there is a chance for a very
limited feedback implosion. However, as sender(s) and all
receivers make up the same application using the same (set of)
codecs in the same RTP session, only little divergence in
semantics for FB messages can safely be assumed and,
therefore, M is assumed to be small in the general case.

Ott, et al. Standards Track [Page 17]

RFC 4585 RTP/AVPF July 2006

Given further that the O(M) FB messages are randomly
distributed over a time interval of T_dither_max, we find that
the resulting limited number of extra compound RTCP packets
(a) is assumed not to overwhelm the sender and (b) should be
conveyed as all contain complementary pieces of information.

6. If R's FB message(s) was not suppressed by other receiver FB
messages as per 5, when te is reached, R MUST transmit the
(minimal) compound RTCP packet containing its FB message(s). R
then MUST set allow_early = FALSE, MUST recalculate tn = tp +
2*T_rr, and MUST set tp to the previous tn. As soon as the newly
calculated tn is reached, regardless whether R sends its next
Regular RTCP packet or suppresses it because of T_rr_interval, it
MUST set allow_early = TRUE again.

3.5.3. Regular RTCP Transmission

Full compound RTCP packets MUST be sent in regular intervals. These
packets MAY also contain one or more FB messages. Transmission of
Regular RTCP packets is scheduled as follows:

If T_rr_interval == 0, then the transmission MUST follow the rules as
specified in Sections 3.2 and 3.4 of this document and MUST adhere to
the adjustments of tn specified in Section 3.5.2 (i.e., skip one
regular transmission if an Early RTCP packet transmission has
occurred). Timer reconsideration takes place when tn is reached as
per [1]. The Regular RTCP packet is transmitted after timer
reconsideration. Whenever a Regular RTCP packet is sent or
suppressed, allow_early MUST be set to TRUE and tp, tn MUST be
updated as per [1]. After the first transmission of a Regular RTCP
packet, Tmin MUST be set to 0.

If T_rr_interval != 0, then the calculation for the transmission
times MUST follow the rules as specified in Sections 3.2 and 3.4 of
this document and MUST adhere to the adjustments of tn specified in
Section 3.5.2 (i.e., skip one regular transmission if an Early RTCP
transmission has occurred). Timer reconsideration takes place when
tn is reached as per [1]. After timer reconsideration, the following
actions are taken:

1. If no Regular RTCP packet has been sent before (i.e., if t_rr_last
== NaN), then a Regular RTCP packet MUST be scheduled. Stored FB
messages MAY be included in the Regular RTCP packet. After the
scheduled packet has been sent, t_rr_last MUST be set to tn. Tmin
MUST be set to 0.

Ott, et al. Standards Track [Page 18]

RFC 4585 RTP/AVPF July 2006

2. Otherwise, a temporary value T_rr_current_interval is calculated
as follows:

T_rr_current_interval = RND*T_rr_interval

with RND being a pseudo random function evenly distributed between
0.5 and 1.5. This dithered value is used to determine one of the
following alternatives:

2a) If t_rr_last + T_rr_current_interval <= tn, then a Regular
RTCP packet MUST be scheduled. Stored RTCP FB messages MAY be
included in the Regular RTCP packet. After the scheduled
packet has been sent, t_rr_last MUST be set to tn.

2b) If t_rr_last + T_rr_current_interval > tn and RTCP FB messages
have been stored and are awaiting transmission, an RTCP packet
MUST be scheduled for transmission at tn. This RTCP packet
MAY be a minimal or a Regular RTCP packet (at the discretion
of the implementer), and the compound RTCP packet MUST include
the stored RTCP FB message(s). t_rr_last MUST remain
unchanged.

2c) Otherwise (if t_rr_last + T_rr_current_interval > tn but no
stored RTCP FB messages are awaiting transmission), the
compound RTCP packet MUST be suppressed (i.e., it MUST NOT be
scheduled). t_rr_last MUST remain unchanged.

In all the four cases above (1, 2a, 2b, and 2c), allow_early MUST be
set to TRUE (possibly after sending the Regular RTCP packet) and tp
and tn MUST be updated following the rules of [1] except for the five
second minimum.

3.5.4. Other Considerations

If T_rr_interval != 0, then the timeout calculation for RTP/AVPF
entities (Section 6.3.5 of [1]) MUST be modified to use T_rr_interval
instead of Tmin for computing Td and thus M*Td for timing out RTP
entities.

Whenever a compound RTCP packet is sent or received -- minimal or
full compound, Early or Regular -- the avg_rtcp_size variable MUST be
updated accordingly (see [1]) and subsequent computations of tn MUST
use the new avg_rtcp_size.

Ott, et al. Standards Track [Page 19]

RFC 4585 RTP/AVPF July 2006

3.6. Considerations on the Group Size

This section provides some guidelines to the group sizes at which the
various feedback modes may be used.

3.6.1. ACK Mode

The RTP session MUST have exactly two members and this group size
MUST NOT grow, i.e., it MUST be point-to-point communications.
Unicast addresses SHOULD be used in the session description.

For unidirectional as well as bi-directional communication between
two parties, 2.5% of the RTP session bandwidth is available for RTCP
traffic from the receivers including feedback. For a 64-kbit/s
stream this yields 1,600 bit/s for RTCP. If we assume an average of
96 bytes (=768 bits) per RTCP packet, a receiver can report 2 events
per second back to the sender. If acknowledgements for 10 events are
collected in each FB message, then 20 events can be acknowledged per
second. At 256 kbit/s, 8 events could be reported per second; thus,
the ACKs may be sent in a finer granularity (e.g., only combining
three ACKs per FB message).

From 1 Mbit/s upwards, a receiver would be able to acknowledge each
individual frame (not packet!) in a 30-fps video stream.

ACK strategies MUST be defined to work properly with these bandwidth
limitations. An indication whether or not ACKs are allowed for a
session and, if so, which ACK strategy should be used, MAY be
conveyed by out-of-band mechanisms, e.g., media-specific attributes
in a session description using SDP.

3.6.2. NACK Mode

Negative acknowledgements (and the other types of feedback exhibiting
similar reporting characteristics) MUST be used for all sessions with
a group size that may grow larger than two. Of course, NACKs MAY be
used for point-to-point communications as well.

Whether or not the use of Early RTCP packets should be considered
depends upon a number of parameters including session bandwidth,
codec, special type of feedback, and number of senders and receivers.

The most important parameters when determining the mode of operation
are the allowed minimal interval between two compound RTCP packets
(T_rr) and the average number of events that presumably need
reporting per time interval (plus their distribution over time, of
course). The minimum interval can be derived from the available RTCP
bandwidth and the expected average size of an RTCP packet. The

Ott, et al. Standards Track [Page 20]

RFC 4585 RTP/AVPF July 2006

number of events to report (e.g., per second) may be derived from the
packet loss rate and sender's rate of transmitting packets. From
these two values, the allowable group size for the Immediate Feedback
mode can be calculated.

As stated in Section 3.3:

Let N be the average number of events to be reported per interval
T by a receiver, B the RTCP bandwidth fraction for this particular
receiver, and R the average RTCP packet size, then the receiver
operates in Immediate Feedback mode as long as N<=B*T/R.

The upper bound for the Early RTCP mode then solely depends on the
acceptable quality degradation, i.e., how many events per time
interval may go unreported.

As stated in Section 3.3:

Using the above notation, Early RTCP mode can be roughly
characterized by N > B*T/R as "lower bound". An estimate for an
upper bound is more difficult. Setting N=1, we obtain for a given
R and B the interval T = R/B as average interval between events to
be reported. This information can be used as a hint to determine
whether or not early transmission of RTCP packets is useful.

Example: If a 256-kbit/s video with 30 fps is transmitted through a
network with an MTU size of some 1,500 bytes, then, in most cases,
each frame would fit in into one packet leading to a packet rate of
30 packets per second. If 5% packet loss occurs in the network
(equally distributed, no inter-dependence between receivers), then
each receiver will, on average, have to report 3 packets lost each
two seconds. Assuming a single sender and more than three receivers,
this yields 3.75% of the RTCP bandwidth allocated to the receivers
and thus 9.6 kbit/s. Assuming further a size of 120 bytes for the
average compound RTCP packet allows 10 RTCP packets to be sent per
second or 20 in two seconds. If every receiver needs to report three
lost packets per two seconds, this yields a maximum group size of 6-7
receivers if all loss events are reported. The rules for
transmission of Early RTCP packets should provide sufficient
flexibility for most of this reporting to occur in a timely fashion.

Extending this example to determine the upper bound for Early RTCP
mode could lead to the following considerations: assume that the
underlying coding scheme and the application (as well as the tolerant
users) allow on the order of one loss without repair per two seconds.
Thus, the number of packets to be reported by each receiver decreases
to two per two seconds and increases the group size to 10. Assuming
further that some number of packet losses are correlated, feedback

Ott, et al. Standards Track [Page 21]

RFC 4585 RTP/AVPF July 2006

traffic is further reduced and group sizes of some 12 to 16 (maybe
even 20) can be reasonably well supported using Early RTCP mode.
Note that all these considerations are based upon statistics and will
fail to hold in some cases.

3.7. Summary of Decision Steps

3.7.1. General Hints

Before even considering whether or not to send RTCP feedback
information, an application has to determine whether this mechanism
is applicable:

1) An application has to decide whether -- for the current ratio of
packet rate with the associated (application-specific) maximum
feedback delay and the currently observed round-trip time (if
available) -- feedback mechanisms can be applied at all.

This decision may be based upon (and dynamically revised
following) RTCP reception statistics as well as out-of-band
mechanisms.

2) The application has to decide -- for a certain observed error
rate, assigned bandwidth, frame/packet rate, and group size --
whether (and which) feedback mechanisms can be applied.

Regular RTCP reception statistics provide valuable input to this
step, too.

3) If the application decides to send feedback, the application has
to follow the rules for transmitting Early RTCP packets or Regular
RTCP packets containing FB messages.

4) The type of RTCP feedback sent should not duplicate information
available to the sender from a lower layer transport protocol.
That is, if the transport protocol provides negative or positive
acknowledgements about packet reception (such as DCCP), the
receiver should avoid repeating the same information at the RTCP
layer (i.e., abstain from sending Generic NACKs).

3.7.2. Media Session Attributes

Media sessions are typically described using out-of-band mechanisms
to convey transport addresses, codec information, etc., between
sender(s) and receiver(s). Such a mechanism is two-fold: a format
used to describe a media session and another mechanism for
transporting this description.

Ott, et al. Standards Track [Page 22]

RFC 4585 RTP/AVPF July 2006

In the IETF, the Session Description Protocol (SDP) is currently used
to describe media sessions while protocols such as SIP, Session
Announcement Protocol (SAP), Real Time Streaming Protocol (RTSP), and
HTTP (among others) are used to convey the descriptions.

A media session description format MAY include parameters to indicate
that RTCP feedback mechanisms are supported in this session and which
of the feedback mechanisms MAY be applied.

To do so, the profile "AVPF" MUST be indicated instead of "AVP".
Further attributes may be defined to show which type(s) of feedback
are supported.

Section 4 contains the syntax specification to support RTCP feedback
with SDP. Similar specifications for other media session description
formats are outside the scope of this document.

4. SDP Definitions

This section defines a number of additional SDP parameters that are
used to describe a session. All of these are defined as media-level
attributes.

4.1. Profile Identification

The AV profile defined in [4] is referred to as "AVP" in the context
of, e.g., the Session Description Protocol (SDP) [3]. The profile
specified in this document is referred to as "AVPF".

Feedback information following the modified timing rules as specified
in this document MUST NOT be sent for a particular media session
unless the description for this session indicates the use of the
"AVPF" profile (exclusively or jointly with other AV profiles).

4.2. RTCP Feedback Capability Attribute

A new payload format-specific SDP attribute is defined to indicate
the capability of using RTCP feedback as specified in this document:
"a=rtcp-fb". The "rtcp-fb" attribute MUST only be used as an SDP
media attribute and MUST NOT be provided at the session level. The
"rtcp-fb" attribute MUST only be used in media sessions for which the
"AVPF" is specified.

The "rtcp-fb" attribute SHOULD be used to indicate which RTCP FB
messages MAY be used in this media session for the indicated payload
type. A wildcard payload type ("*") MAY be used to indicate that the
RTCP feedback attribute applies to all payload types. If several
types of feedback are supported and/or the same feedback shall be

Ott, et al. Standards Track [Page 23]

RFC 4585 RTP/AVPF July 2006

specified for a subset of the payload types, several "a=rtcp-fb"
lines MUST be used.

If no "rtcp-fb" attribute is specified, the RTP receivers MAY send
feedback using other suitable RTCP feedback packets as defined for
the respective media type. The RTP receivers MUST NOT rely on the
RTP senders reacting to any of the FB messages. The RTP sender MAY
choose to ignore some feedback messages.

If one or more "rtcp-fb" attributes are present in a media session
description, the RTCP receivers for the media session(s) containing
the "rtcp-fb"

o MUST ignore all "rtcp-fb" attributes of which they do not fully
understand the semantics (i.e., where they do not understand the
meaning of all values in the "a=rtcp-fb" line);

o SHOULD provide feedback information as specified in this document
using any of the RTCP feedback packets as specified in one of the
"rtcp-fb" attributes for this media session; and

o MUST NOT use other FB messages than those listed in one of the
"rtcp-fb" attribute lines.

When used in conjunction with the offer/answer model [8], the offerer
MAY present a set of these AVPF attributes to its peer. The answerer
MUST remove all attributes it does not understand as well as those it
does not support in general or does not wish to use in this
particular media session. The answerer MUST NOT add feedback
parameters to the media description and MUST NOT alter values of such
parameters. The answer is binding for the media session, and both
offerer and answerer MUST only use feedback mechanisms negotiated in
this way. Both offerer and answerer MAY independently decide to send
RTCP FB messages of only a subset of the negotiated feedback
mechanisms, but they SHOULD react properly to all types of the
negotiated FB messages when received.

RTP senders MUST be prepared to receive any kind of RTCP FB messages
and MUST silently discard all those RTCP FB messages that they do not
understand.

The syntax of the "rtcp-fb" attribute is as follows (the feedback
types and optional parameters are all case sensitive):

(In the following ABNF, fmt, SP, and CRLF are used as defined in
[3].)

Ott, et al. Standards Track [Page 24]

RFC 4585 RTP/AVPF July 2006

rtcp-fb-syntax = "a=rtcp-fb:" rtcp-fb-pt SP rtcp-fb-val CRLF

rtcp-fb-pt = "*" ; wildcard: applies to all formats
/ fmt ; as defined in SDP spec

rtcp-fb-val = "ack" rtcp-fb-ack-param
/ "nack" rtcp-fb-nack-param
/ "trr-int" SP 1*DIGIT
/ rtcp-fb-id rtcp-fb-param

rtcp-fb-id = 1*(alpha-numeric / "-" / "_")

rtcp-fb-param = SP "app" [SP byte-string]
/ SP token [SP byte-string]
/ ; empty

rtcp-fb-ack-param = SP "rpsi"
/ SP "app" [SP byte-string]
/ SP token [SP byte-string]
/ ; empty

rtcp-fb-nack-param = SP "pli"
/ SP "sli"
/ SP "rpsi"
/ SP "app" [SP byte-string]
/ SP token [SP byte-string]
/ ; empty

The literals of the above grammar have the following semantics:

Feedback type "ack":

This feedback type indicates that positive acknowledgements for
feedback are supported.

The feedback type "ack" MUST only be used if the media session is
allowed to operate in ACK mode as defined in Section 3.6.1.

Parameters MUST be provided to further distinguish different types
of positive acknowledgement feedback.

The parameter "rpsi" indicates the use of Reference Picture
Selection Indication feedback as defined in Section 6.3.3.

Ott, et al. Standards Track [Page 25]

RFC 4585 RTP/AVPF July 2006

If the parameter "app" is specified, this indicates the use of
application layer feedback. In this case, additional parameters
following "app" MAY be used to further differentiate various types
of application layer feedback. This document does not define any
parameters specific to "app".

Further parameters for "ack" MAY be defined in other documents.

Feedback type "nack":

This feedback type indicates that negative acknowledgements for
feedback are supported.

The feedback type "nack", without parameters, indicates use of the
Generic NACK feedback format as defined in Section 6.2.1.

The following three parameters are defined in this document for
use with "nack" in conjunction with the media type "video":

o "pli" indicates the use of Picture Loss Indication feedback as
defined in Section 6.3.1.

o "sli" indicates the use of Slice Loss Indication feedback as
defined in Section 6.3.2.

o "rpsi" indicates the use of Reference Picture Selection
Indication feedback as defined in Section 6.3.3.

"app" indicates the use of application layer feedback. Additional
parameters after "app" MAY be provided to differentiate different
types of application layer feedback. No parameters specific to
"app" are defined in this document.

Further parameters for "nack" MAY be defined in other documents.

Other feedback types <rtcp-fb-id>:

Other documents MAY define additional types of feedback; to keep
the grammar extensible for those cases, the rtcp-fb-id is
introduced as a placeholder. A new feedback scheme name MUST to
be unique (and thus MUST be registered with IANA). Along with a
new name, its semantics, packet formats (if necessary), and rules
for its operation MUST be specified.

Ott, et al. Standards Track [Page 26]

RFC 4585 RTP/AVPF July 2006

Regular RTCP minimum interval "trr-int":

The attribute "trr-int" is used to specify the minimum interval
T_rr_interval between two Regular (full compound) RTCP packets in
milliseconds for this media session. If "trr-int" is not
specified, a default value of 0 is assumed.

Note that it is assumed that more specific information about
application layer feedback (as defined in Section 6.4) will be
conveyed as feedback types and parameters defined elsewhere. Hence,
no further provision for any types and parameters is made in this
document.

Further types of feedback as well as further parameters may be
defined in other documents.

It is up to the recipients whether or not they send feedback
information and up to the sender(s) (how) to make use of feedback
provided.

4.3. RTCP Bandwidth Modifiers

The standard RTCP bandwidth assignments as defined in [1] and [2] MAY
be overridden by bandwidth modifiers that explicitly define the
maximum RTCP bandwidth. For use with SDP, such modifiers are
specified in [4]: "b=RS:<bw>" and "b=RR:<bw>" MAY be used to assign a
different bandwidth (measured in bits per second) to RTP senders and
receivers, respectively. The precedence rules of [4] apply to
determine the actual bandwidth to be used by senders and receivers.

Applications operating knowingly over highly asymmetric links (such
as satellite links) SHOULD use this mechanism to reduce the feedback
rate for high bandwidth streams to prevent deterministic congestion
of the feedback path(s).

4.4. Examples

Example 1: The following session description indicates a session made
up from audio and DTMF [18] for point-to-point communication in which
the DTMF stream uses Generic NACKs. This session description could
be contained in a SIP INVITE, 200 OK, or ACK message to indicate that
its sender is capable of and willing to receive feedback for the DTMF
stream it transmits.

v=0
o=alice 3203093520 3203093520 IN IP4 host.example.com
s=Media with feedback
t=0 0

Ott, et al. Standards Track [Page 27]

RFC 4585 RTP/AVPF July 2006

c=IN IP4 host.example.com
m=audio 49170 RTP/AVPF 0 96
a=rtpmap:0 PCMU/8000
a=rtpmap:96 telephone-event/8000
a=fmtp:96 0-16
a=rtcp-fb:96 nack

This allows sender and receiver to provide reliable transmission of
DTMF events in an audio session. Assuming a 64-kbit/s audio stream
with one receiver, the receiver has 2.5% RTCP bandwidth available for
the negative acknowledgement stream, i.e., 250 bytes per second or
some 2 RTCP feedback messages every second. Hence, the receiver can
individually communicate up to two missing DTMF audio packets per
second.

Example 2: The following session description indicates a multicast
video-only session (using either H.261 or H.263+) with the video
source accepting Generic NACKs for both codecs and Reference Picture
Selection for H.263. Such a description may have been conveyed using
the Session Announcement Protocol (SAP).

v=0
o=alice 3203093520 3203093520 IN IP4 host.example.com
s=Multicast video with feedback
t=3203130148 3203137348
m=audio 49170 RTP/AVP 0
c=IN IP4 224.2.1.183
a=rtpmap:0 PCMU/8000
m=video 51372 RTP/AVPF 98 99
c=IN IP4 224.2.1.184
a=rtpmap:98 H263-1998/90000
a=rtpmap:99 H261/90000
a=rtcp-fb:* nack
a=rtcp-fb:98 nack rpsi

The sender may use an incoming Generic NACK as a hint to send a new
intra-frame as soon as possible (congestion control permitting).
Receipt of a Reference Picture Selection Indication (RPSI) message
allows the sender to avoid sending a large intra-frame; instead it
may continue to send inter-frames, however, choosing the indicated
frame as new encoding reference.

Example 3: The following session description defines the same media
session as example 2 but allows for mixed-mode operation of AVP and
AVPF RTP entities (see also next section). Note that both media
descriptions use the same addresses; however, two m= lines are needed
to convey information about both applicable RTP profiles.

Ott, et al. Standards Track [Page 28]

RFC 4585 RTP/AVPF July 2006

v=0
o=alice 3203093520 3203093520 IN IP4 host.example.com
s=Multicast video with feedback
t=3203130148 3203137348
m=audio 49170 RTP/AVP 0
c=IN IP4 224.2.1.183
a=rtpmap:0 PCMU/8000
m=video 51372 RTP/AVP 98 99
c=IN IP4 224.2.1.184
a=rtpmap:98 H263-1998/90000
a=rtpmap:99 H261/90000
m=video 51372 RTP/AVPF 98 99
c=IN IP4 224.2.1.184
a=rtpmap:98 H263-1998/90000
a=rtpmap:99 H261/90000
a=rtcp-fb:* nack
a=rtcp-fb:98 nack rpsi

Note that these two m= lines SHOULD be grouped by some appropriate
mechanism to indicate that both are alternatives actually conveying
the same contents. A sample framework by which this can be
achieved is defined in [10].

In this example, the RTCP feedback-enabled receivers will gain an
occasional advantage to report events earlier back to the sender
(which may benefit the entire group). On average, however, all RTP
receivers will provide the same amount of feedback. The
interworking between AVP and AVPF entities is discussed in depth in
the next section.

5. Interworking and Coexistence of AVP and AVPF Entities

The AVPF profile defined in this document is an extension of the
AVP profile as defined in [2]. Both profiles follow the same basic
rules (including the upper bandwidth limit for RTCP and the
bandwidth assignments to senders and receivers). Therefore,
senders and receivers using either of the two profiles can be
mixed in a single session (see Example 3 in Section 4.5).

AVP and AVPF are defined in a way that, from a robustness point of
view, the RTP entities do not need to be aware of entities of the
respective other profile: they will not disturb each other's
functioning. However, the quality of the media presented may
suffer.

The following considerations apply to senders and receivers when
used in a combined session.

Ott, et al. Standards Track [Page 29]

RFC 4585 RTP/AVPF July 2006

o AVP entities (senders and receivers)

AVP senders will receive RTCP feedback packets from AVPF
receivers and ignore these packets. They will see occasional
closer spacing of RTCP messages (e.g., violating the five-second
rule) by AVPF entities. As the overall bandwidth constraints
are adhered to by both types of entities, they will still get
their share of the RTCP bandwidth. However, while AVP entities
are bound by the five-second rule, depending on the group size
and session bandwidth, AVPF entities may provide more frequent
RTCP reports than AVP ones will. Also, the overall reporting
may decrease slightly as AVPF entities may send bigger compound
RTCP packets (due to the extra RTCP packets).

If T_rr_interval is used as lower bound between Regular RTCP
packets, T_rr_interval is sufficiently large (e.g., T_rr_interval
> M*Td as per Section 6.3.5 of [1]), and no Early RTCP packets
are sent by AVPF entities, AVP entities may accidentally time
out those AVPF group members and hence underestimate the group
size. Therefore, if AVP entities may be involved in a media
session, T_rr_interval SHOULD NOT be larger than five seconds.

o AVPF entities (senders and receivers)

If the dynamically calculated T_rr is sufficiently small (e.g.,
less than one second), AVPF entities may accidentally time out
AVP group members and hence underestimate the group size.
Therefore, if AVP entities may be involved in a media session,
T_rr_interval SHOULD be used and SHOULD be set to five seconds.

In conclusion, if AVP entities may be involved in a media
session and T_rr_interval is to be used, T_rr_interval SHOULD be
set to five seconds.

o AVPF senders

AVPF senders will receive feedback information only from AVPF
receivers. If they rely on feedback to provide the target media
quality, the quality achieved for AVP receivers may be suboptimal.

o AVPF receivers

AVPF receivers SHOULD send Early RTCP feedback packets only if
all sending entities in the media session support AVPF. AVPF
receivers MAY send feedback information as part of regularly
scheduled compound RTCP packets following the timing rules of

Ott, et al. Standards Track [Page 30]

RFC 4585 RTP/AVPF July 2006

[1] and [2] also in media sessions operating in mixed mode.
However, the receiver providing feedback MUST NOT rely on the
sender reacting to the feedback at all.

6. Format of RTCP Feedback Messages

This section defines the format of the low-delay RTCP feedback
messages. These messages are classified into three categories as
follows:

- Transport layer FB messages
- Payload-specific FB messages
- Application layer FB messages

Transport layer FB messages are intended to transmit general purpose
feedback information, i.e., information independent of the particular
codec or the application in use. The information is expected to be
generated and processed at the transport/RTP layer. Currently, only
a generic negative acknowledgement (NACK) message is defined.

Payload-specific FB messages transport information that is specific
to a certain payload type and will be generated and acted upon at the
codec "layer". This document defines a common header to be used in
conjunction with all payload-specific FB messages. The definition of
specific messages is left either to RTP payload format specifications
or to additional feedback format documents.

Application layer FB messages provide a means to transparently convey
feedback from the receiver's to the sender's application. The
information contained in such a message is not expected to be acted
upon at the transport/RTP or the codec layer. The data to be
exchanged between two application instances is usually defined in the
application protocol specification and thus can be identified by the
application so that there is no need for additional external
information. Hence, this document defines only a common header to be
used along with all application layer FB messages. From a protocol
point of view, an application layer FB message is treated as a
special case of a payload-specific FB message.

Note: Proper processing of some FB messages at the media sender
side may require the sender to know which payload type the FB
message refers to. Most of the time, this knowledge can likely be
derived from a media stream using only a single payload type.
However, if several codecs are used simultaneously (e.g., with
audio and DTMF) or when codec changes occur, the payload type
information may need to be conveyed explicitly as part of the FB
message. This applies to all

Ott, et al. Standards Track [Page 31]

RFC 4585 RTP/AVPF July 2006

payload-specific as well as application layer FB messages. It is
up to the specification of an FB message to define how payload
type information is transmitted.

This document defines two transport layer and three (video) payload-
specific FB messages as well as a single container for application
layer FB messages. Additional transport layer and payload-specific
FB messages MAY be defined in other documents and MUST be registered
through IANA (see Section 9, "IANA Considerations").

The general syntax and semantics for the above RTCP FB message types
are described in the following subsections.

6.1. Common Packet Format for Feedback Messages

All FB messages MUST use a common packet format that is depicted in
Figure 3:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| FMT | PT | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of packet sender |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of media source |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: Feedback Control Information (FCI) :
: :

Figure 3: Common Packet Format for Feedback Messages

The fields V, P, SSRC, and length are defined in the RTP
specification [2], the respective meaning being summarized below:

version (V): 2 bits
This field identifies the RTP version. The current version is 2.

padding (P): 1 bit
If set, the padding bit indicates that the packet contains
additional padding octets at the end that are not part of the
control information but are included in the length field.

Ott, et al. Standards Track [Page 32]

RFC 4585 RTP/AVPF July 2006

Feedback message type (FMT): 5 bits
This field identifies the type of the FB message and is
interpreted relative to the type (transport layer, payload-
specific, or application layer feedback). The values for each of
the three feedback types are defined in the respective sections
below.

Payload type (PT): 8 bits
This is the RTCP packet type that identifies the packet as being
an RTCP FB message. Two values are defined by the IANA:

Length: 16 bits
The length of this packet in 32-bit words minus one, including the
header and any padding. This is in line with the definition of
the length field used in RTCP sender and receiver reports [3].

SSRC of packet sender: 32 bits
The synchronization source identifier for the originator of this
packet.

SSRC of media source: 32 bits
The synchronization source identifier of the media source that
this piece of feedback information is related to.

Feedback Control Information (FCI): variable length
The following three sections define which additional information
MAY be included in the FB message for each type of feedback:
transport layer, payload-specific, or application layer feedback.
Note that further FCI contents MAY be specified in further
documents.

Each RTCP feedback packet MUST contain at least one FB message in the
FCI field. Sections 6.2 and 6.3 define for each FCI type, whether or
not multiple FB messages MAY be compressed into a single FCI field.
If this is the case, they MUST be of the same type, i.e., same FMT.
If multiple types of feedback messages, i.e., several FMTs, need to
be conveyed, then several RTCP FB messages MUST be generated and
SHOULD be concatenated in the same compound RTCP packet.

Ott, et al. Standards Track [Page 33]

RFC 4585 RTP/AVPF July 2006

6.2. Transport Layer Feedback Messages

Transport layer FB messages are identified by the value RTPFB as RTCP
message type.

A single general purpose transport layer FB message is defined in
this document: Generic NACK. It is identified by means of the FMT
parameter as follows:

0: unassigned
1: Generic NACK
2-30: unassigned
31: reserved for future expansion of the identifier number space

The following subsection defines the formats of the FCI field for
this type of FB message. Further generic feedback messages MAY be
defined in the future.

6.2.1. Generic NACK

The Generic NACK message is identified by PT=RTPFB and FMT=1.

The FCI field MUST contain at least one and MAY contain more than one
Generic NACK.

The Generic NACK is used to indicate the loss of one or more RTP
packets. The lost packet(s) are identified by the means of a packet
identifier and a bit mask.

Generic NACK feedback SHOULD NOT be used if the underlying transport
protocol is capable of providing similar feedback information to the
sender (as may be the case, e.g., with DCCP).

The Feedback Control Information (FCI) field has the following Syntax
(Figure 4):

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PID | BLP |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 4: Syntax for the Generic NACK message

Packet ID (PID): 16 bits
The PID field is used to specify a lost packet. The PID field
refers to the RTP sequence number of the lost packet.

Ott, et al. Standards Track [Page 34]

RFC 4585 RTP/AVPF July 2006

bitmask of following lost packets (BLP): 16 bits
The BLP allows for reporting losses of any of the 16 RTP packets
immediately following the RTP packet indicated by the PID. The
BLP's definition is identical to that given in [6]. Denoting the
BLP's least significant bit as bit 1, and its most significant bit
as bit 16, then bit i of the bit mask is set to 1 if the receiver
has not received RTP packet number (PID+i) (modulo 2^16) and
indicates this packet is lost; bit i is set to 0 otherwise. Note
that the sender MUST NOT assume that a receiver has received a
packet because its bit mask was set to 0. For example, the least
significant bit of the BLP would be set to 1 if the packet
corresponding to the PID and the following packet have been lost.
However, the sender cannot infer that packets PID+2 through PID+16
have been received simply because bits 2 through 15 of the BLP are
0; all the sender knows is that the receiver has not reported them
as lost at this time.

The length of the FB message MUST be set to 2+n, with n being the
number of Generic NACKs contained in the FCI field.

The Generic NACK message implicitly references the payload type
through the sequence number(s).

6.3. Payload-Specific Feedback Messages

Payload-Specific FB messages are identified by the value PT=PSFB as
RTCP message type.

Three payload-specific FB messages are defined so far plus an
application layer FB message. They are identified by means of the
FMT parameter as follows:

0: unassigned
1: Picture Loss Indication (PLI)
2: Slice Loss Indication (SLI)
3: Reference Picture Selection Indication (RPSI)
4-14: unassigned
15: Application layer FB (AFB) message
16-30: unassigned
31: reserved for future expansion of the sequence number space

The following subsections define the FCI formats for the payload-
specific FB messages, Section 6.4 defines FCI format for the
application layer FB message.

Ott, et al. Standards Track [Page 35]

RFC 4585 RTP/AVPF July 2006

6.3.1. Picture Loss Indication (PLI)

The PLI FB message is identified by PT=PSFB and FMT=1.

There MUST be exactly one PLI contained in the FCI field.

6.3.1.1. Semantics

With the Picture Loss Indication message, a decoder informs the
encoder about the loss of an undefined amount of coded video data
belonging to one or more pictures. When used in conjunction with any
video coding scheme that is based on inter-picture prediction, an
encoder that receives a PLI becomes aware that the prediction chain
may be broken. The sender MAY react to a PLI by transmitting an
intra-picture to achieve resynchronization (making this message
effectively similar to the FIR message as defined in [6]); however,
the sender MUST consider congestion control as outlined in Section 7,
which MAY restrict its ability to send an intra frame.

Other RTP payload specifications such as RFC 2032 [6] already define
a feedback mechanism for some for certain codecs. An application
supporting both schemes MUST use the feedback mechanism defined in
this specification when sending feedback. For backward compatibility
reasons, such an application SHOULD also be capable to receive and
react to the feedback scheme defined in the respective RTP payload
format, if this is required by that payload format.

6.3.1.2. Message Format

PLI does not require parameters. Therefore, the length field MUST be
2, and there MUST NOT be any Feedback Control Information.

The semantics of this FB message is independent of the payload type.

6.3.1.3. Timing Rules

The timing follows the rules outlined in Section 3. In systems that
employ both PLI and other types of feedback, it may be advisable to
follow the Regular RTCP RR timing rules for PLI, since PLI is not as
delay critical as other FB types.

6.3.1.4. Remarks

PLI messages typically trigger the sending of full intra-pictures.
Intra-pictures are several times larger then predicted (inter-)
pictures. Their size is independent of the time they are generated.
In most environments, especially when employing bandwidth-limited
links, the use of an intra-picture implies an allowed delay that is a

Ott, et al. Standards Track [Page 36]

RFC 4585 RTP/AVPF July 2006

significant multitude of the typical frame duration. An example: If
the sending frame rate is 10 fps, and an intra-picture is assumed to
be 10 times as big as an inter-picture, then a full second of latency
has to be accepted. In such an environment, there is no need for a
particular short delay in sending the FB message. Hence, waiting for
the next possible time slot allowed by RTCP timing rules as per [2]
with Tmin=0 does not have a negative impact on the system
performance.

6.3.2. Slice Loss Indication (SLI)

The SLI FB message is identified by PT=PSFB and FMT=2.

The FCI field MUST contain at least one and MAY contain more than one
SLI.

6.3.2.1. Semantics

With the Slice Loss Indication, a decoder can inform an encoder that
it has detected the loss or corruption of one or several consecutive
macroblock(s) in scan order (see below). This FB message MUST NOT be
used for video codecs with non-uniform, dynamically changeable
macroblock sizes such as H.263 with enabled Annex Q. In such a case,
an encoder cannot always identify the corrupted spatial region.

6.3.2.2. Format

The Slice Loss Indication uses one additional FCI field, the content
of which is depicted in Figure 6. The length of the FB message MUST
be set to 2+n, with n being the number of SLIs contained in the FCI
field.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| First | Number | PictureID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 6: Syntax of the Slice Loss Indication (SLI)

First: 13 bits
The macroblock (MB) address of the first lost macroblock. The MB
numbering is done such that the macroblock in the upper left
corner of the picture is considered macroblock number 1 and the
number for each macroblock increases from left to right and then
from top to bottom in raster-scan order (such that if there is a
total of N macroblocks in a picture, the bottom right macroblock
is considered macroblock number N).

Ott, et al. Standards Track [Page 37]

RFC 4585 RTP/AVPF July 2006

Number: 13 bits
The number of lost macroblocks, in scan order as discussed above.

PictureID: 6 bits
The six least significant bits of the codec-specific identifier
that is used to reference the picture in which the loss of the
macroblock(s) has occurred. For many video codecs, the PictureID
is identical to the Temporal Reference.

The applicability of this FB message is limited to a small set of
video codecs; therefore, no explicit payload type information is
provided.

6.3.2.3. Timing Rules

The efficiency of algorithms using the Slice Loss Indication is
reduced greatly when the Indication is not transmitted in a timely
fashion. Motion compensation propagates corrupted pixels that are
not reported as being corrupted. Therefore, the use of the algorithm
discussed in Section 3 is highly recommended.

6.3.2.4. Remarks

The term Slice is defined and used here in the sense of MPEG-1 -- a
consecutive number of macroblocks in scan order. More recent video
coding standards sometimes have a different understanding of the term
Slice. In H.263 (1998), for example, a concept known as "rectangular
slice" exists. The loss of one rectangular slice may lead to the
necessity of sending more than one SLI in order to precisely identify
the region of lost/damaged MBs.

The first field of the FCI defines the first macroblock of a picture
as 1 and not, as one could suspect, as 0. This was done to align
this specification with the comparable mechanism available in ITU-T
Rec. H.245 [24]. The maximum number of macroblocks in a picture
(2**13 or 8192) corresponds to the maximum picture sizes of most of
the ITU-T and ISO/IEC video codecs. If future video codecs offer
larger picture sizes and/or smaller macroblock sizes, then an
additional FB message has to be defined. The six least significant
bits of the Temporal Reference field are deemed to be sufficient to
indicate the picture in which the loss occurred.

The reaction to an SLI is not part of this specification. One
typical way of reacting to an SLI is to use intra refresh for the
affected spatial region.

Ott, et al. Standards Track [Page 38]

RFC 4585 RTP/AVPF July 2006

Algorithms were reported that keep track of the regions affected by
motion compensation, in order to allow for a transmission of Intra
macroblocks to all those areas, regardless of the timing of the FB
(see H.263 (2000) Appendix I [17] and [15]). Although the timing of
the FB is less critical when those algorithms are used than if they
are not, it has to be observed that those algorithms correct large
parts of the picture and, therefore, have to transmit much higher
data volume in case of delayed FBs.

6.3.3. Reference Picture Selection Indication (RPSI)

The RPSI FB message is identified by PT=PSFB and FMT=3.

There MUST be exactly one RPSI contained in the FCI field.

6.3.3.1. Semantics

Modern video coding standards such as MPEG-4 visual version 2 [16] or
H.263 version 2 [17] allow using older reference pictures than the
most recent one for predictive coding. Typically, a first-in-first-
out queue of reference pictures is maintained. If an encoder has
learned about a loss of encoder-decoder synchronicity, a known-as-
correct reference picture can be used. As this reference picture is
temporally further away then usual, the resulting predictively coded
picture will use more bits.

Both MPEG-4 and H.263 define a binary format for the "payload" of an
RPSI message that includes information such as the temporal ID of the
damaged picture and the size of the damaged region. This bit string
is typically small (a couple of dozen bits), of variable length, and
self-contained, i.e., contains all information that is necessary to
perform reference picture selection.

Both MPEG-4 and H.263 allow the use of RPSI with positive feedback
information as well. That is, pictures (or Slices) are reported that
were decoded without error. Note that any form of positive feedback
MUST NOT be used when in a multiparty session (reporting positive
feedback about individual reference pictures at RTCP intervals is not
expected to be of much use anyway).

Ott, et al. Standards Track [Page 39]

RFC 4585 RTP/AVPF July 2006

6.3.3.2. Format

The FCI for the RPSI message follows the format depicted in Figure 7:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PB |0| Payload Type| Native RPSI bit string |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| defined per codec ... | Padding (0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 7: Syntax of the Reference Picture Selection Indication (RPSI)

PB: 8 bits
The number of unused bits required to pad the length of the RPSI
message to a multiple of 32 bits.

0: 1 bit
MUST be set to zero upon transmission and ignored upon reception.

Payload Type: 7 bits
Indicates the RTP payload type in the context of which the native
RPSI bit string MUST be interpreted.

Native RPSI bit string: variable length
The RPSI information as natively defined by the video codec.

Padding: #PB bits
A number of bits set to zero to fill up the contents of the RPSI
message to the next 32-bit boundary. The number of padding bits
MUST be indicated by the PB field.

6.3.3.3. Timing Rules

RPSI is even more critical to delay than algorithms using SLI. This
is because the older the RPSI message is, the more bits the encoder
has to spend to re-establish encoder-decoder synchronicity. See [15]
for some information about the overhead of RPSI for certain bit
rate/frame rate/loss rate scenarios.

Therefore, RPSI messages should typically be sent as soon as
possible, employing the algorithm of Section 3.

Ott, et al. Standards Track [Page 40]

RFC 4585 RTP/AVPF July 2006

6.4. Application Layer Feedback Messages

Application layer FB messages are a special case of payload-specific
messages and are identified by PT=PSFB and FMT=15. There MUST be
exactly one application layer FB message contained in the FCI field,
unless the application layer FB message structure itself allows for
stacking (e.g., by means of a fixed size or explicit length
indicator).

These messages are used to transport application-defined data
directly from the receiver's to the sender's application. The data
that is transported is not identified by the FB message. Therefore,
the application MUST be able to identify the message payload.

Usually, applications define their own set of messages, e.g., NEWPRED
messages in MPEG-4 [16] (carried in RTP packets according to RFC 3016
[23]) or FB messages in H.263/Annex N, U [17] (packetized as per RFC
2429 [14]). These messages do not need any additional information
from the RTCP message. Thus, the application message is simply
placed into the FCI field as follows and the length field is set
accordingly.

Application Message (FCI): variable length
This field contains the original application message that should
be transported from the receiver to the source. The format is
application dependent. The length of this field is variable. If
the application data is not 32-bit aligned, padding bits and bytes
MUST be added to achieve 32-bit alignment. Identification of
padding is up to the application layer and not defined in this
specification.

The application layer FB message specification MUST define whether or
not the message needs to be interpreted specifically in the context
of a certain codec (identified by the RTP payload type). If a
reference to the payload type is required for proper processing, the
application layer FB message specification MUST define a way to
communicate the payload type information as part of the application
layer FB message itself.

7. Early Feedback and Congestion Control

In the previous sections, the FB messages were defined as well as the
timing rules according to which to send these messages. The way to
react to the feedback received depends on the application using the
feedback mechanisms and hence is beyond the scope of this document.

Ott, et al. Standards Track [Page 41]

RFC 4585 RTP/AVPF July 2006

However, across all applications, there is a common requirement for
(TCP-friendly) congestion control on the media stream as defined in
[1] and [2] when operating in a best-effort network environment.

It should be noted that RTCP feedback itself is insufficient for
congestion control purposes as it is likely to operate at much slower
timescales than other transport layer feedback mechanisms (that
usually operate in the order of RTT). Therefore, additional
mechanisms are required to perform proper congestion control.

A congestion control algorithm that shares the available bandwidth
reasonably fairly with competing TCP connections, e.g., TFRC [7],
MUST be used to determine the data rate for the media stream within
the bounds of the RTP sender's and the media session's capabilities
if the RTP/AVPF session is transmitted in a best-effort environment.

8. Security Considerations

RTP packets transporting information with the proposed payload format
are subject to the security considerations discussed in the RTP
specification [1] and in the RTP/AVP profile specification [2]. This
profile does not specify any additional security services.

This profile modifies the timing behavior of RTCP and eliminates the
minimum RTCP interval of five seconds and allows for earlier feedback
to be provided by receivers. Group members of the associated RTP
session (possibly pretending to represent a large number of entities)
may disturb the operation of RTCP by sending large numbers of RTCP
packets thereby reducing the RTCP bandwidth available for Regular
RTCP reporting as well as for Early FB messages. (Note that an
entity need not be a member of a multicast group to cause these
effects.) Similarly, malicious members may send very large RTCP
messages, thereby increasing the avg_rtcp_size variable and reducing
the effectively available RTCP bandwidth.

Feedback information may be suppressed if unknown RTCP feedback
packets are received. This introduces the risk of a malicious group
member reducing Early feedback by simply transmitting payload-
specific RTCP feedback packets with random contents that are not
recognized by any receiver (so they will suppress feedback) or by the
sender (so no repair actions will be taken).

A malicious group member can also report arbitrary high loss rates in
the feedback information to make the sender throttle the data
transmission and increase the amount of redundancy information or
take other action to deal with the pretended packet loss (e.g., send
fewer frames or decrease audio/video quality). This may result in a
degradation of the quality of the reproduced media stream.

Ott, et al. Standards Track [Page 42]

RFC 4585 RTP/AVPF July 2006

Finally, a malicious group member can act as a large number of group
members and thereby obtain an artificially large share of the Early
feedback bandwidth and reduce the reactivity of the other group
members -- possibly even causing them to no longer operate in
Immediate or Early feedback mode and thus undermining the whole
purpose of this profile.

Senders as well as receivers SHOULD behave conservatively when
observing strange reporting behavior. For excessive failure
reporting from one or a few receivers, the sender MAY decide to no
longer consider this feedback when adapting its transmission behavior
for the media stream. In any case, senders and receivers SHOULD
still adhere to the maximum RTCP bandwidth but make sure that they
are capable of transmitting at least regularly scheduled RTCP
packets. Senders SHOULD carefully consider how to adjust their
transmission bandwidth when encountering strange reporting behavior;
they MUST NOT increase their transmission bandwidth even if ignoring
suspicious feedback.

Attacks using false RTCP packets (Regular as well as Early ones) can
be avoided by authenticating all RTCP messages. This can be achieved
by using the AVPF profile together with the Secure RTP profile as
defined in [22]; as a prerequisite, an appropriate combination of
those two profiles (an "SAVPF") is being specified [21]. Note that,
when employing group authentication (as opposed to source
authentication), the aforementioned attacks may be carried out by
malicious or malfunctioning group members in possession of the right
keying material.

9. IANA Considerations

The following contact information shall be used for all registrations
included here:

Contact: Joerg Ott
mailto:jo@acm.org
tel:+358-9-451-2460

The feedback profile as an extension to the profile for audio-visual
conferences with minimal control has been registered for the Session
Description Protocol (specifically the type "proto"): "RTP/AVPF".

Ott, et al. Standards Track [Page 43]

RFC 4585 RTP/AVPF July 2006

SDP Protocol ("proto"):

Name: RTP/AVPF
Long form: Extended RTP Profile with RTCP-based Feedback
Type of name: proto
Type of attribute: Media level only
Purpose: RFC 4585
Reference: RFC 4585

SDP Attribute ("att-field"):

Attribute name: rtcp-fb
Long form: RTCP Feedback parameter
Type of name: att-field
Type of attribute: Media level only
Subject to charset: No
Purpose: RFC 4585
Reference: RFC 4585
Values: See this document and registrations below

A new registry has been set up for the "rtcp-fb" attribute, with the
following registrations created initially: "ack", "nack", "trr-int",
and "app" as defined in this document.

Initial value registration for the attribute "rtcp-fb"

Value name: ack
Long name: Positive acknowledgement
Reference: RFC 4585.

Value name: nack
Long name: Negative Acknowledgement
Reference: RFC 4585.

Value name: trr-int
Long name: Minimal receiver report interval
Reference: RFC 4585.

Value name: app
Long name: Application-defined parameter
Reference: RFC 4585.

Further entries may be registered on a first-come first-serve basis.
Each new registration needs to indicate the parameter name and the
syntax of possible additional arguments. For each new registration,
it is mandatory that a permanent, stable, and publicly accessible
document exists that specifies the semantics of the registered
parameter, the syntax and semantics of its parameters as well as

Ott, et al. Standards Track [Page 44]

RFC 4585 RTP/AVPF July 2006

corresponding feedback packet formats (if needed). The general
registration procedures of [3] apply.

For use with both "ack" and "nack", a joint sub-registry has been set
up that initially registers the following values:

Initial value registration for the attribute values "ack" and "nack":

Value name: sli
Long name: Slice Loss Indication
Usable with: nack
Reference: RFC 4585.

Value name: pli
Long name: Picture Loss Indication
Usable with: nack
Reference: RFC 4585.

Value name: rpsi
Long name: Reference Picture Selection Indication
Usable with: ack, nack
Reference: RFC 4585.

Value name: app
Long name: Application layer feedback
Usable with: ack, nack
Reference: RFC 4585.

Further entries may be registered on a first-come first-serve basis.
Each registration needs to indicate the parameter name, the syntax of
possible additional arguments, and whether the parameter is
applicable to "ack" or "nack" feedback or both or some different
"rtcp-fb" attribute parameter. For each new registration, it is
mandatory that a permanent, stable, and publicly accessible document
exists that specifies the semantics of the registered parameter, the
syntax and semantics of its parameters as well as corresponding
feedback packet formats (if needed). The general registration
procedures of [3] apply.

Two RTCP Control Packet Types: for the class of transport layer FB
messages ("RTPFB") and for the class of payload-specific FB messages
("PSFB"). Per Section 6, RTPFB=205 and PSFB=206 have been added to
the RTCP registry.

Ott, et al. Standards Track [Page 45]

RFC 4585 RTP/AVPF July 2006

RTP RTCP Control Packet types (PT):

Name: RTPFB
Long name: Generic RTP Feedback
Value: 205
Reference: RFC 4585.

Name: PSFB
Long name: Payload-specific
Value: 206
Reference: RFC 4585.

As AVPF defines additional RTCP payload types, the corresponding
"reserved" RTP payload type space (72-76, as defined in [2]), has
been expanded accordingly.

A new sub-registry has been set up for the FMT values for both the
RTPFB payload type and the PSFB payload type, with the following
registrations created initially:

Within the RTPFB range, the following two format (FMT) values are
initially registered:

Name: Generic NACK
Long name: Generic negative acknowledgement
Value: 1
Reference: RFC 4585.

Name: Extension
Long name: Reserved for future extensions
Value: 31
Reference: RFC 4585.

Within the PSFB range, the following five format (FMT) values are
initially registered:

Name: PLI
Long name: Picture Loss Indication
Value: 1
Reference: RFC 4585.

Name: SLI
Long name: Slice Loss Indication
Value: 2
Reference: RFC 4585.

Ott, et al. Standards Track [Page 46]

RFC 4585 RTP/AVPF July 2006

Name: RPSI
Long name: Reference Picture Selection Indication
Value: 3
Reference: RFC 4585.

Name: AFB
Long name: Application Layer Feedback
Value: 15
Reference: RFC 4585.

Name: Extension
Long name: Reserved for future extensions.
Value: 31
Reference: RFC 4585.

Further entries may be registered following the "Specification
Required" rules as defined in RFC 2434 [9]. Each registration needs
to indicate the FMT value, if there is a specific FB message to go
into the FCI field, and whether or not multiple FB messages may be
stacked in a single FCI field. For each new registration, it is
mandatory that a permanent, stable, and publicly accessible document
exists that specifies the semantics of the registered parameter as
well as the syntax and semantics of the associated FB message (if
any). The general registration procedures of [3] apply.

10. Acknowledgements

This document is a product of the Audio-Visual Transport (AVT)
Working Group of the IETF. The authors would like to thank Steve
Casner and Colin Perkins for their comments and suggestions as well
as for their responsiveness to numerous questions. The authors would
also like to particularly thank Magnus Westerlund for his review and
his valuable suggestions and Shigeru Fukunaga for the contributions
on FB message formats and semantics.

We would also like to thank Andreas Buesching and people at Panasonic
for their simulations and the first independent implementations of
the feedback profile.

Ott, et al. Standards Track [Page 47]

RFC 4585 RTP/AVPF July 2006

11. References

11.1. Normative References

[1] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
"RTP: A Transport Protocol for Real-Time Applications", STD 64,
RFC 3550, July 2003.

[2] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video
Conferences with Minimal Control", STD 65, RFC 3551, July 2003.

[3] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.

[4] Casner, S., "Session Description Protocol (SDP) Bandwidth
Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3556,
July 2003.

[5] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997.

[6] Turletti, T. and C. Huitema, "RTP Payload Format for H.261 Video
Streams", RFC 2032, October 1996.

[7] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP Friendly
Rate Control (TFRC): Protocol Specification", RFC 3448, January
2003.

[8] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with
Session Description Protocol (SDP)", RFC 3264, June 2002.

[9] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", BCP 26, RFC 2434, October 1998.

11.2. Informative References

[10] Camarillo, G., Eriksson, G., Holler, J., and H. Schulzrinne,
"Grouping of Media Lines in the Session Description Protocol
(SDP)", RFC 3388, December 2002.

[11] Perkins, C. and O. Hodson, "Options for Repair of Streaming
Media", RFC 2354, June 1998.

[12] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format for
Generic Forward Error Correction", RFC 2733, December 1999.

Ott, et al. Standards Track [Page 48]

RFC 4585 RTP/AVPF July 2006

[13] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, M.,
Bolot, J., Vega-Garcia, A., and S. Fosse-Parisis, "RTP Payload
for Redundant Audio Data", RFC 2198, September 1997.

[14] Bormann, C., Cline, L., Deisher, G., Gardos, T., Maciocco, C.,
Newell, D., Ott, J., Sullivan, G., Wenger, S., and C. Zhu, "RTP
Payload Format for the 1998 Version of ITU-T Rec. H.263 Video
(H.263+)", RFC 2429, October 1998.

[15] B. Girod, N. Faerber, "Feedback-based error control for mobile
video transmission", Proceedings IEEE, Vol. 87, No. 10, pp.
1707 - 1723, October, 1999.

[16] ISO/IEC 14496-2:2001/Amd.1:2002, "Information technology -
Coding of audio-visual objects - Part2: Visual", 2001.

[17] ITU-T Recommendation H.263, "Video Coding for Low Bit Rate
Communication", November 2000.

[18] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits,
Telephony Tones and Telephony Signals", RFC 2833, May 2000.

[19] Kohler, E., Handley, M., and S. Floyd, "Datagram Congestion
Control Protocol (DCCP)", RFC 4340, March 2006.

[20] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP Friendly
Rate Control (TFRC): Protocol Specification", RFC 3448, January
2003.

[21] Ott, J. and E. Carrara, "Extended Secure RTP Profile for RTCP-
based Feedback (RTP/SAVPF)", Work in Progress, December 2005.

[22] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC
3711, March 2004.

[23] Kikuchi, Y., Nomura, T., Fukunaga, S., Matsui, Y., and H.
Kimata, "RTP Payload Format for MPEG-4 Audio/Visual Streams",
RFC 3016, November 2000.

[24] ITU-T Recommendation H.245, "Control protocol for multimedia
communication", May 2006.

Ott, et al. Standards Track [Page 49]

RFC 4585 RTP/AVPF July 2006

Authors' Addresses

Joerg Ott
Helsinki University of Technology (TKK)
Networking Laboratory
PO Box 3000
FIN-02015 TKK
Finland

EMail: jo@acm.org

Stephan Wenger
Nokia Research Center
P.O. Box 100
33721 Tampere
Finland

EMail: stewe@stewe.org

Noriyuki Sato
Oki Electric Industry Co., Ltd.
1-16-8 Chuo, Warabi-city, Saitama 335-8510
Japan

Phone: +81 48 431 5932
Fax: +81 48 431 9115
EMail: sato652@oki.com

Carsten Burmeister
Panasonic R&D Center Germany GmbH

EMail: carsten.burmeister@eu.panasonic.com

Jose Rey
Panasonic R&D Center Germany GmbH
Monzastr. 4c
D-63225 Langen, Germany

EMail: jose.rey@eu.panasonic.com

Ott, et al. Standards Track [Page 50]

RFC 4585 RTP/AVPF July 2006

Full Copyright Statement

This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.

This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.

Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.

Acknowledgement

Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).

Ott, et al. Standards Track [Page 51]