1.1 Introduction
Multimedia applications consisting of audio, video, and/or data that provide communications services at a distance, connected over networks, are termed networked multimedia services. These applications can be distributed across networks, and the communication connectivity can be one-to-one, many-to-many, many-to-one, and one-to-many over the networks. The networking function embedded into multimedia applications makes them networked multimedia services. The network architecture that facilitates networked multimedia services can be termed the networked multimedia services architecture.
1.2 Functional Characteristics
The functional characteristics of networked multimedia applications can be very complex. Multimedia communications for conversation applications will not only need to be in real-time, meeting the stringent performance requirements, but also a simple point-to-point audio call may evolve, at users’ discretion, to become a multimedia call, or a multipoint multimedia call if more participants are added into the call [1,2]. If the data-sharing application is added into the same call, the situation becomes more complicated as it is expected to be the usual case for real-time multimedia collaboration. The connectivity of the communications between the participants can vary from point-to-point to many-to-many, usually with symmetric traffic flows. In the case of video-on-demand (VOD) applications, the connectivity will primarily be from one-to-many with highly asymmetric traffic flows because the video applications are usually distributed from the centralized multimedia servers to multiple users, although the performance constraints may be a little less stringent from those of the conversational applications. The functional characteristics of some multimedia applications, such as multimedia messaging, can be both one-to-many and many-to-many with respect to connectivity, while the traffic flows can also be symmetric and asymmetric depending on the needs of the participants. In general, the functional characteristics of all other multimedia applications may typically fall within these three categories explained here.
1.3 Performance Characteristics
The audio and video of multimedia applications can be continuous, while data consisting of text, still images, and/or graphics can be discrete. However, animation that is also considered as a part of data is continuous, consisting of audio, video, still images, graphics, and/or texts, and needs inter-/intramedia synchronization. Audio, video, and still images are usually captured from the real world, while text, graphics, and animation are synthesized by computers.
On the basis of the performance characteristics of communications, multimedia applications can be categorized as follows: real-time (RT), near-real-time (near-RT), and non-real-time (non-RT). RT applications will have strict bounds on packet loss, packet delay, and delay jitter, while near-RT applications will have less strict bounds on those performance parameters than those of the RT applications. For example, teleconferencing (TC) and video teleconferencing (VTC)/videoconferencing (VC) are considered RT services because of real-time two-way, point-to-point/multipoint conversations between users and, the audio and video performance requirements can be stated as follows [1]:
■ One-way end-to-end delay (including propagation, network, and equipment) for audio or video should be between 100 and 150 ms.
■ Mean-opinion-score (MOS) level for audio should be between 4.0 and 5.0.
■ MOS level for video should be between 3.5 and 5.0.
■ End-to-end delay jitter should be very short, less than 250 μs in some cases.
■ Bit error rate (BER) should be very low for good quality audio or video, although some BER can be tolerated.
■ Intermedia and intramedia synchronization need to be maintained using suitable algorithms.
■ Differential delay between audio and video transmission should be between no more than −20 ms to +40 ms for maintaining proper intermedia synchronization.
One-way VOD [2], which is considered a near-RT communication, can have much less stringent performances than those of TC or VTC. The text or graphics are non-RT applications, and the one-way delay requirement can be of the order of a few seconds; however, unlike audio or video, it cannot tolerate any BER.
The synchronization requirements between different media of multimedia applications impose a heavy burden on the multimedia transport networks, especially for the packet networks such as the Internet Protocol (IP). RT applications are also considered live multimedia applications with the generation of live audio, video, and/or data from live sources of microphones, video cameras, and/or application sharing by human/machine, while near-RT applications are usually retrieved from databases and can be considered as retrieval multimedia applications. Consequently, the synchronization requirements between RT and near-RT applications are also significantly different. The transmission side of the RT applications does not require much control, while near-RT applications must have some defined relationships between media and require some scheduling mechanisms for guaranteed synchronizat...