Detailed Technology Discussion
Background of H.323 and SIP Here's the story: The establishment of end-to-end communication can either be a function of the communication protocols carrying the data, or it can be a separate function. With network protocols like TCP the setup of the connection is inherent in the protocol itself. The drawback to this approach is that, ultimately, a connection is either established, or it is rejected and little more information is conveyed than that. There are limited capabilities for varying types of connections and very few different indications of why a particular connection might have failed, or how to go about correcting a failure. This is fine for TCP; it's the basis of web browsing and email transfer. It's simple: either connect, or not; and if you connect, transfer some data.With a voice call there are many things that happen during the connection setup process. The target station may be busy ("busy signal") or the circuit may be overloaded ("all circuits are busy..."). The intended target may have their phone forwarded to another number, and they may be using caller ID. These are reasons why a signaling protocol is used for voice communication. In the progress of a voice call a virtual circuit is first established between the communicators, then data is sent across the established circuit.In the world of telephone communications (and the associated data-over-the-phone-system "ISDN" network) a popular call setup protocol is called "Q.931". It implements requests and replies like "call setup", "call proceeding", "connected", and "call rejected". The world of Asynchronous Transfer Mode (ATM) uses a protocol family called "Q.2931", similar to Q.931 but with extensions to support the various types of ATM circuits that might be used not only for voice, but for real-time video and time-sensitive data. With the advent of Voice-over-IP the first approach taken to perform call setup and management was to create a protocol that was, in many ways, similar to Q.931/Q.2931. This is the H.323 protocol. Because H.323 was very telephone-company-centric an alternative protocol was created (Session Initiation Protocol; SIP) that was more consistent with the way the routed Internet worked. SIP (Session Initiation Protocol) is a set of standards defined by IETF RFC's (Request for Comments) in the same way IP, TCP, DNS, and other Internet protocols are standardized. Work on SIP began in 1999. SIP implements a series of request and reply messages that allow a device to establish a voice, video, or specialized data connection across the Internet. Because SIP is based on RFC standards there is general interoperability between SIP-enabled devices from different vendors. In a SIP-based VoIP system the intelligence and implantation of interesting features (conference calling, call forwarding, etc). is done in the edge handset device as opposed to the central PBX switch model of non-VoIP telephony in which the switch has the intelligence. This makes a SIP-based system highly scalable since functionality is not vested in a central switch with an upper limit on circuit management, call handling, and feature implementation. H.323 is a set of standards defined by the International Telecommunications Union (ITU) for the transmission of real-time audio, video and data over packet switched networks, including the Internet (using IP packets). The original standard was released in 1996 and, prior to the introduction of SIP, was the emerging standard for Voice-over-IP telephony. H.323 fit nicely into the PSTN switching world and was consistent, in many ways, with protocols like Q.931 and Q.2931 used for call setup and management in ISDN and ATM respectively. SIP is similar to HTTP in that it exchanges information in an ASCII text format and it works in harmony with DNS to resolve domain names and service names into IP addresses. SIP error codes are a superset of HTTP error codes (i.e.: 401=Unauthorized) ("H.323 exchanges binary information and is well suited to operation in the PSTN ISDN environment but can not take advantage of DNS for name resolution.SIP was designed to be a flexible, general-purpose way to set up real-time multimedia sessions between groups of participants. For example, in addition to simple telephone calls, SIP can also be used to set up video and audio multicast meetings, or instant messaging conferences. Once SIP has established a session (a circuit) then RTP (Real Time Protocol) is used as a header (carried by UPD on IP) to convey control information regarding the data content being transmitted. SIP does more than just handle call setup and tear down. The table below shows the five major functions within SIP from a VoIP point of view.
|User location and registration
||End points (telephones) notify SIP proxies of their location; SIP determines which end points will participate in a call.
||SIP is used by end points to determine whether they will “answer” a call.
||SIP is used by end points to negotiate media capabilities, such as agreeing on a mutually supported voice codec.
||SIP tells the end point that its phone should be “ringing;” SIP is used to agree on session attributes used by the calling and called party.
||SIP is used to transfer calls, terminate calls, and change call parameters in mid-session (such as adding a 3-way conference).
- Provides Presence and Mobility
- Protocol Primitives: Setup, terminate, and modify a session
- Redirect calls from unknown callers to the receptionist
- Reply with a web page if a user is unavailable
- Send a JPEG on invitation
- Send Instant Messages
- Exchange any MIME type (i.e.: Email messages with attachments)
- SIP "User Agent Client (UA Client)" originates calls, "User Agent Server (UA Server)" listens for incoming calls.
SIP registration establishes the presence of the user and binds an IP address to the user's present location.
RTP is carried on UDP
- Media content type
- Encoding method and compression type
- Sender identification
- Encryption type
- Segmentation and reassembly
- SIP can be programmed into CGI scripts and Java servlets
- CPL (Call Processing Language) provides an abstraction for both SIP and H.323
- The SIP Proxy acts to allow SIP calls to circumvent the firewall
- To address a SIP destination from a PSTN device the SIP destination must use an E.164 number address. ENUM translates the phone number into a SIP URL at a SIP Gateway.
- SIP has been adopted for 3G cellular networks