Ports 5060 and 5061, both on TCP and UDP, are associated to the Session Initiation Protocol (SIP) by IANA. In particular, port 5060 is assigned to clear text SIP, and port 5061 is assigned to encrypted SIP, also known as SIP-TLS (SIP over a TLS, Transport Layer Security, encrypted channel). Unfortunately, the standard TLS (successor of SSL) can only be established over TCP. Does this mean SIP UDP-based communications have to travel unencrypted? Fortunately not. An IETF draft covering SIP-DTLS is on the queue (DTLS is Datagram TLS, that is, UDP)... but... what is SIP?
The Session Initiation Protocol (SIP) is a signaling protocol for Internet multimedia conferencing and communications, such as Voice over IP (VoIP). It is extensively used on VoIP architectures, and is becoming the IETF alternative and "natural" replacement to the complex H.323 signaling protocol. As a signaling protocol, SIP is mainly focused on managing and negotiating all the features of VoIP sessions (or other multimedia-based protocols), such as setting up and tearing down VoIP calls.
SIP is probably one of the most complex protocols based on the length of its RFC, RFC 3261, and its flexibility and extensibility, what has caused the publication of multiple standards that update and add new functionality to it. Today, a search by SIP on RFC Editor returns 157 references.
However, it is a text-based request and response protocol, very similar to HTTP (web): SIP messages are made of headers and body, it allows multiple request methods (REGISTER, INVITE, ACK, BYE, etc), it defines entities as URI's (Uniform Resource Indicators), and the set of responses is categorized in different groups (success (2xx), redirection (3xx), etc); all them common features available in HTTP. SIP messages are ASCII encoded and can make use of the standard Multipurpose Internet Mail Extensions (MIME) capabilities. This protocol is tightly integrated with the most commonly used media protocol, RTP (Real-time Transport Protocol), used to carry the streaming and real-time multimedia sessions. The ports used for RTP communications are dynamically negotiated by SIP when a new VoIP or multimedia session is established. This behavior is very similar to FTP, where the control channel manage the communication (TCP/21), and negotiates the ports for the data channel, the one that transports the information (TCP/20 or dynamically negotiated). For those readers interested on inspecting a SIP packet capture, check this sample from the Wireshark repository, where SIP Digest authentication is used.
Why SIP is (or can be) an important protocol? As VoIP and other multimedia capabilities, such as video on demand (VoD) and triple play communications, are being widely deployed by service providers and companies, SIP is silently entering in our networks. However, some providers do not use SIP over the standard ports, and select their own set of ports; this obviously breaks communication capabilites between providers as it is like setting up your web server on a port different than TCP/80 (you need to know in what port it is listening on). Additionally, if SIP P2P (peer to peer) communications get adoption, where intermediate proxies and servers are not required for two (or more) users to establish direct multimedia communications, we might end up opening these ports (TCP & UDP / 5060 & 5061) in most firewalls, and specially, on SOHO environments.
You can start today looking for SIP traffic on your networks. Capture some traffic and analyze it. Using Wireshark you can apply different display filters to confirm if SIP is there:
udp.port == 5060 or udp.port == 5061
tcp.port == 5060 or tcp.port == 5061
sip or sdp
frame contains "SIP"
ssl or dtls
Let's get ready for the upcoming SIP and VoIP wave!
Oct 20th 2009
1 decade ago