Failed WebRTC connections can be caused by restrictive networks behind symmetric NATs, port blocks and even protocol blocks at the application & transport layers. We will delve in the intricate process of establishing a peer 2 peer WebRTC connection and lay out the mechanisms that can lead to failed connections.
Establishing a peer 2 peer WebRTC connection has 3 steps:
- Establishing the connection
Problems can appear at any part of the process.
Step One: Signaling
Signaling is the first step in establishing a peer to peer WebRTC connection.
Signaling is the backchannel used to exchange initial information by the (2) parties wanting to establish a peer 2 peer WebRTC connection.
The following information is exchanged:
- Each party’s IP and port where they can be reached (ICE candidates)
- Media capabilities
- Session control messages
Websockets are widely used for signaling. Popular WebRTC media servers like Kurento use them. Secure websockets (wss://) can be also used and are recommended if you wish to have secure data transport for signaling.
Signaling is also one of the first points where the WebRTC connection process can fail.
Kurento for example listens on port 8888 for websocket and on 8443 for secure websocket connections. This default config allows Kurento to run in parallel with your (Apache, nginx) web server but because they’re not commonly used ports like 80 or 443 there’s a high chance that computers & devices that’re part of more restrictive networks will not be able to communicate with these ports on your signaling server.
Running your signaling over port 80 or 443 is one of the 1st things you can do to ensure high connection rates for WebRTC.
Step Two: Discovery (STUN and TURN)
Once a signaling connection is established between the (2) WebRTC endpoints and the signaling server, information can be exchanged.
A very important piece of information is the public IP and port at which each endpoint can be reached. Finding the IP is not a problem for computers that are connected directly to the internet since it (the OS) knows it’s own public IP and can be easily queried by the browser (or other WebRTC clients), but it can be an issue for computers & devices that are part of a local network (behind a router) including mobile devices connected through 3G/4G where their IP is the local network assigned IP.
Such devices are only aware of their local network IP so they have to use the STUN protocol to:
- communicate with a STUN server an find out the public IP of their network and port where they can be reached at
- punch a two way hole through the implicit NAT function of the network’s router.
Furthermore devices behind a symmetric NAT can only communicate with peers with which they’ve communicated in the past thus a TURN server will be needed to relay the data from the other endpoint since the other endpoint can not open a direct connection through the symmetric NAT to our device.
Each WebRTC endpoint will ask the STUN/TURN server for it’s own public IP and port where it can be reached. Once a response is received the WebRTC endpoint will send the pair to the other party through the signaling channel. These ip:port pairs are called ICE candidates.
There are three types of ICE candidates:
- Host: This is the preferred type of candidate. It is represented by a random port and the device’s local ip: - For computers which are directly connected to the Internet the host candidate will be their public Internet ip
- For a computer connected to a local network the host candidate will be their local network ip (e.g. 192.168.1.12)
- For a mobile device connected to a 3G/4G network the host candidate will be their local 3G/4G network ip (e.g. 10.135.12.31)
- Server Reflexive (srflx): Computers that are connected to a local network do not know the public Internet facing ip of the network so they ask a STUN/TURN server (that’s connected directly to the Internet) for their public Internet ip and port opened in the router.
- Relay: These are generated the same way as a Server Reflexive candidate. The query message is sent to the TURN server creating a NAT binding (local ip & ports + remote ip & port pair) in the router. Because the NAT is symmetric only the TURN server can communicate back with the initiating computer/device so the TURN server returns it’s own ip (and a port) back as an ICE candidate. Thus the other WebRTC endpoint will attempt to connect to the ip of the TURN server and not to the actual ip of the other endpoint which is why it’s called a relay candidate.
Communicating with the STUN/TURN servers is the 2nd point where the WebRTC connection process might fail. We’ve encountered 3 possible reasons this could happen:
- The default STUN/TURN ports are blocked
- All UDP ports are blocked
- The STUN/TURN protocols are blocked
Remember we recommended signalling to be done over port 80 or 443? Well, STUN and TURN have their own (different) default ports:
- The default port for sending (or listening to) STUN/TURN requests is 3478.
- The default port for sending (or listening to) STUN/TURN over TLS is 5349.
- Some servers like Google’s generic STUN servers use other ports like 19305 and 19307.
Any of the ports mentioned above could be blocked for either of the two peers trying to connect to each other. In such a case the STUN/TURN servers cannot be reached resulting in no srflx or relay candidates.
To avoid this such issues, one could use common ports for them STUN/TURN servers (443/80) but UDP & protocol blocking still remain.
By default STUN/TURN messages travel over UDP which means a (corporate) firewall which barely allows for DNS queries using port 53 to function over UDP will not permit STUN/TURN messages to pass through.
Luckily STUN/TURN servers can also be communicated with using TCP by specifying the transport parameter in the URL like so:
The above basically tells the WebRTC client “for this TURN/STUN server, connect over TCP instead of UDP”. You can also specify udp (the default value) or tls.
An even worse scenario that one could encounter is when the STUN/TURN protocol messages are blocked altogether. For example we’ve found that Tunnel Bear VPN blocks STUN/TURN packets because they can expose your real ip even if you’re connecting through a VPN. In this case there is not much that you can do except correctly identify the issue and instruct the user to disable such apps during WebRTC calls.
Step 3: Establishing The Connection
Once each WebRTC endpoint learns where the other party can be found at (ip:port ICE candidates) the peer 2 peer connection can be established.
With some WebRTC use cases like video recording the endpoint (in our case Kurento) will act as both a signaling server and as an WebRTC endpoint.
Each client will send the data through UDP to the other endpoint:
- if it’s sending directly to the other party (to a host or srflx candidate) it will send to any port in the 0-65535 range
- if it’s sending to a TURN server (to a relay candidate) it will send to a port between 49152-65535
There’s no way to control these ports, they will be allocated during the discovery phase and are part of the ICE candidates.
- The port number used for signaling is not necessarily the same as the port number used to communicate with the STUN or TURN servers and is not the port number used to send data between WebRTC peers, once the connection is established.
- To ensure signaling will always work make sure it takes place over common ports like 443/80
- Communication with STUN/TURN servers can be blocked using port blocking, UDP blocking and STUN/TURN protocol blocking
- The STUN & TURN protocols can more easily be blocked than websocket communications which are just upgraded HTTP calls
- The ultimate fallback method that can be used for very restrictive networks (e.g. UDP blocked and symmetric NAT) is to configure a TURN server to be accessible over TLS on port 443 or TCP over port 80.
Useful tools for debugging WebRTC connections:
The WebRTC connection test is a very useful tool for checking everything from discovered ICE candidates and thus network restrictions to supported camera resolutions. We send it out to clients and analyze the text report it generates for troubles.
We used the Trickle ICE tool to gather various ICE candidates in different (simulated) network environments. You can use the default STUN server from Google or add your own STUN/TURN servers.
CoTURN is a very easy to setup and use TURN server.
Varun Sing’s state of webrtc video from Twilio 2016 breaks down the monitored connection failures by type. Types Of NAT Explained, Symmetric NAT and It’s Problems + Ilya Grigorik‘s Building Blocks of UDP and WebRTC chapters from the High Performance Browser Networking book are all great resources which will help better understand how WebRTC works and identify issues related to other aspects like latency and video quality.