Introducing WebSocket
WebSocket is an application-layer protocol designed to facilitate bidirectional (either the client or server can send a message to the other party whenever a message is available) and full-duplex communication (both the client and server can send messages to each other simultaneously) between a web browser and WebSocket server in real time.
WebSocket is a binary protocol; therefore, it is faster than the HTTP protocol, which is a text-based protocol.
WebSocket has gained popularity and is already being used by many websites due to its real-time and full-duplex features. Due to overhead caused by comet techniques, it was not suitable for real-time bidirectional message transfer, and it was also not possible to establish a full-duplex communication system between a web browser and web server using comet. That is, comet techniques let us achieve only half-duplex communication system (only the client or server can send messages to the other party at a given time).
WebSocket is designed to facilitate bidirectional communication between a web browser and WebSocket server, but it can be used by any client. In this chapter, we will only concentrate on how it's implemented in a web browser.
Note
What is the WebSocket API?
Web browsers provide an API for creating and managing a WebSocket connection to a WebSocket server as well as for sending and receiving data on the connection. We won't use this API for implementing WebSocket; instead, we will use the Socket.IO library.
The relationship between WebSocket and HTTP
The only relationship between WebSocket and HTTP is that a WebSocket handshake between a web browser and WebSocket server is done using HTTP. Therefore, a WebSocket server is also an HTTP server. Once the handshake is successful, the same TCP connection is used for WebSocket communication, that is, communication switches to the bidirectional binary protocol, which does not conform to the HTTP protocol. The default port number for WebSocket is 80, same as for HTTP.
Note
Why is the default WebSocket port 80?
The main reason for integrating HTTP and WebSocket so tightly and making WebSocket share the HTTP port is to prevent firewalls from blocking non-web content.
Although you can implement your own WebSocket handshake mechanism if you are using WebSocket outside a web browser environment, the official WebSocket documentation only states the HTTP handshake mechanism because WebSocket is designed to enable bidirectional communication between web browsers and WebSocket servers.
You can integrate a WebSocket server into your main web server that serves your HTML pages, or you can use a separate web server for WebSocket communication.
Sending and receiving data on a WebSocket connection
Data is transferred through a WebSocket connection as messages, each of which consists of one or more frames containing the data you are sending (called the payload). In order to ensure that the message can be properly reconstructed when it reaches the other party, each frame is prefixed with 4-12 bytes of data about the payload. Using this frame-based messaging system helps reduce the amount of non-payload data that is transferred, leading to significant reductions in latency, therefore making it possible to build real-time components.
We won't get into the exact data format and other details of the WebSocket handshake, data framing, and sending and receiving data as this is only required if you are planning to create your own WebSocket server. We will use Socket.IO JavaScript library to implement WebSocket in our application, which takes care of all the internal details of WebSocket and provides an easy-to-use API.
WebSocket schemes
WebSocket protocol specifications have introduced two new URL schemes, called ws and wss.
ws
represents an unencrypted connection whereas wss
represents an encrypted connection. Encrypted connections use TLS to encrypt messages.
So, when making a WebSocket handshake request using HTTP, we need to use ws
or wss
instead of http
or https
, respectively.
Note
Why ws
and wss
instead of http
and https
?
You must be wondering what the point of introducing a new scheme instead of just using http
. Well, the reason behind this is that WebSocket can also be used outside a web browser environment, and a handshake can be negotiated via a non-HTTP server. Therefore, a different scheme is required when not using HTTP for the handshake.
The interaction of WebSocket with proxy servers, load balancers, and firewalls
The WebSocket protocol is unaware of proxy servers by itself. When a WebSocket connection is established behind a proxy server, the WebSocket connection can fail or work properly, depending on whether the proxy server is transparent or explicit and also whether we have established a secure or unsecure connection.
If the browser is configured to use an explicit proxy server, then it will first issue an HTTP CONNECT
method to that proxy server when establishing the WebSocket connection. The CONNECT
method is used to tell a proxy to make a connection to another host and simply reply with the content, without attempting to parse or cache it. A browser issues the HTTP CONNECT
method regardless of whether the connection is encrypted or unencrypted.
If we are using a transparent proxy server (that is, a proxy server that the web browser is unaware of) and the connection is encrypted, then the browser doesn't issue an HTTP CONNECT
method because it's unaware of the proxy server. But as the connection is encrypted, the proxy server will most probably let all the encrypted data through, therefore causing no problems to the WebSocket connection.
If we are using a transparent proxy server and the connection is unencrypted, then the browser doesn't issue an HTTP CONNECT
method because it's unaware of the proxy server. But as the connection is unencrypted, the proxy server is likely to try to cache, parse, or block the data, therefore causing issues for the WebSocket connection. In this case, the proxy server should be upgraded or explicitly configured to support WebSocket connections.
The WebSocket protocol is unaware of load balancers by itself. If you are using a TCP load balancer, it is unlikely to cause any problems for a WebSocket connection. But if you are using an HTTP load balancer, it's likely to cause problems; therefore, it needs to be upgraded or explicitly configured to handle WebSocket connections.
The WebSocket protocol is unaware of firewalls by itself. Firewalls are unlikely to cause any problems for a WebSocket connection.
The same-origin policy for WebSocket
Browsers as well as WebSocket instances can perform cross-domain communication, that is, they are not restricted by any same-origin policy.
While making an HTTP request for a handshake, the browser sends an Origin
header assigned to the webpage origin.
If a WebSocket server wants to restrict communication to a particular domain, it can read the Origin
HTTP header of the handshake HTTP request and block or allow the handshake accordingly.