Reverse Proxy

In computer networks, a reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client as though they originated from the proxy server itself. While a forward proxy acts as an intermediary for its associated clients to contact any server, a reverse proxy acts as an intermediary for its associated servers to be contacted by any client.

Common uses for a reverse proxy server

  • Load balancing – A reverse proxy server can act as a “traffic cop,” sitting in front of your back-end servers and distributing client requests across a group of servers in a manner that maximizes speed and capacity utilization while ensuring no one server is overloaded, which can degrade performance. If a server goes down, the load balancer redirects traffic to the remaining online servers.
  • Web acceleration – Reverse proxies can compress inbound and outbound data, as well as cache commonly requested content, both of which speed up the flow of traffic between clients and servers. They can also perform additional tasks such as SSL encryption to take load off of your web servers, thereby boosting their performance.
  • Security and anonymity – By intercepting requests headed for your back-end servers, a reverse proxy server protects their identities and acts as an additional defense against security attacks. It also ensures that multiple servers can be accessed from a single record locator or URL regardless of the structure of your local area network.

How its works

As its name implies, a reverse proxy does the exact opposite of what a forward proxy does. While a forward proxy proxies in behalf of clients (or requesting hosts), a reverse proxy proxies in behalf of servers. A reverse proxy accepts requests from external clients on behalf of servers stationed behind it just like what the figure below illustrates.

Reverse Proxy

To the client in our example, it is the reverse proxy that is providing file transfer services. The client is oblivious to the file transfer servers behind the proxy, which are actually providing those services. In effect, whereas a forward proxy hides the identities of clients, a reverse proxy hides the identities of servers.

An Internet-based attacker would therefore find it considerably more difficult to acquire data found in those file transfer servers than if he wouldn't have had to deal with a reverse proxy.
Just like forward proxy servers, reverse proxies also provide a single point of access and control. You typically set it up to work alongside one or two firewalls to control traffic and requests directed to your internal servers.

In most cases, reverse proxy servers also act as load balancers for the servers behind it. Load balancers play a crucial role in providing high availability to network services that receive large volumes of requests. When a reverse proxy performs load balancing, it distributes incoming requests to a cluster of servers, all providing the same kind of service. So, for instance, a reverse proxy load balancing FTP services will have a cluster of FTP servers behind it.

Both types of proxy servers relay requests and responses between source and destination machines. But in the case of reverse proxy servers, client requests that go through them normally originate from the Internet, while, in the case of forward proxies, client requests normally come from the internal network behind them.

Whats the difference?

Rarely do these terms get thrown around, until deep into the discussion of the solution with the customer. There is a difference, however, and you should know it. Reverse Proxies broker connections coming from the internet, to your app servers. Forward Proxies filter connections going out to the internet, from clients sitting behind the firewall. Reverse Proxies take origin connections from the internet and connect them to one server or a server farm, meaning multiple inbound connections from the internet are pooled into one or more connections to the server(s). This is known as TCP Multiplexing, and is often used with Load Balancing techniques to optimize and accelerate application delivery. Reverse Proxies measure load based on the incoming and outgoing connection ratio, the higher the ratio the better the performance.

A key component of Reverse Proxies is the ability to perform TCP Multiplexing. What this means is the incoming connections are terminated, pooled and new connections are established on the back-end using fewer number of server connections resulting in a TCP Multiplexing Ratio. A typical TCP Mux ratio is 10:1 – ten incoming connections to 1 back-end connection. Another benefit of this is that the connections on the back-end to the servers are kept open even when the incoming connections terminate so that they can be re-used when new incoming connections come in – reducing the time to establish server connections hence improving performance.

How Does Reverse Proxy Work?

When a browser makes a request, it normally sends that request directly to the origin server. When Traffic Server is in reverse proxy mode, it intercepts the request before it reaches the origin server. Typically, this is done by setting up the DNS entry for the origin server (i.e., the origin server’s advertised hostname) so it resolves to the Traffic Server IP address. When Traffic Server is configured as the origin server, the browser connects to Traffic Server rather than the origin server.

In reverse proxy mode, Traffic Server serves HTTP requests on behalf of a web server. The figure below illustrates how Traffic Server in reverse proxy mode serves an HTTP request from a client browser.

HTTP Request via Reverse Proxy

The figure above demonstrates the following steps:

  1. A client browser sends an HTTP request addressed to a host called www.host.com on port 80.
  2. Traffic Server receives the request because it is acting as the origin server (the origin server’s advertised hostname resolves to Traffic Server).
  3. Traffic Server locates a map rule in the remap.config file and remaps the request to the specified origin server (realhost.com).
  4. If the request cannot be served from cache, Traffic Server opens a connection to the origin server (or more likely, uses an existing connection it has pre-established), retrieves the content, and optionally caches it for future use.
  5. If the request was a cache hit and the content is still fresh in the cache, or the content is now available through Traffic Server because of step 3, Traffic Server sends the requested object to the client from the cache directly.