Transparent proxy

Also known as an intercepting proxy, inline proxy, or forced proxy, a transparent proxy intercepts normal communication at the network layer without requiring any special client configuration. Clients need not be aware of the existence of the proxy. A transparent proxy is normally located between the client and the Internet, with the proxy performing some of the functions of a gateway or router.

Transparent proxies act as intermediaries between a user and a web service. When a user connects to a service, the transparent proxy intercepts the request before passing it on to the provider. Transparent proxies are considered transparent because the user isn’t aware of them. On the other hand, the servers hosting the service recognize that the proxied traffic is coming from a proxy and not directly from the user.

Use of Transparent Proxies

Transparent proxies are extremely versatile. The following list contains common examples of how transparent proxies are used.

  • Proxy caches create copies of the data stored on a server and serve the cached content to users. This reduces the strain on the web service by having the proxy provide the content instead of the service itself.
  • Filtering proxies prevent access to certain websites or web services. These are commonly implemented by organizations to prevent users from accessing resources that are unrelated or disruptive to the organization.
  • Gateway proxies modify or block network traffic based on certain rules. Locations that offer public Wi-Fi often implement gateways that require users to register or accept an agreement before they can use the service.

Benefits of transparent proxies

Transparent proxies are an unobtrusive way to add features and functionality to a user’s browsing experience.

  • Enterprises experience greater control over how their customers interact with their services by routing and modifying requests as they come in.
  • Users interact with web services more easily since their connections are seamlessly and invisibly passed through the proxy, leaving configuration to the service providers.

The Problem with Transparency

When Proxy transparently caches a site, the source IP address of the connection changes: the request comes from the cache server rather than the client machine. This can play havoc with web sites that use IP-address authentication (such sites only allow requests from a small set of IP addresses, rather than authenticating requests with a name and password.)
Since the cache changes the source IP address of the connection, some servers may deny legitimate users access. In many cases, this will cost users money.

If you know your network inside out, and know exactly who would be accessing a site like this, there is probably no problem with using transparent caching. If this is the case, though, it might be easier to simply change all of your users' settings.

The Transparent Caching Process

Let's look at what happens when you use transparency. First, though, you need to know something of what happens to IP packets at the ethernet level.

Some Routing Basics
An ethernet IP packet contains four addresses:

  1. The destination ''mac'' address. When a packet is transmitted down the ethernet wire, all ethernet cards on the network will check the destination mac address value. Each ethernet has a (supposedly) unique mac address. If the ethernet card's mac address matches the destination mac address of the packet, the ethernet card will pass the packet to the operating system, which will then deal with the contents of the packet.
  2. The source ''mac'' address: set by the sending ethernet card.
  3. The destination IP address: set by the application sending the packet.
  4. The source ''IP'' address: set by the operating system of the ''source'' host (or, in some circumstances, the application on the source machine.) This value is not changed by routers along the way, routers re-forward the contents of the packet intact, and change only the destination mac addresses. If the source address was changed by each router, the routers would have to keep state of all the connections passing through it. This way, it can simply forward packets and forget about them.

When a host wants to communicate with a machine that isn't on the local network, it uses a smart ''router'' to find the path to that network. When the client wants to send a packet ''through'' a router, the client sets the destination ''mac address'' of the packet to the router's interface, and sets the IP destination address to the required end host. It's important to know that the destination IP address of the packet isn't set to the router's IP address, only the ''mac'' address is changed. When a router accepts a packet, it decides which host to forward it to, based on it's routing tables. The router then sets the destination mac address of the packet to the next-hop router's ethernet address, and sends the packet to that machine. The remote host then repeats this process: if it's the destination machine, it uses the packet, but if it's another router, it will try and move the packet closer to it's final destination.