Formal Service Agreements for Scaling Internet Caching
or
Does ICP Make Long-Term Sense?

Michael Schwartz, @Home Network

Two distinct models of proxy cache service seem to be evolving: free exchange, and formal agreement. Under the free exchange model, caches running at interexchange points agree to share access to cached data via the Internet Cache Protocol (ICP), providing a network of servers through which to funnel lower-level accesses. In the formal agreement model, private caching hierarchies are established within a single enterprise network (as in the case at @Home), or across institutions via formal agreement.

In the Harvest research project we defined ICP at a time when many parts of the Internet were still primarily run as cooperative infrastructure. As the Internet continues to grow and commercialize, I believe the free cache service exchange model will give way to a system of formal agreements, similar to how backbone service providers have imposed peering restrictions to reduce transit traffic they carry for one another.

Formal agreements can remove vague or lopsided subsidies from the economic environment. For example, if three organizations provide cache peering and one has a significantly faster link than the other two, the cache connected to the faster link may become populated with more of the popular objects, and the peering organizations therefore get an implicit network cost subsidy. In contrast, formal arrangements codify the network and server costs.

Formal caching arrangements also have potential performance advantages. ICP allows loosely cooperating servers to aggregate accesses at upper level cache servers, to increase the hit rate before crossing more expensive network links, and to provide redundancy in case of server failures. In contrast, in the @Home network architecture (see Figure 1) we can achieve these goals with higher performance by funneling head end-resident proxy server misses to proxies or replicated Web servers running in our Regional Data Centers (RDCs). Avoiding the additional latency from ICP query/response round-trips is important in a broadband environment, because relatively large objects can be transferred in an RPC round trip time.

Figure 1: @Home Network Architecture

In terms of availability, we achieve redundant cache service at the head ends by running multiple proxy servers, each of which answers a subset of proxy requests. Browsers execute proxy auto-configuration scripts that hash on URLs to select a proxy, with mechanisms for timeout/failover. This approach removes object location queries from retrieval time, optimizing for the common, non-failure case. A similar approach can be used between head end and RDC-resident proxies.

Another advantage of the formal agreement approach to caching is that it can permit enterprise-specific invalidation mechanisms before such mechanisms can be standardized. For example, @Home replicates partner content to RDCs. Updates to this content could trigger invalidation multicasts to proxy servers. It would be more difficult to provide such functionality with informally cooperative proxy service. Formal ties can also provide the capability to partition accesses as a function of geographic or shared interest boundaries, potentially increasing hit rates if it turns out that content accesses divide on such boundaries.

I believe that for the next 1-2 years it will make sense to continue using ICP as an informal glue mechanism between cache systems where formal arrangements are infeasible and bandwidth is quite limited. After this time I believe commercial demands will force the Internet to a less grassroots approach to providing shared infrastructure. In the interim, I believe it would be worthwhile to supplement the measurements performed by Wessels and Claffy with some additional measurements, concerning the impact of ICP on latency and network traffic among the deployed servers. First, it would be useful to measure the relative proportion of time spent in ICP queries vs. remote retrieval times across a range of operational link speeds. Second, it would be interesting to measure the number of bytes of traffic caused by ICP vs. the number of bytes of data transmitted for object retrievals. Current uses of ICP trade traffic at the regional/local level for savings across long-haul links, but understanding the costs at each level of the hierarchy would be illuminating. Finally, it would be worthwhile to measure how heterogenous link speeds and proxy disk space impact the uniformity of cache misses satisfied by each cache among a set of peers.


The comments in this position paper are those of the author, and do not represent official position of @Home Network or its equity partners.