IRCache FAQ and Users Guide

Table of Contents


1 About IRCACHE, this FAQ, and other sources of information

1.1 What is IRCACHE?

IRCACHE is the NLANR Web Caching project, originally funded by the National Science Foundation, Directorate for Computer and Information Sciences and Engineering.

From 1996-2000, the IRCache project was administered by the University of California San Diego, in cooperation with the San Diego Supercomputer Center and the National Laboratory for Applied Network Research.

1.2 What is Web Caching?

Web Caching is the act of storing copies of Web pages on a ``local'' system. If the same pages are requested at a later time, and the cached copy is still valid, there is no need to contact the origin server again. These cache hits can significantly reduce latencies and network bandwidth.

1.3 Where can I find more information about Web Caching?

Web Sites:

Books:

1.4 Do you have any mailing lists?

Yes we have one, but it is mostly inactive now.

Send subscription requests to: ircache-request @ ircache.net. Send messages to: ircache @ ircache.net. Martin Hamilton is keeping an HTML archive of the mailing list.

1.5 About This Document

This document is copyrighted © 2001 by Duane Wessels.

Sorry, we do not have a postscript version available at this time.

Please send questions, comments, and corrections to wessels @ ircache.net.


2 The Caches

2.1 What is the IRCache Mesh?

As a part of this project, we operate several medium/large caches in the U.S.

We encourage anyone operating a Web cache to ``plug in'' to the global cache system. When looking for other caches to establish parent or sibling relationships with, we recommend that you look for caches already operating within your country or region. A good place to start is the Cache Tracker service. If there don't seem to be any caches nearby, consider asking your Internet Service Provider to operate a Web cache for you and their other customers.

2.2 What are the IRCache server names and locations?

The current list is:

pb.us.ircache.net Pittsburgh, Pennsylvania
uc.us.ircache.net Urbana-Champaign, Illinois
bo.us.ircache.net Boulder, Colorado
sv.us.ircache.net Silicon Valley, California (FIX-West)
sd.us.ircache.net San Diego, California
pa.us.ircache.net Palo Alto, California
sj.us.ircache.net MAE-West San Jose, California
rtp.us.ircache.net Research Triangle Park, North Carolina
ny.us.ircache.net New York, NY

2.3 Can anyone use the caches as a parent or sibling?

Probably, yes! Our service is open for anyone to use, so long as you do not abuse it.

2.4 How do I sign up?

First, finish reading this document, and then visit the registration form

2.5 How do I know if my registration is accepted?

Registration is handled automatically. You should get an email reply shortly before the cache is reconfigured to accept your credentials.

2.6 Do you have any statistics?

Sure do, see our Cache Statistics page.

2.7 What hardware do you use?

Currently, we have two hardware platforms in use. Some of the caches are Digital Alphaserver 1000's (circa 1995) with 512 MB RAM, and about 24 GB disk space. The newer systems are custom-built Intel-based PCs with about the same amount of RAM and disk.

2.8 What operating system do you use?

The Digital boxes run Digital Unix.

The PC's run FreeBSD.

2.9 What software do you use?

All caches run Squid, which was also supported by the IRCache proejct.

2.10 What are the top ten servers from your cache system?

Well, it changes over time. Feel free to browse our archive of top 50 servers.

2.11 Is your cache down?

Outages and other news is posted to our status page.

3 Information for Cache Administrators

3.1 Which of your caches should I peer with?

We allow you to peer with any of our caches. We are not able to offer advice on which one would be best for you. You can use standard tools such as ping and traceroute yourself to find out which caches are close to you and give good response times. You can get a reverse traceroute from any of the cache machines by opening a telnet session to port 3121.

We also have a service that will show you traceroute output from each of our caches to your site. Based on that, you may choose the caches which are closest or fastest to your server. To use the reverse traceroute service, visit the reverse traceroute page.

If you are using ICP and/or Cache Digests, please do not configure more than 3 or 4 IRCache proxies as your peers.

3.2 How should I configure my cache to peer with yours?

Configuring a neighbor cache is not a trivial matter. There are a number of decisions you need to make, and other factors to consider.

3.2.1 HTTP port number

Our caches accept HTTP requests on port 3128.

3.2.2 Parent or Sibling?

You need to decide if our cache will be a parent, or a sibling, to your cache. If you use a sibling relationship, then your cache requests only those objects that are already in our cache. The sibling relationship requires you to use one of the inter-cache protocols described below. If you use a parent relationship, then your cache may request objects that are not already in our cache.

3.2.3 Inter-cache protocols

If you have a sibling relationship, then you must use one of the inter-cache protocols described here. If you have a parent relationship, then the inter-cache protocol is optional.

ICP

Our caches accept ICP queries on port 3130. ICP is mature, lightweight, and relatively efficient. ICP may cause some false hits, but that won't be an issue in our configuration.

Cache Digests

A Cache Digest represents the contents of a cache at a particular point in time. Our caches have relatively large digests (500~KB to 1~MB). If you have a small amount of traffic, ICP would be a better choice. If you want to use Cache Digests, you must compile Squid with the --enable-cache-digests option to the configure script. See the Cache Digest section of the Squid FAQ for more information.

HTCP

Our caches accept HTCP queries on port 4827. HTCP is newer than ICP, is more complicated, but should be better than ICP at correctly predicting cache hits. In other words, it may be better for a sibling relationship.

3.3 Do you have any sample cache_peer lines?

If you are using Squid, you need to modify squid.conf and add some lines like these.

For a parent, using ICP:

cache_peer pb.us.ircache.net parent 3128 3130 login=you@your.domain:password

For a sibling, using HTCP:

cache_peer uc.us.ircache.net sibling 3128 4827 htcp login=you@your.domain:password

For a parent with no inter-cache protocol

cache_peer bo.us.ircache.net parent 3128 0 no-query default login=you@your.domain:password

3.4 What type of requests are forbidden?