You probably SHOULD NOT send non-GET
requests to our caches. POST, PUT, and other non-GET
request methods are not cachable by default, so there
is little benefit for either you or us if you send
those requests through our system.
You MUST NOT send HTTPS (aka SSL)
requests through the IRCache proxies.
We used to accept SSL requests, but some dishonest people
abused our service by relaying bogus transactions through
our caches. Because of these transactions, we received many
complaints about credit card fraud and threats of
FBI involvement.
Thus, we now must deny all SSL requests.
You MUST NOT send requests containing
only an IP address. For example:
http://172.16.12.13/
We are now forced to reject such requests under the assumption
that they come from HTTP-intercepted scanning activity. We
have received some abuse complaints
related to such requests.
Note that we only reject IP-based URLs that have no pathname
component.
The example squid.conf section below demonstrates how
to configure Squid
so that only GET requests are sent to our caches:
acl IpAddressOnly url_regex ^http://[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/$
acl IpAddressOnly url_regex ^http://[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$
acl GETONLY method GET
cache_peer_access sd.us.ircache.net deny IpAddressOnly
cache_peer_access sd.us.ircache.net allow GETONLY
We do not recommend using multicast ICP, but if you really
want to try it, we will allow it. Be sure that you read
the multicast seciton
of the Squid FAQ first! To use
multicast, you need to enter these lines in your squid.conf
file. If you prefer, parent could be replaced with
sibling
cache_peer nlanr.mcast.ircache.net multicast 3128 3130 ttl=128
cache_peer pb.us.ircache.net parent 3128 3130 multicast-responder
cache_peer uc.us.ircache.net parent 3128 3130 multicast-responder
cache_peer bo.us.ircache.net parent 3128 3130 multicast-responder
cache_peer sv.us.ircache.net parent 3128 3130 multicast-responder
cache_peer sd.us.ircache.net parent 3128 3130 multicast-responder
After completing the registration process, as described
above, you can configure your browser. Note that
it takes us a while to process your registration, so you
may need to wait a few hours before using our caches.
The preferred method is to use one of our proxy autoconfiguration
files. This is a URL that you type into your browser. Your
browser uses this URL to configure itself for our proxy
service.
The URL that you'll use depends on which IRCache proxy,
or proxies, you want to use. If you don't really care
which one you use, try the random PAC file:
http://www.ircache.net/pac/random.pac
Browse the PAC directory for
more options.
If you're using Netscape Navigator version 4:
Select Edit from the top menu
Select Preferences
Click on the small triangle next to Advanced
Click on Proxies
Select Automatic proxy configuration button.
Type the above URL into the Configuration location box.
If you're using MSIE:
Select View from the top menu
Select Internet Options
Click on the Connection tab
Select the Automatic configuration sub-window
Click on the Configure button.
Type the above URL into the URL box.
If you're using KDE Konqueror:
Select Settings from the top menu
Select Configure Konqueror...
Click on the Proxy button on the left side of the configuration window
Select the Use the following proxy configuration URL
Type the above URL into the box below that.
If you're using Opera:
Select Tools from the top menu
Select Preferences...
Click on the Network line on the left side of the preferences window
Click on the Proxy servers button
Select the Use automatic proxy configuration checkbox
Type the above URL into the box below that.
If you can't use proxy autoconfiguration, then you need to
manually configure your browser.
If you're using Netscape Navigator version 4:
Select Edit from the top menu
Select Preferences
Click on the small triangle next to Advanced
Click on Proxies
Select Manual proxy configuration and click on
the View button.
Type in the cache hostname (sd.us.ircache.net) and port
number (3128) for FTP Proxy , Gopher
Proxy , and HTTP Proxy .
Leave Security Proxy blank!!
If you're using MSIE:
Select View from the top menu
Select Internet Options
Click on the Connection tab
Select the Proxy server sub-window
Select Manual proxy configuration and click on
the View button.
Type in the cache hostname (sd.us.ircache.net) and port
number (3128) for HTTP , FTP , and Leave Secure blank!!
Do NOT select Use the same proxy server for all protocols !!
The ``:443'' in the URL indicates that this is an SSL, or ``https'' request.
Our caches do not accept SSL requests for two reasons:
They are not cachable.
Hackers have abused our service in the past by routing SSL
requests through our caches.
Most likely, your ISP has not followed the instructions
outlined in the Forbidden
Requests section above. Your ISP must not send SSL
requests to our caches.
You are welcome to browse our Squid configuration files .
If you have specific questions (like the following), please ask!
Not really. We used to, but that configuration has recently
been dropped.
To understand how requests are routed through the IRCache proxies,
you need to be aware of the following:
We don't limit requests coming into the caches.
Our cache clients pick one or two caches to use, and send
all of their requests to those caches. So requests
are not partitioned before hitting one of the IRCache proxies.
Strict partitioning based on TLD doesn't work very well because
the .com domain, for example, is so much larger than the
others.
Cache-routing decisions are based on various inter-cache
protocols such as ICP, HTCP, and Cache Digests. These
protocols first select neighbors where the request
would be a hit, and secondly prefer neighbor caches that
are close to the origin server.
Some of the IRCache proxies peer with each other, and some
do not.
``Routing'' is separate from ``caching'' and the IRCache
proxies are currently configured to store any cachable response
they receive. Thus, a response that passes through two caches
is stored in both of them.
To figure out the cache routing configuration, you first
have to look at the cache_peer and
cache_peer_access lines. Since the configuration
changes from time to time, your best bet is to look at the
current configuration .
Hard to say. Sometimes years, but recently more often.
Yes, Squid supports the HTTP/1.1 Via header.
Yes, Squid supports HTTP/1.1 persistent connections, subject
to a few restrictions:
Squid doesn't block any request to wait for a busy connection
to become idle. It always opens another connection.
Idle connections time out after a short (configurable)
amount of time. Currently set to 15 seconds.
Squid does not yet support chunked encoding. A response
without a Content-length header always closes
a connection.
The access(es) that you noticed came from a Proxy Web Cache.
We (IRCache ) operate a number of
proxy caches as a part of our
Information Resource caching project,
originally funded by the National
Science Foundation .
These caches are located
at various educational and commercial institutions throughout the U.S.
Some of these caches are:
128.182.72.190 pb.us.ircache.net
141.142.121.5 uc.us.ircache.net
192.43.244.42 bo1.us.ircache.net, aka harvest.ucar.edu
192.43.217.35 bo2.us.ircache.net, aka boo.ucar.edu
192.203.230.19 sv.us.ircache.net
198.17.46.58 sd.us.ircache.net
204.123.7.2 pa.us.ircache.net, aka cache2.nlanr.pa-x.dec.com
204.29.239.20 sj.us.ircache.net
128.109.131.47 rtp.us.ircache.net
216.66.24.58 ny.us.ircache.net
Our caches are open to any institution to use. Any organization
may join our mesh of caches simply by asking.
We apologize if someone accessed your site through our caches
in a way that makes you upset or uncomfortable. We are aware that the use of
Web caches has both advantages and disadvantages. However, we sincerely
hope that the benefits outweigh the negative aspects.
Note, all requests from our caches include information in the
request headers which may allow you to track down the originating
party. Specifically, the Via and X-Forwarded-For
headers look something like this:
Via: 1.0 CollegeSherbrooke.qc.ca:8080 (Squid/1.1.10), 1.0 pb.us.ircache.net:3128 (Squid/1.1.18)
X-Forwarded-For: 192.219.72.22, 192.219.75.2
When a cache forwards an HTTP request it appends its own information to the
Via header, and it appends the requesting client's IP address to the X-Forwarded-For
header.
The IRCache proxies do not proxy SSL connections. We
decided to make this change back in September 1998 when
many people were abusing our service. Currently, the only
way anyone could make a credit card purchase through pur
proxies is if the origin server accepts such transactions
over insecure, unencrypted connections.
We shouldn't need to tell you that accepting credit card
transactions via insecure, unencrypted connections is
a really bad idea.
If your service accepts such transactions, we are more
than happy to block your site for IRCache proxy users.
To do so, we simply need to know the IP addresses and/or
hostnames of your servers.
Since our proxies handle millions of requests per day,
it is insufficient to simply say ``someone from your
site made a fradulent purchase on day X.'' We'll need
as much information as possible, including:
- The exact time of the transaction.
- Your server's hostname.
- The URL that was used for the transaction.
Squid
sends ICMP echo requests (aka ``pings'') to origin servers to
measure network proximity.
These network RTT (round-trip time) measurements are used by
Squid in cache meshes to forward requests to the cache which
is closest to the origin server.
In short, we ping your site in order to know which cache
will deliver your web pages most quickly and efficiently.
Consider a cache located in the U.K. which has
(at least) two neighbor caches to which it can forward
requests. One of these is located in the U.S., the other in
Germany. The U.K. cache might be configured to forward all
.com requests to the U.S., but that would be sub-optimal
for requests to www.mercedes-benz.com which is located
in Germany.
As shown in the above figure, both the Germany and U.S. caches
have measured their network RTT to various
origin servers. If we include these RTT measurements in
the ICP replies to the U.K. cache, then the U.K. cache
cache select the neighbor which is closest to the source.
In this case, mercedes-benz.com requests are
forwarded to Germany, and chrysler.com requests
are forwarded to the U.S.
Instead of keeping one measurement per hostname or address,
we aggregate the measurements by subnetworks (/24) under
the assumption that two hosts on the same subnetwork will
have approximately the same network RTT. Measured values
are updated and averaged over time. Squid sends
pings to the same subnet no more than once per 30 seconds.
Yes.
Academic researchers may receive access to the trace files at
no cost. To request access, please follow the instructions
in our README file.
Commercial users are expected to pay for access to the trace files.
This helps support our project, which requires equipment, bandwidth,
and human resources.
Do not use a web browser to download the log files. Use a command
line FTP client instead:
shell> ftp ftp.ircache.net
Connected to ircache.net.
220 ircache.net FTP server (Version 6.00LS) ready.
Name (ftp.ircache.net:wessels): USERNAME
331 Password required for USERNAME.
Password: PASSWORD
230 User USERNAME logged in, access restrictions apply.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd Traces
250 CWD command successful.
ftp> ls
150 Opening ASCII mode data connection for '/bin/ls'.
total 879674
-r--r--r-- 1 1000 0 6584 Nov 1 1999 README
-r--r--r-- 1 1000 0 1931667 Oct 8 12:13 bo1.sanitized-access.20011007.gz
-r--r--r-- 1 1000 0 6719329 Oct 9 12:17 bo1.sanitized-access.20011008.gz
...
You must use BINARY mode when transferring the files. You can
use gzip to uncompress them.
Section 6 of
the Squid FAQ describes the logfiles in detail.
Trace files have names like this:
ny.sanitized-access.20040912.gz
pa.sanitized-access.20040912.gz
pb.sanitized-access.20040912.gz
The first part is the name of the cache that this logfile
came from. They are usually abbreviations for
city names. See the complete list of caches above.
The nuermic part of the filename is the date of the
logfile as YYYYMMDD.
there are plenty of RELEASE entries in the
store log with filenumber FFFFFFFF that correspond to access.log
entries with hits (e.g. TCP_HIT/200). Since they were found in the
cache, they must have been cachable. How is it that these unchachable
responses are served as hits?
The store.log entries that you refer to are most likely
cache validation responses. Squid sends an
If-Modifided-Since (or other validation) request
to the origin server. If the response comes back
as 304, then the client receives the original
cached response from Squid.
When analyzing store.log, be sure to consider the
status code field (6th). You may want to ignore
all log entries where the status code is not
200 (Ok).
It varies among the different caches, and depends on the day of the week.
For current numbers, see this graph
First, the real IP addresses are replaced with "random" addresses. The
mapping from real-to-random remains consistent for each log file. For example,
1.1.1.1 would always be shown as 5.5.5.5 in a single log file. Randomizations
are not consistent between log files, or between days.
Second, any URL query terms (following a '?') are replaced with an MD5 hash.
For example:
http://sleuth-hound.com/search/search/Simple.php?[2:zwIFiSCrOJktlLNSVFGV]