O
u
t
a
S
i
t
e
A Program to Make Site Outlines
by Nomi Harris and Wim Rijnders
Back in 1999, I wanted to overhaul my Web site, so I did a Web search to
try to find a program that would make a site map (or outline) of my
site. The only program I found that did what I wanted cost $500, so I
decided to write my own.
OutaSite is a Perl program that, given an URL, produces an outline (in
HTML) showing the children, children's children, etc., of the URL.
(The "children" of a page are URLs to which there are hyperlinks
from that page.) It uses the HTTP perl module to grab the contents of URLs.
The site outlines produced by OutaSite are not necessarily intended to
be posted on your public Web server as site maps; they are intended
more for use by Webmasters to help them clean up their Web sites by checking
for bad links or URLs that have moved.
Here
is an example of a site outline produced by OutaSite--this one was
made for my home page *before* I fixed some of the broken links (so you can see
how OutaSite shows broken links).
Features of OutaSite include:
- Hierarchical numbering of children
- Children of a given URL are printed only once; future instances of the same URL are hyperlinked to the place where they are printed.
- URLs are shown as hyperlinked titles (if the title can be determined--if not, then the URL itself is used as the title).
- Frames are treated as if they were hyperlinks.
- Command-line arguments let you set maximum depth for searching tree (-d) and maximum number of children to search at any branch (-c)
- Command-line argument (-offsite) sets whether branches terminate when they hit off-site URLs
- Links that are obviously not HTML (e.g. images, .pdf files, etc.) are not followed. (Right now, cgi scripts *are* followed, but you could turn this off easily.)
- Links that are not found or forbidden are indicated in red
- Links that are unreachable are indicated in orange
- Tries to follow redirects, then indicates them in green.
- Section numbers hyperlink back to parents so you can quickly see which page has a particular link.
April 2003: Version 2.0
Wim Rijnders discovered
OutaSite and thought it was great, so he decided to make it even better.
He improved detection of redirects and errors; added handling of ftp links; and
speeded up indexing of links that shouldn't be followed. He also did a bit of
code reorganization.
As a service to other Webmasters, OutaSite is available free (for
non-commercial use)--click here
to download. It should run on any system that has Perl 5.002 or later.
Send
compliments or bug reports to me (nlharris at lbl.gov) and/or Wim
(wimrijnders at home.nl).
Last modified: Mon Apr 14 15:45:23 PDT 2003