Oceanography 549--Communicating Ocean Science--Spring Quarter 1996

reminder: need to discuss schedule for rest of quarter (last Monday of quarter is Memorial Day)

Netscape 2.02


HTML Implementations
Anchor Checking

HTML Implementations

Compatibility or control over formatting? Early in the history of the World Wide Web, the structure of HTML represented a community consensus achieved through the mechanism of an Internet Engineering Task Force (IETF). These groups, welcome to all who want to participate, develop the specifications for traffic on the Internet. Acceptance of standards for HTML and implementation of these standards in browser=client software followed in parallel. All of this changed when Netscape Communications entered the scene, simultaneously proposing extensions to HTML to the appropriate IETF while implementing them in a client and widely distributing this client to the public.

A short trip through Alta Vista yielded some sites which use a server that tracks browser type. Some examples:

Which extensions create incompatibility? dbasic.com provides a tabular summary of the key differences. We can see how these differences are manifested in this comparison by a commercial web page developer, who has given into Netscape:

Anchor Checking

A spider is an automated program that searches the web. Spiders are most often used to collect data for use in search engines like Webcrawler, Alta Vista, and Lycos, but that have use to a web page developer as well as a mechanism to find broken links. I make use of a program called momspider.

momspider is designed to test all links made within a set of web documents. This set of documents is defined by an instruction file which is generally located in a person's home directory. For example my home directory is /usra/mcduff and in that directory I have a file /usra/mcduff/.momspider-instruct. When momspider is invoked it systematically traverses all links within a set of documents, checking the status of each. It produces a web page, whose location is defined in the .momspider-instruct file. For example to traverse the pages referenced on the research activities page within the School of Oceanography server I used the instruction file:

AvoidFile   /usra/mcduff/.momspider-avoid
SitesFile   /usra/mcduff/.momspider-sites
<Tree 
    Name          Oceanography Research Links
    TopURL        http://www.ocean.washington.edu/research/
    IndexURL      http://www.ocean.washington.edu/people/faculty/mcduff/spider-res.html
    IndexFile     /www/htdocs/people/faculty/mcduff/spider-res.html
    IndexTitle    Ocean Research Links MOMspider Index
    EmailAddress  mcduff@ocean.washington.edu
    EmailBroken
    EmailRedirected
    EmailChanged  7
    EmailExpired  1
    ExpireWindow  1
>

with this result. The program also e-mails you on completion with a brief summary of critical items (these are defined in the lines beginning Email above.

To use momspider you need to provisional:

  1. either give the command

    source /usr/local/skel/.momspider-env

    or alternatively add this command to the .login file in your home directory so it will be done automatically every time you login. (You can edit the file .login using pico by giving the command

    pico .login

  2. prepare a .momspider-instruct file for your home directory

  3. run momspider (/usr/local/bin must be in your path)

This should be sufficient information for individual users, but if you are interested in more complex application of momspider you can link to the usage notes.
Brief group exploration

Questions, comments

small groups



Ocean 549 Home

Oceanography 549 Pages
Russ McDuff (mcduff@ocean.washington.edu)
Copyright (©) 1996 University of Washington; Copyright Notice
Last Updated 5/14/96