The Linux Documentation Project
 |  HOWTOS  |  Guides  |  Man Pages  |  Linux Gazette  | 
 

LDP Weekly News
2001-03-06

About the LDP

The Linux Documentation Project is developing free, high quality documentation for the GNU/Linux operating system. This includes the creation of HOWTOs and Guides, and collaboration with other documentation groups.

If you've always wanted to help Linux reach Total World Domination(tm), but you're not a programmer, there's still something you can do. Help the LDP!

The LDP keeps a page of resources for authors at http://www.linuxdoc.org/authors/. Contributions are always welcome.

For more LDP Weekly News, go to http://www.linuxdoc.org/ldpwn/

New Documents

Updated Documents

Stale Mirror Issues

We are continuing to uncover stale LDP HOWTOs all around the net. David S. Lawyer, an LDP author as well as an LDP volunteer, did some research with Google and a single HOWTO, and uncovered some very surprising and very disturbing facts. Here is his report.

Stale HOWTOs (the case of Modem-HOWTO)

by David S. Lawyer, Mar. 7, 2001

Out-of-date (stale) documentation is a major problem for Linux. This is also a problem in the Linux Documentation Project (LDP). One well known reason for stale documents is that document authors sometimes don't revise their documents frequently enough. But even if they are revised frequently, people searching for information may not find up-to-date versions.

Here's why. Even though the Linux Documentation Project (LDP) has the most recent versions of its documents on over 200 mirror sites, several hundred other sites also carry LDP documents. Unfortunately, most of these have stale documentation. Why don't people just go to the mirror sites and avoid the other sites? The reason is that many people search for information about Linux using one of the many search engines available on the Internet. More likely than not, such a search engine will find out-of-date Linux documents. While the LDP sites have a search engine for searching the LDP site, it's often advantageous to search the entire Web since there are many other documents available besides just LDP's. But doing so is likely to find stale documentation.

Suppose one finds a LDP HOWTO by using a search engine. Can't they just look at the date of the document and also click on a link to a mirror site that will have the latest document. Unfortunately, this isn't too easy to do. What people usually find with a search engine is not the entire document, but only a chapter of a document. The html documents are usually split up into chapters so that they will download fast.

Each chapter doesn't contain version or date information (perhaps it should). While there may be a chapter in the document that contains a link to the latest version, it's not likely to be in the chapter that one finds with a search engine. To find such a link (if it exists) requires first clicking on the "contents" link to get to the table-of-contents page. Then one might browse the contents to try to find a link to another chapter which itself might contain a link to the most recent version. It's not simple, sure or fast so few readers are likely to do this.

I did a quick survey to find out which versions of Modem-HOWTO were on the Internet. Here's the results: (Last col. is number of sites on the web per Google on Mar. 2, 2001.)

VersionDateCount
v0.14Feb. 2001 0
v0.13Feb. 2001 0
v0.12Dec. 2000 76
v0.11June 2000 118
v0.10May 2000 60
v0.09Mar. 2000 18
v0.08Jan. 2000 61
v0.07Nov. 1999 3
v0.06Nov. 1999 2
v0.05Oct. 1999 17
v0.04Aug. 1999 64
v0.03May 1999 11
v0.02Mar. 1999 73
v0.01Jan. 1999 58
v0.00Dec. 1998 63

The situation is not quite as dire as shown above since in some cases Google doesn't have the latest info: the site has been updated but Google doesn't know about it, or the site may be dead. But a spot check indicated that roughly 80% of them still exist as listed. The sites that were supposed to have v0.12 frequently had the latest version.

For a small minority of cases there's double counting since some sites have HOWTOs in more than one format. Also, a small minority of sites have stale HOWTOs in a directory named "archives", "old", etc. This is OK since they are being correctly classified.

In another respect the situation is even worse than described above since the Modem-HOWTO was a fork from the Serial-HOWTO. Over 200 old versions of Serial-HOWTO (prior to the first version on Modem-HOWTO) are still on the Internet. They all contain quite obsolete information about modems.

Here's some details on how I did the search. I searched using google.com with search terms: Modem-HOWTO "modulation details" v0.xx Where xx = 00, 01, 02, etc. The phrase ""modulation details" is from the table-of-contents so as to always select the HTML table of contents file (for split HTML-HOWTOs) . This is needed since v0.xx is sometimes also in chapter 1 and used so that readers can click on a link to LDP to see if they have the latest version. If "modulation details" were omitted there would be double counting. Also, "modulation details" removes hits on lists/catalogs of HOWTOS. There's still some more details on how I did it but they're not of general interest and are thus omitted.

Thus there are a lot of out-of-date versions of LDP docs (and other documentation) on the Internet. One way to try to lessen this problem would be to put some requirement into the license so that when a document becomes outdated it must be clearly labeled as such. Such labeling needs to be seen before one clicks on the document. But how can this be assured? What might help would be to add a suffix to the name of the document to indicate that it's outdated.

As you can see, stale documentation on the public network is a serious problem. We don't have the resources to be the documentation police on the network, but we ask that if you wish to mirror LDP documents, to do so responsibly, and keep your mirror up to date.

We also recommend that users use our mirror list, at http://www.linuxdoc.org/mirrors.html. Our "official" mirrors are generally well maintained and up to date with the latest HOWTOs.

mirror server hosted at Truenetwork, Russian Federation.