Mailing List Archive: Leveraging What's Been Said

The SIGIA-L mailing list, sponsored by ASIS&T, is a valuable resource to the community. However ASIS&T doesn't have the ability to effectively support the list's ongoing maintenance of SIGIA-L. A new archive could be created that could initially address basic problems we're currently encountering.

A more ambitious approach could then be taken to create new value in two ways:

  1. go beyond the basic functionality provided by such common archive tools as Hypermail and MHoNarc by providing improved access to both individual entries and whole threads, thus allowing users to compare current and past threads; and
  2. archive a broader collection of related mailing lists in the same way, thereby enabling the combined browsing and searching of entries and threads from such lists as:
    • CHI-Web
    • UTest
    • AIGA-Advance
    • InfoDesign
    • InfoDesign-Cafe
    • DigLib
    • others?

Creating archives for these lists could draw much interest from those not currently focused on IA but who work in related fields. Of course, a potentially thorny intellectual property issue could arise here: would we have the right to host this content?

A possible related project would be the creation of an FAQ directed toward new members of the SIGIA-L mailing list.


Posted by Louis Rosenfeld at November 03, 2001 04:42 PM
Comments

What are the basic requirements for the mail archive? I would suggest the following:

- archive sigia-l including all past messages

- continuous archiving of new messages; with new postings to the list available on the archives within a few hours

- permanent, canonical URL for each message, and short enough so it doesn't wrap in an email message

- view individual messages

- view message index by date and by thread. Perhaps also by author.

- searchable archives. should be a fielded search capable of restricting to author, date, subject line, or body of the message.


Those are the basic elements. Other features could and probably should be added, but only after the basic requirements have been met.

What's missing?

Posted by Karl Fast at November 12, 2001 4:19 PM

I recently became a discussion list owner for a group devoted to promoting open source software in schools and government. The software we use is Mailman and it rocks my world.
It is extremely easy to administer and customize,the messages appear instantaneously and it generates the HTML archives for you so you can use any search tool to conduct the searches. very nice and best of all FREE SOFTWARE.


This is what the default list homepage looks like, but you can customize the look and feel and every other aspect of the functionality considerably.

Posted by drew meeks at November 13, 2001 4:47 AM

Suggestion: keep SIGIA-L, but get it a brand spankin new listserv program that works as well as whatever CHI-WEB uses. Message subject summary at top of daily digests. Partial moderation to filter out overlong quotes, HTML/plaintext double posts, MS-specific CSS/XML formatting, and other annoyances.

Thanks!

Posted by Andrew at November 13, 2001 10:34 AM

There are two separate issues being expressed here.


Get new listserv software
Get new listserv archiving software


As ASIST runs SIGIA-L, they are responsible for item 1. I think we are only interested in the second option.

Most listserver packages have their own archiving software. In some it's separate so you can replace the built-in archiver with some other package (like Hypermail or Monharc). IIRC, Mailman uses PiperMail, which is very similar to Hypermail.

--karl


Posted by Karl Fast at November 14, 2001 3:16 PM

One feature I'd want to see in an email archive is the capability for post-hoc annotations. Too often I've stumbled across some archived message which looks to be the answer to my problem only to then find hidden in the dozen replies in the thread a minor comment that that version has a bug.

The problem with post-hoc annotations is that if you implement them as free-text comments you've now forked your discussion space ... the original email delivered discussion, plus a web-based discussion. There are tools that can keep these two mediums converged though. WebCrossing is one, IIRC.

Another problem with searching email archives is that the one message that answers your question won't necessarily have the exact keywords you are searching on, but meanwhile there are dozens of other messages which do. It can sometimes be a simple matter of the poster using some unusual abbreviation, or possibly even mis-spelling the word. By god that is frustrating!

What would be better would be to have these post-hoc annotations be of a constrained form: ratings, links, categories, and keywords.

Posted by Eric Scheid at November 16, 2001 4:08 PM

To answer Karl's question from Nov. 12, here are some more ideas that go beyond a basic list archive (i.e., save these for versions beyond 1.0):

posting traffic numbers and maybe charts: on a monthly basis, who is posting, what topics, and when? It'd be interesting to provide this data longitudinally
treating both postings and threads as objects which could eventually support improved searching, browsing, and asking
archiving related mailing lists (e.g., CHI-WEB) and connecting postings and threads
using popularity measures, either implicit (these postings are linked to a lot/get the most impressions) and/or explicit (a la SlashDot)

Other ideas?

Posted by Lou Rosenfeld at November 19, 2001 11:23 AM

...and here is an interesting tool that Matt "Blackbelt" Jones suggested we all check out:

http://www.futurefarmers.com/josh/rca/grouptools/index.html

Posted by Lou Rosenfeld at November 19, 2001 11:26 AM

There are a number of important issues in making email archives searchable: the most important are simple URLs, labelling the pages, and marking non-indexable sections.

Simple URLs are vital: most search engines will ignore anything with a ?, so we should use URL rewriting tools to make sure all elements are easy for robots to get at.

As for labelling: creating titles with the subject, date (and time?) and author (probably in that order) would solve a lot of problems. These titles can be viewed from any search engine which indexes the site, so it's really important to get them right!

We should also be marking the navigation pages and sections so they won't be indexed.

On the nav pages, we can use the meta robots "noindex,follow" tag on the to make sure that reasonable search engines don't index these (tons of false drops).

On the message pages themselves, it's a bit trickier. Some search engine have their own way of marking non-index areas, usually something like this <!-- noindex --> and <!-- /noindex --> or <stopindex> and <startindex>. In any case marking the navigation vs. content will be a Good Thing in the long run, show we can implement IA, and support our claims to have a clue about the Semantic Web.

Posted by Avi Rappoport at November 19, 2001 7:15 PM

There are a number of important issues in making email archives searchable: the most important are simple URLs, labelling the pages, and marking non-indexable sections.

Simple URLs are vital: most search engines will ignore anything with a ?, so we should use URL rewriting tools to make sure all elements are easy for robots to get at.

As for labelling: creating titles with the subject, date (and time?) and author (probably in that order) would solve a lot of problems. These titles can be viewed from any search engine which indexes the site, so it's really important to get them right!

We should also be marking the navigation pages and sections so they won't be indexed.

On the nav pages, we can use the meta robots "noindex,follow" tag on the to make sure that reasonable search engines don't index these (tons of false drops).

On the message pages themselves, it's a bit trickier. Some search engine have their own way of marking non-index areas, usually something like this [!-- noindex --] and [!-- /noindex --] or [stopindex] and [startindex] (I can't do pointy brackets in this system!). In any case marking the navigation vs. content will be a Good Thing in the long run, show we can implement IA, and support our claims to have a clue about the Semantic Web.

Posted by Avi Rappoport at November 19, 2001 7:16 PM

on the topic of archiving other lists...well, I know we can't do that with CHI-WEB or the usability-list-that-must-not-be-named. Not sure about others. For lists like CHI-WEB that maintain their own archives then it may be possible to link into those existing archives from the SIGIA-L archive (similar to Scott Berkun's Best of CHI-WEB index)

Posted by Jess at November 20, 2001 12:23 AM

Jess: how come we can't?

Posted by Lou Rosenfeld at November 20, 2001 4:06 AM

Why we can't?

Well, we technically could, but we'd step on toes.

Remember that the usability-list-that-must-not-be-named really is very introverted. From their FAQ:

[the list] is not and should not be archived. In order to protect subscribers' messages from various types of abuse and to encourage
discussion, members of the [list] community decided not to archive messages.

And on CHI-WEB, we recently encountered the issue with Scott Berkun's very cool index. But reviewing the conversation I don't see the comment I'm thinking of (which may well mean said comment never happened ;)

I'll drop William a note.

Posted by Jess at November 21, 2001 5:21 AM

Thanks Jess; I could bug Keith, but I think William is more the primary moderator these days.

Maybe we could convince them that this would be beneficial to their community too.

Posted by Lou Rosenfeld at November 21, 2001 11:54 AM

Thanks Jess; I could bug Keith, but I think William is more the primary moderator these days.

Maybe we could convince them that this would be beneficial to their community too.

Posted by Lou Rosenfeld at November 21, 2001 11:54 AM

I do more moderation than Keith does of late...he's retired from moderation, sipping margaritas on the beach ;)

Anyways, chatted with William. We'd have to get ACM approval to do it in any kind of official way, whether we built a wrapper around the existing archives or started a mirror (a mirror would be a harder sell than a wrapper)

Once SIGIA-L archives are up and working and we're ready to tackle other lists then I'd be happy to be point person for the CHI-WEB stuff...

Posted by Jess at November 29, 2001 4:55 AM

I'm all for a careful, user-centered design approach in general, but while we design the perfect uber-archive toolset (tm), I'd just like to be able to link to old posts. I'd suggest that perhaps we can just add a dang archive now and then continue with the discussion of polyhierarchies and the social ecology of mailing lists. In this case, I think Worse is better.

Posted by Nadav at December 13, 2001 11:05 PM

Nadav, you're absolutely right. We (the folks working on the list archive) are doing two things: 1) working on getting the basic archive up (we're hopeful this might happen as soon as next week), and using that as a springboard for 2) doing some of the cool things that have been discussed here.

Volunteers for both efforts are always welcome!

Posted by Lou Rosenfeld at December 14, 2001 6:13 PM

Early SIGIA list archives can be obtained through email.

Majordomo@www.asis.org

example Subject
get sigia-l sigia-l.archive.0005

example Text of email
get sigia-l sigia-l.archive.0005

The 0005 requests 5/2000.

My email j-fullerton@tamu.edu

Posted by John Paul Fullerton at December 17, 2001 9:31 PM

More info.

Email address is Majordomo@www.asis.org

Email subject not required

example Text of email
get sigia-l sigia-l.archive.0005

(note that sigia-l is repeated in the request as the list name and file name)

All list archives were listed in index (available through requesting index sigia-l).

0004-0012 and 0101-0112

The only way that I know to get the files is request each (0004, 0005, 0006, 0007, 0008, 0009, 0010, 0011, 0012, 0101, and so on). It is possible to send multiple requests within email note.

So, for example,

get sigia-l sigia-l.archive.0004
get sigia-l sigia-l.archive.0005
get sigia-l sigia-l.archive.0006
get sigia-l sigia-l.archive.0007
get sigia-l sigia-l.archive.0008

Posted by John Paul Fullerton at December 17, 2001 9:56 PM

At the risk of embarassing myself by pointing out that I use NS 4.7 at work, I wanted to just let you know that the archive doesn't work in this old clunker of a browser. Apparently the CSS file is coded in a relative manner that NS handles less graciously than IE...the error message that appears is:

The requested URL /hypermail/sigia-l/0112/sigiastyle.css was not found on this server.

I'm guessing that the CSS file is not in folder 0112 (the URL for the December 2001 link is /hypermail/sigia-l/0112/).

I mention just in case it turns out to be easy to fix...if not, one more reason to stop using the dinosaur :).

Posted by Beth Mazur at December 18, 2001 9:03 PM

Beth's posting reminds me that we haven't announced it here: an incomplete, though (somewhat) operational archive is now available at:

/hypermail/sigia-l/

:-)

Posted by Lou at December 19, 2001 3:08 AM

The main problem that I have had with other mailing list archives is arriving mid-thread from a search engine - it is generally not common to have good navigation.Its difficult to tell where the thread startsDoes the thread progress up or down?does the previous/next button jump between posts or threads?replies with quotes often confuse the overall narrative and make it difficult to see where the discussion has branched or combined.This can be solved with clearer labelling and an information design that supports the narrative aspects of the discussion. At the end of each post a summary of other posts should lead readers to the next relevant post, or allow a jump to the beginning of the thread. Perhaps mid-posts should not be indexed, but a summary at the start of the thread would be indexed by the search engines... At the least I think that the posts will need some editing, to filter confusing quoting - there is no substitute for hard work by humans hands...

Posted by Timo at December 31, 2001 12:27 PM

Since the archive doesn't seem to be up yet - all the links redirect back to the same page, how about following the worse is better philosophy and:

1. Create a Yahoo! group, e.g. sigia-l

2. Subscribe the group's e-mail address to the list

That'll get most basic requirements met.

Posted by David Carter-Tod at January 24, 2002 10:10 PM

Since the archive doesn't seem to be up yet - all the links redirect back to the same page, how about following the worse is better philosophy and:

1. Create a Yahoo! group, e.g. sigia-l

2. Subscribe the group's e-mail address to the list

That'll get most basic requirements met.

Posted by David Carter-Tod at January 24, 2002 10:10 PM

The archives were temporarily offline due to a minor coding error. The archives now work:

/lists/sigia-l/

Posted by Eric Scheid at January 28, 2002 3:23 AM

Can someone put a link to the mail archives on the home page of info-arch.org? I hate having to search this page to find the link to the archives whenever I want it.

Posted by Andrew at March 8, 2002 1:22 PM

For the record, just how many messages are there in the archive? There are non-local search options available such as Atomz, although they are only free up to a certain size. It doesn't take very much to set them up at all.

Posted by Eric Scheid at April 1, 2002 12:08 PM

Posted by Jacob Assiene at September 25, 2002 2:47 AM

Posted by Dieudonne Assiene at September 25, 2002 10:45 AM

i wont to introduce my new site if you have time visit it thanx

Posted by fioricet at December 29, 2003 11:11 PM

hi there

Posted by paxil at December 30, 2003 8:59 AM


Comments closed
This site exists for archive purposes only. Please check Asilomar Institute for Information Architecture for the latest on the community effort.
 
  Home

Keeping Up
Progress Report
Credits
Powered by Movable Type