Indiana University      Research & Creative Activity      September 1999 Volume XXII Number 2

A

Technology often solves old problems at the expense of creating new ones. Consider the Internet. It gives us e-mail, listservers, chat rooms, Web pages, and access to databases around the world, as well as the opportunity to instantaneously order goods and services online. But the very success of the Internet has caused an explosion in information availability. Open your e-mail and you might find fifty messages waiting; a single listserver may generate 100 postings a day to a subscriber; type in a keyword for a Web search and you may find 100,000 documents to choose from.

Initial
This figure shows the initial profile the user provides for SIFTER. In this case user domain area of interest is in computer science and specifically, the user is interested in information about artificial intelligence. An initial setup like this one would speed SIFTER's learning. However, this is not necessary at all--SIFTER can learn "on the fly." --credit

"This overload places a burden both on users and on the infrastructure of the Internet," says Mathew Palakal, associate professor of computer and information science at Indiana University–Purdue University Indianapolis. Palakal, who is also chair of the Department of Computer and Information Science at IUPUI, heads a team developing a software system called D-SIFTER. An acronym for Distributed Smart Information Filtering Technology for Electronic Resources, D-SIFTER will learn an Internet user's preferences and interests and then seek out only those items of information the user truly wants to obtain.

"The Internet overload," Palakal explains, "is obvious to individual users and is clearly annoying. But there is also a threat to the electronic infrastructure. The Internet has only so much bandwidth to transmit on." By the mid-1990s, the National Science Foundation had already recognized the problem of overload and encouraged research on intelligent filters that could sift desired information from the mass of unwanted or irrelevant messages. Palakal and his collaborators, including Javed Mostafa, assistant professor of library and information science at IU Bloomington, and Snehasis Mukhopadhyay and Rajeev Raje, both assistant professors of computer and information science at IUPUI, took up the challenge. They first developed a software system, named SIFTER, capable of passively filtering out unwanted messages on e-mail and listservers. SIFTER did this by recording a user's document-seeking history, classifying the documents by type, and applying a series of algorithms to obtain a preference rating for documents of each sort. With SIFTER installed, a user then had all messages listed according to preference.

The prototype eased the data load for individual users, but it left minor problems. "Our biggest challenge with the SIFTER was to teach it to adapt to changes in user interest," Palakal recalls. The program was able to come out of its converged state to relearn preferences when a user shifted interests--say, when a researcher on Italian Renaissance art became interested in modern Italian art, or when a baseball fan became interested in lacrosse.

Palakal and his collaborators came up with an innovation, something that set SIFTER apart from other Internet filters. "We added a shift detection module to SIFTER, dedicated solely to tracking changes in user interest," Palakal explains. A complex algorithm detects whether a change in document requests shows a significant change in interest or is simply inherent randomness. Change can be recognized and the system adjusted without a time loss for reconfiguration.

SIFTER
This figure shows how SIFTER would present messages to the user periodically. The top half of the window lists ten messages sorted according to the user's interest out of possibly several hundred received. By clicking on any item in the message list, the content of the message can be viewed, as shown in the bottom half of the window. --credit

"The SIFTER system was relatively easy to build," Palakal says. "It is a passive system and handles overload for the individual learner. But it only handles messages at the receiver's end, after they've come through the system." Palakal and his team decided to produce something better than the standard passive filtering system. "Now we've turned our attention to designing a proactive system," he reports. "That's the way to reduce overload on the Internet itself." Such a system, now being developed by the team at IUPUI, will go forth on the Internet and seek out information desired by its user. The system will ensure that any information sent is truly desired, before data is ever transmitted. "Because the system will send out agents that go to information sources and negotiate with other users' agents for information transfers, the system is both distributed and intelligent. We've named it D-SIFTER to emphasize that it is distributed through the Internet. We foresee a time when every user has a distributed intelligence system working as an agent, negotiating with other users' agents and even, when users' interests change, learning from other agents about preferred documents to match new interests."

The program is an ambitious one, and Palakal has received a grant of more than $300,000 from the National Science Foundation to aid its development. This grant is awarded by the competitive and prestigious Digital Library Initiative. The changes in software applications and the very freedom of the Internet present challenges in developing these distributed agents. "Nowadays, any active filter must deal with both text and multimedia retrieval," Palakal points out, "and it must have the ability to deal with heterogeneous databases and multiple platforms." The research team writes the software in JAVA language because of its ability to interact with multiple platforms serving the Internet. The concept of collaboration among an individual's proactive agent with other agents on the Internet also requires some novel approaches in modeling and programming. "We are using models of market activity for the negotiation capabilities that allow agents to exchange information and learn from each other," Palakal explains. That will allow D-SIFTER to obtain the latest information anywhere on the Internet as soon as the information is available, in a process of give and take among electronic agents.

SIFTERlogo

Palakal wrote his doctoral dissertation on voice recognition systems for computers and more recently has worked intensively on computer modeling of bat sonar, trying to recreate computationally the mechanism by which bats locate objects through echo location. While that research has no direct link to his development of D-SIFTER, all Palakal's research projects are related: all involve the creation of artificial intelligence. To achieve Palakal's vision of a D-SIFTER agent going through the Internet on behalf of its user, negotiating document swaps, and sharing information with other D-SIFTER agents, Palakal's experience with artificial intelligence will be sorely needed.--William Rozycki

For more information on the Web:

Return to the Table of Contents