|
Crawler
Architecture Diagram:
GIF
| |
Jump to:
Project Description | Capabilities
The seeker algorithm is relatively straightforward.
Both keywords
and URLs are used to seed the search.
Keywords are used to search
online search engines to retrieve web pages, through a module
which learns effective queries.
URLs are spidered.
Speculative
fetching is performed based on expectation that site is a project
URL or a metasite, as classified by WebKB tools.
In this way, a
database of project URLs is found.
Next, we use information extraction to populate KBs about software
systems, then use these to intiate searches.
Eventually we would
like this to extend this to a set of tactics for retrieving all
information related to packaging and systems integration.
- Extract version numbers from HTML.
vbolten-3.0.
SNA, software.
Vulnerability assessment.
crawler determines whose running vulnerable software.
Graph isomorphisms.
- Come up with new name for crawler.
- Sorcerer, -a crawler
This page is part of the FWeb package.
It derives from the
Robotics Institute projects page.
Last updated Mon Jan 15 08:34:48 CST 2007
.