Sunday, April 22, 2007

Nutch excellent crawler Web pour indexer avec Lucene

clipped from lucene.apache.org
Lucene

About Nutch

Overview

Nutch is open source web-search
software. It builds on Lucene Java,
adding web-specifics, such as a crawler, a link-graph database,
parsers for HTML and other document formats, etc.

For more information about Nutch, please see the Nutch wiki.

0 comments: