LuceneExtension - an extension for helma wrapping the Lucene search engine
    ==========================================================================

This package makes the Lucene search engine available for easy scripting with Helma. A new global object search is created which mounts different Lucene indexes. For basic understanding of how Lucene works look at jakarta.apache.org (especially for the Query syntax).

Helma prerequistes

The extension has only been tested with Helma 1.2.4 but it should work with Helma 1.2 and above. Get the latest build from helma.org.

Installation

Unzip the distribution to your Helma home directory. A new subdirectory lucenesearch will be created and lucenesearch-X.Y.jar (the helma extension) and lucene-X.Y.jar (Lucene itself) are extracted to lib/ext.

Usage

creating and mounting indexes

A global object search mounts indexes: A Lucene index is a set of files in its own directory. By default indexes are created in the directory index/<appname>/<indexname> below the helma installation directory. A mounted index is available as a property of search: search.<indexname>.

search.createIndex (name, analyzer, location)
To create a new index a name is needed, a specification of an Analyzer and an optional location in the filesystem (to put the index somewhere else than into the default directory).

search.mountIndex (name, location)
Mounts an existing index, the location in the filesystem is optional.

creating documents

Documents are the basic elements that can be added to a Lucene index. Construct a new Document with new Document().

addField (key, value, store, index, token)
addDateField (key, datevalue, store, index, token)
getDateField (key)

To add a plain string field use addField(), to make date values available for comparison use addDateField() and retrieve it with getDateField() (otherwise you'll just get a weird integer number). The store, index, token params are boolean values and determine if the value will be stored, indexed and tokenized - all are set to true by default.

setBoost(num)
getBoost()

Sets/gets the boost factor for a Document which gives it a higher score if matched.

getScore()
Gets the score of a Document if it is contained in a search result.

adding documents to an index

The index object is mounted at search.<indexname>. Documents in an index can be accessed by their number: search.<indexname>[num], the number of Documents is available as the search.<indexname>.length property.

addDocument(doc)
This adds a new Document to the index.

removeDocument(idx)
removeDocument(field, term)

These functions remove Documents from the index, either clearly addressed by their position within the index or those Documents are removed that match the params field and term.

optimize()
Optimizes the index which reduces the number of files used and makes searching as well as adding/removing faster.

clear()
Deletes all Documents from the index but leaves the index itself intact.

unlock()
If the application crashes while writing an index a write-lock may remain (files write.lock and commit.lock). This function removes a possible write lock.

listFieldNames()
Lists all fieldnames used within an index.

getLastError()
All above functions don't throw errors but return false on error. In that case the last error message can be retrieved with this function.

prepare a search

A search is run using a query object that determines search terms exactly and allows great variation.

You can either use the internal QueryParser which parses the query syntax as described on jakarta.apache.org and creates one or more query objects.

new Query(analyzer)
Constructs a new Query object using the given Analyzer.

addTerm (field, term)
term is parsed and added to the top-level query with an AND condition.

addQuery (query)
Alternatively you can add another query to the top-level query.

The alternative is to create the core Query objects yourself:
Query.termQuery(field, term)
Query.multiTermQuery(field, term)
Query.phraseQuery()
Query.booleanQuery()
Query.wildcardQuery(field, term)
Query.prefixQuery(field, term)
Query.fuzzyQuery(field, term)
Query.rangeQuery(field, startvalue, endvalue)

These functions all return different Query objects, for further information on how to use these objects please refer to the Lucene javadocs. You can either directly use these queries for a search or connect them to a toplevel query.

   var q = new Query ();
   q.addTerm ("title", "test*");

   var q1 = Query.rangeQuery("createtime", new Date(2003,1,1), new Date());
   q.addQuery (q1);

run a search

index.search(query)
Execute a search with a prepared Query object against an index. This returns an array of Documents as a result.

using analyzer

An Analyzer is used for indexing and parsing queries and skips some often used terms (a, to, the in english - or der, die das in german). Use "de" to get a german analyzer or "si" to get a simple analyzer (which basically does nothing to the content), otherwise an english standard analyzer is used.

last modified: 2003-05-05, stefan dot pollach št orf dot at