Lucene is a high-performance Java search engine library available from the Apache Software Foundation. Hibernate Annotations includes a package of annotations that allows you to mark any domain model object as indexable and have Hibernate maintain a Lucene index of any instances persisted via Hibernate.
Hibernate Lucene is a work in progress and new features are cooking in this area. So expect some compatibility changes in subsequent versions.
First, we must declare a persistent class as indexable. This is done by annotating the class with @Indexed:
@Entity @Indexed(index="indexes/essays") public class Essay { ... }
The index attribute tells Hibernate what the lucene directory name is (usually a directory on your file system). If you wish to define a base directory for all lucene indexes, you can use the hibernate.lucene.default.indexDir property in your configuration file.
Lucene indexes contain four kinds of fields: keyword fields, text fields, unstored fields and unindexed fields. Hibernate Annotations provides annotations to mark a property of an entity as one of the first three kinds of indexed fields.
@Entity @Indexed(index="indexes/essays") public class Essay { ... @Id @Keyword(id=true) public Long getId() { return id; } @Text(name="Abstract") public String getSummary() { return summary; } @Lob @Unstored public String getText() { return text; } }
These annotations define an index with three fields: id, Abstract and text. Note that by default the field name is decapitalized, following the JavaBean specification.
Note: you must specify @Keyword(id=true) on the identifier property of your entity class.
Lucene has the notion of boost factor. It's a way to give more weigth to a field or to an indexed element over an other during the indexation process. You can use @Boost at the field or the class level.
The analyzer class used to index the elements is configurable through the hibernate.lucene.analyzer property. If none defined, org.apache.lucene.analysis.standard.StandardAnalyzer is used as the default.
Lucene has a notion of Directory where the index is stored. The Directory implementation can be customized but Lucene comes bundled with a file system and a full memory implementation. Hibernate Lucene has the notion of DirectoryProvider that handle the configuration and the initialization of the Lucene Directory.
Table 5.1. List of built-in Directory Providers
Class | description | Properties |
---|---|---|
org.hibernate.lucene.store.FSDirectoryProvider | File system based directory. The directory used will be <indexBase>/<@Index.name> | indexBase: Base directory |
org.hibernate.lucene.store.RAMDirectoryProvider | Memory based directory, the directory will be uniquely indentified by the @Index.name element | none |
If the built-in directory providers does not fit your needs, you can write your own directory provider by implementing the org.hibernate.store.DirectoryProvider interface
Each indexed entity is associated to a Lucene index (an index can be shared by several entities but this is not usually the case). You can configure the index through properties prefixed by hibernate.lucene.<indexname>. Default properties inherited to all indexes can be defined using the prefix hibernate.lucene.default.
To define the directory provider of a given index, you use the hibernate.lucene.<indexname>.directory_provider
hibernate.lucene.default.directory_provider org.hibernate.lucene.store.FSDirectoryProvider hibernate.lucene.default.indexDir=/usr/lucene/indexes hibernate.lucene.Rules.directory_provider org.hibernate.lucene.store.RAMDirectoryProvider
applied on
@Indexed(name="Status") public class Status { ... } @Indexed(name="Rules") public class Rule { ... }
will create a file system directory in /usr/lucene/indexes/Status where the Status entities will be indexed, and use an in memory directory named Rules where Rule entities will be indexed.
So you can easily defined common rules like the directory provider and base directory, and overide those default later on on a per index basis.
Writing your own DirectoryProvider, you can benefit this configuration mechanism too.
Finally, we enable the LuceneEventListener for the three Hibernate events that occur after changes are committed to the database.
<hibernate-configuration> ... <event type="post-commit-update" <listener class="org.hibernate.lucene.event.LuceneEventListener"/> </event> <event type="post-commit-insert" <listener class="org.hibernate.lucene.event.LuceneEventListener"/> </event> <event type="post-commit-delete" <listener class="org.hibernate.lucene.event.LuceneEventListener"/> </event> </hibernate-configuration>