Getting Started With N Hibernate Search N - Cheat Sheet

Transcription

Brought to you by Getting Started wtih Hibernate Searchwww.dzone.comGet More Refcardz! Visit refcardz.com#32CONTENTS INCLUDE:nGoogle Your Database!nMapping EntitiesnBridgesnInitial Indexing of EntitiesnQuerying IndexesnHot Tips and more.Getting Started withHibernate SearchBy John GriffinGOOGLE YOUR DATABASE!Hibernate Search complements Hibernate Core by enabling fulltext search queries on persistent domain models, and bringsLucene search features to the Hibernate world. Hibernate Searchdepends on Apache Lucene, a powerful full-text search enginelibrary (and a de facto standard solution in Java) hosted at theApache Software Foundation (http://www.apache.org/). Thisrefcard explains installation and configuration, and coversMapping entities, bridges, building indexes, querying them andexamining their contents. Table 1 shows links to rg/java/docs/Hibernate Searchhttp://www.hibernate.org/410.htmlMailing ashboard.jspaHotTipHibernate Search is not compatible with all versions of HibernateCore and Hibernate Annotations. Refer to Table 2 for compatibility requirements. The latest version is available on theHibernate download page at rnate CoreDownload Hibernate Search at http://www.hibernate.org or usethe JBoss Maven repository hibernate-search). It is interesting to downloadthe Apache Lucene distribution as well, available at http://lucene.apache.org/java/. It contains both documentation and a contribution section containing various add-ons not included inHibernate Search. Make sure to use the same Lucene versionHibernate Search is based on. You can find the correct versionin the Hibernate Search distribution in lib/readme.txt.Hibernate Search requires three JARs – all available in theHibernate Search distribution:nhibernate-search.jar: the core API and engine ofHibernate Searchnlucene-core.jar: Apache Lucene enginenhibernate-commons-annotations.jar: some common utilitiesfor the Hibernate projectYou can also add the optional support for modular analyzersby adding: apache-solr-analyzer.jar. This JAR (available in theHibernate Search distribution), is a subset of the SOLR distributionthat contains various analyzers. While optional, it is recommendedto add this JAR to your classpath as it greatly simplifies the use ofanalyzers.DZone, Inc.AnnotationsEntityManagerSearch3.2.6 GA3.2.x, 3.3.x3.2.x, 3.3.x3.0.x3.3.0 SP13.4.x3.4.x3.1.x3.3.x3.0.x3.4.x3.1.x3.3.1 GA3.2.x3.4.0 GA3.3.xHibernateEntityManager3.3.2 GA3.2.x3.3.x3.0.x3.4.0 GA3.3.x3.4.x3.1.x3.0.0 GA3.2.x3.3.x3.3.x3.0.x3.1.0 GA3.3.x3.4.x3.4.x3.1.x3.0.1 GA 3.2.2(better if 3.2.6)3.3.x (betterif 3.3.1 2.x3.2.x and3.3.x3.2.x and3.3.x(3.2.0)HibernateValidatorIn order to use Hibernate Search you should understand the basicsof Hibernate, and be familiar with the object manipulation APIsfrom the Hibernate Session or the Java Persistence EntityManageras well as the query APIs. You should also be familiar with association mappings and the concept of bidirectional relationships.CoreHibernateAnnotationsTable 1 Documentation LinksGETTING STARTEDThe apache-solr-analyzer.jar capabilities areonly available in Hibernate Search 3.1 .HibernateSearchTable 2: Compatibility Matrix. \k jlggfik ]fi ? Y\ieXk\JBoss Enterprise Application Platformincludes Hibernate JBoss Enterprise Application Platform pre-integratesJBoss Application Server, Seam, and Hibernate Includes caching, clustering, messaging, transactions, andintegrated web services stack Support for industry-leading Java and technologies likeJAX-WS, EJB 3.0, JPA 1.0, JSF 1.2, and JTA 1.1 Use JBoss Operations Network to monitor and tuneHibernate queriesDownload today: jboss.com/download 2008 Red Hat Middleware, LLC. All Rights Reserved. Red Hat, Red Hat Enterprise Linux, theShadowman logo and JBoss are registered trademarks of Red Hat, Inc. in the U.S. and othercountries. Linux is a registered trademark of Linus Torvalds. www.dzone.com

2Getting Started with Hibernate Searchtech facts at your fingertipsHotTipTable 3, continuedDependencies needed to build and initially testHibernate Search are included in the HibernateSearch distribution or can be found in theMaven dependency file (POM) which isincluded with the Hibernate Search download.hibernate.search.filter.cachestrategyThe filter caching strategy class (musthave a no-arg constructor andimplement he bitresults.sizeThe hard ref count of our CachingWrapperFilter. Defaults to 5.Table 3: Hibernate Search configuration parameters, continuedConfiguration ParametersMAPPING ENTITIESConfiguration parameters can be provided in three ways:nIn a hibernate.cfg.xml filenIn a /hibernate.properties filenThrough the configuration API and specificallyconfiguration.setProperty(String, String)Listing 1: An example hibernate.cfg.xml file. ?xml version ”1.0” encoding ”UTF-8”? hibernate.cfg.xml file !DOCTYPE hibernate-configuration PUBLIC“-//Hibernate/Hibernate Configuration DTD rnate-configuration-3.0.dtd” !-- hibernate.cfg.xml -- hibernate-configuration session-factory name ”dvdstore-catalog” !-- regular Hibernate Core configuration -- property name ”hibernate.dialect” org.hibernate.dialect.PostgreSQLDialect” /property property name ”hibernate.connection.datasource” jdbc/test /property !-- Hibernate Search configuration -- property name ”hibernate.search.default.indexBase” /users/application/indexes /property Figure 1: Basic entity mapping. !-- mapping classes -- mapping class ”com.manning.dvdstore.model.Item”/ list additional entities /session-factory /hibernate-configuration gisterlistenersEnable listeners auto registration inHibernate Annotations and EntityManager. Default to true.hibernate.search.indexing strategyDefines the indexing strategy, default toevent. Other option is manual.hibernate.search.analyzerThe default Lucene analyzer class.hibernate.search.similarityThe default Lucene similarity class.hibernate.search.worker.batch sizeHas been deprecated in favor of thisexplicit APIhibernate.search.worker.backendOut of the box support for the ApacheLucene backend and the JMS back end.Defaults to lucene. Other option is jms.Bridges fulfill several needs in the Hibernate Search architecture.nConverts an object instance into a Lucene consumablerepresentation (commonly a String) and adds it to a Lucenedocument.nReads information from the Lucene document and buildsback the object representation.Bridges that support both the conversion object to Lucene andLucene to object are called two-way bridges. Table 4 lists all outof-the-box Hibernate Search bridges.Java TypeBuild-in BridgeDescriptionStringStringBridgeno-opSupports synchronous and asynchronousexecution. Defaults to sync. Other optionis async.short / ShortShortBridgeUse toString(), not comparableint / IntegerIntegerBridgeUse toString(), not efines the number of threads in thepool. Useful only for asynchronousexecution. Default to 1.long / LongLongBridgeUse toString(), not comparablefloat / FloatFloatBridgeUse toString(), not efines the maximal number of workqueue if the thread poll is starved. Usefulonly for asynchronous execution. Defaultto infinite. If the limit is reached, the workis done by the main thread.double /DoubleDoubleBridgeUse toString(), not comparableBigDecimalBigDecimalBridgeUse toString(), not comparableBigIntegerBigIntegerBridgeUse toString(), not comparableboolean /BooleanBooleanBridgeString value: “true” / “false”ClassClassBridgeAllows manipulation of any combinationof different fields.EnumEnumBridgeUse enum.name()URLUrlBridgeConverts to the String representationURIUriBridgeConverts to the String ernate.search.worker.jndi.*Defines the JNDI properties to initiatethe InitialContext (if needed). JNDI isonly used by the JMS back end.hibernate.search.worker.jms.connection factoryMandatory for the JMS back end. Definesthe JNDI name to lookup the JMS connection factory from. (java:/ConnectionFactory by default in JBoss AS)hibernate.search.worker.jms.queueMandatory for the JMS back end.Defines the JNDI name to lookup theJMS queue from. The queue will be usedto post work messages.hibernate.search.reader.strategyDefines the reader strategy used. Defaultsto shared. Other option is not-shared.Table 4: List of standard Hibernate Search bridges.Table 3: Hibernate Search configuration parameters.DZone, Inc. www.dzone.com

3Getting Started with Hibernate Searchtech facts at your fingertipsBridges, continuedDateDateBridgeAssociations, continuedListing 4 shows that @ContainedIn is paired to an @IndexedEmbeddedannotation on the other side of the bi-directional relationship.The string representation depends on@DateBridge. Converting Date intostring and back is not guaranteed to beidempotentListing 4: Figure 2 in code.Table 4: List of standard Hibernate Search bridges, continued.Custom bridges allow for converting unexpected data types. The@FieldBridge annotation is placed on a property (field or getter)that needs to be processed by a custom bridge. An example,including parameter passing, is given in Listing 2.@Entity @Indexedpublic class Item {@ManyToMany@IndexedEmbeddedprivate Set Actor actors; // embed actors when indexing@ManyToOne@IndexedEmbeddedprivate Director director; // embed director when indexing.}Listing 2: A custom bridge example with parameters.@Entity@Indexedpublic class Item {@Field// property marked to use a custom bridge@FieldBridge(// declare the custom bridge implementationimpl PaddedRoundedPriceBridge.class,// optionally provide parametersparams {@Parameter(name ”pad”, value ”3”),@Parameter(name ”round”, value ”5”) })private double price;.}@Entity @Indexedpublic class Actor {@Field private String name;@ManyToMany(mappedBy ”actors”)@ContainedIn actor is contained in item index (1)private Set Item items;.}@Entity @Indexedpublic class Director {@Id @GeneratedValue @DocumentId private Integer id;@Field private String name;@OneToMany(mappedBy ”director”)@ContainedIn director is contained in item indexprivate Set Item items;.}Embeddable ObjectsEmbedded objects in Java Persistence (they are called components in Hibernate) are objects whose life cycle entirely dependson the owning entity. When the owning entity is deleted, theembedded object is deleted as well.HotTipListing 3: Embeddable Object example.@Embeddablepublic class Rating {// mark properties for indexing@Field(index Index.UN TOKENIZED) private Integer overall;@Field(index Index.UN TOKENIZED) private Integer scenario;@Field(index Index.UN TOKENIZED) private Integer soundtrack;@Field(index Index.UN TOKENIZED) private Integer picture;.}The @IndexEmbedded depth setting (e.g.@IndexEmbedded(depth 3)) controls themaximum number of embeddings allowedper association.ANALYZERSAnalyzers are responsible for taking text as input, chunking itinto individual words (tokens) and optionally applying someoperations (filters) on the tokens. A filter can alter the streamof tokens as it pleases. It can remove, change, and add words.@Entity@Indexedpublic class Item {// mark the association for indexing@IndexedEmbedded private Rating rating;.}In addition to the SOLR analyzers mentioned previously, Lucene’sorg.apache.lucene.analysis package contains additionalanalyzers and many filters. Listing 5 is an example of definingan analyzer on an entity.Listing 5@IndexedEmbedded marks the association as embedded: theLucene document contains rating.overall, rating, scenario,rating.soundtrack, rating.picture. When Item is deletedthe embedded Rating object is also deleted.AssociationsAssociations between objects are similar to embeddable objectsexcept that an associated object’s life time is not dependent onthe owning entity. Below is an example association mapping.@Entity @Indexed@AnalyzerDef(name ”applicationanalyzer”, // analyzer definition nametokenizer // tokenizer factory@TokenizerDef(factory StandardTokenizerFactory.class ),filters {// list of filters to apply@TokenFilterDef(factory ory StopFilterFactory.class,// parameters passed to the filter factoryparams {@Parameter(name ”words”,value rameter(name ”ignoreCase”, value ”true”)} )} )// Use the pre defined analyzer@Analyzer(definition ”applicationanalyzer”)public class Item {.}Figure 2: An example association.DZone, Inc. www.dzone.com

4Getting Started with Hibernate Searchtech facts at your fingertipsFrom a dataset, continuedINITIAL INDEXING OF ENTITIESListing 8: Wiring listeners when not using annotations. hibernate-configuration session-factory . event type ”post-update” listener class tener”/ /event event type ”post-insert” listener class tener”/ /event event type ”post-delete” listener class tener”/ /event event type ”post-collection-recreate” listener class tener”/ /event event type ”post-collection-remove” listener class tener”/ /event event type ”post-collection-update” listener class tener”/ /event /session-factory /hibernate-configuration ManuallyFirst you need an instance of either a FullTextEntityManageror a FullTextSession depending on whether or not you areusing an EntityManager.Listing 6: Manually indexing data.SessionFactory factory new ssion session factory.openSession();FullTextSession fts sion);fts.getTransaction.begin()for (Item item : items) {fts.index(item); //manually index an item instance}fts.getTransaction().commit(); //index is written at commit timesession.close();orEntityManagerFactory factory Persistence.createEntityManagerFactory(“ ”);EntityManager em r ftem anager(em);ftem.getTransaction().begin();for (Item item : items) {ftem.index(item); //manually index an item instance}//index is written at commit timeftem.getTransaction().commit();For versions of Hibernate Search prior to 3.1.x the configurationis slightly different as shown in Listing 9.Listing 9: Wiring listeners prior to Hibernate Search version 3.1.x.HotTip hibernate-configuration session-factory . event type ”post-update” listener class stener”/ /event event type ”post-insert” listener class stener”/ /event event type ”post-delete” listener class stener”/ /event !-- collection listener is different -- event type ”post-collection-recreate” listener class ionEventListener”/ /event event type ”post-collection-remove” listener class EventListener”/ /event event type ”post-collection-update” listener class ionEventListener”/ /event /session-factory /hibernate-configuration getFullTextSessionand getFullTextEntityManagerwere named createFullTextSession andcreateFullTextEntityManager inHibernate Search 3.0.From a datasetListing 7: Initial indexing of a dataset.// disable flush / disable 2nd level cache ransaction tx session.beginTransaction();// read the data from the database// scrollable results will avoid loading too many objects// in memory// ensure forward only result setScrollableResults results session.createCriteria(Item.class).scroll( ScrollMode.FORWARD ONLY );int index 0;while( results.next() ) {index ;session.index( results.get(0) ); index entities (4)if (index % BATCH SIZE 0) {session.flushToIndexes(); apply changes to the index (5)session.clear(); clear the session releasing memory}}tx.commit(); apply the remaining index changesQUERYING INDEXESTable 5 shows the three ways to obtain results.Updates, additions and deletions to indexes are handled automatically by Hibernate Search via entity listeners. If you are usingHibernate Annotations these listeners are automatically wired foryou. If you are not using the annotations then you have to wirethe listeners manually as shown in Listing 8.DZone, Inc. Method CallDescriptionquery.list()List Item items query.list();All matching objects are loaded eagerly as opposed to lazily.Table 5: Querying indexeswww.dzone.com

5Getting Started with Hibernate Searchtech facts at your fingertipsQuerying indexes, continuedHibernate Search Annotations, continuedIterator Item items query.iterate();while ( items.hasNext() ) {Item item items.next();}All object identifiers are extracted from the Lucene index but objectsare not loaded until iterator.next() is called@AnalyzerDefsReusable analyzer definitions. Allows multiple@AnalyzerDef declarations per element.@BoostApply a boost factor to a field or an entire entity.@ClassBridgeAllows a user to manipulate a Lucene document basedon an entity change in any manner the user wishes.ScrollableResults items query.scroll();// process resultsItems.close();ScrollableResults must be closed when processing is finishedto free resources.@ClassBridgesAllows multiple @ClassBridge declarations perdocument.@ContainedInMarks the owning entity as being part of the associated entity’s index (to be more accurate, being partof the indexed object graph). This is only necessarywhen an entity is used as a @IndexedEmbedded targetclass. @ContainedIn must mark the property pointingback to the @IndexedEmbedded owning Entity. Notnecessary if the class is an embeddable class.SessionFactory factory new ssion session factory.openSession();@DateBridgeDefines the temporal resolution of a given property.Dates are stored as a String in GMT.@DocumentIdDeclare a property as the document id.FullTextSession fts sion);@FactoryMarks a method of a filter factory class as a Filterimplementation provider. A factory method is calledwhenever a new instance of a filter is requested.@FieldMarks a property as indexed. Contains field optionsfor storage, tokenization, whether or not to storeTermVector information, a specific analyzer and aField-Bridge.@FieldBridgeSpecifies a field bridge implementation class. A fieldbridge converts (sometimes back and forth) a propertyvalue into a string representation or a representationstored in the Lucene Document.@FieldsMarks a property as indexed into different fields.Useful if the field is used for sorting and searching or ifdifferent analyzers are used.@FullTextFilterDefDefines a full-text filter that can be optionally appliedto full-text queries. While not related to a specificindexed entity, the annotation must be set on one ofthem.query.iterate()query.scroll()Table 5: Querying indexes, continued.Listing 10: A FullTextQuery example.fts.getTransaction.begin()// create a Term for the description fieldTerm term new Term(“description”, “salesman”);TermQuery query new TermQuery(term);// generate a FullTextQuery and obtain a result listorg.hibernate.search.FullTextQuery hibQuery s.createFullTextQuery(query, Dvd.class);List Dvd results hibQuery.list();Basic Query TypesTable 6 presents the basic query types. Consult the Lucene APIdocumentation at http://lucene.apache.org/java/2 4 0/api/index.html for a complete listing, specifically the org.apache.lucene.search package.@FullTextFilterDefsAllows multiple @FullTextFilterDef per s is the basic building block of queries. It searches fora single term in a single field. Many other query types arereduced to one or more of these.Specifies that an entity is to be indexed. The indexname defaulted to the fully qualified class name canbe overridden using the name attribute.@IndexedEmbeddedWildcardQueryQueries with the help of two wildcard symbols '*' (multiplecharacters) and '?' (single character). These wildcard symbolsallow queries to match any combination of characters.Specifies that an association (@*To*, @Embedded,@CollectionOfEmbedded) is to be indexed in theroot entity index. It allows queries involving associatedobjects restrictions.PrefixQueryA WildcardQuery that starts with characters and ends withthe '*' symbol.@KeyPhraseQueryAlso known as a proximity search, this queries for multiple termsenclosed by quotes.Marks a method of a filter factory class as a Filter keyprovider. A key is an object that uniquely identifiesa filter instance associated with a given set ofparameters.FuzzyQueryQueries using the Levenshtein distance between terms.Requires a minimum similarity float value that expands orcontracts the distance.RangeQueryAllows you to search for results between two values. Values canbe inclusive or exclusive but not mixed.BooleanQueryHolds every possible combination of any of the other query typesincluding other BooleanQuerys. Boolean queries combineindividual queries as SHOULD, MUST or MUST NOT.MatchAllDocsQueryReturns all documents contained in a specified index.The key object must implement equals and hashcodeso that 2 keys are equals if and only if the given targetobject types are the same and the set of parametersare the same. The key object is used in the filter cacheimplementation.@ParameterBasically a key/value descriptor. Used in @ClassBridge,@FieldBridge, TokenFilterDef and @TokenizerDef.@ProvidedIdObjects whose identifier is provided externally, asopposed to being a part of the object state, should bemarked with this annotation. This annotation shouldnot be used in conjunction with @DocumentId. Thisannotation is primarily used in the JBoss CacheSearchable project. http://www.jboss.org/jbosscacheand milaritySpecifies a similarity implementation to use in scoringcalculations.Table 6: Basic query types.hibernate search annotationsTable 7 is a complete listing of all Hibernate Search annotations.Ex. @EntityAnnotationDescription@AnalyzerDefine an Analyzer for a given entity, method, attributeor Field. The order of precedence is: @Field,attribute/ method, entity, default. Able to reference animplementation or an @AnalyzerDef definition.@AnalyzerDefReusable analyzer definition. An analyzer definitiondefines: one tokenizer and, optionally, some filters.Filters are applied in the order they are defined.@Indexed@Similarity(impl BookSpecificSimilarity.public class Book {.}Table 7: Hibernate Search Annotations.@TokenFilterDefSpecifies a TokenFilterFactory and its parametersinside a @AnalyzerDef.@TokenizerDefDefines a TokenizerFactory and its parametersinside a @AnalyzerDefTable 7: Hibernate Search Annotations, continuedDZone, Inc. www.dzone.com

6Getting Started with Hibernate Searchtech facts at your fingertipsLUKEThe most indispensable utility you can have in your arsenalof index troubleshooting tools (in fact it may be the only oneyou really need) is Luke, shown in Figure 3. With Luke you canexamine any facet of an index you can imagine. Some of itscapabilities are:nview individual documentsnexecute a search, and browse the resultsnselectively delete documents from the indexnexamine term frequency, and many more.The Luke author, Andrzej Bialecki, actively maintains Luke to keepup with the latest Lucene version. Luke is available for download,in several different formats, at http://www.getopt.org/luke/. Themost current version of the Java WebStart JNLP direct downloadis the easiest to retrieve.Figure 3: The search window of the Luke utility for Lucene indexes.RECOMMENDED BOOKABOUT THE AUTHORJohn GriffinHibernate Search In ActionJohn Griffin has been in the software and computer industry in one formor another since 1969. He remembers writing his first FORTRAN IV programon his way back from Woodstock. Currently, he is the software engineer/architect for SOS Staffing Services, Inc. He was formerly the lead e-commercearchitect for Iomega Corporation and an independent consultant for theDept. of the Interior, among many other callings. John is a member of theACM. Currently, he resides in Layton, Utah with wife Judy and Australian Shepards Clancyand Molly.guides you through everyPublicationsas Search clustering.nnstep to set up full text searchfunctionality in your Javaapplications, and provides apragmatic, how-to explorationof more advanced topics suchBlogXML and SQL Server 2000, New Riders nate Search in Action, ManningPublications, co-authored with Emmanuel BernardBUY NOWbooks.dzone.com/books/hibernate-searchGet More FREE Refcardz. Visit refcardz.com now!Upcoming Refcardz:Available:Core Mule 2Core SeamCore CSS: Part IGetting Started with Equinox OSGiEssential RubyStruts2SOA PatternsSpring AnnotationsCore .NETGetting Started with EMFCore JavaVery First Steps in FlexUsing XML in JavaCore CSS: Part IIC#Essential JSP Expression LanguagePHPGroovyGetting Started with JPANetBeans IDE 6.1 Java EditorJavaServer FacesRSS and AtomVisit refcardz.com for a complete listing of available Refcardz.DZone, Inc.1251 NW MaynardCary, NC 27513more than 1.7 million software developers, architects and decisionmakers. DZone offers something for everyone, including news,tutorials, cheatsheets, blogs, feature articles, source code and more.“DZone is a developer’s dream,” says PC Magazine.Design Patterns Published June 2008ISBN-13: 978-1-934238-34-9ISBN-10: 1-934238-34-150795888.678.0399919.678.0300Refcardz Feedback Welcomerefcardz@dzone.comSponsorship Opportunitiessales@dzone.com 7.95DZone communities deliver over 4 million pages each month toFREE9 781934 238349Copyright 2008 DZone, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical,photocopying, or otherwise, without prior written permission of the publisher. Reference: Hibernate Search in Action, Emmanuel Bernard and John Griffin, Manning Publications, December 2008.Version 1.0

Hibernate Search distribution), is a subset of the SOLR distribution that contains various analyzers. While optional, it is recommended to add this JAR to your classpath as it greatly simplifies the use of analyzers. Hibernate Search is not compatible with all versions of Hibernate Core and Hibernate Annotations. Refer to Table 2 for compati-