Thursday, July 9, 2009

Searching in Solr

The syntax of search in Solr is bit different from that in Verity. Verity has different types like- simple, internet and natural. But in ColdFusion9 Solr has two types of search: Standard(default) and Dismax. In addition to this Solr is configurable and other search types can be defined.

In Verity following code can be used to search for documents having 'filmed' or 'filming' :


< cfsearch name="myveritysearch" face="courier new"> collection="myverity" type="explicit" criteria='< WILDCARD>`film*`' >

In Solr type="explicit" is not valid. Default type for Solr is standard and search code is:

< cfsearch name="mysolrsearch" collection="mysolr" criteria="film*">

? :To perform a single character wildcard search
* :To perform a multiple character wildcard search

Note: * or ? can't be used as the first character of a search.

Searching in a particular document(mydoc.html):

<
cfsearch name="mysolrsearch" collection="mysolr" criteria="film* AND title:mydoc.html">

NOT searching in a particular document(mydoc.html).:

<
cfsearch name="mysolrsearch" collection="mysolr" criteria="-title:mydoc.html">

Fuzzy Search in Solr:

Solr supports fuzzy searches , for example, to search words like roam,foam ctc

<
cfsearch name="mysolrsearch" collection="mysolr" criteria="roam~">

Proximity search:
Solr can search two words with a given proximity to each other. e.g. to search "solr" and "spreadsheets" within a specified distance(10 words):


<
cfsearch name="mysolrsearch" collection="mysolr" criteria='"solr spreadsheet"~10'>

Searching string:
Solr by default searches for words. To search strings say "ColdFusion Green" Or "Solr search" :

<
cfsearch name="mysolrsearch" collection="mysolr" criteria='"ColdFusion Green" OR "Solr search"'>

Another way is to use search type="dismax"
<
cfsearch name="mysolrsearch" collection="mysolr" type="dismax" criteria="ColdFusion Green">


Boosting a Term:
Solr provides the relevance level of matching documents based the terms found. To boost a term use the caret,"^" symbol followed by a boost factor.

For example if you are searching for
coldfusion and solr and you want "coldfusion" to be more relevent:
<
cfsearch name="mysolrsearch" collection="mysolr" criteria="coldfusion^5 solr">

The above will show results having coldfusion first.


2 comments:

DZone said...

This is really nice content on Solr. Will you ever return to blogging more about it? I'd like to syndicate these posts on the DZone network if you are interested. Contact me at mitch (at) dzone |dot| com

Anonymous said...

HI,

Do you have an example of how to boost the search result if the search criteria appears in the title as oppose to the body?

Thanks
Mike