Thursday, July 9, 2009

Searching in Solr

The syntax of search in Solr is bit different from that in Verity. Verity has different types like- simple, internet and natural. But in ColdFusion9 Solr has two types of search: Standard(default) and Dismax. In addition to this Solr is configurable and other search types can be defined.

In Verity following code can be used to search for documents having 'filmed' or 'filming' :


< cfsearch name="myveritysearch" face="courier new"> collection="myverity" type="explicit" criteria='< WILDCARD>`film*`' >

In Solr type="explicit" is not valid. Default type for Solr is standard and search code is:

< cfsearch name="mysolrsearch" collection="mysolr" criteria="film*">

? :To perform a single character wildcard search
* :To perform a multiple character wildcard search

Note: * or ? can't be used as the first character of a search.

Searching in a particular document(mydoc.html):

<
cfsearch name="mysolrsearch" collection="mysolr" criteria="film* AND title:mydoc.html">

NOT searching in a particular document(mydoc.html).:

<
cfsearch name="mysolrsearch" collection="mysolr" criteria="-title:mydoc.html">

Fuzzy Search in Solr:

Solr supports fuzzy searches , for example, to search words like roam,foam ctc

<
cfsearch name="mysolrsearch" collection="mysolr" criteria="roam~">

Proximity search:
Solr can search two words with a given proximity to each other. e.g. to search "solr" and "spreadsheets" within a specified distance(10 words):


<
cfsearch name="mysolrsearch" collection="mysolr" criteria='"solr spreadsheet"~10'>

Searching string:
Solr by default searches for words. To search strings say "ColdFusion Green" Or "Solr search" :

<
cfsearch name="mysolrsearch" collection="mysolr" criteria='"ColdFusion Green" OR "Solr search"'>

Another way is to use search type="dismax"
<
cfsearch name="mysolrsearch" collection="mysolr" type="dismax" criteria="ColdFusion Green">


Boosting a Term:
Solr provides the relevance level of matching documents based the terms found. To boost a term use the caret,"^" symbol followed by a boost factor.

For example if you are searching for
coldfusion and solr and you want "coldfusion" to be more relevent:
<
cfsearch name="mysolrsearch" collection="mysolr" criteria="coldfusion^5 solr">

The above will show results having coldfusion first.


Wednesday, July 8, 2009

Working with Solr: Creating Collections and Indexing them

Solr in ColdFusion needs some basic code change in creating ,indexing and searching a collection.

Creating Collection:
While creating a collection ColdFusion requires an attribute 'engine' to indentify a collection as Solr or Verity. Default value of 'engine' is verity. Defining the attribute 'engine' is one time activity, for indexing and search ColdFusion will automatically identify the collection type.

< cfcollection action="create"
collection="mysolr"
key="#expandpath('.')#"
type="path"
engine="solr">

This will create a solr collection named 'mysolr'. also there is no need to define language and categories at the time of creation of collection,language can be defined at the time of indexing a collection.


Indexing Collection:

Indexing a Solr collection is similar to Verity.

< cfquery name="getCourses" datasource="cfdocexamples">
SELECT * FROM COURSES
< /cfquery>


< cfindex action="Update"
query="getCourses"
collection="mysolr"
type="Custom"
key="Course_ID"
title="Courses"
body="Course_ID,Descript"
custom1="Course_Number">

For indexing a Solr collection in some other languages than English, language attribute can be defined in <> tag.

Solr supports following languages:

• Danish

• Dutch

• Finnish

• French

• German

• Italian

• Norwegian

• Spanish

• Portugese

• Russian

• Swedish

• Chinese

• Japanese

• Korean

• Czech

• Greek

• Thai


However, Solr supports indexing documents in any language. If the document has a language (for example, Arabic) not listed above, it can still index the content(as english), but stemming will not be available in search.

Thursday, July 2, 2009

Solr in Coldfusion9

ColdFusion9 now provides support for Solr search engine as an alternate to Verity. Solr is an enterprise search server based on Lucene java search library. Solr provides features like indexing searching ,caching, and a web Administrator interface. In ColdFusion Solr will run as a seperate service on a server Jetty. Solr can be installed along with ColdFusion and also as a seperate installer.

Configuring Solr:

On Administrator page ColdFusion provides interface to configure Solr. Log on to Administrator page -> Data & Services-> Solr Server. You can configure Solr Home path and port on this page. By default Solr runs on port 8983. On your browser go to

http://localhost:8983/solr

This is the Admin page for Solr, this will show you all solr collections and information about them. Any collection can be searched from here also.

If port 8983 is already blocked then Solr can be configured to run on different port by following steps:

1. Go to the location where Solr is installed (eg: c:\Coldfusion\solr). Go to folder etc and open the file jetty.xml.
2. Now search for port 8983 and replace all with new port say 8984.
3. Now log on to admin page->solr server->Show advance settings and change the Solr Admin port to 8984 and submit.
4. Restart Solr services.(Go to location where Solr is installed and run ./cfsolr restart)