Sitefinity Search, true to the Sitefinity mantra, provides not only powerful set of features and a user friendly UI to manage the functionality, but also an interesting extension point that gives developers freedom to make all necessary tweaks to meet any requirements.
In this blog post we are going to discuss the different extension points that the search feature provides and illustrate them in a sample that uses the different hooks in the Search Utility - index, search widget and search results widget.
To illustrate the ideas discussed in this blog post we are going to cover a code sample. In this sample we are going to achieve searching by category. This consists of:
But first lets get down to some of the core concepts of Sitefinity search and how it can be extended.
In terms of usability, Sitefinity search gives administrators and business users the ability set up very sophisticated search functionality. But it also goes beyond this. It gives developers the ground to be creative and extend the search functionality based on ideas and requirements that may not necessarily fit inside a box and we are going to discuss exactly those extension points in this post. To get in a bit more details on this let us first talk about the engine that Sitefinity Search is based on.
Core components of search and indexing
Each search engine essentially consists of a crawler(spider),an index and a search box. Crawling is really not the right term for built-in search engines as the CMS lets the index know about the changes and no crawling is needed.
Indexing is the process of extracting plain text from the CMS items (pages, blog posts, news) and saving it in an optimized structure for fast search in these contents.
Sitefinity creates and writes to index files that are stored in the file system – in the App_Data folder of the Sitefinity site.
What actually gets indexed?
The fields that are getting indexed are by default Title, Content, Summary, ContentType, OriginalItemId, Link, DateCreated, PipeId, Language, IdentityField. This is enough to resolve each content item, but there is a reason we are actually indexing stuff using Lucene instead of querying the database – performance. Adding additional fields is supported out of the box
How the search works
Now that we have grounding on what the engine around the search indexes is, let’s look at the search process itself. Each time a user creates content (e.g news, events, lists item, custom content, page) the publishing system kicks in.
When you publish you practically invoke a content pipe that persists data to the publishing point. In more simple terms Sitefinity not only writes data to the database but also exposes this data and ask other systems – hey do you want to do your thing with that data, without bothering me with the details. The point of asking this question is in technical terms the publishing point where thereafter the search outbound pipe gets all the data it wants and writes into the Lucene indices. Much in the same way the RSS pipe takes this data to expose an RSS feed or the twitter pipe...well...tweets.
From this point on things are in the court of the Search Pipe to take Sitefinity content and translate it to something Lucene can understand. The pipe sends all the needed data to the Search Provider which then writes to the Lucene files. The search index technically is a set of segmented binary files and you can think of it as the physical representation of the content item, with all the relevant fields persisted. The files themselves consist of a bunch of binary data that would probably not tell you much if you open it up in notepad, but you can think of it as key:value pairs for field:indexed content.
So as we keep publishing and editing content in Sitefinity those search indexes are automatically updated for us. Or, when we click reindex this is done in batch.
How Sitefinity writes to the search index
What happens when we search
The search box widget is probably the simplest widget you can think of, it simply points the URL to a search results page.
There the search results widget is really the one talking to the Siteifnity APIs – the API here is the Search Service that passes the query to Lucene. So if we were to look at the anatomy of search those are the 3 extensible components that we have as well as the different extension points exposed by them
What parts can be extended?
Now that we have a better understanding of the entire Search infrastructure, a logical question that might arise is where can this be extended.
1) Presentation – the search box and the search result widgets are based on a widget template that can be modified to display practically anything: fields, images, promoted searches. Additionally code behind can be added to this results widget. Since Lucene gives us back enough information about the content items that are currently indexed (outside of the indexed fields there is also metadata like the item GUID, URL, default page and type that in essence give you everything you need to extend this infinitely.
2) Fields that are indexed. Sitefinity gives you the power to define custom fields and custom modules. These are not only part of the search infrastructure but you can also easily define where you want Sitefinity to search. For more information, check this blog post
3) Search Queries. The queries you type in have the full power of the Lucene framework, which means:
- Wildcards (Party in New * returns Party in New York and Party in New Mexico) ,phrased queries (“new york”)
- Proximity queries ("foo bar"~4 means foo and bar are 4 words away)
- Keyword matching (title:"foo bar" AND body:"quick fox")
- Boolean conditions(title:Sitefinity AND content:Party)
- Boosts (title:foo OR title:bar)^1.5 (body:foo OR body:bar) means the title will carry heavier weight)
Type in any of those to experiment how they relate to the search results. In the sample we discuss an elegant way that achieves keyword matching
4) The search input. You can manipulate the query that is passed to the results page and ultimately to Lucene, and by doing this you can easily add any type of functionality and logic behind search. In the code sample we are going to explore this option
5) Using the Search Service independently. The search service exposes public APIs which you can use to search for items in your own custom development. A great example can be seen in this Autocomplete blog post
Code Sample Walkthrough
In this code sample we are setting up search by category. Here are the steps that are involved in the initial preparation
- Set up search to index the categories field. Reindex afterwards to have it index all content.
- Add the project attached to this blog post to your solution and add reference SitefinityWebApp to it. Fix the assembly references if you are running a version different than Sitefinity 5.2
- Register both the new search box and search results widget in the toolbox using Sitefinity Thunder
- Add a virtual path for our search box widget
- Just for kicks make the search results widget also highlight the category field using the Advanced Options by having Title,Content,Categories as search fields
- Now kindly ask your users type in AND Categories:Community every time they search.
Kidding. Although this would actually work, because of the way you can query Lucene. But we will quickly build a great looking UI for our customers to choose from categories and submit the search. Here is what happens in the code sample:
In the project we have added the following tweaks to the search box:
- A ComboBox displayed on the template(might as well be any other control)
- This ComboBox is data bound to the categories on InitializeControls.
We have also added logic for modifying the search results.
This is actually the fun part – we hook directly into the Lucene and specifically in the place where Lucene does it’s query building. By inheriting the SearchResults widget you can override the BuildSearchQuery method. Here the code sample provides a nice extension method that generates the search query based on terms, AND/OR conditions etc. and it is configured explicitly to support categorized search as well. The categorized sample is only one of many examples you can explore with this mechanism to add any additional logic to your search query.
var queryBuilder = service.CreateQueryBuilder();
// close the category group
var compiledQuery = queryBuilder.GetQuery();
You can download the code sample below:
Search By Category code sample