+1-888-365-2779
Try Now
More in this section

Forums / Developing with Sitefinity / Search Indexing Question

Search Indexing Question

7 posts, 0 answered
  1. Dave
    Dave avatar
    20 posts
    Registered:
    10 Aug 2010
    10 Sep 2010
    Link to this post
    We are implementing Sitefinity (3.7) for a client. As part of that we have been asked to investigate the option of using an external search engine like Google or bing for their site search. I have done some research and believe this is possible but I am not certain it is necessary or even desired.  But before we can make that call I need to know if the built-in Sitefinity search indexes data directly from the database or if it somehow indexes the rendered pages.

    The issue is that a reasonable amount of the content to be displayed on the site will be coming from an external database so if the built-in indexing only see what is in the Sitefinity database it will not be sufficient. If we need to work with an external search provider (most likely bing), then I will probably have some questions about the best point of integration, but I will wait until I hear back on this before spending any effort on that.

    Thanks,
    Dave Parker

  2. Ivan Dimitrov
    Ivan Dimitrov avatar
    16072 posts
    Registered:
    09 Dec 2016
    10 Sep 2010
    Link to this post
    Hi Dave,

    We use Lucene search engine. Out of the box there are two types of search providers - pages and content items. Both the types uses business object to populate the search segment files that are later used by SearchResult control. For content items we pass all meta data - title, publication and expiration data etc through API methods to Lucene classes. The same applies for pages, but the content of a page is returned through WebClient and HttpRequest classes. WebClient is used when you index an external page which could be a physical file in the website or located on another host( say sitefinity.com/default.aspx)

    You can create a custom provider and query your external database to get the items from there which is also an option

    Kind regards,
    Ivan Dimitrov
    the Telerik team
    Do you want to have your say when we set our development plans? Do you want to know when a feature you care about is added or when a bug fixed? Explore the Telerik Public Issue Tracking system and vote to affect the priority of the items
  3. Dave
    Dave avatar
    20 posts
    Registered:
    10 Aug 2010
    10 Sep 2010
    Link to this post
    If I understand your answer correctly, you are saying "yes", the rendered pages will get indexed, regardless of what data is used to render them. Do I have that right?

    Thanks,
    Dave Parker

  4. Dave
    Dave avatar
    20 posts
    Registered:
    10 Aug 2010
    10 Sep 2010
    Link to this post
    Hmmm... upon reading your response a few more times, I think maybe there is a different interpretation (please forgive my naiveté, the terminology associated with Sitefinity is new to me and I am not yet fully conversant in the architecture). Let me see if I can properly interpret your response line by line.

    - "We use Lucene search engine."
    Simple enough. The underlying search technology is provided by the open-source Lucene engine, which is part of the Apache project.  I presume you are using Lucene.net.

    - "Out of the box there are two types of search providers - pages and content items."
    When you say "search providers", I believe you are referring to something in the Telerik.Search namespace, like subclasses of the IndexDataProvider or IndexFieldsProvider classes.

    Question: Does the indexing happen in the background somehow?  Does Sitefinity start a separate process for this?  (as I don't think you can rely on it running to completion within the IIS AppPool).

    - "Both the types uses business object to populate the search segment files that are later used by SearchResult control."
    It sounds like these "providers" will enumerate collections of objects, possibly use reflection to get properties from them, and then create "search segment files" somewhere.  When the SearchResult control renders the results, it will pull the data from these (presumably XML) files.

    Before I interpret any more, am I somewhere close to on target?  Also, if there is some existing documentation, forum posts, videos, or whatever that will explain this to me, I am happy to learn this info myself, I just could not find anything to address my specific issue about whether the rendered pages get indexed or not.

    Thanks again,
    Dave Parker



  5. Ivan Dimitrov
    Ivan Dimitrov avatar
    16072 posts
    Registered:
    09 Dec 2016
    12 Sep 2010
    Link to this post
    Hello Dave,

    1. We get the data from the page, no matter from where the source comes from. This is valid only for Pages index provider.

    2. Yes we use Lucene .NET ( the link reference I sent you is the same as this one you have posted).

    3. Providers - Please take a look at Provider Model. Search index works in the same way following provider model.

    4. The index runs at the background only when you create/edit or delete an item. But in this case we only update segment files and the index is partially updated.

    Greetings,
    Ivan Dimitrov
    the Telerik team
    Do you want to have your say when we set our development plans? Do you want to know when a feature you care about is added or when a bug fixed? Explore the Telerik Public Issue Tracking system and vote to affect the priority of the items
  6. Dave
    Dave avatar
    20 posts
    Registered:
    10 Aug 2010
    13 Sep 2010
    Link to this post
    Thank you for the clear reply.  I have a follow-on question, then.  I have already read about the provider model, along with other conceptual topics that talk about modules, services, search, etc.  What I don't have, though, is a mapping from those conceptual topics to the reference documentation and code.  Specifically, in order to replace or perhaps just sub-class and extend the search provider, what class, classes or interfaces do I work with?  Is there some place in the documentation that provides this tying of the conceptual topics to the physical code or reference docs?

    Thanks
    Dave Parker

  7. Ivan Dimitrov
    Ivan Dimitrov avatar
    16072 posts
    Registered:
    09 Dec 2016
    13 Sep 2010
    Link to this post
    Hi Dave,

    You can take a look at implementing_search_index and API reference.

    Regards,
    Ivan Dimitrov
    the Telerik team
    Do you want to have your say when we set our development plans? Do you want to know when a feature you care about is added or when a bug fixed? Explore the Telerik Public Issue Tracking system and vote to affect the priority of the items
Register for webinar
7 posts, 0 answered