Try Now
More in this section

Forums / Developing with Sitefinity / Lucene search and documents

Lucene search and documents

3 posts, 1 answered
  1. Lucas
    Lucas avatar
    65 posts
    22 Mar 2010
    21 Jul 2010
    Link to this post

    I'm pretty sure it's not possible, but I can't find a definitive answer on the website: can the Lucene-based search index documents content (pdf, doc, ...) inside a gallery or referenced by a link on the website?

    Or does it only index pages that are on the sitemap?
  2. Ivan Dimitrov
    Ivan Dimitrov avatar
    16072 posts
    16 Jun 2017
    21 Jul 2010
    Link to this post
    Hello Lucas,

    In the mean time series (part 4): BONUS TRACK - "Products" module - implementing search index shows how to create a standard provider for GenericContent based modules like Images and Documents.
    Your provider could directly inherit from ContentIndexProvider and you have to override

    GetItemUrl - get the content item, its url and culture if you use localization
    Initialize - here the index settings are initialized


    Name - return name of the provider
    Description  - return some description
    ContentItemKeyImpl - return a search key
    ContentProviderKeyImpl - return a provider key
    GetDefaultContentProvider - get the default provider
    FilterExpression - return "Name='InVariantCulture';";

    If you want to search inside the content item content, you need to create a custom provider and use 3rd party framework to extract the content of the files. We have API that will allow you to index the content, but we do not have API that allows you to extract the content from the PDF files. You could use some open source libraries as Apache PDFBox or iTextSharp.

    If you use PDFBox you can read the stream by using PDDocument.load(stream); and then call getText of  PDFTextStripper instance.

    Sincerely yours,
    Ivan Dimitrov
    the Telerik team
    Do you want to have your say when we set our development plans? Do you want to know when a feature you care about is added or when a bug fixed? Explore the Telerik Public Issue Tracking system and vote to affect the priority of the items
  3. Lucas
    Lucas avatar
    65 posts
    22 Mar 2010
    22 Jul 2010
    Link to this post
    Thanks a lot Ivan.

    We'll see whether it's worth the extra work in our case.
Register for webinar
3 posts, 1 answered