+1-888-365-2779
Try Now
More in this section

Forums / Developing with Sitefinity / Sitefinity 5.1 and Document Library Full Text Search

Sitefinity 5.1 and Document Library Full Text Search

7 posts, 1 answered
  1. Michael
    Michael avatar
    67 posts
    Registered:
    16 Mar 2012
    14 Jun 2012
    Link to this post
    I know that Sitefinity does not not support searching of the document libraries out of the box, let alone doing full text search of their contents. So I am writing my own module (hopefully) to do this.

    Looking at some of the examples and documentation, it looks like I would want to use an inbound pipe to tie into the publishing system to generate the index information for each document as they are saved (Using various libraries to parse the document and return the text contents). I think I have that mostly accomplished, but I'm missing something.

    I have created the module, and registered it. My initialize code looks like this:

    01.PublishingSystemFactory.RegisterPipe(DocumentSearchInboundPipe.PipeName, typeof(DocumentSearchInboundPipe));
    02.var mappingsList = PublishingSystemFactory.GetDefaultInboundMappingForContent();
    03.PublishingSystemFactory.RegisterPipeMappings(DocumentSearchInboundPipe.PipeName, true, mappingsList);
    04. 
    05.var pipeSettings = (SitefinityContentPipeSettings)PublishingSystemFactory.CreateDefaultContentInboundPipeSettings(DocumentSearchInboundPipe.PipeName);
    06.pipeSettings.ContentTypeName = typeof(Telerik.Sitefinity.Libraries.Model.Document).FullName;
    07.pipeSettings.UIName = "DocumentSearchInboundPipe";
    08.pipeSettings.PipeName = DocumentSearchInboundPipe.PipeName;
    09.pipeSettings.ResourceClassId = typeof(DocumentLibraryFullTextSearchModuleResources).Name;
    10.PublishingSystemFactory.RegisterPipeSettings(DocumentSearchInboundPipe.PipeName, pipeSettings);
    11. 
    12.var definitions = PublishingSystemFactory.CreateDefaultContentPipeDefinitions();
    13.PublishingSystemFactory.RegisterPipeDefinitions(DocumentSearchInboundPipe.PipeName, definitions);
    14. 
    15.var contentPipeSettings = PublishingSystemFactory.GetPipeSettings(DocumentSearchInboundPipe.PipeName);
    16.contentPipeSettings.MaxItems = 0;
    17.PublishingSystemFactory.RegisterTemplatePipe("SearchItemTemplate", contentPipeSettings, ps => ps.PipeName == DocumentSearchInboundPipe.PipeName);

    The examples showed that last line as being the one that would add my inbound pipe to the list of options under creating a search index, but I do not get any new options there. Can anyone shed some light on how I get this in there?

    Note: The InboundPipe I wrote essentially just returns the file's contents in the GetConvertedItemsForMapping method in each WrapperObject's Content field.

    If anyone has any comments about my methodology here (This is the first thing I've written for Sitefinity), especially if I have just completely misunderstood how I should be doing this, please let me know! Like, for instance, if this would actually parse out each file in the documents library on EVERY search, that would be kind of bad... ideally I want to cache that in the search index, and I want the index updated on publishing events (Which is what I THINK I'm doing here...).
  2. Michael
    Michael avatar
    67 posts
    Registered:
    16 Mar 2012
    15 Jun 2012
    Link to this post
    Ok, I think I got it now. I wasn't registering the Resources file, and I think it wasn't being pulled in right because of that.

    Curiously, it shows up when Creating an index as "Documents", but shows up when Editing an index as "Text documents"... 

    But otherwise, I now have a working module that handles indexing the full text of documents uploaded.

    The only thing I might want to add for that is to allow for some configuration of what libraries to watch. Can anyone make a suggestion on how one might do that? Ideally through the admin interface instead of config files?
  3. Steve
    Steve avatar
    3037 posts
    Registered:
    03 Dec 2008
    15 Jun 2012
    Link to this post
    This should be coming out of the box with 5.1 in July though?
    Answered
  4. Michael
    Michael avatar
    67 posts
    Registered:
    16 Mar 2012
    15 Jun 2012
    Link to this post
    Oh. That's cool. Wish I'd seen that before. but that's ok, I needed this right now anyway.
  5. Steve
    Steve avatar
    3037 posts
    Registered:
    03 Dec 2008
    15 Jun 2012
    Link to this post
    I know, I'm always dancing that line of...do it now even though it'll be here later, or just wait on it...
  6. Michael
    Michael avatar
    67 posts
    Registered:
    16 Mar 2012
    15 Jun 2012
    Link to this post
    Hmm, doesn't look like they are doing Word docs though? I'm just scanning the OpenXML ones (docx), but still, figured they would include that. Also wonder what library they are using for PDF reading. I'm using an old copy of iTextSharp (Before the fork to new licensing scheme).

    Anyway, this has been a good exercise for me. I'll need it still for indexing Word docs probably, and I will also need to write something similar to index Video libraries (after I figure out how to add a custom field for choosing a document for captions).

    I'm still interested in if there is a way to change the interface on the Search Index screens to allow for setting a library / collection of library folders, but perhaps that isn't possible...
  7. Inna
    Inna avatar
    97 posts
    Registered:
    08 Jun 2012
    05 Aug 2013 in reply to Steve
    Link to this post
    Steve, I have version 6.1 and the documents check box is checked in the index configuration. Yet, none of my pdfs is getting searched. What do they mean by the "documents" then? Not the documents that are stored in the libraries?
7 posts, 1 answered