+1-888-365-2779
Try Now
More in this section

Forums / Set-up & Installation / Does Search index PDFs and DOC files?

Does Search index PDFs and DOC files?

36 posts, 0 answered
  1. David Willis
    David Willis avatar
    1 posts
    Registered:
    01 Aug 2004
    26 Oct 2007
    Link to this post
    Can search index PDFs and DOC files?
  2. Pepi
    Pepi avatar
    981 posts
    Registered:
    28 Oct 2016
    29 Oct 2007
    Link to this post
    Hello David,

    For the time being Search doesn't support indexing of .pdf and .doc files. But our plans are to implement this functionality as we will extend Search module for future releases .

    Don't hesitate to ask if you come up with other questions.

    Greetings,
    Pepi
    the Telerik team

    Instantly find answers to your questions at the new Telerik Support Center
  3. mexner
    mexner avatar
    98 posts
    Registered:
    06 Apr 2006
    03 Apr 2008
    Link to this post
    With SF 3.2, SP1, I noticed in the web.config a section for "Indexers". Does this provide the ability to search/index .doc and .pdf?

     <indexers> 
            <add extensionOrMimeType=".aspx" type="Telerik.Search.Engine.HtmlIndexer"/> 
            <add extensionOrMimeType="text/htwithml" type="Telerik.Search.Engine.HtmlIndexer"/> 
    </indexers> 


    Is there a "type" we can currently provide to search .pdf?
    Thanks!
  4. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    07 Apr 2008
    Link to this post
    Hi Chris,

    Our search engine still don't support indexing in Doc files and PDFs. Our plans are to implement this functionality, and your suggestions about section <indexers> are correct - when the spiders that index the doc files, and the pdf files are ready, they will be added in this section.

    Let us know if there is anything else you would like to know.

    Greetings,
    Georgi
    the Telerik team

    Instantly find answers to your questions at the new Telerik Support Center
  5. Meister
    Meister avatar
    262 posts
    Registered:
    09 Nov 2007
    30 Apr 2008
    Link to this post
    Hi

    Can you let me know when you are planning to support indexing of PDF files please?
  6. SelAromDotNet
    SelAromDotNet avatar
    912 posts
    Registered:
    18 Jul 2012
    30 Apr 2008
    Link to this post
    do you mean that it will actually index the CONTENTS of these documents? that's crazy awesome :) but what if they're not in the library, but rather just linked and stored physically in a location on the server. would they get indexed or is this only for library files?

    thanks!
  7. Ivan
    Ivan avatar
    478 posts
    Registered:
    16 Jun 2015
    01 May 2008
    Link to this post
    Hi SelArom,

    this is among the more important features we want to implement, but it will surely not be available in the upcoming Service Pack.

    The reason for indexers section in the web.config is that like anything else in Sitefinity, search indexes can be developed and implemented by clients (e.g. you develop some module with complex data and you want to handle the way it'll be searched on your own, you'll be able to implement your own searching logic), so that's how you would register it. Unfortunately the exact process or the tutorials for this are not yet available and it's not exactly a trivial thing to do.

    To conclude: once implemented, the PDF and DOC search will search the contents of a file. It will not be necessary for this file to be in the Images & Documents library. And finally, we don't have a fixed date of this release, but we are working on it.

    Sincerely yours,
    Ivan
    the Telerik team

    Instantly find answers to your questions at the new Telerik Support Center
  8. Jeff Mah
    Jeff Mah avatar
    31 posts
    Registered:
    25 Feb 2005
    19 Dec 2008
    Link to this post
    What's the current status on indexing PDF and DOC files? Are you able to share a probable release date?
  9. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    19 Dec 2008
    Link to this post
    Hello Jeff Mah,

    This is still not implemented. We are trying to include it in the list with the tasks for 4.0, which should come in April or May.

    Best wishes,
    Georgi
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  10. Tony
    Tony avatar
    23 posts
    Registered:
    12 Sep 2008
    23 Jan 2009
    Link to this post
    But I want the feature now :P

    Suggestion:  A sign-up for a digest email on a per-feature basis so that once a week/month/whatever we get an update on the features we are wanting to sell to clients would be great!  (then I could be sure I was on-top of the latest info to tell to my clients.)
  11. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    26 Jan 2009
    Link to this post
    Hi Tony,

    We already working on a similar solution, you will be able to track all issues and feature request directly on our web site. I still cannot commit to a time frame for it though.

    Best wishes,
    Georgi
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  12. martin
    martin avatar
    43 posts
    Registered:
    10 Feb 2009
    24 Feb 2009
    Link to this post
    Hi,

    I appreciate that being able to index the content of uploaded PDFs and files would/will be great... I'm going to ask a question about searching for a file just by it's name.

    I'm running a "trial" version of Sitefinity (prior to purchase) and have created a document library within the "images & documents" module. I have uploaded a PDF and a .rtf into that library. I have created a search index and a search page. I can successfully search for general text held within general pages (including news, events and lists) - but it seems to totally fail to search/find these test files I've built into the mock-up download page.

    Am I missing something very obvious? Does Sitefinity currently (out-of-the-box) search these files too (even if only in it's file name?) Because if it doesn't... it's going to become pretty useless for our client's needs and an alternative CMS will have to be used.

    Any help greatly received,
    Martin.
  13. SelAromDotNet
    SelAromDotNet avatar
    912 posts
    Registered:
    18 Jul 2012
    24 Feb 2009
    Link to this post
    I'm not sure how it works out-of-the-box, but the search index is actually very extensible! you can use it to index any content you wish, all you have to do is add an index provider...

    take a look at this forum post: http://www.sitefinity.com/support/forums/support-forum-thread/b1043S-becdhd.aspx

    there's an example project that you can build on. In my blog, I used it to index content from my events module into my site search results. the details are here:
    http://www.selarom.net/blog/2009-01-23/Sitefinity_Index_and_Search_Events.aspx

    it would probably not be too difficult to add the filenames (and links to the files) to the index using this method...

    I hope this was helpful!
  14. martin
    martin avatar
    43 posts
    Registered:
    10 Feb 2009
    24 Feb 2009
    Link to this post
    Hi SelArom,

    Thank you for your reply. I'm sure your information will indeed be helpful. Unfortunately, as it currently stands, I'm more of a web "designer" than an ASP.NET "coder" and thus on a VERY steep learning curve. I shall read through your links and see what I can make from it.

    I'm pleased to see how flexible Sitefinity is, but a lot of that flexibility does seem to involve a lot of 'back-end' coding (as opposed to within the Admin interface.

    On a similar note (being still talking about the "images & documents" module) I've noticed a funny thing happening with my test download files. I can download them ok, but the file in question doesn't end in .pdf - it ends in .sflb.ashx and thus is ununderstandable by the OS. If I then manually replace .sflb.ashx with .pdf, then it opens just fine.

    Both bizzare and annoying and will need to be sorted prior to a "real" site being created.

    Regards,
    m.
  15. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    25 Feb 2009
    Link to this post
    Hello,

    Josh, thank you for providing this information to Martin. As usual, we appreciate your help in the forums!

    Martin, we are already working on PDF and DOCs indexing functionality, and it will be available with the 4.0 version. As for the Images and Documents module, if you are using the 3.6 Hotfix version, you should have no worries with the download extensions. Although the filenames end with strange extensions, the files should be downloaded with the proper ones. In the hotfix version, it is even possible to use the real file extensions in the Urls. More information on this could be found in the KB article How to use the real extensions for the items in the Images and Documents module.

    I hope this helps as well.

    Sincerely yours,
    Georgi
    the Telerik team

    Instantly find answers to your questions on the new Telerik Support Portal.
    Check out the tips for optimizing your support resource searches.
  16. Venkat
    Venkat avatar
    105 posts
    Registered:
    20 Nov 2008
    03 Aug 2009
    Link to this post
    The link provided in this post is not working.

    Can any one please check this.

    Thanks
    Venkat.
  17. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    03 Aug 2009
    Link to this post
    Hi Venkat,

    Can you please let us know which link are you referring to, since there are several links on the page. I did not find a broken link though.

    Kind regards,
    Georgi
    the Telerik team

    Instantly find answers to your questions on the newTelerik Support Portal.
    Check out the tipsfor optimizing your support resource searches.
  18. Venkat
    Venkat avatar
    105 posts
    Registered:
    20 Nov 2008
    03 Aug 2009
    Link to this post
    Hi George,

    Thank you fro your reply

    I tried it in the morning and the below link not worked at that time and its working now.

    http://www.selarom.net/blog/2009-01-23/Sitefinity_Index_and_Search_Events.aspx.

    regards
    -Venkat.
  19. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    03 Aug 2009
    Link to this post
    Hi Venkat,

    Thank you for the follow up. I am glad that the resource is accessible now.

    Regards,
    Georgi
    the Telerik team

    Instantly find answers to your questions on the newTelerik Support Portal.
    Check out the tipsfor optimizing your support resource searches.
  20. Shanti Boyanapalli
    Shanti Boyanapalli avatar
    1 posts
    Registered:
    16 Oct 2009
    27 Oct 2009
    Link to this post
    Hi
    I would like to know when the Searching through images and documents functionality is getting released?

  21. Nikolai
    Nikolai avatar
    216 posts
    Registered:
    21 Nov 2016
    27 Oct 2009
    Link to this post
    Hello Shanti Boyanapalli,

    We will try our best to add this functionality in the official Sitefinity 4.0 release.

    Greetings,
    Nikolai
    the Telerik team

    Instantly find answers to your questions on the new Telerik Support Portal.
    Watch a video on how to optimize your support resource searches and check out more tips on the blogs.
  22. Jake
    Jake avatar
    1 posts
    Registered:
    28 Feb 2011
    28 Feb 2011
    Link to this post
    Can anyone verify that this functionality was included in the 4.0 build? The post on Feb 25 made it sound like it was to be added, but I can't seem to find any confirmation in the 4.0 features documentation. 

    Thanks in advance! 
    Jake
  23. Ivan Dimitrov
    Ivan Dimitrov avatar
    16072 posts
    Registered:
    25 Nov 2016
    01 Mar 2011
    Link to this post
    Hello Shanti,

    In Sitefinity 4.0 the content of PDF and DOC files is not added to the index. We will not be able to implement this before Q2.

    All the best,
    Ivan Dimitrov
    the Telerik team
    Registration for Q1 2011 What’s New Webinar Week is now open. Mark your calendar for the week starting March 21st and book your seat for a walk through all the exciting stuff we ship with the new release!
  24. Thomas Brooke
    Thomas Brooke avatar
    6 posts
    Registered:
    21 Apr 2010
    23 Aug 2011
    Link to this post
    Hi,

    Any update on this? Or any alternative suggestions (e.g. Marketplace)?

    This is a real must for my client and would be a serious blow to the project if we can't do it.

    Using SF 4.2.

    Thanks,
    Thom
  25. Stanislav Velikov
    Stanislav Velikov avatar
    1113 posts
    Registered:
    30 Nov 2016
    26 Aug 2011
    Link to this post
    Hi Thomas,

    Sitefinity still doesn`t support search in document contents. They are searched by name of the content item.
    Excuse us for the inconvenience.

    Kind regards,
    Stanislav Velikov
    the Telerik team

    Thank you for being the most amazing .NET community! Your unfailing support is what helps us charge forward! We'd appreciate your vote for Telerik in this year's DevProConnections Awards. We are competing in mind-blowing 20 categories and every vote counts! VOTE for Telerik NOW >>

  26. Ryan
    Ryan avatar
    2 posts
    Registered:
    26 Aug 2011
    26 Aug 2011
    Link to this post
    Lack of PDF content indexing is a show stopper for sure.

    Can Sitefinity Search read other indexes?

    What other alternatives are there?

  27. Stanislav Velikov
    Stanislav Velikov avatar
    1113 posts
    Registered:
    30 Nov 2016
    31 Aug 2011
    Link to this post
    Hello Ryan,

    For customized search you have to implement a custom pipe. You can see how the TxtDocumentSearchInboundPipe in the sample project from  Publishing system brief walkthrough.  This is creating a pipe that will push the items into the search index when they are published. The only requirement is to be published by the fluent api. You have to implement PushData and ToPublishingPoint. Those methods will actually put the item into the publishing point. Note that you have to add some settings / mappings as it was explained in Registering custom pipes in Sitefinity

    Regards,
    Stanislav Velikov
    the Telerik team

    Thank you for being the most amazing .NET community! Your unfailing support is what helps us charge forward! We'd appreciate your vote for Telerik in this year's DevProConnections Awards. We are competing in mind-blowing 20 categories and every vote counts! VOTE for Telerik NOW >>

  28. shae
    shae avatar
    30 posts
    Registered:
    26 Jul 2012
    08 Feb 2012
    Link to this post
    Has this functionality been added yet?
  29. Stanislav Velikov
    Stanislav Velikov avatar
    1113 posts
    Registered:
    30 Nov 2016
    08 Feb 2012
    Link to this post
    Hello,

     It is not available yet.

    All the best,
    Stanislav Velikov
    the Telerik team
    Do you want to have your say in the Sitefinity development roadmap? Do you want to know when a feature you requested is added or when a bug fixed? Explore the Telerik Public Issue Tracking system and vote to affect the priority of the items
  30. shae
    shae avatar
    30 posts
    Registered:
    26 Jul 2012
    08 Feb 2012
    Link to this post
    Its been over 4 years since first mention of this. Is it on the 5 year plan?

Register for webinar
36 posts, 0 answered
1 2