In the mean time series (part 4): BONUS TRACK - "Products" module - implementing search index
shows how to create a standard provider for GenericContent based modules like Images and Documents.
Your provider could directly inherit from ContentIndexProvider and you have to override
GetItemUrl - get the content item, its url and culture if you use localization
Initialize - here the index settings are initialized
Name - return name of the provider
Description - return some description
ContentItemKeyImpl - return a search key
ContentProviderKeyImpl - return a provider key
GetDefaultContentProvider - get the default provider
FilterExpression - return "Name='InVariantCulture';";
If you want to search inside the content item content, you need to create a custom provider and use 3rd party framework to extract the content of the files. We have API that will allow you to index the content, but we do not have API that allows you to extract the content from the PDF files. You could use some open source libraries as Apache PDFBox
If you use PDFBox you can read the stream by using PDDocument.load(stream); and then call getText of PDFTextStripper instance.
the Telerik team
Do you want to have your say when we set our development plans?
Do you want to know when a feature you care about is added or when a bug fixed?
Telerik Public Issue Tracking
system and vote to affect the priority of the items