More in this section

Forums / Developing with Sitefinity / Wildcard searches

Wildcard searches

20 posts, 0 answered
  1. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    10 Dec 2008
    Link to this post
    I can't seem to find any information on this... can any wildcards be used in searching?  I've tried a couple things to no avail...

    If it helps at all, I've built a custom search control using
    SearchManager.Search(HttpUrlQuery.Current["SearchQuery"],HttpUrlQuery.Current["IndexCatalogue"],0,5,null,out totalitems)

    While I'm on the subject, what's the fifth parameter for?  In the documentation i have it didn't have this signature. 

    Thanks!
  2. valio b
    valio b avatar
    33 posts
    Registered:
    24 Sep 2012
    11 Dec 2008
    Link to this post
    Hello Michael,

    There are some limitations when it comes down to wildcard search with Sitefinity (as it is out of the box).
    For example *term will match startterm, finishterm;
    term* will match termStart, termFinish;
    But I couldnot get something like t*rm to match term for example;

    Using this signature
    Search(string searchQuery, string indexName, int startIndex, int max, string mode, out int totalItems) 
    the 5th parameter is mode,where mode can be "AnyWord", which would cause the default operator for the search query to be set to OR. Furthermore that will result in the search query being split with OR word in between the terms. Say for example if your query is dogs cats, this would be parsed as dogs or cats.
    If you pass anything else (including null) as mode the default operator will be AND ("AllWords").

    Hope that makes it a bit clearer!

    Greetings,
    valio b
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  3. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    11 Dec 2008
    Link to this post
    Do I have to enable something to allow for wildcards?  It doesn't seem that using a * is working for me.  In fact, I just did a search on your forums for "searches" which pulled up my post, but if I search for "searche*" it doesn't find it. Maybe you're getting these results in an unreleased build?

    I would like to add that the fieldsInfoProvider.xml works GREAT for managing the indexing (excluding sections and adding weight to others). 
  4. valio b
    valio b avatar
    33 posts
    Registered:
    24 Sep 2012
    11 Dec 2008
    Link to this post
    Hello Michael,
    Actually what you have to do is go and disable escaping of special characters from your ResultBox control.
    It's under Validation->EscapeSpecialCharacters. This actually stopped escaping chars like '*' and I managed to get t*rm to match term and more e.g(te?m).
    In the web.config telerik/search node are specified the patterns which will be blanked from your query, when EscapeSpecialCharacters is on:
    <searchInputValidation> 
            <add matchPattern="\A[\*\?\~][^\s]*" replacementString="" enabled="true" matchAlert="You can not start your query using wildcards, modify your query and try again.."/> 
            <add matchPattern="[\!\^\(\)\{\}\[\]]" replacementString="" enabled="true" matchAlert="Strange characters should be enclosed between double quotation, modify your query and try again.." /> 
          </searchInputValidation> 

    You can modify these as you see fit of course.

    BTW, We are going to include some more information regarding the use of the SearchManager.Search() methods

    Hope that helps!

    Best wishes,
    valio b
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  5. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    11 Dec 2008
    Link to this post
    I'm getting close here.  It works just as you said if I use the built-in result control.  However, in my custom control, it seems to ignore the web.config.  Now I did see that there's another signature that takes a bool for escapeChars.  When I set this to false, it works!  However, when I set that to true, it doesn't.  I expected it to use the web.config settings if I set it to true.  Am I understanding that correctly?

    Thanks!
  6. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    11 Dec 2008
    Link to this post
    One more thing - is it "by design" that the summaries (ResultItem.Snippet) just as an elipsis when you do a wildcard search?  It seems to work this way on the built-in control as well.
  7. valio b
    valio b avatar
    33 posts
    Registered:
    24 Sep 2012
    12 Dec 2008
    Link to this post
    Hi Michael,

    Please stay with me on this one. Hope I will get it right this time.
    With regards to the web.config patterns. They get used, when you actually switch the escape characters off from the ResultBox.

    This will cause the input query to be validated with the regular expressions in the web.config and if any of the regexes is matched, this will be deemed as an invalid input and you should get the corresponding message e.g.(You can not start your query using wildcards, modify your query and try again..").
    If the input is valid, a call follows to the SearchManager.Search methods.

    Moreover if this call is made with escapeChars being TRUE, special chars like "+,*,^, ..." get blanked (in code).

    So for example, if you use this method:
    public static IList<ResultItem> Search(string searchQuery, string indexName, int startIndex, int max, string mode, bool escapeChars, out int totalItems) 
    and escapeChars is TRUE special chars like: "+,-,(,),:,^,~,*,?,!, ..." will be blanked in code.

    Hope it makes a bit more sense now.

    Greetings,
    valio b
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  8. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    12 Dec 2008
    Link to this post
    Okay, I've got it now.  It does seem to be working as you explained.

    How about my follow-up question, "is it "by design" that the summaries (ResultItem.Snippet) just as an elipsis when you do a wildcard search?  It seems to work this way on the built-in control as well."

    Thanks!
  9. valio b
    valio b avatar
    33 posts
    Registered:
    24 Sep 2012
    15 Dec 2008
    Link to this post
    Hello Michael,

    Using ellipses is the design decision made for displaying the summaries.
    (Please note that the answer to your additional question was no intentionally omitted from my last reply.)

    Regards,
    valio b
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  10. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    15 Dec 2008
    Link to this post
    I want to make sure that I am clear in my understanding of the wildcard search:

    When doing a wildcard search, there will be no summaries for the matches.  Is that correct?  If so, why would that be the case, and is there any way around it?
  11. valio b
    valio b avatar
    33 posts
    Registered:
    24 Sep 2012
    17 Dec 2008
    Link to this post
    Hi Michael,

    In the current version of Sitefinity wildcard searches, will not have summaries. I just modified this, so will wait for the approval of my superiors and QA and hopefully it will be in the next release. We are also considering to provide a way for custom formatting of the ResultItem summaries, maybe through Customizable Templates. We are also open for suggestions.

    Best wishes,
    valio b
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  12. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    17 Dec 2008
    Link to this post
    I'm really happy to hear that we will (soon) get the summaries with wildcards. It didn't make sense to me that I'd get titles back but no summary. 

    As far as formatting - that would just be gravy.  I'm not really sure what the best way to approach the summary formatting might be. Maybe something along the lines of 'y' groups of text separated by elipsis where each group contains 'n' characters before and after the matched word.  There would need to be some smarts in there so that sections aren't overlapped and so forth.  I could see it getting a little complicated. Just an idea - but I would be happy just having the current summaries with wildcards.

    Thanks Valio
  13. Mike
    Mike avatar
    208 posts
    Registered:
    10 Dec 2007
    12 Jan 2009
    Link to this post
    Are you saying that there is no current way in SF 3.5 to customize the search results?  I have a products module that is being indexed and returning results, but I want to customize how they're displayed.  Out of the box, there are things in the template like:

    <%#DataBinder.Eval(Container.DataItem, "Title")%>

    What are the available fields in the search results?  I had thought I could access any field specified in the fieldsInfoProvider.xml file--is this not the case?

    Thanks,
    Mike Sharp
  14. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    14 Jan 2009
    Link to this post
    Hello Mike,

    Here is what fields the DataItem contains:
    • Title - Title of the result
    • Snippet - The summary goes here
    • Url - Url of the result
    • Score - Search result score, used for ranking
    These are the only items so far. We may add more in our future versions. While these fields are part of the DateItem, the fieldsInfoProvider.xml is responsible for the actual HTML content. Having that said, by tweaking the fieldsInfoProvider.xml, you can access and give different weight to every field/tag from spidered HTML.
    Kind regards,
    Georgi
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  15. Mike
    Mike avatar
    208 posts
    Registered:
    10 Dec 2007
    14 Jan 2009
    Link to this post

    Thanks Georgi,

    The thing that is weird is that the content that is being returned in the snippet does not appear anywhere in the page source.  I don't see how this is possible, if indeed the crawler is spidering the page itself as it is rendered.  It must be using the the API to crawl the content.  The problem for me is that there may be metafields associated with the module content that I don't want displayed, but if it's not in the HTML, I don't see how I can filter it using fieldsinfoprovider.xml.

    If it is indeed using the module API, it seems like there is some way I could control what it exposed, but I can't find much documentation on how to implement indexing from an API viewpoint, other than that "in the meantime" blog entry.

    So, what I've done "in the meantime" for myself, although its terribly inefficient, is to parse the URL of the search results, and then return the content item itself.  Then I assemble the metafields and all the other things I need, and build an arraylist that I can use to databind with.  It seems pretty ineffcient with a lot of extra database traffic, even though I'm only pulling 10 products at a time on each page, but at least I can format the search results so they look exactly like the products listing itself, complete with an "add to cart" control.  If I ever get time, I'll probably improve things by making a single call to the database using the GetContent, but passing in enough filters with an or condition to get the whole page of products in one pass.

    The alternative is to use an external search engine, such as Search Server Express, where I can query against specific metafields as well as use full text, and return a customized result.  I did this a few years ago on MCMS using SharePoint, and it worked quite well. 

    I think the main difference between how SharePoint indexes and your engine (Lucene?) indexes is that SharePoint stores meta keys that it finds, associating a datatype with them (very useful later, for example, when you're trying to do date calculations).  You can then construct a very specific query and get a completely controllable set of return fields that you can format in any way you like. 

    This is all very useful for doing parametric queries against your content.  All you need to do is insert custom meta tags in your page when the crawler is indexing.  You have a nice method for detecting the crawler; it would be great if it could be extended to detect a specific instance of another crawler.  I did that with SharePoint by looking at the combination of User Agent and IP address. 

    I'm open to suggestions for improvement over my approach, but I think Sitefinity really needs more control over search results.  Once this is live, I'll post the URL so others can see it.  I'd love to blog about it, but time seems in short supply these days.  ;^)

    Regards,
    Mike Sharp

  16. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    16 Jan 2009
    Link to this post
    Hi Mike,

    Thank you very much for your the additional notes, and sharing the way you get the things to work. You are always so precise when you explain what you do:) We would love to see your web site online.

    The only thing I can suggest you regarding the meta fields, is to exclude from crawling the ones you do not want to show on your pages . I am not sure how feasible is this for your project, but this is something that you will consider. Every metafield has a searchable property in the web.config file. You may set searchable to false for the metafields you want to exclude.

    I can completely understand why you need more control over the search results though. This is something we should improve indeed.

    Sincerely yours,
    Georgi
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  17. Mike
    Mike avatar
    208 posts
    Registered:
    10 Dec 2007
    16 Jan 2009
    Link to this post
    Ah, thanks for the tip about the searchable property.  Now that you mention it, I've seen that property in the web.config meta fields, but had completely forgotten about it.

    Now if I can just get the last Commercefinity bugs squashed, I'll have the site live.  ;^)

    Mike
  18. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    19 Jan 2009
    Link to this post
    Hi,

    I wish you success with it, Mike.
    Please let me know if you have any other questions.

    All the best,
    Georgi
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
  19. IT
    IT avatar
    52 posts
    Registered:
    30 Sep 2008
    22 Jan 2009
    Link to this post
    Hi guys - I was wondering if the fix to show summaries on wildcard searches (as mentioned by Valio on 12/15 in this thread) will be part of the 3.6 release. Thanks.
  20. Georgi
    Georgi avatar
    3583 posts
    Registered:
    28 Oct 2016
    22 Jan 2009
    Link to this post
    Hi Michael,

    Yes, it will be included in version 3.6.

    Greetings,
    Georgi
    the Telerik team

    Check out Telerik Trainer, the state of the art learning tool for Telerik products.
Register for webinar
20 posts, 0 answered