Okay, day 3 now. Still moving forward. I've gotten a brand new 9.2 setup with a couple of simplified test pages containing only words like "ranched ranching" but not "ranch", and some with "tést", "tésting", etc. And behold, I got the ASCII Folding Filter to show me the "tést" page when I search for "test" with the original code provided!
I then got HALF of the PorterStemFilter working as expected. Based on repeated testing it's definitely stemming the words correctly when indexing. It's NOT stemming the actual search terms, though. So searching for "ranch" gets pages with "ranching" and "ranched", but now searching for "ranching" shows 0 results even though that's one of the terms directly on a page (which shows it must have been indexed as "ranch").
So I went back and checked and with the ASCII Folding Filter it does the exact same thing. If I have "tést" on a page, I can search for "test" and find it, but searching for "tést" directly gives no results. At some point since the article I'm following was written, the Search Query tokenizer and the Search Indexing tokenizer must have been split apart...
I mean, I think at this point I might be able to move forward with doing some hacky search term replacement via custom Regex stemming and fuzzy matching algorithms (loading up potential misspellings and stemmed copies of search terms onto the end of the search query manually), but it's going to be ugly. If you guys could point me to where I can apply these Lucene filters to the search query terms list programmatically in 9.2, that would be awesome.
Same goes for any samples of using the FuzzyQuery at either the time of indexing OR the parsing of search query terms. I have a feeling only one of those hooks might be necessary to get past misspellings, but I'd rather start with some code you guys have seen working at least once in the past.