Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple APILearn More
hitko2590210dLet's say you're trying to autocomplete countries, and user types in "Samoa". Do you want autocomplete to show "American Samoa" as an option or not?
If yes, use standard analyser. If no, use keyword analyser.
IntrusionCM12259210dI'm a bit confused about @hitko answer.
When an edgeGram tokenizer is used, tokenizer create tokens of the input string, usually on a min and max value of characters.
The analyzer searches the **resulting** tokens for matches.
So... "American Samoa" is broken down into tokens, between min and max chars as far as I know.
The resulting tokens are analyzed - meaning the analyzer isn't looking at "Samoa"… rather "Sam" "oa" etc (tokens).
At least this is what I would expect.
The question is now what you'll expect regarding search terms - e.g. a single search term vs many search terms.
hitko2590209d@IntrusionCM You've got the order wrong.
When using keyword + edgeGram, the whole string is treated as a single text, and then edgeGram creates tokens like "Am", "Ame", ... , "American Sa", and obviously none of those tokens will match "Samoa".
When using standard + edgeGram, each word of the string is treated as a single text, so the final tokens would be "Am", "Sa", "Ame", "Sam", "Amer", "Samo", ... , and those will match "Samoa".
hitko2590209d@hitko If the order was switched, then edgeGrams would be "Am", "Ame", ... , "American Sam", "American Samoa" in both cases, but standard analyser would then split them into words, giving "Am", "Ame", "Amer", ... , "S", "Sa", "Sam", ... - notice this wouldn't respect edgeGram minLength=2, and edgeGram would need to have significant maxLength for any of this to work.