atlas

Ranter

organic-ai

241

Comments

1

hitko

2995

4y

Let's say you're trying to autocomplete countries, and user types in "Samoa". Do you want autocomplete to show "American Samoa" as an option or not?

If yes, use standard analyser. If no, use keyword analyser.
0

IntrusionCM

13812

4y

I'm a bit confused about @hitko answer.

When an edgeGram tokenizer is used, tokenizer create tokens of the input string, usually on a min and max value of characters.

The analyzer searches the **resulting** tokens for matches.

So... "American Samoa" is broken down into tokens, between min and max chars as far as I know.

The resulting tokens are analyzed - meaning the analyzer isn't looking at "Samoa"… rather "Sam" "oa" etc (tokens).

At least this is what I would expect.

The question is now what you'll expect regarding search terms - e.g. a single search term vs many search terms.
0

hitko

2995

4y

@IntrusionCM You've got the order wrong.

When using keyword + edgeGram, the whole string is treated as a single text, and then edgeGram creates tokens like "Am", "Ame", ... , "American Sa", and obviously none of those tokens will match "Samoa".

When using standard + edgeGram, each word of the string is treated as a single text, so the final tokens would be "Am", "Sa", "Ame", "Sam", "Amer", "Samo", ... , and those will match "Samoa".
0

hitko

2995

4y

@hitko If the order was switched, then edgeGrams would be "Am", "Ame", ... , "American Sam", "American Samoa" in both cases, but standard analyser would then split them into words, giving "Am", "Ame", "Amer", ... , "S", "Sa", "Sam", ... - notice this wouldn't respect edgeGram minLength=2, and edgeGram would need to have significant maxLength for any of this to work.
0

organic-ai

241

4y

@hitko Thank you! that was what I was trying to do (the "no" way)
0

IntrusionCM

13812

4y

@hitko Interesting.

Didn't have a mongodb database to test, sadly.

But it makes more sense in your way xD

And yes, exactly the second comment / conclusion is what made me doubt my sanity...

Thanks for the longer explanation.

Related Rants

Add Comment

question

mongo

search

development

database