2. Introduction
● Many moving parts need to align to succeed in ecommerce search.
● Query understanding is perhaps the most critical of those parts.
● Query understanding offers extraordinary potential for improvement.
3. Overview
● Relevance and Desirability
● Guiding Searchers to Better Queries
● Search Queries vs. Search Intents
● Bags of Documents, Bags of Queries
5. Retrieval + ranking focus on relevance + desirability.
● Relevance = how much a result responds to the query.
● Desirability = the query-independent utility of a result.
● This is a simplified model that ignores personalization and other factors.
● But what does all this have to do with query understanding?
6. What matters most to the buyer is query-dependent.
● A relevance-only model would depend only on query-dependent signals.
● A smarter approach filters on relevance but then focuses on desirability.
● Queries expressing more specific intent set a higher bar for relevance.
christmas ornaments
gaggia brera water tank
vs.
8. Application should guide searchers to better queries.
● No query understanding model or system is ever going to be perfect.
● A search application will understand some queries but not others.
● A failure to understand the query undermines retrieval and ranking.
● Hence, autocomplete, related searches, and all other query suggestions
should promote queries that the search application can understand!
9. Suggest unambiguous, high-specificity queries.
● A search application should never suggest queries it cannot understand.
● In particular, that means not suggesting ambiguous queries, e.g., “mixer”.
● All else equal, it should favor more specific over less specific queries.
● Queries with higher specificity tend to have higher conversion rates.
● That is why it is important to model and measure query specificity!
11. ● Information retrieval researchers worry about queries with multiple intents.
jaguar or ?
● A more practical concern is multiple queries that map to the same intent.
lightning to 3.5mm
iphone to aux
Search Query != Search Intent
15. Bag of documents: query as mean of product vectors.
►
►
[0.13, 0.81, … ]
[0.09, 0.75, … ]
…
►
[0.11, 0.79, … ]
[0.13, 0.81, … ]
[0.09, 0.77, … ]
…
►
[0.12, 0.78, … ]
►
cos > 0.98
black tshirts for men mens black t-shirt
16. ML generalizes the bag-of-documents model to tail queries.
● Train using (query1, query2, similarity) triples from offline model.
● Oversample similar query pairs to increase sensitivity where it matters.
● Fine-tune a pre-trained micro-BERT sentence transformer model.
● Concatenate the output of a query classifier to the query keywords.
17. Duality: we can model a document as a bag of queries.
● A document can be modeled based on the queries intended to find it.
= mens t shirt black tshirts for men …
● We can use this model to measure retrievability, which is recall in practice.
● Useful as feedback for indexing, both in general and for the particular item.
18. Summary
● The tradeoff between relevance and desirability depends on query specificity.
● All query suggestions should be unambiguous, preferably high-specificity.
● Measure query similarity to recognize queries with same or similar intent.
● Model queries as bags of documents, and documents as bags of queries.