Back to Projects

Rec2Vec

Edit-Distance Supervision for Logical Product Retrieval
Posted: February 09, 2026
Tags: Dense Retrieval, Logical Reasoning, Product Search, Embeddings
Rec2Vec

Dense retrievers struggle to handle logical constraints such as negation, conjunction, and disjunction in product search, particularly when relevance depends on a small number of attribute-level differences. In Rec2Vec, we address this limitation by supervising dense embeddings with attribute-level edit distance, defined as the minimum number of feature changes required for a product to satisfy a Boolean query. We build a large-scale training dataset from the ESCI corpus by constructing contrastive triplets consisting of a positive product, a hard negative that violates specific attributes, and an easy negative sampled at random. To enable fine-grained supervision, we use large language models to extract synthetic product features and generate natural-language Boolean queries, allowing the embedding space to reflect logical structure rather than pure semantic similarity.

Contributors

  • Matthew Toles
  • ,
  • Shachaf Rispler
  • ,
  • Eugene Wu