Эта статья является препринтом и не была отрецензирована.
О результатах, изложенных в препринтах, не следует сообщать в СМИ как о проверенной информации.
Selective Learning-to-Rank for Product Analogs
2026-01-02
Product analog discovery is a critical component of modern e-commerce systems, en-
abling recommendations, catalog deduplication, and search diversification. Unlike classi-
cal similarity search, many products in real-world catalogs do not admit valid substitutes,
making forced ranking prone to false positives.
This work extends selective prediction to learning-to-rank for analog discovery under
partial coverage, introducing a simple yet effective confidence-aware reject mechanism
based on score gap and absolute score. Experiments on a large proprietary catalog compris-
ing 105 products across 50 categories and 106 labeled pairs show that the proposed method
reduces false positives by 25% compared to a forced-ranking baseline while maintaining
high coverage and product-level recall.
Empirical evaluation across diverse product categories demonstrates a systematic recall–
coverage trade-off induced by selective rejection. Price-aware features emerge as the most
influential determinants of analog validity, often outweighing fine-grained specification
similarity. Overall, selective ranking with abstention is an effective and practically imple-
mentable strategy for robust analog discovery at scale.
Ссылка для цитирования:
Krasnov F. 2026. Selective Learning-to-Rank for Product Analogs. PREPRINTS.RU. https://doi.org/10.24108/preprints-3114204
Список литературы
1. Chen F. и др. Studying product competition using representation learning // Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020. С. 1261–1268.
2. Fletcher A., Ormosi P. L., Savani R. Recommender systems and supplier competition on platforms // Journal of Competition Law & Economics. 2023. Т. 19, № 3. С. 397–426.
3. Hu S., Wei M. M., Cui S. The role of product and market information in an online marketplace // Production and Operations Management. 2023. Т. 32, № 10. С. 3100–3118.
4. Wang J. и др. Entity matching: How similar is similar // Proceedings of the VLDB Endowment. 2011. Т. 4. С. 622–633.
5. Köpcke H., Thor A., Rahm E. Evaluation of entity resolution approaches on real-world match problems // Proceedings of the VLDB Endowment. 2010. Т. 3. С. 484–493.
6. Singh R. и др. Synthesizing entity matching rules by examples // Proceedings of the VLDB Endowment. 2017. Т. 11. С. 189–202.
7. Ristoski P. и др. A machine learning approach for product matching and categorization // Semantic Web. 2018. Т. 9, № 5. С. 707–728.
8. Shah K., Kopru S., Ruvini J. D. Neural network based extreme classification and similarity models for product matching // NAACL-HLT. 2018. С. 8–15.
9. Burges C. и др. Learning to rank using gradient descent // Proceedings of the 22nd International Conference on Machine Learning (ICML). 2006. С. 89–96.
10. El-Yaniv R. On the foundations of noise-free selective classification // Journal of Machine Learning Research. 2010. Т. 11. С. 1605–1641.
11. Geifman Y., El-Yaniv R. Selective classification for deep neural networks // Advances in Neural Information Processing Systems. 2017.
12. Mikolov T. и др. Efficient estimation of word representations in vector space // arXiv preprint arXiv:1301.3781. 2013.
13. Devlin J. и др. BERT: Pre-training of deep bidirectional transformers for language understanding // NAACL-HLT. 2019. С. 4171–4186.
14. Reimers N., Gurevych I. Sentence-BERT: Sentence embeddings using Siamese BERT-networks // EMNLP-IJCNLP. 2019. С. 3982–3992.