This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Ghazaleh H. Torbati, Max Planck Institute for Informatics Saarbrucken, Germany & ghazaleh@mpi-inf.mpg.de;
(2) Andrew Yates, University of Amsterdam Amsterdam, Netherlands & a.c.yates@uva.nl;
(3) Anna Tigunova, Max Planck Institute for Informatics Saarbrucken, Germany & tigunova@mpi-inf.mpg.de;
(4) Gerhard Weikum, Max Planck Institute for Informatics Saarbrucken, Germany & weikum@mpi-inf.mpg.de. Table of Links Abstract and Introduction
Related Work
Methodology
Experimental Design
Experimental Results
Conclusion
Ethics Statement and References II. RELATED WORK Exploiting Item Features: Content-based recommender systems incorporate item tags, item-item similarity, and userside features. Item-item similarity typically maps the tag clouds of items into a latent space and computes distances between the embedding points. This can be combined with interaction-based methods that employ latent-space techniques including deep learning (e.g., [2][8], [12]–[14]) or graphbased inference (e.g., [15], [16]). These methods excel in performance, but experimental results often benefit from a large fraction of favorable test cases. For example, when the model is trained with books by some author, predicting that the user also likes other books by the same author is likely a (near-trivial) hit. In our experiments, we ensure that such cases are excluded. Exploiting User Reviews: The most important user features are reviews of items, posted with binary likes or numeric ratings. Early works either mine sentiments on item aspects or map all textual information to latent features using topic models like LDA or static word embeddings like word2vec (see, e.g., [17]). Recent works have shifted to deep neural networks with attention mechanisms [9]–[11], [18]–[20], or feed review text into latent-factor models [21]–[23]. Some works augment collaborative filtering (CF) models with user text, to mitigate data sparseness (e.g., [24]–[26]). However, pre-dating the advent of large language models, all these methods rely on static word level encodings such as word2vec, and are inherently limited. As a salient representative, we include DeepCoNN [18] in the baselines of our experiments. Transformer-based inference: Recent works leverage pretrained language models (LMs), mostly BERT, for recommender systems in different ways: i) encoding item-user CF signals with transformer-based embeddings, ii) making inference from rich representations of the input review texts, or iii) implicitly incorporating the “world knowledge” that LMs have in latent form. An early representative of the first line is BERT4Rec [27], [28], which uses BERT to learn item representations for sequential predictions based on item titles and user-item interaction histories, but does not incorporate any text. The P5 method of [29] employs a suite of prompt templates for the T5 language model, in a multi-task learning framework covering direct as well as sequential recommendations along with generating textual explanations. We include a text-enriched variant of the P5 method in our experiments. The advances on large language models inspired approaches that leverage LLM “world knowledge”. Early works use smaller models like BERT, to elicit knowledge about movie, music and book genres [30]. Recent studies are based on prompting large autoregressive models, such as GPT or PaLM, to generate item rankings for user-specific recommendations [31], [32] or to predict user ratings [33], in a zero-shot or few-shot fashion, using in-context inference solely based on a user’s item titles and genres. Closest to our approach are the methods of [34], [35], using BERT to create representations for user and item text, aggregated by averaging [34] or k-means clustering [35]. The resulting latent vectors are used for predicting item scores. A major limitation is that the text encodings are for individual sentences only, which tends to lose signals from user reviews where cues span multiple sentences. Also, BERT itself is fixed, and the latent vectors for users and items are pre-computed without awareness of the prediction task. Our experiments include the method of [34], called BENEFICT, as a baseline. This paper is available on arxiv under CC 4.0 license. Authors: (1) Ghazaleh H. Torbati, Max Planck Institute for Informatics Saarbrucken, Germany & ghazaleh@mpi-inf.mpg.de; (2) Andrew Yates, University of Amsterdam Amsterdam, Netherlands & a.c.yates@uva.nl; (3) Anna Tigunova, Max Planck Institute for Informatics Saarbrucken, Germany & tigunova@mpi-inf.mpg.de; (4) Gerhard Weikum, Max Planck Institute for Informatics Saarbrucken, Germany & weikum@mpi-inf.mpg.de. This paper is available on arxiv under CC 4.0 license. Authors: (1) Ghazaleh H. Torbati, Max Planck Institute for Informatics Saarbrucken, Germany & ghazaleh@mpi-inf.mpg.de; (2) Andrew Yates, University of Amsterdam Amsterdam, Netherlands & a.c.yates@uva.nl; (3) Anna Tigunova, Max Planck Institute for Informatics Saarbrucken, Germany & tigunova@mpi-inf.mpg.de; (4) Gerhard Weikum, Max Planck Institute for Informatics Saarbrucken, Germany & weikum@mpi-inf.mpg.de. Table of Links Abstract and Introduction Related Work Methodology Experimental Design Experimental Results Conclusion Ethics Statement and References Abstract and Introduction Abstract and Introduction Related Work Related Work Methodology Methodology Experimental Design Experimental Design Experimental Results Experimental Results Conclusion Conclusion Ethics Statement and References Ethics Statement and References II. RELATED WORK Exploiting Item Features: Content-based recommender systems incorporate item tags, item-item similarity, and userside features. Item-item similarity typically maps the tag clouds of items into a latent space and computes distances between the embedding points. This can be combined with interaction-based methods that employ latent-space techniques including deep learning (e.g., [2][8], [12]–[14]) or graphbased inference (e.g., [15], [16]). These methods excel in performance, but experimental results often benefit from a large fraction of favorable test cases. For example, when the model is trained with books by some author, predicting that the user also likes other books by the same author is likely a (near-trivial) hit. In our experiments, we ensure that such cases are excluded. Exploiting Item Features: Exploiting User Reviews: The most important user features are reviews of items, posted with binary likes or numeric ratings. Early works either mine sentiments on item aspects or map all textual information to latent features using topic models like LDA or static word embeddings like word2vec (see, e.g., [17]). Exploiting User Reviews: Recent works have shifted to deep neural networks with attention mechanisms [9]–[11], [18]–[20], or feed review text into latent-factor models [21]–[23]. Some works augment collaborative filtering (CF) models with user text, to mitigate data sparseness (e.g., [24]–[26]). However, pre-dating the advent of large language models, all these methods rely on static word level encodings such as word2vec, and are inherently limited. As a salient representative, we include DeepCoNN [18] in the baselines of our experiments. Transformer-based inference: Recent works leverage pretrained language models (LMs), mostly BERT, for recommender systems in different ways: i) encoding item-user CF signals with transformer-based embeddings, ii) making inference from rich representations of the input review texts, or iii) implicitly incorporating the “world knowledge” that LMs have in latent form. Transformer-based inference: An early representative of the first line is BERT4Rec [27], [28], which uses BERT to learn item representations for sequential predictions based on item titles and user-item interaction histories, but does not incorporate any text. The P5 method of [29] employs a suite of prompt templates for the T5 language model, in a multi-task learning framework covering direct as well as sequential recommendations along with generating textual explanations. We include a text-enriched variant of the P5 method in our experiments. The advances on large language models inspired approaches that leverage LLM “world knowledge”. Early works use smaller models like BERT, to elicit knowledge about movie, music and book genres [30]. Recent studies are based on prompting large autoregressive models, such as GPT or PaLM, to generate item rankings for user-specific recommendations [31], [32] or to predict user ratings [33], in a zero-shot or few-shot fashion, using in-context inference solely based on a user’s item titles and genres. Closest to our approach are the methods of [34], [35], using BERT to create representations for user and item text, aggregated by averaging [34] or k-means clustering [35]. The resulting latent vectors are used for predicting item scores. A major limitation is that the text encodings are for individual sentences only, which tends to lose signals from user reviews where cues span multiple sentences. Also, BERT itself is fixed, and the latent vectors for users and items are pre-computed without awareness of the prediction task. Our experiments include the method of [34], called BENEFICT, as a baseline.

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Recommendations by Concise User Profiles from Review Text: Related Work

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

Recommendations by Concise User Profiles from Review Text: Abstract and Introduction

Recommendations by Concise User Profiles from Review Text: Experimental Results

Recommendations by Concise User Profiles from Review Text: Methodology

Recommendations by Concise User Profiles from Review Text: Experimental Results

Recommendations by Concise User Profiles from Review Text: Conclusion

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

Recommendations by Concise User Profiles from Review Text: Abstract and Introduction

Recommendations by Concise User Profiles from Review Text: Experimental Results

Recommendations by Concise User Profiles from Review Text: Methodology

Recommendations by Concise User Profiles from Review Text: Experimental Results

Recommendations by Concise User Profiles from Review Text: Conclusion

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps