Leveraging LLMs for Sequential Recommendation

12.10.23 by Marios Fragkoulis, Jesse Harte

Leveraging LLMs for Sequential Recommendation

Delivery Hero Logo

4 min read

Sequential recommendation problems have received increasing attention in research during the past few years, leading to the inception of a large variety of algorithmic approaches. In this work, we explore how large language models (LLMs) can be used to build or improve sequential recommendation approaches.

This work was a joint research collaboration between Delivery Hero (DH) Research, Delft University of Technology (TU Delft), the University of Klagenfurt (UK), and the Athens University of Economics and Business (AUEB) carried out by Jesse Harte (DH Research & TU Delft), Wouter Zorgdrager (DH Research), Prof. Panos Louridas (AUEB), Prof. Asterios Katsifodimos (TU Delft), Prof. Dietmar Jannach (UK), and Dr. Marios Fragkoulis (DH Research). The research paper appeared in this year’s ACM Conference on Recommender Systems (RecSys’23), which is consistently ranked among the top computer science research venues.

We had the pleasure to present both the paper and the exciting progress in research involving LLMs in the recommender systems space at the opening keynote presentation in the Context-Aware Recommender Systems (CARS) workshop, which took place alongside the main RecSys conference.

Sequential recommendation problems have received increasing attention in research during the past few years, leading to the inception of a large variety of algorithmic approaches. In this work, we explore how large language models (LLMs), which are nowadays introducing disruptive effects in many AI-based applications, can be used to build or improve sequential recommendation approaches. Specifically, we devise and evaluate three approaches to leverage the power of LLMs in different ways. Our results from experiments on two datasets show that initializing the state-of-the-art sequential recommendation model BERT4Rec with embeddings obtained from an LLM improves NDCG by 15-20% compared to the vanilla BERT4Rec model. Furthermore, we find that a simple approach that leverages LLM embeddings for producing recommendations can provide competitive performance by highlighting semantically related items. We publicly share the code and data of our experiments to ensure reproducibility.

The recent developments in LLMs have taken the world by surprise. Models like OpenAI GPT, Google PaLM, and Facebook LLaMA, which employ deep transformer architectures, demonstrate how innovations in NLP can reshape mainstream online activities, such as search, shopping, and customer care. Inevitably, research in recommender systems is significantly impacted by the developments in the area of LLMs as well.

LLMs are mainly utilized for recommendation problems in two ways: by providing embeddings that can be used to initialize existing recommendation models, and by producing recommendations leveraging their inherent knowledge encoding. LLMs as recommendation models can provide recommendations given a) only a task specification (zero-shot), b) a few examples given inline to the prompt of a task (few-shot), or c) after fine-tuning the model’s weights for a task given a set of training examples. 

This incremental training process deviates from typical recommendation models, which have to be trained from zero on domain data. In fact, LLMs show early indications of adaptability to different recommendation domains with modest fine-tuning. Despite the promising developments, research on leveraging the inherent semantic information of LLMs for recommendation tasks is still limited.

In this work, we explore the potential of using LLMs for sequential recommendation problems. In short, in sequential recommendation problems, we consider as input a sequence of user interactions 𝑆u = (𝑆1u, 𝑆2u, …, 𝑆nu), for user 𝑢, where 𝑛 is the length of the sequence and 𝑆iu are individual items. The aim is to predict the next interaction of the given sequence.

Our paper introduces the following contributions and insights:

  1. We devise three orthogonal methods of leveraging LLMs for sequential recommendation. In our first approach (LLMSeqSim), we retrieve a semantically rich embedding from an existing LLM (from OpenAI) for each item in a session. We then compute an aggregate session embedding to recommend catalogue products with a similar embedding. In the second approach (LLMSeqPrompt), we fine-tune an LLM with dataset-specific information in the form of prompt-completion pairs and ask the model to produce next item recommendations for test prompts. Finally, our third approach (LLM2BERT4Rec) consists of initializing existing sequential models with item embeddings obtained from an LLM.
  2. We find that initializing a sequential model with LLM embeddings is particularly effective: applying it to the state-of-the-art model BERT4Rec improves accuracy in terms of NDCG by 15-20%, making it the best-performing model in our experiments.
  3. Finally, we show that in certain applications simply using LLM embeddings to find suitable items for a given session can lead to state-of-the-art performance.

In our future work, we plan to investigate if our findings generalize to different domains, using alternative datasets with diverse characteristics. Furthermore, we will explore if using other LLMs, e.g., ones with different architectures and training corpora, will lead to similar performance gains, including a hybrid of LLM2BERT4Rec with LLMSeqSim towards combining their accuracy and beyond-accuracy performance. Finally, it is open so far if passing other types of information besides product names, e.g., category information, to an LLM can help to further improve the performance of the models.

Marios Fragkoulis (left) and Jesse Harte (right) at the poster session in RecSys’23.


In this work, we devised and evaluated three approaches that leverage LLMs for sequential recommendation problems. A systematic empirical evaluation revealed that BERT4Rec initialized with LLM embeddings achieves the best performance for two datasets and that the LLM-based initialization leads to a substantial improvement in accuracy.

Marios Fragkoulis is the lead scientist of DH Research, a research team dedicated to boosting innovation across Delivery Hero by solving challenging problems with high impact. Currently, the team is primarily tackling problems in the recommender systems space, while it has also published a number of research papers in other domains, such as data integration, stream processing, and distributed transactions. Marios can be contacted at marios.fragkoulis@deliveryhero.com.

Jesse Harte is a research engineer at DH Research.

If you like what you’ve read and you’re someone who wants to work on open, interesting projects in a caring environment, check out our full list of open roles here – from Backend to Frontend and everything in between. We’d love to have you on board for an amazing journey ahead.

Leveraging LLMs for Sequential Recommendation
Marios Fragkoulis
Senior Research Manager at Delivery Hero Research
Leveraging LLMs for Sequential Recommendation
Jesse Harte
Research Engineer
From Streets to Screens: Redesigning the Rider App



From Streets to Screens: Redesigning the Rider App

Delivery Hero Logo
4 min read