5.04% increase in sales per session thanks to new personalization strategy at Outletcity Metzingen
Outletcity Metzingen puts its personalization strategy to the test and increases revenue per session by 5.04%.
Reinforcement learning is an artificial intelligence method in which a so-called agent learns to interact with its environment as effectively as possible. In recent years, many applications have emerged that have made the method popular. Programs have been developed that can beat humans at games such as chess or Go, or even simple Atari games. They help robots to play soccer successfully or perform daring acrobatic helicopter flights. In this three-part blog series, we show you how reinforcement learning can be used to personalize online store recommendations.
Here's what you can expect to find in this blog article:
An example from behavioral psychology
Reinforcement learning in e-commerce
Reinforcement learning in use for the recommendation engine
Personalization of recommendations
Our conclusion on the use of reinforcement learning in e-commerce
The name reinforcement learning was borrowed from behavioral psychology. Reinforcement learning, a subfield of machine learning (AI), works in a similar way to instrumental conditioning, in which, for example, a dog learns to fetch a ball.
In this case, our dog "Benno" is the agent. The environment is the world in which he finds himself. The trainer and the ball are particularly important here. Benno perceives the environment through his senses. He smells, hears, and sees what is happening around him. His brain creates an internal representation of this environment. He can respond to this representation with various actions. When he sees the ball flying away, he can decide, for example, whether to look after it, run after it, bark, sniff the ground, or lift his leg. If he runs after the ball and brings it back to the trainer, the trainer can then give him a reward.
Stay up to date on personalization: Sign up for the epoq newsletter. Register now!
The behavior is then reinforced when Benno's brain makes the association between bringing the ball back and receiving a reward. Benno will be more motivated to perform this action again next time. However, it is not worthwhile for Benno to chase after a squirrel. He receives no reward for doing so. If he experiences several times that he is only rewarded for bringing back the ball and not the squirrel, his brain can make the connection between the internal representation "ball thrown," running after it and bringing it back, and the reward. This means that he has learned to choose the best response for him in the given situation.
Reinforcement learning works in a very similar way and is used, among other things, for personalization of online shops. Unfortunately, the agent here is not so fluffy and does not bark. But like Benno, it must perceive its environment and, based on this perception, be able to decide on an action that in turn influences the environment.

Reinforcement learning for online shops
The environment we are interested in is online shops and customers who interact with them. A customer's behavior in an online shop can be recorded on the server side. Just as Benno smells the ball, sees it fly away, and hears it hit the ground, the server records when the customer opens a new page (including the exact time), what they searched for, and whether they clicked on a product recommendation. The longer the customer surfs in the online shop, the longer the log, i.e., the record of their behavior. The agent, on the other hand, requires a vector of constant length as input every time it is to act, as do other machine learning methods. This vector is the internal representation of the environment.

Internal representation in the form of a vector
Just as Benno has the ability to perform various actions such as running or sniffing, the agent can also perform various actions. These actions then have an effect on the environment. Our agent has an influence on e-commerce recommendations on a newly accessed page of an online shop. For example, he can decide that only products of a certain brand should be displayed, or only products that cost a maximum of €20. He can also decide to do both at the same time, just as Benno could decide to fetch and bark at the same time.
The agent's decisions influence the product recommendations and personalized elements that the customer sees and can therefore also influence their behavior:
This procedure is repeated for many customers. Each individual online shopper thus becomes the agent's trainer. Over time, the agent learns which product recommendations are best for which customer behavior.
What makes the agent special is that it can respond to the different situations customers find themselves in. Customers with similar behavior generate similar vectors. For example, there are customers who are looking for something specific and know what they want. These customers tend to look at fewer category overview pages, but spend more time on average on each page they visit. For customers who want to browse and be inspired, it's more the other way around.
Stay up to date on personalization: Sign up for the epoq newsletter. Register now!
The agent learns to distinguish between these groups and which action is most appropriate for each group. This can increase sales compared to rigid strategies that perform the same actions for every customer based on preset rules.
With the right training, not only can dogs learn to fetch, but relevant product recommendations can also be generated in online shops. In reinforcement learning, the agent is trained with the different behaviors of shop users and can thus provide increasingly tailored recommendations for each customer.
In the second part of this blog series, we describe how we use real-time analytics to create input vectors from customer behavior. This shows what needs to be considered when tracking so that the agent can make good predictions.
In the third part , we take a closer look at the self-learning algorithms that our agent uses to determine which actions are best for which input vector.
5.04% increase in sales per session: Outletcity Metzingen put its personalization strategy to the test.
Request the case study now!
Willy-Brandt-Straße 3
76275 Ettlingen
+49 (0)7243 2001-0
You are currently viewing placeholder content from HubSpot. To access the actual content, click the button below. Please note that doing so will result in data being shared with third-party providers.
More InformationYou are currently viewing placeholder content from HubSpot Meetings. To access the actual content, click the button below. Please note that doing so will result in data being shared with third-party providers.
More Information