- Data Management
- Artificial Intelligence
Big data in e-commerce—what you need to know
As a shop operator, you have the opportunity to collect a lot of data about your users—in other words, to generate big data in e-commerce. However, this large collection of data does not initially provide you with any useful information. You need to perform targeted analyses for specific use cases in order to gather important insights. You can read about how this works in this blog article.
Here's what you can expect to find in this blog article:
What is big data in e-commerce?
What happens to the collected data
What the challenge is in e-commerce
Use cases for big data in e-commerce
Use case 1: Reduction of click and purchase data for the display of recommendations
Use case 2: Automatic product identification from existing information
Use case 3: Extracting color values for locating similar products
What is big data?
The term "big data" has two meanings that also apply to big data in e-commerce. On the one hand, the term describes the huge amount of data that is now available to many companies and users. On the other hand, big data refers to the methods and technologies used to generate value from this wealth of data.
Dough Laney's V-model defines the following characteristics for big data:
- Volume: describes the size of the data sets. Facebook, for example, has data on its more than 2 billion users.¹
- Velocity: refers to the speed at which new data is added or changed. On YouTube, for example, 400 hours of video material are uploaded every minute.²
- Variety: refers to diverse and often unstructured data, such as in an email. An email has a structured header with the sender and subject line. However, the body is unstructured.
There are various reasons why the term has been used more frequently in recent years. On the one hand, companies today have the opportunityto obtainmuch moredata about their usersthan in the past. Thanks to smartphones, the increased use of mobile internet, and various analysis tools, the amount of available data has increased many times over.
On the other hand, storage options have improved thanks to clouds, and the processing of this data has improved thanks to computers that are becoming faster and faster.
Since big data also extends beyond artificial intelligence (AI) , recent advances in AI and the associated enthusiasm have further fueled the topicof big data in e-commerce.
Stay up to date on personalization: Sign up for the epoq newsletter. Register now!
What is big data in e-commerce?
Big data describes all data that a company can collect in order to gain insights for future actions within the company. Big data in e-commerce represents all data relating to digital commerce that is collected in the context of online shopping. This can include, for example, data collected from the shop system, the merchandise management system, and click and purchase behavior in the online shop.
What happens to the collected data
If you are the operator of an online shop and now have a huge amount of data at your disposal, you are probably wondering how you can benefit from it. And rightly so, because a large collection of data does not initially provide any information. The trick is to process this data effectively and identify patterns and correlations that can provide useful informationfor your online shop. This process is called data mining.
To obtain this information,smart data is extracted from big data in e-commerce. Smart data is useful data for specific applications in e-commerce. This enables targeted analysis of large amounts of data.
This also creates a focus on the essentials and prevents random correlations from being misinterpreted as patterns. A quote from Stefan Jähnichen, FZI Research Center for Information Technology and head of Smart Data accompanying research, describes Smart Data very aptly: "Big data is therefore a kind of raw material that needs to be processed so that it can be refined into smart data and develop its full economic potential.".⁴
What the challenge is in e-commerce
The challenge in e-commerce is therefore to process the "raw material" of big data into smart data for specific use cases so that actions can be carried out in the online shop for the benefit of online shoppers and shop operators. That is why, in addition to data mining, another process is used: machine learning. Machine learning provides the methods for building artificial intelligence in order to further process the available data and carry out actions in the online shop.
In summary, data mining performs targeted analyses and provides a suitable amount of data for machine learning. This data is then further processed and leads to actions in the online shop.
Use cases for big data in e-commerce
To help you better understand smart data, we have compiled three use cases for you. These describe how you can extract and process smart data from big data using data mining and machine learning, and the advantages this offers:
Use Case 1: Reduction of click and purchase data for the display of recommendations
When an online shopper interacts with a website, they generate an enormous amount of data. Every click, every mouse movement, and every scroll can be recorded. This is what we refer to as big data. However, only a subset of this data is needed to calculate recommendations, such as:
- Which links were clicked,
- Which items were added to the shopping cart and
- How long did the customer stay where?
The first reduction therefore takes place during data recording. The second reduction is then carried out when calculating the model, e.g.:
- What information is helpful for the current online shop?
- for the current customer and
- for the current problem.
This context-related information constitutes smart data.
Would you like more information about this use case? Then we recommend our blog article: Reinforcement learning process: How to increase the probability of purchase with real-time analytics (Part 2).
Use Case 2: Automatic product identification from existing information
Your product catalog contains information such as title, category, and description that describe a real product from a sequence of words and product IDs. Up to this point, we are talking about big data. However, when an algorithm extracts a current product from this data, we are talking about smart data. Only when the algorithm recognizes the product as such can it place it in the correct relationship to other products (semantics). An example:
big data
Title: Phase Eight Leto Floral Dress
Category: Women's dresses/Other dresses
Description: This jersey dress features contrasting floral prints for a bright and modern finish. A twist detail at the waist creates a flattering silhouette that ends in a slightly asymmetrical shape below the knee.
Smart Data
Dress -> Jersey dress
For example, it can be recognized whether a product is a top or a bottom. This is important for displaying outfit recommendations in the online shop. Through this smart data extraction, we were able to achieve a double-digit increase in sales for one of our customers, for example.
Stay up to date on personalization: Sign up for the epoq newsletter. Register now!
Use Case 3: Extracting color values for locating similar products
If an online shop has 200,000 image files, for example, then this is referred to as big data. If you extract the colors and their color components from the 200,000 image files for a specific purpose, the color values correspond to smart data. The advantage of this is that you can work directly with the color values to find similar products in your online store, instead of just using very rough linguistic color values such as "blue."

Localization of products in the online shop via color entries in the search function (no color information in the product catalog)
Conclusion: Big data is good, smart data is better
Big data is important for business success. That much is clear. However, the extraction of smart data is an important success factor, as the use cases for big data in e-commerce have shown. This allows you to make the collected data usable for specific applications. Important methods for this are data mining and machine learning.
Sources: ¹ Rising Media Ltd., AllFacebook.de ²Statista ³Gesellschaft für Informatik e. V (GI)⁴ Smart Data Accompanying Research, c/o LoeschHundLiepold Kommunikation GmbH, Newsletter, Issue 1, August 2015
Daniela Ilincic
Head of Marketing-
Meet us
Willy-Brandt-Straße 3
76275 Ettlingen -
Write us
hello@epoq.de -
Talk to us
+49 (0)7243 2001-0
Social Media
Get free tips on personalization
-
Personalization
-
Technology
-
Company
-
Explanatory Videos
-
Entirely
You are currently viewing placeholder content from HubSpot. To access the actual content, click the button below. Please note that doing so will result in data being shared with third-party providers.
More InformationYou are currently viewing placeholder content from HubSpot Meetings. To access the actual content, click the button below. Please note that doing so will result in data being shared with third-party providers.
More Information