Back to overview
  • Data Management
  • Artificial Intelligence

Big data in e-commerce—what you need to know

  • Published November 15, 2018
  • Daniela Ilincic
  • Reading time: 7 min.

As a shop operator, you have the opportunity to collect a lot of data about your users—in other words, to generate big data in e-commerce. However, this large collection of data does not initially provide you with any useful information. You need to perform targeted analyses for specific use cases in order to gather important insights. You can read about how this works in this blog article.

Big data in e-commerce is visualized by three people using a smartphone on the platform of a train station.

What is big data?

The term "big data" has two meanings that also apply to big data in e-commerce. On the one hand, the term describes the huge amount of data that is now available to many companies and users. On the other hand, big data refers to the methods and technologies used to generate value from this wealth of data.

Dough Laney's V-model defines the following characteristics for big data:

  • Volume: describes the size of the data sets. Facebook, for example, has data on its more than 2 billion users
  • Velocity: refers to the speed at which new data is added or changed. On YouTube, for example, 400 hours of video material are uploaded every minute
  • Variety: refers to diverse and often unstructured data, such as in an email. An email has a structured header with the sender and subject line. However, the body is unstructured.

There are various reasons why the term has been used more frequently in recent years. On the one hand, companies today have the opportunityto obtainmuch moredata about their usersthan in the past. Thanks to smartphones, the increased use of mobile internet, and various analysis tools, the amount of available data has increased many times over.

On the other hand, storage options have improved thanks to clouds, and the processing of this data has improved thanks to computers that are becoming faster and faster.

Since big data also extends beyond artificial intelligence (AI) , recent advances in AI and the associated enthusiasm have further fueled the topicof big data in e-commerce.


Stay up to date on personalization: Sign up for the epoq newsletter. Register now!


What is big data in e-commerce?

Big data describes all data that a company can collect in order to gain insights for future actions within the company. Big data in e-commerce represents all data relating to digital commerce that is collected in the context of online shopping. This can include, for example, data collected from the shop system, the merchandise management system, and click and purchase behavior in the online shop.

What happens to the collected data

If you are the operator of an online shop and now have a huge amount of data at your disposal, you are probably wondering how you can benefit from it. And rightly so, because a large collection of data does not initially provide any information. The trick is to process this data effectively and identify patterns and correlations that can provide useful informationfor your online shop. This process is called data mining.

To obtain this information,smart data is extracted from big data in e-commerce. Smart data is useful data for specific applications in e-commerce. This enables targeted analysis of large amounts of data.

This also creates a focus on the essentials and prevents random correlations from being misinterpreted as patterns. A quote from Stefan Jähnichen, FZI Research Center for Information Technology and head of Smart Data accompanying research, describes Smart Data very aptly: "Big data is therefore a kind of raw material that needs to be processed so that it can be refined into smart data and develop its full economic potential.".⁴

What the challenge is in e-commerce

The challenge in e-commerce is therefore to process the "raw material" of big data into smart data for specific use cases so that actions can be carried out in the online shop for the benefit of online shoppers and shop operators. That is why, in addition to data mining, another process is used: machine learning. Machine learning provides the methods for building artificial intelligence in order to further process the available data and carry out actions in the online shop.

 


In summary, data mining performs targeted analyses and provides a suitable amount of data for machine learning. This data is then further processed and leads to actions in the online shop. 

Use cases for big data in e-commerce

To help you better understand smart data, we have compiled three use cases for you. These describe how you can extract and process smart data from big data using data mining and machine learning, and the advantages this offers:

Use Case 1: Reduction of click and purchase data for the display of recommendations

When an online shopper interacts with a website, they generate an enormous amount of data. Every click, every mouse movement, and every scroll can be recorded. This is what we refer to as big data. However, only a subset of this data is needed to calculate recommendations, such as:

  • Which links were clicked,
  • Which items were added to the shopping cart and
  • How long did the customer stay where?

The first reduction therefore takes place during data recording. The second reduction is then carried out when calculating the model, e.g.:

  • What information is helpful for the current online shop?
  • for the current customer and
  • for the current problem.

This context-related information constitutes smart data.

Would you like more information about this use case? Then we recommend our blog article: Reinforcement learning process: How to increase the probability of purchase with real-time analytics (Part 2).

Use Case 2: Automatic product identification from existing information

Your product catalog contains information such as title, category, and description that describe a real product from a sequence of words and product IDs. Up to this point, we are talking about big data. However, when an algorithm extracts a current product from this data, we are talking about smart data. Only when the algorithm recognizes the product as such can it place it in the correct relationship to other products (semantics). An example:

 


big data

Title: Phase Eight Leto Floral Dress

Category: Women's dresses/Other dresses

Description: This jersey dress features contrasting floral prints for a bright and modern finish. A twist detail at the waist creates a flattering silhouette that ends in a slightly asymmetrical shape below the knee.

 

Smart Data

Dress -> Jersey dress


 

For example, it can be recognized whether a product is a top or a bottom. This is important for displaying outfit recommendations in the online shop. Through this smart data extraction, we were able to achieve a double-digit increase in sales for one of our customers, for example.


Stay up to date on personalization: Sign up for the epoq newsletter. Register now!


Use Case 3: Extracting color values for locating similar products

If an online shop has 200,000 image files, for example, then this is referred to as big data. If you extract the colors and their color components from the 200,000 image files for a specific purpose, the color values correspond to smart data. The advantage of this is that you can work directly with the color values to find similar products in your online store, instead of just using very rough linguistic color values such as "blue."

Localization of products in the online shop via color entries in the search function (no color information in the product catalog)

Conclusion: Big data is good, smart data is better

Big data is important for business success. That much is clear. However, the extraction of smart data is an important success factor, as the use cases for big data in e-commerce have shown. This allows you to make the collected data usable for specific applications. Important methods for this are data mining and machine learning.

 

Sources: ¹ Rising Media Ltd., AllFacebook.de ²Statista ³Gesellschaft für Informatik e. V (GI) Smart Data Accompanying Research, c/o LoeschHundLiepold Kommunikation GmbH, Newsletter, Issue 1, August 2015

Daniela Ilincic
Head of Marketing
Daniela Ilincic is Head of Marketing at Epoq. She comes from a background in digital marketing, specializing in SEO and content marketing. She established the digital sales channel at Epoq, which she continues to optimize with her team. In addition to her work, she enjoys sharing market-relevant information on digital topics.