Several million-user data converge daily in online shops. But how can this data be structured, stored and evaluated most efficiently? Our customer, one of the largest online mail order companies, needed a database solution that meets all these requirements – and massively reduces operating and hardware costs.
Case Study: Quick fraud detection

The task

The online shop of a large Internet retailer is visited by several million people every day. All these people generate search queries, traces of their navigation and purchase transactions. Up to now, our customer has stored most of this information in relational databases. However, these are not designed to handle such a large volume of unstructured raw data from e commerce – and the licensing costs of the databases are enormously high. Our customer’s goal was therefore to design a database solution that was optimally suited for this application.

The added value

By switching to the new Hadoop platform, very large amounts of raw data can now be stored in a fail-safe and performant retrievable manner. In addition, this solution has massively reduced the operating and hardware costs of data storage.

Case Study: Quick fraud detection
Case Study: Quick fraud detection

The solution

From the outset, the focus was on a commercial Hadoop distribution, as this system offers the best performance for companies with enormous data volumes. With the right version of the framework we ensured support for problems in everyday operation, and a suitable query tool made it easy for users to switch to the new system.

Based on test data and queries, the distributions and tools from WidasConcepts were compared with each other and checked for handling, performance and stability. In addition, we evaluated the file format Parquet for the sensible storage of the files. Finally, in cooperation with the departments, we evaluated all wishes for the new platform and considered them according to the objectives.

The implemented technologies

Cloudera CDH, MapR Hadoop