Data mining fundamentals

A grocery chain used the data mining capacity of a software to analyze local buying patterns. They discovered that when men bought diapers on Thursdays and Saturdays, they also tended to buy beer. Further analysis showed that these shoppers typically did their weekly grocery shopping on Saturdays. On Thursdays, however, they only bought a few items. The retailer concluded that they purchased the beer to have it available for the upcoming weekend.

The grocery chain could use this newly discovered information in various ways to increase revenue. For example, they could move the beer display closer to the diaper display. And, they could make sure beer and diapers were sold at full price on Thursdays.

That is how the data mining comes into help in everyday business scenario.

So, what is the data mining definition?

A simple definition would be - data mining is the extraction of hidden predictive information from large databases.

Data mining is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both.

What is the need for data mining?

With increased digital system in the world, we now have huge collection of data about everything. What is the use of this data if it just get stored in some remote memory? We need to infer something out of it by using data mining process. Couple of important factors are critical to confirm the need of data mining:

  • Too much data and too little information
  • There is a need to extract useful information from the data and to interpret the data

If we know how to reveal valuable knowledge hidden in raw data, data might be one of our most valuable assets. while data mining is the tool to extract diamonds of knowledge from your historical data and predict outcomes of future situations.

Data mining will make it easy to answer some of the questions related to organization, such as:

  • What goods should be promoted to this customer?
  • What is the probability that a certain customer will respond to a planned promotion?
  • Can one predict the most profitable securities to buy/sell during the next trading session?
  • Will this customer default on a loan or pay back on schedule?
  • What medical diagnose should be assigned to this patient?
  • How large the peak loads of a telephone or energy network are going to be?
  • Why the facility suddenly starts to produce defective goods?

Is data mining a new technology?

The term 'data mining' is not very old and it was introduced in the early 1990s, but the technology is not a recent one. Companies have used powerful computers to sift through volumes of supermarket scanner data and analyze market research reports for years. However, continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy of analysis while driving down the cost.

Data mining roots are traced back along three family lines: classical statistics, artificial intelligence, and machine learning.

Data miing, in many ways, is fundamentally the adaptation of machine learning techniques to business applications. Data mining is best described as the union of historical and recent developments in statistics, AI, and machine learning. These techniques are then used together to study data and find previously-hidden trends or patterns within.

At present, where data mining technology is used?

Primary use of data mining today is observed in companies with a strong focus on customers or consumers. Most of the retail, financial, communication, and marketing organizations fall into this space. In these data mining technologies are used for multiple purposes including the ones below:

  • To determine relationships among "internal" factors such as price, product positioning, or staff skills, and "external" factors such as economic indicators, competition, and customer demographics.
  • To determine the impact on sales, customer satisfaction, and corporate profits
  • To enable them to "drill down" into summary information to view detail transactional data

Live examples of data mining usage

  • Blockbuster Entertainment: They mine video rental history database to recommend rentals to individual customers.
  • American Express: They suggest products to their cardholders based on analysis of their monthly expenditures.
  • WalMart: WalMart is pioneering massive data mining to transform its supplier relationships. WalMart captures point-of-sale transactions across all its stores and continuously transmits this data to its massive data warehouse. The suppliers use the data to identify customer buying patterns at the store display level. They use this information to manage local store inventory and identify new merchandising opportunities.