Kimball and Inmon Approaches to Data Warehousing

Both Bill Inmon and Ralph Kimball are stalwarts of data warehousing and business intelligence industry, they have contributed immensely to define standards and frameworks. They each have defined methods to design data warehouse for any company. Both philosophies have their own advantages and differentiating factors, and enterprises continue to use either of these.

Bill Inmon's enterprise data warehouse approach is better known as top-down approach. In this method, a normalized data model is designed first. Then the dimensional data marts, which contain data required for specific business processes or specific departments are created from the data warehouse.

Ralph Kimball's dimensional design approach approach is better known as bottom-up approach. In this method, the data marts that facilitate reports and analysis are created first; these are then combined together to create a broad data warehouse.

Top Down Approach in Data Warehousing

The top-down approach is designed using a normalized enterprise data model. "Atomic" data, that is, data at the lowest level of detail, are stored in the data warehouse. Inmon gave a classic definition of a data warehouse characteristics:

  • Subject-oriented: The data in the data warehouse is organized so that all the data elements relating to the same real-world event or object are linked together
  • Time-variant: The changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time
  • Non-volatile: Data in the data warehouse is never over-written or deleted -- once committed, the data is static, read-only, and retained for future reporting
  • Integrated: The database contains data from most or all of an organization's operational applications, and that this data is made consistent

Top-down approach - Corporate Information Factory

With Inmon's approach dimensional data marts are created only after the complete data warehouse has been created. Thus, data warehouse is at the center of the Corporate Information Factory (CIF), which provides a logical framework for delivering business intelligence. CIF relies on a data warehouse linked to assorted other pieces that provide for various functions that help a business to use data and optimize its use of valuable internal resources.

Pros and Cons of top-down approach

Advantages of the top-down approach to warehouse data implementation is that warehouse managers and top corporate executives analyze the warehouse’s data system needs, compare various products, consult with accounting professionals in their industry and make a determination about the best approach to follow. Managers then train subordinates about how to utilize the systems once they are in place. Following this approach, a company maintains full control over the costs, inputs and outputs of its warehouse data implementation from start to finish. Therefore we will have an integrated, flexible architecture to support downstream analytic data structures.

There are some drawbacks in using top-down approach, so one needs to decide the required method based on needs of the organization. When all decision making comes from the top with no input from subordinates, there can be miscalculations about the true needs of an organization. In the bottom-up approach to warehouse data implementation, front-line warehouse employees have input in the decision-making process about what type of system to implement. People with first-hand knowledge of the daily workings of the warehouse offer suggestions, explain what they see as the benefits of various systems and outline concerns associated with different systems. In this instance, management is still making the final decision about which warehouse data implementation system to utilize, though they do so with the input of the people charged with effectively utilizing the system on a daily basis.

Kimball Data Warehousing

Kimball’s data warehousing architecture is also known as Data Warehouse Bus (BUS). Dimensional modeling focuses on ease of end user accessibility and provides a high level of performance to the data warehouse.

The name bottom-up approach is suggested in comparision with Inmon's approach, best description of Kimball methodology is "iterative development and deployment techniques" to build a data warehouse. Kimball's data warehousing starts with building data marts. Each data mart contains data on an atomic level as well as on a summarized level, representing all of todays and the future’s information needs. The different data marts are connected via a dimensional bus system, thus allowing the user to access all the data in all data marts.

Pros and Cons of bottom-up approach

Important positives of bottom up approach is that it provides high flexibility and user-friendliness, because it is based on the individual business departments’ information needs. Another critical factor is short response time because of light interative model. The integration of new data is easily done in this model. The incremental development by building data mart after data mart enables a quick usage and cost-efficient development of the data warehouse.

On the downside, bottom-up approach postulates a missing integration and redundancy may not be avoided. As pointed out by Inmon, data marts are developed totally autonomously from each other and thus may contain redundant data. Separate data marts containing different data may obstruct a company-unified view. An integrated sight is not possible.

Major criticism of bottom-up approach is that it enhances the complexity of constructing an integrated data warehouse and increases the danger of departments building stand-alone solutions.

Top-down or Bottom-up: Which one is better?

"You can catch all the minnows in the ocean and stack them together and they still do not make a whale." ~Inmon

“The data warehouse is nothing more than the union of all the data marts" ~Kimball

By looking at the above two quotes from the two stalwarts, nothing much can be concluded. So, there is no right or wrong between these two ideas, as they represent different data warehousing philosophies. Each of the methodologies is suitable to an organization based on certain criteria.