Data Warehousing An Overview

17 Slides59.50 KB

Data Warehousing An Overview

Outline What is Data Warehousing? (Definition) Why does anyone need it? (Applications) How is the data organized? (Star Schema) Implementation Issues.

Data Warehouse Definitions Dyche’: Used for decision making- duplicates existing data - Combination of hardware, specialized software and data extracted from other corporate systems. Inmon: Subject-oriented, integrated, nonvolatile and time-variant collection of data in support of management decisions.

Why Warehouse? Provide single view of customers across enterprise Improve turnaround time for common reports Monitor customer behavior Predict future purchases Improved responsiveness Business issues.

Coca Cola & IBM IBM helping Coca Cola with warehouse. Deal with Global companies like McDonalds – support for negotiating global contracts.

Financial Services Example – Credit Life Cycle Product Planning Customer Acquisition Collections Customer Management

Customer Acquisition Product Planning Support for Marketing Market Segmentation Plus Forecasts with: Response Models Risk / Bankruptcy Models Profitability Models Customer Acquisition

Customer Management Who gets a credit increase? Which of delinquent customers is likely to default? What do you do (call, send letter, do nothing?) Decision Support: Forecast Customer Behavior (Behavior Models) Customer Management Customer Acquisition

Collections/Recovery What is the likelihood of recovering money from an account sent to collections? Collections Decision Support: Collections models Customer Management

Other Questions How can we reduce attrition? How can we activate inactive accounts? How well are my current strategies performing? How do we detect Fraud?

Where is the data? Transaction Systems Marketing Database Credit Reports Customer Service

How is it Organized? Separate from transactional data Contains Historical data Generally aggregated to some extent Optimized for flexible querying of large volumes of data

Star Schema Fact Table plus several dimensional tables Un-normalized Less flexible than normalized tables Faster retrieval than normalized tables for large volumes of data

Implementation Start with the Business Issues Project Planning/Human Resources Database design / data sources Application Development

Business Analysis What is the problem? Who owns the problem? Will data help solve it?

Coupling When can data be used to Predict? High Low Chaotic Markets (fashion driven) Real-Time Markets (Stock Market) Linear Markets (Local authority - # of trash cans) Statistical Markets (retail) Low High Randomness Source: Also read article in Wired Magazine on Data Mining and Terrorism

Back to top button