Data Warehousing An Overview
17 Slides59.50 KB
Data Warehousing An Overview
Outline What is Data Warehousing? (Definition) Why does anyone need it? (Applications) How is the data organized? (Star Schema) Implementation Issues.
Data Warehouse Definitions Dyche’: Used for decision making- duplicates existing data - Combination of hardware, specialized software and data extracted from other corporate systems. Inmon: Subject-oriented, integrated, nonvolatile and time-variant collection of data in support of management decisions.
Why Warehouse? Provide single view of customers across enterprise Improve turnaround time for common reports Monitor customer behavior Predict future purchases Improved responsiveness Business issues.
Coca Cola & IBM IBM helping Coca Cola with warehouse. Deal with Global companies like McDonalds – support for negotiating global contracts.
Financial Services Example – Credit Life Cycle Product Planning Customer Acquisition Collections Customer Management
Customer Acquisition Product Planning Support for Marketing Market Segmentation Plus Forecasts with: Response Models Risk / Bankruptcy Models Profitability Models Customer Acquisition
Customer Management Who gets a credit increase? Which of delinquent customers is likely to default? What do you do (call, send letter, do nothing?) Decision Support: Forecast Customer Behavior (Behavior Models) Customer Management Customer Acquisition
Collections/Recovery What is the likelihood of recovering money from an account sent to collections? Collections Decision Support: Collections models Customer Management
Other Questions How can we reduce attrition? How can we activate inactive accounts? How well are my current strategies performing? How do we detect Fraud?
Where is the data? Transaction Systems Marketing Database Credit Reports Customer Service
How is it Organized? Separate from transactional data Contains Historical data Generally aggregated to some extent Optimized for flexible querying of large volumes of data
Star Schema Fact Table plus several dimensional tables Un-normalized Less flexible than normalized tables Faster retrieval than normalized tables for large volumes of data
Implementation Start with the Business Issues Project Planning/Human Resources Database design / data sources Application Development
Business Analysis What is the problem? Who owns the problem? Will data help solve it?
Coupling When can data be used to Predict? High Low Chaotic Markets (fashion driven) Real-Time Markets (Stock Market) Linear Markets (Local authority - # of trash cans) Statistical Markets (retail) Low High Randomness Source: www.butlergroup.com Also read article in Wired Magazine on Data Mining and Terrorism