Using MDM as a Practical Approach to Get Started in Data Governance

78 Slides5.73 MB

Using MDM as a Practical Approach to Get Started in Data Governance Todd Goldman: VP Products and Marketing

Precursor to Data Governance is Data Management

Data Governance Manifesto Data should be 1. 2. 3. 4. 5. Understood Secure Consistent Accessible Managed Data Governance Management (People, Processes, Procedures) Security Consistency Accessibility Understanding (Discover, Validate) 3

Governance Requires a Foundation of Understanding Management (People, Processes, Procedures) Security Consistency Accessibility Understanding (Discover, Validate) If you don’t know how data in different systems is related How can you make sure they are consistent (MDM)? How can you measure overall data quality? How can you measure the quality of business rules? If you don’t know where the sensitive data is How can you protect it? If your data is not secure, consistent and accessible What does it mean to manage it? 4

Even if you understand your data landscape starting data governance is difficult. It is a new religion.

Current Data Religion – Mystery Cult Data is shrouded in mystery Its meaning is only accessible through data priests Data Priests use omens (metadata) and personal experience to divine the meaning of data in return for (financial) sacrifices Meaning is often obscure, misleading, incomplete and wrong Data Priests are often blockers to data governance programs More paperwork Slows us down Don’t need it 6

A Common Myth: “We know our data” I’m I’maaprofessional. professional. Of Ofcourse courseIIknow know my mydata! data! Subject matter experts (SMEs) only know their own systems But they can’t tell you how it changes and is transformed as it moves from system to system Relationships between systems are complex: But, But,once onceititleaves leavesmy my hands, hands,ititisissomeone someone else’s else’sproblem! problem! Wow, Wow,that thattransformation transformationisis complex. complex. Are Areyou yousure surethat that isisininmy mydata? data? SMEs sometimes change jobs! I’m I’mgoing goingtotostart startmy my own ownconsulting consultingfirm firm 7

More of the Myth: “Our Data is Consistent” Business rules are broken all the time as data crosses business and system boundaries: 83 year old man in system A is a “youthful driver” in system B Bond yield is listed as 5% in system X and 5.3% in system Y All Allof ofmy mydata datafollows follows the thebusiness businessrules rulesfor for this thissystem! system! Exceptions result in lost revenue, customer dissatisfaction, and regulatory fines IIcan’t can’tkeep keepup up with withall allthe the acquisitions acquisitionsand and reorganizations. reorganizations. They Theymess messup upthe the way waysystems systemswork work together. together. ItItisisvery very inconvenient. inconvenient. Business rules change as organizations change Mergers and Acquisitions New products or services Products/services are retired Reorganizations New IT systems are added 8

Data Ecosystem Warriors Have immediate, acute data problem Focus on feasibility and time Tradespeople Have a short term (project) business problem Focus on value Data Priests (SMEs) “Know” the data Assist with data problems Reformers Have a long term business problem Focus on control and scalability 9

Data Governance New Data Religion “Governed Data”: Data is documented, consistent and secure Governance is a must for MDM projects To succeed, a new religion needs Reformers or prophets – people who believe in it and sponsor it Priests – people to educate, explain and promote it New Miracles –successful projects and spectacular results 10

Successful Data Governance Program Roll Out Help Warriors win battles Use victories to Convert the convince Reformers priests to change religions 11 Have priests drive adoption to trades people

Key Winning the First Battle! Pick your battles Find appropriate initial project Achieve quick win Early success Build a Trojan Horse Govern the data without calling it that 12

Picking The First Battle Appropriate Project Immediate Business ROI Project success directly linked to Data Governance practice and methodology Cross-Silo Must have for the company Examples Basel II Master Data Management Application migration Cross-BU Reporting and Analytics 13

Avoid False Starts Projects to Avoid Future ROI – next project will benefit Boil the ocean False Start Examples Refactoring Metadata repository Enterprise (fill in the blank) 14

Quick Win Iterative Approach Agile Development Immediate results Automation is Critical Data Discovery tools Repeatability of results Validation and consistency tools Incident/Exception workflow Visual and Intuitive Presentation of Results Business oriented Graphically presented 15

Barriers to the Quick Win

Data Governance Gap The Peaks of Data Understanding “Design “Designspecifications specificationsget getlost lostor oroutdated, outdated,subject subject matter matterexperts expertsleave leavecompanies, companies,databases databasesand and business businessrules rulesget getchanged changedwithout withoutupdating updating documentation, documentation,mergers mergersand andacquisitions acquisitionswreak wreak havoc havocon ondatabases, databases,all allleading leadingtotoaacompany companynot not knowing knowingexactly exactlywhat whatthey theyhave. have.The Theend endresult resultisis inconsistent inconsistentdata.” data.” Fern FernHalper, Halper,Hurwitz Hurwitz&&Associates Associates Data Governance Data Nightmare “70% “70%orormore moreofofthe thetime timeand andeffort effortinvolved involvedinin completing completingmost mostdata dataintegration integrationprojects projectsisis consumed consumedby bydefining definingand andimplementing implementingthe the business businessrules rulesby bywhich whichdata datawill willbe bemapped, mapped, transformed, integrated, and cleansed.” transformed, integrated, and cleansed.” Ted TedFriedman Friedman Vice VicePresident, President,Gartner GartnerGroup Group 17

Current Tools Weren’t Developed to Discover a Distributed Data Landscape ETL, EAI, Cleansing Not discovery solutions. They depend on discovery Most Widely Used Business Rule Discovery Tools Metadata matching Doesn’t work in a real environment Profiling Focused on a single data source ata ool D y r yT u r t e n v th Ce sco i 0 D 2 hip s n o ati Rel Today’s tools weren’t created to analyze a distributed data landscape Data analysts manually examine data values to figure out the business rules in the distributed data landscape The most sophisticated tools commonly used today is: 18

Case Study: Asset Master Reminder for Todd: Show Dawn’s slides Vice President Charlotte, NC based Commercial bank Project: IT Asset Master Consolidating 8 asset management systems to a single asset master “ We had 9 subject matter experts spend 9 months and we still didn’t know enough to be able to consolidate our data into a master.” 19

Data Analysis: The Lack of Understanding A Case Study (Note: This is NOT what you want to do!)

Data Analyst Case Study The story you are about to hear is true Only the names have been changed to protect the innocent 21

This is Denise Data Analyst: Denise Experienced Data Analyst Extremely successful career working for data software companies Very Personable Very Intelligent Impeccable references Bills at 2000/day Hired by a dental insurance company for a 3 week data analysis/MDM integration project Tools used: Profiling TOAD SQL Highlighter 22

Manual Data Discovery Timeline Get metadata specs and begin to check business rules between one table with six columns against first of three source systems Data Analyst: Denise Expected result: 3 Weeks 23 Day 1

Manual Data Discovery Timeline Get initial results from unit test with inconsistent data for 1st column So far, so good Data Analyst: Denise 24 Day 2

Manual Data Discovery Timeline Retest and debug Day Still on track 3 Data Analyst: Denise 25

Manual Data Discovery Timeline Go to data architect to question Architect pings owner of application (SME). NOTE: Data analyst not allowed to consult with SME directly. Data Analyst: Denise Data Architect SME 26 Day 4-5

Manual Data Discovery Timeline Meeting with architect and SME to review. Initial answer received . Data Analyst: Denise Data Architect SME 27 Day 6

Manual Data Discovery Timeline Rewrite business rules and test. Find second column with inconsistent data. Retest and debug. Data Analyst: Denise 28 Day 7

Manual Data Discovery Timeline Go to data architect to question Day Architect pings owner of application (SME). SME asks upstream application owner Data Analyst: Denise Data Architect SME 29 Application Owner 0 1 8

Manual Data Discovery Timeline Data [email protected] [email protected] App [email protected] Flurry of emails between the 4 players, as upstream app owner in different time zone. Decision on how to proceed agreed upon Day 2 1 11 Data Analyst: Denise App [email protected] [email protected] [email protected] [email protected] App [email protected] Data Architect SME 30 Application Owner

Manual Data Discovery Timeline Rewrite business rules in SQL and test. Find more inconsistent data. Retest and debug. Data Analyst: Denise 31 Day 13

Manual Data Discovery Timeline Go to data architect to question Architect pings owner of application (SME). Data Analyst: Denise Data Architect SME 32 Day 14

Manual Data Discovery Timeline Meeting with architect and SME to review. Decision made to review specs with a larger group Data Analyst: Denise Data Architect SME 33 Day 15

Manual Data Discovery Timeline Meeting with larger group. Original specs validated and corrected Data Analyst: Denise 34 Day 16

Manual Data Discovery Timeline At weekly status meeting, project manager asks, “why have 17 days passed when this phase was to be completed in 3 weeks?” Data Analyst: Denise 35 Day 17

Manual Data Discovery Timeline Rewrite SQL and test. Day 18 Data Analyst: Denise 36

Manual Data Discovery Timeline Pass first source system SQL to ETL developers for coding and QA Data Analyst: Denise 37 Day 19

Manual Data Discovery Timeline Get specs and begin to verify relationships with second of three sources systems – an outside feed Data Analyst: Denise 38 Day 20

Manual Data Discovery Timeline Go to data architect to question Day Architect pings owner of application (SME). SME asks upstream application owner Data Analyst: Denise Feed Vendor Liason Feed vendor liaison is consulted Data Architect SME 39 Application Owner 3 2 21

Manual Data Discovery Timeline Data [email protected] Feed [email protected] [email protected] App [email protected] Data Analyst: Denise Flurry of emails between the 4 players, plus vendor liaison. More people involved consumes even more time Day 6 2 24 Decision on how to proceed agreed upon App [email protected] [email protected] [email protected] [email protected] App [email protected] Feed Vendor Liason Data Architect SME 40 Application Owner

Manual Data Discovery Timeline Recode SQL and test. Repeat experience of days 7-16, with new inconsistent data Data Analyst: Denise 41 Day 27

Manual Data Discovery Timeline Recode SQL and test. Repeat experience of days 7-16, with new inconsistent data Data Analyst: Denise 42 Day 37

Manual Data Discovery Timeline The project now 18 days overdue, with no clue as to how long it will take to complete the remaining work. Repeat variations of days 21-37 several times Data Analyst: Denise 43 Day 37

Manual Data Discovery Timeline Pass 2nd source system business rules to ETL developer and QA. Project phase is now 70 days overdue, with one entire source system still to code. Data Analyst: Denise Red flags being raised Search for sacrificial lambs. 44 Day 89

Manual Data Discovery Timeline Go on preplanned, and much overdue vacation Day 90 Data Analyst: Denise 45

Manual Data Discovery Timeline Get specs and begin to check business rules with third of three sources systems. Repeat variation of days 20-89. Data Analyst: Denise 46 Day 987650842 1965432107

Manual Data Discovery Timeline Pass 3rd source system code to ETL developer and QA. Project is 152 days late 30 weeks 7 months Data Analyst: Denise Company paid for 30 weeks more consulting time than expected 300K overrun 47 Day 167

What does this mean for your Data Governance Project?

MDM deployment: 10MM in Services for Every 1MM in Software Services Software 49 MDM Hub Merge Purge Match

According to the Experts: “70% or more of the time and effort involved in completing most data integration projects is consumed by defining and implementing the business rules by which data will be mapped, transformed, integrated, and cleansed.” Ted Friedman Vice President Gartner Group 50

70% of Services are for Data Analysis 30% of Services Are for Deploying the “Data Hub” Services Discover Map Validate Data Analysis Services MDM Deployment Services Software MDM Hub 51

According to the Experts: Malcolm Chisholm MDM Industry Expert AskGet, Inc. “MDM won’t ever provide a positive return on investment to businesses if the cost and risk of the data analysis and mapping component is not reduced by an order of magnitude you have to automate the process” 52

Recap So Far: You must overcome big hurdles you must overcome to implement Data Governance Technical: Presumes data understanding Requires automated data discovery, validation, remediation for a distributed data landscape Financial: Cost of deployment must justify the project 53 Cultural: You may be changing religions

Data Governance Epic The Peaks of Data Understanding Data Governance Data Nightmare 54

Data Governance Epic But you know the alternatives are unthinkable, so you and your team of data governance warriors boldly go where no man has gone before. 55

Data Governance Epic Scale the cliffs of data relationship discovery Pick your way through data inconsistency glaciers Battle Data Priests for budget and mindshare 56

Data Governance Epic And eventually, if you are very, very persistent and very, very lucky, you may even get there 57

Case Study: Potential for the False Start Manufacturing Firm: Corporate mandate to improve data quality (the CEO demanded a new religion) Created their initial identity master Initial Identity Master: Required 5 analysts to map 4 data sources Merge purge match process is governed But Quality of data in the master is suspect Result: No downstream users Next project: 16 more sources to map Will require 20 more data analysts The Problem: Hiring 20 data analysts is not financially feasible Data mapping and analysis is the critical path Millions of dollars have been spent on software and services already 58

There’s got to be an easier way! Need a Quick Win 59

What if you could Automate Cross System Data Understanding? 60

Automating cross system data discovery would change the economics of governance from this: Services Data Analysis Services MDM Deployment Services Software MDM Hub 61

Automating cross system data discovery would change the economics of governance from this Services Data Analysis Services MDM Deployment Services Software MDM Hub 62

Automating cross system data discovery would change the economics of governance to this Provides the foundation of good data management Automates understanding of the current data landscape Replace services with software (10x differential) Creates repeatability Makes data governance projects financially feasible Accelerate deployment Reduce project risk Turn negative NPV into positive NPV Provides the “Trojan Horse” Analysis Services MDM Deployment Services Software Analysis MDM Hub 63

Case Study: The Trojan Horse Truck Manufacturer Migrating from one finance application to another Data must be mapped and migrated as part of the process The Trojan Horse: Some data in the finance application is master data Using automated tools to map the data and will leverage the map to create a master Did a pilot project where automation took 3 days vs 6 months for manual mapping Planned savings from automation are being rerouted to purchase an MDM system Critical Factors : Governance processes will be required to clean up the data as part of the migration They are not calling this governance they are just doing it All mapping efforts will be leveragable because they are repeatable and verifiable Repeatable and verifiable are good words Future Challenges: They must execute 64

Data Discovery Automation Technology: A Primer

Automated Cross System Data Discovery: What is it? New data analysis methodology and tools Arms the warrior with a new weapon Allows you to quickly understand your current data landscape Establishes data understanding within data sources and between data sources Automates discovery of business rules, lineage, transformations and data inconsistencies across data sources Goes well beyond profiling Examines: Data Values Data Values Data Values Establishes a methodology for cross system data analysis Each data project becomes a building block, not a “one-off” 66

Data-Driven Approach: Aligns Rows Across Datasets Step 1: Discovery Engine analyzes the data values to automatically discover the key that aligns rows across disparate datasets: Works for hundreds of tables Works for millions of rows Data-Driven Discovery Engine Member ID (Table 25) Known Sensitive Table 1 Data ID 514714372 444629628 540450091 567472596 423456789 490204164 444629628 423456789 243-68-1812 272-92-3629 25 87 (243) 681-8107 (272) 923-6280 F M 595846226 466861109 67 Demo1 3 3 2 1 2 1 Sex M F M F F M Phone (123) 456-7890 (138) 271-6037 (154) 864-1961 (173) 447-8996 (194) 261-6476 (217) 573-0453 Age 15 8 22 55 4 66 SS # 123-45-6789 138-27-1604 154-86-4196 173-44-7900 194-26-1648 217-57-3046 987,623 987,624 Member 595846226 567472596 540450091 514714372 490204164 466861109 Row 1 2 3 4 5 6 Table 25 0 0

Data-Driven Approach: Aligns Rows Across Datasets Step 1: Discovery Engine analyzes the data values to automatically discover the key that aligns rows across disparate datasets: Works for hundreds of tables Works for millions of rows Data-Driven Discovery Engine Known Sensitive Table 1 Data ID 595846226 567472596 540450091 514714372 490204164 466861109 444629628 423456789 243-68-1812 272-92-3629 25 87 (243) 681-8107 (272) 923-6280 F M 444629628 423456789 68 Demo1 0 1 2 3 1 0 Sex M F M F F M Phone (123) 456-7890 (138) 271-6037 (154) 864-1961 (173) 447-8996 (194) 261-6476 (217) 573-0453 Age 15 8 22 55 4 66 SS # 123-45-6789 138-27-1604 154-86-4196 173-44-7900 194-26-1648 217-57-3046 987,623 987,624 Member 595846226 567472596 540450091 514714372 490204164 466861109 Row 1 2 3 4 5 6 Table 25 3 2

Data-Driven Approach: Discovers Business Rules & Sensitive Data Step 2: With rows now aligned, analyzes the data values to automatically discover: Forgotten Business Rules Data Lineage Hidden Sensitive Data Data-Driven Discovery Engine CASE: If age 18 and Sex M then If age 18 and Sex F then If age 18 and Sex M then If age 18 and Sex F then Known Sensitive Table 1 Data ID 595846226 567472596 540450091 514714372 490204164 466861109 444629628 423456789 243-68-1812 272-92-3629 25 87 (243) 681-8107 (272) 923-6280 F M 444629628 423456789 69 Demo1 0 1 2 3 1 0 Sex M F M F F M Phone (123) 456-7890 (138) 271-6037 (154) 864-1961 (173) 447-8996 (194) 261-6476 (217) 573-0453 Age 15 8 22 55 4 66 SS # 123-45-6789 138-27-1604 154-86-4196 173-44-7900 194-26-1648 217-57-3046 987,623 987,624 Demo1 Table 25 Member 595846226 567472596 540450091 514714372 490204164 466861109 Row 1 2 3 4 5 6 0 1 2 3 3 2

Data-Driven Approach: Discovers Business Rules & Sensitive Data Step 3: With business rules now discovered, analyzes the data values to automatically discover: Unknown Data Inconsistencies Data-Driven Discovery Engine Hit HitRate: Rate:98% 98% CASE: If age 18 and Sex M then If age 18 and Sex F then If age 18 and Sex M then If age 18 and Sex F then Known Sensitive Table 1 Data ID 595846226 567472596 540450091 514714372 490204164 466861109 444629628 423456789 243-68-1812 272-92-3629 25 87 (243) 681-8107 (272) 923-6280 F M 444629628 423456789 70 Demo1 0 1 2 3 1 0 Sex M F M F F M Phone (123) 456-7890 (138) 271-6037 (154) 864-1961 (173) 447-8996 (194) 261-6476 (217) 573-0453 Age 15 8 22 55 4 66 SS # 123-45-6789 138-27-1604 154-86-4196 173-44-7900 194-26-1648 217-57-3046 987,623 987,624 Demo1 Table 25 Member 595846226 567472596 540450091 514714372 490204164 466861109 Row 1 2 3 4 5 6 0 1 2 3 3 2

What Complex Business Rules are Discovered from the Data? Scalar One to one Substring Concatenation Constants Tokens Conditional logic Aggregation Sum Average Minimum Maximum Column Arithmetic Add Subtract Multiply Divide Case statements Equality/Inequality Null conditions In/Not In Reverse Pivot Conjunctions Cross-Reference Joins Inner Left Outer Custom Data Rules 71 Microsoft Excel Worksheet

Case Study: Worldwide Financial Institution Financial Services Firm: Master Data Management Integration of legacy system with reference master system. First of 40 to be integrated (Deployment time) 6 Manual results for first dataset: Estimated to take 6 months elapsed Manual Mapping 5 Data Driven Discovery 4 Data-Driven Mapping results: 2.5 weeks of elapsed time Also centralized data analysis expertise Months 3 2 Benefits: Significant time to market savings: 5 months Significant project risk reduction Data inconsistencies found as part of process 72 1 0 Manual Mapping Data Driven Discovery

What does this all mean? Makes it much easier and cheaper to map your distributed data landscape This is the foundation upon which the rest is built The economics of governance will look very different Faster, repeatable victories Turns point projects into governance building blocks “Undoable” projects become “doable” Turns data governance projects on their heads 73

Recap I Culture: Culture: use usevictories victoriesto tobuild buildthe thecase case for forbetter betterdata datagovernance governanceand andquality quality Trojan TrojanHorse: Horse: Start Startgoverning governingyour your data datawithout withoutcalling callingititgovernance governance Financial: Financial:Use Usebetter betterdata datamanagement management to to deliver deliverpositive positiveROI ROI Technical: Technical:Automate Automatedata datadiscovery discovery and andmanagement management 74

One more point about Culture Change Strategy Organizational Structure Communications Culture Training/Skills Rewards You can’t just change culture. You have to turn other knobs that affect culture 75

Recap II Pick the right battles, arm the Warriors with 21st century crosssystem discovery and help them achieve quick victories Use victories to Convert the convince Reformers priests to change religions 76 Have priests drive adoption to trades people

Data Governance Success Happy CXO Management Team 77

Questions and Answers Thank You for Attending! For more information, contact : Todd Goldman: Web: Email: Phone: www.exeros.com [email protected] 1.408.213.8910 Or stop by Exeros Booth in the exhibit hall 78

Back to top button