Automation and Agile Processes for (Big) Data Analytics with

15 Slides1.90 MB

Automation and Agile Processes for (Big) Data Analytics with Machine Learning Professor Daniel Neagu AI Research (AIRe) Group Leader Co-Director, Advanced Automotive Analytics (AAA) Research Institute Faculty of Engineering and Informatics University of Bradford http://computing.brad.ac.uk/staff/dneagu 1 16/02/24 IET Seminar Series

Big Data Features Value? Vulnerability? Variability? What else? 2 16/02/24 Agility in (Big) Data Mining

Big Data Mining Big Data refers to challenging (large, diverse, fast-changing, uncertain) data storage and processing. Some headlines and images are from http://www.shapingtomorrow.com/) 3 16/02/24 Data Mining proposes algorithms and processes to find and characterize patterns in data, with the ultimate aim to support knowledge discovery and decision support. Agility in (Big) Data Mining

Big Data Mining Technologies 4 16/02/24 Agility in (Big) Data Mining

Big Data Analytics Tools & Processes* Layer Technology Stack External Platform Integration DSP/Trading Desk Reports Big Data Mining is still an art and a science! Reporting/Inisghts paraphrasing Yoni Iny, CTO at Upsolver (on Quora) Aggregation & Data marts Automated analytics is more commonly referred to as ‘advanced analytics’. Components Scenario Planner/Optimizer Insights Platform BI Mart Analytic Mart Aggs Scored Event Streams Scoring Platform Scoring Raw Event Streams Traditional analytics insights and descriptions (what happened? and why?) are evolving towards predictive and prescriptive applications (what will happen? and how can we make it happen?). Data Processing & Event Stream Creation Connected Recognition ETL Stage Data Collection FTP Online Data Data Sources CRM Offline Data Rafa Garcia Navarro, chief analytics officer UK&I at Vendor Vendor Vendor Vendor Vendor Experian: More analytics – speed and agility through prediction Pixel/Site Data http://www.information-age.com/2017-store-big-data-12 3463665 *Amanda Gessert: Increasing Your Productivity Through Data Visualization, DataSummit2015 / 5 16/02/24 Agility in (Big) Data Mining

Gartner Magic Quadrant for Advanced Data Science Platforms (2017 vs 2016) Advanced analytics platforms provide an end-to-end environment for developing and deploying models (search for Gartner Magic Quadrant for Data Science Platforms): https://www.gartner.com/doc/reprints?id 1-3TK9NW2&ct 170215&st sb 6 16/02/24 Agility in (Big) Data Mining

Big Data Mining Current Evolution* Insights Governance Standards Deployment Past Past Present Present Future Future IT-led IT-led Business-led Business-led Waterfall Waterfall Agile Agile Enterprise Enterprise Decision Decision Team/Individual Team/Individual Decision Decision Business Business Unit Unit Decision Decision Strict Strict Data Data and and BI BI Governance Governance The The Wild Wild West West Governance Governance Fencing Fencing the the Wild Wild West West Backward Backward Looking Looking KPI KPI Reports Reports Forward Forward Looking Looking Ad Hoc Analytics Ad Hoc Analytics Backward Backward Looking Looking Data Data Discovery Discovery Flexible Flexible Forward Forward Looking Analytics Looking Analytics Integrated Integrated Analytic Analytic Platform Platform *Amanda Gessert: Increasing Your Productivity Through Data Visualization, DataSummit2015 7 16/02/24 Agility in (Big) Data Mining

Knowledge Discovery in Databases 8 16/02/24 Agility in (Big) Data Mining

Agile Manifesto 4 Values “That is, while there is value in the items on the right, we value the items on the left more.” Individuals and Interactions over Processes and Tools Working Software over Comprehensive Documentation Customer Collaboration over Contract Negotiation Responding to Change over Following a Plan (www.agilemanifesto.org) *Jonathan Kessel-Fell, Capgemini: Agile Mindset, December 2016 11 16/02/24 Agility in (Big) Data Mining

Agile Mindset*: can we apply it to Big Data? XP Scrum DAD Being Agile Doing Agile SAFe Kanban Agile Mindset 4 Agile Values 12 Agile Principles Agile Practices *Jonathan Kessel-Fell, Capgemini: Agile Mindset, December 2016 12 16/02/24 Agility in (Big) Data Mining

Agile Big Data Mining Motivation: Big Data Mining is a complex, dynamic, much needed domain with high expectations and difficult constraints. Agile analytics allows modellers to have a conversation with their data. Rachel Hawley, SAS on the Quest for Agile Analytics: https://www.kdnuggets.com/2015/02/interview-rachel-hawley-sas-agile-analytics.html Judging by tech talks and case studies from Silicon Valley start-up circa mid– 2010s, one might believe that machine learning as pattern recognition drives business. It does not. In particular, ML does little to act upon insights gained. Let’s consider Uber as an example. ML use cases may help detect patterns: traffic, drivers, commuters, etc. Those can help indicate where value could be “harvested” within the system: opportunities for action. Even so, the dispatcher at the center of Uber works to schedule and optimize rides. It operates as a control system at the heart of the business. Those kinds of control systems may leverage ML to detect patterns, etc., but there’s much more involved. Notably, determining which offers to sell, scheduling resources to deliver on those customer promises, handling contingencies etc. Manipulating the supply chain is where a business earns profit. Patterns only play minor parts, while control is center stage. Paco Nathan: Beyind the AI Winter https://synecdoche.liber118.com/beyond-the-ai-winter-941c0a66b4f5 13 16/02/24 Agility in (Big) Data Mining

Agile Big Data Science Teams It’s a fact that data science results are probabilistic and unpredictable. At the start of a project, it can often look like there’s an obvious route from A to B. When you get started, it’s never that simple. Agile teams do away with strict planning and go into projects with a creative mindset; they embrace uncertainty instead of shying away from it. This comes in handy when a roadblock pops up—traditionally-run data science teams can get stuck deciding on their options, while the flexible agile data science teams are more likely to find a new solution. Unpredictability and the need to adapt quickly to problems doesn’t scare them; it excites them. To be relevant and useful, the day to day activities of data scientists must: prioritize use of technology, so that it produces the best results; design technology and products with the consumer in mind; and collaborate well with partners and customers. John Akred, Silicon Valley Data Science https://www.kdnuggets.com/2017/02/real-world-results-agile-data-science-teams.html 14 16/02/24 Agility in (Big) Data Mining

Agile Big Data Mining: Key Concepts Sprints. Stories are completed during sprints, which is a set chunk of time to work on tasks, typically two weeks, with the goal of producing new results. Data scientists will break these down into stories for that sprint. Standups. Each day during the sprint, the team gathers in a standup meeting. Here they report their progress, say what to do next, and coordinate to remove blockers. Review meetings. The team presents and evaluates results. Customer stakeholders also attend this meeting. Is it good enough? Should we keep working on it? Will it be useful, or should we abandon it now? John Akred, Silicon Valley Data Science https://www.kdnuggets.com/2017/02/real-world-results-agile-data-science-teams.html 15 16/02/24 Agility in (Big) Data Mining

Conclusions Agile data science teams work in a way that is adaptable, collaborative, and produces usable results. They subscribe to the idea that data science can be creative and innovative. They embrace the unknown instead of making assumptions, and they don’t waste time beating their head against a wall for things that aren’t working. Agile teams are the future of data science, the creative teammates who work together to make things that are useful, and answer real world problems. The future is fickle, and we must be flexible to succeed. John Akred, Silicon Valley Data Science https://www.kdnuggets.com/2017/02/real-world-results-agile-data-science-teams.html With people in every area of business often struggling with data and analytics, the opportunity for analytics companies is huge. Rafa Garcia Navarro, chief analytics officer UK&I at Experian: More analytics – speed and agility through prediction http://www.information-age.com/2017-store-big-data-123463665/ Agility in (Big) Data Mining 16 16/02/24

Agility in (Big) Data Mining Questions Time! Professor Daniel Neagu AI Research (AIRe) Group Leader Co-Director, Advanced Automotive Analytics (AAA) Research Institute Faculty of Engineering and Informatics University of Bradford http://computing.brad.ac.uk/staff/dneagu 17 16/02/24 IET Seminar Series

Back to top button