Machine Learning for Business Analytics - Peter C. Bruce, Kuber R. Deokar, Nitin R. Patel, Galit Shmueli

Blick ins Buch

Machine Learning for Business Analytics (eBook)

Concepts, Techniques, and Applications with Analytic Solver Data Mining

Peter C. Bruce, Kuber R. Deokar, Nitin R. Patel, Galit Shmueli (Autoren)

eBook Download: EPUB

2023 | 4. Auflage
624 Seiten
Wiley (Verlag)
978-1-119-82986-7 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (EPUB)

Machine learning -also known as data mining or predictive analytics- is a fundamental part of data science. It is used by organizations in a wide variety of arenas to turn raw data into actionable information.

Machine Learning for Business Analytics: Concepts, Techniques, and Applications in Analytic Solver Data Mining provides a comprehensive introduction and an overview of this methodology. The fourth edition of this best-selling textbook covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, rule mining, recommendations, clustering, text mining, experimentation, time series forecasting and network analytics. Along with hands-on exercises and real-life case studies, it also discusses managerial and ethical issues for responsible use of machine learning techniques.

This fourth edition of Machine Learning for Business Analytics also includes:

An expanded chapter focused on discussion of deep learning techniques
A new chapter on experimental feedback techniques including A/B testing, uplift modeling, and reinforcement learning
A new chapter on responsible data science
Updates and new material based on feedback from instructors teaching MBA, Masters in Business Analytics and related programs, undergraduate, diploma and executive courses, and from their students
A full chapter devoted to relevant case studies with more than a dozen cases demonstrating applications for the machine learning techniques
End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented
A companion website with more than two dozen data sets, and instructor materials including exercise solutions, slides, and case solutions

This textbook is an ideal resource for upper-level undergraduate and graduate level courses in data science, predictive analytics, and business analytics. It is also an excellent reference for analysts, researchers, and data science practitioners working with quantitative data in management, finance, marketing, operations management, information systems, computer science, and information technology.

Galit Shmueli, PhD, is Distinguished Professor and Institute Director at National Tsing Hua University's Institute of Service Science. She has designed and instructed business analytics courses since 2004 at University of Maryland, Statistics.com, The Indian School of Business, and National Tsing Hua University, Taiwan.

Peter C. Bruce, is Founder of the Institute for Statistics Education at Statistics.com, and Chief Learning Officer at Elder Research, Inc.

Kuber R. Deokar, is the Data Science Team Lead at UpThink Experts, India. He is also a faculty member at Statistics.com.

Nitin R. Patel, PhD, is cofounder and lead researcher at Cytel Inc. He was also a co-founder of Tata Consultancy Services. A Fellow of the American Statistical Association, Dr. Patel has served as a visiting professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years.

MACHINE LEARNING FOR BUSINESS ANALYTICS Machine learning also known as data mining or predictive analytics is a fundamental part of data science. It is used by organizations in a wide variety of arenas to turn raw data into actionable information. Machine Learning for Business Analytics: Concepts, Techniques, and Applications with Analytic Solver Data Mining provides a comprehensive introduction and an overview of this methodology. The fourth edition of this best-selling textbook covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, rule mining, recommendations, clustering, text mining, experimentation, time series forecasting and network analytics. Along with hands-on exercises and real-life case studies, it also discusses managerial and ethical issues for responsible use of machine learning techniques. This fourth edition of Machine Learning for Business Analytics also includes: An expanded chapter on deep learning A new chapter on experimental feedback techniques, including A/B testing, uplift modeling, and reinforcement learning A new chapter on responsible data science Updates and new material based on feedback from instructors teaching MBA, Masters in Business Analytics and related programs, undergraduate, diploma and executive courses, and from their students A full chapter devoted to relevant case studies with more than a dozen cases demonstrating applications for the machine learning techniques End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented A companion website with more than two dozen data sets, and instructor materials including exercise solutions, slides, and case solutions This textbook is an ideal resource for upper-level undergraduate and graduate level courses in data science, predictive analytics, and business analytics. It is also an excellent reference for analysts, researchers, and data science practitioners working with quantitative data in management, finance, marketing, operations management, information systems, computer science, and information technology.

Galit Shmueli, PhD, is Distinguished Professor and Institute Director at National Tsing Hua University's Institute of Service Science. She has designed and instructed business analytics courses since 2004 at University of Maryland, Statistics.com, The Indian School of Business, and National Tsing Hua University, Taiwan. Peter C. Bruce, is Founder of the Institute for Statistics Education at Statistics.com, and Chief Learning Officer at Elder Research, Inc. Kuber R. Deokar, is the Data Science Team Lead at UpThink Experts, India. He is also a faculty member at Statistics.com. Nitin R. Patel, PhD, is cofounder and lead researcher at Cytel Inc. He was also a co-founder of Tata Consultancy Services. A Fellow of the American Statistical Association, Dr. Patel has served as a visiting professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years.

Foreword xix

Preface to the Fourth Edition xxi

Acknowledgments xxv

PART I PRELIMINARIES

CHAPTER 1 Introduction 3

CHAPTER 2 Overview of the Machine Learning Process 15

PART II DATA EXPLORATION AND DIMENSION REDUCTION

CHAPTER 3 Data Visualization 59

CHAPTER 4 Dimension Reduction 91

PART III PERFORMANCE EVALUATION

CHAPTER 5 Evaluating Predictive Performance 115

PART IV PREDICTION AND CLASSIFICATION METHODS

CHAPTER 6 Multiple Linear Regression 151

CHAPTER 7 k-Nearest-Neighbors (k-NN) 169

CHAPTER 8 The Naive Bayes Classifier 181

CHAPTER 9 Classification and Regression Trees 197

CHAPTER 10 Logistic Regression 229

CHAPTER 11 Neural Nets 257

CHAPTER 12 Discriminant Analysis 283

CHAPTER 13 Generating, Comparing, and Combining Multiple Models 303

PART V INTERVENTION AND USER FEEDBACK

CHAPTER 14 Experiments, Uplift Modeling, and Reinforcement Learning 319

PART VI MINING RELATIONSHIPS AMONG RECORDS

CHAPTER 15 Association Rules and Collaborative Filtering 341

CHAPTER 16 Cluster Analysis 369

PART VII FORECASTING TIME SERIES

CHAPTER 17 Handling Time Series 401

CHAPTER 18 Regression-Based Forecasting 415

CHAPTER 19 Smoothing Methods 445

PART VIII DATA ANALYTICS

CHAPTER 20 Social Network Analytics 467

CHAPTER 21 Text Mining 487

CHAPTER 22 Responsible Data Science 507

PART IX CASES

CHAPTER 23 Cases 537

References 575

Data Files Used in the Book 577

Index 579

CHAPTER 1
Introduction

1.1 WHAT IS BUSINESS ANALYTICS?

Business analytics (BA) is the practice and art of bringing quantitative data to bear on decision making. The term means different things to different organizations.

Consider the role of analytics in helping newspapers survive the transition to a digital world. One tabloid newspaper with a working‐class readership in Britain had launched a web version of the paper, and did tests on its home page to determine which images produced more hits: cats, dogs, or monkeys. This simple application, for this company, was considered analytics. By contrast, the Washington Post has a highly influential audience that is of interest to big defense contractors: it is perhaps the only newspaper where you routinely see advertisements for aircraft carriers. In the digital environment, the Post can track readers by time of day, location, and user subscription information. In this fashion, the display of the aircraft carrier advertisement in the online paper may be focused on a very small group of individuals—say, the members of the House and Senate Armed Services Committees who will be voting on the Pentagon's budget.

Business analytics, or more generically, analytics, includes a range of data analysis methods. Many powerful applications involve little more than counting, rule checking, and basic arithmetic. For some organizations, this is what is meant by analytics.

The next level of business analytics, now termed business intelligence (BI), refers to data visualization and reporting for understanding “what happened and what is happening.” This is done by use of charts, tables, and dashboards to display, examine, and explore data. BI, which earlier consisted mainly of generating static reports, has evolved into more user‐friendly and effective tools and practices, such as creating interactive dashboards that allow the user not only to access real‐time data but also to directly interact with it. Effective dashboards are those that tie directly into company data, and give managers a tool to quickly see what might not readily be apparent in a large complex database. One such tool for industrial operations managers displays customer orders in a single two‐dimensional display, using color and bubble size as added variables, showing customer name, type of product, size of order, and length of time to produce.

Business analytics now typically includes BI as well as sophisticated data analysis methods, such as statistical models and machine learning algorithms used for exploring data, quantifying and explaining relationships between measurements, and predicting new records. Methods like regression models are used to describe and quantify “on average” relationships (e.g., between advertising and sales), to predict new records (e.g., whether a new patient will react positively to a medication), and to forecast future values (e.g., next week's web traffic).

Readers familiar with earlier editions of this book might have noticed that the book title changed from Data Mining for Business Intelligence to Data Mining for Business Analytics, and finally, in this edition to Machine Learning for Business Analytics. The first change reflects the more recent term BA, which overtook the earlier term BI to denote advanced analytics. Today, BI is used to refer to data visualization and reporting. The change from data mining to machine learning reflects today's common use of machine learning to refer to algorithms that learn from data. This book uses primarily the term machine learning, except for specific references to the software Analytic Solver Data Mining and its menus.

WHO USES PREDICTIVE ANALYTICS?

The widespread adoption of predictive analytics, coupled with the accelerating availability of data, has increased organizations’ capabilities throughout the economy. A few examples:

Credit scoring: One long‐established use of predictive modeling techniques for business prediction is credit scoring. A credit score is not some arbitrary judgment of credit‐worthiness; it is based mainly on a predictive model that uses prior data to predict repayment behavior.
Future purchases: A controversial example is Target's use of predictive modeling to classify sales prospects as “pregnant” or “not‐pregnant.” Those classified as pregnant could then be sent sales promotions at an early stage of pregnancy, giving Target a head start on a significant purchase stream.
Tax evasion: The US Internal Revenue Service found it was 25 times more likely to find tax evasion when enforcement activity was based on predictive models, allowing agents to focus on the most likely tax cheats (Siegel, 2013).

The business analytics toolkit also includes statistical experiments, the most common of which is known to marketers as A‐B testing. These are often used for pricing decisions:

Orbitz, the travel site, found that it could price hotel options higher for Mac users than Windows users.
Staples online store found it could charge more for staplers if a customer lived far from a Staples store.

Beware the organizational setting where analytics is a solution in search of a problem: A manager, knowing that business analytics and machine learning are hot areas, decides that her organization must deploy them too, to capture that hidden value that must be lurking somewhere. Successful use of analytics and machine learning requires both an understanding of the business context where value is to be captured, and an understanding of exactly what the machine learning methods do.

1.2 WHAT IS MACHINE LEARNING?

In this book, machine learning (or data mining) refers to business analytics methods that go beyond counts, descriptive techniques, reporting, and methods based on business rules. While we do introduce data visualization, which is commonly the first step into more advanced analytics, the book focuses mostly on the more advanced data analytics tools. Specifically, it includes statistical and machine‐learning methods that inform decision making, often in automated fashion. Prediction is typically an important component, often at the individual level. Rather than “what is the relationship between advertising and sales?” we might be interested in “what specific advertisement, or recommended product, should be shown to a given online shopper at this moment?” Or we might be interested in clustering customers into different “personas” that receive different marketing treatment, then assigning each new prospect to one of these personas.

The era of big data has accelerated the use of machine learning. Machine learning algorithms, with their power and automaticity, have the ability to cope with huge amounts of data and extract value.

1.3 MACHINE LEARNING, AI, AND RELATED TERMS

The field of analytics is growing rapidly, both in terms of the breadth of applications, and in terms of the number of organizations using advanced analytics. As a result there is considerable overlap and inconsistency of definitions. Terms have also changed over time.

The older term data mining means different things to different people. To the general public, it may have a general, somewhat hazy and pejorative meaning of digging through vast stores of (often personal) data in search of something interesting. Data mining, as it refers to analytic techniques, has largely been superceded by the term machine learning. Other terms that organizations use are predictive analytics, predictive modeling, and most recently machine learning and Artificial Intelligence (AI).

Many practitioners, particularly those from the IT and computer science communities, use the term AI to refer to all the methods discussed in this book. AI originally referred to the general capability of a machine to act like a human, and, in its earlier days, existed mainly in the realm of science fiction and the unrealized ambitions of computer scientists. More recently, it has come to encompass the methods of statistical and machine learning discussed in this book, as the primary enablers of that grand vision, and sometimes the term is used loosely to mean the same thing as machine learning. More broadly, it includes generative capabilities such as the creation of images, audio, and video.

Statistical Modeling vs. Machine Learning

A variety of techniques for exploring data and building models have been around for a long time in the world of statistics: linear regression, logistic regression, discriminant analysis, and principal component analysis, for example. But the core tenets of classical statistics—computing is difficult and data are scarce—do not apply in machine learning applications where both data and computing power are plentiful.

This gives rise to Daryl Pregibon's description of “data mining” (in the sense of machine learning) as “statistics at scale and speed” (Pregibon, 1999). Another major difference between the fields of statistics and machine learning is the focus in statistics on inference from a sample to the population regarding an “average effect”—for example, “a $1 price increase will reduce average demand by 2 boxes.” In contrast, the focus in machine learning...

Erscheint lt. Verlag	19.4.2023
Sprache	englisch
Themenwelt	Informatik ► Office Programme ► Outlook
Schlagworte	Computer Science • Data Mining • Data Mining & Knowledge Discovery • Data Mining Statistics • Data Mining u. Knowledge Discovery • Datenanalyse • Electrical & Electronics Engineering • Elektrotechnik u. Elektronik • Informatik • Neural networks • Neuronale Netze • Statistics • Statistik
ISBN-10	1-119-82986-0 / 1119829860
ISBN-13	978-1-119-82986-7 / 9781119829867

Haben Sie eine Frage zum Produkt?

EPUB (Adobe DRM)
Größe: 42,5 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Andere eBook-Ausgabe

PDF (Adobe DRM)

107,99 €