Statistical Learning of Complex Data -

Statistical Learning of Complex Data (eBook)

eBook Download: PDF
2019 | 1. Auflage
XIII, 200 Seiten
Springer-Verlag
978-3-030-21140-0 (ISBN)
Systemvoraussetzungen
139,90 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

This book of peer-reviewed contributions presents the latest findings in classification, statistical learning, data analysis and related areas, including supervised and unsupervised classification, clustering, statistical analysis of mixed-type data, big data analysis, statistical modeling, graphical models and social networks. It covers both methodological aspects as well as applications to a wide range of fields such as economics, architecture, medicine, data management, consumer behavior and the gender gap. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary.

This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field of data analysis and classification. It gathers selected and peer-reviewed contributions presented at the 11th Scientific Meeting of the Classification and Data Analysis Group of the Italian Statistical Society (CLADAG 2017), held in Milan, Italy, on September 13-15, 2017.




Francesca Greselin is an Associate Professor of Statistics at the University of Milano-Bicocca, Milan, Italy. She teaches Statistics and Insurance Risks for graduate students and Inference for PhD students. Her research interests range from robust statistical methods for model-based classification and clustering, to inferential results for inequality and risk measures. She has published more than 30 scientific papers in peer-reviewed international statistics journals.

Laura Deldossi is an Associate Professor of Statistics at the Università Cattolica del Sacro Cuore in Milan, Italy. Her main research interests are optimal design of experiments, Bayesian model discrimination, discrete choice models, experimental and quasi-experimental design for causal inference designs, and statistical process control. She has taught several courses:  Statistics, Applied Statistics, Data Analysis and Sample Techniques, and Design of Experiments.

Luca Bagnato is an Assistant Professor of Statistics at the Università Cattolica del Sacro Cuore in Piacenza, Italy. He completed his Ph.D. in Statistics at the University of Milano-Bicocca in 2009 and received two postdoctoral fellowships: at the University of Milano-Bicocca and at the University of Verona. His research interests include time series analysis, distribution theory, mixture models, and spatial statistics. He has published more than 20 scientific papers in peer-reviewed journals.

Maurizio Vichi is a Full Professor of Statistics and Chair of the Department of Statistical Sciences at Sapienza University of Rome, Italy. He is Coordinating Editor of the international journal Advances in Data Analysis and Classification, published by Springer, and acting Chair of the European Statistical Advisory Committee of the EU. He teaches Multivariate Statistics and Advances in Data Analysis and Statistical Modelling. His research interests include statistical models for clustering, classification, dimensionality reduction, composite indicators, PLS, SEM and new methods for official statistics based on smart statistics and big data analysis. He is the author of more than 150 papers, mainly published in peer-reviewed international statistics journals.

Preface 6
Contents 10
Contributors 12
Part I Clustering and Classification 15
Cluster Weighted Beta Regression: A Simulation Study 16
1 Introduction 16
2 The Model 18
3 ML Parameter Estimation 19
4 Simulation Study 20
5 Concluding Remarks 22
References 23
Detecting Wine Adulterations Employing Robust Mixture of Factor Analyzers 25
1 Introduction and Motivation 25
2 Mixtures of Gaussian Factors Analyzers 26
3 Wine Recognition Data 28
4 Simulation Study 30
References 33
Simultaneous Supervised and Unsupervised Classification Modeling for Assessing Cluster Analysis and Improving Results Interpretability 34
1 Introduction 34
2 Proposal 35
3 Application on Real Data 37
4 Concluding Remarks 41
References 42
A Parametric Version of Probabilistic Distance Clustering 43
1 Introduction 43
2 Probabilistic Distance Clustering 44
3 Methodology 45
3.1 Gaussian PD Clustering 47
3.2 Student-t PD Clustering 48
4 Application on Simulated Data Sets 49
5 Conclusion and Future Work 52
References 52
An Overview on the URV Model-Based Approach to Cluster Mixed-Type Data 54
1 Introduction 54
2 Clustering Ordinal Data 55
3 Simultaneous Clustering and Reduction 56
4 Clustering Mixed-Type Data 57
5 Model Identifiability 57
6 Computation, Classification and Model Selection 59
7 Some Related Models 59
8 Real Data Application 60
References 61
Part II Exploratory Data Analysis 63
Preference Analysis of Architectural Façades by Multidimensional Scaling and Unfolding 64
1 Introduction 64
2 Paired-Comparison Task 66
Step 1 67
Step 2 68
3 Ranking Task 69
4 Conclusions 71
References 71
Community Structure in Co-authorship Networks: The Case of Italian Statisticians 72
1 Introduction 72
2 Community Detection Methods 73
3 Community Detection Results for Italian Statisticians 75
4 Conclusions 78
References 79
Analyzing Consumers' Behavior in Brand Switching 80
1 Introduction 80
2 Method 81
3 A Real Dataset: Potato Snack Brands and Their Customers 82
4 Data Analysis and Obtained Results 82
5 Discussion of the Obtained Findings 85
References 87
Evaluating the Quality of Data Imputation in Cardiovascular Risk Studies Through the Dissimilarity Profile Analysis 88
1 Introduction 88
2 The DPA Method in Short 89
3 DPA for Evaluating QoI in a CVD Risk Case Study 91
3.1 Application of DPA and PA for Evaluating QoI 92
3.2 Assessment of QoI in the Athlete Group 93
4 Conclusions 98
References 99
Part III Statistical Modeling 100
Measuring Economic Vulnerability: A Structural Equation Modeling Approach 101
1 Introduction 101
2 The Economic Vulnerability Index (EVI): A Possible Extension 102
3 The PLS Approach to Structural Equation Model 103
4 Results 105
4.1 Comparing Different Models' Specifications 106
4.2 Comparing Estimation Methods 106
4.3 Does Vulnerability Help Explain Growth? 107
5 Final Remarks 108
References 108
Bayesian Inference for a Mixture Model on the Simplex 109
1 Introduction 109
2 The Flexible Dirichlet Distribution 110
3 Bayesian Inference via Gibbs Sampling 111
4 A New Parametrization 113
5 Simulation Study 114
References 116
Stochastic Models for the Size Distribution of Italian Firms: A Proposal 117
1 Introduction 117
2 Model 118
3 Results of Empirical Studies 121
4 Discussion 122
References 125
Modeling Return to Education in Heterogeneous Populations: An Application to Italy 127
1 Introduction 127
2 The Model 129
3 Maximum Likelihood Estimation: The EM Algorithm 130
4 Real Data Analysis 131
5 Conclusions 134
References 135
Changes in Couples' Bread-Winning Patterns and Wife's Economic Role in Japan from 1985 to 2015 138
1 Introduction 138
1.1 Background 138
1.2 Hypotheses 139
2 Data and Methods 141
3 Results 142
4 Conclusion and Discussion 145
References 145
Weighted Optimization with Thresholding for Complete-Case Analysis 147
1 Introduction 147
2 Weighted Optimization 148
3 Examples 151
4 Conclusions 154
References 154
Part IV Graphical Models 156
Measurement Error Correction by Nonparametric Bayesian Networks: Application and Evaluation 157
1 Introduction 158
2 Nonparametric Bayesian Networks 158
3 NPBNs and Measurement Error—Application and Results 159
3.1 Results 162
4 Conclusions 163
References 164
Copula Grow-Shrink Algorithm for Structural Learning 165
1 Introduction 165
2 Nonparanormal Graphical Models 166
3 Structural Learning 167
4 Experiments 168
4.1 Simulation 168
4.2 An Application to Real Data: The Italian Energy Market 170
5 Conclusions 172
References 172
Context-Specific Independencies Embedded in Chain Graph Models of Type I 174
1 Introduction 174
2 Methodology 175
2.1 Hierarchical Multinomial Marginal Models for Context-Specific Independencies 175
2.2 Stratified Chain Graph Models 177
3 Application 178
3.1 The Italian Innovation Survey 178
4 Conclusions 181
References 181
Part V Big Data Analysis 182
Big Data and Network Analysis: A Combined Approach to Model Online News 183
1 Big Data and Network Analysis 183
2 The Big Data Audience Model 185
3 The Network Analysis Application 186
4 Conclusions 190
References 190
Experimental Design Issues in Big Data: The Question of Bias 192
1 The Challenges of Experimental Design with Big Data 192
2 Causal Models 193
3 Bias Models 195
3.1 A Game Theoretic Approach 197
4 Conclusion 199
References 199

Erscheint lt. Verlag 6.9.2019
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Mathematik Statistik
Mathematik / Informatik Mathematik Wahrscheinlichkeit / Kombinatorik
Wirtschaft
Schlagworte Big Data • classification • Clustering • Complex Data • Data Analysis • explanatory data analysis • Functional Data • Graphical Models • machine learning methods • Multidimensional Scaling • multiway data • network data • pattern recognition • Statistical Learning • statistical modeling
ISBN-10 3-030-21140-1 / 3030211401
ISBN-13 978-3-030-21140-0 / 9783030211400
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 4,6 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Das umfassende Handbuch

von Wolfram Langer

eBook Download (2023)
Rheinwerk Computing (Verlag)
49,90
Das umfassende Handbuch

von Jürgen Sieben

eBook Download (2023)
Rheinwerk Computing (Verlag)
89,90
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Fachbuchverlag
29,99