
eBook - ePub
Data Analysis and Applications 3
Computational, Classification, Financial, Statistical and Stochastic Methods
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
Data Analysis and Applications 3
Computational, Classification, Financial, Statistical and Stochastic Methods
About this book
Data analysis as an area of importance has grown exponentially, especially during the past couple of decades. This can be attributed to a rapidly growing computer industry and the wide applicability of computational techniques, in conjunction with new advances of analytic tools. This being the case, the need for literature that addresses this is self-evident. New publications are appearing, covering the need for information from all fields of science and engineering, thanks to the universal relevance of data analysis and statistics packages. This book is a collective work by a number of leading scientists, analysts, engineers, mathematicians and statisticians who have been working at the forefront of data analysis. The chapters included in this volume represent a cross-section of current concerns and research interests in these scientific areas. The material is divided into two parts: Computational Data Analysis, and Classification Data Analysis, with methods for both - providing the reader with both theoretical and applied information on data analysis methods, models and techniques and appropriate applications.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Data Analysis and Applications 3 by Andreas Makrides, Alex Karagrigoriou, Christos H. Skiadas, Andreas Makrides,Alex Karagrigoriou,Christos H. Skiadas in PDF and/or ePUB format, as well as other popular books in Business & Management. We have over one million books available in our catalogue for you to explore.
Information
PART 1
Computational Data Analysis and Methods
1
Semi-supervised Learning Based on Distributionally Robust Optimization
We propose a novel method for semi-supervised learning (SSL) based on data-driven distributionally robust optimization (DRO) using optimal transport metrics. Our proposed method enhances generalization error by using the non-labeled data to restrict the support of the worst case distribution in our DRO formulation. We enable the implementation of our DRO formulation by proposing a stochastic gradient descent algorithm, which allows us to easily implement the training procedure. We demonstrate that our semi-supervised DRO method is able to improve the generalization error over natural supervised procedures and state-of-the-art SSL estimators. Finally, we include a discussion on the large sample behavior of the optimal uncertainty region in the DRO formulation. Our discussion exposes important aspects such as the role of dimension reduction in SSL.
1.1. Introduction
We propose a novel method for semi-supervised learning (SSL) based on data-driven distributionally robust optimization (DRO) using an optimal transport metric – also known as Earth’s moving distance (see [RUB 00]).
Our approach enhances generalization error by using the unlabeled data to restrict the support of the models, which lie in the region of distributional uncertainty. It is intuitively felt that our mechanism for fitting the underlying model is automatically tuned to generalize beyond the training set, but only over potential instances which are relevant. The expectation is that predictive variables often lie in lower dimensional manifolds embedded in the underlying ambient space; thus, the shape of this manifold is informed by the unlabeled data set (see Figure 1.1 for an illustration of this intuition).

Figure 1.1. Idealization of the way in which the unlabeled predictive variables provide a proxy for an underlying lower dimensional manifold. Large red dots represent labeled instances and small blue dots represent unlabeled instances. For a color version of this figure, see www.iste.co.uk/makrides/data3.zip
To enable the implementation of the DRO formulation, we propose a stochastic gradient descent (SGD) algorithm, which allows us to implement the training procedure at ease. Our SGD construction includes a procedure of independent interest which, we believe, can be used in more general stochastic optimization problems.
We focus our discussion on semi-supervised classification but the modeling and computational approach t...
Table of contents
- Cover
- Table of Contents
- Preface
- PART 1: Computational Data Analysis and Methods
- PART 2: Classification Data Analysis and Methods
- List of Authors
- Index
- End User License Agreement