Book
xiv, 183 p.
SAL3 (off-campus storage), Special Collections
Book
xvii, 134 leaves, bound.
SAL3 (off-campus storage), Special Collections
Book
xiv, 112 leaves bound.
SAL3 (off-campus storage), Special Collections
Book
xiii, 93 leaves, bound.
SAL3 (off-campus storage), Special Collections
Book
ix, 100 leaves.
SAL3 (off-campus storage), Special Collections
Book
1 online resource.
Dynamic panel data models are widely studied and used in econometrics and biostatistics. We first introduce mixed models, which have many applications in risk analytics in finance and insurance, to analyze dynamic panel data. We next develop a martingale approach to the evaluation of forecasting methods. We apply this approach to the development of statistical inference procedures for a widely used measure, the accuracy ratio, for the evaluation of probability forecasts. A comprehensive asymptotic theory for the time-adjusted accuracy ratio is established, taking care of the time series aspects of the data. Our martingale approach does not require subjective modeling of the data, such as assuming whether time series are stationary, which is an obvious advantage in regulatory environments. We carry out an empirical study on default predictions of small and medium sized enterprises. We propose using the generalized linear mixed model (GLMM) to model the default probabilities, where as the logistic regressions are the most widely used method in the literature. We use the accuracy ratio to compare the forecasting methods, and the confidence intervals for the accuracy ratios show that the GLMM outperforms the logistic regression forecasts.
Book
1 online resource.
Markov decision processes (MDPs) provide a mathematical framework for modeling sequential decision making where system evolution and cost/reward depend on uncertainties and control actions of a decision. MDP models have been widely adopted in numerous domains such as robotics, control systems, finance, economics, and manufacturing. At the same time, optimization theories of MDPs serve as the theoretical underpinnings to numerous dynamic programming and reinforcement learning algorithms in stochastic control problems. While the study in MDPs is attractive for several reasons, there are two main challenges associated with its practicality: 1) An accurate MDP model is oftentimes not available to the decision maker. Affected by modeling errors, the resultant MDP solution policy is non-robust to system fluctuations. 2) The most widely-adopted optimization criterion for MDPs is represented by the risk-neutral expectation of a cumulative cost. This does not take into account the notion of risk, i.e., increased awareness of events of small probability but high consequences. In this thesis we study multiple important aspects in risk-sensitive sequential decision making where the variability of stochastic costs and robustness to modeling errors are taken into account. First, we address a special type of risk-sensitive decision making problems where the percentile behaviors are considered. Here risk is either modeled by the conditional value-at-risk (CVaR) or the Value-at-risk (VaR). VaR measures risk as the maximum cost that might be incurred with respect to a given confidence level, and is appealing due to its intuitive meaning and its connection to chance-constraints. The VaR risk measure has many fundamental engineering applications such as motion planning, where a safety constraint is imposed to upper bound the probability of maneuvering into dangerous regimes. Despite its popularity, VaR suffers from being unstable, and its singularity often introduces mathematical issues to optimization problems. To alleviate this problem, an alternative measure that addresses most of VaR's shortcomings is CVaR. CVaR is a risk-measure that is rapidly gaining popularity in various financial applications, due to its favorable computational properties (i.e., CVaR is a coherent risk) and superior ability to safeguard a decision maker from the "outcomes that hurt the most". As a risk that measures the conditional expected cost given that such cost is greater than or equal to VaR, CVaR accounts for the total cost of undesirable events (it corresponds to events whose associated probability is low, but the corresponding cost is high) and is therefore preferable in financial application such as portfolio optimization. Second, we consider optimization problems in which the objective function involves a coherent risk measure of the random cost. Here the term coherent risk denotes a general class of risks that satisfies convexity, monotonicity, translational-invariance and positive homogeneity. These properties not only guarantee that the optimization problems are mathematically well-posed, but they are also axiomatically justified. Therefore modeling risk-aversion with coherent risks has already gained widespread acceptance in engineer- ing, finance and operations research applications, among others. On the other hand, when the optimization problem is sequential, another important property of a risk measure is time consistency. A time consistent risk metric satisfies the "dynamic-programming" style property which ensures rational decision making, i.e., the strategy that is risk-optimal at the current stage will also be deemed optimal in subsequent stages. To get the best of both worlds, the recently proposed Markov risk measures satisfy both the coherent risk properties and time consistency. Thus to ensure rationality in risk modeling and algorithmic tractability, this thesis will focus on risk-sensitive sequential decision making problems modeled by Markov risk measures.
Book
1 online resource.
A definition of a probabilistic automaton is formulated in which its prime decomposition follows as a direct consequence of Krohn-Rhodes theorem. We characterize Green's relations on the monoid of stochastic matrices. The reduced holonomy monoid is induced by the holonomy decomposition of Eilenberg, which is a variant of the prime decomposition. We prove that its representation theory is determined by an iterated wreath product of its holonomy groups.
Book
1 online resource.
In Monte Carlo simulations it is often essential that a method is accompanied by an appropriate variance reduction method. Reducing the variance of a Monte Carlo method is, at least conceptually, an optimization problem, and mathematical optimization has indeed been used as a theoretical and conceptual tool in this pursuit. However, traditional Monte Carlo methods have only used numerical optimization sparingly, and convex optimization even less. Numerical optimization is study of algorithms for finding a solution to an optimization problem, as opposed to the study of analytical solutions of an optimization problem. Convex optimization is the study of convex optimization problems, a subclass of optimization problems for which efficient algorithms for finding the global optimum exists. In this work we present a framework for using convex optimization for Monte Carlo. More specifically, we present a framework for using stochastic convex optimization for adaptive importance sampling, self-normalized importance sampling, and what-if simulations. The main idea is to perform importance sampling and numerical optimization simultaneously. In particular, the numerical optimization does not rely on blackbox optimization solvers, and this allows the computational cost of each iteration to remain cheap. Because the optimization is performed on a convex problem, we can establish convergence and optimality.
Book
1 online resource.
Financial markets are well-defined complex systems, which are continuously monitored and recorded, and generate tremendous amounts of data. This thesis is an attempt to analyze financial markets as high-dimensional and dynamic systems. In dealing with high-dimensional data sets, factor models are often useful for dimension reduction. In the first part of this thesis, we present a new approach to estimate high-dimensional factor models, using the empirical spectral density of residuals. The spectrum of covariance matrices from financial data typically exhibits two characteristic aspects: a few spikes and bulk. The former represent factors that mainly drive the features and the latter arises from idiosyncratic noise. Motivated by these two aspects, we consider a minimum distance between two spectrums; one from a covariance structure model and the other from real residuals of financial data that are obtained by subtracting principal components. Our method simultaneously provides estimators of the number of factors and information about correlation structures in residuals. Using free random variable techniques, the proposed algorithm can be implemented and controlled effectively. Monte Carlo simulations confirm that our method is robust to noise or the presence of weak factors. Furthermore, the application to financial time-series shows that our estimators capture essential aspects of market dynamics. In the second part of this thesis, we turn to pay attention to more practical applications using residual dynamics. In this part, we study the risk of mean-reversions in statistical arbitrage. The basic concept of statistical arbitrage is to exploit short-term deviations in returns from a long-term equilibrium across several assets. This kind of strategy heavily relies on the assumption of mean-reversion of idiosyncratic returns - reverting to a long-term mean after a certain amount of time, but literature on the assessment of risk on this belief is rare. In this chapter, we propose a scheme that controls the risk on mean-reversions, via portfolio selections and screenings. Realizing that each residual has a different mean-reversion time, the residuals that are fast mean-reverting are selected to form portfolios. In addition, further control is imposed by allowing the trading activity only when the goodness-of-fit of the estimation for trading signals is sufficiently high. We design the dynamic asset allocation with the market- and dollar- neutrality conditions as a constrained optimization problem, which is solved numerically. The improved reliability and robustness of the strategy of our method is demonstrated through back-testing with real data. It is found that the performance of this investment framework is robust to market changes. We further provide answers to puzzles regarding relations among the number of factors, length of estimation window, and transaction costs, which are crucial parameters that have direct impacts on the investment strategy. In the third part, we consider the Wishart processes, which is a good candidate for modeling a stochastic covariance matrix. Specifically, we use perturbation theory to derive the eigenvalues and eigenvectors of the Wishart processes. These processes have some interesting features where each eigenvalue is part of a multi-dimensional stochastic systems of non-colliding particles, which has been widely studied in mathematical physics. In addition, we implement a numerical scheme to effectively simulate eigenvalues of the square Ornstein-Uhlenbeck processes. In this scheme, the Yosida approximation was exploited to resolve the difficulty that arises from the singular drift terms. The results in this chapter contribute to the dynamic modeling of stochastic covariance matrices and their financial applications.
Book
1 online resource.
The goal of the research presented in this dissertation is to present a data driven policy making approach in three different settings. In the first setting, we look at using data to inform policies to supply supplementary food aid for undernourished children in Guatemala. We fit a trivariate model of weight-for-age z score (WAZ), height-for-age z score (HAZ) and diarrhea status to data from an observational study of supplemen- tary feeding in 17 Guatemalan communities. We estimate how the effect of supplementary food on WAZ, HAZ and diarrhea status varies with a child's age. We find that the effect of supplementary food on all 3 metrics decreases linearly with age from 6 to 20 mo and has little effect after 20 mo. We derive 2 food allocation policies that myopically (i.e., looking ahead 2 mo) minimize either the underweight or stunting severity -- i.e., the sum of squared WAZ or HAZ scores for all children with WAZ or HAZ < 0. A simulation study based on the statistical model predicts that in a low-dose (100 kCal/day) supplementary feeding setting in Guatemala, allocating food primarily to 6-12 mo infants can reduce the severity of underweight and stunting. In the second setting, we look at analyzing how targeting interventions to the most under- nourished would perform in reducing the malaria burden amongst children in sub-Saharan Africa. We construct a malaria model with superinfection and heterogeneous susceptibility, where a portion of this susceptibility is due to undernutrition (as measured by weight-for-age z scores); so as to isolate the impact of supplementary food on malaria from the influence of confounding factors, we estimate the portion of the total susceptibility that is due to undernutrition from a large randomized trial of supplementary feeding. A simulation study based on the malaria model suggests that targeting insecticide-treated bed nets to undernutritioned children leads to fewer malaria deaths than the random distribution of bed nets in the hypoendemic and mesoendemic settings. In the third setting, we look to quantify Tuberculosis (TB) transmission risk within and outside cells using high-resolution social contact data of prisoner in a large prison in the state of Mato Grosso do Sul, Brazil. We then develop and parameterize a model of TB natural history and transmission to evaluate the impact of TB control interventions in prisons on TB incidence amongst the prisoners. A simulation study using this model suggests that any single intervention (improved case detection, decrowding or active case detection) would be insufficient to bring the TB incidence back to the TB incidence levels of the early 2000s but a combination of interventions would be necessary. The results of these studies demonstrate that mathematical modeling can be a powerful tool in understanding the interplaying factors behind the policy in question and help policy makers make informed decisions.
Book
1 online resource.
In online social networks such as Twitter and Facebook, users are constantly sharing information with people they are connected to, as well as re-sharing information posted by others. Through this process, a single piece of information called a contagion can spread from user to user over the connections until it has reached a large portion of the network. In this thesis, we develop a series of probabilistic methods for modeling the spread of contagions in social networks in order to better understand the factors that affect the process. Our work examines several different phenomena that affect information flows through social networks. One such phenomenon is unobserved sources of information influencing members of the network. We present a model that not only quantifies these hidden information sources but also provides a more accurate view of information spread. We find that as much as 29% of all information spreading through social networks like Twitter originates from sources outside the network. Next, we examine how different contagions spreading through a network can interact with each other. We observe and model competition (when one contagion can decrease the spread of another contagion) and cooperation (when one contagion increases the spread of another). We find that contagion interaction can increase or decrease the probability of contagion spread by more than 70% on average. We also explore the dynamic nature of social network structure, and how these dynamics are affected by the spread of information. As social network users are exposed to new contagions, they are constantly forming new connections with other users as well as deleting connections. We find that the spread of contagions can cause sudden "bursts" in both the creation of new connections and the deletion of old connections. We also find that contagions can change the network structure on a global scale by moving like-minded members closer to each other as well as pushing less similar users farther away. Additionally, we consider the problem of inferring the structure of a hidden network when only patterns of information spread are known. Given only the timing of when each user adopted each piece of information, our method accurately predicts which users are connected to which other users by converting the maximum likelihood estimation of the network into a series of independent convex optimization problems that can be solved efficiently. Taken together, the results presented in this thesis make contributions to understanding how information flows through networks, and they provide insights into the nature of how people exchange information.
Book
1 online resource.
With the ever-increasing power of computers and the advent parallel computing in the recent decades, scientists and engineers are relying more and more on numerical simulations in their studies of complex physical systems. Given the central role that simulations play in engineering design and decision making, it is crucial to assign confidence in their outputs. One important factor that leads to uncertainty in the output quantity of interest in a physical system, is uncertainty in the inputs like properties of materials, manufacturing details, initial and boundary conditions, \emph{etc}. Our goal in uncertainty quantification is (by definition) to quantify the effects of these input uncertainties on the output quantity of interest. In other words, we are trying to describe the behavior of this output variable as a function of the input uncertainties. In \emph{intrusive} UQ strategies, we solve the governing equations of the physical system in terms of both physical and uncertain variables. Solving these equations is usually significantly more challenging than solving the original (deterministic) equations, and often require writing new code that is substantially different from the deterministic solver. In \emph{non-intrusive} UQ on the other hand, we run the deterministic code for various values of the input parameters, and use the outputs of these simulations to construct an approximation for the behavior of the output quantity of interest as a function of the input uncertainties. Although non-intrusive methods come in many flavors, they are all based on some variation of the following fundamental problem in approximation theory: given the values of a function at a set of points in its domain, how can we efficiently and accurately approximate that function? In this dissertation, we study several variants of this problem. Sometimes, we are free in choosing the points at which the function is evaluated (\emph{i.e.}, the values of the input parameters for which we run the deterministic simulation). In this scenario, we need to choose the points in a way that, given a certain number of function evaluations, we can get the best quality of approximation. As an instance of the problem in this setting, we will look at a non-intrusive \emph{polynomial chaos expansion} (PCE) technique, in which we use weighted least squares to construct a multivariate polynomial surrogate. We present a novel optimization based method for finding the best points for this type of approximation. We are not always free to choose the grid points, and sometimes have to find the best approximation that we can, using a fixed set of points that are just given to us. For these problems, in univariate settings, we present an efficient and accurate method based on the \emph{Floater-Hormann} rational interpolation. For multivariate settings, we present a generalization of the nearest neighbor interpolation based on $L_1$ minimization. This method has similar convergence properties to those of the \emph{moving least squares} method, but unlike moving least squares, it does not come with any tunable parameters. We will also look at a hybrid setting, where some of the points are fixed, and we are free in choosing the rest. Assume that we have found the polynomial interpolant a function at a set of \emph{Chebyshev} points, and decide that we need to use a higher order polynomial interpolant. We present a method for finding the best points to use for finding the higher order interpolant, under the constraint that the previous set of points have to be reused.
Book
1 online resource.
Because geophysical data are inexorably sparse and incomplete, stochastic treatments of simulated responses are crucial to explore possible scenarios and assess risks in subsurface problems. In particular, nonlinear two-phase flow in porous media is essential, yet challenging, in reservoir simulation and hydrology. Adding highly heterogeneous and uncertain input, such as the permeability and porosity fields, transforms the estimation of the flow response into a tough stochastic problem for which computationally expensive Monte Carlo simulations remain the preferred option. In this thesis, we first propose an alternative approach to evaluate the probability distribution of the (water) saturation for nonlinear transport in strongly heterogeneous porous systems. We build a physics-based, computationally efficient and numerically accurate method to estimate the one-point probability density and cumulative distribution functions of the saturation. The distribution method draws inspiration from a Lagrangian approach of the stochastic transport problem and expresses the saturation probability density function and cumulative distribution function essentially in terms of a deterministic nonlinear mapping of scalar random fields. In a large class of applications these random fields are smooth and can be estimated at low computational costs (few Monte Carlo runs), thus making the distribution method attractive. Once the saturation distribution is determined, any one-point statistics thereof can be obtained, especially the saturation average and standard deviation. More importantly, the probability of rare events and saturation quantiles (e.g. P10, P50 and P90) can be efficiently derived from the distribution method. These statistics can then be used for risk assessment, as well as data assimilation and uncertainty reduction in the prior knowledge of geophysical input distributions. We provide various examples and comparisons with existing methods to illustrate the performance and applicability of the new developed method. In the second part of the thesis, we present a procedure to analytically obtain the multi-point cumulative distribution function of the saturation for the stochastic two-phase Buckley-Leverett model with random total-velocity field. The multi-point distribution function is determined by first deriving a partial differential equation for the saturation raw cumulative distribution function at each point, then combining these equations into a single partial differential equation for the multi-point raw cumulative distribution function. This latter stochastic partial differential equation, linear in the space-time variables, can be solved in a closed form and semi-analytically for spatial one dimensional problems or numerically for higher spatial dimensions. Finally, the ensemble average of its solution gives the saturation multi-point cumulative distribution function. We provide numerical results of distribution function profiles in one spatial dimension and for two points. Besides, we use the two-point distribution method to compute the saturation auto-covariance function, essential for data assimilation. We confirm the validity of the method by comparing covariance results obtained with the multipoint distribution method and Monte Carlo simulations.
Book
1 online resource.
As the ability to generate datasets with billions of records in relatively automated ways continues to grow, new challenges are posed to modern large-scale data analysis from several points of view such as scalability, feasibility, and interpretability. Thus, improved algorithms on large-scale data platforms are of interest. Recent years have seen great interest in Randomized Linear Algebra (RLA) algorithms. RLA is an interdisciplinary research area that exploits randomization as a computational resource for the development of improved algorithms for common matrix problems such as least-squares approximation, low-rank matrix approximation, and Laplacian-based linear equation solvers. In this thesis, our focus is on the underlying theory and practical implementation of RLA algorithms. In particular: Chapter 3 describes a novel sampling algorithm for large-scale over-determined quantile regression problems whose running time is roughly proportional to the number of non-zero elements in the matrix plus a term that depends on the low dimension only. Chapter 4 describes a hybrid algorithm named pwSGD --- precondition weighted stochastic gradient descent that combines RLA and SGD. We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Chapter 5 describes a class of randomized Newton-type algorithms that exploit non-uniform sub-sampling as well as inexact updates for a class of constrained optimization problems. We show that our methods are more robust to ill-conditioned problems than other similar previous approaches. Chapter 6 presents results of implementing RLA algorithms for least-squares problems, quantile regression, and CX decomposition problems in parallel and distributed environments using Apache Spark, and discuss various tradeoffs in the implementations. In particular, we demonstrate that least-squares problems with up to terabyte-sized data can be solved to low, medium, or high precision on existing distributed systems. Chapter 7 demonstrates that applying CX/CUR decomposition to large-scale Mass Spectrometry Imaging (MSI) datasets returns interpretable results as it successfully identifies important ions and locations in complex biological samples. All in all, we show that RLA is a powerful tool for deriving scalable algorithms for large-scale regression and optimization problems, and we demonstrate that RLA algorithms are amenable to distributed computing platforms and are useful in scientific applications.
Book
1 online resource.
In this thesis we investigate the utility of spectral sequences for persistent homology, a tool which characterizes the shape of data. We show how to use these spectral sequences to compute persistent homology in parallel.
Book
1 online resource.
This thesis examines estimation of covariance and correlation matrices. More specifically we will in the first part study dynamical properties of the top eigenvalue and eigenvector for sample estimators of covariance and correlation matrices. This is done under the assumption that the top eigenvalue is separated from the others, which is reasonable when the data comes from financial returns. We show exactly how these quantities behave when the true covariance or correlation is stationary and derive theoretical values of related quantities that can be useful when quantifying the amount of non-stationarity for real data. We also validate the results by using Monte-Carlo simulations. A major contribution from the analysis is that it shows how and under which regimes correlation matrices differ from covariance matrices from a dynamic viewpoint. This effect has been observed in financial data, but never explained. In the second part of the thesis we study modifications to covariance estimators that find the optimal estimator within a certain sub-class. This type of estimators is generally known as shrinkage estimators as they modify only eigenvalues of the original estimator. We will do this when the original estimator takes the form A1/2XBXT A1/2, where A and B are matrices and X is a matrix of i.i.d. variables. The analysis is done in the asymptotic limit where both the number of samples and variables approach infinity jointly so that random-matrix theory can be used. Our goal is to find the shrinkage estimator which minimizes expected value of the Frobenius norm between the estimator and the true covariance matrix. To do this we first derive a generalization to the Marchenko-Pastur equation for this class of estimators. This theorem allows us to calculate the asymptotic value of the projection of the sample eigenvectors onto the true covariance matrix. We then show how to use these to find the optimal covariance estimator. At last, we show with simulations that these estimators are close to the optimal bound when used on finite data sets.
Book
1 online resource.
Symmetric cone optimization subsumes linear optimization, second-order cone optimization, and semidefinite optimization. It is of interest to extend the algorithmic developments of symmetric cone optimization into the realm of unsymmetric cones. We analyze the theoretical properties of some algorithms for unsymmetric cone problems. We show that they achieve excellent worst-case iteration bounds while not necessarily being practical to implement. Using lessons from this analysis and inspired by the Mehrotra predictor-corrector algorithm, we extend the homogeneous implementation ECOS to handle problems modeled with Cartesian products of the positive orthant, second-order cones, and the exponential cone, and we empirically validate its efficiency.
Special Collections
Book
1 online resource.
This thesis presents the design and analysis of numerical methods for free- and moving-boundary problems: partial differential equations posed on domains that change with time. Two principal developments are presented. First, a novel framework is introduced for solving free- and moving-boundary problems with a high order of accuracy. This framework has the distinct advantage that it can handle large domain deformations easily (a common difficulty faced by conventional deforming-mesh methods) while representing the geometry of the moving domain exactly (an infeasible task for conventional fixed-mesh methods). This is accomplished using a universal mesh: a background mesh that contains the moving domain and conforms to its geometry at all times by perturbing a small number of nodes in a neighborhood of the moving boundary. The resulting framework admits, in a general fashion, the construction of methods that are of arbitrarily high order of accuracy in space and time when the boundary evolution is prescribed. Numerical examples involving phase-change problems, fluid flow around moving obstacles, and free-surface flows are presented to illustrate the technique. Second, a unified analytical framework is developed for establishing the convergence properties of a wide class of numerical methods for moving-boundary problems. This class includes, as special cases, the technique described above as well as conventional deforming-mesh methods (commonly known as arbitrary Lagrangian-Eulerian, or ALE, schemes). An instrumental tool developed in this analysis is an abstract estimate, which applies to rather general mesh motions, for the error incurred by finite element discretizations of parabolic moving-boundary problems. Specializing the abstract estimate to particular choices of the mesh motion strategy and finite element space leads to error estimates in terms of the mesh spacing for various semidiscrete schemes. We illustrate this by deriving error estimates for ALE schemes under mild assumptions on the nature of the mesh deformation and the regularity of the exact solution and the moving domain, and we do the same for universal meshes.
Book
1 online resource.
Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. This dissertation extends the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.
Special Collections