# Learning graphical models fundamental limits and efficient algorithms [electronic resource]

- Responsibility
- José Bento.
- Imprint
- 2012.
- Physical description
- 1 online resource.

## Digital content

## Online

### Also available at

## At the library

### Special Collections

#### On-site access

Collections are moving, which may affect access. Request materials as early as possible. Maximum 5 items per day. Contact specialcollections@stanford.edu for information about access.

**University Archives**Request via Aeon (opens in new tab)

Call number | Note | Status |
---|---|---|

3781 2012 B | In-library use |

### More options

## Description

### Creators/Contributors

- Author/Creator
- Bento Ayres Pereira, Jose.
- Contributor
- Montanari, Andrea primary advisor. Thesis advisor
- Candès, Emmanuel J. (Emmanuel Jean) advisor. Thesis advisor
- Johari, Ramesh, 1976- advisor. Thesis advisor
- Stanford University. Department of Electrical Engineering

### Contents/Summary

- Summary
- Graphical models provide a flexible and yet powerful language to describe high dimensional probability distributions. Over the last 20 years, graphical models methods have been successfully applied to a broad range of problems, from computer vision to error correcting codes to computational biology, to name a few. In a graphical model, the graph structure characterizes the conditional independence properties of the underlying probability distributions. Roughly speaking, it encodes key information about which variables influence each other. It allows us to answer questions of the type: are variables X and Y dependent because they 'interact' directly, or because they are both dependent on a third variable Z? In many applications, this information has utmost practical importance, and it is therefore crucial to develop efficient algorithms to learn the graphical structure from data. This problem is largely unsolved and for a long time several heuristics have been used without a solid theoretical foundation in place. In the first part of this work, we consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from i.i.d. samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms often fail when the Markov random field develops long-range correlations. More precisely, this phenomenon appears to be related to the Ising model phase transition (although it does not coincide with it). An example of an important algorithm that exhibits this behavior is the L1- regularized logistic regression estimator introduced by Ravikumar et al. Ravikumar et al. proved a set sufficient conditions under which this algorithm exactly learns Ising models, the most interesting being a so-called incoherence condition. In this thesis we show that this incoherence condition is also necessary and analytically establish whether it holds for several families of graphs. In particular, denoting by [theta] the edge strength and by \Delta the maximum degree, we prove that regularized logistic regression succeeds on any graph with \Delta [less than or equal to] 3/(10 \theta) and fails on most regular graphs with \Delta [greater than or equal to] 2 / \theta. In the second part of this work, we address the important scenario in which data is not composed of i.i.d. samples. We focus on the problem of learning the drift coefficient of a p-dimensional stochastic differential equation (SDE) from a sample path of length T. We assume that the drift is parametrized by a high-dimensional vector, and study the support recovery problem in the case where p is allowed to grow with T. In particular, we describe a general lower bound on the sample-complexity T by using a characterization of mutual information as a time integral of conditional variance, due to Kadota, Zakai, and Ziv. For linear SDEs, the drift coefficient is parametrized by a p-by-p matrix which describes which degrees of freedom interact under the dynamics. In this case, we analyze an L1-regularized least-squares estimator and describe an upper bound on T that nearly matches the lower bound on specific classes of sparse matrices. We describe how this same algorithm can be used to learn non-linear SDEs and in addition show by means of a numerical experiment why one should expect the sample-complexity to be of the same order as that for linear SDEs.

### Bibliographic information

- Publication date
- 2012
- Note
- Submitted to the Department of Electrical Engineering.
- Note
- Thesis (Ph.D.)--Stanford University, 2012.