Bittel, Lennart, Mele, Antonio A., Eisert, Jens, and Leone, Lorenzo

Subjects

Quantum Physics

Abstract

Free-fermionic states, also known as matchgates or Gaussian states, are a fundamental class of quantum states due to their efficient classical simulability and their crucial role across various domains of Physics. With the advent of quantum devices, experiments now yield data from quantum states, including estimates of expectation values. We establish that deciding whether a given dataset, formed by a few Majorana correlation functions estimates, can be consistent with a free-fermionic state is an NP-complete problem. Our result also extends to datasets formed by estimates of Pauli expectation values. This is in stark contrast to the case of stabilizer states, where the analogous problem can be efficiently solved. Moreover, our results directly imply that free-fermionic states are computationally hard to properly PAC-learn, where PAC-learning of quantum states is a learning framework introduced by Aaronson. Remarkably, this is the first class of classically simulable quantum states shown to have this property. Comment: 19 pages, 1 figure

We develop a theory of high-arity PAC learning, which is statistical learning in the presence of "structured correlation". In this theory, hypotheses are either graphs, hypergraphs or, more generally, structures in finite relational languages, and i.i.d. sampling is replaced by sampling an induced substructure, producing an exchangeable distribution. We prove a high-arity version of the fundamental theorem of statistical learning by characterizing high-arity (agnostic) PAC learnability in terms of finiteness of a purely combinatorial dimension and in terms of an appropriate version of uniform convergence. Comment: 145 pages, 1 figure

Computer Science - Machine Learning, Computer Science - Cryptography and Security, and Computer Science - Data Structures and Algorithms

Abstract

We continue the study of the computational complexity of differentially private PAC learning and how it is situated within the foundations of machine learning. A recent line of work uncovered a qualitative equivalence between the private PAC model and Littlestone's mistake-bounded model of online learning, in particular, showing that any concept class of Littlestone dimension $d$ can be privately PAC learned using $\mathrm{poly}(d)$ samples. This raises the natural question of whether there might be a generic conversion from online learners to private PAC learners that also preserves computational efficiency. We give a negative answer to this question under reasonable cryptographic assumptions (roughly, those from which it is possible to build indistinguishability obfuscation for all circuits). We exhibit a concept class that admits an online learner running in polynomial time with a polynomial mistake bound, but for which there is no computationally-efficient differentially private PAC learner. Our construction and analysis strengthens and generalizes that of Bun and Zhandry (TCC 2016-A), who established such a separation between private and non-private PAC learner.

A basic question in the PAC model of learning is whether proper learning is harder than improper learning. In the classical case, there are examples of concept classes with VC dimension $d$ that have sample complexity $\Omega\left(\frac d\epsilon\log\frac1\epsilon\right)$ for proper learning with error $\epsilon$, while the complexity for improper learning is O$\!\left(\frac d\epsilon\right)$. One such example arises from the Coupon Collector problem. Motivated by the efficiency of proper versus improper learning with quantum samples, Arunachalam, Belovs, Childs, Kothari, Rosmanis, and de Wolf (TQC 2020) studied an analogue, the Quantum Coupon Collector problem. Curiously, they discovered that for learning size $k$ subsets of $[n]$ the problem has sample complexity $\Theta(k\log\min\{k,n-k+1\})$, in contrast with the complexity of $\Theta(k\log k)$ for Coupon Collector. This effectively negates the possibility of a separation between the two modes of learning via the quantum problem, and Arunachalam et al.\ posed the possibility of such a separation as an open question. In this work, we first present an algorithm for the Quantum Coupon Collector problem with sample complexity that matches the sharper lower bound of $(1-o_k(1))k\ln\min\{k,n-k+1\}$ shown recently by Bab Hadiashar, Nayak, and Sinha (IEEE TIT 2024), for the entire range of the parameter $k$. Next, we devise a variant of the problem, the Quantum Padded Coupon Collector. We prove that its sample complexity matches that of the classical Coupon Collector problem for both modes of learning, thereby exhibiting the same asymptotic separation between proper and improper quantum learning as mentioned above. The techniques we develop in the process can be directly applied to any form of padded quantum data. We hope that padding can more generally lift other forms of classical learning behaviour to the quantum setting. Comment: 23 Pages

Carmosino et al. (2016) demonstrated that natural proofs of circuit lower bounds for $\Lambda$ imply efficient algorithms for learning $\Lambda$-circuits, but only over \textit{the uniform distribution}, with \textit{membership queries}, and provided $\AC^0[p] \subseteq \Lambda$. We consider whether this implication can be generalized to $\Lambda \not\supseteq \AC^0[p]$, and to learning algorithms which use only random examples and learn over arbitrary example distributions (Valiant's PAC-learning model). We first observe that, if, for any circuit class $\Lambda$, there is an implication from natural proofs for $\Lambda$ to PAC-learning for $\Lambda$, then standard assumptions from lattice-based cryptography do not hold. In particular, we observe that depth-2 majority circuits are a (conditional) counter example to the implication, since Nisan (1993) gave a natural proof, but Klivans and Sherstov (2009) showed hardness of PAC-learning under lattice-based assumptions. We thus ask: what learning algorithms can we reasonably expect to follow from Nisan's natural proofs? Our main result is that all natural proofs arising from a type of communication complexity argument, including Nisan's, imply PAC-learning algorithms in a new \textit{distributional} variant (i.e., an ``average-case'' relaxation) of Valiant's PAC model. Our distributional PAC model is stronger than the average-case prediction model of Blum et al. (1993) and the heuristic PAC model of Nanashima (2021), and has several important properties which make it of independent interest, such as being \textit{boosting-friendly}. The main applications of our result are new distributional PAC-learning algorithms for depth-2 majority circuits, polytopes and DNFs over natural target distributions, as well as the nonexistence of encoded-input weak PRFs that can be evaluated by depth-2 majority circuits. Comment: Added discussion

We revisit the problem of characterising the complexity of Quantum PAC learning, as introduced by Bshouty and Jackson [SIAM J. Comput. 1998, 28, 1136-1153]. Several quantum advantages have been demonstrated in this setting, however, none are generic: they apply to particular concept classes and typically only work when the distribution that generates the data is known. In the general case, it was recently shown by Arunachalam and de Wolf [JMLR, 19 (2018) 1-36] that quantum PAC learners can only achieve constant factor advantages over classical PAC learners. We show that with a natural extension of the definition of quantum PAC learning used by Arunachalam and de Wolf, we can achieve a generic advantage in quantum learning. To be precise, for any concept class $\mathcal{C}$ of VC dimension $d$, we show there is an $(\epsilon, \delta)$-quantum PAC learner with sample complexity \[ O\left(\frac{1}{\sqrt{\epsilon}}\left[d+ \log(\frac{1}{\delta})\right]\log^9(1/\epsilon)\right). \] Up to polylogarithmic factors, this is a square root improvement over the classical learning sample complexity. We show the tightness of our result by proving an $\Omega(d/\sqrt{\epsilon})$ lower bound that matches our upper bound up to polylogarithmic factors.

Perez, Mateo, Somenzi, Fabio, and Trivedi, Ashutosh

Subjects

Computer Science - Machine Learning and Computer Science - Logic in Computer Science

Abstract

Linear temporal logic (LTL) and omega-regular objectives -- a superset of LTL -- have seen recent use as a way to express non-Markovian objectives in reinforcement learning. We introduce a model-based probably approximately correct (PAC) learning algorithm for omega-regular objectives in Markov decision processes (MDPs). As part of the development of our algorithm, we introduce the epsilon-recurrence time: a measure of the speed at which a policy converges to the satisfaction of the omega-regular objective in the limit. We prove that our algorithm only requires a polynomial number of samples in the relevant parameters, and perform experiments which confirm our theory.

Brahmbhatt, Anand, Saket, Rishi, and Raghuveer, Aravindan

Subjects

Computer Science - Machine Learning and Statistics - Machine Learning

Abstract

Learning from label proportions (LLP) is a generalization of supervised learning in which the training data is available as sets or bags of feature-vectors (instances) along with the average instance-label of each bag. The goal is to train a good instance classifier. While most previous works on LLP have focused on training models on such training data, computational learnability of LLP was only recently explored by [Saket'21, Saket'22] who showed worst case intractability of properly learning linear threshold functions (LTFs) from label proportions. However, their work did not rule out efficient algorithms for this problem on natural distributions. In this work we show that it is indeed possible to efficiently learn LTFs using LTFs when given access to random bags of some label proportion in which feature-vectors are, conditioned on their labels, independently sampled from a Gaussian distribution $N(\mathbf{\mu}, \mathbf{\Sigma})$. Our work shows that a certain matrix -- formed using covariances of the differences of feature-vectors sampled from the bags with and without replacement -- necessarily has its principal component, after a transformation, in the direction of the normal vector of the LTF. Our algorithm estimates the means and covariance matrices using subgaussian concentration bounds which we show can be applied to efficiently sample bags for approximating the normal direction. Using this in conjunction with novel generalization error bounds in the bag setting, we show that a low error hypothesis LTF can be identified. For some special cases of the $N(\mathbf{0}, \mathbf{I})$ distribution we provide a simpler mean estimation based algorithm. We include an experimental evaluation of our learning algorithms along with a comparison with those of [Saket'21, Saket'22] and random LTFs, demonstrating the effectiveness of our techniques. Comment: Spotlight paper at Neural Information Processing Systems (NeurIPS), 2023

In this work, we aim to characterize the statistical complexity of realizable regression both in the PAC learning setting and the online learning setting. Previous work had established the sufficiency of finiteness of the fat shattering dimension for PAC learnability and the necessity of finiteness of the scaled Natarajan dimension, but little progress had been made towards a more complete characterization since the work of Simon (SICOMP '97). To this end, we first introduce a minimax instance optimal learner for realizable regression and propose a novel dimension that both qualitatively and quantitatively characterizes which classes of real-valued predictors are learnable. We then identify a combinatorial dimension related to the Graph dimension that characterizes ERM learnability in the realizable setting. Finally, we establish a necessary condition for learnability based on a combinatorial dimension related to the DS dimension, and conjecture that it may also be sufficient in this context. Additionally, in the context of online learning we provide a dimension that characterizes the minimax instance optimal cumulative loss up to a constant factor and design an optimal online learner for realizable regression, thus resolving an open question raised by Daskalakis and Golowich in STOC '22.

Computer Science - Data Structures and Algorithms, Computer Science - Machine Learning, and Statistics - Machine Learning

Abstract

The concept class of low-degree polynomial threshold functions (PTFs) plays a fundamental role in machine learning. In this paper, we study PAC learning of $K$-sparse degree-$d$ PTFs on $\mathbb{R}^n$, where any such concept depends only on $K$ out of $n$ attributes of the input. Our main contribution is a new algorithm that runs in time $({nd}/{\epsilon})^{O(d)}$ and under the Gaussian marginal distribution, PAC learns the class up to error rate $\epsilon$ with $O(\frac{K^{4d}}{\epsilon^{2d}} \cdot \log^{5d} n)$ samples even when an $\eta \leq O(\epsilon^d)$ fraction of them are corrupted by the nasty noise of Bshouty et al. (2002), possibly the strongest corruption model. Prior to this work, attribute-efficient robust algorithms are established only for the special case of sparse homogeneous halfspaces. Our key ingredients are: 1) a structural result that translates the attribute sparsity to a sparsity pattern of the Chow vector under the basis of Hermite polynomials, and 2) a novel attribute-efficient robust Chow vector estimation algorithm which uses exclusively a restricted Frobenius norm to either certify a good approximation or to validate a sparsity-induced degree-$2d$ polynomial as a filter to detect corrupted samples. Comment: ICML 2023. V2 fixed typos

Diochnos, Dimitrios I., Mahloujifar, Saeed, and Mahmoody, Mohammad

2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA) ICMLA Machine Learning and Applications (ICMLA), 2020 19th IEEE International Conference on. :717-722 Dec, 2020

Cate, Balder ten, Funk, Maurice, Jung, Jean Christoph, and Lutz, Carsten

Subjects

Computer Science - Artificial Intelligence

Abstract

We propose bounded fitting as a scheme for learning description logic concepts in the presence of ontologies. A main advantage is that the resulting learning algorithms come with theoretical guarantees regarding their generalization to unseen examples in the sense of PAC learning. We prove that, in contrast, several other natural learning algorithms fail to provide such guarantees. As a further contribution, we present the system SPELL which efficiently implements bounded fitting for the description logic $\mathcal{ELH}^r$ based on a SAT solver, and compare its performance to a state-of-the-art learner. Comment: 19 pages, Long version of paper accepted at IJCAI 2023

Probably Approximately Correct (i.e., PAC) learning is a core concept of sample complexity theory, and efficient PAC learnability is often seen as a natural counterpart to the class P in classical computational complexity. But while the nascent theory of parameterized complexity has allowed us to push beyond the P-NP ``dichotomy'' in classical computational complexity and identify the exact boundaries of tractability for numerous problems, there is no analogue in the domain of sample complexity that could push beyond efficient PAC learnability. As our core contribution, we fill this gap by developing a theory of parameterized PAC learning which allows us to shed new light on several recent PAC learning results that incorporated elements of parameterized complexity. Within the theory, we identify not one but two notions of fixed-parameter learnability that both form distinct counterparts to the class FPT -- the core concept at the center of the parameterized complexity paradigm -- and develop the machinery required to exclude fixed-parameter learnability. We then showcase the applications of this theory to identify refined boundaries of tractability for CNF and DNF learning as well as for a range of learning problems on graphs. Comment: Accepted at AAAI 2023

Computer Science - Machine Learning, Computer Science - Computational Complexity, Computer Science - Information Theory, and Statistics - Machine Learning

Abstract

Many conventional learning algorithms rely on loss functions other than the natural 0-1 loss for computational efficiency and theoretical tractability. Among them are approaches based on absolute loss (L1 regression) and square loss (L2 regression). The first is proved to be an \textit{agnostic} PAC learner for various important concept classes such as \textit{juntas}, and \textit{half-spaces}. On the other hand, the second is preferable because of its computational efficiency, which is linear in the sample size. However, PAC learnability is still unknown as guarantees have been proved only under distributional restrictions. The question of whether L2 regression is an agnostic PAC learner for 0-1 loss has been open since 1993 and yet has to be answered. This paper resolves this problem for the junta class on the Boolean cube -- proving agnostic PAC learning of k-juntas using L2 polynomial regression. Moreover, we present a new PAC learning algorithm based on the Boolean Fourier expansion with lower computational complexity. Fourier-based algorithms, such as Linial et al. (1993), have been used under distributional restrictions, such as uniform distribution. We show that with an appropriate change, one can apply those algorithms in agnostic settings without any distributional assumption. We prove our results by connecting the PAC learning with 0-1 loss to the minimum mean square estimation (MMSE) problem. We derive an elegant upper bound on the 0-1 loss in terms of the MMSE error and show that the sign of the MMSE is a PAC learner for any concept class containing it. Comment: AISTATS 2023

This paper contributes to the study of CPAC learnability -- a computable version of PAC learning -- by solving three open questions from recent papers. Firstly, we prove that every improperly CPAC learnable class is contained in a class which is properly CPAC learnable with polynomial sample complexity. This confirms a conjecture by Agarwal et al (COLT 2021). Secondly, we show that there exists a decidable class of hypothesis which is properly CPAC learnable, but only with uncomputably fast growing sample complexity. This solves a question from Sterkenburg (COLT 2022). Finally, we construct a decidable class of finite Littlestone dimension which is not improperly CPAC learnable, strengthening a recent result of Sterkenburg (2022) and answering a question posed by Hasrati and Ben-David (ALT 2023). Together with previous work, our results provide a complete landscape for the learnability problem in the CPAC setting. Comment: 12 pages, 1 figure (corrected version)

We study the problem of binary classification from the point of view of learning convex polyhedra in Hilbert spaces, to which one can reduce any binary classification problem. The problem of learning convex polyhedra in finite-dimensional spaces is sufficiently well studied in the literature. We generalize this problem to that in a Hilbert space and propose an algorithm for learning a polyhedron which correctly classifies at least $1- \varepsilon$ of the distribution, with a probability of at least $1 - \delta,$ where $\varepsilon$ and $\delta$ are given parameters. Also, as a corollary, we improve some previous bounds for polyhedral classification in finite-dimensional spaces. Comment: 16 pages, 2 figures, to appear in the proceedings of the AAAI-2023 conference

In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems. Comment: To appear in The 14th Asian Conference on Machine Learning (ACML 2022)

Fioravanti, Simone, Flammini, Michele, Kodric, Bojana, and Varricchio, Giovanna

Subjects

Computer Science - Computer Science and Game Theory and Computer Science - Artificial Intelligence

Abstract

We study PAC learnability and PAC stabilizability of Hedonic Games (HGs), i.e., efficiently inferring preferences or core-stable partitions from samples. We first expand the known learnability/stabilizability landscape for some of the most prominent HGs classes, providing results for Friends and Enemies Games, Bottom Responsive, and Anonymous HGs. Then, having a broader view in mind, we attempt to shed light on the structural properties leading to learnability/stabilizability, or lack thereof, for specific HGs classes. Along this path, we focus on the fully expressive Hedonic Coalition Nets representation of HGs. We identify two sets of conditions that lead to efficient learnability, and which encompass all of the known positive learnability results. On the side of stability, we reveal that, while the freedom of choosing an ad hoc adversarial distribution is the most obvious hurdle to achieving PAC stability, it is not the only one. First, we show a distribution independent necessary condition for PAC stability. Then, we focus on $\W$-games, where players have individual preferences over other players and evaluate coalitions based on the least preferred member. We prove that these games are PAC stabilizable under the class of bounded distributions, which assign positive probability mass to all coalitions. Finally, we discuss why such a result is not easily extendable to other HGs classes even in this promising scenario. Namely, we establish a purely computational property necessary for achieving PAC stability. Comment: Accepted paper at AAAI 2023