Match!
arXiv: Learning
Papers
19689
Papers 10000
1 page of 1,000 pages (10k results)
Newest
We propose a novel regularization-based continual learning method, dubbed as Adaptive Group Sparsity based Continual Learning (AGS-CL), using two group sparsity-based penalties. Our method selectively employs the two penalties when learning each node based its the importance, which is adaptively updated after learning each new task. By utilizing the proximal gradient descent method for learning, the exact sparsity and freezing of the model is guaranteed, and thus, the learner can explicitly cont...
Learning value functions off-policy is at the core of modern Reinforcement Learning (RL). Traditional off-policy actor-critic algorithms, however, only approximate the true policy gradient, since the gradient \nabla_{\theta} Q^{\pi_{\theta}}(s,a)of the action-value function with respect to the policy parameters is often ignored. We introduce a class of value functions called Parameter-based Value Functions (PVFs) whose inputs include the policy parameters. PVFs can evaluate the performance of...
To increase the trustworthiness of deep neural network (DNN) classifiers, an accurate prediction confidence that represents the true likelihood of correctness is crucial. Towards this end, many post-hoc calibration methods have been proposed to leverage a lightweight model to map the target DNN's output layer into a calibrated confidence. Nonetheless, on an out-of-distribution (OOD) dataset in practice, the target DNN can often mis-classify samples with a high confidence, creating significant ch...
#1Xuezhou ZhangH-Index: 3
Last. Xiaojin ZhuH-Index: 44
view all 5 authors...
In this paper, we initiate the study of sample complexity of teaching, termed as "teaching dimension" (TDim) in the literature, for Q-learning. While the teaching dimension of supervised learning has been studied extensively, these results do not extend to reinforcement learning due to the temporal constraints posed by the underlying Markov Decision Process environment. We characterize the TDim of Q-learning under different teachers with varying control over the environment, and present matching...
#2Cyrus CousinsH-Index: 1
Last. Matteo RiondatoH-Index: 12
view all 4 authors...
We present MCRapper, an algorithm for efficient computation of Monte-Carlo Empirical Rademacher Averages (MCERA) for families of functions exhibiting poset (e.g., lattice) structure, such as those that arise in many pattern mining tasks. The MCERA allows us to compute upper bounds to the maximum deviation of sample means from their expectations, thus it can be used to find both statistically-significant functions (i.e., patterns) when the available data is seen as a sample from an unknown distri...
Source
#1Tengyu XuH-Index: 1
#2Zhe WangH-Index: 4
Last. H. Vincent PoorH-Index: 98
view all 4 authors...
Min-max optimization captures many important machine learning problems such as robust adversarial learning and inverse reinforcement learning, and nonconvex-strongly-concave min-max optimization has been an active line of research. Specifically, a novel variance reduction algorithm SREDA was proposed recently by (Luo et al. 2020) to solve such a problem, and was shown to achieve the optimal complexity dependence on the required accuracy level \epsilon Despite the superior theoretical performa...
#1Haleh AkramiH-Index: 1
#2Sergul AydoreH-Index: 3
Last. Anand A. JoshiH-Index: 19
view all 4 authors...
We propose a robust variational autoencoder with \betadivergence for tabular data (RTVAE) with mixed categorical and continuous features. Variational autoencoders (VAE) and their variations are popular frameworks for anomaly detection problems. The primary assumption is that we can learn representations for normal patterns via VAEs and any deviation from that can indicate anomalies. However, the training data itself can contain outliers. The source of outliers in training data include the dat...
#1Ievgen RedkoH-Index: 4
#2Emilie MorvantH-Index: 8
Last. Younès BennaniH-Index: 16
view all 5 authors...
All famous machine learning algorithms that comprise both supervised and semi-supervised learning work well only under a common assumption: the training and test data follow the same distribution. When the distribution changes, most statistical models must be reconstructed from newly collected data, which for some applications can be costly or impossible to obtain. Therefore, it has become necessary to develop approaches that reduce the need and the effort to obtain new labeled samples by exploi...
Humans have a remarkable capacity to reason about abstract relational structures, an ability that may support some of the most impressive, human-unique cognitive feats. Because equality (or identity) is a simple and ubiquitous relational operator, equality reasoning has been a key case study for the broader question of abstract relational reasoning. This paper revisits the question of whether equality can be learned by neural networks that do not encode explicit symbolic structure. Earlier work ...
#2Colin WeiH-Index: 4
Last. Tengyu MaH-Index: 24
view all 4 authors...
The noise in stochastic gradient descent (SGD) provides a crucial implicit regularization effect for training overparameterized models. Prior theoretical work largely focuses on spherical Gaussian noise, whereas empirical studies demonstrate the phenomenon that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise. This paper theoretically characterizes this phenomenon on a quadratically-parameterized model introduced by Vaskevici...
12345678910
Top fields of study
Machine learning
Mathematical optimization
Mathematics
Artificial neural network
Reinforcement learning