- Machine Learning and Algorithms
- Privacy-Preserving Technologies in Data
- Complexity and Algorithms in Graphs
- Stochastic Gradient Optimization Techniques
- Algorithms and Data Compression
- Machine Learning and Data Classification
- Cryptography and Data Security
- Imbalanced Data Classification Techniques
- Bayesian Modeling and Causal Inference
- Optimization and Search Problems
- Statistical Methods and Inference
- Sparse and Compressive Sensing Techniques
- Adversarial Robustness in Machine Learning
- Internet Traffic Analysis and Secure E-voting
- Advanced Bandit Algorithms Research
- Domain Adaptation and Few-Shot Learning
- Neural Networks and Applications
- Computability, Logic, AI Algorithms
- Distributed Sensor Networks and Detection Algorithms
- Evolutionary Algorithms and Applications
- semigroups and automata theory
- Advanced Causal Inference Techniques
- Multi-Criteria Decision Making
- Constraint Satisfaction and Optimization
- Error Correcting Code Techniques
Apple (United States)
2021-2024
Fukushima Prefectural Culture Center
2024
Apple (United Kingdom)
2023-2024
Google (United States)
2018-2021
Apple (Israel)
2020-2021
IBM (United States)
2005-2019
Berkeley College
2018
University of California, Berkeley
2018
University of Birmingham
2018
IBM Research - Almaden
2008-2017
Testing hypotheses privately Large data sets offer a vast scope for testing already-formulated ideas and exploring new ones. Unfortunately, researchers who attempt to do both on the same set run risk of making false discoveries, even when exploration are carried out distinct subsets data. Based drawn from differential privacy, Dwork et al. now provide theoretical solution. Ideas tested against aggregate information, whereas individual components remain confidential. Preserving that privacy...
A great deal of effort has been devoted to reducing the risk spurious scientific discoveries, from use sophisticated validation techniques, deep statistical methods for controlling false discovery rate in multiple hypothesis testing. However, there is a fundamental disconnect between theoretical results and practice data analysis: theory inference assumes fixed collection hypotheses be tested, or learning algorithms applied, selected non-adaptively before are gathered, whereas shared reused...
Marching along the DARPA SyNAPSE roadmap, IBM unveils a trilogy of innovations towards TrueNorth cognitive computing system inspired by brain's function and efficiency. Judiciously balancing dual objectives functional capability implementation/operational cost, we develop simple, digital, reconfigurable, versatile spiking neuron model that supports one-to-one equivalence between hardware simulation is implementable using only 1272 ASIC gates. Starting with classic leaky integrate-and-fire...
State-of-the-art results on image recognition tasks are achieved using over-parameterized learning algorithms that (nearly) perfectly fit the training set and known to well even random labels. This tendency memorize seemingly useless data labels is not explained by existing theoretical analyses. Memorization of also presents significant privacy risks when contains sensitive personal information thus it important understand whether such memorization necessary for accurate learning.
Sensitive statistics are often collected across sets of users, with repeated collection reports done over time. For example, trends in users' private preferences or software usage may be monitored via such reports. We study the local differential privacy (LDP) model, and describe an algorithm whose cost is polylogarithmic number changes to a user's value.More fundamentally---by building on anonymity reports---we also demonstrate how our LDP can actually much lower when viewed central model...
We prove the following strong hardness result for learning: Given a distribution of labeled examples from hypercube such that there exists monomial consistent with $(1-\epsilon)$ it is NP-hard to find halfspace correct on $(1/2+\epsilon)$ arbitrary constants $\epsilon>0$. In learning theory terms, weak agnostic monomials hard, even if one allowed output hypothesis much bigger concept class halfspaces. This subsumes long line previous results, including two recent results proper and As an...
We introduce a framework for proving lower bounds on computational problems over distributions against algorithms that can be implemented using access to statistical query oracle. For such algorithms, the input distribution is limited obtaining an estimate of expectation any given function sample drawn randomly from rather than directly accessing samples. Most natural interest in theory and practice, example, moments-based methods, local search, standard iterative methods convex...
Deep learning algorithms are well-known to have a propensity for fitting the training data very well and often fit even outliers mislabeled points. Such requires memorization of labels, phenomenon that has attracted significant research interest but not been given compelling explanation so far. A recent work Feldman (2019) proposes theoretical this based on combination two insights. First, natural image distributions (informally) known be long-tailed, is fraction rare atypical examples....
Recent work of Erlingsson, Feldman, Mironov, Raghunathan, Talwar, and Thakurta [1] demonstrates that random shuffling amplifies differential privacy guarantees locally randomized data. Such amplification implies substan-tially stronger for systems in which data is contributed anonymously [2] has lead to significant interest the shuffle model [3], [1]. We give a characterization guarantee <tex xmlns:mml="http://www.w3.org/1998/Math/MathML"...
We address well-studied problems concerning the learn-ability of parities and halfspaces in presence classification noise. Learning under uniform distribution with random noise, also called noisy parity problem is a famous open computational learning. reduce number basic regarding learning to parities. show that distribution, adversarial noise reduces Together algorithm Blum et al. (2003), this gives first nontrivial for DNF expressions just logarithmic variables. k-juntas k These reductions...
We introduce a framework for proving lower bounds on computational problems over distributions, based class of algorithms called statistical algorithms. For such algorithms, access to the input distribution is limited obtaining an estimate expectation any given function sample drawn randomly from distribution, rather than directly accessing samples. Most natural interest in theory and practice, e.g., moments-based methods, local search, standard iterative methods convex optimization, MCMC...
Many commonly used learning algorithms work by iteratively updating an intermediate solution using one or a few data points in each iteration. Analysis of differential privacy for such often involves ensuring step and then reasoning about the cumulative cost algorithm. This is enabled composition theorems that allow releasing all results. In this work, we demonstrate contractive iterations, not results strongly amplifies guarantees. We describe several applications new analysis technique to...
We study differentially private (DP) algorithms for stochastic convex optimization: the problem of minimizing population loss given i.i.d. samples from a distribution over functions. A recent work Bassily et al. (2019) has established optimal bound on excess achievable n samples. Unfortunately, their algorithm achieving this is relatively inefficient: it requires O(min{n 3/2, 5/2/d}) gradient computations, where d dimension optimization problem.
We study the learnability of several fundamental concept classes in agnostic learning framework [D. Haussler, Inform. and Comput., 100 (1992), pp. 78–150] [M. Kearns, R. Schapire, L. Sellie, Machine Learning, 17 (1994), 115–141]. show that under uniform distribution, agnostically parities reduce to with random classification noise, commonly referred as noisy parity problem. Together algorithm [A. Blum, A. Kalai, H. Wasserman, J. ACM, 50 (2003), 506–519], this gives first nontrivial for...
Overfitting is the bane of data analysts, even when are plentiful. Formal approaches to understanding this problem focus on statistical inference and generalization individual analysis procedures. Yet practice an inherently interactive adaptive process: new analyses hypotheses proposed after seeing results previous ones, parameters tuned basis obtained results, datasets shared reused. An investigation gap has recently been initiated by authors in (Dwork et al., 2014), where we focused...
Valiant has recently introduced a framework for analyzing the capabilities and limitations of evolutionary process random change guided by selection. In his acquiring complex functionality is viewed as substantially restricted form PAC learning an unknown function from certain set functions. showed that classes functions evolvable in model are also learnable statistical query (SQ) Kearns asked whether converse true.
The problem of identifying a planted assignment given random k-SAT formula consistent with the exhibits large algorithmic gap: while solution can always be identified O(n log n) clauses, there are distributions over clauses for which best known efficient algorithms require nk/2 clauses. We propose and study unified model k-SAT, captures well-known special cases. An instance is described by σ distribution on k literals. define its complexity as largest r not r-wise independent (1 ≤ any assignment).
Stochastic convex optimization, where the objective is expectation of a random function, an important and widely used method with numerous applications in machine learning, statistics, operations research other areas. We study complexity stochastic optimization given only statistical query (SQ) access to function. show that well-known popular first-order iterative methods can be implemented using queries. For many cases interest we derive nearly matching upper lower bounds on estimation...
Algorithmic stability is a classical approach to understanding and analysis of the generalization error learning algorithms. A notable weakness most stability-based bounds that they hold only in expectation. Generalization with high probability has been established landmark paper Bousquet Elisseeff (2002) albeit at expense an additional $\sqrt{n}$ factor bound. Specifically, their bound on estimation any $γ$-uniformly stable algorithm $n$ samples range $[0,1]$ $O(γ\sqrt{n \log(1/δ)} +...