Dennis Wei

ORCID: 0000-0002-6510-1537
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Explainable Artificial Intelligence (XAI)
  • Adversarial Robustness in Machine Learning
  • Sparse and Compressive Sensing Techniques
  • Machine Learning and Data Classification
  • Ethics and Social Impacts of AI
  • Statistical Methods and Inference
  • Distributed Sensor Networks and Detection Algorithms
  • Machine Learning and Algorithms
  • Advanced Bandit Algorithms Research
  • Bayesian Modeling and Causal Inference
  • Advanced Adaptive Filtering Techniques
  • Privacy-Preserving Technologies in Data
  • Advanced Causal Inference Techniques
  • Imbalanced Data Classification Techniques
  • Topic Modeling
  • Scientific Computing and Data Management
  • Domain Adaptation and Few-Shot Learning
  • Neural Networks and Applications
  • Spectroscopy and Chemometric Analyses
  • Target Tracking and Data Fusion in Sensor Networks
  • Advanced Statistical Methods and Models
  • Face and Expression Recognition
  • Data-Driven Disease Surveillance
  • Advanced Statistical Process Monitoring
  • Semantic Web and Ontologies

IBM (United States)
2014-2024

University of Utah
2021-2023

Scarsdale Historical Society
2023

IBM Research - Thomas J. Watson Research Center
2013-2022

University of Michigan
2012-2015

Massachusetts Institute of Technology
2007-2011

As artificial intelligence and machine learning algorithms make further inroads into society, calls are increasing from multiple stakeholders for these to explain their outputs. At the same time, stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, present different requirements explanations. Toward addressing needs, we introduce AI Explainability 360 (http://aix360.mybluemix.net/), an open-source software toolkit featuring eight...

10.48550/arxiv.1909.03012 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Artificial intelligence systems are being increasingly deployed due to their potential increase the efficiency, scale, consistency, fairness, and accuracy of decisions. However, as many these opaque in operation, there is a growing demand for such provide explanations Conventional approaches this problem attempt expose or discover inner workings machine learning model with hope that resulting will be meaningful consumer. In contrast, paper suggests new approach problem. It introduces simple,...

10.1145/3306618.3314273 article EN 2019-01-27

Several strands of research have aimed to bridge the gap between artificial intelligence (AI) and human decision-makers in AI-assisted decision-making, where humans are consumers AI model predictions ultimate high-stakes applications. However, people's perception understanding often distorted by their cognitive biases, such as confirmation bias, anchoring availability name a few. In this work, we use knowledge from field science account for biases human-AI collaborative decision-making...

10.1145/3512930 article EN Proceedings of the ACM on Human-Computer Interaction 2022-03-30

The problem of estimation density functionals like entropy and mutual information has received much attention in the statistics theory communities. A large class estimators probability suffer from curse dimensionality, wherein mean squared error (MSE) decays increasingly slowly as a function sample size T dimension d samples increases. In particular, rate is often glacially slow order O(T-γ/d ), where γ > 0 parameter. Examples such include kernel estimators, k-nearest neighbor (k-NN) k-NN...

10.1109/tit.2013.2251456 article EN IEEE Transactions on Information Theory 2013-06-12

Risk assessment is a growing use for machine learning models. When used in high-stakes applications, especially ones regulated by anti-discrimination laws or governed societal norms fairness, it important to ensure that learned models do not propagate and scale any biases may exist training data. In this paper, we add on an additional challenge beyond fairness: unsupervised domain adaptation covariate shift between source target distribution. Motivated the real-world problem of risk new...

10.1145/3306618.3314236 article EN 2019-01-27

This paper considers the learning of Boolean rules in either disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) or conjunctive (CNF, AND-of-ORs) as an interpretable model for classification. An integer program is formulated optimally trade classification accuracy simplicity. Column generation (CG) used efficiently search over exponential number candidate clauses (conjunctions disjunctions) without need heuristic mining. approach also bounds gap between selected set...

10.48550/arxiv.1805.09901 preprint EN other-oa arXiv (Cornell University) 2018-01-01

This tutorial will teach participants to use and contribute a new open-source Python package named AI Explainability 360 (AIX360) (https://aix360.mybluemix.net), comprehensive extensible toolkit that supports interpretability explainability of data machine learning models.

10.1145/3351095.3375667 article EN 2020-01-27

The popularity of pretrained language models in natural processing systems calls for a careful evaluation such down-stream tasks, which have higher potential societal impact. usually focuses on accuracy measures. Our findings this paper call attention to be paid fairness measures as well. Through the analysis more than dozen varying sizes two toxic text classification tasks (English), we demonstrate that focusing alone can lead with wide variation characteristics. Specifically, observe vary...

10.18653/v1/2022.findings-acl.176 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

The recent Zika virus (ZIKV) epidemic in the Americas ranks among largest outbreaks modern times. Like other mosquito-borne flaviviruses, ZIKV circulates sylvatic cycles primates that can serve as reservoirs of spillover infection to humans. Identifying is critical mitigating risk, but relevant surveillance and biological data remain limited for this most zoonoses. We confronted sparsity by combining a machine learning method, Bayesian multi-label learning, with multiple imputation method on...

10.1016/j.epidem.2019.01.005 article EN cc-by-nc-nd Epidemics 2019-03-19

Shanghai has experienced a rapid process of urbanization and urban expansion, which increases travel costs limits job accessibility for the economically disadvantaged population. This paper investigates jobs-housing imbalance problem in at subdistrict-level (census-level) reaches following conclusions. First, shows ring pattern is evident mainly suburban areas periphery metropolitan area because opportunities are highly concentrated while residential sprawling. Second, structural factors...

10.5198/jtlu.2021.1805 article EN cc-by-nc Journal of Transport and Land Use 2021-03-14

Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce novel probabilistic formulation of data pre-processing for reducing discrimination. We propose convex optimization learning transformation with three goals: controlling discrimination, limiting distortion individual samples, and preserving utility. characterize the impact limited sample size accomplishing objective, apply two instances proposed to datasets, including one on real-world...

10.48550/arxiv.1704.03354 preprint EN other-oa arXiv (Cornell University) 2017-01-01

This paper develops a novel optimization framework for learning accurate and sparse two-level Boolean rules classification, both in Conjunctive Normal Form (CNF, i.e. AND-of-ORs) Disjunctive (DNF, OR-of-ANDs). In contrast to opaque models (e.g. neural networks), gain the crucial benefit of interpretability, which is necessary wide range applications such as law medicine attracting considerable attention machine learning. introduces two principled objective functions trade off classification...

10.1109/mlsp.2016.7738856 article EN 2016-09-01

Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce novel probabilistic formulation of data pre-processing for reducing discrimination. We propose convex optimization learning transformation with three goals: controlling group discrimination, limiting distortion individual samples, and preserving utility. Several theoretical properties are established, including conditions convexity, characterization the impact limited sample size on...

10.1109/jstsp.2018.2865887 article EN IEEE Journal of Selected Topics in Signal Processing 2018-08-17

This paper considers three problems in sparse filter design, the first involving a weighted least-squares constraint on frequency response, second mean squared error estimation, and third signal-to-noise ratio detection. The are unified under single framework based sparsity maximization quadratic performance constraint. Efficient exact solutions developed for specific cases which matrix is diagonal, block-diagonal, banded, or has low condition number. For more difficult general case,...

10.1109/tsp.2012.2229996 article EN IEEE Transactions on Signal Processing 2012-11-27

This paper considers sequential adaptive estimation of sparse signals under a constraint on the total sensing effort. The advantage adaptivity in this context is ability to focus more resources regions space where signal components exist, thereby improving performance. A dynamic programming formulation derived for allocation effort minimize expected loss. Based method open-loop feedback control, policies are then developed variety loss functions. optimal two-stage case, generalizing an...

10.1109/jstsp.2013.2256105 article EN IEEE Journal of Selected Topics in Signal Processing 2013-04-01

As a contribution to interpretable machine learning research, we develop novel optimization framework for accurate and sparse two-level Boolean rules. We consider rules in both conjunctive normal form (AND-of-ORs) disjunctive (OR-of-ANDs). A principled objective function is proposed trade classification accuracy interpretability, where use Hamming loss characterize sparsity interpretability. propose efficient procedures optimize these objectives based on linear programming (LP) relaxation,...

10.48550/arxiv.1606.05798 preprint EN other-oa arXiv (Cornell University) 2016-01-01

This paper presents an exact algorithm for sparse filter design under a quadratic constraint on performance. The is based branch-and-bound, combinatorial optimization procedure that can either guarantee optimal solution or produce with bound its deviation from optimality. To reduce the complexity of several methods are developed bounding cost. Bounds infeasibility yield incrementally accumulating improvements minimal computation, while two convex relaxations, referred to as linear and...

10.1109/tsp.2012.2226450 article EN IEEE Transactions on Signal Processing 2012-10-25

The information technology (IT) services industry is undergoing a rapid change with the growth of market interest in cloud, analytics, mobile, social, and security technologies. For service providers to match this pace, they must rapidly transform their workforce terms job roles, do so without incurring excessive cost while continuing deliver core services. In paper, we describe big data approach enable such transformation through internal transfers suitable employees from legacy areas...

10.1109/bigdatacongress.2015.84 article EN 2015-06-01

This paper re-examines a continuous optimization framework dubbed NOTEARS for learning Bayesian networks. We first generalize existing algebraic characterizations of acyclicity to class matrix polynomials. Next, focusing on one-parameter-per-edge setting, it is shown that the Karush-Kuhn-Tucker (KKT) optimality conditions formulation cannot be satisfied except in trivial case, which explains behavior associated algorithm. then derive KKT an equivalent reformulation, show they are indeed...

10.48550/arxiv.2010.09133 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output biased and toxic generations. Due several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible impose direct safety constraints on deployed model. Therefore, an efficient reliable alternative is required. To this end, we present our ongoing efforts create deploy library detectors: compact easy-to-build classification that provide labels...

10.48550/arxiv.2403.06009 preprint EN arXiv (Cornell University) 2024-03-09
Coming Soon ...