NFDI4DS | UHH-SEMS - Publication Details

Daniel Kifer

ORCID: 0000-0002-4611-7066

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5005431144

Research Areas

Privacy-Preserving Technologies in Data
Cryptography and Data Security
Stochastic Gradient Optimization Techniques
Privacy, Security, and Data Protection
Handwritten Text Recognition Techniques
Adversarial Robustness in Machine Learning
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Mobile Crowdsensing and Crowdsourcing
Anomaly Detection Techniques and Applications
Landslides and related hazards
Data-Driven Disease Surveillance
Topic Modeling
Neural Networks and Applications
Seismology and Earthquake Studies
Data Management and Algorithms
Complexity and Algorithms in Graphs
Internet Traffic Analysis and Secure E-voting
Natural Language Processing Techniques
Flood Risk Assessment and Management
Hydrological Forecasting Using AI
Census and Population Estimation
Multimodal Machine Learning Applications
Model Reduction and Neural Networks
Time Series Analysis and Forecasting

Pennsylvania State University
2016-2025

United States Census Bureau
2018-2025

Yahoo (United States)
2009-2018

Adobe Systems (United States)
2017

Park University
2015

University of Waterloo
2013

Cornell University
2002-2011

Yahoo (Spain)
2008

L -diversity

OPENALEX - Publications

Ashwin Machanavajjhala Daniel Kifer Johannes Gehrke Muthuramakrishnan Venkitasubramaniam

Publishing data about individuals without revealing sensitive information them is an important problem. In recent years, a new definition of privacy called k -anonymity has gained popularity. -anonymized dataset, each record indistinguishable from at least − 1 other records with respect to certain identifying attributes. this article, we show using two simple attacks that dataset some subtle but severe problems. First, attacker can discover the values attributes when there little diversity...

10.1145/1217299.1217302 article EN ACM Transactions on Knowledge Discovery from Data 2007-03-01

L-diversity: privacy beyond k-anonymity

OPENALEX - Publications

Ashwin Machanavajjhala Johannes Gehrke Daniel Kifer Muthuramakrishnan Venkitasubramaniam

Publishing data about individuals without revealing sensitive information them is an important problem. In recent years, a new definition of privacy called \kappa-anonymity has gained popularity. \kappa-anonymized dataset, each record indistinguishable from at least k—1 other records with respect to certain "identifying" attributes. this paper we show two simple attacks that dataset some subtle, but severe problems. First, attacker can discover the values attributes when there little...

10.1109/icde.2006.1 article EN 2006-01-01

No free lunch in data privacy

OPENALEX - Publications

Daniel Kifer Ashwin Machanavajjhala

Differential privacy is a powerful tool for providing privacy-preserving noisy query answers over statistical databases. It guarantees that the distribution of changes very little with addition or deletion any tuple. frequently accompanied by popularized claims it provides without assumptions about data and protects against attackers who know all but one record. In this paper we critically analyze protections offered differential privacy.

10.1145/1989323.1989345 article EN 2011-06-12

Privacy: Theory meets Practice on the Map

OPENALEX - Publications

Ashwin Machanavajjhala Daniel Kifer John M. Abowd Johannes Gehrke Lars Vilhuber

In this paper, we propose the first formal privacy analysis of a data anonymization process known as synthetic generation, technique becoming popular in statistics community. The target application for work is mapping program that shows commuting patterns population United States. source were collected by U.S. Census Bureau, but due to constraints, they cannot be used directly program. Instead, generate statistically mimic original while providing guarantees. We use these surrogate data....

10.1109/icde.2008.4497436 article EN 2008-04-01

Context-aware citation recommendation

OPENALEX - Publications

Qi He Jian Pei Daniel Kifer Prasenjit Mitra C. Lee Giles

When you write papers, how many times do want to make some citations at a place but are not sure which papers cite? Do wish have recommendation system can recommend small number of good candidates for every that citations? In this paper, we present our initiative building context-aware citation system. High quality is challenging: only should the recommended be relevant paper under composition, also match local contexts places made. Moreover, it far from trivial model topic whole and affect...

10.1145/1772690.1772734 article EN 2010-04-26

HESS Opinions: Incubating deep-learning-powered hydrologic science advances as a community

OPENALEX - Publications

Chaopeng Shen Eric Laloy Amin Elshorbagy Adrian Albert Jerad Bales and 9 more

Abstract. Recently, deep learning (DL) has emerged as a revolutionary and versatile tool transforming industry applications generating new improved capabilities for scientific discovery model building. The adoption of DL in hydrology so far been gradual, but the field is now ripe breakthroughs. This paper suggests that DL-based methods can open up complementary avenue toward knowledge hydrologic sciences. In avenue, machine-learning algorithms present competing hypotheses are consistent with...

10.5194/hess-22-5639-2018 article EN cc-by Hydrology and earth system sciences 2018-11-01

Prolongation of SMAP to Spatiotemporally Seamless Coverage of Continental U.S. Using a Deep Learning Neural Network

OPENALEX - Publications

Kuai Fang Chaopeng Shen Daniel Kifer Xiao Yang

The Soil Moisture Active Passive (SMAP) mission has delivered valuable sensing of surface soil moisture since 2015. However, it a short time span and irregular revisit schedule. Utilizing state-of-the-art time-series deep learning neural network, Long Short-Term Memory (LSTM), we created system that predicts SMAP level-3 data with atmospheric forcing, model-simulated moisture, static physiographic attributes as inputs. removes most the bias model simulations improves predicted climatology,...

10.1002/2017gl075619 article EN Geophysical Research Letters 2017-10-16

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

OPENALEX - Publications

Xiao Yang Ersin Yumer Paul Asente Mike Kraley Daniel Kifer and 1 more

We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. consider structure extraction as a pixel-wise segmentation task, and propose unified model that classifies pixels based not only on their visual appearance, in the traditional page but also content of underlying text. Moreover, we efficient synthetic generation process use to generate pretraining data our network. Once is trained large set documents, fine-tune unlabeled...

10.1109/cvpr.2017.462 article EN 2017-07-01

Pufferfish

OPENALEX - Publications

Daniel Kifer Ashwin Machanavajjhala

In this article, we introduce a new and general privacy framework called Pufferfish. The Pufferfish can be used to create definitions that are customized the needs of given application. goal is allow experts in an application domain, who frequently do not have expertise privacy, develop rigorous for their data sharing needs. addition this, also study existing definitions. We illustrate benefits with several applications framework: use it analyze differential formalize connection attackers...

10.1145/2514689 article EN ACM Transactions on Database Systems 2014-01-01

Learning to Read Irregular Text with Attention Mechanisms

OPENALEX - Publications

Xiao Yang Dafang He Zihan Zhou Daniel Kifer C. Lee Giles

We present a robust end-to-end neural-based model to attentively recognize text in natural images. Particularly, we focus on accurately identifying irregular (perspectively distorted or curved) text, which has not been well addressed the previous literature. Previous research reading often works with regular (horizontal and frontal) does adequately generalize processing perspective distortion curving effects. Our work proposes overcome this difficulty by introducing two learning components:...

10.24963/ijcai.2017/458 article EN 2017-07-28

Crime Rate Inference with Big Data

OPENALEX - Publications

Hongjian Wang Daniel Kifer Corina Graif Zhenhui Li

Crime is one of the most important social problems in country, affecting public safety, children development, and adult socioeconomic status. Understanding what factors cause higher crime critical for policy makers their efforts to reduce increase citizens' life quality. We tackle a fundamental problem our paper: rate inference at neighborhood level. Traditional approaches have used demographics geographical influences estimate rates region. With fast development positioning technology...

10.1145/2939672.2939736 article EN 2016-08-08

A Neural Temporal Model for Human Motion Prediction

OPENALEX - Publications

Anand Gopalakrishnan Ankur Mali Daniel Kifer C. Lee Giles Alexander G. Ororbia

We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work short-term prediction requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids generating planned trajectories, 2) simple set easily computable features integrate derivative information, 3) multi-objective loss...

10.1109/cvpr.2019.01239 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

The Data Synergy Effects of Time‐Series Deep Learning Models in Hydrology

OPENALEX - Publications

Kuai Fang Daniel Kifer Kathryn Lawson Dapeng Feng Chaopeng Shen

Abstract When fitting statistical models to variables in geoscientific disciplines such as hydrology, it is a customary practice stratify large domain into multiple regions (or regimes) and study each region separately. Traditional wisdom suggests that built for separately will have higher performance because of homogeneity within region. However, stratified model has access fewer less diverse data points. Here, through two hydrologic examples (soil moisture streamflow), we show conventional...

10.1029/2021wr029583 article EN publisher-specific-oa Water Resources Research 2022-03-17

Injecting utility into anonymized datasets

OPENALEX - Publications

Daniel Kifer Johannes Gehrke

Limiting disclosure in data publishing requires a careful balance between privacy and utility. Information about individuals must not be revealed, but dataset should still useful for studying the characteristics of population. Privacy requirements such as k-anonymity l-diversity are designed to thwart attacks that attempt identify discover their sensitive information. On other hand, utility has been well-studied.In this paper we will discuss shortcomings current heuristic approaches...

10.1145/1142473.1142499 article EN 2006-06-27

Worst-Case Background Knowledge for Privacy-Preserving Data Publishing

OPENALEX - Publications

David Martín Daniel Kifer Ashwin Machanavajjhala Johannes Gehrke Joseph Y. Halpern

Recent work has shown the necessity of considering an attacker's background knowledge when reasoning about privacy in data publishing. However, practice, publisher does not know what attacker possesses. Thus, it is important to consider worst-case. In this paper, we initiate a formal study worst-case knowledge. We propose language that can express any data. provide polynomial time algorithm measure amount disclosure sensitive information worst case, given at most k pieces language. also...

10.1109/icde.2007.367858 article EN 2007-04-01

Attacks on privacy and deFinetti's theorem

OPENALEX - Publications

Daniel Kifer

In this paper we present a method for reasoning about privacy using the concepts of exchangeability and deFinetti's theorem. We illustrate usefulness technique by it to attack popular data sanitization scheme known as Anatomy. stress that Anatomy is not only vulnerable attack. fact, any uses random worlds model, i.i.d. or tuple-independent model needs be re-evaluated.

10.1145/1559845.1559861 article EN 2009-06-29

A rigorous and customizable framework for privacy

OPENALEX - Publications

Daniel Kifer Ashwin Machanavajjhala

In this paper we introduce a new and general privacy framework called Pufferfish. The Pufferfish can be used to create definitions that are customized the needs of given application. goal is allow experts in an application domain, who frequently do not have expertise privacy, develop rigorous for their data sharing needs. addition this, also study existing definitions.

10.1145/2213556.2213571 article EN 2012-05-21

Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget

OPENALEX - Publications

Jae Wook Lee Daniel Kifer

Iterative algorithms, like gradient descent, are common tools for solving a variety of problems, such as model fitting. For this reason, there is interest in creating differentially private versions them. However, their conversion to algorithms often naive. instance, fixed number iterations chosen, the privacy budget split evenly among them, and at each iteration, parameters updated with noisy gradient.

10.1145/3219819.3220076 article EN 2018-07-19

Citation recommendation without author supervision

OPENALEX - Publications

Qi He Daniel Kifer Jian Pei Prasenjit Mitra C. Lee Giles

Automatic recommendation of citations for a manuscript is highly valuable scholarly activities since it can substantially improve the efficiency and quality literature search. The prior techniques placed considerable burden on users, who were required to provide representative bibliography or mark passages where are needed. In this paper we present system that considerably reduces burden: user simply inputs query (without bibliography) our automatically finds locations We show naïve...

10.1145/1935826.1935926 article EN 2011-02-01

A Simple Baseline for Travel Time Estimation using Large-scale Trip Data

OPENALEX - Publications

Hongjian Wang Xianfeng Tang Yu-Hsuan Kuo Daniel Kifer Zhenhui Li

The increased availability of large-scale trajectory data provides rich information for the study urban dynamics. For example, New York City Taxi 8 Limousine Commission regularly releases source/destination taxi trips, where 173 million trips released Year 2013 [29]. Such a big dataset us potential new perspectives to address traditional traffic problems. In this article, we travel time estimation problem. Instead following route-based estimation, propose simply use large amount without...

10.1145/3293317 article EN ACM Transactions on Intelligent Systems and Technology 2019-01-12

Multi-Scale Multi-Task FCN for Semantic Page Segmentation and Table Detection

OPENALEX - Publications

Dafang He Scott Cohen Brian Price Daniel Kifer C. Lee Giles

Page segmentation and table detection play an important role in understanding the structure of documents. We present a page algorithm that incorporates state-of-the-art deep learning methods for segmenting three types document elements: text blocks, tables, figures. propose multi-scale, multi-task fully convolutional neural network (FCN) tasks semantic element contour detection. The accurately predicts probability at each pixel classes. instance level "edges" around occurrence. conditional...

10.1109/icdar.2017.50 article EN 2017-11-01

Coming Soon ...