NFDI4DS | UHH-SEMS - Publication Details

Dafna Shahaf

ORCID: 0000-0003-3261-0818

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5009660759

Research Areas

Topic Modeling
Natural Language Processing Techniques
Advanced Text Analysis Techniques
Data Visualization and Analytics
Multimodal Machine Learning Applications
Semantic Web and Ontologies
Web Data Mining and Analysis
Software Engineering Research
Mobile Crowdsensing and Crowdsourcing
Logic, Reasoning, and Knowledge
Humor Studies and Applications
Data Management and Algorithms
Sentiment Analysis and Opinion Mining
Open Source Software Innovations
Machine Learning and Algorithms
Artificial Intelligence in Games
Rough Sets and Fuzzy Logic
Advanced Optical Network Technologies
Explainable Artificial Intelligence (XAI)
Bayesian Modeling and Causal Inference
Language, Metaphor, and Cognition
Information Retrieval and Search Behavior
Video Analysis and Summarization
Innovative Human-Technology Interaction
Multi-Agent Systems and Negotiation

Hebrew University of Jerusalem
2016-2024

Stanford University
2013-2022

Tel Aviv University
2022

Bar-Ilan University
2022

Allen Institute
2022

Allen Institute for Artificial Intelligence
2020

University of California, Berkeley
2020

Microsoft (United States)
2015

Stanford Medicine
2013

Carnegie Mellon University
2009-2012

State of What Art? A Call for Multi-Prompt LLM Evaluation

OPENALEX - Publications

Moran Mizrahi Guy Kaplan Dan Malkin Rotem Dror Dafna Shahaf and 1 more

Abstract Recent advances in LLMs have led to an abundance of evaluation benchmarks, which typically rely on a single instruction template per task. We create large-scale collection paraphrases and comprehensively analyze the brittleness introduced by single-prompt evaluations across 6.5M instances, involving 20 different 39 tasks from 3 benchmarks. find that templates lead very performance, both absolute relative. Instead, we propose set diverse metrics multiple paraphrases, specifically...

10.1162/tacl_a_00681 article EN cc-by Transactions of the Association for Computational Linguistics 2024-01-01

Connecting the dots between news articles

OPENALEX - Publications

Dafna Shahaf Carlos Guestrin

The process of extracting useful knowledge from large datasets has become one the most pressing problems in today's society. problem spans entire sectors, scientists to intelligence analysts and web users, all whom are constantly struggling keep up with larger amounts content published every day. With this much data, it is often easy miss big picture.

10.1145/1835804.1835884 article EN 2010-07-25

Learning to Route

OPENALEX - Publications

Asaf Valadarsky Michael Schapira Dafna Shahaf Aviv Tamar

Recently, much attention has been devoted to the question of whether/when traditional network protocol design, which relies on application algorithmic insights by human experts, can be replaced a data-driven (i.e., machine learning) approach. We explore this in context arguably most fundamental networking task: routing. Can ideas and techniques from learning (ML) leveraged automatically generate "good" routing configurations? focus classical setting intradomain traffic engineering. observe...

10.1145/3152434.3152441 article EN 2017-11-27

Turning down the noise in the blogosphere

OPENALEX - Publications

Khalid El-Arini Gaurav Veda Dafna Shahaf Carlos Guestrin

In recent years, the blogosphere has experienced a substantial increase in number of posts published daily, forcing users to cope with information overload. The task guiding through this flood thus become critical. To address issue, we present principled approach for picking set that best covers important stories blogosphere.

10.1145/1557019.1557056 article EN 2009-06-28

Trains of thought

OPENALEX - Publications

Dafna Shahaf Carlos Guestrin Eric Horvitz

When information is abundant, it becomes increasingly difficult to fit nuggets of knowledge into a single coherent picture. Complex stories spaghetti branches, side stories, and intertwining narratives. In order explore these one needs map navigate unfamiliar territory. We propose methodology for creating structured summaries information, which we call metro maps. Our proposed algorithm generates concise set documents maximizing coverage salient pieces information. Most importantly, maps...

10.1145/2187836.2187957 article EN 2012-04-16

Information cartography

OPENALEX - Publications

Dafna Shahaf Jaewon Yang Caroline Suen Jeff Jacobs Heidi Wang and 1 more

In an era of information overload, many people struggle to make sense complex stories, such as presidential elections or economic reforms. We propose a methodology for creating structured summaries information, which we call zoomable metro maps. Just cartographic maps have been relied upon centuries help us understand our surroundings, can the landscape.

10.1145/2487575.2487690 article EN 2013-08-11

Metro maps of science

OPENALEX - Publications

Dafna Shahaf Carlos Guestrin Eric Horvitz

As the number of scientific publications soars, even most enthusiastic reader can have trouble staying on top evolving literature. It is easy to focus a narrow aspect one's field and lose track big picture. Information overload indeed major challenge for scientists today, especially daunting new investigators attempting master discipline who seek cross disciplinary borders. In this paper, we propose metrics influence, coverage connectivity We use these create structured summaries...

10.1145/2339530.2339706 article EN 2012-08-12

SOLVENT

OPENALEX - Publications

Joel Chan Joseph Chee Chang Tom Hope Dafna Shahaf Aniket Kittur

Scientific discoveries are often driven by finding analogies in distant domains, but the growing number of papers makes it difficult to find relevant ideas a single discipline, let alone other domains. To provide computational support for across we introduce SOLVENT, mixed-initiative system where humans annotate aspects research that denote their background (the high-level problems being addressed), purpose specific mechanism (how they achieved purpose), and findings (what learned/achieved),...

10.1145/3274300 article EN Proceedings of the ACM on Human-Computer Interaction 2018-11-01

Scaling up analogical innovation with crowds and AI

OPENALEX - Publications

Aniket Kittur Lixiu Yu Tom Hope Joel Chan Hila Lifshitz‐Assaf and 4 more

Analogy—the ability to find and apply deep structural patterns across domains—has been fundamental human innovation in science technology. Today there is a growing opportunity accelerate by moving analogy out of single person’s mind distributing it many information processors, both machine. Doing so has the potential overcome cognitive fixation, scale large idea repositories, support complex problems with multiple constraints. Here we lay perspective on future scalable analogical first steps...

10.1073/pnas.1807185116 article EN Proceedings of the National Academy of Sciences 2019-02-04

Generalized Task Markets for Human and Machine Computation

OPENALEX - Publications

Dafna Shahaf Eric Horvitz

We discuss challenges and opportunities for developing generalized task markets where human machine intelligence are enlisted to solve problems, based on a consideration of the competencies, availabilities, pricing different problem-solving resources. The approach couples computation with learning planning, is aimed at optimizing flow subtasks people computational problem solvers. illustrate key ideas in context Lingua Mechanica, project focused harnessing translation skills perform among...

10.1609/aaai.v24i1.7652 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2010-07-04

Inside Jokes

OPENALEX - Publications

Dafna Shahaf Eric Horvitz Robert Mankoff

Humor is an integral aspect of the human experience. Motivated by prospect creating computational models humor, we study influence language cartoon captions on perceived humorousness cartoons. Our studies are based a large corpus crowdsourced that were submitted to contest hosted New Yorker. Having access thousands for same image allows us analyze breadth responses people visual stimulus.

10.1145/2783258.2783388 article EN 2015-08-07

Accelerating Innovation Through Analogy Mining

OPENALEX - Publications

Tom Hope Joel Chan Aniket Kittur Dafna Shahaf

The availability of large idea repositories (e.g., the U.S. patent database) could significantly accelerate innovation and discovery by providing people with inspiration from solutions to analogous problems. However, finding useful analogies in these large, messy, real-world remains a persistent challenge for either human or automated methods. Previous approaches include costly hand-created databases that have high relational structure predicate calculus representations) but are very sparse....

10.1145/3097983.3098038 article EN 2017-08-04

Augmenting Scientific Creativity with an Analogical Search Engine

OPENALEX - Publications

Hyeonsu B Kang Xin Qian Tom Hope Dafna Shahaf Joel Chan and 1 more

Analogies have been central to creative problem-solving throughout the history of science and technology. As number scientific articles continues increase exponentially, there is a growing opportunity for finding diverse solutions existing problems. However, realizing this potential requires development means searching through large corpus that goes beyond surface matches simple keywords. Here we contribute first end-to-end system analogical search on evaluate its effectiveness with...

10.1145/3530013 article EN ACM Transactions on Computer-Human Interaction 2022-06-08

Analogy Mining for Specific Design Needs

OPENALEX - Publications

Karni Gilon Joel Chan Felicia Ng Hila Liifshitz-Assaf Aniket Kittur and 1 more

Finding analogical inspirations in distant domains is a powerful way of solving problems. However, as the number that could be matched and dimensions on which matching occur grow, it becomes challenging for designers to find relevant their needs. Furthermore, are often interested exploring specific aspects product-- example, one designer might improving brewing capability an outdoor coffee maker, while another wish optimize portability. In this paper we introduce novel system targeting...

10.1145/3173574.3173695 article EN 2018-04-19

The Lean Data Scientist

OPENALEX - Publications

Chen Shani Jonathan Zarecki Dafna Shahaf

A taxonomy of the methods used to obtain quality datasets enhances existing resources.

10.1145/3551635 article EN Communications of the ACM 2023-01-20

Connecting Two (or Less) Dots

OPENALEX - Publications

Dafna Shahaf Carlos Guestrin

Finding information is becoming a major part of our daily life. Entire sectors, from Web users to scientists and intelligence analysts, are increasingly struggling keep up with the larger amounts content published every day. With this much data, it often easy miss big picture. In article, we investigate methods for automatically connecting dots---providing structured, way navigate within new topic discover hidden connections. We focus on news domain: given two articles, system finds coherent...

10.1145/2086737.2086744 article EN ACM Transactions on Knowledge Discovery from Data 2012-01-31

Information cartography

OPENALEX - Publications

Dafna Shahaf Carlos Guestrin Eric Horvitz Jure Leskovec

A metro map can tell a story, as well provide good directions.

10.1145/2735624 article EN Communications of the ACM 2015-10-23

Tractable near-optimal policies for crawling

OPENALEX - Publications

Yossi Azar Eric Horvitz Eyal Lubetzky Yuval Peres Dafna Shahaf

Significance We present a tractable algorithm that provides near-optimal solution to the crawling problem, fundamental challenge at heart of web search: Given large quantity distributed and dynamic content, what pages do we choose update local cache with goal serving up-to-date client requests? Solving this optimization requires identifying best set refresh given popularity rates change rates—an intractable problem in general case. To overcome intractability, show optimal randomized strategy...

10.1073/pnas.1801519115 article EN cc-by-nc-nd Proceedings of the National Academy of Sciences 2018-07-23

Language (Re)modelling: Towards Embodied Language Understanding

OPENALEX - Publications

Ronen Tamari Chen Shani Tom Hope Miriam R. L. Petruck Omri Abend and 1 more

While natural language understanding (NLU) is advancing rapidly, today’s technology differs from human-like in fundamental ways, notably its inferior efficiency, interpretability, and generalization. This work proposes an approach to representation learning based on the tenets of embodied cognitive linguistics (ECL). According ECL, inherently executable (like programming languages), driven by mental simulation metaphoric mappings over hierarchical compositions structures schemata learned...

10.18653/v1/2020.acl-main.559 article EN cc-by 2020-01-01

“Alexa, Do You Want to Build a Snowman?” Characterizing Playful Requests to Conversational Agents

OPENALEX - Publications

Chen Shani Alexander Libov Sofia Tolmach Liane Lewin-Eytan Yoelle Maarek and 1 more

Conversational Agents (CAs) such as Apple's Siri and Amazon's Alexa are well-suited for task-oriented interactions ("Call Jason"), but other interaction types often beyond their capabilities. One notable example is playful requests: example, people ask CAs personal questions ("What's your favorite color?") or joke with them, sometimes at expense ("Find Nemo"). Failing to recognize playfulness causes user dissatisfaction abandonment, destroying the precious rapport CA.

10.1145/3491101.3519870 article EN CHI Conference on Human Factors in Computing Systems Extended Abstracts 2022-04-27

State of What Art? A Call for Multi-Prompt LLM Evaluation

OPENALEX - Publications

Moran Mizrahi Guy Kaplan Dan Malkin Rotem Dror Dafna Shahaf and 1 more

Recent advances in large language models (LLMs) have led to the development of various evaluation benchmarks. These benchmarks typically rely on a single instruction template for evaluating all LLMs specific task. In this paper, we comprehensively analyze brittleness results obtained via single-prompt evaluations across 6.5M instances, involving 20 different and 39 tasks from 3 To improve robustness analysis, propose evaluate with set diverse prompts instead. We discuss tailored metrics use...

10.48550/arxiv.2401.00595 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Coming Soon ...