Jonathan Bragg

ORCID: 0000-0001-5460-9047
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Mobile Crowdsensing and Crowdsourcing
  • Software Engineering Research
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Data Stream Mining Techniques
  • Scientific Computing and Data Management
  • Digital Accessibility for Disabilities
  • Auction Theory and Applications
  • Web Data Mining and Analysis
  • Data Visualization and Analytics
  • Semantic Web and Ontologies
  • Text Readability and Simplification
  • Tactile and Sensory Interactions
  • Recommender Systems and Techniques
  • AI in Service Interactions
  • Multimodal Machine Learning Applications
  • Music Technology and Sound Studies
  • Machine Learning and Data Classification
  • Data Quality and Management
  • Multi-Agent Systems and Negotiation
  • Personal Information Management and User Behavior
  • Open Source Software Innovations
  • Artificial Intelligence in Law
  • Education and Critical Thinking Development

Allen Institute for Artificial Intelligence
2021-2024

Allen Institute
2022-2024

University of Washington
2013-2022

University of Pennsylvania
2022

University of California, Berkeley
2022

Harvard University Press
2011

Dimagi (United States)
2010

Recent work has introduced CASCADE, an algorithm for creating a globally-consistent taxonomy by crowdsourcing microwork from many individuals, each of whom may see only tiny fraction the data (Chilton et al. 2013). While CASCADE needs unskilled labor and produces taxonomies whose quality approaches that human experts, it uses significantly more than experts. This paper presents DELUGE, improved workflow with comparable using less crowd labor. Specifically, our method multi-label...

10.1609/hcomp.v1i1.13091 article EN Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 2013-11-03

The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field. Semantic Scholar (S2) open data platform and website aimed at accelerating science by helping scholars discover understand literature. We combine public proprietary sources using state-of-the-art techniques scholarly PDF content extraction automatic knowledge graph construction build the Academic Graph, largest literature to-date, 200M+ papers, 80M+...

10.48550/arxiv.2301.10140 preprint EN cc-by arXiv (Cornell University) 2023-01-01

When seeking information not covered in patient-friendly documents, healthcare consumers may turn to the research literature. Reading medical papers, however, can be a challenging experience. To improve access we explore four features enabled by natural language processing: definitions of unfamiliar terms, in-situ plain section summaries, collection key questions that guides readers answering passages, and summaries those passages. We embody these into prototype system, Paper Plain ....

10.1145/3589955 article EN cc-by ACM Transactions on Computer-Human Interaction 2023-04-01

Crowd workers are human and thus sometimes make mistakes. In order to ensure the highest quality output, requesters often issue redundant jobs with gold test questions sophisticated aggregation mechanisms based on expectation maximization (EM). While these methods yield accurate results in many cases, they fail extremely difficult problems local minima, such as situations where majority of get answer wrong. Indeed, this has caused some researchers conclude that tasks crowdsourcing can never...

10.1609/hcomp.v4i1.13270 article EN Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 2016-09-21

Angli Liu, Stephen Soderland, Jonathan Bragg, Christopher H. Lin, Xiao Ling, Daniel S. Weld. Proceedings of the 2016 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2016.

10.18653/v1/n16-1104 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

Few-shot NLP research is highly active, yet conducted in disjoint threads with evaluation suites that lack challenging-yet-realistic testing setups and fail to employ careful experimental design. Consequently, the community does not know which techniques perform best or even if they outperform simple baselines. In response, we formulate FLEX Principles, a set of requirements practices for unified, rigorous, valid, cost-sensitive few-shot evaluation. These principles include Sample Size...

10.48550/arxiv.2107.07170 preprint EN other-oa arXiv (Cornell University) 2021-01-01

When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work. However, it can be challenging to prioritize make sense of hundreds encountered during literature reviews. This paper introduces CiteSee, tool that leverages user's publishing, reading, saving activities provide personalized visual augmentations context around citations. First, CiteSee connects familiar contexts by surfacing known user had cited or opened....

10.1145/3544548.3580847 preprint EN 2023-04-19

Scholars who want to research a scientific topic must take time read, extract meaning, and identify connections across many papers. As literature grows, this becomes increasingly challenging. Meanwhile, authors summarize prior in papers' related work sections, though is scoped support single paper. A formative study found that while reading multiple paragraphs helps overview topic, it hard navigate overlapping diverging references foci. In work, we design system, Relatedly, scaffolds...

10.1145/3544548.3580841 preprint EN 2023-04-19

High-quality alt text is crucial for making scientific figures accessible to blind and low-vision readers. Crafting complete, accurate challenging even domain experts, as published often depict complex visual information readers have varied informational needs. These challenges, along with high diversity in figure types domain-specific details, also limit the usefulness of fully automated approaches. Consequently, prevalence high-quality very low papers today. We investigate whether how...

10.1145/3640543.3645212 article EN cc-by 2024-03-18

Requesters on crowdsourcing platforms, such as Amazon Mechanical Turk, routinely insert gold questions to verify that a worker is diligent and providing high-quality answers. However, there no clear understanding of when how many insert. Typically, requesters mix flat 10-30% into the task stream every worker. This static policy arbitrary wastes valuable budget --- exact percentage often chosen with little experimentation, and, more importantly, it does not adapt individual workers, current...

10.5555/2936924.2937066 article EN Adaptive Agents and Multi-Agents Systems 2016-05-09

The ever-increasing pace of scientific publication necessitates methods for quickly identifying relevant papers. While neural recommenders trained on user interests can help, they still result in long, monotonous lists suggested To improve the discovery experience we introduce multiple new augmenting recommendations with textual relevance messages that highlight knowledge-graph connections between recommended papers and a user's interaction history. We explore associations mediated by author...

10.1145/3491102.3517470 article EN CHI Conference on Human Factors in Computing Systems 2022-04-28

When reading a scholarly paper, scientists oftentimes wish to understand how follow-on work has built on or engages with what they are reading. While paper itself can only discuss prior work, some scientific search engines provide list of all subsequent citing papers; unfortunately, undifferentiated and disconnected from the contents original reference paper. In this we introduce novel experience that integrates relevant information about directly into allowing readers learn newer papers see...

10.1145/3490099.3511162 article EN 2022-03-21

Scholars need to keep up with an exponentially increasing flood of scientific papers. To aid this challenge, we introduce Scim, a novel intelligent interface that helps experienced researchers skim – or rapidly review paper attain cursory understanding its contents. Scim supports the skimming process by highlighting salient contents in order direct reader's attention. The system's highlights are faceted content type, evenly distributed across paper, and have density configurable readers at...

10.1145/3581641.3584034 article EN 2023-03-27

Traditional approaches for ensuring high quality crowdwork have failed to achieve high-accuracy on difficult problems. Aggregating redundant answers often fails the hardest problems when majority is confused. Argumentation has been shown be effective in mitigating these drawbacks. However, existing argumentation systems only support limited interactions and show workers general justifications, not context-specific arguments targeted their reasoning. This paper presents Cicero, a new workflow...

10.1145/3290605.3300761 article EN 2019-04-29

When seeking information not covered in patient-friendly documents, like medical pamphlets, healthcare consumers may turn to the research literature. Reading papers, however, can be a challenging experience. To improve access we introduce novel interactive interface-Paper Plain-with four features powered by natural language processing: definitions of unfamiliar terms, in-situ plain section summaries, collection key questions that guide readers answering passages, and summaries passages. We...

10.48550/arxiv.2203.00130 preprint EN cc-by arXiv (Cornell University) 2022-01-01

In order to help scholars understand and follow a research topic, significant has been devoted creating systems that discover relevant papers authors. Recent approaches have shown the usefulness of highlighting authors while engage in paper discovery. However, these do not capture utilize users' evolving knowledge We reflect on design space introduce ComLittee, literature discovery system supports author-centric exploration. contrast paper-centric interaction prior systems, ComLittee's...

10.1145/3544548.3581371 preprint EN 2023-04-19

An ideal crowdsourcing or citizen-science system would route tasks to the most appropriate workers, but best assignment is unclear because workers have varying skill, difficulty, and assigning several a single task may significantly improve output quality. This paper defines space of routing problems, proves that even simplest NP-hard, develops approximation algorithms for parallel problems. We show an intuitive class requesters' utility functions submodular, which lets us provide iterative...

10.1609/hcomp.v2i1.13170 article EN Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 2014-09-05

While crowdsourcing enables data collection at scale, ensuring high-quality remains a challenge. In particular, effective task design underlies nearly every reported success, yet difficult to accomplish. Task is hard because it involves costly iterative process: identifying the kind of work output one wants, conveying this information workers, observing worker performance, understanding what ambiguous, revising instructions, and repeating process until resulting satisfactory. To facilitate...

10.1145/3242587.3242598 article EN 2018-10-11

Mainstream crowdwork platforms treat microtasks as indivisible units; however, in this article, we propose that there is value re-examining assumption. We argue can improve their proposition for all stakeholders by supporting subcontracting within microtasks. After describing the of subcontracting, then define three models microtask subcontracting: real-time assistance, task management, and improvement, reflect on potential use cases implementation considerations associated with each....

10.1145/3025453.3025687 article EN 2017-05-02

Figures in scientific publications contain important information and results, alt text is needed for blind low vision readers to engage with their content. We conduct a study characterize the semantic content of HCI based on framework introduced by Lundgard Satyanarayan. Our focuses graphs, charts, plots extracted from accessibility publications; we focus these communities due lack papers published outside disciplines. find that capacity author-written fulfill user needs mixed; example, only...

10.1145/3517428.3544796 preprint EN 2022-10-22

Research consumption has been traditionally limited to the reading of academic papers-a static, dense, and formally written format. Alternatively, pre-recorded conference presentation videos, which are more dynamic, concise, colloquial, have recently become widely available but potentially under-utilized. In this work, we explore design space benefits for combining papers talk videos leverage their complementary nature provide a rich fluid research experience. Based on formative co-design...

10.1145/3586183.3606770 preprint EN cc-by 2023-10-21

With the rapid growth of scholarly archives, researchers subscribe to "paper alert'' systems that periodically provide them with recommendations recently published papers are similar previously collected papers. However, sometimes struggle make sense nuanced connections between recommended and their own research context, as existing only present paper titles abstracts. To help spot these connections, we PaperWeaver, an enriched alerts system provides contextualized text descriptions based on...

10.1145/3613904.3642196 preprint EN cc-by 2024-05-11

Scholarly publications are key to the transfer of knowledge from scholars others. However, research papers information-dense, and as volume scientific literature grows, need for new technology support reading process grows. In contrast finding papers, which has been transformed by Internet technology, experience changed little in decades. The PDF format sharing is widely used due its portability, but it significant downsides including: static content, poor accessibility low-vision readers,...

10.48550/arxiv.2303.14334 preprint EN cc-by arXiv (Cornell University) 2023-01-01
Coming Soon ...