Catherine Chen

ORCID: 0009-0009-8734-436X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Explainable Artificial Intelligence (XAI)
  • Online and Blended Learning
  • Online Learning and Analytics
  • Innovative Teaching and Learning Methods
  • Educational Games and Gamification
  • Multimodal Machine Learning Applications
  • Visual and Cognitive Learning Processes
  • Biomedical Text Mining and Ontologies
  • Handwritten Text Recognition Techniques
  • Scientific Computing and Data Management
  • Machine Learning and Algorithms
  • Meta-analysis and systematic reviews
  • Research Data Management Practices
  • Educational Strategies and Epistemologies
  • RNA Research and Splicing
  • Science Education and Perceptions
  • Neurobiology of Language and Bilingualism
  • Semantic Web and Ontologies
  • Advanced Graph Neural Networks
  • Service-Oriented Architecture and Web Services
  • Ethics and Social Impacts of AI
  • ICT Impact and Policies
  • Text Readability and Simplification

Brown University
2023-2024

University of California, Berkeley
2020-2023

University of Tübingen
2023

Massachusetts Institute of Technology
2023

Hebrew University of Jerusalem
2023

Allen Institute
2023

Ball State University
2009-2021

Princeton University
2018

Geisinger Medical Center
1990

Mechanistic interpretability is an emerging diagnostic approach for neural models that has gained traction in broader natural language processing domains. This paradigm aims to provide attribution components of systems where causal relationships between hidden layers and output were previously uninterpretable. As the use IR retrieval evaluation becomes ubiquitous, we need ensure can interpret why a model produces given both transparency betterment systems. work comprises flexible framework...

10.48550/arxiv.2501.10165 preprint EN arXiv (Cornell University) 2025-01-17

Neural Ranking Models (NRMs) have rapidly advanced state-of-the-art performance on information retrieval tasks. In this work, we investigate a Cross-Encoder variant of MiniLM to determine which relevance features it computes and where they are stored. We find that employs semantic the traditional BM25 in an interpretable manner, featuring localized components: (1) Transformer attention heads compute soft term frequency while controlling for saturation document length effects, (2) low-rank...

10.48550/arxiv.2502.04645 preprint EN arXiv (Cornell University) 2025-02-06

We present a method for constructing taxonomic trees (e.g., WordNet) using pretrained language models. Our approach is composed of two modules, one that predicts parenthood relations and another reconciles those predictions into trees. The prediction module produces likelihood scores each potential parent-child pair, creating graph relation scores. tree reconciliation treats the task as optimization problem outputs maximum spanning this graph. train our model on subtrees sampled from...

10.18653/v1/2021.naacl-main.373 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2021-01-01

10.1145/3626772.3657841 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

Journal Article Interaction of RNA polymerase I transcription factors with a promoter in the nontranscribed spacer rat ribosomal DNA Get access S.David Smith, Smith Weis Center for Research, Geisinger ClinicDanville, PA 17822, USA Search other works by this author on: Oxford Academic PubMed Google Scholar Emmanuel Oriahi, Oriahi Hsin-Fang Yang-Yen, Yang-Yen WenQin Xie, Xie Catherine Chen, Chen Lawrence I. Rothblum * To whom correspondence should be addressed Nucleic Acids Volume 18, Issue 7,...

10.1093/nar/18.7.1677 article EN Nucleic Acids Research 1990-01-01

10.1016/j.jbusres.2009.03.005 article EN Journal of Business Research 2009-03-30

The tremendous amount of data collected from connected devices and social media has created a high demand for new skills to help organizations gain the power big data. With shorter completion time, graduate certificates appeared be desirable alternative working professionals develop these skills. However, it was unclear if met job market's needs. Quantitative text analysis used analyze science in more than 5,000 descriptions taught 588 required courses 166 investigate needed industry were...

10.1080/08874417.2020.1852628 article EN Journal of Computer Information Systems 2021-03-02

Recent work has shown that infusing layout features into language models (LMs) improves processing of visually-rich documents such as scientific papers. Layout-infused LMs are often evaluated on with familiar (e.g., papers from the same publisher), but in practice encounter unfamiliar distributions features, new combinations text sizes and styles, or spatial configurations textual elements. In this we test whether layout-infused robust to distribution shifts. As a case study use task...

10.18653/v1/2023.findings-acl.844 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

Turbulent market conditions characterised by very fast and continuous changes are forcing many enterprises to adopt new organisational paradigms of collaboration cooperation forming highly dynamic organisations known as Virtual Enterprises (VE) or Organisations (VO). Implementing Enterprise Resource Planning (ERP) systems in a virtual enterprise setting is an entirely different preposition than implementing ERP system single environment. Collaborative have integrate between their packages...

10.1504/ijmed.2006.009572 article EN International Journal of Management and Enterprise Development 2006-01-01

Instructional theories have shifted from viewing students as reactive learners in the mid-1980s to current view of proactive learners. Emphasis is no longer placed on teachers adapt instruction meet individual student’s mental ability or social-cultural background. In contrast, are viewed active participants their own learning process. Based literature and prior research, purpose this study revise strategy section Motivated Strategy for Learning Questionnaire (MSLQ) reflect unique strategies...

10.4018/jea.2012040103 article EN International Journal of E-Adoption 2012-04-01

Learning specialized computer skills necessitates using the as a problem-solving tool while displaying superior affective in an applied learning environment.Motivational Strategies for Questionnaire (MSLQ) motivation scale assesses students' general to learn, but does not capture and use of effective strategies environment.To better understand difference (or lack difference) engage problem-based learning, valid instrument is needed.Exploratory factor analyses confirmatory showed that survey...

10.48009/3_iis_2015_108-118 article EN cc-by-nc-nd Issues in Information Systems 2015-01-01

Abstract In Transformer-based language models (LMs) the attention mechanism converts token embeddings into contextual that incorporate information from neighboring words. The resulting hidden state have enabled highly accurate of brain responses, suggesting constructs carry reflected in language-related representations. However, it is unclear whether weights are used to integrate across words themselves related representations brain. To address this question we analyzed functional magnetic...

10.1101/2022.12.07.519480 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2022-12-07

10.18848/2327-7963/cgp/v19i02/48899 article EN The International Journal of Pedagogy and Curriculum 2013-01-01

Neural models have demonstrated remarkable performance across diverse ranking tasks. However, the processes and internal mechanisms along which they determine relevance are still largely unknown. Existing approaches for analyzing neural ranker behavior with respect to IR properties rely either on assessing overall model or employing probing methods that may offer an incomplete understanding of causal mechanisms. To provide a more granular decision-making processes, we propose use...

10.48550/arxiv.2405.02503 preprint EN arXiv (Cornell University) 2024-05-03

10.1145/3626772.3657796 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize field also becomes urgent. this work, we--a multidisciplinary team of researchers spanning computer science--present first principled framework termed Political-LLM advance comprehensive understanding...

10.48550/arxiv.2412.06864 preprint EN arXiv (Cornell University) 2024-12-09

As information retrieval (IR) systems, such as search engines and conversational agents, become ubiquitous in various domains, the need for transparent explainable systems grows to ensure accountability, fairness, unbiased results. Despite many recent advances toward AI IR techniques, there is no consensus on what it means a system be explainable. Although growing body of literature suggests that explainability comprised multiple subfactors [2, 5, 6], virtually all existing approaches treat...

10.1145/3539618.3591792 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

Motivational and self-regulated learning constructs have continued to pose vigorous, multi-dimensional questions for academicians.The challenge of aspiring students elevate their social-cognitive perspective toward authentic learning, heightening the learners' interpretation academic interactions, articulating role that beliefs, cognitions, affects, values play in scholastic performance is an evolving paradigm.The Strategies Learning Questionnaire (MSLQ) has been used widely assess students'...

10.48009/3_iis_2017_129-140 article EN cc-by-nc-nd Issues in Information Systems 2017-01-01

Past work has shown that paired vision-language signals substantially improve grammar induction in multimodal datasets such as MSCOCO. We investigate whether advancements large language models (LLMs) are only trained with text could provide strong assistance for settings. find our text-only approach, an LLM-based C-PCFG (LC-PCFG), outperforms previous multi-modal methods, and achieves state-of-the-art performance various datasets. Compared to image-aided induction, LC-PCFG the prior by 7.9...

10.48550/arxiv.2212.10564 preprint EN other-oa arXiv (Cornell University) 2022-01-01

The world of entertainment is an ever-changing minefield.With so many factors that influence the industry, it always shifting and changing.What might be current situation could rendered obsolete only a few years down line.To stay afloat in this fast-paced environment, companies must adapt overcome changes as soon they appear.As such, industry extremely competitive, with large small all grappling for attention money diverse consumer base.Companies like Disney, Netflix, Hulu, Tik Tok thrive...

10.2991/assehr.k.211020.268 article EN cc-by-nc Advances in Social Science, Education and Humanities Research/Advances in social science, education and humanities research 2021-01-01

Information retrieval (IR) systems have become an integral part of our everyday lives. As search engines, recommender systems, and conversational agents are employed across various domains from recreational to clinical decision support, there is increasing need for transparent explainable guarantee accountable, fair, unbiased results. Despite many recent advances towards AI IR techniques, no consensus on what it means a system be explainable. Although growing body literature suggests that...

10.48550/arxiv.2210.09430 preprint EN cc-by arXiv (Cornell University) 2022-01-01
Coming Soon ...