- Topic Modeling
- Natural Language Processing Techniques
- Intelligent Tutoring Systems and Adaptive Learning
- Biomedical Text Mining and Ontologies
- AI in Service Interactions
- Cancer Genomics and Diagnostics
- Machine Learning and Data Classification
- Machine Learning in Healthcare
- Domain Adaptation and Few-Shot Learning
- Mobile Crowdsensing and Crowdsourcing
- Health Literacy and Information Accessibility
- Human Mobility and Location-Based Analysis
- COVID-19 epidemiological studies
- Artificial Intelligence in Healthcare and Education
- Multimodal Machine Learning Applications
- Mental Health Research Topics
- Educational Games and Gamification
- Speech and dialogue systems
- Data Stream Mining Techniques
- Text Readability and Simplification
- Cultural Competency in Health Care
- Electronic Health Records Systems
- Emotion and Mood Recognition
- Digital Mental Health Interventions
- Humor Studies and Applications
University of California, San Francisco
2019-2024
University of Virginia
2016-2023
Engineering Systems (United States)
2021
City College of San Francisco
2017
Centre National de la Recherche Scientifique
2014
Télécom Paris
2014
Laboratoire Traitement et Communication de l’Information
2014
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has hypothesized that this is consequence implicit multitask learning in models' pretraining (Radford 2019). Can instead be directly induced by explicit learning? To test question at scale, we develop system for easily mapping any natural into human-readable prompted form. We convert large supervised datasets, each with multiple prompts wording....
We develop Polaris, the first safety-focused LLM constellation for real-time patient-AI healthcare conversations. Unlike prior works in focusing on tasks like question answering, our work specifically focuses long multi-turn voice Our one-trillion parameter system is composed of several multibillion LLMs as co-operative agents: a stateful primary agent that driving an engaging conversation and specialist support agents focused performed by nurses to increase safety reduce hallucinations....
Electronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical clinical research. Interactive flexible software that interfaces directly with EHR data structured around a common model (CDM) could accelerate more EHR-based research by making the accessible to researchers who lack computational expertise and/or domain knowledge.
Significant health disparities exist between Hispanics and the general US population, complicated in part by communication, literacy, linguistic factors. There are few available Spanish-language interactive, technology-driven education programs that engage patients who have a range of literacy levels. We describe development an interactive virtual patient educator for educating counseling Hispanic women about cervical cancer human papillomavirus. Specifically, we iterative design methodology...
Early detection of influenza-like symptoms can prevent widespread flu viruses and enable timely treatments, particularly in the post-pandemic era. Mobile sensing leverages an increasingly diverse set embedded sensors to capture fine-grained information human behaviors ambient contexts, serve as a promising solution for symptom recognition. Traditionally, handcrafted high level features mobile data are extracted by manual feature engineering convolutional/recurrent neural network...
Electronic health record (EHR) data are increasingly used for biomedical discoveries. The nature of the data, however, requires expertise in both science and EHR structure. Observational Medical Out-comes Partnership (OMOP) common model (CDM) standardizes language structure to promote interoperability research. While OMOP CDM is valuable more attuned research purposes, it still extensive domain knowledge utilize effectively, potentially limiting widespread adoption quality improvement.
Training and evaluating language models increasingly requires the construction of meta-datasets --diverse collections curated data with clear provenance. Natural prompting has recently lead to improved zero-shot generalization by transforming existing, supervised datasets into a diversity novel pretraining tasks, highlighting benefits meta-dataset curation. While successful in general-domain text, translating these data-centric approaches biomedical modeling remains challenging, as labeled...
During pandemics, effective interventions require monitoring the problem at different scales and understanding various tradeoffs between efficacy, privacy, economic burden. To address these challenges, we propose a framework where perform Bayesian change-point analysis on aggregate behavior markers extracted from mobile sensing data collected during COVID-19 pandemic. Results generated by 598 participants for up to four months reveal rich insights: We observe an increase in smartphone usage...
Machine learning techniques applied to the Natural Language Processing (NLP) component of conversational agent development show promising results for improved accuracy and quality feedback that a can provide. The effort required develop an educational scenario specific is time consuming as it requires domain experts label annotate noisy data sources such classroom videos. Previous approaches modeling annotations have relied on labeling thousands examples calculating inter-annotator agreement...
Correctly identifying an individual's social context from passively worn sensors holds promise for delivering just-in-time adaptive interventions (JITAIs) to treat anxiety disorder. In this study, we present results using collected data a within-subject experiment that assessed physiological response across different contexts (i.e, alone vs. with others), phases (i.e., pre- and post-interaction during interaction), interaction sizes dyadic group interactions), levels of threat implicit...
Jason Fries, Natasha Seelam, Gabriel Altay, Leon Weber, Myungsun Kang, Debajyoti Datta, Ruisi Su, Samuele Garda, Bo Wang, Simon Ott, Matthias Samwald, Wojciech Kusa. Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. 2022.
People with HIV in the United States are aging, risk for negative health outcomes from social isolation. PositiveLinks is a mobile (mHealth) intervention that includes an anonymous Community Message Board (CMB) peer-to-peer conversations. We investigated differences CMB usage and support between younger (<50 years) older (≥50) members.We assessed relationship age groups app use using chi-square tests. posts were analyzed qualitatively to categorize forms of support. To have visual...
Abstract Animal models remain a cornerstone of research efforts in Oncology to model the complexity cancer progression and discover new therapeutic approaches disease management. With advent genomic manipulation techniques such as CRISPR, advances mouse modeling with genetically engineered mice (GEM) patient-derived xenografts (PDX), we can expect development novel powerful animal near future. These are expanding our capability for pre-clinical testing agents or n-of-1 patient-specific...
A growing body of recent evidence has highlighted the limitations natural language processing (NLP) datasets and classifiers. These include presence annotation artifacts in datasets, classifiers relying on shallow features like a single word (e.g., if movie review "romantic", tends to be positive), or unnecessary words learning proper noun classify as positive negative). The such subsequently led development challenging force model generalize better. While variety heuristic strategies,...
In most countries around the world, various public policies and guidelines, such as social distancing stay-at-home orders, have been put in place to slow down spreading of COVID-19. Relying on traditional surveys assess policy impacts community level behavior changes may lead biased results, limit fine-grained understanding human dynamics over time. We propose leverage mobile sensing capture people's footprints amid COVID-19 pandemic, understand their collective with respect existing...