Matthew Matero

ORCID: 0000-0002-8407-4298
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Mental Health via Writing
  • Topic Modeling
  • Machine Learning in Healthcare
  • Sentiment Analysis and Opinion Mining
  • Authorship Attribution and Profiling
  • Misinformation and Its Impacts
  • Natural Language Processing Techniques
  • Artificial Intelligence in Healthcare
  • Speech Recognition and Synthesis
  • Mental Health Research Topics
  • Substance Abuse Treatment and Outcomes
  • Mental Health Treatment and Access
  • Suicide and Self-Harm Studies
  • Educational Tools and Methods
  • Opioid Use Disorder Treatment

Stony Brook University
2020-2023

Mental health predictive systems typically model language as if from a single context (e.g. Twitter posts, status updates, or forum posts) and often limited to level of analysis either the message-level user-level). Here, we bring these pieces together explore use open-vocabulary (BERT embeddings, topics) theoretical features (emotional expression lexica, personality) for task suicide risk assessment on support forums (the CLPsych-2019 Shared Task). We used dual based approaches (modeling...

10.18653/v1/w19-3005 article EN 2019-01-01

In human-level NLP tasks, such as predicting mental health, personality, or demographics, the number of observations is often smaller than standard 768+ hidden state sizes each layer within modern transformer-based language models, limiting ability to effectively leverage transformers. Here, we provide a systematic study on role dimension reduction methods (principal components analysis, factorization techniques, multi-layer auto-encoders) well dimensionality embedding vectors and sample...

10.18653/v1/2021.naacl-main.357 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2021-01-01

Targeting of location-specific aid for the U.S. opioid epidemic is difficult due to our inability accurately predict changes in mortality across heterogeneous communities. AI-based language analyses, having recently shown promise cross-sectional (between-community) well-being assessments, may offer a way more longitudinally community-level overdose mortality. Here, we develop and evaluate, TROP (Transformer Opiod Prediction), model community-specific trend projection that uses social media...

10.1038/s41746-023-00776-0 article EN cc-by npj Digital Medicine 2023-03-08

Much of natural language processing is focused on leveraging large capacity models, typically trained over single messages with a task predicting one or more tokens. However, modeling human at higher-levels context (i.e., sequences messages) under-explored. In stance detection and other social media tasks where the goal to predict an attribute message, we have contextual data that loosely semantically connected by authorship. Here, introduce Message-Level Transformer (MeLT) – hierarchical...

10.18653/v1/2021.findings-emnlp.253 preprint EN cc-by 2021-01-01

Assessing risk for excessive alcohol use is important applications ranging from recruitment into research studies to targeted public health messaging. Social media language provides an ecologically embedded source of information assessing individuals who may be at harmful drinking.Using data collected on 3664 respondents the general population, we examine how accurately used social classifies as at-risk problems based Alcohol Use Disorder Identification Test-Consumption score benchmarks.We...

10.1111/acer.14807 article EN Alcoholism Clinical and Experimental Research 2022-05-01

Natural language is generated by people, yet traditional modeling views words or documents as if independently. Here, we propose human (HuLM), a hierarchical extension to the problem where human- level exists connect sequences of (e.g. social media messages) and capture notion that moderated changing states. We introduce, HaRT, large-scale transformer model for solving HuLM, pre-trained on approximately 100,000 users, demonstrate it’s effectiveness in terms both (perplexity) fine-tuning 4...

10.18653/v1/2022.findings-acl.52 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

Human natural language is mentioned at a specific point in time while human emotions change over time. While much work has established strong link between use and emotional states, few have attempted to model Here, we introduce the task of

10.18653/v1/2020.coling-main.261 article EN cc-by Proceedings of the 17th international conference on Computational linguistics - 2020-01-01

Swanie Juhng, Matthew Matero, Vasudha Varadarajan, Johannes Eichstaedt, Adithya V Ganesan, H. Andrew Schwartz. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2023.

10.18653/v1/2023.acl-short.128 article EN cc-by 2023-01-01

Social science NLP tasks, such as emotion or humor detection, are required to capture the semantics along with implicit pragmatics from text, often limited amounts of training data. Instruction tuning has been shown improve many capabilities large language models (LLMs) commonsense reasoning, reading comprehension, and computer programming. However, little is known about effectiveness instruction on social domain where pragmatic cues needed be captured. We explore use for tasks introduce...

10.48550/arxiv.2402.01980 preprint EN arXiv (Cornell University) 2024-02-02

Many recent works in natural language processing have demonstrated ability to assess aspects of mental health from personal discourse. At the same time, pre-trained contextual word embedding models grown dominate much NLP but little is known empirically on how best apply them for assessment. Using degree depression as a case study, we do an empirical analysis which off-the-shelf model, individual layers, and combinations layers seem most promising when applied human-level tasks. Notably,...

10.18653/v1/2022.wassa-1.9 article EN cc-by 2022-01-01

Recent works have demonstrated ability to assess aspects of mental health from personal discourse. At the same time, pre-trained contextual word embedding models grown dominate much NLP but little is known empirically on how best apply them for assessment. Using degree depression as a case study, we do an empirical analysis which off-the-shelf language model, individual layers, and combinations layers seem most promising when applied human-level tasks. Notably, find RoBERTa effective and,...

10.48550/arxiv.2112.13795 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Natural language is generated by people, yet traditional modeling views words or documents as if independently. Here, we propose human (HuLM), a hierarchical extension to the problem whereby human-level exists connect sequences of (e.g. social media messages) and capture notion that moderated changing states. We introduce, HaRT, large-scale transformer model for HuLM task, pre-trained on approximately 100,000 users, demonstrate its effectiveness in terms both (perplexity) fine-tuning 4...

10.48550/arxiv.2205.05128 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Adithya V Ganesan, Vasudha Varadarajan, Juhi Mittal, Shashanka Subrahmanya, Matthew Matero, Nikita Soni, Sharath Chandra Guntuku, Johannes Eichstaedt, H. Andrew Schwartz. Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology. 2022.

10.18653/v1/2022.clpsych-1.25 article EN cc-by 2022-01-01

Much of natural language processing is focused on leveraging large capacity models, typically trained over single messages with a task predicting one or more tokens. However, modeling human at higher-levels context (i.e., sequences messages) under-explored. In stance detection and other social media tasks where the goal to predict an attribute message, we have contextual data that loosely semantically connected by authorship. Here, introduce Message-Level Transformer (MeLT) -- hierarchical...

10.48550/arxiv.2109.08113 preprint EN other-oa arXiv (Cornell University) 2021-01-01

In human-level NLP tasks, such as predicting mental health, personality, or demographics, the number of observations is often smaller than standard 768+ hidden state sizes each layer within modern transformer-based language models, limiting ability to effectively leverage transformers. Here, we provide a systematic study on role dimension reduction methods (principal components analysis, factorization techniques, multi-layer auto-encoders) well dimensionality embedding vectors and sample...

10.48550/arxiv.2105.03484 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...