NFDI4DS | UHH-SEMS - Publication Details

Learning geographical preferences for point-of-interest recommendation

OPENALEX - Publications

Bin Liu Yanjie Fu Zijun Yao Hui Xiong

The problem of point interest (POI) recommendation is to provide personalized recommendations places interests, such as restaurants, for mobile users. Due its complexity and connection location based social networks (LBSNs), the decision process a user choose POI complex can be influenced by various factors, preferences, geographical influences, mobility behaviors. While there are some studies on recommendations, it lacks integrated analysis joint effect multiple factors. To this end, in...

10.1145/2487575.2487673 article EN 2013-08-11

Dynamic Word Embeddings for Evolving Semantic Discovery

OPENALEX - Publications

Zijun Yao Yifan Sun Weicong Ding Nikhil Rao Hui Xiong

Word evolution refers to the changing meanings and associations of words throughout time, as a byproduct human language evolution. By studying word evolution, we can infer social trends constructs over different periods history. However, traditional techniques such representation learning do not adequately capture evolving structure vocabulary. In this paper, develop dynamic statistical model learn time-aware vector representation. We propose that simultaneously learns embeddings solves...

10.1145/3159652.3159703 preprint EN 2018-02-02

A General Geographical Probabilistic Factor Model for Point of Interest Recommendation

OPENALEX - Publications

Bin Liu Hui Xiong Spiros Papadimitriou Yanjie Fu Zijun Yao

The problem of point interest (POI) recommendation is to provide personalized recommendations places, such as restaurants and movie theaters. increasing prevalence mobile devices location based social networks (LBSNs) poses significant new opportunities well challenges, which we address. decision process for a user choose POI complex can be influenced by numerous factors, personal preferences, geographical considerations, mobility behaviors. This further complicated the connection LBSNs...

10.1109/tkde.2014.2362525 article EN IEEE Transactions on Knowledge and Data Engineering 2014-10-09

POI Recommendation: A Temporal Matching between POI Popularity and User Regularity

OPENALEX - Publications

Zijun Yao Yanjie Fu Bin Liu Yanchi Liu Hui Xiong

Point of interest (POI) recommendation, which provides personalized recommendation places to mobile users, is an important task in location-based social networks (LBSNs). However, quite different from traditional interest-oriented merchandise POI more complex due the timing effects: we need examine whether fits a user's availability. While there are some prior studies included temporal effect into recommendations, they overlooked compatibility between time-varying popularity POIs and regular...

10.1109/icdm.2016.0066 article EN 2016-12-01

Representing Urban Functions through Zone Embedding with Human Mobility Patterns

OPENALEX - Publications

Zijun Yao Yanjie Fu Bin Liu Wangsu Hu Hui Xiong

Urban functions refer to the purposes of land use in cities where each zone plays a distinct role and cooperates with other serve people’s various life needs. Understanding helps solve variety urban related problems, such as increasing traffic capacity enhancing location-based service. Therefore, it is beneficial investigate how learn representations city zones terms functions, for better supporting analytic applications. To this end, paper, we propose framework vector representation...

10.24963/ijcai.2018/545 article EN 2018-07-01

Exploiting geographic dependencies for real estate appraisal

OPENALEX - Publications

Yanjie Fu Hui Xiong Yong Ge Zijun Yao Yu Zheng and 1 more

It is traditionally a challenge for home buyers to understand, compare and contrast the investment values of real estates. While number estate appraisal methods have been developed value property, performances these limited by traditional data sources appraisal. However, with development new ways collecting estate-related mobile data, there potential leverage geographic dependencies estates enhancing Indeed, an can be from characteristics its own neighborhood (individual), nearby (peer),...

10.1145/2623330.2623675 article EN 2014-08-22

Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors

OPENALEX - Publications

Yanjie Fu Yong Ge Yu Zheng Zijun Yao Yanchi Liu and 2 more

Ranking residential real estates based on investment values can provide decision making support for home buyers and thus plays an important role in estate marketplace. In this paper, we aim to develop methods ranking by mining users' opinions about from online user reviews offline moving behaviors (e.g., Taxi traces, smart card transactions, check-ins). While a variety of features could be extracted these data, are Interco related redundant. Thus, selecting good integrating the feature...

10.1109/icdm.2014.18 article EN 2014-12-01

User Preference Learning with Multiple Information Fusion for Restaurant Recommendation

OPENALEX - Publications

Yanjie Fu Bin Liu Yong Ge Zijun Yao Hui Xiong

If properly analyzed, the multi-aspect rating data could be a source of rich intelligence for providing personalized restaurant recommendations. Indeed, while recommender systems have been studied various applications and many recommendation techniques developed general or specific tasks, there are few studies by addressing unique challenges reviews. As we know, traditional collaborative filtering methods typically single aspect ratings. However, ratings often collected from customers. These...

10.1137/1.9781611973440.54 article EN 2014-04-28

Check Me If You Can: Detecting ChatGPT-Generated Academic Writing using CheckGPT

OPENALEX - Publications

Zeyan Liu Zijun Yao Fengjun Li Bo Luo

With ChatGPT under the spotlight, utilizing large language models (LLMs) to assist academic writing has drawn a significant amount of debate in community. In this paper, we aim present comprehensive study detectability ChatGPT-generated content within literature, particularly focusing on abstracts scientific papers, offer holistic support for future development LLM applications and policies academia. Specifically, first GPABench2, benchmarking dataset over 2.8 million comparative samples...

10.48550/arxiv.2306.05524 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Ontology-aware Prescription Recommendation in Treatment Pathways Using Multi-evidence Healthcare Data

OPENALEX - Publications

Zijun Yao Bin Liu Fei Wang Daby Sow Ying Li

For care of chronic diseases (e.g., depression, diabetes, hypertension), it is critical to identify effective treatment pathways that aim promptly update the medication following change patient state and disease progression. This task challenging because optimal pathway for each needs be personalized due significant heterogeneity among individuals. Therefore, naturally promising investigate how use abundant electronic health records recommend safe prescriptions. However, prescription...

10.1145/3579994 article EN ACM transactions on office information systems 2023-01-12

Recurrent neural networks and attention scores for personalized prediction and interpretation of patient-reported outcomes

OPENALEX - Publications

Jinxiang Hu Mohsen Nayebi Kerdabadi Xiaohang Mei Joseph Cappelleri Richard J. Barohn and 1 more

We proposed an Interpretable Personalized Artificial Intelligence (AI) model for PRO measures via Recurrent Neural Networks (RNN) and attention scores, with data from open label randomized clinical trial of pain in 402 participants cryptogenic sensory polyneuropathy at 40 neurology care clinics. All patients were assigned to four treatment groups: nortriptyline, duloxetine, pregabalin, mexiletine. Each patient had 4 (quality life SF-12; PROMIS: interference, fatigue, sleep disturbance) time...

10.1080/10543406.2025.2469884 article EN Journal of Biopharmaceutical Statistics 2025-03-13

Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament

OPENALEX - Publications

Yantao Liu Zijun Yao Rui Min Yixin Cao Lei Hou and 1 more

Best-of-N (BoN) sampling, a common strategy for test-time scaling of Large Language Models (LLMs), relies on reward models to select the best candidate solution from multiple generations. However, traditional often assign arbitrary and inconsistent scores, limiting their effectiveness. To address this, we propose Pairwise Reward Model (Pairwise RM) combined with knockout tournament BoN sampling. Instead assigning absolute given one math problem, RM evaluates two solutions' correctness...

10.48550/arxiv.2501.13007 preprint EN arXiv (Cornell University) 2025-01-22

Designing GenAI Tools for Personalized Learning Implementation: Theoretical Analysis and Prototype of a Multi-Agent System

OPENALEX - Publications

Ling Zhang Zijun Yao Arya Hadizadeh Moghaddam

Educator preparation, personalized learning (PL) implementation, and applications of Generative AI converge as three interrelated systems that, when carefully designed, can help achieve the long-sought goal providing inclusive education for all learners. However, realizing this potential comes with challenges resulting from theoretical complexities technological constraints. This article provides a analysis complex interconnectedness among these guided by Cultural-Historical Activity Theory...

10.1177/00224871251325109 article EN Journal of Teacher Education 2025-03-19

Computing Co-Location Patterns in Spatial Data with Extended Objects: A Scalable Buffer-Based Approach

OPENALEX - Publications

Yong Ge Zijun Yao Huayu Li

Spatial co-location patterns are subsets of spatial features usually located together in geographic space. Recent literature has provided different approaches to discover over point data. However, most consider the neighborhood relationship among objects as binary and mainly designed for features, thus not appropriate extended such line strings polygons, which is naturally continuous. This paper adopts a buffer-based model measuring mining patterns. While several advantages it involves high...

10.1109/tkde.2019.2930598 article EN publisher-specific-oa IEEE Transactions on Knowledge and Data Engineering 2019-07-23

Modeling of Geographic Dependencies for Real Estate Ranking

OPENALEX - Publications

Yanjie Fu Hui Xiong Yong Ge Yu Zheng Zijun Yao and 1 more

It is traditionally a challenge for home buyers to understand, compare, and contrast the investment value of real estate. Although number appraisal methods have been developed properties, performances these limited by traditional data sources estate appraisal. With development new ways collecting estate-related mobile data, there potential leverage geographic dependencies enhancing Indeed, an can be from characteristics its own neighborhood (individual), values nearby estates (peer),...

10.1145/2934692 article EN ACM Transactions on Knowledge Discovery from Data 2016-08-27

On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing

OPENALEX - Publications

Z.Q. Liu Zijun Yao Fengjun Li Bo Luo

With ChatGPT under the spotlight, utilizing large language models (LLMs) to assist academic writing has drawn a significant amount of debate in community. In this paper, we aim present comprehensive study detectability ChatGPT-generated content within literature, particularly focusing on abstracts scientific papers, offer holistic support for future development LLM applications and policies academia. Specifically, first GPABench2, benchmarking dataset over 2.8 million comparative samples...

10.1145/3658644.3670392 article EN cc-by 2024-12-02

eXITs: An Ensemble Approach for Imputing Missing EHR Data

OPENALEX - Publications

James Codella Hillol Sarker Prithwish Chakraborty Mohamed Ghalwash Zijun Yao and 1 more

Missing data points are prevalent in electronic health records (EHRs) and an impedance to utilizing machine learning for predictive classification tasks healthcare. For this challenge, we developed eXITs - a stacked ensemble learner that employs 6 base models perform imputation on time series from 13 different laboratory tests across 8, 267 patients the MIMIC-III database provided ICHI 2019 Data Analytics Challenge Imputation (DACMI). The results show our model (avg. nRMSE = 0.200)...

10.1109/ichi.2019.8904779 article EN 2019-06-01

Reverse That Number! Decoding Order Matters in Arithmetic Learning

OPENALEX - Publications

Daniel Zhang-Li Nianyi Lin Jifan Yu Zheyuan Zhang Zijun Yao and 4 more

Recent advancements in pretraining have demonstrated that modern Large Language Models (LLMs) possess the capability to effectively learn arithmetic operations. However, despite acknowledging significance of digit order computation, current methodologies predominantly rely on sequential, step-by-step approaches for teaching LLMs arithmetic, resulting a conclusion where obtaining better performance involves fine-grained step-by-step. Diverging from this conventional path, our work introduces...

10.48550/arxiv.2403.05845 preprint EN arXiv (Cornell University) 2024-03-09

Contrastive Learning on Medical Intents for Sequential Prescription Recommendation

OPENALEX - Publications

Arya Hadizadeh Moghaddam Mohsen Nayebi Kerdabadi Mei Liu Zijun Yao

Recent advancements in sequential modeling applied to Electronic Health Records (EHR) have greatly influenced prescription recommender systems.While the recent literature on drug recommendation has shown promising performance, study of discovering a diversity coexisting temporal relationships at level medical codes over consecutive visits remains less explored.The goal this can be motivated from two perspectives.First, there is need develop sophisticated model capable disentangling complex...

10.1145/3627673.3679836 preprint EN 2024-10-20

Contrastive Learning of Temporal Distinctiveness for Survival Analysis in Electronic Health Records

OPENALEX - Publications

Mohsen Nayebi Kerdabadi Arya Hadizadeh Moghaddam Bin Liu Mei Liu Zijun Yao

Survival analysis plays a crucial role in many healthcare decisions, where the risk prediction for events of interest can support an informative outlook patient's medical journey. Given existence data censoring, effective way survival is to enforce pairwise temporal concordance between censored and observed data, aiming utilize time interval before censoring as partially time-to-event labels supervised learning. Although existing studies mostly employed ranking methods pursue ordering...

10.1145/3583780.3614824 article EN cc-by 2023-10-21

Image inpainting algorithm based on partial differential equation technique

OPENALEX - Publications

S J Li Zijun Yao

Image inpainting is the process of filling in missing parts damaged images based on information gleaned from surrounding areas. In this paper, we present two variational models for image inpainting. Combining models, can simultaneously fill missing, corrupted or undesirable information, while remove noise. We explain that diffusion performance proposed essentially superior to TV model by analysing physical characteristics local coordinates, and investigate existence minimising functionals BV...

10.1179/1743131x11y.0000000055 article EN The Imaging Science Journal 2012-03-08

The Impact of Community Safety on House Ranking

OPENALEX - Publications

Zijun Yao Yanjie Fu Bin Liu Hui Xiong

Previous chapter Next Full AccessProceedings Proceedings of the 2016 SIAM International Conference on Data Mining (SDM)The Impact Community Safety House RankingZijun Yao, Yanjie Fu, Bin Liu, and Hui XiongZijun Xiongpp.459 - 467Chapter DOI:https://doi.org/10.1137/1.9781611974348.52PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract It is well recognized that community safety which affects people's right live without fear crime has considerable...

10.1137/1.9781611974348.52 article EN 2016-06-30

Multi-View Multi-Task Campaign Embedding for Cold-Start Conversion Rate Forecasting

OPENALEX - Publications

Zijun Yao Deguang Kong Miao Lu Xiao Bai Jian Yang and 1 more

In online advertising, it is critical for advertisers to forecast conversion rate (CVR) of campaigns. Previous work on campaign forecasting concentrates the time-series analysis which depend availability a length history. However, these approaches become inadequate cold-start campaigns lack observation past. this work, we attempt mitigate challenge by learning an unsupervised and composite embedding capture multi-view semantic relationships information, consequently using nearest neighbor...

10.1109/tbdata.2022.3162150 article EN IEEE Transactions on Big Data 2022-03-24

Invernet: An Inversion Attack Framework to Infer Fine-Tuning Datasets through Word Embeddings

OPENALEX - Publications

Ishrak Hayet Zijun Yao Bo Luo

Word embedding aims to learn the dense representation of words and has become a regular input preparation in many NLP tasks. Due data computation intensive nature learning embeddings from scratch, more affordable way is borrow pretrained available public fine-tune through domain specific downstream dataset. A privacy concern can arise if malicious owner gets access fine-tuned tries infer critical information datasets. In this study, we propose novel inversion framework called Invernet that...

10.18653/v1/2022.findings-emnlp.368 article EN cc-by 2022-01-01

Plasma Cell-Free DNA Is a Potential Biomarker for Diagnosis of Calcific Aortic Valve Disease

OPENALEX - Publications

Wangge Ma Wei Zhang Huahua Liu Benheng Qian R. Lai and 4 more

<b><i>Introduction:</i></b> Calcific aortic valve disease (CAVD) is the third most common cardiovascular in aging populations. Despite a growing number of biomarkers having been shown to be associated with CAVD, marker suitable for routine testing clinical practice still needed. Plasma cell-free DNA (cfDNA) has suggested as biomarker diagnosis and prognosis multiple diseases. In this study, we aimed test whether cfDNA could used CAVD....

10.1159/000534229 article EN cc-by Cardiology 2023-10-27