NFDI4DS | UHH-SEMS - Publication Details

Xiaoyan Lin

ORCID: 0000-0003-2348-7977

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5064719403

Research Areas

Statistical Methods and Bayesian Inference
Statistical Methods and Inference
Handwritten Text Recognition Techniques
Bayesian Methods and Mixture Models
Mathematics, Computing, and Information Processing
Statistical Distribution Estimation and Applications
Algorithms and Data Compression
Image Retrieval and Classification Techniques
Spatial and Panel Data Analysis
Fluid Dynamics and Vibration Analysis
Reliability and Agreement in Measurement
Computer Graphics and Visualization Techniques
Image Processing and 3D Reconstruction
Lattice Boltzmann Simulation Studies
Advanced Image and Video Retrieval Techniques
Transport and Economic Policies
Efficiency Analysis Using DEA
Advanced Causal Inference Techniques
Differential Equations and Numerical Methods
Face recognition and analysis
Video Analysis and Summarization
Open Education and E-Learning
Machine Learning and ELM
Advanced Database Systems and Queries
Facial Nerve Paralysis Treatment and Research

University of South Carolina
2011-2024

Peking University
2011-2023

Arkansas Children's Nutrition Center
2023

University of Arkansas for Medical Sciences
2023

Wageningen University & Research
2023

Dalian Polytechnic University
2023

University of Hong Kong
2023

The University of Western Australia
2023

North Carolina State University
2023

University of Nottingham
2023

Mathematical Formula Identification in PDF Documents

OPENALEX - Publications

Xiaoyan Lin Liangcai Gao Zhi Tang Xiaofan Lin Xuan Hu

Recognizing mathematical expressions in PDF documents is a new and important field document analysis. It quite different from extracting image-based documents. In this paper, we propose novel method by combining rule-based learning-based methods to detect both isolated embedded Moreover, various features of formulas, including geometric layout, character context content, are used adapt wide range formula types. Experimental results show satisfactory performance the proposed method....

10.1109/icdar.2011.285 article EN International Conference on Document Analysis and Recognition 2011-09-01

Bayesian proportional hazards model for current status data with monotone splines

OPENALEX - Publications

Bo Cai Xiaoyan Lin Lianming Wang

10.1016/j.csda.2011.03.013 article EN Computational Statistics & Data Analysis 2011-04-05

A semiparametric probit model for case 2 interval‐censored failure time data

OPENALEX - Publications

Xiaoyan Lin Lianming Wang

Abstract Interval‐censored data occur naturally in many fields and the main feature is that failure time of interest not observed exactly, but known to fall within some interval. In this paper, we propose a semiparametric probit model for analyzing case 2 interval‐censored as an alternative existing models literature. Specifically, approximate unknown nonparametric nondecreasing function with linear combination monotone splines, leading only finite number parameters estimate. Both maximum...

10.1002/sim.3832 article EN Statistics in Medicine 2010-01-12

A Bayesian proportional hazards model for general interval-censored data

OPENALEX - Publications

Xiaoyan Lin Bo Cai Lianming Wang Zhigang Zhang

10.1007/s10985-014-9305-9 article EN Lifetime Data Analysis 2014-08-06

Mathematical formula identification and performance evaluation in PDF documents

OPENALEX - Publications

Xiaoyan Lin Liangcai Gao Zhi Tang Josef B. Baker Volker Sorge

10.1007/s10032-013-0216-1 article EN International Journal on Document Analysis and Recognition (IJDAR) 2013-12-20

A mathematics retrieval system for formulae in layout presentations

OPENALEX - Publications

Xiaoyan Lin Liangcai Gao Xuan Hu Zhi Tang Yingnan Xiao and 1 more

The semantics of mathematical formulae depend on their spatial structure, and they usually exist in layout presentations such as PDF, LaTeX, Presentation MathML, which challenges previous text index retrieval methods. This paper proposes an innovative mathematics system along with the novel algorithms, enables efficient formula from both webpages PDF documents. Unlike prior studies, require users to manually input markup language query, new "copy" queries directly Furthermore, by using a...

10.1145/2600428.2609611 article EN 2014-07-03

WikiMirs

OPENALEX - Publications

Xuan Hu Liangcai Gao Xiaoyan Lin Zhi Tang Xiaofan Lin and 1 more

Mathematical formulae in structural formats such as MathML and LaTeX are becoming increasingly available. Moreover, repositories websites, including ArXiv Wikipedia, growing numbers of digital libraries use these to present mathematical formulae. This presents an important new challenging area research, namely Information Retrieval (MIR). In this paper, we propose WikiMirs, a tool facilitate formula retrieval Wikipedia. WikiMirs is aimed at searching for similar based upon both textual...

10.1145/2467696.2467699 article EN 2013-07-22

Pain intensity estimation based on a spatial transformation and attention CNN

OPENALEX - Publications

Xuwu Xin Xiaoyan Lin Sheng-Fu Yang Xin Zheng

Models designed to detect abnormalities that reflect disease from facial structures are an emerging area of research for automated analysis, which has important potential value in smart healthcare applications. However, most the proposed models directly analyze whole face image containing background information, and rarely consider effects different regions on analysis results. Therefore, view these effects, we propose end-to-end attention network with spatial transformation estimate pain...

10.1371/journal.pone.0232412 article EN cc-by PLoS ONE 2020-08-21

Bayesian semiparametric model for spatially correlated interval-censored survival data

OPENALEX - Publications

Chun Pan Bo Cai Lianming Wang Xiaoyan Lin

10.1016/j.csda.2013.11.016 article EN Computational Statistics & Data Analysis 2014-01-03

Bayesian Proportional Odds Models for Analyzing Current Status Data: Univariate, Clustered, and Multivariate

OPENALEX - Publications

Xiaoyan Lin Lianming Wang

Current status data commonly arise in many fields such as epidemiological studies and cross-sectional tumorigenicity studies. In this article, we propose a semiparametric Bayesian approach for analyzing current with the proportional odds model. The use of monotone splines baseline function novel augmentation Poisson latent variables enable simple updating all parameters posterior computation. proposed shows good performance is compared Wang Dunson (2010 , L. D. B. ( 2010 ). Semiparametric...

10.1080/03610918.2011.566971 article EN Communications in Statistics - Simulation and Computation 2011-04-19

Identification of embedded mathematical formulas in PDF documents using SVM

OPENALEX - Publications

Xiaoyan Lin Liangcai Gao Zhi Tang Xuan Hu Xiaofan Lin

With the tremendous popularity of PDF format, recognizing mathematical formulas in documents becomes a new and important problem document analysis field. In this paper, we present method embedded formula identification documents, based on Support Vector Machine (SVM). The first segments text lines into words, then classifies each word two classes, namely or ordinary text. Various features formulas, including geometric layout, character context content, are utilized to build robust adaptable...

10.1117/12.912445 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 2011-11-30

Bayesian hierarchical latent class models for estimating diagnostic accuracy

OPENALEX - Publications

Chunling Wang Xiaoyan Lin Kerrie P. Nelson

The diagnostic accuracy of a test or rater has crucial impact on clinical decision making. assessment for multiple tests raters also merits much attention. A Bayesian hierarchical conditional independence latent class model estimating sensitivities and specificities large group is proposed, which applicable to both with-gold-standard without-gold-standard situations. Through the structure, not only are individual estimated, but performance whole tests. For small raters, proposed further...

10.1177/0962280219852649 article EN Statistical Methods in Medical Research 2019-05-30

Discovery of a Novel Ketohexokinase Inhibitor with Improved Drug Distribution in Target Tissue for the Treatment of Fructose Metabolic Disease

OPENALEX - Publications

Guodong Zhu Jiao Li Xiaoyan Lin Zhen Zhang Tao Hu and 2 more

Excessive fructose absorption and its subsequent metabolisms are implicated in nonalcoholic fatty liver disease, obesity, insulin resistance humans. Ketohexokinase (KHK) is a primary enzyme involved metabolism via the conversion of to fructose-1-phosphate. KHK inhibition might be potential approach for treatment metabolic disorders. Herein, series novel inhibitors were designed, synthesized, evaluated. Among them, compound 14 exhibited more potent activity than PF-06835919 based on rat assay...

10.1021/acs.jmedchem.3c00715 article EN Journal of Medicinal Chemistry 2023-09-28

Performance Evaluation of Mathematical Formula Identification

OPENALEX - Publications

Xiaoyan Lin Liangcai Gao Zhi Tang Xiaofan Lin Xuan Hu

This paper presents a performance evaluation system for mathematical formula identification. First, ground-truth dataset is constructed to facilitate the comparison of different identification algorithms. Statistics analysis shows diversities reflect real-world documents. Second, metric proposed, including error type definitions and scenario-adjustable scoring. The proposed enables in-depth systems in scenarios. Finally, based on metric, tool developed automatically evaluate results. It...

10.1109/das.2012.68 article EN 2012-03-01

Bayesian variable selection in joint modeling of longitudinal data and interval-censored failure time data

OPENALEX - Publications

Yuchen Mao Lianming Wang Xiaoyan Lin Xuemei Sui

Joint modeling of longitudinal data and survival has gained great attention in the last two decades. However, most existing studies have focused on right-censored data. In this article, we study joint analysis interval-censored conduct Bayesian variable selection framework. A new model is proposed with a shared frailty to characterize dependence between types responses, where response modeled semiparametric linear mixed-effects submodel time by normal fraility probit sub-model. Several...

10.21203/rs.3.rs-4254893/v1 preprint EN cc-by Research Square (Research Square) 2024-04-18

A Text Line Detection Method for Mathematical Formula Recognition

OPENALEX - Publications

Xiaoyan Lin Liangcai Gao Zhi Tang Josef B. Baker Mohamed Alkalai and 1 more

Text line detection is a prerequisite procedure of mathematical formula recognition, however, many incorrectly segmented text lines are often produced due to the two-dimensional structures mathematics when using existing segmentation methods such as Projection Profiles Cutting or white space analysis. In consequence, recognition adversely affected by these detected lines, with errors propagating through further processes. Aimed at we propose method produce reliable segmentation. Based on...

10.1109/icdar.2013.75 article EN 2013-08-01

A Bayesian approach for analyzing case 2 interval-censored data under the semiparametric proportional odds model

OPENALEX - Publications

Lianming Wang Xiaoyan Lin

10.1016/j.spl.2011.02.034 article EN Statistics & Probability Letters 2011-03-05

Pain expression assessment based on a locality and identity aware network

OPENALEX - Publications

Xuwu Xin Xiaowu Li Sheng-Fu Yang Xiaoyan Lin Xin Zheng

Abstract In clinical medicine, the pain feeling is a significant indicator for medical condition of patients. Of late, automatic assessment methods have received more and interests. Many researchers proposed corresponding achieved impressive results. However, they always ignore locality individual differences painful expression. Therefore, identity aware network (LIAN) presented here. Concretely, characteristic, module consisting two‐branch structure, feature attention branches, presented....

10.1049/ipr2.12282 article EN IET Image Processing 2021-06-28

A Bayesian approach for semiparametric regression analysis of panel count data

OPENALEX - Publications

Jianhong Wang Xiaoyan Lin

10.1007/s10985-019-09471-3 article EN Lifetime Data Analysis 2019-04-15

A Bayesian semiparametric accelerated failure time model for arbitrarily censored data with covariates subject to measurement error

OPENALEX - Publications

Xiaoyan Lin

A flexible Bayesian semiparametric accelerated failure time (AFT) model is proposed for analyzing arbitrarily censored survival data with covariates subject to measurement error. Specifically, the baseline error distribution in AFT nonparametrically modeled as a Dirichlet process mixture of normals. Classical models are imposed An efficient and easy-to-implement Gibbs sampler, based on stick-breaking formulation combined techniques retrospective slice sampling, developed posterior...

10.1080/03610918.2014.977919 article EN Communications in Statistics - Simulation and Computation 2015-05-18

A comprehensive image processing suite for book re-mastering

OPENALEX - Publications

Jingtao Fan Xiaoyan Lin Steven J. Simske

Converting paper books into electronic form provides benefits for archiving, distribution and content reuse. However, directly scanned images are usually undesirable books, automated re-mastering is required. In this paper, we describe a comprehensive image processing suite consisting of three major components: 1) enhancement with deskew, cropping, color correction, contrast text sharpening, 2) compound document compression, 3) extraction TOC (table content) linking. We built pipeline that...

10.1109/icdar.2005.5 article EN 2005-01-01

Improving Formula Analysis with Line and Mathematics Identification

OPENALEX - Publications

Mohamed Alkalai Josef B. Baker Volker Sorge Xiaoyan Lin

The explosive growth of the internet and electronic publishing has led to a huge number scientific documents being available users, however, they are usually inaccessible those with visual impairments often only partially compatible software modern hardware such as tablets e-readers. In this paper we revisit Maxtract, tool for analysing converting into accessible formats, combine it two advanced segmentation techniques, statistical line identification machine learning formula identification....

10.1109/icdar.2013.74 article EN 2013-08-01

Active document layout synthesis

OPENALEX - Publications

Xiaoyan Lin

Document layout analysis has been researched for many years. However, there is little work on the reverse of document analysis: synthesis, whose goal to generate logically correct and aesthetically appealing given text/image contents flexible template. This paper introduces a new automatic synthesis method, which can actively pursue optimal text block height-width tradeoff simultaneously with position adjustment. The key idea use cluster linear functions approximate nonlinear curve. Then...

10.1109/icdar.2005.42 article EN 2005-01-01

Coming Soon ...