Xiaoyan Lin

ORCID: 0000-0003-2348-7977
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Bayesian Inference
  • Statistical Methods and Inference
  • Handwritten Text Recognition Techniques
  • Bayesian Methods and Mixture Models
  • Mathematics, Computing, and Information Processing
  • Statistical Distribution Estimation and Applications
  • Algorithms and Data Compression
  • Image Retrieval and Classification Techniques
  • Spatial and Panel Data Analysis
  • Fluid Dynamics and Vibration Analysis
  • Reliability and Agreement in Measurement
  • Computer Graphics and Visualization Techniques
  • Image Processing and 3D Reconstruction
  • Lattice Boltzmann Simulation Studies
  • Advanced Image and Video Retrieval Techniques
  • Transport and Economic Policies
  • Efficiency Analysis Using DEA
  • Advanced Causal Inference Techniques
  • Differential Equations and Numerical Methods
  • Face recognition and analysis
  • Video Analysis and Summarization
  • Open Education and E-Learning
  • Machine Learning and ELM
  • Advanced Database Systems and Queries
  • Facial Nerve Paralysis Treatment and Research

University of South Carolina
2011-2024

Peking University
2011-2023

Arkansas Children's Nutrition Center
2023

University of Arkansas for Medical Sciences
2023

Wageningen University & Research
2023

Dalian Polytechnic University
2023

University of Hong Kong
2023

The University of Western Australia
2023

North Carolina State University
2023

University of Nottingham
2023

Recognizing mathematical expressions in PDF documents is a new and important field document analysis. It quite different from extracting image-based documents. In this paper, we propose novel method by combining rule-based learning-based methods to detect both isolated embedded Moreover, various features of formulas, including geometric layout, character context content, are used adapt wide range formula types. Experimental results show satisfactory performance the proposed method....

10.1109/icdar.2011.285 article EN International Conference on Document Analysis and Recognition 2011-09-01

10.1016/j.csda.2011.03.013 article EN Computational Statistics & Data Analysis 2011-04-05

Abstract Interval‐censored data occur naturally in many fields and the main feature is that failure time of interest not observed exactly, but known to fall within some interval. In this paper, we propose a semiparametric probit model for analyzing case 2 interval‐censored as an alternative existing models literature. Specifically, approximate unknown nonparametric nondecreasing function with linear combination monotone splines, leading only finite number parameters estimate. Both maximum...

10.1002/sim.3832 article EN Statistics in Medicine 2010-01-12

10.1007/s10032-013-0216-1 article EN International Journal on Document Analysis and Recognition (IJDAR) 2013-12-20

The semantics of mathematical formulae depend on their spatial structure, and they usually exist in layout presentations such as PDF, LaTeX, Presentation MathML, which challenges previous text index retrieval methods. This paper proposes an innovative mathematics system along with the novel algorithms, enables efficient formula from both webpages PDF documents. Unlike prior studies, require users to manually input markup language query, new "copy" queries directly Furthermore, by using a...

10.1145/2600428.2609611 article EN 2014-07-03

Mathematical formulae in structural formats such as MathML and LaTeX are becoming increasingly available. Moreover, repositories websites, including ArXiv Wikipedia, growing numbers of digital libraries use these to present mathematical formulae. This presents an important new challenging area research, namely Information Retrieval (MIR). In this paper, we propose WikiMirs, a tool facilitate formula retrieval Wikipedia. WikiMirs is aimed at searching for similar based upon both textual...

10.1145/2467696.2467699 article EN 2013-07-22

Models designed to detect abnormalities that reflect disease from facial structures are an emerging area of research for automated analysis, which has important potential value in smart healthcare applications. However, most the proposed models directly analyze whole face image containing background information, and rarely consider effects different regions on analysis results. Therefore, view these effects, we propose end-to-end attention network with spatial transformation estimate pain...

10.1371/journal.pone.0232412 article EN cc-by PLoS ONE 2020-08-21

Current status data commonly arise in many fields such as epidemiological studies and cross-sectional tumorigenicity studies. In this article, we propose a semiparametric Bayesian approach for analyzing current with the proportional odds model. The use of monotone splines baseline function novel augmentation Poisson latent variables enable simple updating all parameters posterior computation. proposed shows good performance is compared Wang Dunson (2010 , L. D. B. ( 2010 ). Semiparametric...

10.1080/03610918.2011.566971 article EN Communications in Statistics - Simulation and Computation 2011-04-19

With the tremendous popularity of PDF format, recognizing mathematical formulas in documents becomes a new and important problem document analysis field. In this paper, we present method embedded formula identification documents, based on Support Vector Machine (SVM). The first segments text lines into words, then classifies each word two classes, namely or ordinary text. Various features formulas, including geometric layout, character context content, are utilized to build robust adaptable...

10.1117/12.912445 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 2011-11-30

The diagnostic accuracy of a test or rater has crucial impact on clinical decision making. assessment for multiple tests raters also merits much attention. A Bayesian hierarchical conditional independence latent class model estimating sensitivities and specificities large group is proposed, which applicable to both with-gold-standard without-gold-standard situations. Through the structure, not only are individual estimated, but performance whole tests. For small raters, proposed further...

10.1177/0962280219852649 article EN Statistical Methods in Medical Research 2019-05-30

Excessive fructose absorption and its subsequent metabolisms are implicated in nonalcoholic fatty liver disease, obesity, insulin resistance humans. Ketohexokinase (KHK) is a primary enzyme involved metabolism via the conversion of to fructose-1-phosphate. KHK inhibition might be potential approach for treatment metabolic disorders. Herein, series novel inhibitors were designed, synthesized, evaluated. Among them, compound 14 exhibited more potent activity than PF-06835919 based on rat assay...

10.1021/acs.jmedchem.3c00715 article EN Journal of Medicinal Chemistry 2023-09-28

This paper presents a performance evaluation system for mathematical formula identification. First, ground-truth dataset is constructed to facilitate the comparison of different identification algorithms. Statistics analysis shows diversities reflect real-world documents. Second, metric proposed, including error type definitions and scenario-adjustable scoring. The proposed enables in-depth systems in scenarios. Finally, based on metric, tool developed automatically evaluate results. It...

10.1109/das.2012.68 article EN 2012-03-01

Joint modeling of longitudinal data and survival has gained great attention in the last two decades. However, most existing studies have focused on right-censored data. In this article, we study joint analysis interval-censored conduct Bayesian variable selection framework. A new model is proposed with a shared frailty to characterize dependence between types responses, where response modeled semiparametric linear mixed-effects submodel time by normal fraility probit sub-model. Several...

10.21203/rs.3.rs-4254893/v1 preprint EN cc-by Research Square (Research Square) 2024-04-18

Text line detection is a prerequisite procedure of mathematical formula recognition, however, many incorrectly segmented text lines are often produced due to the two-dimensional structures mathematics when using existing segmentation methods such as Projection Profiles Cutting or white space analysis. In consequence, recognition adversely affected by these detected lines, with errors propagating through further processes. Aimed at we propose method produce reliable segmentation. Based on...

10.1109/icdar.2013.75 article EN 2013-08-01

Abstract In clinical medicine, the pain feeling is a significant indicator for medical condition of patients. Of late, automatic assessment methods have received more and interests. Many researchers proposed corresponding achieved impressive results. However, they always ignore locality individual differences painful expression. Therefore, identity aware network (LIAN) presented here. Concretely, characteristic, module consisting two‐branch structure, feature attention branches, presented....

10.1049/ipr2.12282 article EN IET Image Processing 2021-06-28

A flexible Bayesian semiparametric accelerated failure time (AFT) model is proposed for analyzing arbitrarily censored survival data with covariates subject to measurement error. Specifically, the baseline error distribution in AFT nonparametrically modeled as a Dirichlet process mixture of normals. Classical models are imposed An efficient and easy-to-implement Gibbs sampler, based on stick-breaking formulation combined techniques retrospective slice sampling, developed posterior...

10.1080/03610918.2014.977919 article EN Communications in Statistics - Simulation and Computation 2015-05-18

Converting paper books into electronic form provides benefits for archiving, distribution and content reuse. However, directly scanned images are usually undesirable books, automated re-mastering is required. In this paper, we describe a comprehensive image processing suite consisting of three major components: 1) enhancement with deskew, cropping, color correction, contrast text sharpening, 2) compound document compression, 3) extraction TOC (table content) linking. We built pipeline that...

10.1109/icdar.2005.5 article EN 2005-01-01

The explosive growth of the internet and electronic publishing has led to a huge number scientific documents being available users, however, they are usually inaccessible those with visual impairments often only partially compatible software modern hardware such as tablets e-readers. In this paper we revisit Maxtract, tool for analysing converting into accessible formats, combine it two advanced segmentation techniques, statistical line identification machine learning formula identification....

10.1109/icdar.2013.74 article EN 2013-08-01

Document layout analysis has been researched for many years. However, there is little work on the reverse of document analysis: synthesis, whose goal to generate logically correct and aesthetically appealing given text/image contents flexible template. This paper introduces a new automatic synthesis method, which can actively pursue optimal text block height-width tradeoff simultaneously with position adjustment. The key idea use cluster linear functions approximate nonlinear curve. Then...

10.1109/icdar.2005.42 article EN 2005-01-01
Coming Soon ...