Tianyi Bai

ORCID: 0009-0009-5057-7100
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Advanced Multi-Objective Optimization Algorithms
  • Machine Learning and Data Classification
  • Web Data Mining and Analysis
  • Extraction and Separation Processes
  • Ocular Surface and Contact Lens
  • Orthopaedic implants and arthroplasty
  • Arsenic contamination and mitigation
  • Bone health and treatments
  • Spine and Intervertebral Disc Pathology
  • Corneal Surgery and Treatments
  • Musculoskeletal pain and rehabilitation
  • Adsorption and biosorption for pollutant removal
  • Engineering and Information Technology
  • Salivary Gland Disorders and Functions
  • Bone Metabolism and Diseases
  • Advanced Image and Video Retrieval Techniques
  • Machine Learning and Algorithms
  • Advanced Bandit Algorithms Research
  • Manufacturing Process and Optimization
  • Video Analysis and Summarization
  • Modeling, Simulation, and Optimization
  • Scoliosis diagnosis and treatment
  • Topic Modeling

Stomatology Hospital
2024

Peking University
2024

Peking University Third Hospital
2024

Beijing Institute of Technology
2022

Affiliated Hospital of Chengde Medical College
2022

Nanjing University
2014

A wide spectrum of design and decision problems, including parameter tuning, A/B testing drug design, intrinsically are instances black-box optimization. Bayesian optimization (BO) is a powerful tool that models optimizes such expensive "black-box" functions. However, at the beginning optimization, vanilla methods often suffer from slow convergence issue due to inaccurate modeling based on few trials. To address this issue, researchers in BO community propose incorporate spirit transfer...

10.48550/arxiv.2302.05927 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Removal and separation of As(<sc>v</sc>) As(<sc>iii</sc>) can be achieved by bifunctional silica.

10.1039/c4ra06563h article EN RSC Advances 2014-01-01

Osteoporotic individuals who have dental implants usually require a prolonged healing time for osseointegration due to the shortage of bone mass and lack initial stability. Although studies shown that intermittent teriparatide administration can promote osseointegration, there is little data support idea pre-implantation necessary beneficial. Sixty-four titanium were placed in bilateral proximal tibial metaphysis 32 female SD rats. Bilateral ovariectomy (OVX) was used induce osteoporosis....

10.1186/s40729-024-00536-z article EN cc-by International Journal of Implant Dentistry 2024-04-16

Human beings perceive the world through diverse senses such as sight, smell, hearing, and touch. Similarly, multimodal large language models (MLLMs) enhance capabilities of traditional by integrating processing data from multiple modalities including text, vision, audio, video, 3D environments. Data plays a pivotal role in development refinement these models. In this survey, we comprehensively review literature on MLLMs data-centric perspective. Specifically, explore methods for preparing...

10.48550/arxiv.2405.16640 preprint EN arXiv (Cornell University) 2024-05-26

The tuning of hyperparameters becomes increasingly important as machine learning (ML) models have been extensively applied in data mining applications. Among various approaches, Bayesian optimization (BO) is a successful methodology to tune hyper-parameters automatically. While traditional methods optimize each task isolation, there has recent interest speeding up BO by transferring knowledge across previous tasks. In this work, we introduce an automatic method design the search space with...

10.1145/3534678.3539369 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

Data selection is of great significance in pre-training large language models, given the variation quality within large-scale available training corpora. To achieve this, researchers are currently investigating use data influence to measure importance instances, $i.e.,$ a high score indicates that incorporating this instance set likely enhance model performance. Consequently, they select top-$k$ instances with highest scores. However, approach has several limitations. (1) Computing all...

10.48550/arxiv.2409.16986 preprint EN arXiv (Cornell University) 2024-09-25

Efficient data selection is crucial to accelerate the pretraining of large language models (LLMs). While various methods have been proposed enhance efficiency, limited research has addressed inherent conflicts between these approaches achieve optimal for LLM pretraining. To tackle this problem, we propose a novel multi-agent collaborative mechanism. In framework, each method serves as an independent agent, and agent console designed dynamically integrate information from all agents...

10.48550/arxiv.2410.08102 preprint EN arXiv (Cornell University) 2024-10-10

Recently, with the rise of web videos, managing and understanding large-scale video datasets has become increasingly important. Video Large Language Models (VideoLLMs) have emerged in recent years due to their strong capabilities. However, training inference processes for VideoLLMs demand vast amounts data, presenting significant challenges data management, particularly regarding efficiency, robustness, effectiveness. In this work, we present KeyVideoLLM, a text-video frame similarity-based...

10.48550/arxiv.2407.03104 preprint EN arXiv (Cornell University) 2024-07-03

As the elderly population continues to grow, number of patients with low back pain is gradually increasing. Among them, Lumbar Degenerative Diseases (LDD) one major contributors pain. Biomechanical in vivo studies lumbar spine are mainly performed by implants or imaging data record real-time changes form and stress on intervertebral disc during motion. However, current developments slow due technological ethical limitations. In vitro experiments include animal cadaver experiments, which...

10.4236/jbm.2022.103004 article EN Journal of Biosciences and Medicines 2022-01-01
Coming Soon ...