Aritra Bose

ORCID: 0000-0002-8665-056X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genetic Associations and Epidemiology
  • Genetic Mapping and Diversity in Plants and Animals
  • Bioinformatics and Genomic Networks
  • Genetic and phenotypic traits in livestock
  • Gene expression and cancer classification
  • Genetic diversity and population structure
  • Forensic and Genetic Research
  • Computational Drug Discovery Methods
  • Biochemical Analysis and Sensing Techniques
  • Olfactory and Sensory Function Studies
  • COVID-19 and Mental Health
  • Advanced Chemical Sensor Technologies
  • Evolution and Genetic Dynamics
  • Occupational Health and Safety Research
  • Topological and Geometric Data Analysis
  • Machine Learning in Bioinformatics
  • COVID-19 Pandemic Impacts
  • Innovative Microfluidic and Catalytic Techniques Innovation
  • Quantum Computing Algorithms and Architecture
  • Molecular Communication and Nanonetworks
  • Quantum-Dot Cellular Automata
  • Tensor decomposition and applications
  • Racial and Ethnic Identity Research
  • Race, Genetics, and Society
  • Artificial Intelligence in Healthcare

IBM (United States)
2021-2024

IBM Research - Thomas J. Watson Research Center
2021-2024

Topiwala National Medical College & BYL Nair Charitable Hospital
2021-2023

Creative Commons
2023

Purdue University West Lafayette
2017-2020

King Edward Memorial Hospital and Seth G.S. Medical College
2019

Indian Statistical Institute
2012

Principal Component Analysis is a key tool in the study of population structure human genetics. As modern datasets become increasingly larger size, traditional approaches based on loading entire dataset system memory (Random Access Memory) impractical and out-of-core implementations are only viable alternative.We present TeraPCA, C++ implementation Randomized Subspace Iteration method to perform large-scale datasets. TeraPCA can be applied both in-core able successfully operate even...

10.1093/bioinformatics/btz157 article EN Bioinformatics 2019-04-04

Tensor decomposition has emerged as a powerful framework for feature extraction in multi-modal biomedical data. In this review, we present comprehensive analysis of tensor methods such Tucker, CANDECOMP/PARAFAC, spiked decomposition, etc. and their diverse applications across domains imaging, multi-omics, spatial transcriptomics. To systematically investigate the literature, applied topic modeling-based approach that identifies groups distinct thematic sub-areas biomedicine where been used,...

10.48550/arxiv.2502.13140 preprint EN arXiv (Cornell University) 2025-02-18

GWAS focuses on significance loosing false positives; machine learning probes sub-significant features relying predictivity. Yet, these are far from orthogonal. We sought to explore how inform each other in sub-genome-wide significant situations define relevance for predictive features. introduce the SVM-based RubricOE that selects heavily cross-validated feature sets, and LDpred2 PRS as a strong contrast SVM, Our Alzheimer's test case notoriously lacks genetic signals except few very...

10.1016/j.isci.2024.109209 article EN cc-by iScience 2024-02-12

Peloponnese has been one of the cradles Classical European civilization and an important contributor to ancient history. It also subject a controversy about ancestry its population. In theory hotly debated by scholars for over 170 years, German historian Jacob Philipp Fallmerayer proposed that medieval Peloponneseans were totally extinguished Slavic Avar invaders replaced settlers during 6th century CE. Here we use 2.5 million single-nucleotide polymorphisms investigate genetic structure...

10.1038/ejhg.2017.18 article EN cc-by-nc-nd European Journal of Human Genetics 2017-03-08

In recent years, there has been tremendous progress in the development of quantum computing hardware, algorithms and services leading to expectation that near future computers will be capable performing simulations for natural science applications, operations research, machine learning at scales mostly inaccessible classical computers. Whereas impact already started recognized fields such as cryptanalysis, simulations, optimization among others, very little is known about full potential...

10.48550/arxiv.2307.05734 preprint EN cc-by arXiv (Cornell University) 2023-01-01

India represents an intricate tapestry of population substructure shaped by geography, language, culture, and social stratification. Although geography closely correlates with genetic structure in other parts the world, strict endogamy imposed Indian caste system large number spoken languages add further levels complexity to understand structure. To date, no study has attempted model evaluate how these factors have interacted shape patterns diversity within India. We merged all publicly...

10.1093/molbev/msaa321 article EN cc-by-nc Molecular Biology and Evolution 2021-01-13

Linear mixed models (LMMs) have been widely used in genome-wide association studies to control for population stratification and cryptic relatedness. However, estimating LMM parameters is computationally expensive, necessitating large-scale matrix operations build the genetic relationship (GRM). Over past 25 years, Randomized Algebra has provided alternative approaches such by leveraging sketching , which often results provably accurate fast efficient approximations. We leverage develop a...

10.1101/gr.279230.124 article EN Genome Research 2024-09-04

Identifying variants associated with complex traits is a challenging task in genetic association studies due to linkage disequilibrium (LD) between and population stratification, unrelated the disease risk. Existing methods of structure correction use principal component analysis or linear mixed models random effect when modeling associations trait interest markers. However, stringent significance thresholds latent interactions markers, these often fail detect genuinely variants.To overcome...

10.1186/s12859-023-05511-w article EN cc-by BMC Bioinformatics 2023-10-31

Abstract A wide variety of chemicals having distinct odors are smelled by humans. Odor perception initiates in the nose, where it is detected a large family olfactory receptors (ORs). Based on divergence evolutionary model sequence human ORs database has been proposed D. Lancet et al (2000, 2006). It quite impossible to infer whether given nucleotides OR or not, without any biological experimental validation. In our perspective, proper quantitative understanding these required justify...

10.1038/npre.2012.6967.1 preprint EN Nature Precedings 2012-03-05

Abstract Genome-wide association studies (GWAS) have been extensively used to estimate the signed effects of trait-associated alleles. Recent independent failed replicate strong evidence selection for height across Europe implying shortcomings standard population stratification correction approaches. Here, we present CluStrat, a algorithm complex structure that leverages linkage disequilibrium (LD)-induced distances between individuals. CluStrat performs agglomerative hierarchical clustering...

10.1101/2020.01.15.908228 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-01-16

Polygenic risk scores (PRS) are increasingly used to estimate the personal of a trait based on genetics. However, most genomic cohorts European populations, with strong under-representation non-European groups. Given that PRS poorly transport across racial groups, this has potential exacerbate health disparities if in clinical care. Hence there is need generate perform comparably ethnic Borrowing from recent advancements domain adaption field machine learning, we propose FairPRS - an...

10.1142/9789811270611_0019 article EN cc-by-nc Biocomputing 2022-11-01

Clinical trials are pivotal in the drug discovery process to determine safety and efficacy of a candidate. The high failure rates these attributed deficiencies clinical model development protocol design. Improvements design could therefore yield significant benefits for all stakeholders involved. This paper examines current challenges faced trial optimization, reviews established classical computational approaches, introduces quantum algorithms aimed at enhancing processes. Specifically,...

10.48550/arxiv.2404.13113 preprint EN arXiv (Cornell University) 2024-04-19

Abstract Background: Essential life skills are a vast range of psychological and interpersonal abilities that can help people lead healthy productive lives, make informed decisions, communicate effectively, build coping self-management skills. Adolescence is the building block for these skills, current study was planned to estimate among school-going children in slum area. Methodology: The conducted area Mumbai, where 158 adolescents (10–19 years) were selected by systematic random sampling....

10.4103/jphpc.jphpc_62_23 article EN cc-by-nc-sa Journal of Public Health and Primary Care 2024-01-01

The emergence of COVID-19 (C19) created incredible worldwide challenges but offers unique opportunities to understand the physiology its risk factors and their interactions with complex disease conditions, such as metabolic syndrome. To address discovering clinically relevant interactions, we employed a approach for epidemiological analysis powered by redescription-based topological data (RTDA).

10.1093/bioinformatics/btae235 article EN cc-by Bioinformatics 2024-06-28

Abstract Chronic kidney disease (CKD) is a complex condition where the kidneys are damaged and progressively lose their ability to filter blood, 10% of world population have that often goes undetected until it too late for intervention. Using UK Biobank (UKBB) we constructed CKD cohort patients (n=46,986) with genomic, clinical demographic data available, subset (n=2,151) having also whole body Magnetic Resonance Imaging (MRI) scans. We used this multimodal successfully predict, from...

10.1101/2024.10.15.24315251 preprint EN cc-by-nc medRxiv (Cold Spring Harbor Laboratory) 2024-10-16

Abstract Due to the intricate etiology of neurological disorders, finding interpretable associations between multi-omics features can be challenging using standard approaches. We propose COMICAL , a contrastive learning approach leveraging data generate genetic markers and brain imaging-derived phenotypes. jointly learns omic representations utilizing transformer-based encoders with custom tokenizers. Our modality-agnostic uniquely identi-fies many-to-many via self-supervised schemes...

10.1101/2024.11.02.24316653 preprint EN cc-by-nd medRxiv (Cold Spring Harbor Laboratory) 2024-11-04

Introduction: Life skills encompass a broad spectrum of psychological and interpersonal abilities essential for leading healthy productive lives. This holistic approach, addressing knowledge, attitude, skills, aims to modify behavior foster balanced development. are crucial supporting the mental health competency young individuals facing challenges adolescence – transitional phase between childhood adulthood. Material Methods: community-based cross-sectional study was conducted in slum...

10.4103/ijcfm.ijcfm_14_24 article EN cc-by-nc-sa Indian Journal of Community and Family Medicine 2024-07-01

Abstract The role of race in medical decision-making has been a contentious issue. Insights from history and population genetics suggest considering as differentiating marker for practices can be influenced by systemic bias, leading to serious errors. This may negatively impact treatment complex diseases such cardiovascular disease (CVD). We seek identify instrumental variables independently verifiable epidemiological tests whether diagnoses treatments impacting severe conditions are...

10.1101/2023.02.10.23285769 preprint EN cc-by medRxiv (Cold Spring Harbor Laboratory) 2023-02-14

Abstract The SARS-CoV2 virus behind the COVID-19 pandemic is manifesting itself in different ways among infected people. While many are experiencing mild flue-like symptoms or even remaining asymptomatic after infection, has also led to serious complications, overloading ICUs while claiming more than 2.6 million lives world-wide. In this work, we apply AI methods better understand factors that drive severity of disease. From UK BioBank dataset analyzed both clinical and genomic data patients...

10.1101/2021.03.15.21253549 preprint EN medRxiv (Cold Spring Harbor Laboratory) 2021-03-24

Background: Wood workers are predisposed to many occupational diseases. Studying work place environment and its association with the morbidities would provide practical insights promote health prevent disease in wood workers. Present study intends epidemiological determinants of morbidity workers.Methods: Quantitative method research is used. All One hundred five area were recruited after taking informed consent. A semi-structured, pre-validated, questionnaire consisting questions on...

10.18203/2394-6040.ijcmph20191852 article EN International Journal of Community Medicine and Public Health 2019-04-27

Background: Sickle cell disease (SCD) is more than a century old disease, but we still do not have an affordable cure for it. Although studies show that the prevalence of high in tribal areas, awareness and mainstreaming management SCD Primary Health Cares are formalized system. Our study aims to provide practical insights imparting quality services ensure provisions reach at doors patients. Objectives: The objectives were epidemiological profile clinical patterns among affected...

10.5455/ijmsph.2019.0926921092019 article EN International Journal of Medical Science and Public Health 2019-01-01

A wide variety of chemicals having distinct odors are smelled by humans. Odor perception initiates in the nose, where it is detected a large family olfactory receptors (ORs). Based on divergence evolutionary model, sequence human ORs database has been proposed D. Lancet et al (2000, 2006). It quite impossible to infer whether given nucleotides OR or not, without any biological experimental validation. In our perspective, proper quantitative understanding these required justify nullify not....

10.1038/npre.2012.6967 preprint EN Nature Precedings 2012-03-07
Coming Soon ...