- Biomedical Text Mining and Ontologies
- Semantic Web and Ontologies
- Scientific Computing and Data Management
- Bioinformatics and Genomic Networks
- Genomics and Phylogenetic Studies
- Natural Language Processing Techniques
- Gene expression and cancer classification
- Topic Modeling
- Genetics, Bioinformatics, and Biomedical Research
- Research Data Management Practices
- Geochemistry and Geologic Mapping
- Cancer Genomics and Diagnostics
- Advanced Database Systems and Queries
- Advanced Biosensing Techniques and Applications
- Genomic variations and chromosomal abnormalities
- Advanced Text Analysis Techniques
- Genomics and Chromatin Dynamics
- Mobile Crowdsensing and Crowdsourcing
- Data Quality and Management
- Caching and Content Delivery
- Adipose Tissue and Metabolism
- Computational and Text Analysis Methods
- Microbial Metabolic Engineering and Bioproduction
- Cell Image Analysis Techniques
- Artificial Intelligence in Healthcare and Education
RELX Group (United States)
2018-2020
RELX Group (Netherlands)
2017
Foundation Medicine (United States)
2012-2015
Ollscoil na Gaillimhe – University of Galway
2011-2014
University of Alabama at Birmingham
2013
Instituto de Biologia Experimental e Tecnológica
2011
The University of Texas MD Anderson Cancer Center
2007-2011
Universidade Nova de Lisboa
2006-2011
Members of the W3C Health Care and Life Sciences Interest Group (HCLS IG) have published a variety genomic drug-related data sets as Resource Description Framework (RDF) triples. This experience has helped interest group define general workflow for mapping health care life science (HCLS) to RDF linking it with other Linked Data sources. paper presents along four case studies that demonstrate addresses many challenges may be faced when creating new resources. The first study describes...
Scientists rarely reuse expert knowledge of phylogeny, in spite years ofeffort to assemble a great "Tree Life" (ToL). A notableexception involves the use Phylomatic, which provides tools togenerate custom phylogenies from large, pre-computed, phylogeny ofplant taxa. This suggests potential for more generalized systemthat, starting with query consisting list any known species, wouldrectify non-standard names, identify containing theimplicated taxa, prune away unneeded parts, and supply branch...
Bioinformatics research relies heavily on the ability to discover and correlate data from various sources. The specialization of life sciences over past decade, coupled with an increasing number biomedical datasets available through standardized interfaces, has created opportunities towards new methods in discovery. Despite popularity semantic web technologies tackling integrative bioinformatics challenge, there are many obstacles its usage by non-technical audiences. In particular, fully...
Abstract Motivation: Since 2011, The Cancer Genome Atlas’ (TCGA) files have been accessible through HTTP from a public site, creating entirely new possibilities for cancer informatics by enhancing data discovery and retrieval. Significantly, these enhancements enable the reporting of analysis results that can be fully traced to reproduced using their source data. However, realize this possibility, continually updated road map in TCGA is required. Creation such represents significant modeling...
This paper proposes a collaborative methodology for developing semantic data models. The proposed the model development follows “meet-in-the-middle” approach. On one hand, concepts emerged in bottom-up fashion from analyzing domain and interviewing experts regarding their needs. other it followed top-down approach whereby existing ontologies, vocabularies models were analyzed integrated with model. identified elements then fed to multiphase abstraction exercise order get of derived is also...
The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional pilot project to create an atlas of genetic mutations responsible for cancer. One the aims this develop infrastructure making cancer related data publicly accessible, enable researchers anywhere around world make and validate important discoveries. However, in genome are organized as text archives set directories. Devising bioinformatics applications analyse such still challenging, it requires downloading very large...
The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to catalogue genetic mutations responsible for cancer using genome analysis techniques. One of the aims this project create comprehensive and open repository related molecular analysis, be exploited by bioinformaticians towards advancing knowledge. However, devising bioinformatics applications analyse such large dataset still challenging, as it often requires downloading archives parsing relevant text files....
Background Data, data everywhere. The diversity and magnitude of the generated in Life Sciences defies automated articulation among complementary efforts. additional need this field for managing property access permissions compounds difficulty very significantly. This is particularly case when integration involves multiple domains disciplines, even more so it includes clinical high throughput molecular data. Methodology/Principal Findings emergence Semantic Web technologies brings promise...
Background The Cancer Genome Atlas project (TCGA) has initiated the analysis of multiple samples a variety tumor types, starting with glioblastoma multiforme. analytical methods encompass genomic and transcriptomic information, as well demographic clinical data about sample donors. create opportunity for systematic screening components molecular machinery features that may be associated formation. wealth existing mechanistic information cancer cell biology provides natural reference...
The value and usefulness of data increases when it is explicitly interlinked with related data. This the core principle Linked Data. For life sciences researchers, harnessing power Data to improve biological discovery still challenged by a need keep pace rapidly evolving domains requirements for collaboration control as well reference semantic web ontologies standards. Knowledge organization systems (KOSs) can provide an abstraction publishing discoveries without complicating transactions...
Biomedical research is set to greatly benefit from the use of semantic web technologies in design computational infrastructure. However, beyond well defined initiatives, substantial issues data heterogeneity, source distribution, and privacy currently stand way towards personalization Medicine. A framework for bioinformatic infrastructure was designed deal with heterogeneous sources sensitive mixture public private that characterizes biomedical domain. This consists a logical model build...
Reverse Phase Protein Arrays (RPPA) are convenient assay platforms to investigate the presence of biomarkers in tissue lysates. As with other high-throughput technologies, substantial amounts analytical data generated. Over 1000 samples may be printed on a single nitrocellulose slide. Up 100 different proteins assessed using immunoperoxidase or immunoflorescence techniques order determine relative protein expression interest. In this report an RPPA Information Management System (RIMS) is...
The amount of bio-medical data available on the Web grows exponentially with time. resulting large volume makes manual exploration very tedious. Moreover, velocity at which this changes and variety formats in is published it dicult to access them an integrated form. Finally, lack vocabulary querying more dicult.In paper, we advocate use Linked Data integrate, query visualize data. Big allows discovering knowledge distributed across manifold sources, making viable for serendipitous discovery...