Boris Mirkin

ORCID: 0000-0001-5470-8635
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Clustering Algorithms Research
  • Complex Network Analysis Techniques
  • Data Mining Algorithms and Applications
  • Data Management and Algorithms
  • Rough Sets and Fuzzy Logic
  • Face and Expression Recognition
  • Advanced Text Analysis Techniques
  • Bioinformatics and Genomic Networks
  • Bayesian Methods and Mixture Models
  • Text and Document Classification Technologies
  • Genomics and Phylogenetic Studies
  • Machine Learning in Bioinformatics
  • Multi-Criteria Decision Making
  • Gene expression and cancer classification
  • Algorithms and Data Compression
  • Advanced Graph Neural Networks
  • RNA and protein synthesis mechanisms
  • Remote-Sensing Image Classification
  • Advanced Scientific Research Methods
  • Data Visualization and Analytics
  • Sensory Analysis and Statistical Methods
  • Advanced Statistical Methods and Models
  • DNA and Biological Computing
  • Spam and Phishing Detection
  • Logic, programming, and type systems

Birkbeck, University of London
2013-2024

National Research University Higher School of Economics
2015-2024

Technion – Israel Institute of Technology
2011-2014

University of London
2006-2011

University of Trento
2011

Lancaster University
2011

Carnegie Mellon University
2011

University of Surrey
2011

Cornell University
2011

University of California, Irvine
2011

Lactic acid-producing bacteria are associated with various plant and animal niches play a key role in the production of fermented foods beverages. We report nine genome sequences representing phylogenetic functional diversity these bacteria. The small genomes lactic acid encode broad repertoire transporters for efficient carbon nitrogen acquisition from nutritionally rich environments they inhabit reflect limited range biosynthetic capabilities that indicate both prototrophic auxotrophic...

10.1073/pnas.0607117103 article EN Proceedings of the National Academy of Sciences 2006-10-10

10.1038/sj.jors.2600836 article EN Journal of the Operational Research Society 1997-01-01

Comparative analysis of sequenced genomes reveals numerous instances apparent horizontal gene transfer (HGT), at least in prokaryotes, and indicates that lineage-specific loss might have been even more common evolution. This complicates the notion a species tree, which needs to be re-interpreted as prevailing evolutionary trend, rather than full depiction evolution, makes reconstruction ancestral non-trivial task.We addressed problem constructing parsimonious scenarios for individual sets...

10.1186/1471-2148-3-2 article EN cc-by BMC Evolutionary Biology 2003-01-06

Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction eukaryotic genomes consists paralogous gene families. We assess the extent ancestral paralogy, which dates back to last common ancestor all eukaryotes, and examine origins paralogs their potential roles in emergence cell complexity. parsimonious reconstruction repertoires shows that 4137 orthologous sets (LECA) map 2150 hypothetical first (FECA) [paralogy quotient (PQ) 1.92]. Analogous reconstructions...

10.1093/nar/gki775 article EN cc-by-nc Nucleic Acids Research 2005-08-02

This paper gives an experimentally supported review and comparison of several indices based on the conventional K-means inertia criterion for determining number clusters, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</i> , in datasets, using popular Silhouette width index as a benchmark. Our experiments involve novel version Elbow index, defined values two or three steps apart. We also discuss alternative ways computing summarizing its...

10.1109/access.2024.3350791 article EN cc-by-nc-nd IEEE Access 2024-01-01

Abstract The issue of determining ‘the right number clusters’ is attracting ever growing interest. paper reviews published work on the with respect to mixture distributions, partition, especially in k ‐means clustering, and hierarchical cluster structures. Some perspective directions for further developments are outlined. © 2011 John Wiley &amp; Sons, Inc. WIREs Data Mining Knowl Discov 1 252–260 DOI: 10.1002/widm.15 This article categorized under: Algorithmic Development &gt; Structure...

10.1002/widm.15 article EN Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery 2011-03-08

In the framework of problem combining different gene trees into a unique species phylogeny, model for duplication/speciation/loss events along evolutionary tree is introduced. The employed embedding phylogeny another one via so-called duplication/speciation principle requiring that duplicated evolves in such way any contemporary involved bears only copies diverged. number biologically meaningful elements result (duplications, losses, information gaps) considered (asymmetric) dissimilarity...

10.1089/cmb.1995.2.493 article EN Journal of Computational Biology 1995-01-01

The multiple prototype fuzzy clustering model (FCMP), introduced by Nascimento, Mirkin and Moura-Pires (1999), proposes a framework for partitional which suggests of how the data are generated from cluster structure to be identified. In model, it is assumed that membership each entity expresses part reflected in entity. this paper we extend FCMP number criteria, study properties on fitting underlying proposed generated. A comparative with c-means algorithm also presented.

10.1109/fuzzy.2000.838676 article EN 2002-11-07

10.1023/a:1010924920739 article EN Machine Learning 2001-01-01

10.1016/0022-2496(72)90030-2 article EN Journal of Mathematical Psychology 1972-05-01

Abstract The prediction of a biological activity using Quantitative Structure–Activity Relationship (QSAR) model is valid only if the compound in question inside model's domain applicability. existing methods for determining applicability descriptor space suffer from problems including poor handling nonconvex training sets and computational inefficiency. In this paper, we propose cluster‐based approach to modelling applicability, which may overcome some shortcomings approaches described. We...

10.1002/qsar.200630086 article EN QSAR & Combinatorial Science 2007-03-27
Coming Soon ...