- Algorithms and Data Compression
- Genomics and Phylogenetic Studies
- Interconnection Networks and Systems
- DNA and Biological Computing
- Data Mining Algorithms and Applications
- RNA and protein synthesis mechanisms
- Gene expression and cancer classification
- Data Management and Algorithms
- Advanced Data Storage Technologies
- Parallel Computing and Optimization Techniques
- Machine Learning in Bioinformatics
- Privacy-Preserving Technologies in Data
- Genomics and Chromatin Dynamics
- Optimization and Search Problems
- Data Quality and Management
- Natural Language Processing Techniques
- Bioinformatics and Genomic Networks
- Advanced biosensing and bioanalysis techniques
- Underwater Vehicles and Communication Systems
- Distributed and Parallel Computing Systems
- Rough Sets and Fuzzy Logic
- Cellular Automata and Applications
- semigroups and automata theory
- Energy Efficient Wireless Sensor Networks
- Genetics, Bioinformatics, and Biomedical Research
University of Connecticut
2015-2024
Samsung (United States)
2024
The University of Texas at Dallas
2021
King Faisal University
2019
Bharathiar University
2019
Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology
2016
Iowa State University
2013
Duke University
2013
University of Minnesota
2013
University of California, Davis
2013
The materials discovery process can be significantly expedited and simplified if we learn effectively from available knowledge data. In the present contribution, show that efficient accurate prediction of a diverse set properties material systems is possible by employing machine (or statistical) learning methods trained on quantum mechanical computations in combination with notions chemical similarity. Using family one-dimensional chain systems, general formalism allows us to discover...
This paper assumes a parallel RAM (random access machine) model which allows both concurrent reads and writes of global memory. The main result is an optimal randomized algorithm for INTEGER_SORT (i.e., sorting n integers in the range $[1,n]$). costs only logarithmic time first known that optimal: product its processor bounds upper bounded by linear function input size. Also given deterministic sublogarithmic prefix sum. In addition this presents obtaining random permutation elements...
Most heating, ventilation, and air-conditioning (HVAC) systems operate with one or more faults that result in increased energy consumption could lead to system failure over time. Today, most building owners are performing reactive maintenance only may be less concerned able assess the health of until catastrophic occurs. This is mainly because do not previously have good tools detect diagnose these faults, determine their impact, act on findings. Commercially available fault detection...
We consider the planted (I, d) motif search problem, which consists of finding a substring length I that occurs in set input sequences {si,. ..,s <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sub> } with up to d errors, problem arises from need find transcription factor-binding sites genomic information. propose sequence practical algorithms, start based on ideas considered PMS1. These algorithms are exact, have little space requirements,...
In this paper, we present a novel algorithm for mining complete frequent itemsets. This is referred to as the TM (transaction mapping) from hereon. algorithm, transaction ids of each itemset are mapped and compressed continuous intervals in different space counting itemsets performed by intersecting these interval lists depth-first order along lexicographic tree. When compression coefficient becomes smaller than average number comparisons intersection at certain level, switches id...
Minimotif Miner (MnM available at http://minimotifminer.org or http://mnm.engr.uconn.edu) is an online database for identifying new minimotifs in protein queries. Minimotifs are short contiguous peptide sequences that have a known function least one protein. Here we report the third release of MnM which has now grown 60-fold to approximately 300 000 minimotifs. Since by their nature not very complex also summarize set false-positive filters and linear regression scoring vastly enhance...
Abstract Background Motifs are patterns found in biological sequences that vital for understanding gene function, human disease, drug design, etc. They helpful finding transcriptional regulatory elements, transcription factor binding sites, and so on. As a result, the problem of identifying motifs is very crucial biology. Results Many facets motif search have been identified literature. One them (ℓ, d ) -motif (or Planted Motif Search (PMS)) . The PMS has well investigated shown to be...
Abstract Motivation: Next Generation Sequencing (NGS) technologies have revolutionized genomic research by reducing the cost of whole genome sequencing. One biggest challenges posed modern sequencing technology is economic storage NGS data. Storing raw data infeasible because its enormous size and high redundancy. In this article, we address problem transmission large FASTQ files using innovative compression techniques. Results: We introduce a new lossless non-reference based algorithm named...
The problem of identifying meaningful patterns (i.e., motifs) from biological data has been studied extensively due to its paramount importance. Three versions this have identified in the literature. One these three problems is planted (l, d)-motif problem. Several instances posed as a challenge. Numerous algorithms proposed literature that address Many fall under category heuristic algorithms. In paper we present for always find correct answer(s). Our are very simple and based on some ideas...
Clustering of data has numerous applications and been studied extensively. Though most the algorithms in literature are sequential, many parallel have also designed. In this paper, we present with better performance than known algorithms. We consider that work well worst case as good expected performance.
In this paper, we present a novel algorithm for mining complete frequent itemsets. This is referred to as the TM (transaction mapping) from hereon. algorithm, transaction ids of each itemset are mapped and compressed continuous intervals in different space counting itemsets performed by intersecting these interval lists depth-first order along lexicographic tree. When compression coefficient becomes smaller than average number comparisons intersection at certain level, switches id...
Minimotif Miner (MnM) consists of a minimotif database and web-based application that enables prediction motif-based functions in user-supplied protein queries. We have revised MnM by expanding the more than 10-fold to approximately 5000 motifs standardized motif function definitions. The web-application user interface has been redeveloped with new features including improved navigation, screencast-driven help, support for alias names expanded SNP analysis. A sample analysis prion shows how...
Motif searching is an important step in the detection of rare events occurring a set DNA or protein sequences. One formulation problem known as (l,d)-motif search Planted Search (PMS). In PMS we are given two integers l and d n biological We want to find all sequences length that appear each input with at most mismatches. The NP-complete. algorithms typically evaluated on certain instances considered challenging. Despite ample research area, considerable performance gap exists because many...
Feature selection is a crucial problem in efficient machine learning, and it also greatly contributes to the explainability of machine-driven decisions. Methods, like decision trees Least Absolute Shrinkage Selection Operator (LASSO), can select features during training. However, these embedded approaches only be applied small subset learning models. Wrapper based methods independently from models but they often suffer high computational cost. To enhance their efficiency, many randomized...
The large model size, high computational operations, and vulnerability against membership inference attack (MIA) have impeded deep learning or neural networks (DNNs) popularity, especially on mobile devices. To address the challenge, we envision that weight pruning technique will help DNNs MIA while reducing storage operation. In this work, propose a algorithm, show proposed algorithm can find subnetwork prevent privacy leakage from achieves competitive accuracy with original DNNs. We also...
Density functional theory (DFT) within the local or semilocal density approximations, i.e., approximation (LDA) generalized gradient (GGA), has become a workhorse in electronic structure of solids, being extremely fast and reliable for energetics structural properties, yet remaining highly inaccurate predicting band gaps semiconductors insulators. The accurate prediction using first-principles methods is time consuming, requiring hybrid functionals, quasiparticle GW, quantum Monte Carlo...