- Algorithms and Data Compression
- Cellular Automata and Applications
- DNA and Biological Computing
- semigroups and automata theory
- Stochastic processes and statistical mechanics
- Machine Learning and Algorithms
- Complex Network Analysis Techniques
- Computability, Logic, AI Algorithms
- Advanced Queuing Theory Analysis
- Advanced Combinatorial Mathematics
- Data Management and Algorithms
- Bayesian Methods and Mixture Models
- Mathematical Dynamics and Fractals
- Advanced Data Compression Techniques
- Fractal and DNA sequence analysis
- Error Correcting Code Techniques
- Bioinformatics and Genomic Networks
- Neural Networks and Applications
- Distributed systems and fault tolerance
- Coding theory and cryptography
- Quantum Computing Algorithms and Architecture
- Advanced Data Storage Technologies
- Gene Regulatory Network Analysis
- Markov Chains and Monte Carlo Methods
- Wireless Networks and Protocols
Purdue University West Lafayette
2015-2024
Jagiellonian University
2021
Adam Mickiewicz University in Poznań
2019
Purdue University System
2019
Gdańsk University of Technology
2014-2018
Center for Discrete Mathematics and Theoretical Computer Science
2009
TU Wien
2009
University of California, San Diego
2007
University of Gdańsk
1987-2006
University of Illinois Chicago
2000-2006
With an ever-increasing amount of available data on protein–protein interaction (PPI) networks and research revealing that these evolve at a modular level, discovery conserved patterns in becomes important problem. Although interactions is currently limited, recently developed algorithms have been shown to convey novel biological insights through employment elegant mathematical models. The main challenge aligning PPI define graph theoretical measure similarity between structures captures...
We consider the standard slotted ALOHA system with a finite number of buffered users. Stability analysis such was initiated by Tsybakov and Mikhailov (1979). Since then several bounds on stability region have been established; however, exact is known only for symmetric two-user ALOHA. This paper proves necessary sufficient conditions system. accomplish this means novel technique based three simple observations: applying mathematical induction to smaller copy system, isolating single queue...
With rapidly increasing amount of network and interaction data in molecular biology, the problem effectively analyzing this is an important one. Graph theoretic formalisms, commonly used for these analysis tasks, often lead to computationally hard problems due their relation with subgraph isomorphism.This paper presents innovative new algorithm detecting frequently occurring patterns modules biological networks. Using graph simplification technique, which ideally suited networks, our renders...
Information theory traditionally deals with "conventional data," be it textual data, image, or video data. However, databases of various sorts have come into existence in recent years for storing "unconventional data" including biological social web topographical maps, and medical In compressing such one must consider two types information: the information conveyed by structure itself, data labels implanted structure. this paper, we attempt to address former problem studying graphical...
Suffix trees find several applications in computer science and telecommunications, most notably algorithms on strings, data compressions, codes. Despite this, very little is known about their typical behaviors. In a probabilistic framework, family of suffix trees—further called b-suffix trees—built from the first n suffixes random word considered. this noncompact tree (i.e., such that every edge labeled by single symbol) represented $b = 1$, compact without unary nodes) asymptotically...
A practical suboptimal (variable source coding) algorithm for lossy data compression is presented. This scheme based on approximate string matching, and it naturally extends the lossless Lempel-Ziv (1977) scheme. Among others we consider typical length of an approximately repeated pattern within first n positions a stationary mixing sequence where D percent mismatches allowed. We prove that there exists constant r/sub 0/(D) such converges in probability to 1/r/sub log (pr.) but almost surely...
Recent years have seen a resurgence of interest in redundancy lossless coding. The (regret) universal fixed-to-variable length coding for class sources determines by how much the actual code exceeds optimal (ideal over class) length. In minimax scenario one finds best worst source either case (called also maximal minimax) or on average. We first study stationary ergodic and replace Shtarkov's bound an exact formula. Among others, we prove that generalized Shannon minimizes redundancy, derive...
Molecular interaction data plays an important role in understanding biological processes at a modular level by providing framework for cellular organization, functional hierarchy, and evolutionary conservation. As the quality quantity of network increases rapidly, problem effectively analyzing this becomes significant. Graph theoretic formalisms, commonly used these analysis tasks, often lead to computationally hard problems due their relation subgraph isomorphism. This paper presents...
In this paper, we settle a long-standing open problem concerning the average redundancy r/sub n/ of Lempel-Ziv'78 (LZ78) code. We prove that for memoryless source rate attains asymptotically Er/sub n/=(A+/spl delta/(n))/log n+ O(log log n/log/sup 2/ n), where A is an explicitly given constant depends on characteristics, and /spl delta/(x) fluctuating function with small amplitude. also derive leading term kth moment number phrases. conclude by conjecturing precise formula expected Markovian...
We investigate the basic question of information theory, namely, evaluation Shannon entropy, and a more general Renyi (1961) for some discrete distributions (e.g., binomial, negative etc.). aim at establishing analytic methods (i.e., those in which complex analysis plays pivotal role) such computations often yield estimates unparalleled precision. The main tool used here is that poissonization depoissonization. illustrate our approach on entropy binomial distribution, is, we prove (n, p)...
We investigate the duration of an elimination process for identifying a winner by coin tossing or, equivalently, height random incomplete trie. Applications include election leader in computer network. Using direct probabilistic arguments we obtain exact expressions discrete distribution and moments height. Elementary approximation techniques then yield asymptotics distribution. show that no limiting exists, as asymptotic exhibit periodic fluctuations. In many similar problems associated...
The Patricia trie is a simple modification of regular trie. By eliminating unary branching nodes, the achieves better performance than tries. However, question is: how much on average better? This paper offers thorough answer to this by considering some statistics number nodes examined in successful search and an unsuccessful It shown that for containing n records length S asymptotically becomes 1/ h 1 · ln + O (1), variance either var = c 0 (1) asymmetric or symmetric Patricia, where...
Consider a given pattern and random text generated by Markovian source of any order. We study the frequency occurrences in when overlapping copies are counted separately. More precisely: we consider two strings, string H=h/sub 1/h/sub 2/....h/sub m/ T=t/sub 1/t/sub 2/...t/sub n/ respective lengths equal to m n over an alphabet S size V. The results for model summarized theorems.
A complete characterization of a digital tree, also called trie, is presented from the depth viewpoint in Markovian framework, that is, under assumption symbols key are Markov-dependent. The main findings show asymptotically, as number keys n tends to infinity, average becomes ED/sub n/ approximately (1/h/sub 1/) log N+c', and variance var D/sub alpha n+c", where h/sub 1/ entropy (Markovian-dependent) alphabet, parameter probabilistic model c' c" constants. symmetric independent has =0,...
Comparative analyses of cellular interaction networks enable understanding the cell's modular organization through identification functional modules and complexes. These techniques often rely on topological features such as connectedness density, based premise that functionally related proteins are likely to interact densely these interactions follow similar evolutionary trajectories. Significant recent work has focused efficient algorithms for their conservation. In spite algorithmic...