- Time Series Analysis and Forecasting
- Anomaly Detection Techniques and Applications
- Complex Systems and Time Series Analysis
- Data Management and Algorithms
- Music and Audio Processing
- Advanced Text Analysis Techniques
- Data Mining Algorithms and Applications
- Algorithms and Data Compression
- Advanced Database Systems and Queries
- Topic Modeling
- Natural Language Processing Techniques
- Advanced Malware Detection Techniques
- Data Visualization and Analytics
- Network Security and Intrusion Detection
- Geographic Information Systems Studies
- Data Stream Mining Techniques
- Neural Networks and Applications
- AI in cancer detection
- Advanced Bandit Algorithms Research
- Bayesian Modeling and Causal Inference
- EEG and Brain-Computer Interfaces
- Video Analysis and Summarization
- Sentiment Analysis and Opinion Mining
- User Authentication and Security Systems
- COVID-19 diagnosis using AI
George Mason University
2014-2024
Brown University
2024
Georgetown University Medical Center
2002-2022
Center for Global Health
2022
Georgetown University
2002-2022
University of Tulsa
2015-2018
University of California, Berkeley
2014-2015
Microsoft (Finland)
2012
University of California, Riverside
2003-2009
Office of Infectious Diseases
2006
The parallel explosions of interest in streaming data, and data mining time series have had surprisingly little intersection. This is spite the fact that are typically data. main reason for this apparent paradox vast majority work on explicitly assumes discrete, whereas real valued.Many researchers also considered transforming valued into symbolic representations, nothing such representations would potentially allow to avail wealth structures algorithms from text processing bioinformatics...
In this work, we introduce the new problem of finding time series discords. Time discords are subsequences a longer that maximally different to all rest subsequences. They thus capture sense most unusual subsequence within series. have many uses for data mining, including improving quality clustering, cleaning, summarization, and anomaly detection. Discords particularly attractive as detectors because they only require one intuitive parameter (the length subsequence) unlike detection...
The parallel explosions of interest in streaming data, and data mining time series have had surprisingly little intersection. This is spite the fact that are typically data. main reason for this apparent paradox vast majority work on explicitly assumes discrete, whereas real valued.Many researchers also considered transforming valued into symbolic representations, nothing such representations would potentially allow to avail wealth structures algorithms from text processing bioinformatics...
With the advance of sensor technologies, Multivariate Time Series classification (MTSC) problem, perhaps one most essential problems in time series data mining domain, has continuously received a significant amount attention recent decades. Traditional approaches based on Bag-of-Patterns or Shapelet have difficulty dealing with huge amounts feature candidates generated high-dimensional multivariate but promising performance even when training set is small. In contrast, deep learning methods...
The problem of efficiently locating previously known patterns in a time series database (i.e., query by content) has received much attention and may now largely be regarded as solved problem. However, from knowledge discovery viewpoint, more interesting is the enumeration unknown, frequently occurring patterns. We call such "motifs", because their close analogy to discrete counterparts computation biology. An efficient motif algorithm for would useful tool summarizing visualizing massive...
Time series data is perhaps the most frequently encountered type of examined by mining community. Clustering used algorithm, being useful in it's own right as an exploratory technique, and also a subroutine more complex algorithms such rule discovery, indexing, summarization, anomaly detection, classification. Given these two facts, it hardly surprising that time clustering has attracted much attention. The to be clustered can one formats: many individual series, or single from which are...
Moments before the launch of every space vehicle, engineering discipline specialists must make a critical go/no-go decision. The cost false positive, allowing in spite fault, or negative, stopping potentially successful launch, can be measured tens millions dollars, not including morale and other more intangible detriments. Aerospace Corporation is responsible for providing assessments to decision Department Defense vehicle. These are made by constantly monitoring streaming telemetry data...
In this work we introduce the new problem of finding time series discords. Time discords are subsequences longer that maximally different to all rest subsequences. They thus capture sense most unusual subsequence within a series. While brute force algorithm discover is quadratic in length series, show simple 3 4 orders magnitude faster than force, while guaranteed produce identical results.
Data visualization techniques are very important for data analysis, since the human eye has been frequently advocated as ultimate data-mining tool. However, there surprisingly little work on visualizing massive time series sets. To this end, we developed VizTree, a pattern discovery and system based augmenting suffix trees. VizTree visually summarizes both global local structures of at same time. In addition, it provides novel interactive solutions to many problems, including occurring...
The problem of time series motif discovery has received a lot attention from researchers in the past decade. Most existing work on finding motifs require that length be known advance. However, such information is not always available. In addition, different lengths may co-exist dataset. this work, we develop visualization system based grammar induction. We demonstrate induction can effectively identify repeated patterns without prior knowledge their lengths. discovered by are...
The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs discords, respectively, have received a lot attention from researchers the past decade. However, since search space is usually intractable, most existing detection algorithms require that patterns discriminative characteristics its length known advance provided as input, which an unreasonable requirement for many real-world problems. In addition, similar structure, but different lengths may co-exist...
Malaria transmission from humans to mosquitoes is modulated by human host immune factors. Understanding mechanisms which the response may impair parasite infectivity for has direct implications development of transmission-blocking vaccines. We hypothesized that despite a low intensity malaria in Peruvian Amazon region Iquitos, immunity against Plasmodium vivax might be common, given an unexpectedly high proportion asymptomatic parasitemic individuals this region. To test hypothesis, ability...
In this work, we introduce the new problem of finding time series discords. Time discords are subsequences longer that maximally different to all rest subsequences. They thus capture sense most unusual subsequence within a series. While have many uses for data mining, they particularly attractive as anomaly detectors because only require one intuitive parameter (the length subsequence), unlike detection algorithms typically parameters. brute force algorithm discover is quadratic in series,...