- Topic Modeling
- Advanced Text Analysis Techniques
- Sentiment Analysis and Opinion Mining
- Natural Language Processing Techniques
- Machine Learning and Data Classification
- Imbalanced Data Classification Techniques
- Time Series Analysis and Forecasting
- Data Mining Algorithms and Applications
- Text and Document Classification Technologies
- Biomedical Text Mining and Ontologies
- Traffic Prediction and Management Techniques
- Complex Network Analysis Techniques
- Data Management and Algorithms
- Neural dynamics and brain function
- Anomaly Detection Techniques and Applications
- Evolutionary Algorithms and Applications
- Semantic Web and Ontologies
- Vehicular Ad Hoc Networks (VANETs)
- Parallel Computing and Optimization Techniques
- EEG and Brain-Computer Interfaces
- Data Stream Mining Techniques
- Logic, programming, and type systems
- Algorithms and Data Compression
- Neural Networks and Applications
- Formal Methods in Verification
Technical University of Cluj-Napoca
2015-2024
Laboratoire d'Informatique de Paris-Nord
2017-2024
University of Bologna
2018
Universitatea Națională de Știință și Tehnologie Politehnica București
2013
Vrije Universiteit Amsterdam
2013
California Institute of Technology
2013
University College Cork
2013
Spamming has become a time consuming and expensive problem for which several new directions have been investigated lately. This paper presents approach spam detection filter. The solution developed is an offline application that uses the k-Nearest Neighbor (kNN) algorithm pre-classified email data set learning process.
We introduce MorphNLI, a modular step-by-step approach to natural language inference (NLI). When classifying the premise-hypothesis pairs into {entailment, contradiction, neutral}, we use model generate necessary edits incrementally transform (i.e., morph) premise hypothesis. Then, using an off-the-shelf NLI track how entailment progresses with these atomic changes, aggregating intermediate labels final output. demonstrate advantages of our proposed method particularly in realistic...
In the era of data-driven technologies, need for diverse and high-quality datasets training testing machine learning models has become increasingly critical. this article, we present a versatile methodology, Generic Methodology Constructing Synthetic Data Generation (GeMSyD), which addresses challenge synthetic data creation in context smart devices. GeMSyD provides framework that enables generation datasets, aligning them closely with real-world data. To demonstrate utility GeMSyD,...
The high popularity of modern web is partly due to the increase in number content sharing applications. social tools provided by applications allow online users interact, express their opinions and read from other users. However, spammers provide comments which are written intentionally mislead redirecting them sites rating promote products less known on market. Reading spam a bad experience waste time for most but can also be harming cause damage reader. Research has been performed this...
The electroencephalography (EEG) data records vast amounts of human cerebral activity yet is still reviewed primarily by readers. Most the times, contaminated with non-cerebral originated signals, called artifacts, which could be very difficult to visually detect and, undiscovered, damage neural information analysis. purpose our work artifacts identifying most relevant features, both in temporal and frequency domains, train various supervised learning algorithms: Decision Trees, SVM KNN,...
High accuracy is essential to any data mining process. A large part of the factors which influence success a problem reside in quality used. Feature selection represents one tools can refine dataset before presenting it learning scheme. This paper analyzes wrapper approach for feature selection, with purpose boosting classification accuracy. viewed as 3-tuple consisting generation procedure, an evaluation function and validation procedure. Experimental evaluations have been performed several...
The main objective of this paper is the time-frequency analysis EEG signal captured in a cognitive task (i.e. object recognition) performed by human subjects. We investigate whether power spectral density gamma frequency range can be used to classify outcome recognition seen, unseen, uncertain). signals were acquired and analyzed from 128 electrodes located on all parts brain. Power features are extracted for classification support vector machine (SVM), K-Nearest Neighbor (KNN) Artificial...
Community detection in social networks is a hot research topic that has received great interest the recent years due to its wide applicability. This paper proposes scalable approach for community structure identification using genetic algorithm. Two existing fitness functions are analyzed and parameters tuned on thoroughly studied with known structures. Experiments large data set show how amount of time necessary determine meaningful communities network significantly reduced by running...
As virtual home assistants are becoming more popular, there is an emerging need for supporting languages other than English. While wide-spread or popular such as Spanish, French Hindi already integrated into existing like Google Home Alexa, integration of less-known Romanian still missing. This paper explores the problem Natural Language Understanding (NLU) applied to a assistant. We propose customized capsule neural network architecture that performs intent detection and slot filling in...
Vehicular traffic in urban areas faces congestion challenges that negatively impact our lives. The infrastructure associated with intelligent transportation systems provides means for addressing the areas. This study proposes an effective and scalable vehicular avoidance methodology. It introduces a thresholding mechanism to predict avoid during route computation. Our methodology was evaluated validated by employing four road network topologies, three density levels various light...
Malware signatures represent a powerful tool for malware detection and classification, widely used by security researchers solution providers. Yara rules describe based on string patterns that are evaluated targeted files. Generally, the provider sends to client endpoints rule evaluation is performed locally, such scanned files do not leave machines. However, if zero-day vulnerability discovered exposes corresponding signature, there considerable risk also disclose unpatched vulnerability. A...
Abstract In recent years, machine learning (ML) has become increasingly popular in various fields of activity. Cloud platforms have also grown popularity, as they offer services that are more secure and accessible worldwide. this context, cloud-based technologies emerged to support ML, giving rise the a service (MLaaS) concept. However, clients accessing ML order obtain classification results on private data may be reluctant upload sensitive information cloud. The model owners prefer not...
Smart cities facilitate the comprehensive management and operation of urban data generated within a city, establishing foundation for smart services addressing diverse challenges. A system public laundry uses artificial intelligence-based solutions to solve challenges inefficient utilization laundries, waiting times, overbooking or underutilization machines, balancing loads across implementation energy-saving features. We propose SmartLaundry, real-time design recommendations better manage...
Purpose Improving healthcare services by developing assistive technologies includes both the health aid devices and analysis of data collected them. The acquired modeled as a knowledge base give more insight into each patient’s status needs. Therefore, ultimate goal health-care system is obtaining recommendations provided an decision support using such base, benefiting patients, physicians industry. This paper aims to define flow for medical structuring raw leveraging contained in proposing...
In the context of current technological progress, big data arises as a compelling research topic. This paper presents non-traditional analysis strategies like exploiting semantics (cycle identification) well traditional ones (signal interpolation and correlation) for industrial within Big Data paradigm. A general approach preprocessing operations exploring extracting valuable knowledge from large set is defined. The identified are tested validated on real characterized by multitude...
Medical diagnosis and prognosis is an emblematic example for classification problems. Machine learning could provide invaluable support automatically inferring diagnostic rules from descriptions of past cases, making the process more objective reliable. Since problem involves both test misclassification costs, we have analyzed ICET, most prominent approach in literature complex cost The hybrid algorithm tries to avoid pitfalls traditional greedy induction by performing a heuristic search...
This paper presents a system for identifying communities in networks built based on opinions and social data. We show how we can build graphs from interactions identify the community structure of these graphs. handle both types data: one-dimensional multidimensional. As detection method, use Infomap algorithm. The dimensions considered are one or many attributes. contradictions be detected using identified communities.
This paper focuses on solutions to two NP-Complete problems: k-SAT and the knapsack problem. We propose a new parallel genetic algorithm strategy CUDA architecture, perform experiments compare it with sequential versions. show how these problems can benefit from GPU solutions, leading significant improvements in speedup while keeping quality of solution. The best performance obtained terms is 67 times. solution presented this suggests general for finding fast robust complex problems.
The interest in time-series classification has increased the last decade. Most of proposals introduced new algorithms for classification, clustering and prediction. are dealing with two major aspects: dimensionality reduction techniques (ex: Piecewise Approximation Aggregation, Symbolic Aggregate etc.) similarity measures Euclidean distance, Dynamic Time Warping etc.). In this article, we give an overview on advantages disadvantages these algorithms.
Natural Language Understanding (NLU) is currently a very high-interest domain to both academia and the commercial environment, due in largest part recent increased popularity of conversational systems. In this paper we focus on home assistant application context identify set language data-related challenges that can occur such scenario, as: distribution shift, missing information class imbalance. We systematically generate datasets Romanian model these data complexities further investigate...
Meta-learning is currently a hot research topic in machine learning, which has emerged from the need to support data mining automation issues related algorithm and parameter selection. Finding best learning strategy for new domain/problem can prove be an expensive time-consuming process even experienced analysts. This paper presents meta-learning system, designed automatically discover most reliable schemes particular dataset, based on knowledge system acquired about similar datasets. The...