- Topic Modeling
- Natural Language Processing Techniques
- Speech and dialogue systems
- Handwritten Text Recognition Techniques
- Graph Theory and Algorithms
- Complex Network Analysis Techniques
- Software Engineering Research
- Web Data Mining and Analysis
- Software Testing and Debugging Techniques
- AI in Service Interactions
- Caching and Content Delivery
- Data Mining Algorithms and Applications
- Advanced Graph Neural Networks
- Sentiment Analysis and Opinion Mining
- Software Engineering Techniques and Practices
- Machine Learning and Data Classification
- Opportunistic and Delay-Tolerant Networks
- Advanced Database Systems and Queries
- Scientific Computing and Data Management
- Algorithms and Data Compression
- Neural Networks and Applications
- Semantic Web and Ontologies
- Image Processing and 3D Reconstruction
- Advanced Image and Video Retrieval Techniques
University of California, Riverside
2020-2021
King Abdullah University of Science and Technology
2013-2019
Kootenay Association for Science & Technology
2019
University of Jordan
2010
Frequent Subgraph Mining is an essential operation for graph analytics and knowledge extraction. Due to its high computational cost, parallel solutions are necessary. Existing approaches either suffer from load imbalance, or communication synchronization overheads. In this paper we propose ScaleMine; a novel frequent subgraph mining system single large graph. ScaleMine introduces two-phase approach. The first phase approximate; it quickly identifies subgraphs that with probability, while...
Betweenness centrality quantifies the importance of nodes in a graph many applications, including network analysis, community detection and identification influential users. Typically, graphs such applications evolve over time. Thus, computation betweenness should be performed incrementally. This is challenging because updating even single edge may trigger all-pairs shortest paths entire graph. Existing approaches cannot scale to large graphs: they either require excessive memory (i.e.,...
Frequent Subgraph Mining is an essential operation for graph analytics and knowledge extraction. Due to its high computational cost, parallel solutions are necessary. Existing approaches either suffer from load imbalance, or communication synchronization overheads. In this paper we propose ScaleMine; a novel frequent subgraph mining system single large graph. ScaleMine introduces two-phase approach. The first phase approximate; it quickly identifies subgraphs that with probability, while...
Betweenness centrality quantifies the importance of graph nodes in a variety applications including social, biological and communication networks. Its computation is very costly for large graphs; therefore, many approximate methods have been proposed. Given lack golden standard, accuracy most evaluated on tiny graphs not guaranteed to be representative realistic datasets that are orders magnitude larger. In this paper, we develop BeBeCA, benchmark betweenness approximation graphs....
Slot filling is identifying contiguous spans of words in an utterance that correspond to certain parameters (i.e., slots) a user request/query. one the most important challenges modern task-oriented dialog systems. Supervised approaches have proven effective at tackling this challenge, but they need significant amount labeled training data given domain. However, new domains unseen training) may emerge after deployment. Thus, it imperative these models seamlessly adapt and fill slots from...
Identifying user intents from natural language utterances is a crucial step in conversational systems that has been extensively studied as supervised classification problem. However, practice, new emerge after deploying an intent detection model. Thus, these models should seamlessly adapt and classify with both seen unseen -- deployment they do not have training data. The few existing target this setting rely heavily on the data of consequently overfit to intents, resulting bias misclassify...
Existing query engines for RDF graphs follow one of two design paradigms: relational or graph-based. We explore sparse matrix algebra as a third paradigm and propose MAGiQ: framework implementing SPARQL that are portable on various hardware architectures, scalable over thousands compute nodes, efficient very large datasets. MAGiQ represents the graph defines domain-specific language algebraic operations. queries translated into programs oblivious to underlying computing infrastructure....
Hundreds of thousands mobile app users post their reviews online. Responding to user promptly and satisfactorily improves application ratings, which is key popularity success. The proliferation such makes it virtually impossible for developers keep up with responding manually. To address this challenge, recent work has shown the possibility automatic response generation by training a seq2seq model large collection review-response pairs. However, because pairs are aggregated from many...
To recognize unlimited set of handwritten Arabic words, an efficient segmentation algorithm is needed to segment these cursive words into a limited primal graphemes. We propose rule-based that segments graphemes through collecting special feature points from the word skeleton. The development this motivated by need solve problems and limitations available in state-of-the-art algorithms area. preliminary evaluation proposed promising with over 96% accuracy on sample subset IFN/ENIT database.
Existing RDF engines follow one of two design paradigms: relational or graph-based. Such are typically designed for specific hardware architectures, mainly CPUs, and not easily portable to new architectures. Porting an existing engine a different architecture (e.g., many-core architectures) entails almost redesign from scratch. We explore sparse matrix algebra as third paradigm designing portable, scalable, efficient engine. demonstrate MAGiQ; approach evaluating complex SPARQL queries over...
Most existing commercial goal-oriented chatbots are diagram-based; i.e., they follow a rigid dialog flow to fill the slot values needed achieve user's goal. Diagram-based predictable, thus their adoption in settings; however, lack of flexibility may cause many users leave conversation before achieving On other hand, state-of-the-art research use Reinforcement Learning (RL) generate flexible policies. However, such can be unpredictable, violate intended business constraints, and require large...
Artificial neural networks have the abilities to learn by example and are capable of solving problems that hard solve using ordinary rule-based programming. They many design parameters affect their performance such as number sizes hidden layers. Large slow small generally not accurate. Tuning network size is a task because space often large training long process. We use experiments techniques tune recurrent used in an Arabic handwriting recognition system. show best results achieved with...
Most existing commercial goal-oriented chatbots are diagram-based; i.e. they follow a rigid dialog flow to fill the slot values needed achieve user’s goal. Diagram-based predictable, thus their adoption in settings; however, lack of flexibility may cause many users leave conversation before achieving On other hand, state-of-the-art research use Reinforcement Learning (RL) generate flexible policies. However, such can be unpredictable, violate intended business constraints, and require large...
Identifying user intents from natural language utterances is a crucial step in conversational systems that has been extensively studied as supervised classification problem. However, practice, new emerge after deploying an intent detection model. Thus, these models should seamlessly adapt and classify with both seen unseen -- deployment they do not have training data. The few existing target this setting rely heavily on the scarcely available data overfit to data, resulting bias misclassify...
Frequently Asked Questions (FAQ) are a form of semi-structured data that provides users with commonly requested information and enables several natural language processing tasks. Given the plethora such question-answer pairs on Web, there is an opportunity to automatically build large FAQ collections for any domain, as COVID-19 or Plastic Surgery. These can be used by information-seeking portals applications, AI chatbots. Automatically identifying extracting high-utility challenging...
Responding to user reviews promptly and satisfactorily improves application ratings, which is key popularity success. The proliferation of such makes it virtually impossible for developers keep up with responding manually. To address this challenge, recent work has shown the possibility automatic response generation. However, because training review-response pairs are aggregated from many different apps, remains challenging models generate app-specific responses, which, on other hand, often...
Slot filling is identifying contiguous spans of words in an utterance that correspond to certain parameters (i.e., slots) a user request/query. one the most important challenges modern task-oriented dialog systems. Supervised learning approaches have proven effective at tackling this challenge, but they need significant amount labeled training data given domain. However, new domains unseen training) may emerge after deployment. Thus, it imperative these models seamlessly adapt and fill slots...