- Topic Modeling
- Natural Language Processing Techniques
- Reinforcement Learning in Robotics
- Text and Document Classification Technologies
- Domain Adaptation and Few-Shot Learning
- Machine Learning and Algorithms
- Artificial Intelligence in Games
- Speech Recognition and Synthesis
- Multimodal Machine Learning Applications
- Speech and Audio Processing
- Music and Audio Processing
- Advanced Bandit Algorithms Research
- Advanced Graph Neural Networks
- Recommender Systems and Techniques
- Advanced Image and Video Retrieval Techniques
- Image Retrieval and Classification Techniques
- Machine Learning and Data Classification
- Imbalanced Data Classification Techniques
- Digital Games and Media
- Auction Theory and Applications
- Advanced Text Analysis Techniques
- Anomaly Detection Techniques and Applications
- Generative Adversarial Networks and Image Synthesis
- Tensor decomposition and applications
- Consumer Market Behavior and Pricing
University of Charleston
2023
Lamsade
2021
Meta (Israel)
2015-2021
Menlo School
2016-2020
Meta (United States)
2016-2020
Heuristics and Diagnostics for Complex Systems
2013-2016
Université de Technologie de Compiègne
2013-2015
Centre National de la Recherche Scientifique
2010-2014
Laboratoire de Recherche en Informatique de Paris 6
2006-2014
Sorbonne Université
2005-2014
Image annotation datasets are becoming larger and larger, with tens of millions images thousands possible annotations. We propose a strongly performing method that scales to such by simultaneously learning optimize precision at the top ranked list annotations for given image low-dimensional joint embedding space both Our method, called WSABIE, outperforms several baseline methods is faster consumes less memory.
Training large-scale question answering systems is complicated because training sources usually cover a small portion of the range possible questions. This paper studies impact multitask and transfer learning for simple answering; setting which reasoning required to answer quite easy, as long one can retrieve correct evidence given question, be difficult in conditions. To this end, we introduce new dataset 100k questions that use conjunction with existing benchmarks. We conduct our study...
This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of image and values attributes directly in latent space. As result, after training, our model can generate different realistic versions an input varying attribute values. By using continuous values, we choose how much specific perceivable generated image. property could allow for applications where users modify sliding knobs, like faders on mixing console,...
The problem of Knowledge Base Completion can be framed as a 3rd-order binary tensor completion problem. In this light, the Canonical Tensor Decomposition (CP) (Hitchcock, 1927) seems like natural solution; however, current implementations CP on standard benchmarks are lagging behind their competitors. work, we attempt to understand limits for knowledge base completion. First, motivate and test novel regularizer, based nuclear $p$-norms. Then, present reformulation that makes it invariant...
We propose an extension to neural network language models adapt their prediction the recent history. Our model is a simplified version of memory augmented networks, which stores past hidden activations as and accesses them through dot product with current activation. This mechanism very efficient scales large sizes. also draw link between use external in cache used count based models. demonstrate on several datasets that our approach performs significantly better than networks.
We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support input contexts, and zero-shot instruction following ability programming tasks. provide multiple flavors to cover wide range applications: foundation (Code Llama), Python specializations - Python), instruction-following Instruct) with 7B, 13B, 34B 70B parameters each. All are trained sequences 16k tokens show improvements...
Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums any other accompaniments.Contrarily many audio synthesis tasks where best performances are achieved by models that directly generate waveform, state-of-the-art in source compute masks on magnitude spectrum. In this paper, we compare two waveform domain architectures. We first adapt...
This paper proposes a novel approach for relation extraction from free text which is trained to jointly use information the and existing knowledge. Our model based on two scoring functions that operate by learning low-dimensional embeddings of words entities relationships knowledge base. We empirically show New York Times articles aligned with Freebase relations our able efficiently extra provided large subset data (4M entities, 23k relationships) improve over methods rely features alone.
We introduce Parseval networks, a form of deep neural networks in which the Lipschitz constant linear, convolutional and aggregation layers is constrained to be smaller than 1. are empirically theoretically motivated by an analysis robustness predictions made when their input subject adversarial perturbation. The most important feature maintain weight matrices linear (approximately) tight frames, extensions orthogonal non-square matrices. describe how these constraints can maintained...
In ranking with the pairwise classification approach, loss associated to a predicted ranked list is mean of losses. This inadequate for tasks like information retrieval where we prefer lists high precision on top list. We propose optimize larger class functions ranking, based an ordered weighted average (OWA) (Yager, 1988) Convex OWA aggregation operators range from max depending their weights, and can be used focus elements as they give more weight largest When aggregating hinge losses,...
In this work we introduce a structured prediction model that endows the Deep Gaussian Conditional Random Field (G-CRF) with densely connected graph structure. We keep memory and computational complexity under control by expressing pairwise interactions as inner products of low-dimensional, learnable embeddings. The G-CRF system matrix is therefore low-rank, allowing us to solve resulting in few milliseconds on GPU using conjugate gradient. As G-CRF, inference exact, unary terms are jointly...
Most algorithms for representation learning and link prediction in relational data have been designed static data. However, the they are applied to usually evolves with time, such as friend graphs social networks or user interactions items recommender systems. This is also case knowledge bases, which contain facts (US, has president, B. Obama, [2009-2017]) that valid only at certain points time. For problem of under temporal constraints, i.e., answering queries ?, 2012), we propose a...
Hashing produces compact representations for documents, to perform tasks like classification or retrieval based on these short codes. When hashing is supervised, the codes are trained using labels training data. This paper first shows that evaluation protocols used in literature supervised not satisfactory: we show a trivial solution encodes output of classifier significantly outperforms existing semi-supervised methods, while much shorter We then propose two alternative hashing: one...
We train a bank of complex filters that operates on the raw waveform and is fed into convolutional neural network for end-to-end phone recognition. These time-domain filterbanks (TD-filterbanks) are initialized as an approximation mel-filterbanks, then fine-tuned jointly with remaining architecture. perform recognition experiments TIMIT show several architectures, models trained TD- consistently outperform their counterparts comparable mel-filterbanks. get our best performance by learning...
Performing link prediction in Knowledge Bases (KBs) with embedding-based models, like the model TransE (Bordes et al., 2013) which represents relationships as translations embedding space, have shown promising results recent years.Most of these works are focused on modeling single and hence do not take full advantage graph structure KBs.In this paper, we propose an extension that learns to explicitly composition via addition their corresponding translation vectors.We show empirically allows...