- Topic Modeling
- Natural Language Processing Techniques
- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Literature, Culture, and Criticism
- History, Culture, and Society
- Text Readability and Simplification
- Brazilian Legal Issues
- Advanced Image and Video Retrieval Techniques
- Brazilian cultural history and politics
- Education and Digital Technologies
- Linguistics and Language Studies
- Medical Malpractice and Liability Issues
- Urban and sociocultural dynamics
- Youth, Drugs, and Violence
- Explainable Artificial Intelligence (XAI)
- Rural Development and Agriculture
- History of Colonial Brazil
- Academic Research in Diverse Fields
- Speech and dialogue systems
- Software Engineering Research
- Text and Document Classification Technologies
- Education Pedagogy and Practices
- Translation Studies and Practices
- Gender, Sexuality, and Education
Faculdade Pernambucana de Saúde
2021
Faculdade Frassinetti do Recife
2021
University of Lisbon
2019-2021
Universidade Federal de Minas Gerais
2016-2021
Instituto de Telecomunicações
2019-2021
Instituto Superior Técnico
2020
Universidade do Estado de Santa Catarina
2009-2013
Université Sorbonne Nouvelle
2012
University of Minho
2012
Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Anuoluwapo Aremu, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna-Adriana Clinciu, Dipanjan Das, Kaustubh Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Chinenye Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa...
Named entity recognition (NER) and linking (EL) are two fundamentally related tasks, since in order to perform EL, first the mentions entities have be detected. However, most approaches disregard mention detection part, assuming that correct been previously In this paper, we joint learning of NER EL leverage their relatedness obtain a more robust generalisable system. For that, introduce model inspired by Stack-LSTM approach. We observe fact, doing multi-task improves performance both tasks...
Abstract Natural language generation has witnessed significant advancements due to the training of large models on vast internet-scale datasets. Despite these advancements, there exists a critical challenge: These can inadvertently generate content that is toxic, inaccurate, and unhelpful, existing automatic evaluation metrics often fall short identifying shortcomings. As become more capable, human feedback an invaluable signal for evaluating improving models. This survey aims provide...
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on constantly evolving ecosystem of automated metrics, datasets, human evaluation standards. Due to this moving target, new models often still evaluate divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging identify the limitations current opportunities progress. Addressing limitation, GEM provides...
Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance. However, to avoid degenerate text, they require sampling from a modified softmax, via temperature parameters or ad-hoc truncation techniques, in top-k nucleus sampling. This creates mismatch between training and testing conditions. In this paper, we use the recently introduced entmax transformation train sample natively sparse model, avoiding mismatch. The result is...
The quality of open-weight LLMs has seen significant improvement, yet they remain predominantly focused on English. In this paper, we introduce the EuroLLM project, aimed at developing a suite multilingual capable understanding and generating text in all official European Union languages, as well several additional relevant languages. We outline progress made to date, detailing our data collection filtering process, development scaling laws, creation tokenizer, mix modeling configurations....
Trata-se aqui de abordar, entre imigrantes cabo-verdianos da região Grande Lisboa, Portugal, alguns dos recursos estéticos e identitários empregados como parte estratégia adaptação ao novo contexto. O conceito manifestações expressivas, estéticas que possuem algum caráter identitário, serve suporte para a construção análise do objeto empírico, associado aos pressupostos tradições são inventadas (Hobsbawn), comunidades imaginadas (Anderson) só fazem sentido no quadro social qual emergem...
Using natural language to give instructions robots is challenging, since understanding still largely an open problem. In this paper we address problem by restricting our attention commands modeled as one action, plus arguments (also known slots). For action detection called intent detection) and slot filling various architectures of Recurrent Neural Networks Long Short Term Memory (LSTM) networks were evaluated, having LSTMs achieved a superior accuracy. As the requested may not fall within...
Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance. However, to avoid degenerate text, they require sampling from a modified softmax, via temperature parameters or ad-hoc truncation techniques, in top-$k$ nucleus sampling. This creates mismatch between training and testing conditions. In this paper, we use the recently introduced entmax transformation train sample natively sparse model, avoiding mismatch. The result is...
A mandioca é um dos produtos mais populares da alimentação no Brasil, sendo relevante para a segurança alimentar, sobretudo Maranhão em que as áreas produtoras destacam-se pela capacidade elevada de produção, diversidade usos, flexibilidade plantio e colheita, assim como por sua importância sociocultural. Por isso objetiva-se analisar práticas desafios produzida Maranhão, fim subsidiar diálogos oportunidades atinentes ao incremento à inovação dessa cadeia produtiva. Para tanto, procedeu-se...
Este estudo relata as contribuições da Coordenadoria Assistência Farmacêutica no enfrentamento Covid-19 município de Sobral, Ceará. Trata-se um relato experiência compreendendo ações executadas período março a outubro 2020, sob o olhar profissionais que vivenciaram processo junto à rede serviços referência para Covid-19, vinculados Secretaria Municipal Saúde. No cenário pandemia assistência farmacêutica despontou com importante papel, assegurando programadas vistas integralidade e...
A forma da Cidade é feita constante construção, reutilização e sobreposição de uma multiplicidade elementos urbanos, criando no decorrer um tempo longo entidade heterogénea multifacetada, densa paisagem cultural definida por complexa sequência estratos construídos. Pressupondo a construção urbana como acto contínuo produção tecido sobre pré-existências que deixam sua marca nas estruturas subsequentes elas se impõem ou adaptam, os criados sucessivamente reinterpretados em diferentes épocas,...
ANTROPOLOGIA E PIONEIRISMO: FRANCISCO EGON SCHADEN NO IMAGINÁRIO DE SÃO BONIFÁCIO (SC)
Visual attention mechanisms are widely used in multimodal tasks, as visual question answering (VQA). One drawback of softmax-based is that they assign some probability mass to all image regions, regardless their adjacency structure and relevance the text. In this paper, better link with text, we replace traditional softmax mechanism two alternative sparsity-promoting transformations: sparsemax, which able select only relevant regions (assigning zero weight rest), a newly proposed...
Machine translation models struggle when translating out-of-domain text, which makes domain adaptation a topic of critical importance. However, most methods focus on fine-tuning or training the entire part model every new domain, can be costly. On other hand, semi-parametric have been shown to successfully perform by retrieving examples from an in-domain datastore (Khandelwal et al., 2021). A drawback these retrieval-augmented models, however, is that they tend substantially slower. In this...
O presente artigo apresenta análise a partir da Antropologia do Direito sobre relação entre o ao território e experiência Malungu para defesa dos quilombos Pará durante pandemia de situações casos específicos. A pergunta pesquisa utilizada foi “Quais noções direito foram evidenciadas na atuação política pandemia?”. As autoras estiveram diretamente envolvidas em processos resistência Malungu. se baseou levantamento dados covid19 nos quilombos, relatos experiências levados público atuações...
Transformers are unable to model long-term memories effectively, since the amount of computation they need perform grows with context length. While variations efficient transformers have been proposed, all a finite memory capacity and forced drop old information. In this paper, we propose $\infty$-former, which extends vanilla transformer an unbounded memory. By making use continuous-space attention mechanism attend over memory, $\infty$-former's complexity becomes independent length,...