Daniela Vianna

ORCID: 0000-0003-2943-5211
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Personal Information Management and User Behavior
  • Topic Modeling
  • Data Quality and Management
  • Context-Aware Activity Recognition Systems
  • Artificial Intelligence in Law
  • Sentiment Analysis and Opinion Mining
  • Natural Language Processing Techniques
  • Hate Speech and Cyberbullying Detection
  • Computational and Text Analysis Methods
  • Text and Document Classification Technologies
  • Climate Change and Health Impacts
  • Imbalanced Data Classification Techniques
  • Cognitive Computing and Networks
  • Business Process Modeling and Analysis
  • Web Data Mining and Analysis
  • Chemical and Physical Properties of Materials
  • Advanced Text Analysis Techniques
  • Information Retrieval and Search Behavior
  • Opportunistic and Delay-Tolerant Networks
  • Language, Metaphor, and Cognition
  • Service-Oriented Architecture and Web Services
  • FinTech, Crowdfunding, Digital Finance

UniBrasil Centro Universitário
2024

Universidade Federal do Amazonas
2022-2023

Rutgers, The State University of New Jersey
2014-2022

Universidade Federal Fluminense
2002

Social Media platforms, vital for debate and communication, also grapple with misinformation hateful comments. This work examines the detection of hate speech in Portuguese, contemplating its unique linguistic cultural nuances. Leveraging Transformer-based models different training activation strategies, eight variations architecture, size, pre-training corpora are evaluated. Our findings show that, even though large generative enhanced prompts exhibited promising results, tuned small...

10.52591/lxai202406212 article EN 2024-06-21

Digital storage now acts as an archive of the memories users worldwide, keeping record data well context in which was acquired. The massive amount available and fact that it is fragmented across many services (e.g., Facebook) devices laptop) make very difficult for to find specific pieces information they remember having stored or accessed. Unifying this into a single set includes contextual would allow much better indexing searching personal information. Thus, we have developed extraction...

10.1109/icdew.2014.6818307 article EN 2014-03-01

A large number of personal digital traces is constantly generated or available online from a variety sources, such as social media, calendars, purchase history, etc. These data are fragmented and highly heterogeneous, raising the need for an integrated view user's activities. Prior research in Personal Information Management focused mostly on creating static model world (objects their relationships). We argue that dynamic also helpful making sense collections related documents, propose...

10.1145/3077331.3077337 article EN 2017-05-14

A significant challenge in the legal domain is to organize and summarize a constantly growing collection of documents, uncovering hidden topics, or themes, that later can support tasks such as case retrieval judgment prediction. This massive amount digital combined with inherent complexity judiciary systems worldwide, presents promising scenario for Machine Learning solutions, mainly those taking advantage all advancements area Natural Language Processing (NLP). It this Jusbrasil, largest...

10.1145/3477495.3536329 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

ABSTRACT Digital traces of our lives are now constantly produced by various connected devices, internet services and interactions. Our actions result in a multitude heterogeneous data objects, or traces, kept locations the cloud on local devices. Users have very few tools to organize, understand, search digital they produce. We propose simple but flexible model aggregate, find personal information within collection user's traces. uses as basic dimensions six questions: what, when, where,...

10.1002/pra2.22 article EN Proceedings of the Association for Information Science and Technology 2019-01-01

In Brazil, some cases of hate speech can be qualified as a crime. However, identifying and categorizing offensive comments among the vast number interactions on social media is complex. Automatic detection sensitive content an expanding field, but it faces obstacles due to subtleties language varied forms expression. Brazil's rich cultural diversity, shaped by its experiences, culture, traditions, history colonization, introduces additional challenges. This linguistic diversity plays crucial...

10.52591/lxai2024062114 article EN 2024-06-21

As Redes Sociais, que desempenham um papel significativo no debate e na comunicação moderna, enfrentam o desafio contemporâneo do grande volume desordenado de conteúdo nocivo, como discurso ódio desinformação. Este artigo aborda a detecção em português, considerando suas particularidades linguísticas nuances culturais. Utilizando-se modelos derivados Transformers, juntamente com diversas estratégias treinamento ativação, são investigados nove variações arquitetura, tamanho corpora...

10.21814/lm.16.2.446 article PT cc-by Linguamática 2024-12-27

Este artigo descreve uma abordagem baseada em tópicos para o problema de recuperação casos jurídicos (legal case retrieval). O método consiste duas fases: filtragem e ordenação. Na primeira fase, técnica modelagem é aplicada todo conjunto dados selecionar um inicial candidatos cada consulta. segunda função ordenação usada produzir lista ordenada relevantes a consulta fornecida. Resultados experimentais obtidos utilizando três diferentes funções ordenação, com coleções idiomas, indicam que...

10.5753/sbbd.2023.232576 article PT 2023-09-25

Várias métricas de avaliação para geração texto foram propostas nos últimos anos. No entanto, muitas questões surgiram sobre o quão bem elas podem avaliar a acurácia e qualidade do gerado. Neste trabalho, estudamos como algumas das mais populares se comportam ao lidar com tarefa sumarização no domínio jurídico em Português. Mais especificamente, avaliamos cinco -- ROUGE, BERTScore, BARTScore, BLEURT MoverScore --, usando um dataset contendo 892 acórdãos Superior Tribunal Justiça. Cada item é...

10.5753/sbbd.2023.232000 article PT 2023-09-25

Digital traces of our lives are now constantly produced by various connected devices, internet services and interactions. Our actions result in a multitude heterogeneous data objects, or traces, kept locations the cloud on local devices. Users have very few tools to organize, understand, search digital they produce. We propose simple but flexible model aggregate, find personal information within collection user's traces. uses as basic dimensions six questions: what, when, where, who, why,...

10.48550/arxiv.1904.05374 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Personal digital traces are constantly produced by connected devices, internet services and interactions. These typically small, heterogeneous stored in various locations the cloud or on local making it a challenge for users to interact with search their own data. By adopting multidimensional data model based six natural questions -- what, when, where, who, why how represent unify personal traces, we can propose learning-to-rank approach using state of art LambdaMART algorithm...

10.48550/arxiv.2012.13114 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Personal digital traces are constantly produced by connected devices, internet services and interactions.These typically small, heterogeneous stored in various locations the cloud or on local making it a challenge for users to interact with search their own data.By adopting multidimensional data model based six natural questions -what, when, where, who, why howto represent unify personal traces, we propose learning-to-rank approach using state of art LambdaMART algorithm frequency-based...

10.24251/hicss.2022.361 article EN Proceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences 2022-01-01

O desempenho de programas paralelos é frequentemente afetado por diferentes fatores dinâmicos desequilíbrio carga. Um fator muito comum, presente nos ambientes não dedicados, a existência outros processos concorrendo com aplicação paralela pelos recursos computacionais. A heterogeneidade e variação desta carga externa impede que seja feita uma distribuição prévia equilibrada das tarefas da paralela. uso estratégia balanceamento adequada fundamemal para redução dos efeitos causados este...

10.5753/wscad.2002.20755 article PT 2002-10-28

Sentiment analysis in tweets is a research field of great importance, mainly due to the popularity Twitter. However, collecting and annotating an expensive time-consuming task, making that some domains have only limited set labeled data. A promising strategy handle this issue leverage rich data select instances enrich target datasets. This paper proposes different strategies for selecting from source datasets order improve performance classifiers trained with dataset. Different approaches...

10.5753/kdmile.2021.17463 article EN 2021-10-04
Coming Soon ...