Ricardo Baeza‐Yates

ORCID: 0000-0003-3208-9778
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Web Data Mining and Analysis
  • Data Management and Algorithms
  • Algorithms and Data Compression
  • Advanced Database Systems and Queries
  • Complex Network Analysis Techniques
  • Information Retrieval and Search Behavior
  • Semantic Web and Ontologies
  • Caching and Content Delivery
  • Recommender Systems and Techniques
  • Web visibility and informetrics
  • Data Mining Algorithms and Applications
  • Natural Language Processing Techniques
  • semigroups and automata theory
  • Network Packet Processing and Optimization
  • Reading and Literacy Development
  • Topic Modeling
  • Ethics and Social Impacts of AI
  • DNA and Biological Computing
  • Optimization and Search Problems
  • Advanced Text Analysis Techniques
  • Misinformation and Its Impacts
  • Text Readability and Simplification
  • Advanced Image and Video Retrieval Techniques
  • Data Visualization and Analytics
  • Data Quality and Management

Universitat Pompeu Fabra
2015-2025

Silicon Valley University
2018-2025

Northeastern University
2018-2025

University of Chile
2006-2024

Universidad del Noreste
2019-2024

Ospedale Policlinico San Martino
2023

University of California, San Francisco
2023

Consorci Institut D'Investigacions Biomediques August Pi I Sunyer
2023

Intel (United States)
2023

Eastern University
2021-2022

Contents Preface Acknowledgements 1 Introduction 2 User Interfaces for Search by Marti Hearst 3 Modeling 4 Retrieval Evaluation 5 Relevance Feedback and Query Expansion 6 Documents: Languages & Properties with Gonzalo Navarro Nivio Ziviani 7 Queries: 8 Text Classification Marcos Gonccalves 9 Indexing Searching 10 Parallel Distributed IR Eric Brown 11 Web Yoelle Maarek 12 Crawling Carlos Castillo 13 Structured Mounia Lalmas 14 Multimedia Information Dulce Poncele'on Malcolm Slaney 15...

10.5860/choice.48-6950 article EN Choice Reviews Online 2011-08-01

article Free Access Share on A new approach to text searching Authors: Ricardo Baeza-Yates Universidad de Chile, Blanco Encalada 2120, Depto. Ciencias la Computacion, Santiago, Chile ChileView Profile , Gaston H. Gonnet Informatik, Swiss Technological Institute in Zurich, Switzerland SwitzerlandView Authors Info & Claims Communications of the ACMVolume 35Issue 10Oct. 1992pp 74–82https://doi.org/10.1145/135239.135243Published:01 October 1992Publication History...

10.1145/135239.135243 article EN Communications of the ACM 1992-10-01

10.1006/inco.1993.1054 article EN publisher-specific-oa Information and Computation 1993-10-01

Bias in Web data and use taints the algorithms behind Web-based applications, delivering equally biased results.

10.1145/3209581 article EN Communications of the ACM 2018-05-23

In this paper we study a large query log of more than twenty million queries with the goal extracting semantic relations that are implicitly captured in actions users submitting and clicking answers. Previous analyses were mostly done just not followed after them. We first propose novel way to represent vector space based on graph derived from query-click bipartite graph. then analyze produced by our log, showing it is less sparse previous results suggested, almost all measures these graphs...

10.1145/1281192.1281204 article EN 2007-08-12

In this work, we define and solve the Fair Top-k Ranking problem, in which want to determine a subset of k candidates from large pool n >> candidates, maximizing utility (i.e., select "best" candidates) subject group fairness criteria. Our ranked definition extends using standard notion protected groups is based on ensuring that proportion every prefix top-k ranking remains statistically above or indistinguishable given minimum. Utility operationalized two ways: (i) candidate included...

10.1145/3132847.3132938 preprint EN 2017-11-06

We present a fast compression technique for natural language texts. The novelties are that (1) decompression of arbitrary portions the text can be done very efficiently, (2) exact search words and phrases on compressed directly, using any known sequential pattern-matching algorithm, (3) word-based approximate extended also efficiently without decoding. scheme uses semistatic model Huffman code where coding alphabet is byte-oriented rather than bit-oriented. compress typical English texts to...

10.1145/348751.348754 article EN ACM transactions on office information systems 2000-04-01

Prunus persica has been proposed as a genomic model for deciduous trees and the Rosaceae family.Optimized protocols RNA isolation are necessary to further advance studies in this species such that functional genomics analyses may be performed.Here we present an optimized protocol rapidly efficiently purify high quality total from peach fruits (Prunus persica).Isolating high-quality fruit tissue is often difficult due large quantities of polysaccharides polyphenolic compounds accumulate...

10.4067/s0716-97602005000100010 article EN Biological Research 2005-01-01

In this paper we study the trade-offs in designing efficient caching systems for Web search engines. We explore impact of different approaches, such as static vs. dynamic caching, and query results vs.caching posting lists. Using a log spanning whole year limitations demonstrate that lists can achieve higher hit rates than answers. propose new algorithm lists, which outperforms previous methods. also problem finding optimal way to split cache between answers Finally, measure how changes...

10.1145/1277741.1277775 article EN 2007-07-23

Time is an important dimension of any information space and can be very useful in retrieval. Current retrieval systems applications do not take advantage all the time available content documents to provide better search results user experience. In this paper we show some areas that benefit from exploiting such temporal information.

10.1145/1328964.1328968 article EN ACM SIGIR Forum 2007-12-01

Given the large number of installed apps and limited screen size mobile devices, it is often tedious for users to search app they want use. Although some OSs provide categorization schemes that enhance visibility useful among those installed, emerging category homescreen aims take one step further by automatically organizing in a more intelligent personalized way. In this paper, we study how improve apps' usage experience through prediction mechanism allows show which she going use immediate...

10.1145/2684822.2685302 article EN 2015-01-28

Despite some key problems, big data could fundamentally change scientific research methodology and how businesses develop products provide services.

10.1109/mc.2015.62 article EN Computer 2015-03-01

Around 10% of the people have dyslexia, a neurological disability that impairs person's ability to read and write. There is evidence presentation text has significant effect on text's accessibility for with dyslexia. However, best our knowledge, there are no experiments objectively measure impact font type reading performance. In this paper, we present first experiment uses eye-tracking speed. Using within-subject design, 48 subjects dyslexia 12 texts different fonts. Sans serif, monospaced...

10.1145/2513383.2513447 article EN 2013-10-17

Time is an important dimension of any information space and can be very useful in retrieval particular clustering exploration search results. Search result a feature integrated some today's engines, allowing users to further explore However, only little work has been done on exploiting temporal embedded documents for the presentation, clustering, results along well-defined timelines. In this paper, we present add-on traditional applications which exploit various associated with cluster...

10.1145/1645953.1645968 article EN 2009-11-02
Coming Soon ...