Juan Raposo

ORCID: 0000-0003-4913-618X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Web Data Mining and Analysis
  • Advanced Malware Detection Techniques
  • Caching and Content Delivery
  • Web Application Security Vulnerabilities
  • Surface Chemistry and Catalysis
  • Mobile and Web Applications
  • Advanced Database Systems and Queries
  • Algorithms and Data Compression
  • Software Testing and Debugging Techniques
  • Service-Oriented Architecture and Web Services
  • Mobile Agent-Based Network Management
  • Software Engineering Research
  • Peer-to-Peer Network Technologies
  • Web Applications and Data Management
  • Spam and Phishing Detection
  • Semantic Web and Ontologies
  • Data Quality and Management
  • Computability, Logic, AI Algorithms
  • Web visibility and informetrics
  • Scientific Computing and Data Management
  • Business Process Modeling and Analysis
  • Advanced Image and Video Retrieval Techniques
  • Data Management and Algorithms
  • Distributed and Parallel Computing Systems
  • Industrial Automation and Control Systems

Universidade da Coruña
2004-2015

Semi-automatic wrapper generation tools aim to ease the task of building structured views over Web sources. But techniques presented date show several weaknesses when dealing with complex commercial sources today, especially constructing advanced navigational sequences for accessing data. We present Wargo, a semi-automatic tool, which has been used by non-programmer staff successfully wrap more than 700 in industrial applications.

10.1109/dexa.2002.1045916 article EN Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004. 2004-04-23

The crawler engines of today cannot reach most the information contained in Web. A great amount valuable is "hidden" behind query forms online databases, and/or dynamically generated by technologies such as Javascript. This portion web usually known Deep Web or Hidden We have built DeepBot, a prototype hidden-web focused able to access content. DeepBot receives set domain definitions an input, each one describing specific data-collecting task and automatically identifies learns execute...

10.1145/1278380.1278385 article EN 2007-06-12

During the last years, significant attention has been paid to problem of building wrappers for extracting data from semistructured web sources. Nevertheless, since sources are autonomous, they may experience changes that invalidate wrappers. In this paper, we present new heuristics and algorithms address automatic wrapper maintenance. Our approach is based on collecting query results during operation using them later generate sets examples can be used induce a when source changes.

10.1145/1066677.1066826 article EN 2005-03-13

The problem of data extraction from the deep Web can be divided into two tasks: crawling client-side and server-side Web. objective is to define an architecture a set related techniques access information placed in This involves dealing with aspects such as JavaScript technology, nonstandard session maintenance mechanisms, client redirections, pop-up menus, etc. We use current browser APIs building blocks leverage them implement novel models algorithms

10.1109/cec-east.2004.30 article EN IEEE International Conference on E-Commerce Technology for Dynamic E-Business 2005-03-21

A substantial subset of the Web data follows some kind underlying structure. Nevertheless, HTML does not contain any schema or semantic information about it represents. program able to provide software applications with a structured view those semi-structured sources is usually called wrapper. Wrappers are accept query against source and return set results, thus enabling access in similar manner that from databases. significant problem this approach arises because may experiment changes...

10.1109/ideas.2005.13 article EN 2006-10-11

In order to let software programs gain full benefit from semi-structured Web sources, wrapper must be built provide a "machine readable" view over them. A significant problem of this approach is that, since sources are autonomous, they may experience changes that invalidate the current wrapper. paper, we address by introducing novel heuristics and algorithms for automatically maintaining wrappers. our approach, system collects some query results during normal operation and, when source...

10.1109/wi.2005.40 article EN IEEE/WIC/ACM International Conference on Web Intelligence (WI'04) 2005-10-18

Web automation applications are widely used for different purposes such as B2B integration, web mashups, automated testing of applications, Internet metasearch or technology and business watch. One crucial part in intensive that require real time responses, is them to execute the navigation sequences shortest possible time. The approach building automatic component by using APIs conventional browsers, followed most current systems, not appropriate scenario, because it presents performance...

10.1109/icebe.2015.12 article EN 2015-10-01

10.3217/jucs-014-11-1838 article EN cc-by Zenodo (CERN European Organization for Nuclear Research) 2008-06-01
Coming Soon ...