- Semantic Web and Ontologies
- Digital Humanities and Scholarship
- Research Data Management Practices
- Natural Language Processing Techniques
- Advanced Database Systems and Queries
- Mathematics, Computing, and Information Processing
- Biomedical Text Mining and Ontologies
- Library Science and Information Systems
- Scientific Computing and Data Management
- Data Quality and Management
- Digital and Traditional Archives Management
- Topic Modeling
- Information Retrieval and Search Behavior
- scientometrics and bibliometrics research
- Media, Communication, and Education
- Web and Library Services
- Advanced Data Storage Technologies
- Web Data Mining and Analysis
- Computability, Logic, AI Algorithms
- Library Collection Development and Digital Resources
- Logic, Reasoning, and Knowledge
- Software Engineering Research
- Speech and dialogue systems
- Ethics and Social Impacts of AI
- Biographical and Historical Analysis
University of Illinois System
2008-2024
University of Illinois Urbana-Champaign
2011-2024
University College London
2012
Cranfield University
2012
In-Q-Tel
2007
Urbana University
2003
John Brown University
1993-1999
Brown University
1987-1990
Markup practices can affect the move toward systems that support scholars in process of thinking and writing. Whereas procedural presentational markup retard movement, descriptive accelerate pace by simplifying mechanical tasks allowing authors to focus their attention on content.
The integration of heterogeneous data in varying formats and from diverse communities requires an improved understanding the concept a dataset, key related concepts, such as format, encoding, version. Ultimately, normative formal framework concepts will be needed to support effective curation, integration, use shared multi-disciplinary scientific data. To prepare for development this we reviewed definitions dataset found technical documentation literature. Four basic features can identified...
Although XML Document Type Definitions provide a mechanism for specifying, in machine-readable form, the syntax of an markup language, there is no comparable specifying semantics vocabulary. That is, way to characterize meaning so that facts and relationships represented by occurrence constructs can be explicitly, comprehensively, mechanically identified. This has serious practical theoretical consequences. On positive side, assigned arbitrary used application areas not foreseen original...
Textual entailment is a relationship that obtains between fragments of text when one fragment in some sense implies the other fragment. The automation textual recognition supports wide variety text‐based tasks, including information retrieval, extraction, question answering, summarization, and machine translation. Much ingenuity has been devoted to developing algorithms for identifying entailments, but relatively little saying what actually . This article review logical philosophical issues...
THE WAY IN WHICH TEXT IS represented on a computer affects the kinds of uses to which it can be put by its creator and subsequent users. The electronic document model currently in use is impoverished restrictive. authors argue that text best as an ordered hierarchy content object (OHCO), because what really is. This conforms with emerging standards such SGML contains within advantages for writer, publisher, researcher. then describe how hierarchical allow future reuse database, hypertext, or network.
article Free Access Share on 50 years after "As we may think": the Brown/MIT Vannevar Bush symposium Authors: Rosemary Simpson Information Programming, Allen Renear, Elli Mylonas, and Andries van Dam, Brown University UniversityView Profile , Renear View Mylonas Dam Authors Info & Claims InteractionsVolume 3Issue 2March 1996 pp 47–67https://doi.org/10.1145/227181.227187Online:01 March 1996Publication History 24citation2,250DownloadsMetricsTotal Citations24Total Downloads2,250Last 12...
Abstract The concept of known‐item search has long been central to research and application in library information science. It is surprising then that this received practically no systematic discussion. We survey the various conceptual operational characterizations LIS literature order determine exactly how being understood by its users. demonstrate apparently simple notion actually quite complex varied, moreover, there hardly a single feature ordinarily associated with it can confidently be...
Abstract Heterogeneous digital data that has been produced by different communities with varying practices and assumptions, is organized according to representation schemes, encodings, file formats, presents substantial obstacles efficient integration, analysis, preservation. This a particular impediment reuse interdisciplinary science. An underlying problem we have no shared formal conceptual model of information both accurate sufficiently detailed accommodate the management analysis real...
The traditional distinction between descriptive and procedural markup is awed; it con ates two different dimensions -mood domain -which in fact can vary independently.An adequate taxonomy must, among other things, incorporate distinctions such as those developed contemporar y "speech-act theory".This will substantially complicate, although interesting ways, the development of an theor semantics, formalization require modal operators additional axiomatic relationships.In addition, these re...
Abstract We examine the conceptual model of “bibliographic universe” presented in IFLA's Functional Requirements for Bibliographic Records (FRBR) and argue, applying ontology design recommendations proposed by N. Guarino C. Welty, that three four Group 1 entity types might be more accurately conceptualized as roles. show how this approach may generalize solution to a previously identified puzzle regarding FRBR type XML documents speculate sorts entities take on these This view bibliographic...
Contemporary retrieval systems, which search across collections, usually ignore collection-level metadata. Alternative approaches, exploiting information, will require an understanding of the various kinds relationships that can obtain between and item-level This paper outlines problem describes a project is developing logic-based framework for classifying collection/item metadata relationships. support (i) specification developers defining elements, (ii) creators describing objects, (iii)...
The concept of significant properties, properties that must be identified and preserved in any successful digital object preservation, is now common data curation. Although this notion has clearly demonstrated its usefulness cultural heritage domains application to the preservation scientific datasets not as well developed. One obstacle familiar models are sufficiently explicit identify relevant entities, relationships involved dataset preservation. We present a logic-based formal framework...
Most definitions of document current in the processing and digital publishing communities, would, if take literally, imply that documents are extensional entities cannot undergo changes such as editing or revision. In other domains well, textual criticism library science, one can also find notions text similarly difficult to reconcile with modification. We describe problem sketch some possible resolutions. Although issues conceptual foundational practical significance is real. Formal...
Last year at Balisage (2009) we considered the claim that documents cannot be modified. Our analysis took form of identifying and evaluating possible responses to this inconsistent triad: 1) Documents are strings; 2) Strings modified; 3) can Late spring were surprised realize our survey document modifiability puzzle had overlooked one response: There no documents. We turn neglected response now.
Collections of artifacts, images, texts, and other cultural objects are not arbitrary aggregations, but designed to support specific research scholarly activities. Collection-level metadata directly supports this objective, providing critical contextual information. However, exploiting information, especially in a semantic web environment linked data, requires precise formalization the rules that characterize collection/item relationships. Toward end we developing logic-based framework...