- Natural Language Processing Techniques
- Semantic Web and Ontologies
- Scientific Computing and Data Management
- Distributed and Parallel Computing Systems
- Topic Modeling
- Biomedical Text Mining and Ontologies
- Genomics and Phylogenetic Studies
- Service-Oriented Architecture and Web Services
- Digital Humanities and Scholarship
- Lexicography and Language Studies
- Computational Physics and Python Applications
- Research Data Management Practices
- Translation Studies and Practices
- Web Data Mining and Analysis
- Advanced Computational Techniques and Applications
- Simulation Techniques and Applications
- Software Engineering Research
- Advanced Data Storage Technologies
- Text Readability and Simplification
- IoT and GPS-based Vehicle Safety Systems
- Marine Biology and Ecology Research
- Advanced Text Analysis Techniques
- Gene expression and cancer classification
- Library Science and Information Systems
- Transport Systems and Technology
Johns Hopkins University
2024
Vassar College
2007-2020
Brandeis University
2017
Technical University of Darmstadt
2017
University of Oslo
2017
University of Manitoba
2000
Galaxy is a mature, browser accessible workbench for scientific computing. It enables scientists to share, analyze and visualize their own data, with minimal technical impediments. A thriving global community continues use, maintain contribute the project, support from multiple national infrastructure providers that enable freely analysis training services. The Training Network supports free, self-directed, virtual >230 integrated tutorials. Project engagement metrics have continued grow...
In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, have automatically transduced several different Wall Street Journal corpus into GrAF can then be merged, analyzed, visualized using standard graph algorithms tools. We discuss how, representation, allows for application well-established traversal analysis to produce information...
This paper explores interoperability for data represented using the Graph Annotation Framework (GrAF) (Ide and Suderman, 2007) formats utilized by two general-purpose annotation systems: General Architecture Text Engineering (GATE) (Cunningham, 2002) Unstructured Information Management (UIMA). GrAF is intended to serve as a "pivot" enable among different formats, both GATE UIMA are at least implicitly designed with an eye toward other tools. We describe steps required perform round-trip...
The Galaxy application is a popular open-source framework for data intensive sciences, counting thousands of monthly users across more than 100 public servers. To support growing number and greater variety use cases, the complexity production-grade installation has also grown, requiring administration effort. There need rapid reproducible deployment method that can be maintained at high-availability with minimal maintenance.
The American National Corpus and its annotations are represented in a stand-off XML format compliant with the specifications of ISO TC37 SC4 WG1's Linguistic Annotation Framework. Because few systems that enable search access corpus currently support markup, project has developed SAX like parser generates ANC data in-line, variety output formats.
In this paper we investigate cross-platform interoperability for natural language processing (NLP) and, in particular, annotation of textual resources, with an eye toward identifying the design elements models and processes that are particularly problematic for, or amenable to, enabling seamless communication across different platforms. The study is conducted context a specific methodology, namely machine-assisted interactive (also known as human-in-the-loop annotation). This methodology...
For decades, most self-respecting linguistic engineering initiatives have designed and implemented custom representations for various layers of, example, morphological, syntactic, semantic analysis. Despite occasional efforts at harmonization or even standardization, our field today is blessed with a multitude of ways encoding exchanging annotations these types, both the levels ‘abstract syntax’, naming choices, course file formats. To large degree, it possible to work within across design...
In a recent project, the Language Application Grid was augmented to support mining of scientific publications. The results that ef- fort have now been repurposed focus on Covid-19 literature, including modification LAPPS "AskMe" query and retrieval engine. We describe AskMe system discuss its functionality as compared other engines available search covid-related