NFDI4DS | UHH-SEMS - Publication Details

K. Vijay‐Shanker

ORCID: 0000-0003-0958-3073

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5027009667

Research Areas

Natural Language Processing Techniques
Biomedical Text Mining and Ontologies
Topic Modeling
Software Engineering Research
Semantic Web and Ontologies
semigroups and automata theory
Bioinformatics and Genomic Networks
Speech and dialogue systems
Software Testing and Debugging Techniques
Genomics and Phylogenetic Studies
Software Reliability and Analysis Research
Algorithms and Data Compression
Syntax, Semantics, Linguistic Variation
Machine Learning in Bioinformatics
Advanced Software Engineering Methodologies
Genetics, Bioinformatics, and Biomedical Research
Logic, programming, and type systems
Machine Learning and Algorithms
Advanced Malware Detection Techniques
Logic, Reasoning, and Knowledge
Software Engineering Techniques and Practices
Advanced Text Analysis Techniques
Text Readability and Simplification
Molecular Biology Techniques and Applications
Computational Drug Discovery Methods

University of Delaware
2015-2025

University Ucinf
2006-2017

Georgetown University
2007-2014

University Hospital Heidelberg
2014

Georgetown University Medical Center
2014

Princeton University
2014

Heidelberg University
2014

European Molecular Biology Laboratory
2014

Mississippi State University
2012

Tilburg University
2001

Towards automatically generating summary comments for Java methods

OPENALEX - Publications

Giriprasad Sridhara Emily Hill Divya Muppaneni Lori Pollock K. Vijay‐Shanker

Studies have shown that good comments can help programmers quickly understand what a method does, aiding program comprehension and software maintenance. Unfortunately, few projects adequately comment the code. One way to overcome lack of human-written summary comments, guard against obsolete is automatically generate them. In this paper, we present novel technique descriptive for Java methods. Given signature body method, our automatic generator identifies content generates natural language...

10.1145/1858996.1859006 article EN 2010-09-20

Automatic generation of natural language summaries for Java classes

OPENALEX - Publications

Laura Moreno Jairo Aponte Giriprasad Sridhara Andrian Marcus Lori Pollock and 1 more

Most software engineering tasks require developers to understand parts of the source code. When faced with unfamiliar code, often rely on (internal or external) documentation gain an overall understanding code and determine whether it is relevant for current task. Unfortunately, absent outdated. This paper presents a technique automatically generate human readable summaries Java classes, assuming no exists. The allow main goal structure class. focus content responsibilities rather than their...

10.1109/icpc.2013.6613830 article EN 2013-05-01

The equivalence of four extensions of context-free grammars

OPENALEX - Publications

K. Vijay‐Shanker David Weir

10.1007/bf01191624 article EN Mathematical Systems Theory 1994-11-01

Characterizing structural descriptions produced by various grammatical formalisms

OPENALEX - Publications

K. Vijay‐Shanker David Weir Aravind K. Joshi

We consider the structural descriptions produced by various grammatical formalisms in terms of complexity paths and relationship between sets that each system can generate. In considering formalisms, we show it is useful to abstract away from details formalism, examine nature their derivation process as reflected properties trees. find several considered be seen being closely related since they have tree with same structure those Context-Free Grammars. On basis this observation, describe a...

10.3115/981175.981190 article EN 1987-01-01

Using natural language program analysis to locate and understand action-oriented concerns

OPENALEX - Publications

David Shepherd Zachary P. Fry Emily Hill Lori Pollock K. Vijay‐Shanker

Most current software systems contain undocumented high-level ideas implemented across multiple files and modules. When developers perform program maintenance tasks, they often waste time effort locating understanding these scattered concerns. We have developed a semi-automated concern location comprehension tool, Find-Concept, designed to reduce the spend on tasks increase their confidence in results of tasks. Find-Concept is effective because it searches unique natural language-based...

10.1145/1218563.1218587 article EN 2007-03-14

Automatically capturing source code context of NL-queries for software maintenance and reuse

OPENALEX - Publications

Emily Hill Lori Pollock K. Vijay‐Shanker

As software systems continue to grow and evolve, locating code for maintenance reuse tasks becomes increasingly difficult. Existing static search techniques using natural language queries provide little support help developers determine whether results are relevant, few recommend alternative words reformulate poor queries. In this paper, we present a novel approach that automatically extracts phrases from source identifiers categorizes the in hierarchy. Our contextual allows explore word...

10.1109/icse.2009.5070524 article EN 2009-01-01

Automatically detecting and describing high level actions within methods

OPENALEX - Publications

Giriprasad Sridhara Lori Pollock K. Vijay‐Shanker

One approach to easing program comprehension is reduce the amount of code that a developer has read. Describing high level abstract algorithmic actions associated with fragments using succinct natural language phrases potentially enables newcomer focus on fewer and more concepts when trying understand given method. Unfortunately, such descriptions are typically missing because it tedious create them manually.

10.1145/1985793.1985808 article EN 2011-05-21

Mining source code to automatically split identifiers for software analysis

OPENALEX - Publications

Eric Enslen Emily Hill Lori Pollock K. Vijay‐Shanker

Automated software engineering tools (e.g., program search, concern location, code reuse, quality assessment, etc.) increasingly rely on natural language information from comments and identifiers in code. The first step analyzing words requires splitting into their constituent words. Unlike languages, where space punctuation are used to delineate words, cannot contain spaces. One common way split is follow programming naming conventions. For example, Java programmers often use camel case,...

10.1109/msr.2009.5069482 article EN 2009-05-01

iPTMnet: an integrated resource for protein post-translational modification network discovery

OPENALEX - Publications

Hongzhan Huang Cecilia Arighi Karen Ross Jia Ren Gang Li and 5 more

Protein post-translational modifications (PTMs) play a pivotal role in numerous biological processes by modulating regulation of protein function. We have developed iPTMnet (http://proteininformationresource.org/iPTMnet) for PTM knowledge discovery, employing an integrative bioinformatics approach—combining text mining, data and ontological representation to capture rich information, including enzyme-substrate-site relationships, PTM-specific protein-protein interactions (PPIs) conservation...

10.1093/nar/gkx1104 article EN cc-by Nucleic Acids Research 2017-10-24

Exploring the neighborhood with dora to expedite software maintenance

OPENALEX - Publications

Emily Hill Lori Pollock K. Vijay‐Shanker

Completing software maintenance and evolution tasks for today's large, complex systems can be difficult, often requiring considerable time to understand the system well enough make correct changes. Despite evidence that successful programmers use program structure as identifier names explore software, most existing exploration techniques either structural or lexical information. By using only one type of information, automated tools ignore valuable clues about a developer's intentions -...

10.1145/1321631.1321637 article EN 2007-11-05

Generating Parameter Comments and Integrating with Method Summaries

OPENALEX - Publications

Giriprasad Sridhara Lori Pollock K. Vijay‐Shanker

An important part of the leading comments for a method are formal parameters method. According to Java documentation writing guidelines, developers should write summary method'sactions followed by each parameter. In this paper, we describe novel technique automatically generate descriptive methods. Such generated can help alleviate lack developer written parameter comments. addition, they programmer in ensuring that comment is current with code. We present heuristics provide high-level...

10.1109/icpc.2011.28 article EN 2011-06-01

AMAP

OPENALEX - Publications

Emily Hill Zachary P. Fry Haley Boyd Giriprasad Sridhara Y. S. Novikova and 2 more

When writing software, developers often employ abbreviations in identifier names. In fact, some may never occur with the expanded word, or more code. However, most existing program comprehension and search tools do little to address problem of abbreviations, therefore miss meaningful pieces code relationships between software artifacts. this paper, we present an automated approach mining abbreviation expansions from source enhance maintenance that utilize natural language information. Our...

10.1145/1370750.1370771 article EN 2008-05-10

Automatically mining software-based, semantically-similar words from comment-code mappings

OPENALEX - Publications

Matthew Howard Samir Gupta Lori Pollock K. Vijay‐Shanker

Many software development and maintenance tools involve matching between natural language words in different artifacts (e.g., traceability) or queries submitted by a user code search). Because people likely created the various artifacts, effectiveness of these is often improved expanding adding related to textual artifact representations. Synonyms are particularly useful overcome mismatch vocabularies, as well other word relations that indicate semantic similarity. However, experience shows...

10.1109/msr.2013.6624052 article EN 2013-05-01

Transcriptome response to heat stress in a chicken hepatocellular carcinoma cell line

OPENALEX - Publications

Liang Sun Susan J. Lamont Amanda M. Cooksey Fiona M. McCarthy Catalina O. Tudor and 6 more

Heat stress triggers an evolutionarily conserved set of responses in cells. The transcriptome responds to hyperthermia by altering expression genes adapt the cell or organism survive heat challenge. RNA-seq technology allows rapid identification environmentally responsive on a large scale. In this study, we have used identify chicken male white leghorn hepatocellular (LMH) line. transcripts 812 were (p < 0.01) with 235 upregulated and 577 downregulated following 2.5 h stress. Among whose...

10.1007/s12192-015-0621-0 article EN cc-by Cell Stress and Chaperones 2015-08-04

DiMeX: A Text Mining System for Mutation-Disease Association Extraction

OPENALEX - Publications

A. S. M. Ashique Mahmood Tsung-Jung Wu Raja Mazumder K. Vijay‐Shanker

The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There pressing need to gather such mutation-disease into public knowledge bases, but manual curation slows down the growth databases. We have addressed this problem by developing text-mining system (DiMeX) extract mutation disease from publication abstracts. DiMeX consists series natural language processing modules that preprocess input text apply syntactic semantic patterns...

10.1371/journal.pone.0152725 article EN cc-by PLoS ONE 2016-04-13

miRTex: A Text Mining System for miRNA-Gene Relation Extraction

OPENALEX - Publications

Gang Li Karen Ross Cecilia Arighi Yifan Peng Cathy Wu and 1 more

MicroRNAs (miRNAs) regulate a wide range of cellular and developmental processes through gene expression suppression or mRNA degradation. Experimentally validated miRNA targets are often reported in the literature. In this paper, we describe miRTex, text mining system that extracts miRNA-target relations, as well miRNA-gene gene-miRNA regulation relations. The achieves good precision recall when evaluated on literature corpus 150 abstracts with F-scores close to 0.90 three different types We...

10.1371/journal.pcbi.1004391 article EN cc-by PLoS Computational Biology 2015-09-25

Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

OPENALEX - Publications

Peng Su K. Vijay‐Shanker

Recently, automatically extracting biomedical relations has been a significant subject in research due to the rapid growth of literature. Since adaptation domain, transformer-based BERT models have produced leading results on many natural language processing tasks. In this work, we will explore approaches improve model for relation extraction tasks both pre-training and fine-tuning stages its applications. stage, add another level sub-domain data bridge gap between domain knowledge...

10.1186/s12859-022-04642-w article EN cc-by BMC Bioinformatics 2022-04-04

A BIOLOGICAL NAMED ENTITY RECOGNIZER

OPENALEX - Publications

Meenakshi Narayanaswamy K. E. Ravikumar K. Vijay‐Shanker

10.1142/9789812776303_0040 article EN Biocomputing 2002-12-01

Literature mining and database annotation of protein phosphorylation using a rule-based system

OPENALEX - Publications

Zhengyong Hu Meenakshi Narayanaswamy K. E. Ravikumar K. Vijay‐Shanker Cathy Wu

Motivation: A large volume of experimental data on protein phosphorylation is buried in the fast-growing PubMed literature. While great value, such information limited databases owing to laborious process literature-based curation. Computational literature mining holds promise facilitate database

10.1093/bioinformatics/bti390 article EN Computer applications in the biosciences 2005-04-06

Feature structures based Tree Adjoining Grammars

OPENALEX - Publications

K. Vijay‐Shanker Aravind K. Joshi

We have embedded Tree Adjoining Grammars (TAG) in a feature structure based unification system. The resulting system, Feature Structure (FTAG), captures the principle of factoring dependencies and recursion, fundamental to TAG's. show that FTAG has an enhanced descriptive capacity compared TAG formalism. consider some restricted versions this system possible linguistic stipulations can be made. briefly describe calculus represent structures used by extending on work Rounds, Kasper [Rounds et...

10.3115/991719.991783 article EN 1988-01-01

Polynomial time parsing of Combinatory Categorial Grammars

OPENALEX - Publications

K. Vijay‐Shanker David Weir

In this paper we present a polynomial time parsing algorithm for Combinatory Categorial Grammar. The recognition phase extends the CKY CFG. process of generating representation parse trees has two phases. Initially, shared forest is build that encodes set all derivation input string. This then pruned to remove spurious ambiguity.

10.3115/981823.981824 article EN 1990-01-01

D-tree grammars

OPENALEX - Publications

Owen Rambow K. Vijay‐Shanker David Weir

DTG are designed to share some of the advantages TAG while overcoming its limitations. involve two composition operations called subsertion and sister-adjunction. The most distinctive feature is that, unlike TAG, there complete uniformity in way that relate lexical items: always corresponds complementation sister-adjunction modification. Furthermore, DTG, can provide a uniform analysis for wh-movement English Kashmiri, despite fact wh element Kashmiri appears sentence-second position, not...

10.3115/981658.981679 article EN 1995-01-01

Coming Soon ...