NFDI4DS | UHH-SEMS - Publication Details

Steven Skiena

ORCID: 0000-0003-0397-7514

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5060741187

Research Areas

Algorithms and Data Compression
Topic Modeling
Computational Geometry and Mesh Generation
Natural Language Processing Techniques
Genomics and Phylogenetic Studies
Advanced Graph Neural Networks
RNA and protein synthesis mechanisms
Complex Network Analysis Techniques
Advanced Graph Theory Research
DNA and Biological Computing
Advanced Text Analysis Techniques
Data Management and Algorithms
Graph Theory and Algorithms
Authorship Attribution and Profiling
Gene expression and cancer classification
Digital Image Processing Techniques
Machine Learning and Algorithms
Web Data Mining and Analysis
Computability, Logic, AI Algorithms
Computer Graphics and Visualization Techniques
Sentiment Analysis and Opinion Mining
Genome Rearrangement Algorithms
Complexity and Algorithms in Graphs
Constraint Satisfaction and Optimization
Advanced biosensing and bioanalysis techniques

Stony Brook University
2015-2024

State University of New York
2004-2023

Cornell University
1991-2020

Technion – Israel Institute of Technology
2020

Tampere University
2020

University of Illinois Urbana-Champaign
1985-2020

Institute for Research in Fundamental Sciences
2020

Georgia Institute of Technology
2020

University of Michigan
2020

University of California, Berkeley
2008

DeepWalk

OPENALEX - Publications

Bryan Perozzi Rami Al‐Rfou Steven Skiena

We present DeepWalk, a novel approach for learning latent representations of vertices in network. These encode social relations continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements language modeling and unsupervised feature (or deep learning) from sequences words to graphs.

10.1145/2623330.2623732 preprint EN 2014-08-22

Virus Attenuation by Genome-Scale Changes in Codon Pair Bias

OPENALEX - Publications

J. Robert Coleman Dimitris Papamichail Steven Skiena Bruce Futcher Eckard Wimmer and 1 more

As a result of the redundancy genetic code, adjacent pairs amino acids can be encoded by as many 36 different synonymous codons. A species-specific "codon pair bias" provides that some codon are used more or less frequently than statistically predicted. We synthesized de novo large DNA molecules using hundreds over-or underrepresented to encode poliovirus capsid protein. Underrepresented caused decreased rates protein translation, and polioviruses containing such acid-independent changes...

10.1126/science.1155761 article EN Science 2008-06-26

Statistically Significant Detection of Linguistic Change

OPENALEX - Publications

Vivek Kulkarni Rami Al‐Rfou Bryan Perozzi Steven Skiena

We propose a new computational approach for tracking and detecting statistically significant linguistic shifts in the meaning usage of words. Such are especially prevalent on Internet, where rapid exchange ideas can quickly change word's meaning. Our meta-analysis constructs property time series word usage, then uses sound point detection algorithms to identify shifts. consider analyze three approaches increasing complexity generate such series, culmination which distributional...

10.1145/2736277.2741627 article EN 2015-05-18

Reduction of the Rate of Poliovirus Protein Synthesis through Large-Scale Codon Deoptimization Causes Attenuation of Viral Virulence by Lowering Specific Infectivity

OPENALEX - Publications

Steffen Mueller Dimitris Papamichail J. Robert Coleman Steven Skiena Eckard Wimmer

Exploring the utility of de novo gene synthesis with aim designing stably attenuated polioviruses (PV), we followed two strategies to construct PV variants containing synthetic replacements capsid coding sequences either by deoptimizing synonymous codon usage (PV-AB) or maximizing position changes existing wild-type (wt) poliovirus codons (PV-SD). Despite 934 nucleotide in region, PV-SD RNA produced virus characteristics. In contrast, no viable was recovered from PV-AB carrying 680 silent...

10.1128/jvi.00738-06 article EN Journal of Virology 2006-09-14

HARP: Hierarchical Representation Learning for Networks

OPENALEX - Publications

Haochen Chen Bryan Perozzi Yifan Hu Steven Skiena

We present HARP, a novel method for learning low dimensional embeddings of graph’s nodes which preserves higher-order structural features. Our proposed achieves this by compressing the input graph prior to embedding it, effectively avoiding troublesome configurations (i.e. local minima) can pose problems non-convex optimization. HARP works finding smaller approximates global structure its input. This simplified is used learn set initial representations, serve as good initializations...

10.1609/aaai.v32i1.11849 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-26

Live attenuated influenza virus vaccines by computer-aided rational design

OPENALEX - Publications

Steffen Mueller J. Robert Coleman Dimitris Papamichail Charles B. Ward Anjaruwee S. Nimnual and 3 more

10.1038/nbt.1636 article EN Nature Biotechnology 2010-06-13

Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment

OPENALEX - Publications

Muhao Chen Yingtao Tian Kai-Wei Chang Steven Skiena Carlo Zaniolo

Multilingual knowledge graph (KG) embeddings provide latent semantic representations of entities and structured with cross-lingual inferences, which benefit various knowledge-driven NLP tasks. However, precisely learning such inferences is usually hindered by the low coverage entity alignment in many KGs. Since multilingual KGs also literal descriptions entities, this paper, we introduce an embedding-based approach leverages a weakly aligned KG for semi-supervised using descriptions. Our...

10.24963/ijcai.2018/556 article EN 2018-07-01

Syntax-Directed Variational Autoencoder for Structured Data

OPENALEX - Publications

Hanjun Dai Yingtao Tian Bo Dai Steven Skiena Le Song

Deep generative models have been enjoying success in modeling continuous data. However it remains challenging to capture the representations for discrete structures with formal grammars and semantics, e.g., computer programs molecular structures. How generate both syntactically semantically correct data still largely an open problem. Inspired by theory of compiler where syntax semantics check is done via syntax-directed translation (SDT), we propose a novel variational autoencoder (SD-VAE)...

10.48550/arxiv.1802.08786 preprint EN other-oa arXiv (Cornell University) 2018-01-01

International Sentiment Analysis for News and Blogs

OPENALEX - Publications

Mikhail Bautin Lohit Vijayarenu Steven Skiena

There is a growing interest in mining opinions using sentiment analysis methods from sources such as news, blogs and product reviews. Most of these have been developed for English are difficult to generalize other languages. We explore an approach utilizing state-of-the-art machine translation technology perform on the foreign language text. Our experiments indicate that (a) entity scores obtained by our method statistically significantly correlated across nine languages news five parallel...

10.1609/icwsm.v2i1.18606 article EN Proceedings of the International AAAI Conference on Web and Social Media 2021-09-25

Elevated atmospheric CO2 affects soil microbial diversity associated with trembling aspen

OPENALEX - Publications

Céline Lesaulnier Dimitrios Papamichail Sean McCorkle Bernard Ollivier Steven Skiena and 3 more

Summary The effects of elevated atmospheric CO 2 (560 p.p.m.) and subsequent plant responses on the soil microbial community composition associated with trembling aspen was assessed through classification 6996 complete ribosomal DNA sequences amplified from Rhinelander WI free‐air O 3 enrichment (FACE) experiments metagenome. This in‐depth comparative analysis provides an unprecedented, detailed deep branching profile population changes incurred as a response to this environmental...

10.1111/j.1462-2920.2007.01512.x article EN Environmental Microbiology 2008-01-30

Computational discrete mathematics: combinatorics and graph theory with Mathematica

OPENALEX - Publications

Sriram V. Pemmaraju Steven Skiena

With examples of all 450 functions in action plus tutorial text on the mathematics, this book is definitive guide to Experimenting with Combinatorica, a widely used software package for teaching and research discrete mathematics. Three interesting classes exercises are provided--theorem/proof, programming exercises, experimental explorations--ensuring great flexibility learning material. The Combinatorica user community ranges from students engineers, researchers computer science, physics,...

10.5860/choice.42-0356 article EN Choice Reviews Online 2004-09-01

Trading Strategies to Exploit Blog and News Sentiment

OPENALEX - Publications

Wenbin Zhang Steven Skiena

We use quantitative media (blogs, and news as a comparison) data generated by large-scale natural language processing (NLP) text analysis system to perform comprehensive comparative study on how company related variables anticipates or reflects the company's stock trading volumes financial returns. Building our findings, we give sentiment-based market-neutral strategy which gives consistently favorable returns with low volatility over long period. Our results are significant in confirming...

10.1609/icwsm.v4i1.14075 article EN Proceedings of the International AAAI Conference on Web and Social Media 2010-05-16

Lowest common ancestors in trees and directed acyclic graphs

OPENALEX - Publications

Michael A. Bender Martı́n Farach-Colton Giridhar Pemmasani Steven Skiena Pavel Sumazin

10.1016/j.jalgor.2005.08.001 article EN Journal of Algorithms 2005-09-17

Name-ethnicity classification from open sources

OPENALEX - Publications

Anurag Anil Ambekar Charles B. Ward Jahangir Mohammed Swapna Male Steven Skiena

The problem of ethnicity identification from names has a variety important applications, including biomedical research, demographic studies, and marketing. Here we report on the development an classifier where all training data is extracted public, non-confidential (and hence somewhat unreliable) sources. Our uses hidden Markov models (HMMs) decision trees to classify into 13 cultural/ethnic groups with individual group accuracy comparable earlier binary (e.g., Spanish/non-Spanish)...

10.1145/1557019.1557032 article EN 2009-06-28

Optimizing triangle strips for fast rendering

OPENALEX - Publications

Francine Evans Steven Skiena Amitabh Varshney

Almost all scientific visualization involving surfaces is currently done via triangles. The speed at which such triangulated can be displayed crucial to interactive and bounded by the rate data sent graphics subsystem for rendering. Partitioning polygonal models into triangle strips significantly reduce rendering times over transmitting each individually. We present new efficient algorithms constructing from partially models, experimental results showing these are on average 15% better than...

10.5555/244979.245626 article EN IEEE Visualization 1996-10-27

Don't Walk, Skip!

OPENALEX - Publications

Bryan Perozzi Vivek Kulkarni Haochen Chen Steven Skiena

We present WALKLETS, a novel approach for learning multiscale representations of vertices in network. In contrast to previous works, these explicitly encode multi-scale vertex relationships way that is analytically derivable.

10.1145/3110025.3110086 article EN 2017-07-31

Building Sentiment Lexicons for All Major Languages

OPENALEX - Publications

Yanqing Chen Steven Skiena

Sentiment analysis in a multilingual world remains challenging problem, because developing language-specific sentiment lexicons is an extremely resourceintensive process. Such remain scarce resource for most languages. In this paper, we address lexicon gap by building high-quality 136 major We integrate variety of linguistic resources to produce immense knowledge graph. By appropriately propagating from seed words, construct each component language our Our have polarity agreement 95.7% with...

10.3115/v1/p14-2063 article EN 2014-01-01

HARP: Hierarchical Representation Learning for Networks

OPENALEX - Publications

Haochen Chen Bryan Perozzi Yifan Hu Steven Skiena

We present HARP, a novel method for learning low dimensional embeddings of graph's nodes which preserves higher-order structural features. Our proposed achieves this by compressing the input graph prior to embedding it, effectively avoiding troublesome configurations (i.e. local minima) can pose problems non-convex optimization. HARP works finding smaller approximates global structure its input. This simplified is used learn set initial representations, serve as good initializations...

10.48550/arxiv.1706.07845 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Coming Soon ...