NFDI4DS | UHH-SEMS - Publication Details

Stephen F. Altschul

ORCID: 0000-0003-2120-9631

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5015061000

Research Areas

Genomics and Phylogenetic Studies
RNA and protein synthesis mechanisms
Machine Learning in Bioinformatics
Protein Structure and Dynamics
Algorithms and Data Compression
Advanced Proteomics Techniques and Applications
Genomics and Chromatin Dynamics
Bioinformatics and Genomic Networks
Gene expression and cancer classification
Genetics, Bioinformatics, and Biomedical Research
Molecular Biology Techniques and Applications
Bayesian Methods and Mixture Models
Glycosylation and Glycoproteins Research
Enzyme Structure and Function
Bacterial Genetics and Biotechnology
Biomedical Text Mining and Ontologies
Microbial Metabolic Engineering and Bioproduction
Genomics and Rare Diseases
RNA modifications and cancer
RNA Research and Splicing
Evolution and Paleontology Studies
DNA and Biological Computing
Fractal and DNA sequence analysis
Bacteriophages and microbial interactions
Computational Drug Discovery Methods

National Center for Biotechnology Information
2010-2021

National Institutes of Health
2010-2021

Vanderbilt University
2014

Center for Human Genetics
2014

Center for Information Technology
2011

Rockefeller University
1986-2008

Florida Atlantic University
2003

United States National Library of Medicine
1990-2002

Duke University Hospital
2000

Duke Medical Center
2000

Basic local alignment search tool

OPENALEX - Publications

Stephen F. Altschul Warren Gish Webb Miller Eugene W. Myers David J. Lipman

10.1016/s0022-2836(05)80360-2 article EN Journal of Molecular Biology 1990-10-01

Basic Local Alignment Search Tool

OPENALEX - Publications

Stephen F. Altschul

10.1006/jmbi.1990.9999 article EN Journal of Molecular Biology 1990-10-05

Identification of FAP Locus Genes from Chromosome 5q21

OPENALEX - Publications

Kenneth W. Kinzler Mef Nilbert Li-Kuo Su Bert Vogelstein Tracy M. Bryan and 16 more

Recent studies suggest that one or more genes on chromosome 5q21 are important for the development of colorectal cancers, particularly those associated with familial adenomatous polyposis (FAP). To facilitate identification from this locus, a portion region is tightly linked to FAP was cloned. Six contiguous stretches sequence (contigs) containing approximately 5.5 Mb DNA were isolated. Subclones these contigs used identify and position six genes, all which expressed in normal colonic...

10.1126/science.1651562 article EN Science 1991-08-09

Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple Alignment

OPENALEX - Publications

Charles E. Lawrence Stephen F. Altschul Mark S. Boguski Jun S. Liu Andrew F. Neuwald and 1 more

A wealth of protein and DNA sequence data is being generated by genome projects other sequencing efforts. crucial barrier to deciphering these sequences understanding the relations among them difficulty detecting subtle local residue patterns common multiple sequences. Such frequently reflect similar molecular structures biological properties. mathematical definition this "local alignment" problem suitable for full computer automation has been used develop a new sensitive algorithm, based on...

10.1126/science.8211139 article EN Science 1993-10-08

Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences

OPENALEX - Publications

Robert L. Strausberg Elise A. Feingold Lynette Grouse Jeffery G. Derge Richard D. Klausner and 77 more

The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence cDNA clone containing complete ORF for each human mouse gene. ESTs were generated from libraries enriched full-length cDNAs analyzed candidate full-ORF clones, which then sequenced high accuracy. MGC has currently verified the full nonredundant set >9,000 >6,000 genes. Candidate clones an additional 7,800 3,500 genes also have been identified. All sequences...

10.1073/pnas.242603899 article EN Proceedings of the National Academy of Sciences 2002-12-11

Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

OPENALEX - Publications

Samuel Karlin Stephen F. Altschul

An unusual pattern in a nucleic acid or protein sequence region of strong similarity shared by two more sequences may have biological significance. It is therefore desirable to know whether such can arisen simply chance. To identify interesting patterns, appropriate scoring values be assigned the individual residues single sets when several are compared. For sequences, scores reflect biophysical properties as charge, volume, hydrophobicity, secondary structure potential; for multiple they...

10.1073/pnas.87.6.2264 article EN Proceedings of the National Academy of Sciences 1990-03-01

A workbench for multiple alignment construction and analysis

OPENALEX - Publications

Gregory D. Schuler Stephen F. Altschul David J. Lipman

Abstract Multiple sequence alignment can be a useful technique for studying molecular evolution, as well analyzing relationships between structure or function and primary sequence. We have developed this purpose an interactive program, MACAW (Multiple Alignment Construction Analysis Workbench), that allows the user to construct multiple alignments by locating, analyzing, editing, combining “blocks” of aligned segments. incorporates several novel features. (1) Regions local similarity are...

10.1002/prot.340090304 article EN Proteins Structure Function and Bioinformatics 1991-03-01

Domain enhanced lookup time accelerated BLAST

OPENALEX - Publications

Grzegorz M. Boratyn Alejandro A. Schäffer Richa Agarwala Stephen F. Altschul David J. Lipman and 1 more

BLAST is a commonly-used software package for comparing query sequence to database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated (PSI-BLAST) iteratively searches database, using the matches round i construct position-specific score matrix (PSSM) searching + 1. Biegert and Söding developed Context-sensitive (CS-BLAST), which combines information from with derived library short profiles achieve better homology detection than PSI-BLAST, builds its...

10.1186/1745-6150-7-12 article EN cc-by Biology Direct 2012-01-01

A superfamily of conserved domains in DNA damage‐ responsive cell cycle checkpoint proteins

OPENALEX - Publications

Peer Bork Kay Hofmann Philipp Bücher Andrew F. Neuwald Stephen F. Altschul and 1 more

Computer analysis of a conserved domain, BRCT, first described at the carboxyl ter-minus breast cancer protein BRCA1, p53 binding (53BP1), and yeast cell cycle checkpoint RAD9 revealed large super- family domains that occur predominantly in proteins involved functions responsive to DNA damage. The BRCT domain consists ~95 amino acid residues occurs as tandem repeat terminus numerous proteins, but has been observed also or single copy. superfamily presently includes ~40 nonorthologous namely,...

10.1096/fasebj.11.1.9034168 article EN The FASEB Journal 1997-01-01

[27] Local alignment statistics

OPENALEX - Publications

Stephen F. Altschul Warren Gish

10.1016/s0076-6879(96)66029-7 article EN Methods in enzymology on CD-ROM/Methods in enzymology 1996-01-01

Amino acid substitution matrices from an information theoretic perspective

OPENALEX - Publications

Stephen F. Altschul

10.1016/0022-2836(91)90193-a article EN Journal of Molecular Biology 1991-06-01

Composition-based statistics and translated nucleotide searches: Improving the TBLASTN module of BLAST

OPENALEX - Publications

E. Michael Gertz Yi‐Kuo Yu Richa Agarwala Alejandro A. Schäffer Stephen F. Altschul

TBLASTN is a mode of operation for BLAST that aligns protein sequences to nucleotide database translated in all six frames. We present the first description modern implementation TBLASTN, focusing on new techniques were used implement composition-based statistics searches. Composition-based use composition being aligned generate more accurate E-values, which allows distinction between true and false matches. Until recently, available only protein-protein They are now as command line option...

10.1186/1741-7007-4-41 article EN cc-by BMC Biology 2006-12-01

A tool for multiple sequence alignment.

OPENALEX - Publications

David J. Lipman Stephen F. Altschul John Kececioglu

Multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence-structure relationships. Until recently, it has been impractical to apply dynamic programming, the most widely accepted method producing pairwise alignments, comparisons of more than three sequences. We describe design application tool multiple amino acid sequences that implements new algorithm greatly reduces computational demands programming. This is able align in reasonable time...

10.1073/pnas.86.12.4412 article EN Proceedings of the National Academy of Sciences 1989-06-01

Protein database searches for multiple alignments.

OPENALEX - Publications

Stephen F. Altschul David J. Lipman

Protein database searches frequently can reveal biologically significant sequence relationships useful in understanding structure and function. Weak but meaningful patterns be obscured, however, by other similarities due only to chance. By searching a for multiple as opposed pairwise alignments, distant are much more easily distinguished from background noise. Recent statistical results permit the power of this approach analyzed. Given typical query sequence, an algorithm described here...

10.1073/pnas.87.14.5509 article EN Proceedings of the National Academy of Sciences 1990-07-01

SAGEmap: A Public Gene Expression Resource

OPENALEX - Publications

Alex E. Lash Carolyn M. Tolstoshev Lukas Wagner Gregory D. Schuler Robert L. Strausberg and 2 more

We have constructed a public gene expression data repository and online access analysis, WWW FTP sites for serial analysis of (SAGE) data. The components this resource, SAGEmap, are located at http://www.ncbi.nlm.nih. gov/sage ftp://ncbi.nlm.nih.gov/pub/sage, respectively. herein describe SAGE submission procedures, the construction characteristics tags to assignments, derivation use novel statistical test designed specifically differential-type analyses data, organization resource.

10.1101/gr.10.7.1051 article EN cc-by-nc Genome Research 2000-07-01

Applications and statistics for multiple high-scoring segments in molecular sequences.

OPENALEX - Publications

Samuel Karlin Stephen F. Altschul

Score-based measures of molecular-sequence features provide versatile aids for the study proteins and DNA. They are used by many sequence data base search programs, as well identifying distinctive properties single sequences. For any such measure, it is important to know what can be expected occur purely chance. The statistical distribution high-scoring segments has been described elsewhere. However, molecular sequences will frequently yield several which some combined assessment in order....

10.1073/pnas.90.12.5873 article EN Proceedings of the National Academy of Sciences 1993-06-15

IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices

OPENALEX - Publications

A. A. Schaffer Yuri I. Wolf Chris P. Ponting Eugene V. Koonin L. Aravind and 1 more

Abstract Motivation: Many studies have shown that database searches using position-specific score matrices (PSSMs) or profiles as queries are more effective at identifying distant protein relationships than use simple sequences queries. One popular program for constructing a PSSM and comparing it with of is Position-Specific Iterated BLAST (PSI-BLAST). Results: This paper describes new software package, IMPALA, designed the complementary procedure single query sequence PSI-BLAST-generated...

10.1093/bioinformatics/15.12.1000 article EN Bioinformatics 1999-12-01

Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.

OPENALEX - Publications

Roman L. Tatusov Stephen F. Altschul Eugene V. Koonin

We describe an approach to analyzing protein sequence databases that, starting from a single uncharacterized or group of related sequences, generates blocks conserved segments. The procedure involves iterative database scans with evolving position-dependent weight matrix constructed coevolving set aligned For each iteration, the expected distribution scores under random model is used cutoff score for inclusion segment in next iteration. This may be calculated allow chance either fixed number...

10.1073/pnas.91.25.12091 article EN Proceedings of the National Academy of Sciences 1994-12-06

Optimal sequence alignment using affine gap costs

OPENALEX - Publications

Stephen F. Altschul Bruce W. Erickson

10.1007/bf02462326 article EN Bulletin of Mathematical Biology 1986-09-01

Coming Soon ...