NFDI4DS | UHH-SEMS - Publication Details

Gilad Mishne

ORCID: 0009-0009-6858-0228

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5042098508

Research Areas

Topic Modeling
Natural Language Processing Techniques
Web Data Mining and Analysis
Information Retrieval and Search Behavior
Advanced Text Analysis Techniques
Complex Network Analysis Techniques
Semantic Web and Ontologies
Genomics and Rare Diseases
Spam and Phishing Detection
Speech and dialogue systems
Expert finding and Q&A systems
Recommender Systems and Techniques
Genetic Associations and Epidemiology
Sentiment Analysis and Opinion Mining
Biomedical Text Mining and Ontologies
Cancer Genomics and Diagnostics
Genomic variations and chromosomal abnormalities
Data Management and Algorithms
Advanced Database Systems and Queries
Digital Marketing and Social Media
Algorithms and Data Compression
Genetics, Bioinformatics, and Biomedical Research
Digital Humanities and Scholarship
Open Education and E-Learning
Statistics Education and Methodologies

Twitter (United States)
2012-2021

Color (United States)
2016-2020

Yahoo (United Kingdom)
2008-2012

Yahoo (United States)
2009-2010

University of Amsterdam
2003-2007

University of Maryland, College Park
2003

Finding high-quality content in social media

OPENALEX - Publications

Eugene Agichtein Carlos Castillo Debora Donato Aristides Gionis Gilad Mishne

The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability such increases, task identifying high-quality sites based on user contributions --social media -- becomes increasingly important. Social in general exhibit a rich variety information sources: addition itself, there is wide array non-content available, as links between items explicit ratings members community. In this paper we investigate methods for exploiting community feedback...

10.1145/1341531.1341557 article EN 2008-01-01

AutoTag

OPENALEX - Publications

Gilad Mishne

This paper describes AutoTag, a tool which suggests tags for weblog posts using collaborative filtering methods. An evaluation of AutoTag on large collection shows good accuracy; coupled with the blogger's final quality control, assists both in simplifying tagging process and improving its quality.

10.1145/1135777.1135961 article EN 2006-05-23

Towards recency ranking in web search

OPENALEX - Publications

Anlei Dong Yi Chang Zhaohui Zheng Gilad Mishne Jing Bai and 4 more

In web search, recency ranking refers to documents by relevance which takes freshness into account. this paper, we propose a retrieval system automatically detects and responds sensitive queries. The queries using high precision classifier. machine learned model trained for such We use multiple features provide temporal evidence effectively represents document recency. Furthermore, several training methodologies important rankers. Finally, develop new evaluation metrics Our experiments...

10.1145/1718487.1718490 article EN 2010-02-04

Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores

OPENALEX - Publications

Julian R. Homburger Cynthia L. Neben Gilad Mishne Alicia Y. Zhou Sekar Kathiresan and 1 more

Inherited susceptibility to common, complex diseases may be caused by rare, pathogenic variants ("monogenic") or the cumulative effect of numerous common ("polygenic"). Comprehensive genome interpretation should enable assessment for both monogenic and polygenic components inherited risk. The traditional approach requires two distinct genetic testing technologies-high coverage sequencing known genes detect a genome-wide genotyping array followed imputation calculate scores (GPSs). We...

10.1186/s13073-019-0682-2 article EN cc-by Genome Medicine 2019-11-26

Learning domain ontologies for Web service descriptions

OPENALEX - Publications

Marta Sabou Chris Wroe Carole Goble Gilad Mishne

The reasoning tasks that can be performed with semantic web service descriptions depend on the quality of domain ontologies used to create these descriptions. However, building such is a time consuming and difficult task.We describe an automatic extraction method learns for from textual documentations attached services. We conducted our experiments in field bioinformatics by learning ontology documentation services myGrid, project supports biology Grid. Based evaluation extracted context...

10.1145/1060745.1060776 article EN 2005-01-01

Why are they excited?

OPENALEX - Publications

Krisztian Balog Gilad Mishne Maarten de Rijke

We describe a method for discovering irregularities in temporal mood patterns appearing large corpus of blog posts, and labeling them with natural language explanation. Simple techniques based on comparing frequencies, coupled quantities data, are shown to be effective identifying the events underlying changes global moods.

10.3115/1608974.1609010 article EN 2006-01-01

Fast data in the era of big data

OPENALEX - Publications

Gilad Mishne Jeff Dalton Zhenghua Li Aneesh Sharma Jimmy Lin

We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in web search literature, Twitter context introduces a "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides case study illustrating challenges of data processing era "big data". tell story how our system was built twice: first implementation on typical Hadoop-based...

10.1145/2463676.2465290 article EN 2013-06-22

A Study of "Churn" in Tweets and Real-Time Search Queries

OPENALEX - Publications

Jimmy Lin Gilad Mishne

The real-time nature of Twitter means that term distributions in tweets and search queries change rapidly: the most frequent terms one hour may look very different from those next. Informally, we call this phenomenon "churn". Our interest analyzing churn stems perspective search. How do "correctly" compute statistics, considering underlying rapidly? In paper, present an analysis tweet query on Twitter, as a first step to answering question. Analyses reveal interesting insights temporal...

10.1609/icwsm.v6i1.14297 article EN Proceedings of the International AAAI Conference on Web and Social Media 2021-08-03

Automatic analysis of call-center conversations

OPENALEX - Publications

Gilad Mishne David Carmel Ron Hoory Alexey Roytman Aya Soffer

We describe a system for automating call-center analysis and monitoring. Our integrates transcription of incoming calls with their content; the analysis, we introduce novel method estimating domain-specific importance conversation fragments, based on divergence corpus statistics. Combining this Information Retrieval approaches, provide knowledge-mining tools both agents administrators center.

10.1145/1099554.1099684 article EN 2005-10-31

ClickRank

OPENALEX - Publications

Guangyu Zhu Gilad Mishne

User browsing information, particularly non-search-related activity, reveals important contextual information on the preferences and intents of Web users. In this article, we demonstrate importance mining general user behavior data to improve ranking other Web-search experience, with an emphasis analyzing individual sessions for creating aggregate models. context, introduce ClickRank , efficient, scalable algorithm estimating Webpage Website from user-behavior data. We lay out theoretical...

10.1145/2109205.2109206 article EN ACM Transactions on the Web 2012-03-01

LEAP: Using machine learning to support variant classification in a clinical setting

OPENALEX - Publications

Carmen Lai Anjali D. Zimmer Robert O’Connor Serra Kim Raymond C. Chan and 4 more

Advances in genome sequencing have led to a tremendous increase the discovery of novel missense variants, but evidence for determining clinical significance can be limited or conflicting. Here, we present Learning from Evidence Assess Pathogenicity (LEAP), machine learning model that utilizes variety feature categories classify and achieves high performance multiple genes different health conditions. Feature include functional predictions, splice population frequencies, conservation scores,...

10.1002/humu.24011 article EN Human Mutation 2020-03-16

A machine learning model to determine the accuracy of variant calls in capture-based next generation sequencing

OPENALEX - Publications

Jeroen van den Akker Gilad Mishne Anjali D. Zimmer Alicia Y. Zhou

Next generation sequencing (NGS) has become a common technology for clinical genetic tests. The quality of NGS calls varies widely and is influenced by features like reference sequence characteristics, read depth, mapping accuracy. With recent advances in software tools, the majority variants called using alone are fact accurate reliable. However, small subset difficult-to-call that still do require orthogonal confirmation exist. For this reason, many laboratories confirm results...

10.1186/s12864-018-4659-0 article EN cc-by BMC Genomics 2018-04-17

Coming Soon ...