Gilad Mishne

ORCID: 0009-0009-6858-0228
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Web Data Mining and Analysis
  • Information Retrieval and Search Behavior
  • Advanced Text Analysis Techniques
  • Complex Network Analysis Techniques
  • Semantic Web and Ontologies
  • Genomics and Rare Diseases
  • Spam and Phishing Detection
  • Speech and dialogue systems
  • Expert finding and Q&A systems
  • Recommender Systems and Techniques
  • Genetic Associations and Epidemiology
  • Sentiment Analysis and Opinion Mining
  • Biomedical Text Mining and Ontologies
  • Cancer Genomics and Diagnostics
  • Genomic variations and chromosomal abnormalities
  • Data Management and Algorithms
  • Advanced Database Systems and Queries
  • Digital Marketing and Social Media
  • Algorithms and Data Compression
  • Genetics, Bioinformatics, and Biomedical Research
  • Digital Humanities and Scholarship
  • Open Education and E-Learning
  • Statistics Education and Methodologies

Twitter (United States)
2012-2021

Color (United States)
2016-2020

Yahoo (United Kingdom)
2008-2012

Yahoo (United States)
2009-2010

University of Amsterdam
2003-2007

University of Maryland, College Park
2003

The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability such increases, task identifying high-quality sites based on user contributions --social media -- becomes increasingly important. Social in general exhibit a rich variety information sources: addition itself, there is wide array non-content available, as links between items explicit ratings members community. In this paper we investigate methods for exploiting community feedback...

10.1145/1341531.1341557 article EN 2008-01-01

This paper describes AutoTag, a tool which suggests tags for weblog posts using collaborative filtering methods. An evaluation of AutoTag on large collection shows good accuracy; coupled with the blogger's final quality control, assists both in simplifying tagging process and improving its quality.

10.1145/1135777.1135961 article EN 2006-05-23

In web search, recency ranking refers to documents by relevance which takes freshness into account. this paper, we propose a retrieval system automatically detects and responds sensitive queries. The queries using high precision classifier. machine learned model trained for such We use multiple features provide temporal evidence effectively represents document recency. Furthermore, several training methodologies important rankers. Finally, develop new evaluation metrics Our experiments...

10.1145/1718487.1718490 article EN 2010-02-04

Inherited susceptibility to common, complex diseases may be caused by rare, pathogenic variants ("monogenic") or the cumulative effect of numerous common ("polygenic"). Comprehensive genome interpretation should enable assessment for both monogenic and polygenic components inherited risk. The traditional approach requires two distinct genetic testing technologies-high coverage sequencing known genes detect a genome-wide genotyping array followed imputation calculate scores (GPSs). We...

10.1186/s13073-019-0682-2 article EN cc-by Genome Medicine 2019-11-26

The reasoning tasks that can be performed with semantic web service descriptions depend on the quality of domain ontologies used to create these descriptions. However, building such is a time consuming and difficult task.We describe an automatic extraction method learns for from textual documentations attached services. We conducted our experiments in field bioinformatics by learning ontology documentation services myGrid, project supports biology Grid. Based evaluation extracted context...

10.1145/1060745.1060776 article EN 2005-01-01

We describe a method for discovering irregularities in temporal mood patterns appearing large corpus of blog posts, and labeling them with natural language explanation. Simple techniques based on comparing frequencies, coupled quantities data, are shown to be effective identifying the events underlying changes global moods.

10.3115/1608974.1609010 article EN 2006-01-01

We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in web search literature, Twitter context introduces a "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides case study illustrating challenges of data processing era "big data". tell story how our system was built twice: first implementation on typical Hadoop-based...

10.1145/2463676.2465290 article EN 2013-06-22

The real-time nature of Twitter means that term distributions in tweets and search queries change rapidly: the most frequent terms one hour may look very different from those next. Informally, we call this phenomenon "churn". Our interest analyzing churn stems perspective search. How do "correctly" compute statistics, considering underlying rapidly? In paper, present an analysis tweet query on Twitter, as a first step to answering question. Analyses reveal interesting insights temporal...

10.1609/icwsm.v6i1.14297 article EN Proceedings of the International AAAI Conference on Web and Social Media 2021-08-03

We describe a system for automating call-center analysis and monitoring. Our integrates transcription of incoming calls with their content; the analysis, we introduce novel method estimating domain-specific importance conversation fragments, based on divergence corpus statistics. Combining this Information Retrieval approaches, provide knowledge-mining tools both agents administrators center.

10.1145/1099554.1099684 article EN 2005-10-31

User browsing information, particularly non-search-related activity, reveals important contextual information on the preferences and intents of Web users. In this article, we demonstrate importance mining general user behavior data to improve ranking other Web-search experience, with an emphasis analyzing individual sessions for creating aggregate models. context, introduce ClickRank , efficient, scalable algorithm estimating Webpage Website from user-behavior data. We lay out theoretical...

10.1145/2109205.2109206 article EN ACM Transactions on the Web 2012-03-01

Advances in genome sequencing have led to a tremendous increase the discovery of novel missense variants, but evidence for determining clinical significance can be limited or conflicting. Here, we present Learning from Evidence Assess Pathogenicity (LEAP), machine learning model that utilizes variety feature categories classify and achieves high performance multiple genes different health conditions. Feature include functional predictions, splice population frequencies, conservation scores,...

10.1002/humu.24011 article EN Human Mutation 2020-03-16

Next generation sequencing (NGS) has become a common technology for clinical genetic tests. The quality of NGS calls varies widely and is influenced by features like reference sequence characteristics, read depth, mapping accuracy. With recent advances in software tools, the majority variants called using alone are fact accurate reliable. However, small subset difficult-to-call that still do require orthogonal confirmation exist. For this reason, many laboratories confirm results...

10.1186/s12864-018-4659-0 article EN cc-by BMC Genomics 2018-04-17
Coming Soon ...