The First International Workshop on
Learning from Limited or Noisy Data
For Information Retrieval
July 12th, 2018, Ann Arbor, Michigan, USA. Co-located with SIGIR 2018.
Tell Me More

About the Workshop

In recent years, machine learning approaches, and in particular deep neural networks, have yielded significant improvements on several natural language processing and computer vision tasks; however, such breakthroughs have not yet been observed in the area of information retrieval. Besides the complexity of IR tasks, such as understanding the user's information needs, a main reason is the lack of high-quality and/or large-scale training data for many IR tasks. This necessitates studying how to design and train machine learning algorithms where there is no large-scale or high-quality data in hand. Therefore, considering the quick progress in development of machine learning models, this is an ideal time for a workshop that especially focuses on learning in such an important and challenging setting for IR tasks.

The goal of this workshop is to bring together researchers from industry, where data is plentiful but noisy, with researchers from academia, where data is sparse but clean, to discuss solutions to these related problems.

Call for Paper

We invite two kinds of contributions: research papers (up to 6 pages) and position papers (up to 2 pages). Submissions must be in English, in PDF format, and should not exceed the appropriate page limit in the current ACM two-column conference format (including references and figures). Suitable LaTeX and Word templates are available from the ACM Website. The papers can represent reports of original research, preliminary research results, or proposals for new work. The review process is single-blind. Papers will be evaluated according to their significance, originality, technical content, style, clarity, relevance to the workshop, and likelihood of generating discussion. Authors should note that changes to the author list after the submission deadline are not allowed without permission from the PC Chairs. At least one author of each accepted paper is required to register for, attend, and present the work at the workshop. All short papers are to be submitted via EasyChair at

Papers presented at the workshop will be required to be uploaded to but will be considered non-archival, and may be submitted elsewhere (modified or not), although the workshop site will maintain a link to the arXiv versions. This makes the workshop a forum for the presentation and discussion of current work, without preventing the work from being published elsewhere.

Relevant topics include, but are not limited to:
  • Learning from noisy data for IR
    • Learning from automatically constructed data
    • Learning from implicit feedback data, e.g., click data
    • Meta-learning for noisy data
  • Distant or weak supervision and learning from IR heuristics
  • Unsupervised and semi-supervised learning for IR
  • Transfer learning for IR
  • Incorporating expert/domain knowledge to improve learning-based IR models
    • Learning from labeled features
    • Incorporating IR axioms to improve machine learning models

Important Dates:

  • Submission deadline: May 4, 2018
  • Paper notifications: May 25, 2018
  • Camera-ready deadline: June 8, 2018
  • Workshop Day: July 12, 2018


Hamed Zamani

University of Massachusetts Amherst

Mostafa Dehgahni

University of Amsterdam

Hang Li


Nick Craswell