The First International Workshop on
Learning from Limited or Noisy Data
For Information Retrieval
July 12th, 2018, Ann Arbor, Michigan, USA. Co-located with SIGIR 2018.
Tell Me More

About the Workshop

In recent years, machine learning approaches, and in particular deep neural networks, have yielded significant improvements on several natural language processing and computer vision tasks; however, such breakthroughs have not yet been observed in the area of information retrieval. Besides the complexity of IR tasks, such as understanding the user's information needs, a main reason is the lack of high-quality and/or large-scale training data for many IR tasks. This necessitates studying how to design and train machine learning algorithms where there is no large-scale or high-quality data in hand. Therefore, considering the quick progress in development of machine learning models, this is an ideal time for a workshop that especially focuses on learning in such an important and challenging setting for IR tasks.

The goal of this workshop is to bring together researchers from industry, where data is plentiful but noisy, with researchers from academia, where data is sparse but clean, to discuss solutions to these related problems.

Call for Paper

We invite two kinds of contributions: research papers (up to 6 pages) and position papers (up to 2 pages). Submissions must be in English, in PDF format, and should not exceed the appropriate page limit in the current ACM two-column conference format (including references and figures). Suitable LaTeX and Word templates are available from the ACM Website. The papers can represent reports of original research, preliminary research results, or proposals for new work. The review process is single-blind. Papers will be evaluated according to their significance, originality, technical content, style, clarity, relevance to the workshop, and likelihood of generating discussion. Authors should note that changes to the author list after the submission deadline are not allowed without permission from the PC Chairs. At least one author of each accepted paper is required to register for, attend, and present the work at the workshop. All short papers are to be submitted via EasyChair at https://easychair.org/conferences/?conf=lnd4ir.

Papers presented at the workshop will be required to be uploaded to arXiv.org but will be considered non-archival, and may be submitted elsewhere (modified or not), although the workshop site will maintain a link to the arXiv versions. This makes the workshop a forum for the presentation and discussion of current work, without preventing the work from being published elsewhere.

Relevant topics include, but are not limited to:
  • Learning from noisy data for IR
    • Learning from automatically constructed data
    • Learning from implicit feedback data, e.g., click data
  • Distant or weak supervision and learning from IR heuristics
  • Unsupervised and semi-supervised learning for IR
  • Transfer learning for IR
  • Incorporating expert/domain knowledge to improve learning-based IR models
    • Learning from labeled features
    • Incorporating IR axioms to improve machine learning models

Important Dates:

  • Submission deadline: May 4, 2018
  • Paper notifications: May 25, 2018
  • Camera-ready deadline: June 8, 2018
  • Workshop Day: July 12, 2018

Organizers

Hamed Zamani

University of Massachusetts Amherst

Mostafa Dehgahni

University of Amsterdam

Hang Li

Toutiao

Nick Craswell

Microsoft

Program Committee:

  • Michael Bendersky, Google, USA
  • Daniel Cohen, UMass Amherst, USA
  • W. Bruce Croft, UMass Amherst, USA
  • J. Shane Culpepper, RMIT Univ., Australia
  • Maarten de Rijke, Univ. of Amsterdam, The Netherlands
  • Jiafeng Guo, Chinese Academy of Sciences, China
  • Claudia Hauff, TU Delf, The Netherlands
  • Jaap Kamps, Univ. of Amsterdam, The Netherlands
  • Craig Macdonald, Univ. of Glasgow, UK
  • Bhaskar Mitra, Microsoft and UCL, UK
  • Amirmohammad Rooshenas, UMass Amherst, USA
  • Min Zhang, Tsinghua University, China
  • Yongfeng Zhang, Rutgers University, USA
  • Accepted Papers

  • "Distributed Evaluations: Ending Neural Point Metrics", Daniel Cohen, Scott M. Jordan, and W. Bruce Croft.

  • "Explainable Agreement through Simulation for Tasks with Subjective Labels", John Foley.

  • "Information Retrieval in African Languages", Hussein Suleman.

  • "Highly Relevant Routing Recommendation Systems for Handling Few Data Using MDL Principle", Diyah Puspitaningrum, I.S.W.B. Prasetya, and P.A. Wicaksono.

  • "Learning to Rank from Samples of Variable Quality", Mostafa Dehghani and Jaap Kamps.

  • "Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data", Ethem Can, Aysu Ezen-Can, and Fazli Can.

  • "Named Entity Recognition with Extremely Limited Data", John Foley, Sheikh Muhammad Sarwar, and James Allan.

  • "Towards Theoretical Understanding of Weak Supervision for Information Retrieval", Hamed Zamani and W. Bruce Croft.
  • Sponsors