Skip to yearly menu bar Skip to main content


( events)   Timezone:  
Workshop
Fri Dec 07 05:00 AM -- 03:30 PM (PST) @ Room 511 ABDE
Critiquing and Correcting Trends in Machine Learning
Thomas Rainforth · Matt Kusner · Benjamin Bloem-Reddy · Brooks Paige · Rich Caruana · Yee Whye Teh

Workshop Webpage: https://ml-critique-correct.github.io/

Recently there have been calls to make machine learning more reproducible, less hand-tailored, fair, and generally more thoughtful about how research is conducted and put into practice. These are hallmarks of a mature scientific field and will be crucial for machine learning to have the wide-ranging, positive impact it is expected to have. Without careful consideration, we as a field risk inflating expectations beyond what is possible. To address this, this workshop aims to better understand and to improve all stages of the research process in machine learning.

A number of recent papers have carefully considered trends in machine learning as well as the needs of the field when used in real-world scenarios [1-18]. Each of these works introspectively analyzes what we often take for granted as a field. Further, many propose solutions for moving forward. The goal of this workshop is to bring together researchers from all subfields of machine learning to highlight open problems and widespread dubious practices in the field, and crucially, to propose solutions. We hope to highlight issues and propose solutions in areas such as:
- Common practices [1, 8]
- Implicit technical and empirical assumptions that go unquestioned [2, 3, 5, 7, 11, 12, 13, 17, 18]
- Shortfalls in publication and reviewing setups [15, 16]
- Disconnects between research focus and application requirements [9, 10, 14]
- Surprising observations that make us rethink our research priorities [4, 6]

The workshop program is a collection of invited talks, alongside contributed posters and talks. For some of these talks, we plan a unique open format of 10 minutes of talk + 10 minutes of follow up discussion. Additionally, a separate panel discussion will collect researchers with a diverse set of viewpoints on the current challenges and potential solutions. During the panel, we will also open the conversation to the audience. The discussion will further be open to an online Q&A which will be solicited prior to the workshop.

A key expected outcome of the workshop is a collection of important open problems at all levels of machine learning research, along with a record of various bad practices that we should no longer consider to be acceptable. Further, we hope that the workshop will make inroads in how to address these problems, highlighting promising new frontiers for making machine learning practical, robust, reproducible, and fair when applied to real-world problems.

Call for Papers:

Deadline: October 30rd, 2018, 11:59 UTC

The one day NIPS 2018 Workshop: Critiquing and Correcting Trends in Machine Learning calls for papers that critically examine current common practices and/or trends in methodology, datasets, empirical standards, publication models, or any other aspect of machine learning research. Though we are happy to receive papers that bring attention to problems for which there is no clear immediate remedy, we particularly encourage papers which propose a solution or indicate a way forward. Papers should motivate their arguments by describing gaps in the field. Crucially, this is not a venue for settling scores or character attacks, but for moving machine learning forward as a scientific discipline.

To help guide submissions, we have split up the call for papers into the follows tracks. Please indicate the intended track when making your submission. Papers are welcome from all subfields of machine learning. If you have a paper which you feel falls within the remit of the workshop but does not clearly fit one of these tracks, please contact the organizers at: ml.critique.correct@gmail.com.

Bad Practices (1-4 pages)
Papers that highlight common bad practices or unjustified assumptions at any stage of the research process. These can either be technical shortfalls in a particular machine learning subfield, or more procedural bad practices of the ilk of those discussed in [17].

Flawed Intuitions or Unjustified Assumptions (3-4 pages)
Papers that call into question commonly held intuitions or provide clear evidence either for or against assumptions that are regularly taken for granted without proper justification. For example, we would like to see papers which provide empirical assessments to test out metrics, verify intuitions, or compare popular current approaches with historic baselines that may have unfairly fallen out of favour (see e.g. [2]). We would also like to see work which provides results which makes us rethink our intuitions or the assumptions we typically make.

Negative Results (3-4 pages)
Papers which show failure modes of existing algorithms or suggest new approaches which one might expect to perform well but which do not. The aim of the latter of these is to provide a venue for work which might otherwise go unpublished but which is still of interest to the community, for example by dissuading other researchers from similar ultimately unsuccessful approaches. Though it is inevitably preferable that papers are able to explain why the approach performs poorly, this is not essential if the paper is able to demonstrate why the negative result is of interest to the community in its own right.

Research Process (1-4 pages)
Papers which provide carefully thought through critiques, provide discussion on, or suggest new approaches to areas such as the conference model, the reviewing process, the role of industry in research, open sourcing of code and data, institutional biases and discrimination in the field, research ethics, reproducibility standards, and allocation of conference tickets.

Debates (1-2 pages)
Short proposition papers which discuss issues either affecting all of machine learning or significantly sized subfields (e.g. reinforcement learning, Bayesian methods, etc). Selected papers will be used as the basis for instigating online forum debates before the workshop, leading up to live discussions on the day itself.

Open Problems (1-4 papers/short talks)
Papers that describe either (a) unresolved questions in existing fields that need to be addressed, (b) desirable operating characteristics for ML in particular application areas that have yet to be achieved, or (c) new frontiers of machine learning research that require rethinking current practices (e.g., error diagnosis for when many ML components are interoperating within a system, automating dataset collection/creation).

Submission Instructions Papers should be submitted as pdfs using the NIPS LaTeX style file. Author names should be anonymized.

All accepted papers will be made available through the workshop website and presented as a poster. Selected papers will also be given contributed talks. We have a small number of complimentary workshop registrations to hand out to students. If you would like to apply for one of these, please email a one paragraph supporting statement. We also have a limited number of reserved tickets slots to assign to authors of accepted papers. If any authors are unable to attend the workshop due to ticketing, visa, or funding issues, they will be allowed to provide a video presentation for their work that will be made available through the workshop website in lieu of a poster presentation.

Please submit papers here: https://easychair.org/conferences/?conf=cract2018

Deadline: October 30rd, 2018, 11:59 UTC

References

[1] Mania, H., Guy, A., & Recht, B. (2018). Simple random search provides a competitive approach to reinforcement learning. arXiv preprint arXiv:1803.07055.
[2] Rainforth, T., Kosiorek, A. R., Le, T. A., Maddison, C. J., Igl, M., Wood, F., & Teh, Y. W. (2018). Tighter variational bounds are not necessarily better. ICML.
[3] Torralba, A., & Efros, A. A. (2011). Unbiased look at dataset bias. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on (pp. 1521-1528). IEEE.
[4] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
[5] Mescheder, L., Geiger, A., Nowozin S. (2018) Which Training Methods for GANs do actually Converge? ICML
[6] Daumé III, H. (2009). Frustratingly easy domain adaptation. arXiv preprint arXiv:0907.1815
[7] Urban, G., Geras, K. J., Kahou, S. E., Wang, O. A. S., Caruana, R., Mohamed, A., ... & Richardson, M. (2016). Do deep convolutional nets really need to be deep (or even convolutional)?.
[8] Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., & Meger, D. (2017). Deep reinforcement learning that matters. arXiv preprint arXiv:1709.06560.
[9] Narayanan, M., Chen, E., He, J., Kim, B., Gershman, S., & Doshi-Velez, F. (2018). How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation. arXiv preprint arXiv:1802.00682.
[10] Schulam, S., Saria S. (2017). Reliable Decision Support using Counterfactual Models. NIPS.
[11] Rahimi, A. (2017). Let's take machine learning from alchemy to electricity. Test-of-time award presentation, NIPS.
[12] Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O. (2018). Are GANs Created Equal? A Large-Scale Study. arXiv preprint arXiv:1711.10337.
[13] Le, T.A., Kosiorek, A.R., Siddharth, N., Teh, Y.W. and Wood, F., (2018). Revisiting Reweighted Wake-Sleep. arXiv preprint arXiv:1805.10469.
[14] Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J. and Mané, D., (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.
[15] Sutton, C. (2018) Making unblinding manageable: Towards reconciling prepublication and double-blind review. http://www.theexclusive.org/2017/09/arxiv-double-blind.html
[16] Langford, J. (2018) ICML Board and Reviewer profiles. http://hunch.net/?p=8962378

Opening Remarks (Talk)
Zachary Lipton (Invited Talk)
Kim Hazelwood (Invited Talk)
Expanding search in the space of empirical ML (Contributed Talk)
Opportunities for machine learning research to support fairness in industry practice (Contributed Talk)
Spotlights - Papers 2, 23, 24, 36, 40, 44 (Contributed Talk)
Poster Session 1 (note there are numerous missing names here, all papers appear in all poster sessions) (Poster Session)
Finale Doshi-Velez (Invited Talk)
Suchi Saria (Invited Talk)
Lunch (Break)
Sebastian Nowozin (Invited Talk)
Using Cumulative Distribution Based Performance Analysis to Benchmark Models (Contributed Talk)
Charles Sutton (Invited Talk)
On Avoiding Tragedy of the Commons in the Peer Review Process (Contributed Talk)
Spotlights - Papers 10, 20, 35, 42 (Contributed Talk)
Coffee Break and Posters (Break)
Panel on research process (Panel)
Poster Session 2 (Poster Session)