Timezone: »

Critiquing and Correcting Trends in Machine Learning
Thomas Rainforth · Matt Kusner · Benjamin Bloem-Reddy · Brooks Paige · Rich Caruana · Yee Whye Teh

Fri Dec 07 05:00 AM -- 03:30 PM (PST) @ Room 511 ABDE

Workshop Webpage: https://ml-critique-correct.github.io/

Recently there have been calls to make machine learning more reproducible, less hand-tailored, fair, and generally more thoughtful about how research is conducted and put into practice. These are hallmarks of a mature scientific field and will be crucial for machine learning to have the wide-ranging, positive impact it is expected to have. Without careful consideration, we as a field risk inflating expectations beyond what is possible. To address this, this workshop aims to better understand and to improve all stages of the research process in machine learning.

A number of recent papers have carefully considered trends in machine learning as well as the needs of the field when used in real-world scenarios [1-18]. Each of these works introspectively analyzes what we often take for granted as a field. Further, many propose solutions for moving forward. The goal of this workshop is to bring together researchers from all subfields of machine learning to highlight open problems and widespread dubious practices in the field, and crucially, to propose solutions. We hope to highlight issues and propose solutions in areas such as:
- Common practices [1, 8]
- Implicit technical and empirical assumptions that go unquestioned [2, 3, 5, 7, 11, 12, 13, 17, 18]
- Shortfalls in publication and reviewing setups [15, 16]
- Disconnects between research focus and application requirements [9, 10, 14]
- Surprising observations that make us rethink our research priorities [4, 6]

The workshop program is a collection of invited talks, alongside contributed posters and talks. For some of these talks, we plan a unique open format of 10 minutes of talk + 10 minutes of follow up discussion. Additionally, a separate panel discussion will collect researchers with a diverse set of viewpoints on the current challenges and potential solutions. During the panel, we will also open the conversation to the audience. The discussion will further be open to an online Q&A which will be solicited prior to the workshop.

A key expected outcome of the workshop is a collection of important open problems at all levels of machine learning research, along with a record of various bad practices that we should no longer consider to be acceptable. Further, we hope that the workshop will make inroads in how to address these problems, highlighting promising new frontiers for making machine learning practical, robust, reproducible, and fair when applied to real-world problems.

Call for Papers:

Deadline: October 30rd, 2018, 11:59 UTC

The one day NIPS 2018 Workshop: Critiquing and Correcting Trends in Machine Learning calls for papers that critically examine current common practices and/or trends in methodology, datasets, empirical standards, publication models, or any other aspect of machine learning research. Though we are happy to receive papers that bring attention to problems for which there is no clear immediate remedy, we particularly encourage papers which propose a solution or indicate a way forward. Papers should motivate their arguments by describing gaps in the field. Crucially, this is not a venue for settling scores or character attacks, but for moving machine learning forward as a scientific discipline.

To help guide submissions, we have split up the call for papers into the follows tracks. Please indicate the intended track when making your submission. Papers are welcome from all subfields of machine learning. If you have a paper which you feel falls within the remit of the workshop but does not clearly fit one of these tracks, please contact the organizers at: ml.critique.correct@gmail.com.

Bad Practices (1-4 pages)
Papers that highlight common bad practices or unjustified assumptions at any stage of the research process. These can either be technical shortfalls in a particular machine learning subfield, or more procedural bad practices of the ilk of those discussed in [17].

Flawed Intuitions or Unjustified Assumptions (3-4 pages)
Papers that call into question commonly held intuitions or provide clear evidence either for or against assumptions that are regularly taken for granted without proper justification. For example, we would like to see papers which provide empirical assessments to test out metrics, verify intuitions, or compare popular current approaches with historic baselines that may have unfairly fallen out of favour (see e.g. [2]). We would also like to see work which provides results which makes us rethink our intuitions or the assumptions we typically make.

Negative Results (3-4 pages)
Papers which show failure modes of existing algorithms or suggest new approaches which one might expect to perform well but which do not. The aim of the latter of these is to provide a venue for work which might otherwise go unpublished but which is still of interest to the community, for example by dissuading other researchers from similar ultimately unsuccessful approaches. Though it is inevitably preferable that papers are able to explain why the approach performs poorly, this is not essential if the paper is able to demonstrate why the negative result is of interest to the community in its own right.

Research Process (1-4 pages)
Papers which provide carefully thought through critiques, provide discussion on, or suggest new approaches to areas such as the conference model, the reviewing process, the role of industry in research, open sourcing of code and data, institutional biases and discrimination in the field, research ethics, reproducibility standards, and allocation of conference tickets.

Debates (1-2 pages)
Short proposition papers which discuss issues either affecting all of machine learning or significantly sized subfields (e.g. reinforcement learning, Bayesian methods, etc). Selected papers will be used as the basis for instigating online forum debates before the workshop, leading up to live discussions on the day itself.

Open Problems (1-4 papers/short talks)
Papers that describe either (a) unresolved questions in existing fields that need to be addressed, (b) desirable operating characteristics for ML in particular application areas that have yet to be achieved, or (c) new frontiers of machine learning research that require rethinking current practices (e.g., error diagnosis for when many ML components are interoperating within a system, automating dataset collection/creation).

Submission Instructions Papers should be submitted as pdfs using the NIPS LaTeX style file. Author names should be anonymized.

All accepted papers will be made available through the workshop website and presented as a poster. Selected papers will also be given contributed talks. We have a small number of complimentary workshop registrations to hand out to students. If you would like to apply for one of these, please email a one paragraph supporting statement. We also have a limited number of reserved tickets slots to assign to authors of accepted papers. If any authors are unable to attend the workshop due to ticketing, visa, or funding issues, they will be allowed to provide a video presentation for their work that will be made available through the workshop website in lieu of a poster presentation.

Please submit papers here: https://easychair.org/conferences/?conf=cract2018

Deadline: October 30rd, 2018, 11:59 UTC


[1] Mania, H., Guy, A., & Recht, B. (2018). Simple random search provides a competitive approach to reinforcement learning. arXiv preprint arXiv:1803.07055.
[2] Rainforth, T., Kosiorek, A. R., Le, T. A., Maddison, C. J., Igl, M., Wood, F., & Teh, Y. W. (2018). Tighter variational bounds are not necessarily better. ICML.
[3] Torralba, A., & Efros, A. A. (2011). Unbiased look at dataset bias. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on (pp. 1521-1528). IEEE.
[4] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
[5] Mescheder, L., Geiger, A., Nowozin S. (2018) Which Training Methods for GANs do actually Converge? ICML
[6] Daumé III, H. (2009). Frustratingly easy domain adaptation. arXiv preprint arXiv:0907.1815
[7] Urban, G., Geras, K. J., Kahou, S. E., Wang, O. A. S., Caruana, R., Mohamed, A., ... & Richardson, M. (2016). Do deep convolutional nets really need to be deep (or even convolutional)?.
[8] Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., & Meger, D. (2017). Deep reinforcement learning that matters. arXiv preprint arXiv:1709.06560.
[9] Narayanan, M., Chen, E., He, J., Kim, B., Gershman, S., & Doshi-Velez, F. (2018). How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation. arXiv preprint arXiv:1802.00682.
[10] Schulam, S., Saria S. (2017). Reliable Decision Support using Counterfactual Models. NIPS.
[11] Rahimi, A. (2017). Let's take machine learning from alchemy to electricity. Test-of-time award presentation, NIPS.
[12] Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O. (2018). Are GANs Created Equal? A Large-Scale Study. arXiv preprint arXiv:1711.10337.
[13] Le, T.A., Kosiorek, A.R., Siddharth, N., Teh, Y.W. and Wood, F., (2018). Revisiting Reweighted Wake-Sleep. arXiv preprint arXiv:1805.10469.
[14] Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J. and Mané, D., (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.
[15] Sutton, C. (2018) Making unblinding manageable: Towards reconciling prepublication and double-blind review. http://www.theexclusive.org/2017/09/arxiv-double-blind.html
[16] Langford, J. (2018) ICML Board and Reviewer profiles. http://hunch.net/?p=8962378

Fri 5:30 a.m. - 5:40 a.m. [iCal]
Opening Remarks (Talk)
Fri 5:40 a.m. - 6:05 a.m. [iCal]
Zachary Lipton (Invited Talk)
Zachary Lipton
Fri 6:05 a.m. - 6:30 a.m. [iCal]
Kim Hazelwood (Invited Talk)
Kim Hazelwood
Fri 6:30 a.m. - 6:40 a.m. [iCal]
Expanding search in the space of empirical ML (Contributed Talk)
Bronwyn Woods
Fri 6:40 a.m. - 6:50 a.m. [iCal]
Opportunities for machine learning research to support fairness in industry practice (Contributed Talk)
Kenneth Holstein
Fri 6:50 a.m. - 7:20 a.m. [iCal]
Spotlights - Papers 2, 23, 24, 36, 40, 44 (Contributed Talk)
Fri 7:20 a.m. - 8:10 a.m. [iCal]
Poster Session 1 (note there are numerous missing names here, all papers appear in all poster sessions) (Poster Session)
Akhilesh Gotmare, Kenneth Holstein, Jan Brabec, Michal Uricar, Kaleigh Clary, Cynthia Rudin, Sam Witty, Andrew Ross, Shayne O'Brien, Babak Esmaeili, Jessica Forde, Massimo Caccia, Ali Emami, Scott Jordan, Bronwyn Woods, D. Sculley, Rebekah Overdorf, Nicolas Le Roux, Peter Henderson, Brandon Yang, Joyce Liu, David Jensen, Nic Dalmasso, Weitang Liu, Paul TRICHELAIR, Jun Lee, Akanksha Atrey, Matt Groh, Yotam Hechtlinger, Emma Tosch
Fri 8:10 a.m. - 8:35 a.m. [iCal]
Finale Doshi-Velez (Invited Talk)
Finale Doshi-Velez
Fri 8:35 a.m. - 9:00 a.m. [iCal]
Suchi Saria (Invited Talk)
Suchi Saria
Fri 9:00 a.m. - 10:30 a.m. [iCal]
Lunch (Break)
Fri 10:30 a.m. - 10:55 a.m. [iCal]
Sebastian Nowozin (Invited Talk)
Sebastian Nowozin
Fri 10:55 a.m. - 11:05 a.m. [iCal]
Using Cumulative Distribution Based Performance Analysis to Benchmark Models (Contributed Talk)
Scott Jordan
Fri 11:05 a.m. - 11:30 a.m. [iCal]
Charles Sutton (Invited Talk)
Charles Sutton
Fri 11:30 a.m. - 11:40 a.m. [iCal]
On Avoiding Tragedy of the Commons in the Peer Review Process (Contributed Talk)
D. Sculley
Fri 11:40 a.m. - 12:00 p.m. [iCal]
Spotlights - Papers 10, 20, 35, 42 (Contributed Talk)
Fri 12:00 p.m. - 12:30 p.m. [iCal]
Coffee Break and Posters (Break)
Fri 12:30 p.m. - 1:30 p.m. [iCal]
Panel on research process (Panel)
Zachary Lipton, Charles Sutton, Finale Doshi-Velez, Hanna Wallach, Suchi Saria, Rich Caruana, Tom Rainforth
Fri 1:30 p.m. - 3:00 p.m. [iCal]
Poster Session 2 (Poster Session)

Author Information

Tom Rainforth (University of Oxford)
Matt Kusner (University of Oxford)
Ben Bloem-Reddy (University of Oxford)
Brooks Paige (Alan Turing Institute / University of Cambridge)
Rich Caruana (Microsoft)
Yee Whye Teh (University of Oxford, DeepMind)

I am a Professor of Statistical Machine Learning at the Department of Statistics, University of Oxford and a Research Scientist at DeepMind. I am also an Alan Turing Institute Fellow and a European Research Council Consolidator Fellow. I obtained my Ph.D. at the University of Toronto (working with Geoffrey Hinton), and did postdoctoral work at the University of California at Berkeley (with Michael Jordan) and National University of Singapore (as Lee Kuan Yew Postdoctoral Fellow). I was a Lecturer then a Reader at the Gatsby Computational Neuroscience Unit, UCL, and a tutorial fellow at University College Oxford, prior to my current appointment. I am interested in the statistical and computational foundations of intelligence, and works on scalable machine learning, probabilistic models, Bayesian nonparametrics and deep learning. I was programme co-chair of ICML 2017 and AISTATS 2010.

More from the Same Authors