NeurIPS Poster Reproducibility study of 'Proto2Proto: Can you recognise the car, the way I do?'

Poster

Reproducibility study of 'Proto2Proto: Can you recognise the car, the way I do?'

Gerson de Kleuver · David Bikker · Wenhua Hu · Bram Veenman

Great Hall & Hall B1+B2 (level 1) #2013

[ Abstract ]

[ Poster] [ OpenReview]

Abstract:

Scope of Reproducibility — This paper analyses the reproducibility of the study Proto2Proto: Can you recognize the car, the way I do? The main contributions and claims of the study are: 1) Using Proto2Proto, a shallower student model is more faithful to the teacher in terms of interpretability than a baseline student model while also showing the same or better accuracy; 2) Global Explanation loss forces student prototypes to be close to teacher prototypes; 3) Patch‐Prototype Correspondence loss enforces the local representations of the student to be similar to those of the teacher; 4) The proposed evaluation metrics determine the faithfulness of the student to the teacher in terms of interpretability.Methodology — A public code repository was available for the paper, which provided a working but incomplete and minimally documented codebase. With some modifications we were able to carry out the experiments that were best supported by the codebase. We spent a total of 60 computational GPU hours on reproduction.Results — The results we were able to produce support claim 1, albeit weakly. Further results are in line with the paper, but we found them to go against claim 3. In addition, we carried out a theoretical analysis which provides support for claim 4. Finally, we were unable to carry out our intended experiment to verify claim 2. What was easy — The original paper was clearly structured and understandable. The experiments for which configurations were provided were simple to conduct.What was difficult — The public codebase contained minimal documentation. Moreover, the use of variable names did not correspond between the code and the paper. Furthermore, the codebase lacked elements vital to reproducing some experiments. Another significant constraint were the computational requirements needed to reproduce the original experiments. Finally, the code required to reproduce one of the visualizations was not provided.Communication with original authors — We contacted the authors to ask for trained model weights and missing hyperparameters for several experiments. We did not receive aresponse.

Chat is not available.