A Spectral View of Adversarially Robust Features
- Type of resource
- Date created
Also available at
Item belongs to a collection
- Zhang, Brian Hu
- Degree granting institution
- Stanford University, Department of Computer Science
- Primary advisor
- Gregory Valiant
While great progress has been made in image classification using machine learning, often achieving near-human or even superhuman accuracy on image classification tasks, recent studies have found that image classification models are vulnerable to adversarial attacks: special images crafted to fool the models into mislabeling the picture. In this work, we investigate the problem of creating an adversarially robust feature: a feature f whose value at any point x cannot be changed much by perturbing x slightly. We establish strong connections between adversarially robust features and a natural spectral property of the geometry of the dataset and metric of interest. This connection can be leveraged both to provide robust features and to provide a lower bound on the robustness of any function that has significant variance across the dataset. Finally, we provide empirical evidence that the adversarially robust features yielded via this spectral approach can be fruitfully leveraged to learn a robust (and accurate) model.
- Preferred Citation
- Zhang, Brian Hu. (2018). A Spectral View of Adversarially Robust Features. Stanford Digital Repository. Available at: https://purl.stanford.edu/zk082nr1912
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.