Topology reduction in deep convolutional feature extraction networks
Authors
Thomas WiatowskiReference
Wavelets and Sparsity XVII, San Diego, USA, Aug. 2017, (invited talk).[BibTeX, LaTeX, and HTML Reference]
Abstract
Many practical machine learning tasks employ very deep convolutional neural networks (DCNNs). Large depths pose formidable computational challenges in training and operating the network. It is therefore important to understand how many layers are needed to have most of the features of the input signal be contained in the feature vector generated by the network. This question can be formalized by asking how quickly the energy contained in the feature maps decays across layers. In addition, it is desirable that none of the input signal's features be ``lost'' in the feature extraction network or, more formally, we want energy conservation in the sense of the energy contained in the feature vector being proportional to that of the corresponding input signal. In this talk, we characterizes the energy decay rate and establishes conditions for energy conservation for a wide class of DCNNs. Specifically, we consider general scattering networks, and find that under a mild analyticity and high-pass condition on the filters (which encompasses, inter alia, various constructions of Weyl-Heisenberg filters, wavelets, ridgelets, curvelets, and shearlets), and under mild smoothness assumptions on the input signals, the feature map energy decays at least polynomially fast. For wavelets and Weyl-Heisenberg filters, the guaranteed decay rate is shown to be exponential. Our results yield handy estimates of the number of layers needed to have most of the input signal energy be contained in the feature vector. This talk represents joint work with P. Grohs and H. Bölcskei.This publication is currently not available for download.