Viterbi Faculty of Electrical Engineering, Technion
On the Modularity and Optimization Dynamics of Hypernetworks
Hypernetworks are architectures that produce the weights of a task-specific implicit network. A notable application of hypernetworks in the recent literature involves implicit neural representations. In these scenarios, the hypernetwork learns a representation corresponding to the weights of a shallow MLP, which typically encodes visual information.
In this talk, we study wide over-parameterized hypernetworks. Two aspects of hypernetworks are discussed. (i) Their Modularity, which is the ability to effectively learn distinct functions for each task separately. (ii) Their optimization dynamics. We show that unlike typical architectures, infinitely wide hypernetworks do not guarantee convergence to a global minima under gradient descent. We further show that convexity can be achieved by increasing the dimensionality of the hypernetwork's output, to implicitly represent wide MLPs. In this dually infinite-width regime, we identify the functional priors of these architectures by deriving their corresponding GP and NTK kernels.
If time will allow, we will also discuss the optimization behavior of ResNets and DenseNets in comparison with standard neural networks.
* Tomer Galanti is a Ph.D. student in Tel-Aviv University under the supervision of Professor Lior Wolf.
Zoom link: https://technion.zoom.us/j/94420766487
Sun 01 Nov 2020
Start Time: 11:30
End Time: 12:30
ZOOM Meeting | Electrical Eng. Building