Smaller generalization error derived for a deep residual neural network compared with shallow networks

Aku Kammonen; Jonas Kiessling; Petr Plechac; Mattias Sandberg; Anders Szepessy; Raul Tempone

doi:10.1093/imanum/drac049

Back

Smaller generalization error derived for a deep residual neural network compared with shallow networks

Journal article

Open access

Peer reviewed

Smaller generalization error derived for a deep residual neural network compared with shallow networks

Aku Kammonen, Jonas Kiessling, Petr Plechac, Mattias Sandberg, Anders Szepessy and Raul Tempone

IMA journal of numerical analysis

12/09/2022

DOI: https://doi.org/10.1093/imanum/drac049

Abstract

Mathematics

Mathematics, Applied

Physical Sciences

Science & Technology

Estimates of the generalization error are proved for a residual neural network with L random Fourier features layers (z) over bar (l+1) = (z) over bar (l) + Re Sigma(K)(k=1)(b) over bar (lk) e(i omega lk (z) over barl) + Re Sigma(K)(k=1) (c) over bar (lk) e(i omega'lk.x). An optimal distribution for the frequencies (omega(lk), omega'(lk)) of the random Fourier features e(i omega lk (z) over barl) and e(i omega'lk.x) is derived. This derivation is based on the corresponding generalization error for the approximation of the function values f(x). The generalization error turns out to be smaller than the estimate parallel to(f) over cap parallel to(2)(L1(Rd))/(KL) of the generalization error for random Fourier features with one hidden layer and the same total number of nodes KL, in the case the L-infinity-norm of f is much less than the L-1-norm of its Fourier transform (f) over cap. This understanding of an optimal distribution for random features is used to construct a new training method for a deep residual network. Promising performance of the proposed new algorithm is demonstrated in computational experiments.

Files and links (1)

url

https://doi.org/10.1093/imanum/drac049View

Published (Version of record) Open

Metrics

1 Record Views

Details

Title: Smaller generalization error derived for a deep residual neural network compared with shallow networks
Creators - without role: Aku Kammonen - King Abdullah University of Science and Technology
Jonas Kiessling - KTH Royal Institute of Technology
Petr Plechac - University of Delaware
Mattias Sandberg - KTH Royal Institute of Technology
Anders Szepessy - KTH Royal Institute of Technology
Raul Tempone - King Abdullah University of Science and Technology
Publication Details: IMA journal of numerical analysis
Publisher: Oxford Univ Press
Number of pages: 48
Grant note: W911NF-19-1-0243 / ARO Grant 2019-03725 / Swedish Research Council Alexander von Humboldt Foundation URF/1/2281 - 01 - 01; URF/1/2584 - 01 - 01; OSR-2019-CRG8-4033.2 / King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR)
Identifiers: 9944040808331
Academic Unit: King Abdullah University of Science & Technology
Language: English
Resource Type: Journal article