Date of Award

9-1998

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Department of Electrical and Computer Engineering

First Advisor

Steven K. Rogers, PhD

Abstract

When training an artificial neural network (ANN) for classification using backpropagation of error, the weights are usually updated by minimizing the sum-squared error on the training set. As training ensues, overtraining may be observed as the network begins to memorize the training data. This occurs because, as the magnitude of the weight vector, W, grows, the decision boundaries become overly complex in much the same way as a too-high order polynomial approximation can overfit a data set in a regression problem. Since w grows during standard backpropagation, it is important to initialize the weights with consideration to the importance of the weight vector magnitude, w. With this in mind, the expected value of the magnitude of the initial weight vector is here derived for the separate cases of each weight drawn from a normal or uniform distribution. The usefulness of this derivation is universal since the magnitude of the weight vector plays such an important role in the formation of the classification boundaries. When the network overtrains on the training data, it will not exhibit consistently low error on subsequent test data. One way to overcome this overtraining problem is to stop the training early, which limits the magnitude of the weight vector below what it would be if the training were allowed to continue until a near-global training error minimum were found. The question then is when to stop the training. Here, the relationship between training data set size and the magnitude of the weight vector providing good generalization results is empirically established using cross-validational analysis on small subsets of the training data. These results are then used to estimate at what weight vector magnitude the training should be stopped when using the full data set.

AFIT Designator

AFIT-DS-ENG-98-14

DTIC Accession Number

ADA353812

Recommended Citation

Myers, Lemuel R. Jr., "Radial Complexity Estimation for Improved Generalization in Artificial Neural Networks" (1998). Theses and Dissertations. 5512.
https://scholar.afit.edu/etd/5512

Download

Included in

Computer Sciences Commons

COinS

Theses and Dissertations

Radial Complexity Estimation for Improved Generalization in Artificial Neural Networks

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

AFIT Designator

DTIC Accession Number

Recommended Citation

Included in

Search

Browse

Author Corner

Theses and Dissertations

Radial Complexity Estimation for Improved Generalization in Artificial Neural Networks

Author

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

AFIT Designator

DTIC Accession Number

Recommended Citation

Included in

Share

Search

Browse

Author Corner