Algorithms and Architectures - Cornelius T. Leondes

Algorithms and Architectures (eBook)

Cornelius T. Leondes (Autor)

eBook Download: PDF | EPUB

1998 | 1. Auflage
460 Seiten
Elsevier Science (Verlag)
978-0-08-049898-0 (ISBN)

This volume is the first diverse and comprehensive treatment of algorithms and architectures for the realization of neural network systems. It presents techniques and diverse methods in numerous areas of this broad subject. The book covers major neural network systems structures for achieving effective systems, and illustrates them with examples.
This volume includes Radial Basis Function networks, the Expand-and-Truncate Learning algorithm for the synthesis of Three-Layer Threshold Networks, weight initialization, fast and efficient variants of Hamming and Hopfield neural networks, discrete time synchronous multilevel neural systems with reduced VLSI demands, probabilistic design techniques, time-based techniques, techniques for reducing physical realization requirements, and applications to finite constraint problems.
A unique and comprehensive reference for a broad array of algorithms and architectures, this book will be of use to practitioners, researchers, and students in industrial, manufacturing, electrical, and mechanical engineering, as well as in computer science and engineering.

Key Features
* Radial Basis Function networks
* The Expand-and-Truncate Learning algorithm for the synthesis of Three-Layer Threshold Networks
* Weight initialization
* Fast and efficient variants of Hamming and Hopfield neural networks
* Discrete time synchronous multilevel neural systems with reduced VLSI demands
* Probabilistic design techniques
* Time-based techniques
* Techniques for reducing physical realization requirements
* Applications to finite constraint problems
* Practical realization methods for Hebbian type associative memory systems
* Parallel self-organizing hierarchical neural network systems
* Dynamics of networks of biological neurons for utilization in computational neuroscience
Practitioners, researchers, and students in industrial, manufacturing, electrical, and mechanical engineering, as well as in computer science and engineering, will find this volume a unique and comprehensive reference to a broad array of algorithms and architectures

This volume is the first diverse and comprehensive treatment of algorithms and architectures for the realization of neural network systems. It presents techniques and diverse methods in numerous areas of this broad subject. The book covers major neural network systems structures for achieving effective systems, and illustrates them with examples. This volume includes Radial Basis Function networks, the Expand-and-Truncate Learning algorithm for the synthesis of Three-Layer Threshold Networks, weight initialization, fast and efficient variants of Hamming and Hopfield neural networks, discrete time synchronous multilevel neural systems with reduced VLSI demands, probabilistic design techniques, time-based techniques, techniques for reducing physical realization requirements, and applications to finite constraint problems. A unique and comprehensive reference for a broad array of algorithms and architectures, this book will be of use to practitioners, researchers, and students in industrial, manufacturing, electrical, and mechanical engineering, as well as in computer science and engineering. Radial Basis Function networks The Expand-and-Truncate Learning algorithm for the synthesis of Three-Layer Threshold Networks Weight initialization Fast and efficient variants of Hamming and Hopfield neural networks Discrete time synchronous multilevel neural systems with reduced VLSI demands Probabilistic design techniques Time-based techniques Techniques for reducing physical realization requirements Applications to finite constraint problems Practical realization methods for Hebbian type associative memory systems Parallel self-organizing hierarchical neural network systems Dynamics of networks of biological neurons for utilization in computational neuroscience

Cover 1
Contents 6
Contributors 16
Preface 20
Chapter 1. Statistical Theories of Learning in Radial Basis Function Networks 26
I. Introduction 26
II. Learning in Radial Basis Function Networks 29
III. Theoretical Evaluations of Network Performance 46
IV. Fully Adaptive Training„An Exact Analysis 65
V. Summary 79
Appendix 80
References 82
Chapter 2. Synthesis of Three-Layer Threshold Networks 86
I. Introduction 87
II. Preliminaries 88
III. Finding the Hidden Layer 89
IV. Learning an Output Layer 98
V. Examples 102
VI. Discussion 109
VII. Conclusion 110
References 111
Chapter 3. Weight Initialization Techniques 112
I. Introduction 112
II. Feedforward Neural Network Models 114
III. Stepwise Regression for Weight Initialization 115
IV. Initialization of Multilayer Perceptron Networks 117
V. Initial Training for Radial Basis Function Networks 123
VI. Weight Initialization in Speech Recognition Application 128
VII. Conclusion 141
Appendix I: Chessboard 4 X 4 141
Appendix II: Two Spirals 142
Appendix III: GaAs MESFET 142
Appendix IV: Credit Card 142
References 143
Chapter 4. Fast Computation in Hamming and Hopfield Networks 148
I. General Introduction 148
II. Threshold Hamming Networks 149
III. Two-Iteration Optimal Signaling in Hopfield Networks 160
IV. Concluding Remarks 177
References 178
Chapter 5. Multilevel Neurons 180
I. Introduction 180
II. Neural System Analysis 182
III. Neural System Synthesis for Associative Memories 192
IV. Simulations 196
V. Conclusions and Discussions 198
Appendix 198
References 203
Chapter 6. Probabilistic Design 206
I. Introduction 206
II. Unified Framework of Neural Networks 207
III. Probabilistic Design of Layered Neural Networks 214
IV. Probability Competition Neural Networks 222
V. Statistical Techniques for Neural Network Design 243
VI. Conclusion 253
References 253
Chapter 7. Short Time Memory Problems 256
I. Introduction 256
II. Background 257
III. Measuring Neural Responses 258
IV. Hysteresis Model 259
V. Perfect Memory 262
VI. Temporal Precedence Differentiation 264
VII. Study in Spatiotemporal Pattern Recognition 266
VIII. Conclusion 270
Appendix 271
References 285
Chapter 8. Reliability Issue and Quantization Effects in Optical and Electronic Network Implementations of Hebbian-Type Associative Memories 286
I. Introduction 286
II. Hebbian-Type Associative Memories 289
III. Network Analysis Using a Signal-to-Noise Ratio Concept 291
IV. Reliability Effects in Network Implementations 293
V. Comparison of Linear and Quadratic Networks 303
VI. Quantization of Synaptic Interconnections 306
VII. Conclusions 313
References 314
Chapter 9. Finite Constraint Satisfaction 318
I. Constrained Heuristic Search and Neural Networks for Finite Constraint Satisfaction Problems 318
II. Linear Programming and Neural Networks 348
III. Neural Networks and Genetic Algorithms 356
IV. Related Work, Limitations, Further Work, and Conclusions 366
Appendix I. Formal Description of the Shared Resource Allocation Algorithm 367
Appendix II. Formal Description of the Conjunctive Normal Form Satisfiability Algorithm 371
Appendix III. A 3-CNF-SAT Example 373
Appendix IV. Outline of Proof for the Linear Programming Algorithm 375
References 384
Chapter 10. Parallel, Self-Organizing, Hierarchical Neural Network Systems 388
I. Introduction 389
II. Nonlinear Transformations of Input Vectors 391
III. Training, Testing, and Error-Detection Bounds 392
IV. Interpretation of the Error-Detection Bounds 396
V. Comparison between the Parallel, Self-Organizing, Hierarchical Neural Network, the Backpropagation Network, and the Maximum Likelihood Method 398
VI. PNS Modules 404
VII. Parallel Consensual Neural Networks 406
VIII. Parallel, Self-Organizing, Hierarchical Neural Networks with Competitive Learning and Safe Rejection Schemes 410
IX. Parallel, Self-Organizing, Hierarchical Neural Networks with Continuous Inputs and Outputs 417
X. Recent Applications 420
XI. Conclusions 424
References 424
Chapter 11. Dynamics of Networks of Biological Neurons: Simulation and Experimental Tools 426
I. Introduction 427
II. Modeling Tools 428
III. Arrays of Planar Microtransducers for Electrical Activity Recording of Cultured Neuronal Populations 443
VI. Concluding Remarks 446
References 447
Chapter 12. Estimating the Dimensions of Manifolds Using Delaunay Diagrams 450
I. Delaunay Diagrams of Manifolds 450
II. Estimating the Dimensions of Manifolds 460
III. Conclusions 480
References 481
Index 482

Statistical Theories of Learning in Radial Basis Function Networks

Jason A.S. Freeman Centre for Cognitive Science, University of Edinburgh, Edinburgh EH8 9LW, United Kingdom

Mark J.L. Orr Centre for Cognitive Science, University of Edinburgh, Edinburgh EG8 9LW, United Kingdom

David Saad Department of Computer Science and Applied Mathematics, University of Aston, Birmingham B4 7ET, United Kingdom

1 INTRODUCTION

There are many heuristic techniques described in the neural network literature to perform various tasks within the supervised learning paradigm, such as optimizing training, selecting an appropriately sized network, and predicting how much data will be required to achieve a particular generalization performance. The aim of this chapter is to explore these issues in a theoretically based, well founded manner for the radial basis function (RBF) network. We will be concerned with issues such as using cross-validation to select network size, growing networks, regularization, and calculating the average and worst-case generalization performance. Two RBF training paradigms will be considered: one in which the hidden units are fixed on the basis of statistical properties of the data, and one with hidden units which adapt continuously throughout the training period. We also probe the evolution of the learning process over time to examine, for instance, the specialization of the hidden units.

A RADIAL BASIS FUNCTION NETWORK

RBF networks have been successfully employed in many real world tasks in which they have proved to be a valuable alternative to multilayer perceptrons (MLPs). These tasks include chaotic time-series prediction [1], speech recognition [2], and data classification [3]. Furthermore, the RBF network is a universal approximator for continuous functions given a sufficient number of hidden units [4]. The RBF architecture consists of a two-layer fully connected network (see Fig. 1), with an input layer which performs no computation. For simplicity, we use a single output node throughout the chapter that computes a linear combination of the outputs of the hidden units, parametrized by the weights w between hidden and output layers. The defining feature of an RBF as opposed to other neural networks is that the basis functions (the transfer functions of the hidden units) are radially symmetric.

Figure 1 The radial basis function network. Each of N components of the input vector ξ feeds forward to K basis functions whose outputs are linearly combined with weights bb=1K into the network output f(ξ).

The function computed by a general RBF network is therefore of the form

ξw=∑b=1Kwbsbξ,

(1)

where ξ is the vector applied to the input units and sb denotes basis function b.

The most common choice for the basis functions is the Gaussian, in which case the function computed becomes

ξw=∑b=1Kwb−ξ‐mb22σB2,

(2)

where each hidden node is parametrized by two quantities: a center m in input space, corresponding to the vector defined by the weights between the node and the input nodes, and a width σB.

Other possibilities include using Cauchy functions and multiquadrics. Functions that decrease in value as one moves toward the periphery are most frequently utilized; this issue is discussed in Section II.

There are two commonly employed methods for training RBFs. One approach involves fixing the parameters of the hidden layer (both the basis function centers and widths) using an unsupervised technique such as clustering, setting a center on each data point of the training set, or even picking random values (for a review, see [5]). Only the hidden-to-output weights are adaptable, which makes the problem linear in those weights. Although fast to train, this approach often results in suboptimal networks because the basis function centers are set to fixed values. This method is explored in Section II, in which methods of selecting and training optimally sized networks using techniques such as cross-validation and ridge regression are discussed. Forward selection, an advanced method of selecting the centers from a large fixed pool, is also explored. The performance that can be expected from fixed-hidden-layer networks is calculated in Section III, using both Bayesian and probably approximately correct (PAC) frameworks.

The alternative is to adapt the hidden-layer parameters, either just the center positions or both center positions and widths. This renders the problem nonlinear in the adaptable parameters, and hence requires an optimization technique, such as gradient descent, to estimate these parameters. The second approach is computationally more expensive, but usually leads to greater accuracy of approximation. The generalization error that can be expected from this approach can be calculated from a worst-case perspective, under the assumption that the algorithm finds the best solution given the available data (see Section III). It is perhaps more useful to know the average performance, rather than the worst-case result, and this is explored in Section IV. This average-case approach provides a complete description of the learning process, formulated in terms of the overlaps between vectors in the system, and so can be used to study the phenomenology of the learning process, such as the specialization of the hidden units.

II LEARNING IN RADIAL BASIS FUNCTION NETWORKS

A SUPERVISED LEARNING

In supervised learning problems we try to fit a model of the unknown target function to a training set D consisting of noisy sampled input–output pairs:

=ξpy^pp=1P.

(3)

The caret (hat) in ŷp indicates that this value is a sample of a stochastic variable, yp, which has a mean, ¯p, and a variance, σp2. If we generated a new training set with the same input points, pp=1P, we would get a new set of output values, ^pp=1P because of the random sampling. The outputs are not completely random and in fact it is their deterministic part, as a function of the input, which we seek to estimate in supervised learning.

If the weights, bb=1K,, which appear in the model provided by an RBF network [defined by Eq. (1)] were the only part of the network to adapt during training, then this model would be linear. That would imply a unique minimum of the usual sum-squared-error cost function,

wD=∑p=1Pfξpw−y^p2,

(4)

which can be found by a straightforward computation (the bulk of which is the inversion of a square matrix of size K). There would be no confusion caused by local minima and no need for computationally expensive gradient descent algorithms. Of course, the difficulty is in determining the right set of basis functions, bb=1K, to use in the model (1). More likely than not, if the training set is ignored when choosing the basis functions we will end up having too many or too few of them, putting them in the wrong places, or giving them the wrong sizes. For this reason we have to allow other model parameters (as well as the weights) to adapt in learning, and this inevitably leads to some kind of nonlinear algorithm involving something more complicated than just a matrix inverse.

However, as we shall see, even though we cannot get away from nonlinearity in the learning problem, we are not thereby restricted to algorithms which construct a vector space of dimension equal to the number of adaptable parameters and search it for a good local minimum of the cost function—the usual approach with neural networks. This section investigates alternative approaches where the linear character of the underlying model is to the foremost in both the analysis (using linear algebra) and implementation (using matrix computations).

The section is divided as follows. It begins with some review material before describing the main learning algorithms. First, Section II.B reminds us why, if the model were linear, the cost function would have a single minimum and how it could be found with a single matrix inversion. Section II.C describes bias and variance, the two main sources of error in supervised learning, and the trade-off which occurs between them. Section II.D describes some cost functions, such as generalized cross-validation (GCV), which are better than sum-squared-error for effective generalization. This completes the review material and the next two subsections describe two learning algorithms, both modem refinements of techniques from linear regression theory. The first is ridge regression (Section II.E), a crude type of regularization, which balances bias and variance by varying the amount of smoothing until GCV is minimized. The second is forward selection (Section II.F), which balances bias and variance by...

Erscheint lt. Verlag	9.2.1998
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik ► Netzwerke
	Mathematik / Informatik ► Informatik ► Programmiersprachen / -werkzeuge
	Mathematik / Informatik ► Informatik ► Software Entwicklung
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
	Technik ► Elektrotechnik / Energietechnik
ISBN-10	0-08-049898-1 / 0080498981
ISBN-13	978-0-08-049898-0 / 9780080498980

Haben Sie eine Frage zum Produkt?

PDF (Adobe DRM)
Größe: 48,3 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

EPUB (Adobe DRM)
Größe: 11,1 MB

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Andere Ausgabe

Buch | Hardcover (1998)

108,45 €