Summary of Optimizers

This topic provides details on the various optimizers available for performing nominal optimization. For details on setting up a nominal optimization, refer to Performing Nominal Optimization.

The optimizers are differentiated by their error function formulations and search methods .

For more information on the individual optimizers, refer to Available Optimizers.

Obtaining Global Optimal Results

Most of the optimizers provided are local optimizers. Genetic Optimizer and Simulated Annealing Optimizer are global optimizers and Random Optimizer has the capability to find the global optimal solutions. If you really want a global solution, try Simulated Annealing optimizer first, then Genetic optimizer. You can also try to use Random optimizers (without fixed Seed) multiple times to select the global solution.

Available Optimizers

When you select an optimizer, the system automatically sets both the error-function (EF) formulation and the search method to be used. The available optimizers and their associated error-function formulations and search methods are shown below.

While there are a number of optimizers available, the Random, Gradient, and Simulated Annealing optimizers tend to be the most frequently used because they work for most cases.

Available Optimizers and Associated EF & Search Method
Optimizer Error-Function (EF) Formulation Search Method
Random Optimizer Least-Squares EF Random Search
Gradient Optimizer Least-Squares EF Gradient Search
Random Minimax Optimizer Minimax L1 EF Random Search
Gradient Minimax Optimizer Minimax L1 EF Gradient Search
Quasi-Newton Optimizer Least-Squares EF Quasi-Newton Search
Least Pth Optimizer Least Pth EF Quasi-Newton Search
Minimax Optimizer Minimax EF Gauss-Newton/Quasi-Newton (minimax) Search
Random Max Optimizer Negated Least-Squares EF Random Search
Hybrid Optimizer Least-Squares EF Random Search and Quasi-Newton Search
Discrete Optimizer Least-Squares EF Exhaustive Search
Genetic Optimizer Least-Squares EF Genetic Algorithm Search
Simulated Annealing Optimizer Least-Squares EF Simulated Annealing Algorithm Search
Sensitivity Analysis    

† Note that Sensitivity Analysis appears in the list of available optimizers. Sensitivity Analysis is not actually an optimization process. It is a fundamental element of gradient optimization.

Available Types for Optimization-Variables

Some optimizers may work with discrete valued parts, while others must work with continuous valued parts. It is important to take care when selecting an optimizer because different optimization problems require different optimizers for better performance and accuracy.

Optimizers and Optimization-Variable Types
Optimizer Operates with Discrete Variables Operates with Continuous Variables
Random Yes Yes
Gradient No Yes
Random Minimax Yes Yes
Gradient Minimax No Yes
Quasi-Newton No Yes
Least Pth No Yes
Minimax No Yes
Random Max Yes Yes
Hybrid No Yes
Discrete Yes No
Genetic Yes Yes
Simulated Annealing No Yes

The difference between continuous and discrete variables is relatively simple.

For more information, refer to Value Types for Nominal Optimization.

Random Optimizer

The Random optimizer uses the Random search method to arrive at new parameter values by using a random-number generator, that is, by picking a number at random within a range. Starting from an initial set of parameter values for which the error function is known, a new set of values is obtained by perturbing each of the initial values, and the error function is re-evaluated. Then the error function is re-evaluated by reversing the algebraic sign of each parameter value perturbation. These two values, corresponding to positive and negative perturbations, are compared to the value at the initial point. If either value is less than the initial value, then the set of parameter values for which the error function has its least value becomes the initial point for the next iteration. If neither value is less than the initial value, then the initial point remains the same for the next iteration. Since random search uses a pseudo-random generator, the results can be different for two optimization procedures.

The Random optimizer uses the Least-Squares error function to minimize the average weighted violation for the desired responses. So the value for the error function represents the average weighted violation for the desired responses and the value of zero indicates that all of the intended performance goals have been reached. The Random optimizer guarantees to find at least one local minimum result. It also has the probability to find the global minimum result. The Random optimizer is probably the best optimizer for the following cases when considering the average violation for the performance goals:

See Also

Gradient Optimizer

The Gradient optimizer uses the Gradient search method to arrive at new parameter values using the gradient information of the network's error function. The gradient of the error function indicates the direction to move a set of parameter values in order to reduce the error function. For each iteration, the error function and its gradient is evaluated at the initial point. Then the set of parameter values is moved in that direction until the error function is minimized. A single iteration usually includes many function evaluations; therefore, an iteration in the gradient search method takes much longer than the random search method.

The Gradient optimizer uses the Least-Squares error function to minimize the average weighted violation for the desired responses. So the value for the error function represents the average weighted violation for the desired responses and the value of zero indicates that all of the intended performance goals have been reached. The Gradient optimizer guarantees to find a local minimum result. A design that is optimized by the gradient optimizer has the least sensitivity (more stable) to slight variations in its parameter values.

The Gradient optimizer is the best optimizer to use for simple circuits with straightforward requirements; that is, the larger number of function evaluations will not slow the optimization appreciably, but the optimizer will converge on a solution quickly. The Gradient optimizer is also quite good at following contours.

See Also

Random Minimax Optimizer

The Random Minimax optimizer uses the Random search method to arrive at new parameter values. The procedure is the same as that described for the Random optimizer.

The Random Minimax optimizer uses the Minimax L1 error function to minimize the point that constitutes the greatest violation for the desired responses. So the value for the error function reveals the greatest violation for the weighted desired responses and the value of zero indicates that all of the intended performance goals are satisfied. Just like the Random Optimizer, the Random Minimax optimizer guarantees to find at least one local minimum result and has the probability to find the global minimum result.

The Random Minimax optimizer is probably the best optimizer for the following cases when considering the greatest violation for the performance goals:

See Also

Gradient Minimax Optimizer

The Gradient Minimax optimizer uses the Gradient search method to arrive at new parameter values. The procedure is the same as that described for the Gradient optimizer.

The Gradient Minimax optimizer uses the Minimax L1 error function to minimize the point that constitutes the greatest violation for the desired responses. So the value for the error function reveals the greatest violation for the weighted desired responses and the value of zero indicates that all of the intended performance goals are satisfied. Just like the Gradient Optimizer, the Gradient Minimax optimizer guarantees to find a local minimum result. A design that is optimized by a gradient minimax optimizer has the least sensitivity (more stable) to slight variations in its parameter values.

The Gradient Minimax optimizer is the best optimizer to use for simple circuits with straightforward requirements; that is, the larger number of function evaluations will not slow the optimization appreciably, but the optimizer will converge on a solution quickly. The Gradient Minimax optimizer is also quite good at following contours.

See Also

Quasi-Newton Optimizer

The Quasi-Newton optimizer uses the Quasi-Newton search method to arrive at new parameter values. The Quasi-Newton search method uses the second-order derivatives of the error function and the gradient to find a descending direction. The second-order derivatives are estimated by the Davidson-Fletcher-Powell (DFP) formula or its complement. Appropriately combined with the gradient, this information is used to find a direction and an inexact line-search is conducted. Much like the optimizers using the gradient search method, an iteration in the optimizers using the Quasi-Newton search method consists of many function evaluations. Therefore, a single iteration using the Quasi-Newton search method takes longer than an iteration in the optimizers using the Random search method.

The Quasi-Newton optimizer uses the Least-Squares error function to minimize the average weighted violation for the desired responses. So the value for the error function represents the average weighted violation for the desired responses and the value of zero indicates that all of the intended performance goals have been reached. The Quasi-Newton optimizer guarantees to find a local minimum result.

See Also

Least Pth Optimizer

The Least P th optimizer uses the Quasi-Newton search method to arrive at new parameter values.

The Least P th optimizer uses the Least Pth error function formulation, in which the error function is equal to the P th power of the difference between the simulated responses, where p = 2, 4, 8, or 16. The optimizer automatically increases p in this sequence. This emphasizes the errors that have high values much more strongly than those that have small values. As p increases, the Least P th error function approaches the minimax error function.

The Least P th optimizer allows the error function to become negative when you specify a performance window and the response moves inside of that window. For example, there may be a minimum and maximum gain specification on an amplifier and the Least P th optimizer can go beyond the specification and place the gain halfway between the two limits. The Least P th optimization routine is the exponential sum of the error function, where the exponent p is not necessarily equal to 2. It can be a positive number, usually an integer. The Least P th formulation is used as an indirect method to achieve a minimax design. Minimax error function can contain edges or discontinuities in their derivatives. These occur at points where the error contributions due to different goals intersect in the parameter space. The Least P th error functions avoid this problem.

For a large value of p, the errors having the maximum value are more strongly emphasized over the other errors, that is, they are given higher priority in optimization. As p increases to infinity, the Least P th formulation leads to a minimax error function. The problem is solved though a sequence of Least P th optimizations with p being gradually increased. The sequential Least P th optimization used in the program uses p = 2 4 8 16. This strategy often provides a smooth path towards a minimax solution.

See Also

Minimax Optimizer

The minimax optimizer consists of two stages. In the first stage of the algorithm, the Gauss-Newton search method solves a minimax problem using a linear programming technique. In doing so, the status and potential of each individual error function component are analyzed. Its contribution to the minimax problem is mathematically assessed and taken into account during optimization. In the second stage, the optimizer works with a Quasi-Newton search method using approximate second-order derivatives. Such extra effort becomes necessary for an accurate and efficient solution to certain ill-conditioned problems (i.e., singular problems).

Using the minimax error function formulation , the Minimax optimizer calculates the difference between the desired response and the actual response over the entire measurement parameter range of optimization. Then the optimizer tries to minimize the point that constitutes the greatest difference between actual response and desired response. The minimax optimizer terminates when responses become optimally equal-ripple or the relative change in the variables is less than 0.05 percent. It also stops when the number of iterations is reached. Note that the bounds imposed on the variables are formulated and treated directly as linear constraints without having to resort to variable transformation; therefore, a source of nonlinearity is eliminated.

See Also

Random Max Optimizer

The Random Max optimizer uses the Random search method . The Random Max optimizer uses the same general process as the Random optimizer.

The Random Max optimizer uses the Negated Least-Squares error function which provides for a worst-case analysis. As the name implies, the optimizer internally negates the least-squares error function so that the effect is a maximization of the error function.

See Also

Hybrid Optimizer

The Hybrid optimizer combines the Random and Quasi-Newton search methods . It offers a compromise between the ability to find a minimum quickly, using the fewest possible circuit analyses (this is the strength of Quasi-Newton optimization), and the possibility to find the global cost minimum in the presence of many local minima (this is the strength of Random optimization).

When hybrid optimization is chosen, the system starts out using the Quasi-Newton search method , and quickly finds the nearest local minimum. When the gradient approaches zero, it is near a minimum and can do little to decrease the cost function. The system then uses the Random search method to generate a new initial guess. The Quasi-Newton search method is then performed, starting at this initial guess. This process continues until the optimizer can no longer improve the performance of the circuit, or has reached the maximum number of iterations that has been specified. For this reason, hybrid optimization is an excellent choice if time is available for very long analyses.

Hybrid optimizer has two useful properties:

The Hybrid optimizer also uses the Least-Squares error function .

See Also

Discrete Optimizer

The Discrete optimizer uses the exhaustive search method exclusively. This optimizer only affects parameter values specified as discrete-valued optimization variables. The exhaustive search method involves a comprehensive search for the combination of discrete values that results in the best design performance. Starting from an initial set of parameter values for which the error function is known, an update in the parameter values occurs upon an improvement in the error function. Because the parameter values that may change are not continuous variables, this search method is more a series of trials than iterations. Moreover, the number of trials required to attempt all combinations may often be prohibitive. To reduce optimization time, keep the number of discrete variables to a minimum and reduce the number of values the discrete variables may take on.

The Discrete optimizer also uses the Least-Squares error function.

See Also

Genetic Optimizer

Genetic algorithms (GA's) provide another direct search optimization method. The basis of the procedure is a set of trial parameter sets, sometimes called chromosomes, which are allowed to evolve towards a set that gives progressively better performance. The key to the genetic optimization is the strategy of change, sometimes likened to survival of the fittest. The idea is that with each change in the parameter population, that is, each generation of parameters, the performance given by the parameter population improves. This whole process is achieved using a five-step process called (1) Representation, (2) Evaluation, (3) Reproduction, (4) Breeding and Crossover, and (5) Mutation as described below.

  1. Representation : Genetic algorithms require the input parameter set to be represented as a string of digits. It is straightforward to map each parameter onto the interval 0 to 1, for instance, and then have each of the n parameters occupy a position in the string of n bounded numbers. The algorithm then manipulates and optimizes this string of numbers as a whole. An individual string of parameters is called an element within the population of parameter strings.
  2. Evaluation : Each generation of parameters begins with a performance evaluation of each string in the population. Usually this involves determining the performance G(P) for each representation of P in the population. Each element is then graded as to how well it performed, often using an error function, known as the fitness function .
  3. Reproduction : Some of the members of the population for this generation are copied, that is, reproduced, and added to the next generation population. The number of copies depends on the performance evaluation. The elements that perform well are copied several times, and those with poor performance are not copied at all. The copies, or offspring, then make up the next generation. Elements that are not copied are not represented in the next generation. Note that the number of elements in each generation is constant. There are several suggested methods of ranking and reproduction, including ratioing, where the number of copies is directly related to the element's performance, and ranking where the performances are ranked, with the top performers being copied more times than the lower ranked performers.
  4. Breeding and crossover : The previous step, Reproduction, produced a population of strings where each evaluated well. Breeding then combines parts of two strings to form two different and new strings. In this way good representations are mixed with poorer representations, with the result eventually being evaluated in the next generation of the algorithm. There are many methods for breeding; the most common is crossover. Crossover typically takes two elements, splits them at a random location in the string, and swaps the two parts to create two new strings (see the following figure). This provides a controlled statistical exploration of the performance space.

    Breeding and Crossover in the Simple Genetic Algorithm
  5. Mutation : The last step in creating a new generation of elements is the random changing of parameters in some of the surviving strings. This comprises a completely random search of the performance space, and can be viewed as the injection of information into the surviving population. The following figure presents a completed flow diagram of the genetic algorithm. The application of these techniques requires many tuning parameters that are not available with yield optimization. Genetic optimization techniques will prove useful for many complex optimization problems, including discrete value and tolerance optimization.

    Random-Search Optimization Using a Genetic Algorithm

See Also

Simulated Annealing Optimizer

Simulated Annealing is a technique that has attracted significant attention as a suitable choice for optimization problems of large scale, especially ones where a desired global extremum is hidden among many, poorer, local extrema. At the heart of the method of simulated annealing is an analogy with thermodynamics, specifically with the way that liquids freeze and crystallize, or metals cool and anneal. The essence of the process is slow cooling, allowing ample time for redistribution of atoms as they lose mobility to ensure that a low energy state will be achieved.

The idea behind the Simulated Annealing method is to sample the Boltzmann distribution, which describes the distribution of energy of a mechanical system in a heat bath. The so-called Boltzmann probability distribution, Prob( E) ~ exp( -E/kT), expresses the idea that a system in thermal equilibrium at temperature T has its energy probabilistically distributed among all different energy states E. As a result, the system sometimes goes uphill as well as downhill. But the lower the temperature, the less likely is any significant uphill excursion.

The Simulated Annealing Optimizer in ADS is combined with a modification of the downhill simplex method. It can locate a good approximation to the global optimum of a given problem. The objective function of the optimization problem is regarded as the energy function of such a system. This amounts to replacing the single point x as a description of the system state by a simplex of N+1 points. The moves are the same as the refections, expansions, and contractions of the simplex method. The procedure for the random changes in the configuration is: add a positive, logarithmically distributed random variable, proportional to the temperature T, to the stored function value associated with every vertex of the simplex, and subtract a similar random variable from the function value of every new point that is tried as a replacement point. This procedure always accepts a true downhill step, but sometimes accepts an uphill one. In the limit T→0, this algorithm reduces exactly to the downhill simplex method and converges to a local minimum. At a finite value of T, the simplex expands to a scale that approximates the size of the region that can be reached at this temperature, and then executes a stochastic, tumbling Brownian motion within that region, sampling new, approximately random points as it does so. The efficiency with which a region is explored is independent of its narrowness and orientation. If the temperature is reduced sufficiently slowly, it becomes highly likely that the simplex will shrink into that region containing the lowest relative minimum encountered.

Success or failure is quite often determined by the choice of annealing schedule to reduce the temperature sufficiently slowly. The choice of annealing schedule depends on the problem being solved. In the algorithm used here, the annealing schedule is chosen to decrease T to β times after m shoots for one iteration, and the process continues for n iterations. In other words, the customer can control the annealing schedule by changing the following control parameters:

In summary, simulated annealing can deal with highly nonlinear models, chaotic and noise data, and many constraints. It is a robust and general technique. Its main advantages over other local search methods are its flexibility and its ability to approach global optimality. However, since simulated annealing is a metaheuristic, a lot of choices are required to tune it for an actual problem. There is a clear trade-off between the quality of the solutions and the time required to compute them. The tailoring work required to account for different classes of constraints and to fine-tune the algorithm's parameters can be rather delicate.

Here are some suggestions on how to use the optimizer's control parameters:

See Also

Error-Function (EF) Formulation

The error function (EF) is the method used to evaluate how far away from the target the current iteration is. In its most general definition, the error function calculates the difference between the simulation and the specifications defined by the optimization goals.

One possible formulation of the error function is shown below in general terms.

Here the error function ( EF ) first determines the difference between the simulation ( simulation i ) and the goals ( goal i ) for all of the goals ( allGoals ) that have been defined. This difference is usually called a residual . Each residual is then raised to a power, P , and the result is then multiplied by a weighting factor, W . The error function value is determined as the sum of all these terms. The weighting factors may have different values from one goal to another, and they are used to emphasize some optimization goals versus others by making their contribution to the error function more significant.

The example above is just one possible formulation of the error function. Numerous other possibilities exist and are explained in other sections of this document.

Each optimizer uses a different method for error function formulation. The types of error function formulations are shown in the table below.

Use of Error Function Formulation
Optimizers Error Function Formulation
Random Optimizer,
Gradient Optimizer,
Quasi-Newton Optimizer,
Hybrid Optimizer,
Discrete Optimizer,
Genetic Optimizer,
Simulated Annealing Optimizer
Least-Squares EF
Minimax Optimizer Minimax EF
Random Minimax Optimizer,
Gradient Minimax Optimizer
Minimax L1 EF
Least Pth Optimizer Least Pth EF
Random Max Optimizer Negated Least-Squares EF

Least-Squares EF

The least-squares error-function is very popular. The residuals are squared, hence the name of this error function formulation. The generalized form is as follows:

The least-squares error function is calculated by evaluating the error for each specified goal at each frequency/power point individually. The magnitudes of these errors are then squared, and the squared magnitudes are then averaged over frequency and/or power.

To help you understand the error function calculation in more generality for a measurement as a function of frequency, consider the following variable definitions.

di the "ith" goal normalizing factor. Its inverse will be the "ith" goal internal weighting factor.
F j the set of frequency values specified by the "jth" frequency range
R ij (f) the "ith" frequency dependent response that is being optimized over the "jth" frequency range
g ij the "ith" goal value within the "jth" frequency range that is the optimization criterion corresponding to the R ij response
W ij the "ith" goal weighting factor, within the "jth" frequency range, associated with the R ij response and g ij goal
e ij (f) the frequency dependent error contribution due to differences between R ij and g ij , evaluated at frequency "f."

Where

d i = abs(Min) if only Min value is specified in the "ith" goal component.
    abs(Max) if only Max value is specified in the "ith" goal component.
    (abs(Max) + abs(Min))/2.0 if both Min and Max are specified in the "ith" goal component.
d i = 1.0 if the value calculated for d i from above equation is zero.

The error contribution e ij (f) are dependent on the optimization goal specification, which involves a specified relational operator (RelationOp), and the error contribution formation. The following table summarizes these conditions.

Relational operator Goal satisfaction condition
Equal to (=) R ij (f) = g ij
Less than (<) R ij (f) ≤ g ij
Greater than (>) R ij (f) ≥ g ij

The error contribution e ij (f) has different formulation for different error function formation. For the least-squared method,
e ij (f) = (| R ij (f) − g ij | /d i ) 2
The contribution to the total error function from response R ij (f) over the set of frequencies in the j th range is then given by:

The next step is to sum the contributions from all responses within the frequency range and divide by the number of frequencies. One way to express this mathematically is with the equation:

in which N j is the number of frequencies in frequency range F j .
The final step in the error function calculation is to sum over all frequency ranges.

The weighting factors in different frequency ranges can be used to emphasize one of the E j 's with respect to others, just as weights within a frequency range can be used to attach greater or lesser importance to a given response relative to other responses within that frequency range.
In summary, the error function calculation can be represented by the following triple summation:

Where

For optimization with an additional 2nd swept variable (VAR) parameter, there is an additional summation over the parameter range. This error function formulation is represented in the following equation:

Where the additional summation index p is over the swept variable levels in the swept variable range for the i th response, P i represents the total number of swept variable levels in the i th responses power range. Thus, each response can have a unique swept variable range associated with it.

The above procedure for error function formulation is based upon the error contribution from all of the responses for jth frequency range, E j . The total error function with respect to the error contributions from each response, E i , can also be constructed, which includes all of the frequency range. So,

and

This second approach for error function formulation is applied in this documentation. Remember that a response is any individual measurement on any network. And, that the weights W ij have the value 1 unless some other value is given with the g ij goal specification.

For more information on weight, refer to Weighting Factors.

Minimax EF

The Minimax optimizer calculates the difference between the desired response and the actual response over the entire measurement parameter range of optimization. Then the optimizer tries to minimize the point that constitutes the greatest difference between actual response and desired response.

Basically, minimax means minimizing the maximum (of a set of functions generally denoted as errors) . The error function is defined as the maximum among all error contributions, regardless of their signs or, expressed mathematically:

E = max{e ij (f)*W ij }, among all of the f, j and i

where e ij (f) is equal to (R ij (f)-g ij )/d i for less than goal relationship, and is equal to (g ij -R ij (f))/d i for greater than relationship.

The minimax objective function always represents the worst case, where the specifications are either most severely violated (in which case E > 0) or, are satisfied with the worst error (in which case E < 0). The minimax optimizer will spend all its effort trying to minimize these. A minimax solution is one such that the goal specifications are met in an optimal, typically equal-ripple manner.

Minimax L1 EF

Random Minimax and Gradient Minimax optimizers use the Minimax L1 error function:

E = max{0, e ij (f)*W ij }, among all of the f, j and i

These optimizers calculate the difference between the desired response and the actual response over the entire measurement parameter range of optimization. Then the optimizers try to minimize the point that constitutes the greatest violation for the desired response. Compared with minimax error function, minimax L1 error cannot be less than zero. So that it only accounts for the most severely violated case.

Least Pth EF

The Least P th optimizer uses an error function formulation similar in makeup to the least squares method found in the random, gradient, and the quasi-Newton optimizers. But, instead of squaring the magnitudes of the individual errors at each frequency, it raises them to the P th power, where p = 2, 4, 8, or 16. The optimizer automatically increases p in that sequence. This emphasizes the errors that have high values much more strongly than those that have small values. As p increases, the Least P th error function approaches the minimax error function.

Least P th allows the error function to become negative. That happens when you specify a performance window and the response moves inside that window.

For example, there may be a minimum and maximum gain specification on an amplifier and the Least P th optimizer can go beyond the specification and place the gain halfway between the two limits.

The Least P th optimization routine is the exponential sum of the error function, where the exponent p is not necessarily equal to 2. It can be a positive number, usually an integer.

First of all, the maximum error is found as:

E max = max i (E i )

According to the sign of E max , the Least P th error function is defined as follows:

if E max > 0

if E max = 0

if E max < 0

The cases E max > 0 and E max < 0 are clearly distinguished. The first case, where E max > 0, only positive errors enter the final error function. For the second case, where E max < 0, all errors become part of the final error function. In this case, all errors are negative, i.e., all the specifications are satisfied.

The Least P th final error function is well-defined whether or not there are positive errors. It has the same sign as the maximum error E max and becomes negative when all the specifications are met. Therefore, the optimizer continues to improve the performance of the design, even after there are no more violations of the specifications.

The Least P th formulation is used as an indirect method to achieve a minimax design. Minimax error function can contain edges or discontinuities in their derivatives. These occur at points where the error contributions due to different goals intersect in the parameter space. The Least P th error functions avoid this problem.

For a large value of p, the errors having the maximum value (E i = E max ) are more strongly emphasized over the other errors, i.e., they are given higher priority in optimization. As p increases to infinity, the Least P th formulation leads to a minimax error function. The problem is solved though a sequence of Least P th optimizations with p being gradually increased. The sequential Least P th optimization used in the program uses p = 2 4 8 16. This strategy often provides a smooth path towards a minimax solution.

Negated Least-Squares EF

The optimizer that provides worst-case analysis is called Random Max . The random maximizer internally negates (changes the sign) the least-squares error function so that the effect is a maximization of the error function.

Note
The effect of this error function is to drive the values to the extreme of their ranges. To prevent destroying the originally desired response, you may want to save a copy of your optimized design, then change your optimization constraints to be equal to the tolerances on these variables.

For more information on the least-squares error function, refer back to Least-Squares EF, keeping in mind the effect of the negation.

Weighting Factors

All of the error-function (EF) formulations include weighting factors, W i . Weighting factors are a measure of the importance of a given goal relative to other goals. The default value for weighting factors is 1.0.

Setting the Appropriate Weighting Factors Manually

Assuming the internal weighting factors are 1.0 (default value), there can be two cases for setting up appropriate weighting factors related to different requirements:

Unbiased Error Function

An unbiased error function means the partial errors for each of the goals have almost the same contributions to the error function. When the partial errors have values of the same order, there is no problem with the default weighting factors. However, when the partial errors have values of different orders, you must adjust the weighting factors to make the weighted partial errors in the same order. For example, assume there are three requirements for one amplifier design:

|S22 | <= 0.30
Gain ripple <= 1 dB
|NFACT - NFMin | <= 0.1 dB

If the least-squares error function is applied to the optimization procedure, the orders for the partial errors for the three goals are 0.09, 1, and 0.01 respectively, and the order for the total error is:
EF = W 1 *0.09 + W 2 *1 + W 3 *0.01
If W 1 , W 2 , and W 3 use the default value of 1.0, it is clear that the error function will be biased toward the second goal, while the third goal requirement will be almost ignored. In order to use the three goal requirements equally, the appropriate weighting factors can be used in the following order:
W 1 = 1.0/0.09
W 2 = 1.0/1
W 3 = 1.0/0.01
In other words, the appropriate weighting factor depends on your specific performance requirement and you must set them correctly to obtain valid results.

Intended Emphasis

Sometimes, it is desirable to allow the error function emphasis on some of the goals. In this case, larger weighting factors for these goals means a higher emphasis.

Setting the appropriate weighting factors based on the goal requirements as described above can become tedious. One method is provided with the effect of having the appropriate internal weighting factors defined for each of the performances automatically as shown in the following table.

Internal Weighting Factors Based on Performance Specifications
IW i = abs(Min) if only Min value is specified for ith goal (in the ith goal component)
  abs(Max) if only Max value is specified for the ith goal (in the ith goal component)
  (abs(Max) + abs(Min))/2.0 if both Min and Max are specified for the ith goal (in the ith goal component)
IW i = 1.0 if the value calculated for IW i from the above equation is zero

The procedure used to set up the appropriate weighting factors using internal weighting factors from the automatic scaling mode is:

Search Methods

These search methods are used to arrive at new parameter values:

Optimization with Random Search is typically used initially. Optimization with Gradient search is generally used in later stages of optimization. Discrete optimization only affects discrete-valued variables. The genetic algorithm search is well suited to the discrete and mixed (continuous and discrete) problems.

Random Search

The optimizers using random search method (Random, Random Minimax, and Random Max optimizers) arrive at new parameter values by using a random-number generator, that is, by picking a number at random within a range, which is sometimes a slower process compared to the optimizers using gradient search methods.

Optimization with random search method is a trial and error process. Starting from an initial set of parameter values for which the error function is known, a new set of values is obtained by perturbing each of the initial values, and the error function is re-evaluated.

For optimization with random search method, a trial consists of two error function evaluations. A trial performed by optimization with random search method is completed by reversing the algebraic sign of each parameter value perturbation and re-evaluating the error function. These two values, corresponding to positive and negative perturbations, are compared to the value at the initial point.

If either value is less than the initial value, then the set of parameter values for which the error function has its least value becomes the initial point for the next trial. If neither value is less than the initial value, then the initial point remains the same for the next trial.

Gradient Search

The optimizers using gradient search method (Gradient and Gradient Minimax optimizers) find the gradient of the network's error function. These optimizers usually progress more quickly to a point where the error function is minimized, though it is possible for them to terminate in a local minimum.

The optimizers find the gradient of the error function (i.e., the direction to move a set of parameter values in order to reduce the error function). Once the direction is determined, the set of parameter values is moved in that direction until the error function is minimized. Then the gradient is re-evaluated. This cycle is equal to one iteration of the gradient optimizers.

A design that is optimized by a gradient optimizer has the least sensitivity (more stable) to slight variations in its parameter values.

A single iteration usually includes many function evaluations ; therefore, an iteration in optimization using gradient search method takes much longer than a trial in optimization using random search method.

Quasi-Newton Search

The optimizers using Quasi-Newton search method (Quasi-Newton and Least Pth optimizers) use second-order derivatives of the error function and the gradient to find a descending direction.

The optimization routine using Quasi-Newton search method estimates the second-order derivatives using the Davidson-Fletcher-Powell (DFP) formula or its complement. Appropriately combined with the gradient, this information is used to find a direction and an inexact line-search is conducted.

The optimization terminates when the gradient vanishes or the change ration in the variables is small (less than 1.0e-5). It also stops when the number of iterations, that you have specified, is reached. The bounds imposed on the optimization variables are handled using a transformation of variables.

Like the optimizers using gradient search method, an iteration in the optimizers using Quasi-Newton search methods consists of many function evaluations, and takes longer than a trial in the optimizers using random search method.

Gauss-Newton/Quasi-Newton (minimax) Search

The minimax optimizer consists of two stages. In the first stage of the algorithm, the optimizer solves a minimax problem using a linear programming technique. In doing so, the status and potential of each individual error function component are analyzed. Its contribution to the minimax problem is mathematically assessed and taken into account during optimization.

In the second stage, the optimizer works with a Quasi-Newton method using approximate second-order derivatives. Such extra effort becomes necessary for an accurate and efficient solution to certain ill-conditioned problems (i.e., singular problems).

The minimax optimizer terminates when responses become optimally equal-ripple or the relative change in the variables is less than 0.05 percent. It also stops when the number of iterations is reached.

Note that the bounds imposed on the variables are formulated and treated directly as linear constraints without having to resort to variable transformation; therefore, a source of nonlinearity is eliminated.

Exhaustive Search

The exhaustive search method is used exclusively by the discrete optimizer. The discrete optimizer only affects parameter values specified as discrete-valued optimization variables.

The exhaustive search method involves a comprehensive search for the combination of discrete values that results in the best design performance. Starting from an initial set of parameter values for which the error function is known, an update in the parameter values occurs upon an improvement in the error function.

Because the parameter values that may change are not continuous variables, this search method is more a series of trials than iterations. Moreover, the number of trials required to attempt all combinations may often be prohibitive.

To reduce optimization time, keep the number of discrete variables to a minimum and reduce the number of values the discrete variables may take on.

Genetic Algorithm Search

The Genetic optimizer is the only optimizer that uses the Genetic Algorithm search method.

For more information, refer to the Genetic Optimizer.

Simulated Annealing Algorithm Search

The Simulated Annealing optimizer is the only optimizer that uses the Simulated Annealing Search method. Basically, it is a modified downhill simplex search mechanism. For more information, see Simulated Annealing Optimizer.

Sensitivity Analysis

Sensitivity analysis is listed as a type of optimization. This feature comprises a single-point or infinitesimal sensitivity analysis of a design variable. Sensitivity analysis for circuit design involves taking partial derivatives of the response with respect to a design variable of interest. It is thought that these numbers can help pinpoint variables that contribute disproportionately to performance variance.

The method used to compute sensitivities is based on finite difference approximation requiring N+1 full circuit simulations, where N is the number of design variables. Sensitivity analysis results are also unconditionally sent to the dataset, and this data can be examined in the Data Display window. Sensitivities are approximated as follows:

where R(P 0 ) is the response evaluated at the nominal point and R(P i + ) is the response due to a perturbation in the i th parameter. The perturbation ratio is about 1.0e-06 (e.g., P i + = (1+1.0e-06)*P i 0 ). The response R is actually the expression found in the goal(s) given in the Optimization window and performing the sensitivity analysis.

The sensitivity analysis also outputs normalized sensitivities. Normalized sensitivities use the approximate gradient (single-point sensitivity) to predict the percentage change in the response due to a 1% change in the design variable. Normalized sensitivity is defined as:

The sensitivity information is saved in the dataset using the Goal name. Normalized sensitivities are save with the name: norm_ <goal_name> .

Cascaded Optimization Strategies

There is no unique winning strategy for optimization due to the complexity and variation of circuit design. However, the starting point is very important in finding a good solution to your design problems. It is recommended that you start from a manual/empirical estimation, based on a simplified design technique. Starting with some empirical estimation or educated guess is always better than simply picking a random starting point. Once you start using the optimization process, you may develop your own strategies based on different applications. Some possible strategies are,

For example, you can use the random optimizer first, then fine-tune the solution using the gradient optimizer. Using this strategy, the min-max ranges for a discrete optimization can be significantly narrowed. The discrete optimizer can then be used if necessary.

In addition to reducing the number of possible discrete values for each discrete variable, you should also consider minimizing the number of discrete variables to be used. The intent here is to minimize the total number of permutations which, in turn, will minimize the optimization time and the computer resources required for this process.

For more information on discrete variables, refer to Value Types for Nominal Optimization.

 

Privacy Statement  | Terms of Use  | Legal | Contact Us  | © Agilent 2000-2008 

Contents
Additional Resources