Section 6: Analysis of Impact on Applications As discussed in the previous section, given a certain appreciable frequency of occurrence of the reduced precision divide, the impact on the enduser depends upon the way in which the results of these instructions (along with any inaccuracies) are propagated into further computation in the application, and upon the way in which the final results of the application are interpreted by the enduser.
In order to truly understand the importance of the flaw, an elaborate characterization effort was undertaken. The effort had a twofold thrust: first, to estimate the frequency of occurrence of the reduced precision divide, and second, to estimate how the reduction in precision gets propagated to the end result,and to determine how it gets used.
The methodology used for this purpose involved data sources both internal and external to Intel.Internally, characterization was performed in a verification laboratory on key applications that had been ported to the laboratory environment. Additionally, test suites provided by the application vendor for verification of the functionality of the test suite on that platform were procured, and were used for a pilot measurement. Externally, opinions were taken from eminent application and algorithm experts in the industry as well as from power users of the key applications.
6.1 Taxonomy of Applications The application base was categorized into the following groups:
1. Commercial PC applications on desktop/mobile platform running on MSDOS, Microsoft*Windows*, or OS/2*. This class includes basic spreadsheet users for personal finance or basic accounting.
2. Technical applications. This includes a broad range of applications including engineering and scientific, advanced multimedia, educational, and financial applications. Thus, this class includes power users of spreadsheets such as financial analysts and financial engineers.
Applications in this category could be purely integerbased, or could involve floating point instructions for either numerical computation or for visualization. This class spans the wide range of applications running on MSDOS, Windows, OS/2 or UNIX* operating systems.
3. Server and transaction processing applications.
6.2 Impact on Commercial PC applications A large majority of PC applications do not invoke the floating point unit. This includes applications such as word processing, text editing and email. In the commercial PC domain, the majority of applications that do use floating point do not invoke an appreciable number of divides and hence do not introduce significant failures that will pose a dataintegrity problem during the useful life of the part. Table 61illustrates the outcome of the analyses and characterization on a few key applications. Of specific concern were the spreadsheet applications, where numerical calculation is often supported via use of the floatingpoint unit. Towards this concern, a more elaborate study focussed on spreadsheets. This study is addressed in the next subsection.
6.2.1 Spreadsheets The study on spreadsheets included a survey of acknowledged numerics experts in the industry. The results of the survey were partially confirmed by statistical characterization in the internal verification laboratory at Intel. The results from the survey are now summarized.
Table 61 COMMERCIAL PC APPLICATIONS ONDOS/WINDOWS/OS/2
The most common use of a spreadsheet is as a computational database that collects information of some kind, e.g. information on expense reports, budgets or miscellaneous data on a process, an experiment or personnel in a firm. Only a small fraction of all spreadsheet users are actually "heavy" users, users who intensely invoke the computational engine to generate numerical information. Most other users either use spreadsheets to display this kind of information and make minor modifications and edits, or perform a few calculations.
Once entered into the spreadsheet with a certain number of significant digits, most data is converted to some internal representation, and most numeric computation is floating pointbased. Spreadsheets like Excel* and QuattroPro* compute in double precision floating point, while Lotus123* computes in extended precision. While intermediate values are stored with the full precision, results are displayed as dictated by the user. About 40% of the results in general are displayed with only two decimal digits after the point (e.g. for currency display), another 40% are displayed as integers (after rounding), and only the remaining 20% of the numbers are displayed in scientific format or in floating point format with more than two digits after the decimal point.
About 95% of the numeric formulae invoked contain one or two operators, typically an add or a multiply or, rarely, a divide. Occasionally, the Mod function (that remainders by one to get the fractional portion of the number) is used. The remaining 4% of the formulae used include functions such as IRR (Internal Rate of Return, which solves an Nth order polynomial equation), Power, Interest Rates, Standard Deviation and square Root. Transcendental functions are invoked very rarely. Equation solvers are also used rarely, and could invoke the divide function to implement Newton's formula.
For most accounting applications of the spreadsheet, typical input data may have up to about 78 decimal digits to the left of the decimal point, and about 23 digits to the right of the point, so that the information is known to about 11 significant digits. The most common use of divides is for computation of ratios.Often these ratios are applied once or a couple of times to data, and often towards the end of the computation, so that results from the divide have reduced opportunity to propagate. Since ratios are often used for calculating percentages, the ratio requires about 4 decimal digits (2 to the right and 2 to the left of the decimal point).
For the rest of the basic spreadsheet users, most data that is input to spreadsheets has fewer than three significant digits to the right of the decimal point. A lot of the numbers have only a few significant digits to the left of the point and are thus only known to four or five digits. Also frequent are poweroftwofractions.
In terms of numbers of operations, fewer than 10% of the instructions executed in a typical spread sheet run floating point instruction. Most of the numerical operations are geared toward the display engine.Displaying a spreadsheet of 1 page with 600 cells and 2 floating point operations (one of which may be a divide) per cell would require 1,200 FP operations. On the computational side, a typical recalculation could contains 5,000 adds and subtracts, a few multiplies and a very few divides. Divides are used for date calculations, to divide by 365. It is very unlikely that a basic spreadsheet user would invoke any more than5001,000 independent divides per day. It is worth noting that scrolling through several pages repeatedly would result in recalculation with the same values and would not introduce any additional independent divide operations and therefore no additional errors.
Given that even by conservative estimates, an average PC user invoking 1,000 divides per day would see a FIT rate of once in 27,000 years due to this failure mechanism, and given the information on the way the data is interpreted, displayed and used, we conclude that the rate of a significant failure would be much smaller than once every 27,000 years. By the analysis from the previous sections, the common user will not see this effect during the practical lifetime of the part.
For individual users who invoke a greater number of independent divides per day (than 1,000), the rate of encountering a reduced precision result will simply be increased proportionately.
The treatment of the advanced use of spreadsheets for financial engineering is handled in the section on technical applications.
6.3 Impact on Technical Applications In the following two subsections, we examine first engineering and scientific applications, followed by applications in the financial world.
6.3.1 Impact on Engineering and Scientific Applications A broad array of applications are run by scientists and engineers on modern workstations. Table 62 shows one taxonomy of technical applications based on the discipline.This table gives the algorithm employed in the particular application, an example of such an application,an indication of its reliance on divides, the normal condition of the problem (an indication of the likelihood that an error will propagate through the calculation [see below]) and the frequency with which Pentium processors are likely to be used in the application.
The straightforward calculation of frequency of occurrence of a divide inaccuracy based on the number of divides/day on a Pentium processor based platform indicates that users will experience inaccuracy due to the flaw from time to time in the course of floating point intensive work. Based on this result it is necessary to investigate the likely impact of a divide returning a reduced precision result. Figure 61 shows a simple framework for evaluating the frequency of outcomes fora Pentium processor based platform used for divideintensive work.The symbols used in Figure 61 are explained in Table 63 .
The number of divides performed in any given period of time is of course dependent on the size and frequencies of the analyses performed on the Pentium processor based platform. It is difficult to select a representative example because the percentage of divides can vary dramatically. For example, in Gaussian Elimination on dense matrices the operation count varies as N3 where N is the matrix order, while the number of divides is proportional to N. Thus smaller matrices have a much higher proportion of divides and will encounter more divides per unit time, even though the precision of the divides in the larger matrix calculations is more critical. The sparsity pattern also plays a large role as sparse matrix computations encounter divides as a larger percentage of the total operations than do dense matrices. For the purposes of estimation we assume a divide rate of K = 120 million/day. This corresponds to Gaussian Elimination on a 2,000 by 2,000 matrix with a bandwidth of 250 at a flop rate of 30Mflops. (This example is illustrative only and is not intended to quote performance on a specific problem.) A cross check of the data from extensive testing with engineering codes indicates rates approaching, but not exceeding this value. The probability P1 is known from the studies cited earlier in this report. It works out to 1 in 9billion or 1.11E10.
The final stage, governed by P3, gives the number of problems expected per year for the system. By"problem" we mean the use of an answer with less than expected precision that has a significantly negative impact on the user. Examples would be failure of designed parts, financial decisions leading to loss of value or erroneous navigation information. The probability P3 is very difficult to estimate, or even to bound. Many errors that could result from a reduced precision divide would cause a calculation to either fail entirely or produce an answer so obviously wrong that it would never be used in practice.
Rather than wrestle with P3 we attempt to bound P2, the probability that the flaw leads to a meaningfully inaccurate result. For this purpose we define meaningfully inaccurate as having an accuracy of fewer that three significant digits. Since the inaccuracy in the divide result appears in bit positions between the 12th to the right of the binary point in the mantissa and the last bit, corresponding to inaccuracies no larger that in the 4th significant digit, an amplification of the inaccuracy must occur for a meaningful inaccuracy to appear in the final result. While it is easy to construct examples in which a single divide inaccuracy can result in a final answer possessing anywhere from full accuracy to no significant digits (The latter outcome is most easily produced by subtracting the result of a slightly inaccurate divide from a number of close magnitude so that the correct result would contain only digits beyond those lost to the inaccuracy), in practice most reduced precision divides are found to be benign.
If P2 were 1.0, indicating that every divide inaccuracy produced a meaningful inaccuracy in the result, the frequency of meaningful inaccuracy would be 1 in 75 days based on the values of K and P1 above. In order for this frequency to fall to a level comparable to the frequency of divide inaccuracies in spreadsheet applications P2 must be of the order of 104. The remainder of this section deals with the estimation ofP2.
6.3.1.1 Estimating P2 The property of a problem (the algorithm along with its data) that relates errors in the output to errors in the input (or errors introduced by numerical computation) is its condition. While the condition can be expressed as a single number for many calculations and can be used in error bounds, for the purposes of this report the condition can be thought of expressing the quality of sensitivity to accuracy in the divide operation. It should be noted that the error in the final answer may actually be less than the error introduced in a particular operation in cases where that calculation ultimately turns out to be a minor contributor to the final answer or in cases where the algorithm is selfcorrecting (e.g. certain iterative schemes or neural net computing).
An experimental approach is used to estimate P2, the probability that a divide inaccuracy will result in a meaningful inaccuracy in the final result. This approach is preferred over an analytical one since a problem's sensitivity to error is highly dependent on the particular data and the location at which the error is introduced. It can be seen in Table 62 that those applications characterized by a large number of divides and poorly conditioned are largely those that deal in dense or sparse matrix algebra, and in particular those involving exotic modelling techniques (such as the use of shell elements in finite element analysis) or eigenvalue extraction (as used in the calculation of vibration modes in structural analysis). Since the use of Pentium processors in the solution of large dense matrix equations is thought to be rare we concentrate on sparse matrix problems. In order to capture the most demanding work loads we ran extensive tests on the QA test suites of MSC/NASTRAN™ and ANSYS™. It should be noted that these engineering codes were provided by their vendors for the ongoing purpose of functional and performance testing on Intelbased systems and their use here in noway constitutes a recommendation or endorsement by the MacNealSchwendler or Swanson companies.
NASTRAN and ANSYS represent the upper end of engineering analysis packages and both are frequently run on supercomputers in the calculation of stress, vibration modes, fluid flow, magnetic fields, and other engineering calculations characterized by finite element models. While NASTRAN has few licenses on Intel Architecture systems and ANSYS has only a moderate number, the workloads run on these codes represent a worst case scenario for a Pentium processorbased system in engineering use. Those engineering codes in widespread use on Intel Architecture systems (e.g. AutoCAD*) will not place more stress on the floating point performance than these codes. Thus our intent in setting up this test program is to identify any possible problems in sparse matrix computing. If problems are found we will then look to the more plentiful applications on PCs to see if the types of analysis found to be susceptible to problems are performed with those codes.
Given the infrequency of divide inaccuracies, and the likelihood that a single inaccuracy will go unnoticed, it is impractical to run problems on a Pentium processor with the divide flaw and wait for an inaccuracy to show up in the output. In fact, at no point in the testing described here was an actual effect from the divide flaw seen. Since the object of the experiment is to determine the effect on the output of the engineering analysis when inaccuracies occur, we introduce inaccuracies artificially and observe the result. Even this plan has the problem that single inaccuracies introduced at random, and with random bit locations for the inaccuracy, will take orders of magnitude too long to produce statistically significant results.
To get a rough estimate of the size of P2 we introduce multiple inaccuracies into single runs of the codes and extrapolate the results to single inaccuracy on problems of comparable complexity. The procedure is as follows:
1. Run all tests with 100% of divides at minimum precision (12 good bits to the right of the binary point of the significand)
2. For tests exhibiting meaningful inaccuracies:
a. Determine minimum number of divides (D) and precision (precis) to generate inaccuracy
b. For each test:
p2>1/D Prob(precis)
3. Overall estimate of P2>Max (p2)
In step 2b above the Prob(precis) is the probability that the divide inaccuracy will be as bad as that precision level. For example if a precision loss in the 13th binary bit (12th bit to the right of the binary point of the significand) in a double precision operation is required to see a meaningful inaccuracy in the result, Prob(13) would be about 1/40.
Figures 62 and 63 show typical results for experiments run on problems which exhibit meaningful inaccuracies when run with all divides at minimum precision. In these figures the number of significant digits in the final result is plotted as a function of the portion of divides artificially modified. The three different curves for each figure show results for reducing the precision of the divide results to different levels. For each precision level the transition from meaningful inaccuracy (number of significant digits fewer than three) to meaningless inaccuracy (more than three digits of accuracy) is observed. Each such transition yields an estimate for p2, the probability of meaningful inaccuracy on this problem due to a single random divide precision reduction. We take the largest such estimate to be the value of p2 for this problem, then estimate P2 as the maximum over all tests.
6.3.1.2 Experimental Results: Estimation of P2 Table 64 shows a representative sample of results for tests where initial screening with all divides at minimum precision indicated the potential for meaningful inaccuracy in the final result. For those tests the procedure outlined above in section 6.3.1.1 was followed to determine the transition from meaningful to meaningless inaccuracy. Each test yields an extrapolation for p2 and the maximum over all tests is our estimate of P2.
As indicated at the bottom of Table 64 the maximum estimate of P2calculated over all tests run to date is 2.2e4. A value in this range indicates that a meaningful inaccuracy due to the flaw is expected only one time in about one thousand years on a Pentium processor based system. This puts the probability of such an inaccuracy below that of other errors that could affect a system over its lifetime (see Table 51) In this table the MTBF refers to the time between meaningful inaccuracies in the final result.
6.3.2 Impact on Financial applications Financial engineering applications which use floating point division are implemented (by both users and software distributors) in spreadsheets, in high level languages and through use of statistical software packages. To consider the potential impacts of the flaw for the financial engineer, we will divide the workspace into four categories. The first set is the collection of users performing corporate or marketing analysis oriented calculations. The next set contains the most frequent financial analytics such as present values, annuities, depreciations and basic financial quantities. The last two sets comprise the most intensive computation and mathematical models.
This summary classification of the financial applications and the impact of the flaw is given in Table 65.
Table 65 Classification of financial applications
Notice that the number of users of each category is inversely proportional to the severity of the impact.The vast majority of users are included in the first category. The last category represents applications which have only recently been transferred to the desktop.
In the following sections we apply the characterization methodology (P1, P2) used in the section on engineering applications to the 4 sets.
6.3.2.1 Values of P2 While we do not provide a detailed analysis of the P2 probability (the probability that the flaw leads to a meaningful inaccuracy) for financial applications here, we will make the following comments. The P2value for these applications is often either close to 1.0 or 0.0. The former leaves the risk at P1, the latter reduces the risk to zero.
P2 is close to 0.0 when dealing with random number generators, where any random number is as good as another, provided the basic distribution is not changed. Since the inaccuracy happens in only 1 of nine billion divisions, there will be no change to the estimate of the distribution.
Again, P2 is close to 0.0 on simulations which use a large number of paths and perform expected value analysis. An error on one path will not have a significant impact on the final answer. The number of paths are always far less than nine billion, so that more than one error among the paths is very unlikely; and for more than two paths to have errors is prohibitively unlikely. Finally, P2 is 0.0 when the number of significant decimal digits the user needs is less than four.
P2 moves from near 0.0 to near 1.0 as the need for significant decimal digits reaches 15. This is because the inaccuracy seems to be equally likely to occur at each significant digit beyond four. For instance, if the user needs six significant digits, and an error occurs, then (assuming double precision arithmetic), the probability that the inaccuracy was in fourth through sixth significant decimal digits is 3/15 = 0.2.
Most other times the P2 value will be near 1.0.
6.3.2.2 MTBF estimation CATEGORY 1
For applications from the 1st set, such as corporate financial analysis and forecasting, marketing analysis,planning and so forth, the likelihood of encountering reduced precision divides is low. This is because typical calculations here are dominated by comparisons and additions. The inputoutput operations and the time for human conception of the results consume more time then the processor spends performing arithmetic operations. This effect limits the number of divisions that are computed per day to well below what is necessary to have any appreciable probability of experiencing a meaningful inaccuracy. As an example, consider a large budget calculation implemented as a 700x700 cell spread sheet, which is run an average of a few times a day. This will produce less than 10,000 divisions a day (on average); so few divisions that no error is likely to be seen for thousands of years.
CATEGORY 2
In the second set of usages, one of the most frequent calculations is discounting a value to the present,which typically involves an expression such as (c/(1+r)^t). This discount process is generally connected with some method of generating an associated cash flow. The number of divisions is about onefifth of the total operations (or less) and about equal to the number of exponentiations. In the most extreme case,where the calculation is a simple present value, the 60MHz Pentium processor running 24hrs per day could produce at most 500 million divisions and exponentiations, resulting in a MTBF of 18 days. In more realistic applications, the number of PV calculations is of order of 1000 or less within the spreadsheet and the spreadsheet is recalculated no more than 100 times a day. This produces at most 500,000 divides a day for a worst case MTBF of more than 50 years. This possibility is considerably less than the chances of a system memory error, which could be equally inaccurate.
CATEGORY 3
The third set is represented by the BlackScholes and simple binomial models. BlackScholes solutions generally require approximations to be made for standard normal distributions in order to run them on any desktop computer. These approximations will increase the ratio of divide time to compute time.Divisions, exponentiation, and natural logarithms take about onefifth of the actual computation time.Models pricing a few thousand options are run at most a few times per hour, representing approximately a million divisions and transcendental computations per day. The MTBF would then be roughly 30 years. In the extreme case where a user does not look at all the results, and continually recalculates the models, the upper bound of calculations (running 24 hours per day at full rate) is about 1 billion divisions per day yielding an MTBF of 9 days. Again, if the accuracy required is less than four digits, then even such maximal use will not produce a meaningful inaccuracy.
Simple Binomial models are usually implemented with a discount computation at each node of the model and two simple integer divisions. The number of divisions and the number of exponentiations are of the same order of magnitude, each being about 1/5th of the total number of operations. For an analysis of a few thousand options a day, MTBF would exceed 30 years.
CATEGORY 4
The last category of applications focuses on the valuation of more complicated derivatives and the use of simulation. Representative applications for this set include nonsimple binomial models, as well astrinomial and finite difference methods. Simulation analysis usually employs Monte Carlo techniques to arrive at valuations for complex securities with large numbers of embedded options such as CMOs(Collateralized Mortgagebacked Obligations).
More complicated binomial models, such as those with nonstationary dividends, and other valuation techniques such as nonrecombining trees and finite difference methods can severely increase the number of computational steps performed in a valuation. In the case of nonsimple binomial models, for example,realistic problems might have an MTBF of three years or less. However, while finite difference problems also use significant numbers of divides, real applications of these techniques involve extensive nondivision operations in order to implement useful algorithms. This can greatly reduce the time spent doing divisions, resulting in very low divisions per day. These methods can also be iterative, so that an inaccuracy on one iteration will disappear in following iterations.
When simulation analysis is used for valuation, the number of cash flows valued must be relatively large.This is significant since the extremely large numbers of discount operations greatly increases the rate of divisions per day. For those circumstances where continuous use of a desktop platform is being made to solve these computationally intensive applications, the MTBF may well be less than a week. However, this may be ameliorated by the P2 factor as discussed above.
In conclusion, the large majority of financial users will not experience any problems from the flaw. The problem may manifest itself significantly in those programs for valuing the most complicated financial instruments. Even in this case, if the valuation is statistically based, single division inaccuracies may be harmless. The user should consider the number of divisions performed per day and the context in which the resulting quotients are used.
6.4 Impact on Server Applications Server applications do not use the relevant floating point instructions. The flaw has no impact on them.
This applies to:
