Click for next page ( 9


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 8
II STATISTICAL MODELS AND ANALYSES IN AUDITING 1. The Beginnings The field of accounting encompasses a number of subdisciplines. Among these, two important ones are financial accounting and auditing. Financial accounting is concemed with the collection of data about the economic activities of a given firm and He summarizing and reporting of Rem in He form of financial statements. Auditing' on He over hand, refers to the independent verification of He fairness of these financial statements. The auditor collects data mat is useful for verification from several sources and by different means. It is very evident mat the acquisition of reliable audit information at low cost is essential to economical and efficient auditing. Them are two main types of audit tests for which He acquisition of infonnation can profitably make use of statistical sampling. Firstly, an auditor may require evidence to verify that the accounting treatments of numerous individual transactions comply win prescribed procedures for internal control. Secondly, audit evidence may be required to verify that reported monetary balances of large numbers of individual items are not materially misstated. The first audit test, collecting data to determine the rate of procedural errors of a population of transactions is called a compliance test. The second, coDect~ng data for evaluating the aggregate monetary error in the stated balance, is caned a substantive test of details. The auditor considers an error to be material if its magnitude "is such that it is probable that the judgement of a reasonable person relying upon the report would have been changed or influenced by the inclusion or correction of the item". (Financial Accounting Standards Board, 1980) Current auditing standards set by He American Institute of Certified Public Accountants (AICPA) do no! mandate the use of statistical sampling when conducting audit tests (AICPA, l98l & 1983~. However, the meets of random sampling as the means to obtain, at relatively low COSt, reliable approximations to the characteristics of a large group of entnes, were known to accountants as early as 1933 (Carmen, 19331. The early applications were apparently limited to compliance tests (Neter, 19861. The statistical problems that arise, when analyzing He type of nonstandard mixture of distributions that is the focus of this report, did not surface in auditing until the late 1950s. At about that time, Kenneth Stringer began to investigate Be practicality of incorporating statistical sampling into the audit practices of his finn, Deloitte, Haskins & Sells. It was not until 1963 that some results of his studies were commurucated to the statistical profession. The occasion was a meeting of the American 8

OCR for page 8
Statistical Association (Stringer. 1963 & 1979). Before summarizing Stringer's main conclusions, we describe Me context as follows. An item In an audit sample produces two pieces of inforTnabon, namely, the book (recorded) amount and the audited (correct) amount. The difference between We two is caned the error amount. The percentage of items in error may be smalD in an accounting population. In an audit sample, it is not uncommon to observe only a few items with errors. An audit sample may not yield any non-zero error amounts. For analyses of such data, in which most observations are zero, me classical interval estimation of the total error amount based on the asymptotic normality of me sampling distribution is not reliable. Also, when the sample contains no items in error, the estimated standard deviation of the estimator of the total error amount becomes zero. Alternatively, one could use the sample mean of the audited amount to estimate Me total mean audited amount for the population. The estimate of the mean is then multipled by the known number of items in the population to estimate the population total. In the audit profession, this method is referred to as mean-per-unit estimation (AICPA, 19X3~. Since observations are audited amounts, Me standard deviation of this estimator can be estimated even when aU items in the sample are error-free. However, because of the large variance of the audited amount that may arise In simple random sampling, the mean-per-un~t estimation is imprecise. More fundamentally, however, when the sample does not contain any item in error, the difference between the estunate of the tote] audited amount and the book balance must be Interpreted as the sampling error. The auditor thus evaluates that the book amount does not contain any material error. This is an important point for the auditor. To quote from Swinger (1963) concem~ng statistical estimates ("evaluations") of total error: Assuming a population with no error in it, each of the possible distinct samples of a given size Mat could be selected from it would result In a different estimate and precision limit under this approach; however, from the view point of the auditor, all samples which include no errors should result in identical evaluations. Stringer men reported in He same presentation that he, in collaboration with Frederick F. Stephan of Princeton University, had developed a new statistical procedure for his firm's use in auditing that did not depend on the normal approximation of He sampling distribution and that could silk provide a reasonable inference for the population error amount when all items in the sample are error-free. This sampling plan is apparently the original implementation of He now widely pracused dollar (or monetary) unit sampling and is one of the first workable solutions 9

OCR for page 8
proposed for the nonstandard mixtures problem in accounting. However, as it is studied later in this report, the method assumes that errors are overstatements with the maximum size of an error of an item equal to its book amount. Another solution, using a similar procedure, was devised by van Heerden (19611. His work, however, was slow to become known within He American accounting profession. En the public sector, statistical sampling has also become an integral part of audit tools In the Intemal Revenue Service ORS) since the issuance of He 1972 memo by their Chief Council ORS, 1972 & 1975~. In a tax examination the audit agent uses statistical sampling of individual items to estun ate the adjusOnent, if necessary, for an aggregate expense reported in the tax retum. Statistical auditing may also be utilized by over gove'Tunental agencies. For example, the Office of He Spector General of He Deparunent of Heath and Human Services investigates compliance of the cost report of a state to the Medicaid policy by using statistical sampling of items. IN these cases, a large proportion of items in an audit sample requires no adiusonent, i.e., most sample items are allowable deductions. Since an individual item adjusunent is seldom negative, the audit data for estimation of the total adjustment is a mixture of a large percentage of zeros and a small percentage of positive numbers. Thus, the mixture model and related statistical problems that are important to accounting firms in auditing also arise in other auditing contexts such as those associated with IRS tax examinations. Significant differences also exist in these applications, however, and these win be stressed later. For concise accounts of the problems of statistical auditing one is referred to Knight (1979), Smith (1979) and Neter (1986~; the last reference also includes recent developments. Leslie, Teitiebaum and Anderson (1980) also provide an annotated bibliography that portrays the historical development of the subject Trough 1979. In the sections which follow, however, we provide a comprehensive survey of the research efforts that have contributed to me identification and beKer understanding of problems in statistical auditing. We include brief descnptions of many of the solutions that have been proposed for these problems along with their limitations. It wild be noticed that the solutions thus far proposed are mainly directed toward He special need for good upper bounds on errors when emus are overstatements. This is an important and common audit problem for accounting firms but in the case of tax examinations, Though the mixture distnbui~on is similar, the interest is in me study of lower bounds. Thus in statistical auditing, whether in the private or public sector, the investigator's interest is usually concerned win one-sided problems, i.e., of an upper or a lower bound, rather than nvo-si~e~ problems as currently stressed in many texts. 10

OCR for page 8
The next section provides the definitions and notations that are used. Then in Section 3 though 7' we present venous methodologies that have been provided in the literature A numerical example is given in me last section to indurate some of the alternative procedures. 11

OCR for page 8
2. Definitions and Notations An account, such as accounts receivable or inventory, is a population of individual accounts. To distinguish He use of the word 'account' in the former sense from me latter, we dedne the constituent individual accounts, when used as audit uruts, as tine items. Let Y' and Xi, Be latter not usually known for an values of i, denote the book (recorded) amount and the audited (correct) amount respectively, for the inch line item of an account of N Inne items. The book and audited balances of Me account are respectively N Y=2Y'. =1 called the population book amount, and V X=~Xi. i=1 (2.1) (2.2) caped Me population audited amount. The error amount of the i-th item is defined to be Di = YiXi (2.3) When Di > 0 , we can it an overstatement and, when Di Ti Y i=1 (2.5) . , l=1 (2.6) As emphasized in Section 1, a large proportion of items in an audit population will likely be error-free, so that Di = 0 for many values of i . Similar populations are common in many disciplines as discussed in Chapter I. Aitchison (1955) is me first to consider an inference problem for such a population. Following his approach, the error ~ of an item randomly chosen from an accounting population may be modeled as 12

OCR for page 8
follows: , z with probability p, 4=~ O wi~probability (imp), (2.7) where p is the proportion of items with errors in the population and z =0 is a random variable representing the error amount. z may depend on the book amount. The nonstandard mixture problem that is the focus of this report is the problem of obtaining confidence bounds for the population total error D when sampling from the model (2.7~. A useful sampling design for statistical auditing is to select items without replacement with probability proportional to book values. This sampling design can be modeled in teens of use of individual doBars of the total book amount as sampling units and is commonly referred to as Dollar Unit Sampling (D US) or Monetary Unit Sampling (MUS). (Anderson and Teitlebaum, 1973; Roberts, 1978; Leslie, Teitlebaum and Anderson, 19801. The book amounts of the N items are successively cumulated to a total of Y dollars. One may Den choose systematically n doBar units at fixed intervals of ~ (= Yin ~ doDars. The items with book amounts exceeding ~ doDars, and hence items that are certain to be sampled, are separately examined. Items with a zero book amount should also be examined separately as Hey will not be selected. If a selected dollar unit fans in the lath item, the tainting Ti (=Dil Yi) of the item is recorded. Namely, every dollar unit observation is the tainting of the item that the unit falls in. The model (2.7) may then be applied for DUS by considering ~ as an independent observation of tainting of a dollar unit. p is, then, the probability that a dollar unit is in error. Thus, (2.7) can be used for sampling individual items or individual doBars. In the former, ~ stands for the error amount of an item, and in Me latter for the tainting of a dollar unit. In the next section we present some results from several empirical studies to illustrate values of p and the distribution of z, for both line item sampling and DUS designs. 13

OCR for page 8
3. Error Distributions of Audit Populations - Empirical Evidence Why do errors occur? Hylas and Ashton (1982) conducted a survey of the audit practices of a large accounting finn in order to investigate the Ends of accounts that are likely to show errors, to obtain alternative audit leads for detection of these errors, and to attempt to identify their apparent causes. Not surpnsingly, their sway shows mat unintentional human error is the most likely cause of recording erwrs. The remainder of this section reports the results of several empincal studies about actual values of the error rate p and actual distributions of me non-zero error z In the model (2.7~. The sample audit populations are from a few large accounting finns and each contains a relatively large number of errors. Therefore, the conclusions may not represent typical audit situations. A) Line item errors. Data sets supplied by a large accounting finn were studied by Ramage, Krieger and Spero (1979) and again by Johnson, Leitch and Neter (1981~. The orientations of the two studies differ in some important respects. The latter provides more comprehensive information about the error amount distributions of the given data sets. It should be noted mat the data sets are not chosen randomly. Instead, they have been selected because each dam set contains a large number of errors enough to yield a reasonable smooth picture of the error distribution. According to the study by Johnson et al. (1981), He median error rate of 55 accounts receivables data is .024 (~e quartiles are: Q ~=.004 and Q3=.089~. On He other hand, the median enter rate of 26 inventory audits is .154 (Q ~=.073 and Q3=.399~. Thus the error amount distribution of a typical accounts receivable in their study has a mass .98 at zero. A random sample of 100 items from such a distribution will then contain, on the average, only two non-zero observations. On the other hand, the error amount distribution of a typical inventory in their study has a mass .85 at the origin and sampling of 100 items from such a distribution win contain, on the average, 15 non-zero observations. The items with larger book amounts are more likely to be in error Han those with smaller book amounts. The average error amount, however, does not appear to be related to the book amount. On the other hand, the standard deviation of the error amount tends to increase with book amount. Ham. Losel1 and Smieliauskas (1985) conducted a similar sway using data sets provided by another accounting firm. Besides accounts receivable and inventory, this study also included accounts payable, purchases and sales. Four error rates are defined and reported for each cRte~o~ of accounts. It should be noted that their smby defines emu hroadlv. since they include errors that do not accompany changes in rccordcd amounts. 14

OCR for page 8
The distribution of non-zero error amounts again differs substantially between receivables and inventory. The error amount for receivables are likely to be overstated and their distribution positively skewed. On the other hand, errors for inventory include both overstatements and understatements with about equal frequency. However' for both account categories, He distnbui~ons contain outliers. Graphs in Figure ~ are taken from Johnson e' al. (1981) and illustrate forms of the ever amount distributions of typical receivables and inventory audit data. The figures show the non-normality of error distnbutions. Similar conclusions are also reached by Ham et al. (1985). Their study also reports the distribution of erTor amounts for accounts payables and purchases. The error amounts tend to be understannents for these categones. Again, the shape of distnbudons are not normal. By Dollar unit tainangs. When items are chosen with probability proportional co book amounts, He relevant error amount distr~budon is He distribution of tintings weighted by the book amount. Equivalently, it is He distribution of dodar unit faintings. Table 1 tabulates an example. Neter, Johnson and Leitch (1985) report the dollar unit tainting distnbunons of the same audit data Hat they analyzed previously. The median error rate of receivables is .040 for dollar units and is higher than Hat of line items (.024~. Similarly, the median dolBar unit error rate for inventory is .186 (.154 for line items). The reason is, they conclude, that the line item error rate tends to be higher for items with larger book amount for both categories. Since the average line item error amount is not related with the book amount, the dollar unit tainting tends to be smaller for items with larger book amounts. Consequently, the distnbudon of dollar unit tainting tends to be concentrated around the ongin. Some accounts receivable have, however, a J-shaped dollar unit taint distribution with negative skewness. One significant characteristic of the dollar unit tainting distribution that is common for many accounts receivable is the existence of a mass at 1.00, indicating that a significant proportion of these items has a 100% overstatement err. Such an error could arise when, for example, an account has been paid in fun but the transaction has not been recorded. A standard parametric distribution such as nonnal, exponential, gamma, beta and so on, alone may not be satisfactory for modeling such distribution. Figure 2 gives the graphs of the dollar unit tainting distributions for the same audit data used in Figure 1. Note Hat the distribution of faintings can be skewed when that of error amounts is not. Note also the existence of an appreciable mass at 1 in the accounts receivable example. The situation here may be viewed as a nonstandard mixture in which the discrete part has masses at two points. 15

OCR for page 8
Fiat 1 Ex~m~esof~d~ ofEnor^moun~ (^~^ccoun~Rc~iv~Ie(1060b~0~) D~TR~UT~N OF ERROR AMOUNTS ~ ACCOUNTS RECE1V^BLE JUDE ~ 3 ~ . ~ O ~ ~00 -250 12, 7 ~ \ loo o 100 250 Error Amounts 16

OCR for page 8
Figure 1 (continued, (B) InventoIy (1139 observations) DISTRIIBU~ON OF E - - MOUNTS .s .4 0 ~ ~ .o A .2 O ,, 28~ 250 l / J / / -100 0 100 E nor Amounts ~ _ 2SO Source: Figures 1 and 2 of Johnson, Leitch and Neter (1981) 17 - / 177,6

OCR for page 8
Figure 2 Examples of Distribution of Dollar Unit Tainting (A) Accounts Receivable o 0 _ i_ ~ `1) ~ O 0 _ O _ ~ . -1. 5 -1. 0 -0. 5 0.0 0.5 Do] lar Unit Taint 18 1 rig 1.0 1.

OCR for page 8
and E(oi)=Pi, (7.12a) Yar(oi)= P K P' . (7.12b) Let u =(wO, ,wl00) with 2;wi =n be the sample data of n items. w is distributed as a multinomial distribution (n, p) when sampling is win replacement (if sampling is without replacement, approximately). Since the Dirichlet prior distribution is conjugate with the multinomial sampling model, the posenor distribudon of p is again a Dinchlet distribution with the parameter (K p+w). We may define K'=K+n. and wi {i = n ' (7.13a) (7. lab) p, = (Kpi+npi) 9 i=1 100. (7.13c) Then the posterior distribution of p is DinchIet (K' p'), where P' - (P'o9 .P'1oo) By We definition of AD: 100 i D=~ 1~ Pi , (7.14) the posterior distribution Of ED iS denved as a linear combination Of P'i. It can be shown that 100 i E(l1D)= ~ 1ooPi, and (7.15a) Var(iLD)= ~1 1 {my O )2p'i-(~ ~ P'i)2} (715b) The exact distnbui~on of ED iS complicated and therefore is approximated by a Beta distnbution having the same mean and the vanance. Using simulation, Tsui et al. suggest that K =5. go= 8. p~oo=.IO1 and rem airiing 99 Pi's being .001 be used as Me prior sewing for Weir upper bound to perform wed under repeated sampling for a wide variety of tainting distributions. 44

OCR for page 8
McCray (1984) suggests another non-pa~etric Bayesian approach using me muldnomial dis~ibudon as the data generating model. ~ his model, ED ho bun di~retzed, involving a nor of categones, say ADS, j=l,, Nil. Me auditor is ~ provide As assessment of Me prior distribution by assigning probabilities qj to the values, pDj. Men the posterior distribution of AD iS dete=med to be ~jL(Wl~D,) Prob (~D =~Dj ~ W) ah L (w! pD4) (7.16) where 100 L(wl pDj)=max n Pi i in which Me maximum is taken over all probabilities ( Pi } satisfying No ~ ~Dj Pj = ED t ' '=1 (7.17) (7.18) It should be noted that Me two nonpararnetric models introduced above can incorporate negative faintings; mat is, the auditor defines any finite lower and upper limits for tainting and divides the sample space into a Unite categories. Simulation studies have been performed to compare performances of these Bayesian bounds with the procedures described in earlier sections. Dwonn and Grimlund (1986) compares the performance of Weir moment bound with that of McCray's procedure. Several Bayesian and non- Bayesian procedures are also compared in Smienauskas (1986~. Gnm~und and Felix (1987) provides results of an extensive simulation study that compares the long run perfornances of me following bounds: Bayesian bounds with normal error distnbution as discussed in A) above, the Cox and SneU as discussed In C), the bound of Tsui et al. as discussed in D) and the moment bound discussed in Section 6. Recently, Tamura (1988) has proposed a nonparametnc Bayesian model using Ferguson's Dirichlet process to incorporate the auditor's prior prediction of the conditional distnbution of the enor. It is hypothesized mat the auditor cannot predict me exact fume of the error distribution, but is able to describe the expected form. Let Fritz) be the expected distribution function of z representing the auditor's best prior prediction. The auditor may use any standard parametric mode] for Fo. Altemadvely, Fo may be based directly on past data. The auditor assigns a finite weight Oo to indicate his uncertainty about the prediction. Then the auditor's 45

OCR for page 8
prior prediction is defined by the Dir~chlet process With the parameter adz ) = aO Fo(z ). (7.19) This means that Prober Liz') is distnbuted according to the beta distnbu~aon Beta (a(Z')9a( - XtZ'))- The posterior prediction given m observations on z, say z = (at, .... Zm), iS Men defined by the Dinchlet process with the parameter Adz ~ z) = {aO+m ~ {Wm FOd{~~Wm) Em }(Z), (7.20) where aO {aO+m } and Fritz) is the empirical distribution function of z function of the mean ~ of z is Even by . Get) = Probe) =Prob(T(~)<0)9 where the characteristic function of T(~) is 00 (7.21) The distribution (7.22) (v ) (u ) = expE- ~ log ~ I-iu (t -v ) ~ ~ aft )1. (7.22) The distnbudon of ~ is obtained by numerical inversion of (7.22). The distnbution function of the mean tainting 11 is, den, given by Hand ) = Prob (A ~ dl ) = Prob (p id < d() = E (`d < drip I p ). (7.23) This integration can be done numerically. In this work, a beta distribution is proposed to model p. 46 -

OCR for page 8
S. Numerical Examples In Sections 5 though 7 various methods for setting a confidence bound for the accounting population error were described. They differ from Me classical methods of Section 4 in the sense Mat these me~ods do not assume Mat the sampling distnbudons of their estimators are nonnal. Among these new developments, we illustrate in this section me computation of the following upper bounds for the total population error: Me Stringer bound, the multinomial bound, parametric bounds using the power function, and the moment bound. In addition, computation of two Bayesian models developed by Cox and SneB and Tsui et al. will also be illustrated. Software for computing aB but one of these bounds can be developed easily. The exception is Me multinomial bound, which Squires extensive programming unless He number of errors in He sample is either O or 1. These methods are designed primarily for setting an upper bound of an accounting population error contaminated by overstatements in individual items. The maximum size of the error amount of an item is assumed not to exceed its book amount. These mesons also assume DUS. Under this sampling design the total population en or amount is equal to He known book amount Y times the mean tainting per doBar unit DIP FEZ. We win, therefore, demonstrate the computation of a .95 upper bound for AD using each method. We data used for these illustrations are hypothetical. Our main objectives are to provide some comparisons of bounds using the same audit data and also to provide numerical checks for anyone who wishes to develop software for some of the bounds illustrated in this section. A) [lo errors in the sample. When Here are no errors in a sample of n dollar uriits, the Stnnger, multinomial, and power function bounds are identical and are given by the .95 upper bound for the population enor rate p. The bound is therefore directly computed by Pu (0; 95) = 1 - 05~1In (~.1) using the Binomial distribution. For n = 100' ~Su(0;.95) = .0295. In practice, the Poisson approximation of 3.0/n is often used. The computation of the moment bound is more involved but gives a very similar result. For Bayesian bounds, the value of a .95 confidence bound depends or1 He choice of He prior about the error distnbunon. Using extensive simulation, Neter and Godfrey (1985) discovered that for certain priors tile Cox and Snell bound demonstrates a desirable relative frequency behavior under repeated sampling. One such setting is to use the following values for the mean and the standard deviation for the gamma prior of p 47

OCR for page 8
and Liz, respectively: po=.lO, up= .10, - .40, and cs~=.20. These can be related to me parameters a and b in (7.11) as follows. a = ~ O/csp )2, b = (iLo/C~~)2~2.0. (~.2a) (~.2b) Thus for no enamors in the sample, i.e., m=O, using Me above prior values, we compute a = (.10/.10~2 = 1, b = (.40/.20~2+2.0 = 6. The degrees of freedom for me F distribution are 2(m~a ~ and 2(m+b), so for m=0 they are 2 and 12,respec~vely. Since We 95 percendie of F2,~2 is 3.89, and the coefficient, when n=100, is mz+(b-l)Yo m+a (~11.40 1 = = 00303 n +a /p O m+b 1~1.0/.10 6 the 95% Cox and Snell upper bound is .00303 x 3.89 = .01177. For another Bayesian bound proposed by Tsui et al. we use the prior given in Section 7, namely, the Dinchlet prior with parameters K=5.0, pO=.8, peso = .101, and Pi = .001 for i = 1,..., 99. Given no errorin a sample of 100 dollar unit observations, me posterior values for these parameters are K'=K+n=105, and p'O= (K p`~+wo)/K'=~5~.~+1001/105 = .99048. Similarly, p'~OO= (5~.101~/105= .00481, and p'i=~5~.001~/105~= .00004762 for i=1,...,99. The expected value for the posterior ED iS Ten E(11D)=( 1oo+100+ +1900~.00004762+11OO.00481=.007167. To obtain Var (~D ). we compute E (~D) = .0063731 so that Var (~D ~ = (ILL? ~ { E (IID ~ ~ The posterior distribution is, then, approximated by me Beta distnbution having the expected values and the v anance computed above. The two parameters a and ~ of me approximating Beta distribution B (a,~) are 48

OCR for page 8
E ( ) rL E (IID ) ~ 1E (I1D ) ~ 1: and r 13 = { 1E (~D )} I (~D ){ 1 E (~D )} = .8489 1 1 = 117.46. The upper bound is Men given by me 95 percentile of me Beta distribution win parameters .848 and 1 17.46, which is .0~27. B) One error in the sample. When me DUS audit data contain one enter, each method produces a different result First of an, for computation of the Stnuger bound, we determine a .95 upper bound for p, Ad,, (m ,.95) for m=0 and 1 . Software is available for computing these values (e.g., BELBIN in Intemational Mathematical and Statistical Libraries OMSL)~. We compute 0,.95-.0295 and Mu (1, 95) = .0466. Suppose that the observed tainting is ~ =.25. Then a .95 Stringer bound is Pu (0,.95 - to (1, 95 - PI (0,.951) = .0295+.25~.0460.0295) = .0338. Second, the multinomial bound has an explicit solution for one error. It is convenient to express Me observed minting in cents so set {=100~. Denote also a .95 lower bound for p as p'(m ,.95) when a sample of n observations contain m errors. Then a .95 multinomial bound for m=l is given by ({p^' +lOOp ioo)/lOO . where pi and p loo are determined as follows. Let To = max Then and (100 {)(n-1) 1/n ~ 1 .05 ,^] Pt'= p~n-~-P(),(, P 100 = 1poPV , pi(n-l,.95) (8.3) (8.4) (8.5) To illustrate the above computation, using {=25 and n=100, we compute mat p^'(99,.95~= .9534 and 49

OCR for page 8
~- os 25(100) . ~ . .. ~ (100 25)(10~1)~ so Mat p 0 = .96767. Then by (8.4), 1 .05 lo= 00 .9676799 11100 . _ = .96767, .967671 = .00326. Hence, p me = 1.0 - .96767 - .00326 = .0291. A .95 multinomial upper bound, when m=l, is then .25(.00326)+.0291 = .02988. Third, we discuss computation of the paramedic bound using me power function for modeling the distribution of tainting. The density of z iS f (z) = ~z~-1 for O OCR for page 8
A random sample of sue n from Me distnbution (8.9) and (8.10) is caned the bootstrap sample. Denote lied as We value of ED = ~ (1) computed from a single bootstrap sample. The distnbudon Of P*D under sampling from (8.9) and (8.10) is the bootstrap distribution of ~*D. The 95 percentile of the bootstrap distribution is used to set a bound for ED. To approximate the bootstrap sampling distnbution, we may use simulation. Let B be the number of independent bootstrap samples. Then an estimate of a .95 upper bound is UB such that :# ~ ~ D0, B >0 and x >G . The method of moments is used to estimate these parameters. Let m`, i=1,2 and 3 be me sample central, second and third moments. Then the moments estimators are: S1

OCR for page 8
A = 4m 3Im 2, B=(l/2)malm2 and G =ml-2m2/m3. (8.12) (8.13) (8.14) For computation of m' of the sample mean faintings a number of heuristic arguments are introduced. First of all, we compute the average tainting ~ = .325 of the two observations. Suppose that the population audited is a population of accounts receivables. Then, we compute, without any statistical explanation being given, me third data point,"*: ~ =. 81 t 1-.667 tanh( 107)] t 1+.667 tannin / 103] = .3071 . The tempt in the second pair of brackets win not be used when Me population is inventory. t* is so conducted that when Were is no enor in a sample, the upper bound is very close to the the Swinger bound. Using, thus, Me data points - two observed and one constructed - the first Tree noncentral moments are computed for z, i.e., the tainting of items in ear. They are: Vz 1 = (.25+.40+.3071)/3 = .31903, ,2 = (.252+.402~.30712)/3 = .1056, and V2 3 = (.2~3~.4~+.30713)/3 = .03619. The noncentral moments of d are simply p times Me noncentral moments of z. Using well known properties of moments, the population central, second and third moments can then be derived from noncentral moments. These population central moments are used to determine the three noncentral moments of Me sample mean. Throughout these steps the error rate p is treated as a nuisance parameter but at this stage is integrated out using Me normalized likelihood function of p. Then, Me noncentral moments of the sample mean are shown to be as follows: m+] n+2 al ' Vd,2 = m+1 m+1 m+2 2 n+2 z)+(n-l) n+2 n+3 ~Z, n ~2 (8.15) (8.16)

OCR for page 8
Vd'= m+1 m+1 m+2 n+2 z'+3(n-~) n+2 n+3 if ~ Vz.2 n2 ~ _ (n -l )(n -2) +2 n +3 n +4 vz n2 Using (8.15) through (8.17), we compute vd,2 =.14615 x 10-3, and vd,3 =.29792 x 10-5. Then, m 1 = vd,l= .93831 x 10~2 m2 = Ed 2Va21 = .581~ x 10 - (8.17) Vd 1 =.93831 x 1072, (8.18) (8~19) ma = vd,3 - 3Vd,1 vd,2 + 2v,d,1 = .51748 x 10 - (8.20) Using these values, we compute A =2.93, B =0.00445 and G =-.00366. These parameter estimates are used to determine the 95 percentile of the gamma distribution to set a .95 moment bound. The bound is .0238. For companson, for the same audit data, the Stringer bound = .0401, the parametric bound = .0238, and using the prior settings previously selected, the Cox and Snell bound = .0248 and the Tsui et at. bound = .0304. Table 4 tabulates the results. Note that when there is no error in me sample (m =0), the two B. aye Sian bounds, under the headings C&S and Tsui, are considerably smaller than the four other bounds. The reason is that the four other bounds assume that aD taints are 100% when there is no error in the sample. When the sample does contain some errors, the bounds are closer, as shown for m =l and 2. S3

OCR for page 8
Table 4 Comparison of Six .95 Upper Confidence Bounds for AD: Me Stringer bound, Me Muli~nomial bound, the Moment bound, the Pararr~etnc bound, the Cox and SneU bound, and the Tsui. et at. bound. (Sample size is n=IOO) No. of Errors Str. Mult. Moment Para. C & S Tsui m=0 .0295 .0295 .0295 .0295 .0118 .0023 m = 1 .0338 .0299 .0156 .0152 .0182 .0255 t=.25 m =2 .0401 .0315* .0239 .0238 .0248 .0304 =.40 t ~=.25 Notc: ~ This value was computed by the software made available by R. PIante.