Problem: An article in Communications of the ACM (Vol. 30, No. 5, 1987) studied different algorithms for estimating software development costs. Six algorithms were applied to several different software development projects and the percent error in estimating the development cost was observed. Some of the data from this experiment is shown in the table below. We are interested to find if different algorithms are different in their mean cost estimation accuracy or not?
|
Project |
||||||
|
Algorithm |
1 |
2 |
3 |
4 |
5 |
6 |
|
1 |
1244 |
21 |
82 |
2221 |
905 |
839 |
|
2 |
281 |
129 |
396 |
1306 |
336 |
910 |
|
3 |
220 |
84 |
458 |
543 |
300 |
794 |
|
4 |
225 |
83 |
425 |
552 |
291 |
826 |
|
5 |
19 |
11 |
-34 |
121 |
15 |
103 |
|
6 |
-20 |
35 |
-53 |
170 |
104 |
199 |
a) H0 : mean1 = mean2= mean3 =mean4.... = mean6 (all algorithms have same mean cost estimation accuracy)
Halternate : all algorithms dont have same mean cost estimation accuracy
b) Treatment - Algorithm 1 ,2 ,3 ,4 ,5 ,6
level - possible values of the algorithms
block - Project 1 ,2,3,4,5,6
response - values entered
c)For randomized block designs, there is one factor or variable that is of primary interest. However, there are also several other nuisance factors.
Nuisance factors are those that may affect the measured result, but are not of primary interest. For example, in applying a treatment, nuisance factors might be the specific operator who prepared the treatment, the time of day the experiment was run, and the room temperature. All experiments have nuisance factors. The experimenter will typically need to spend some time deciding which nuisance factors are important enough to keep track of or control, if possible, during the experiment.
Randomized block design test
d) sources of variablity sampling response, measurement error, random error, technical variation etc
e)The Model F-value of 2.602 implies the model is significant. There is only a 0.17% chance that a "Model F-Value" this large could occur due to noise
| Anova: Two-Factor Without Replication | ||||||
| SUMMARY | Count | Sum | Average | Variance | ||
| 1 | 6 | 5312 | 885.3333 | 661519.5 | ||
| 2 | 6 | 3358 | 559.6667 | 203937.9 | ||
| 3 | 6 | 2399 | 399.8333 | 64260.97 | ||
| 4 | 6 | 2402 | 400.3333 | 69639.87 | ||
| 5 | 6 | 235 | 39.16667 | 3581.767 | ||
| 6 | 6 | 435 | 72.5 | 10442.7 | ||
| 1 | 6 | 1969 | 328.1667 | 216024.6 | ||
| 2 | 6 | 363 | 60.5 | 2082.3 | ||
| 3 | 6 | 1274 | 212.3333 | 57476.27 | ||
| 4 | 6 | 4913 | 818.8333 | 651728.6 | ||
| 5 | 6 | 1951 | 325.1667 | 96648.57 | ||
| 6 | 6 | 3671 | 611.8333 | 129780.6 | ||
| ANOVA | ||||||
| Source of Variation | SS | df | MS | F | P-value | F crit |
| Rows | 2989130 | 5 | 597826.1 | 5.376958 | 0.00172 | 2.602987 |
| Columns | 2287339 | 5 | 457467.9 | 4.114551 | 0.007295 | 2.602987 |
| Error | 2779574 | 25 | 111182.9 | |||
| Total | 8056044 | 35 |
f)
h)


i) The FUNCTIONAL POINTS algorithm has the losest cost estimation error.
Problem: An article in Communications of the ACM (Vol. 30, No. 5, 1987) studied different algorithms...