Issue

Improving the repeatability and reproducibility of the Helmke Drum test method

05/01/2002

Measuring particle release

After 1991 lab tests found ambiguities, IEST Working Group 003 set out to revise test methods through subtle changes in equipment and process

Editor's Note: A technical session on the results of this paper was presented at CleanRooms West 2001 in San Jose, CA.

by Jenni M. Elion, Research Triangle Institute
With research by David S. Ensor, Research Triangle Institute
Chuck Berndt, C. W. Berndt Associates, Ltd.
Mike Bovino, UniClean Cleanroom Garment Services
Roger Diener, Analog Devices, Inc.
Gordon Ely, Nelson Laboratories
Jan Eudy, Cintas Cleanroom Resources
Robert Giroux, Certifab SIC-Cleanroom
Mike Rataj, Aramark Cleanroom Services
Jean Witt, Hi-Tec Garment

The Helmke Drum test method, as described in the Institute of Environmental Sciences and Technology's recommended practice IEST-RP-CC003.2, is used to measure particle release greater than 0.5 micrometer (µm) from cleanroom garments. During the current effort to revise IEST-RP-CC003.2, it became clear that the method, as currently written, required revision.

Under "Description of Test Limitations," the current method mentions a 1991 interlaboratory comparison (ILC). However, results from that ILC were too variable to allow meaningful interpretation.

A cursory look at the results from this 1991 ILC suggested that a re-evaluation of the data using the statistical analysis outlined in ASTM E691 might yield new insights. The subcommittee of IEST Working Group 003 proposed several changes to the method and a second ILC was conducted in 2001, in which six participating laboratories tested garments per the revised method.

Figure 1. The h values by test laboratory for the 1991 interlaboratory comparison (ILC) show a possible problem with the reproducibility of Lab 4-91's data.Click here to enlarge image

A statistical analysis of these results compared to the statistical analysis of the 1991 ILC shows the revised method yields more repeatable and more reproducible results. The proposed revisions to the method are expected to be adopted by the IEST Working Group and incorporated into the rewrite of the recommended practice.

Repeatability, reproducibility
To make this data interpretation and comparison meaningful, one must account for the inherent variability contributed by the test method in terms of accuracy, the closeness of the results to the "true" value or generally accepted reference value as well as precision, which is the degree of scatter in a set of measurements.

Precision is expressed in terms of two measurement concepts: repeatability and reproducibility. Repeatability is the ability of one laboratory to produce the same result again and again. Factors such as operator, environment, equipment and calibration are kept generally constant within a laboratory and contribute minimally to variability.

Reproducibility is the ability of multiple laboratories to achieve the same results. Those factors listed may vary substantially from one laboratory to another and may contribute significantly to variability.

Figure 2. The k values by test laboratory for the 1991 ILC. The data suggest a possible measurement problem at Lab 3-91 and possible imprecision at Lab 4-91.Click here to enlarge image

The usefulness of the Helmke Drum test method, described in IEST-RP-CC003.2, has been criticized because of the difficulty in reproducing results between laboratories and repeating results within laboratories. The Helmke Drum subcommittee of IEST Working Group 003 questioned whether these difficulties were the result of inherent variability of the samples themselves or the fault of the method. A faulty method may leave key elements, such as sample handling, open to multiple interpretations, resulting in significant variability; but this can be corrected through the examination of the method for sources of ambiguity and ultimately clarifying them.

The 1991 ILC
Approach: IEST Working Group 003 organized an ILC in 1991 with four participants capable of both laundering and testing garments. A manufacturer provided 96 medium-size, herringbone weave, 100-percent polyester garments for the testing.

Each participant received 24 garments and laundered them twice. The garments were returned to IEST and redistributed such that each participant received a combination of garments, including six garments they had laundered themselves plus six laundered by each of the other three participants.

Click here to enlarge image

The garments were labeled so that the participants did not know which garments were from which facility. The participants tested the garments in accordance with the Helmke Drum procedure as specified in IEST-RP-CC003.2.¹ For the purpose of this article, "laboratories" performed the testing and "facilities" laundered the garments. The identity of the participants, kept confidential during the 1991 ILC, was not disclosed to Research Triangle Institute (RTI).

Results: Per the recommended practice, results are reported as cumulative particle counts per minute for particle diameters greater than or equal to 0.5 µm. The laboratories reported the results (see Table 1) to the IEST.

Click here to enlarge image

For garments laundered at Facility B, Lab 3-91's results are noticeably lower than the other three laboratories. Also, Lab 4-91's results had high variability for garments laundered at all four facilities.

As part of ongoing efforts to improve the Helmke Drum test method, IEST Working Group 003 decided to re-evaluate these results using the standard practice for determining repeatability and reproducibility of a test method as described in ASTM E691.²

Generally, this standard practice is followed for large interlaboratory studies; when fewer than six laboratories are involved, precision statistics are not to be reported as final. Knowing these limitations, the practice was applied to see if new insights could be gained as to the repeatability and reproducibility of the method. With no supporting documentation to indicate the cause of the outliers in Lab 4-91's data set or why Lab 3-91's results were consistently lower, all data were included in the statistical analysis.

Statistical analysis of 1991 ILC
The six data points corresponding to the particle counts measured by one laboratory for garments from one laundry facility make up a "cell." The first step in the analysis is to calculate the cell average (x-) and cell standard deviation (s) for each of the 16 cells in Table 1.

Figure 3. The h values by test laboratory for the 2001 ILC s suggest good reproducibility between laboratories.Click here to enlarge image

The next step is to calculate the average of the cell averages (xø) for all four laboratories for each laundry facility and the cell deviation from the mean (d). The third step is to calculate the standard deviation of cell averages (S_x), repeatability standard deviation (S_r) and reproducibility standard deviation (S_R). The results of these calculations are presented in Table 2.

The between-laboratory "reproducibility" statistic (h) is defined as the cell deviation (d) for a given laboratory divided by the standard deviation of the cell averages (S_x) for all laboratories. Ideally, the cell deviation would approach zero, indicating that the average for an individual laboratory (x-) would be close to the average for all laboratories (xø). In this case, h would approach zero as well.

Figure 4. The k values by test laboratory for the 2001 ILC suggest good repeatability within each laboratory.Click here to enlarge image

The within-laboratory "repeatability" statistic (k) compares the standard deviation of one laboratory's measurements to the standard deviation of all participating laboratories. It is defined as the standard deviation of a laboratory divided by the standard deviation (S_r) for all laboratories; or the standard deviation of the whole sample population. Ideally, the deviation within one laboratory should reflect the deviation of the sample population, and k would approach unity.

Critical values for h and k are obtained from a table in the ASTM practice, based on the number of laboratories and the number of sample replicates at a significance level of 0.5 percent. In this case, critical h and k values were obtained for an ILC with four laboratories and six replicates. The values are used as boundaries or control limits for the reproducibility and repeatability of the results. The calculated h and k values are then compared to the critical values, as shown in Figures 1 and 2.

1991 patterns
Patterns that indicate good reproducibility when examining h values are balanced positive and negative values for each lab, or all positive h values for some laboratories balanced by about the same number of laboratories with all negative h values.

Both these patterns are normal for "in control" laboratories. A third pattern, indicative of a problem within a given laboratory, is h values that are all positive (or negative) for that laboratory while all the other laboratories have h values that are all negative (or positive).

Click here to enlarge image

The pattern suggests a possible problem associated with Lab 4-91. The h values for the first three laboratories all fall well within the limits of the critical h value. However, the h values for Lab 4-91 are very close to the critical h value of 1.49. Also, the h values for Lab 4-91 are all positive while the h values for the three other laboratories are all negative.

Very small k values, where the deviation within a laboratory is very low compared with the general population, suggest measurement insensitivity, such as taking measurements near the instrument's detection limit. Very high k values, where the deviation within a laboratory is very high compared with the general population, suggest imprecision. This could occur, for example, where there is variability in sample handling or operator error.

A plot of the within-laboratory consistency statistic (see Figure 2) shows two potential problems. The k values for Lab 3-91 are very low and the k values for Lab 4-91 exceed the critical k-value of 1.60.

Precision is expressed in terms of two measurement concepts: repeatability and reproducibility. Repeatability is the ability of one laboratory to produce the same result again and again. Reproducibility is the ability of multiple laboratories to achieve the same results.

The results of the statistical analysis suggested measurement problems with one participating laboratory and imprecision problems with another. Unfortunately, at the time of the ILC, no follow-up investigation was performed to investigate these problems. The Helmke Drum subcommittee concluded that ambiguities in the current test method might have contributed to the problems.

The 2001 ILC
Approach: The Helmke Drum subcommittee of IEST Working Group 003 proposed a new ILC to test the repeatability and reproducibility of a revised draft method. To clarify the method, the subcommittee incorporated several changes.

The upper limit of the particle size measured by the test method was set at 5 µm based on particle transport models for sample lines. The particle counter intake tube was redesigned to eliminate one of the bends, a source of particle loss. The position of the particle counter intake tube was specified to be at "11 o'clock" when looking at the drum clockwise rotation. The drum was modified to include four internal cleats to facilitate tumbling during rotation. The particle counter's sample flow rate was specified at 1 acfm. The procedure for folding the garment was altered both to remove ambiguity and to make it easier to perform. The procedure for placing the garment into the drum was described more completely to remove confusion.

Click here to enlarge image

Finally, it was specified that the results were to be reported as cumulative values at both 0.3 µm and 0.5 µm to take advantage of the improvements in particle counter sensitivity over the past two decades.

With the goal of applying ASTM E691 and calculating precision statistics, six laboratories were recruited to participate. The participants were balanced between independent contract laboratories and industry laboratories. Sixty new medium-size C-4 coveralls with zip front and ESD knit cuffs were loaned for testing. The garments were laundered twice at a single facility following standard cleanroom laundering practice. Each laboratory received ten garments and tested them in accordance with the revised draft Helmke Drum method.

Results: The results, presented in Table 3, were delivered to RTI for analysis.

Statistical analysis of 2001 ILC
The ten data points produced by one laboratory for one size range make up a "cell." Table 4 includes the cell averages (x-), cell standard deviations (s), averages of the cell averages (xø), cell deviations (d), standard deviations of cell averages (S_x), repeatability standard deviations (S_r) and reproducibility standard deviations (S_R). The consistency statistics, h and k, were calculated and compared against the critical values of h and k for 6 laboratories and 10 replicates at the 0.5 percent significance level.

2001 patterns
As in 1991, the patterns to look for when plotting h values are balanced positive and negative values for each lab or all positive h values for some laboratories balanced by about the same number of laboratories with all negative h values. A plot of the between-laboratory consistency statistic (see Figure 3) shows that the h values for all laboratories are within the critical limits, laboratories with all positive h values are balance by laboratories with all negative h values, and the remaining laboratories are internally balanced between positive and negative.

A plot of the within-laboratory consistency statistic (Figure 4) shows that the k values for all laboratories are below the critical k value of 1.52. The k values range from 0.73 to 1.16, suggesting that neither measurement insensitivity nor internal lab imprecision is present.

Conclusions
The data indicate that both the consistency within test laboratories and between test laboratories are within acceptable limits as established by ASTM standard practice. These results suggest that the proposed revised Helmke Drum test method will yield a robust method, with inherent variability stemming from the garments themselves and not the method.

Acknowledgments
The authors gratefully acknowledge Analog Devices, Inc. for providing garments and the following laboratories (and personnel) for conducting the 2001 ILC testing:

Aramark Cleanroom Services (Rose Graber)
Certifab SIC-Cleanroom (Robert Giroux)
Cintas Cleanroom Resources (William Jablesnik)
Nelson Laboratories (Karl Perkes)
Research Triangle Institute (K. David Carter)
UniClean Cleanroom Garment Services

References

IEST-RP-CC003.2, Garment Considerations for Cleanrooms and Other Controlled Environ ments, IEST, Mount Prospect, IL.
ASTM E691-92, Standard Practice for Conducting an Interlaboratory Study to Determine the Precision of a Test Method, ASTM, Philadelphia, PA, 1999.
ASTM E177-90a (reapproved 1996), Standard Practice for Use of the Terms Precision and Bias in ASTM Methods, ASTM, Philadelphia, PA, 1999.

Jenni M. Elion is a research environmental engineer in the Center for Environmental and Engineering Technology at Research Triangle Institute (Research Triangle Park, NC). Her responsibilities include engineering research and development pertaining to precision cleaning of surfaces and cleanroom products testing. She also is actively involved in projects that measure aerosol penetration of air filters and chemical protective garments.

Editor's Note: This article appeared in the Journal of the IEST, Fall 2001, Vol.44, No. 4.