Temporal Reliability of Selected
VALPAR COMPONENT WORK SAMPLES:
Learning Curve Studies
Ó1999- Valpar International Corporation; All Rights Reserved.
In the early 1990s, Valpar conducted a series of learning curve studies on most of its Component Work Sample Series (VCWS). The studies were undertaken to allow for the use of Methods-Time Measurement (MTM) as an aid to work sample users in the interpretation of work rate scores. Small samples of industrially-employed workers were tested at least ten times on each work sample exercise. The amount and rate of improvement in their scores over the ten exercise administrations were used to calculate unique learning curves for each of the work sample exercises, and those learning curves have been incorporated into the current criterion-referenced scoring system of most of the VCWS. (For information on Valpar's use of the learning curves in the VCWS, please contact Valpar.)
Although well-suited to the learning curve purposes, for several reasons, the data gathered in the learning curve studies is somewhat problematic for establishing the temporal reliability of the work samples. Nevertheless, if certain facts are taken into account, the statistics presented in the following tables should be useful to work sample users, and, in general, support the temporal reliability of most of the work sample exercises.
Before reviewing the tables, readers should be aware of a few points. First, the samples of people in the studies were small-usually six, although in a few cases there were a few more. That fact alone limits the usefulness of the data for reliability purposes. Moreover, the study participants were not selected with an eye to increasing variance, and it is well-known that variability is necessary for the size of Pearson correlation coefficients desired in test reliability statistics. Instead, study participants were selected because they were all satisfactory industrial workers used to working with their eyes and upper bodies in work tasks. That selection criterion actually served to reduce variability within the groups.
It also seems to be true that, for at least some of the work samples, ten administrations is simply not enough to stabilize one's work performance---or at least for those in the studies. Partly for this reason, we recommend that work sample scores be interpreted as reflecting peoples' minimal work potential; while a successful performance would be very unlikely by a person without the ability to learn to perform similar work at an acceptable level, the reverse is not the case: a failure to pass a work sample exercise on a first (or second, or third) try does not necessarily indicate that the individual could not learn to do similar work at an acceptable level. Such a person may simply need a little more training before he or she can demonstrate his or her maximum abilities. So, especially in the following cases in which the correlation coefficient is modest, users should refrain from interpreting scores as reflecting an examinee's maximum work-related abilities. In those cases, it would be especially wise to cross-validate a failed work sample effort before drawing the conclusion that the individual does not seem to have the work-related potential to succeed in work similar to that of the work sample.
Usually, in the following tables, the first administration of the work sample has been correlated with the sum of all subsequent administrations through ten. In a few cases, the first trial was correlated with the tenth trial alone. In some cases, the first administration constitutes what we term a "trial." There are, however, several work samples that have multiple, short-duration exercises. In those cases, we recommend that the exercises be administered two or three times, and that the scores from those administrations be combined to form a "trial." The reason for that is that we found that an exercise duration of about ten minutes is the minimum sample of behavior required for adequate reliability. In the following tables, therefore, readers will see either "trial 1" or "sum of administrations...", depending upon how many administrations are recommended for a "trial."
Although some aspect of work quality (errors, or points) as well as work rate is measured and used in several of the work samples to help users determine the pass/fail result, often we simply did not get enough data in the learning curve study to correlate scores on work quality as well as work rate.
Some readers will note the absence of VCWS 16. That work sample is not suited to our learning curve process, and so it was not included in the study. Work samples number 14, 17, and 18 have also been omitted. The manuals to those work samples have not been revised, and they were not included in the learning curve studies.
Usually, data for each study participant on each work sample was gathered over a period of several days, sometimes a week or more.
A high correlation on a work sample exercise shows that the subjects' first trial was indicative of their performance on the work sample over time, in terms of the position of their scores relative to the others in the study group. The lower correlations in the following tables point to a more cautious interpretation of exercise scores; and especially when scores on those exercises do not pass, they should not be viewed as typical of client performance.