Inventory Development
Content Validity


Five samples of experts provided data once each in one of five summers to establish the content validity of the TSI (Fimian, 1987a). An "expert" was defined as one who was knowledgeable about teacher stress and burnout. Each had (a) authored one or more stress articles, monographs or books; (b) conducted quantitative, qualitative, and/or combination stress research; and/or (c) conducted stress management workshops for practitioners. The samples' data were pooled; thus, the aggregate sample of 226 experts, representing 62% of the surveys distributed between 1980 and 1984, was used in this content validation. Return rates, though moderate and varying in size, are typical in comparison to voluntary self-report stress studies conducted prior to 1982 (Fimian, 1983). The majority of the respondents were male (64%); below the age of 40 (62%); had taught children at one time for more than 5 years (72%); had received or were completing their doctoral degree (96%); had presented stress management workshops to practitioners (63%); had authored one or more stress-related works (62%; place and type of publication undetermined); and had conducted some type of stress research (56%) of a quantitative (27%), qualitative (11%), or combination (19%) nature.

A modified version of the TSI was used to collect the expert appraisal data. As discussed earlier in this chapter, an item pool of 135 stems was reduced to 42 through the procedures of content, factorial, and construct validation outlined in greater detail elsewhere (Fimian, 1984b, 1985). To these, 16 conceptually similar items-including the Time Management items-were added to the modified TSI for future development. Of the resulting 58 questions (Q1 to Q58), 7 were designated as "Other" and allowed respondents to add and rate their own stress sources or manifestations; 51 items were of the closed-ended variety and were rated by the experts. Two of these items were later omitted from the analyses as these were established earlier as having little discriminant worth. Thus the item pool was reduced from 51 to 49, 41 of which were drawn from the original TSI resulting from the pilot studies in Connecticut and Vermont. Of these 49 items, 8 were conceptually related to Time Management problems and were appended to the end of the Inventory during its third year of use based on earlier recommendations of a number of the experts. In the content validity study, each of the 49 items was associated with a 4-point Likert-type scale (1 = not relevant; 2 = somewhat relevant; 3 = quite relevant; 4 = very relevant), which would allow each expert to determine the degree to which each item was related to his or her individual concept of teacher stress. Finally, an additional eight "Personal and Professional Information" items were included on the cover page of the form.

Design and Procedures

The design selected and used with the content validity samples is a "cross-sectional" survey design (Huck, Cormier, & Bounds, 1974); each expert was surveyed once during the summers of 1980 to 1984. During each academic year an address list of "stress experts" was developed from that year's literature. These individuals were then surveyed the following summer using paper-and-pencil procedures; one introductory letter, modified Inventory, and prestamped, return-addressed envelope was distributed to each potential respondent.


Relevance means for the TSI items varied somewhat around the 3.0 or quite relevant level, ranging from a low of 2.5 Q24, Teachers feel frustrated because their students would probably do better if they only tried harder) to a high of 3.4 Q29 and Q47, Feeling unable to cope; Experiencing physical exhaustion). Of the 49 rated items, 28 met or exceeded the 3.0 relevance level, whereas 21 fell slightly below this. Items meeting or exceeding the 2.5 level (e.g., relevant) were retained in the item pool; all items were retained as they fell in the relevant to quite relevant range.

In order to assess the degree of congruence among the experts' ratings, Finn's (1970) r formula was then used to calculate an interrater reliability correlation, first for each item, then for each TSI subscale and scale, as outlined by Tinsley and Weiss (1975). These correlations, which could range from 0.0 to 1.0, indicate a total lack of correspondence (r = 0.0) to perfect agreement (r = 1.0) among raters. The item-level data were entered into these computations first, then the subscale and scale data. As noted in Table 12, item-level correlations ranged from a low of .18 Q34, Using prescription drugs) to a high of .90 Q50, Rushing in one's speech). All correlations exceeded the.05 (2 items),.01 (6 items), and.001 (41 items) probability levels. The data were then inspected at the subscale and scale levels using the newly defined factors of the final form of the TSI; these reliabilities, as noted in Table 12, ranged from a low of .42 (Behavioral Manifestations) to a high of .72 (Time Management). Interrater reliability for the Total TSI was .82. The reliability estimates were generally larger for the stress sources than they were for the stress manifestations, indicating slightly more agreement among the experts about what causes teacher stress than about how that stress is manifested. All subscale and scale interrater reliabilities exceeded .001 probability levels for given sample sizes.

Also reported in Table 12 are the expert appraisal means and standard deviations based on the summed and averaged item-level ratings. Subscale means ranged from a low of 2.9 (Professional Investment; Gastronomic Manifestations) to a high of 3.3 (Emotional Manifestations). A scale mean of 3.1 indicated that the experts viewed the pool of stress items as being quite relevant to teacher stress. Standard deviations were moderate at both subscale (0.6 to 0.9) and scale (0.5) levels.