Evaluation Summary of Article
on Two way ANOVA (Fixed Effects - Model I) Design
Submitted by Pitiyage Bilesha Perera

 

1.Background Information:

Authors: Hewes, R.L., and Janikowski, T.P.
Title of the Article: Parental alcoholism and codependency: A comparison of female
children of alcoholics and non-alcoholics in two college-age groups.
Sources: The College Student Journal, Vol. 32; Number 1, 140-147
Year: 1998

2. Abstract:

Codependency behaviors of the children of alcoholic parents have gained much attention among public health professionals as an important public health issue. The present study aimed at investigating codependency behaviors between children of alcoholics (COA) and children of non-alcoholics (non-COA) in two class levels (under and upper) using a sample of female college students. The sample consisted of 76 female college students and they were surveyed and screened using a demographic questionnaire, the Spann-Fisher Codependency Scale and the Children of Alcoholics Screening Test. A significant interaction effect was found between class level and their parental alcoholism (F= 5.05; p=0.028). Simple main effect analysis indicated that under-class (first years and sophomores) COAs had significantly lower codependency behaviors than their non-COA peers while there was no significant difference of the codependency behaviors found between COAs and non-COAs in the upper-class (juniors and seniors). Codependency behavior score of COAs at upper-class level was significantly higher than COAs at the underclass level, indicating the possibilities that COAs tend to increase their codependency behaviors with increasing number of years at the college. These findings could be used to improve and to make necessary changes in counseling techniques for female college students.


3. Null hypothesis, Alpha (or p) level, and Sample size per group:

Null hypotheses were not clearly stated in the article. I constructed the null hypotheses based on information given in the article.

(1) Ho : The population mean score of the codependency behavior of female college students of alcoholic families is equal to the population mean score of the codependency behavior of female college students of non-alcoholic families.

H1 : The population mean score of the codependency behavior of female college students of alcoholic families is not equal to the population mean score of the codependency behavior of female college students of non-alcoholic families.

(2) Ho: The population mean score of the codependency behavior of female college students in the under-class is equal to the population mean score of the codependency behavior of female college students in the upper-class.

H1: The population mean score of the codependency behavior of female college students in the under-class is not equal to the population mean score of the codependency behavior of female college students in the upper-class.

(3) Ho: There is no influence on female college students codependency behavior associated with the joint effects of their parental alcoholism and the class level.

H1: There is influence on female college students codependency behavior associated with the joint effects of their parental alcoholism and the class level.


The alpha level was not pre-stated. However, authors seemed to have used 0.05 as the probability cut off point for the analysis of variance test. p level for the interaction test was reported ( F= 5.05; p= 0.028 )

In addition to the above null hypotheses the following null hypotheses for simple main effects were tested using t-test. I constructed the null hypotheses for these simple main effects.

1. Ho: There is no significant difference of the population mean scores of codependency behavior of underclass female COAs and underclass female non-COAs.
H1: There is a significant difference of the population mean scores of codependency behavior of underclass female COAs and underclass female non-COAs.

(Here a significant result was found, p=0.043)

2. Ho: There is no significant difference of the population mean scores of
codependency behaviors of upper class female COAs and upperclass female non-
COAs.
H1 : There is a significant difference of the population mean scores of
codependency behaviors of upperclass female COAs and upperclass female non-
COAs.

(Here no significant result was found, p= 0.255)


3. Ho: There is no significant difference of the population mean scores of
codependency behavior of underclass female COAs and upperclass female COAs.
H1: There is a significant difference of the population mean scores of
codependency behavior of underclass female COAs and upperclass female COAs.

(Here significant difference was found, p=0.015)


4. Ho: There is no significant difference of the population mean score of
codependency behavior of underclass female non-COAs and upperclass female
non-COAs.
H1: There is no significant difference of the population mean score of
codependency behavior of underclass female non-COAs and upperclass female
non-COAs.

(Here no significant difference was found, P =0.706)

The authors did not report preset alpha level. But they had given p value for each t test of which they have used to investigate simple main effects (As mentioned above).

Sample size, mean score and standard deviation for each group ( underclass COAs (n=11), Underclass non-COAS (n=36), Upperclass COAs (n=10) and Upperclass non-COAs (n=19)) was given in the article under table Number 1. In this experiment, it is noted that these four groups have different sample sizes. So that this is unbalanced 2x2 ANOVA table. Further, the two factors (independent variables) Class level and whether COA or non COA (parental alcoholism) are assigned factors so that none of the female students was randomly assigned to the levels of either factor.


4. Independent variables:

In this study two independent variables were considered. (Both independent variables were fixed variables. So that the model is fixed effect model)

1. Parents alcoholism (only two levels here; whether the student has come from an alcoholic family or not. This is a fixed effect variable because the all levels about which inferences are to be drawn are included in the experiment)
(i.e. Two levels - Female students who are coming from a alcoholic family and female students who are coming from a non-alcoholic family).


2. Students class level - There are two levels here also; female students in the underclass level and female students in the upperclass level. (underclass level was defined as 1st year and sophomores and upperclass level was defined as juniors and seniors - This is also a fixed variable because the population consist only these two levels)

Dependent variable:

In this study dependent variable was defined as the codependency behavior score of the sample subjects. This variable was measured using Spann-Fischer Codependency Scale.

5. Instruments used

1. Children of Alcoholics Screening Test (CAST) was used to identify whether the student had come from an alcoholic family or from non-alcoholic family. This test comprised of 30 questions and scores may range from 0 to 30. A cutoff score of 6 or more indicates that the test taker has a parental figure who is an alcoholic.
Split-half reliability of CAST was 0.98 which indicates a higher reliability. As the
reliability coefficient increases, the standard error of the measurement decreases.
Here the reliability coefficient 0.98 indicates small errors in CAST. However,
this reliability score was obtained for a sample of 133 latency-age and adolescent
individuals. It seems that this present study sample was not a sample of the
population that the CAST was tested for its reliability. Reliability may very if the
sample does not come from the same population which it was originally tested.

It is better, if any internal consistency measures of the CAST was discussed in the
paper as CAST has different items such as attitudes, feelings, perceptions,
experiences etc. In addition, length of the test was given and generally reliability
of a test instrument increases with increasing length of the test. CAST has 30 items
so it should have high reliability.

The construct validity of the instrument is very important here to distinguish students into COAs and non-COAs groups. Analysis of variance and chi-square analysis (Jones, 1983) were used to show the construct validity of CAST.
In the article it says CAST was normed on both child and adult groups (ability to differentiate COAs and non-COAs) so that CAST has demonstrated high validity.

2 Spann-Fischer Codependency Scale (SF-CDS) was used to measure the
codependency scores of each sample subject. ( to measure dependent variable)

This SF-CDS scale has 16 items measured on a 6-point Likert scale and total scores may very from 16-96. Higher the score in this scale the higher the level of codependency. The scale has tested for reliability and validity using student groups, a recovery group and codependent group. It is again important to know whether the student groups that were used to test the SF CDS and the female student group in this present study can be considered as samples from the same population. But such information was not given in the article.

The SF CDS had shown higher internal reliability (reliability coefficient from 0.73 to 0.80) which can be considered as good. However, this measurement scale has 16 items which can adversely influence the reliability of SF CDS. It is better if there were more reliability evidence such as split-half, alternative form , test-retest etc., as these different reliability methods gives an indication of different sources of errors.
Content validity of SF CDS had established through the use of group of experts, Authors argued that construct validity of the SF CDS was established using Factor analysis. Convergent and discriminant validity of the SF CDS were also demonstrated in the article ( Fisher, Spann & Crawford, 1991). According to the information given SF CDS is a valid measurement.

3. A questionnaire was used to identify basic demographic characteristics of the sample subjects. It is mentioned in the article that variables such as age and class standing were identified using this questionnaire. It is very unlikely that there were any errors in this survey questionnaire.


6. Experimental Procedure:

Subjects:

76 female college students from Introductory rehabilitation services classes in a small, coeducational, residential college in the Easten United States.

Sampling Methods:

There was no mentioning of how the authors have recruited sample subjects. The authors mentioned that initially 120 females were recruited and voluntarily agreed to participate in this study, but the selection method (out of how many students these 120 was recruited, and how etc.) is not clearly given. The researchers have predetermined age range of 18-22, and those who fallen outside these age range was excluded. In addition, those who are age of 20 and spanned class levels were also excluded in order to have homogeneous age groups in these two class standards. So that final sample consisted of 76 participants ( 47 (61.8%) of underclass female students and 29 (38.2%) upperclass female students) It is observed that the mean age of underclass group was 19.4 ( range 18.1 - 19.9) and upperclass group was 21.5 ( range 21- 22.3).

Data Collection Procedure:

The paper only mentioned that the participants were asked to complete, in order, a demographic questionnaire, SF CDS and CAST. It is not given details of the actual data collection procedure ( who was involved in data collection, whether all the student participated in this experiment at the same time etc - for internal and external validity issues)


7. Statistical Analysis:

ANOVA Table was not given in the paper. Only F value and p value for interaction was given. So I have constructed the ANOVA table using the data available in the article. This is an unbalanced design (i.e. Unequal sample sizes in each group. However it was possible to construct the ANOVA table using data in the table 1 as this was a 2 x 2 ANOVA (Armitage & Berry, 1987)
The sum of squares due to a difference between female COAs and female non-COAs are calculated first. The sum of squares due to class level, with adjustment for any COA and non COA differences are then calculated. Alternatively, the effect of class level could be done first and then can calculate additional sum of squares due to COA or non-COA, adjusted for class level. With an unbalanced design it is often sensible to carry out the analysis using both orders.

Summary of Two way ANOVA for codependency behavior of Female COAs and Non-COAs in two college age groups

Table 1
SS df MS F

Alcoholic Family (AF) 23.79 1 23.79 0.229
(COA or non-COA)

Class level (CL) 101.27 1 101.27 0.978
(Under or Upper)
(adjusted for Alcoholic
Family)
AF x CL 521.24 1 521.24 5.035 *

Error (With in groups) 7452.87 72 103.51

Total 8099.17 75

* indicates F statistics is significant at 0.05 level, p= 0.028

Table II
SS df MS F

Class Level (CL) 88.66 1 88.66 0.856

Alcoholic Family (AF) 36.68 1 36.68 0.354
(adjusted for Class level)

CL x AF 521.83 1 521.83 5.041 *

Error (With in groups) 7452.87 72 103.51

Total 8100.04 75

* indicates F statistics is significant at 0.05 level, p=0.028

ANOVA results indicate that both main effects (the main effect of class level on codependency behavior of female college students and the main effect of parental alcoholism (whether student came from Alcoholic family or not) on codependency behavior of female college students) were not significant. The interaction between Class level and parental alcoholism was significant (F (1, 72) = 5.05; p= 0.028)

Simple main effects analysis indicate that the difference in codependency behavior between underclass COAs and non COAs was significant ( p= 0.043) and the difference in codependency behavior between underclass and upperclass COAs was also significant (p=0.015).

(the appropriateness of simple main effect analysis in this experiment will be discussed in the comment section)

No post-hoc procedure was conducted because there is no need to do multiple comparisons (i.e. each independent variable has only two levels)


8. If you were a researcher, how would you improve the study? Be specific.

Merits of the article:

(1) Title of the article is suitable. Authors have included all important sections in the article (ie. Literature review, Objectives of the study, methods used, Instruments used ( reliability and validity of those instruments), Statistical analysis, Results and Discussion (including limitations, recommendations and suggestions for further research). Authors recommended a longitudinal study to see the changes of codependency behavior of college students over time which is a very good suggestion.

(2) Literature review clearly established the importance of the study. Authors have discussed about the limitations of previous studies and the new knowledge that this study is going to obtain.

(3) Independent and dependent variables of the study was clearly stated. Two independent variables that were used in the study are assigned factors (ie characteristics of the study's subjects that they bring with them to the investigation - so that randomization is not possible when assigning subjects into groups) and these two factors have only two levels so that no post-hoc comparisons are needed.

(4) Authors were careful about the confounding factors. Here sex is a confounding factor and authors have used only females for this study and no confounding effects thus occurred due to sex.

(5) The authors have discussed in detail the reliability and the validity of the instruments that they had used in the experiment.

(6) The authors have used an appropriate experimental design.

(7) For each group in the study, the authors have given ( under table 1) sample size, mean value of the dependent variable and its standard deviation so that readers can formulate ANOVA if they wish).

(8) Participants were asked to complete, in order, a demographic questionnaire, SF CDs and CAST. This order may be important to minimize the number of dropouts from the study. For e.g., If the students were asked to complete the CAST first, some of them may not get interested in completing the other scales. (However, authors were not mentioned about the ethical acceptance of the experiment. If students were blind to the objectives of the experiment it is not ethical).


Weaknesses of the study:

(1) It would be helpful to the readers if authors have clearly stated the null hypothesis.

(2) It would be better if the authors have discussed assumptions that they made ( and any violations) when doing this experimental design. Since there are unequal number of subjects in each cell (group) the homogeneity of error variance should be discussed. Further if the scores of codependency behavior are not normal larger number of subjects per group should be used. Since there are unequal number of subjects in each group, experimental design is non orthogonal and the possible consequences of nonhomogeneous variances on the probability of type I error can be substantial.

(3) The authors have conducted t tests to see simple main effects. But this is not an appropriate analysis to do here. Since there is an interaction effect in the 2 x 2 ANOVA, mean score in each group (cell) has interaction effect component also. So that it is wrong to test whether there are any significant differences between two group mean scores at each level of the independent variables just using mean values of groups.

(4) Alpha levels was not stated for all hypothesis before analysis.

(5) Hypothesis for interaction was not stated ( but interaction was discussed in the result section)

(6) Recruitment of participants into the study was not clearly stated. It says initially 120 females were recruited and voluntarily agreed to participate in the study. But no information was given about whether this was a sample form the entire college, whether random selection was made etc.

(7) The ANOVA table was not given in the article. ( Since this is unbalanced design it is somewhat difficult to formulate ANOVA using the data available in the article)
(8) Authors did not comment about the appropriateness of SF CDS and CAST measurement scales for this study sample ( specially with regard to reliability, reliability coefficient that they used can be applied to this female student sample if it was a representative sample of the population that the scales were originally tested for).

(9) Time of the experiment conducted is also important (whether they have conducted the experiment for all student participants at the same time of the year, whether they conducted the experiment in fall, spring or summer etc. For e.g., results may vary if researchers have conducted the experiment for 1st years in spring semester rather than conducting it in fall -just after they entered into the college in fall semester. (i.e. in the spring, 1st year students may have different development stages in their life- much more familiar with the college environment so that it may have influence on their codependency behavior). In the methodology authors should have discussed this in detail

(10) Authors were in fact interested in comparing two age groups so that were careful about the homogeneity of age of the students in each class level. They had used age range of 18 -22 years. They initially had excluded 4 students outside this age group and later excluded another 40 female students who were 20 years of age and spanned class level. However, this adversely effect the external validity (generalizability) of the research findings.

If I were a researcher, How I would improve the study.

(1) I would clearly state all the null hypothesis that are going to be tested.
(2) I would specify alpha levels of all hypothesis before the analysis of the results in addition to p values.

(3) I would include two ANOVA tables (which I constructed) in the paper. It would be more informative to the reader to have an ANOVA table presented in the paper.

(4) I would discuss assumptions made in making such unbalanced ANOVA tables and possible deviations of such assumptions.

(5) Effect sizes and power of the test would be included in the article.


(6) I will explain briefly the recruitment procedures ( selection procedure of the subjects) and possible shortcomings in selecting subjects. In the methodology section I would include details when and how the experiment was conducted. ( i.e. whether all sample subjects were tested at the same time or at different times etc).


(7) I would include additional reliability evidence for the measurements CAST and SF CDS and discuss the suitability of these scales for this study sample of female college students.


(8) I would conduct the experiment using a larger sample and using equal numbers of subjects in each cell (so that the experimental design will be orthogonal and possible consequences of non-homogeneous variances on the probability of type I error can be minimized).

References:

Armitage,P. & Berry G. Statistical methods in medical research, (3rd ed) Cambridge, Balckwell Science,1987.