ANCOVA stands for Analysis of Covariance. ANCOVA and regression are a blend of the general linear model. ANCOVA is often called the Treatment as it estimates the average of a dependant value (DV) that is equal across the levels of an independent variable (IV) categorically. It also statistically controls the effects of other variables that are continuous and are not primary. This type of variable is known as nuisance or Covariates variables (CV). ANCOVA mathematically decomposes the value of the variance in the DV into the variance as explained by the categorical IV, residual variance and CVs. ANCOVA might adjust the DV through a group of CVs. The ANCOVA model describes a linear relation between Covariate (CV) and DV which is:

Yadj._mean = Y_mean – b*(X^ith_mean-X_mean)

In this relation, Y is the IV, X is the covariate, b is the regression coefficient and i is from one of the k groups. ANCOVA is implemented in analyzing the differences in the average values of the variables that are dependant and are related to the effect of the controlled independent variables IV under the authority of the uncontrolled Variables.

Some Common Terms Related to ANCOVA

Below are some of the most common terms that should be understood before having proper knowledge of ANCOVA. They are:

1. Covariate

A covariate is an interval level continuous independent variable. ANCOVA is applied in the presence of covariates. Otherwise, ANOVA needs to be applied. Covariates are generally the most commonly used control variables. For example, a baseline pre-test score can be implemented as a covariate to control initial group differences in the ANCOVA study. In ANCOVA, the effects of the categorical independents are taken into account on the interval dependent variable. This is done after the interval covariates’ effects are completely controlled.

2. F-test

The significance of an F-test is to test each of the interactions and main effects in the case of multiple groups or single dependent intervals constructed by a categorical independent. F can be defined as the variance between the groups divided by the variance within the groups. If the result of the p-value is less, then there is a chance of the existence of important relationships.

Adjusted means are an integral output of ANCOVA tests and demonstrate if the relationship exists or not. The role of the covariates can be explained by comparing the means of the adjusted and the original groups. These adjustments show how the means of K groups are changed to control the covariates.

3. T-test

A T-test is a test of measuring the differences in the means between a single interval dependent.

Is it Possible to Model ANCOVA via Regression?

Yes, it is possible to model ANCOVA using regression, if the dummy variables can be used as categorical independents. One less category must be implemented than the values of every independent while producing dummy variables. In ANCOVA, interaction cross-product terms are added for every pair of independents that are included in the equation, incorporating the dummies. Then the multiple regressions are computed. The results of the F-tests will be the same as the original ANCOVA one. Full-Reduced-Model is used to compute the F ratio through an extra sum of the squares.

Uses of ANCOVA

Since ANCOVA is much related to factorial ANOVA, it can be used as follows:

ANCOVA is mainly implemented in the business section.
It is also used to estimate the variation in the consumer intention while purchasing a particular brand concerning the price and the attitude of the consumer towards that brand.
ANCOVA also determines how a change in the price of a particular product can affect the consumption of that commodity by the customers.
ANCOVA is useful for those cases where the dependent variables are linearly related to the covariates and not directly related to the factors.
It is employed to balance the results of the powerful independent variables that are non-interacting and are continuous. This will help to avoid improbability in the independent variables.
ANCOVA controls the factors which can be calculated on an interval scale but are not randomized in an experimental design.
It also fits the regressions inside both interval and categorical independents in a regression model. Recently this purpose of usage has been replaced by logistic regression.
ANCOVA is implemented as an extension to compare various regression lines of multiple regressions.
This procedure can also be applied in the analysis of the posttest or Pretest and whenever the mean regression will affects the measurements of the posttests.
ANCOVA can also be employed in quasi-experiments and researches that are non-experimental like surveys.

Other Uses of ANCOVA

Some of the other important uses of ANCOVA, apart from the above-mentioned benefits are:

1. Adjusts Pre-existing Differences

ANCOVA adjusts the pre-existing differences in non-equivalent groups. The initial differences in the groups are corrected by ANCOVA that are present in DV among other intact groups. In these circumstances, random assignments cannot equal the participants. CVs adjust the scores and make the participants more relatable without the CV. But even with the use of covariates, no unequal groups can be made equal with any statistical techniques. Sometimes CV is intertwined with IV in such a way that if the variance of the DV that is associated with CV is removed, then the considerable variance of the DV would also be removed, rendering a meaningless result.

2. Increasing the Power of the Experimental Designs

ANCOVA reduces the error variance within a group to increase the statistical power between the groups. It is mainly used in the experimental test called F-test to calculate the major differences among the groups. ANCOVA is used as a way to terminate all the unwanted variance of the dependent variable. Variables are allocated to control groups in the F-test. In this way, a test researcher can increase the sensitivity of the test. The error term is typically reduced by adding necessary and reliable variables. The reduction in the error term also increases the F-test’s sensitivity for an interactive effect.

3. Adjusting the Means of Numerous Dependent Variables

ANCOVA is also used to Multivariate ANCOVA (MANCOVA) that measures numerous other dependent variables. This procedure happens when a researcher wants to know the contribution of different dependent variables by removing their effectiveness from the analyses. This method is known as a step-down analysis.

Assumptions of ANCOVA

When an ANCOVA is implemented while analyzing data, first it has to be made sure that the data to be used can be analyzed via ANCOVA. The data should pass the following 9 assumptions of ANCOVA to represent the most perfect result. There can be a situation where the data may not overcome one or two assumptions. In that case, the data can also be studied by finding solutions for those assumptions. Now let us take a look at those 9 assumptions.

Assumption 1

The dependent and covariate variable should always be measured on a continuous scale at the ratio or interval level. Some of the examples following this assumption are revision time that is measured in hours, intelligence measured via an IQ score, performance in the exams that is calculated based on the score from 0 to 100, weight estimated in kgs and many more. The analysis of categorical covariates like gender that includes two sections: male and female are not considered as an ANCOVA.

Assumption 2

The independent variable should always include two or more than two independent, categorical groups. Examples for independent variables that fall under this assumption are the gender with two groups: male and female, ethnicity with three groups: Indian, American and African, Profession with 4 groups: doctor, nurse, dentist and therapists, and many more.

Assumption 3

There should be no relationship between the observations of independent groups or in each of the groups. There should be the only participant from each group that should not belong to more than one group. If the observation fails this assumption, then you can go to the next statistical test.

Assumption 4

Important outliners should not be present. Outliners are data points present inside data that do not follow the regular patterns. The disadvantage of an outliner is that they can have a negative impact on a one way ANCOVA, thus reducing the validity of the consequences. One way outliners can be detected by applying SPSS statistics to run the ANCOVA on the required data. The example of outlines is the mean score of the IQ test of 100 students is 108. Among the students, one has got 156 which is abnormal and may even place her in the top 1% of the global IQ test score. Here the mean score of this particular student is an outliner.

Assumption 5

The residuals should be normally distributed for each independent variable category. The normal residuals only use approximate results so that the assumptions can be violated to a level and will still provide near-perfect results. The test for normality can be done through two Shapiro Wilk Tests of normality. One test shows the residuals within a group and other tests check the overall fit of the model. Both of the tests can be done in SPSS statistics.le

Assumption 6

The variances should be homogenous. This assumption can be tested in SPSS statistics via Levene’s test for variance’s homogeneity. In this test, some data needs to be explained during interpretation and represent other possible ways to continue the analysis if the interpretation of the data falls short to meet this assumption.

Assumption 7

At every level of an independent variable, the covariates should be linearly related to the dependent variable. This assumption can be tested in statistics by plotting a covariate group of the scatterplot, post-test scores of the independent and dependent variable. In this assumption, a researcher tests how to produce and interpret a grouped scatterplot.

Assumption 8

Homoscedasticity should be tested by plotting standardized residuals of the scatterplot against the predicted result values. In this assumption, scatterplot for the testing of homoscedasticity is created and interpreted for the data.

Assumption 9

There should be homogeneity of regression slopes in this assumption. This means that there should be no interaction between the independent variable and covariate in the GLM procedure. In this assumption, homogeneity of the regression slopes is tested separately in ANCOVA via SPSS statistics and the results are interpreted. Also, the violation of the assumption should be considered during the entire test.

How to Conduct an ANCOVA

The ANCOVA test can be conducted by employing the following procedures.

Test of Multicollinearity: The DV will not be adjusted over other CV if a CV is highly related to a correlation of 0.5 or more than another CV. Anyone of them should be removed due to their statistical redundancy.

Test of the Variable Assumptions Homogeneity: The homogeneity of the variable assumptions is tested via Levene’s test for error variances’ equality. If you have before adjustments during the test, then again adjustments need to be made after the test is over.

Test of the homogeneity of the assumption of regression slopes: The ANCOVA model has to run consisting of both IV and CV X IV interaction term, to check if V interacts with IV or not. If CV X IV interaction is vital, then ANCOVA should not be executed. According to Salkind and Green, differences of the DV at particular CV levels should be accessed. Also, CV and its interaction should be treated as another IV by implementing the analysis of regression analysis. Mediation analysis can also be utilized to estimate if the CV has any effect of IV on the DV.

Run the Analysis of the ANCOVA: If CV X IV interaction is not vital, then ANCOVA needs to be run again without CV X IV interaction. Adjusted means and adjusted MSerror are used to run this analysis. The adjusted or least-square means refers to the group means after controlling the influence of the CV on DV.

Follow up Analysis: If there is a significant effect, then there is a difference between the levels of IV, without considering any other factors. Moreover, if it can be found out that which levels are different from one another, then the same follow up tests of the ANCOVA can be carried out. However, if two or more IVs are present, then the effect of one IV on the DV changes based on the level of another factor. One can determine the main effects of the methods via factorial ANCOVA.

The analysis of Covariance or ANCOVA presumes that the differences between the modeled values and the observations follow a normal distribution. This can be validated by graphical or formal tests. ANCOVA is an implementation of GLM (General Linear Model), the method reverts to one and two way ANCOVA and Multiple Regression.

ANCOVA- Analysis of Covariance