Non parametric test in R
1. What is nonparametric test?
Nonparametric tests don't require distribution to meet the requirement of assumptions to be fulfilled. So, these tests are also known as distribution free tests. When one's data doesn't fulfill normality assumptions, it is recommended to do these tests.
These tests are only to be done if, the assumptions for parametric tests are not fulfilled. If the population size is sufficiently large, we can still use parametric tests.
2. What are the cases to use nonparametric tests:
a. Skewed data: We can use parametric tests if its assumption of normal distribution and homogeneity of variance is satisfied. If the data are skewed, mean is no longer the best measure as it is affected by extreme values. In such case, data are better represented by median.
b. If population size is too small.
c. The analyzed data is either nominal or ordinal.
d. When there are definite outliers.
3. Types of non parametric test.
a. Mann Whitney U Test: It is nonparametric alternative to independent sample t test.
b. Wilcox Signed Rank Test: It is non parametric counterpart of paired sample t test.
c. Kruskal Wallis Test: It is non parametric alternative to one way ANOVA.
Note: Please see the condition of using independent sample t test, paired sample t test and one way ANOVA in my previous articles.
4. Tests in R
a. Binomial test
Example: In an trial of a COVID vaccine , out of 100 studies 65 were effective. Their claim was of 80%.
> binom.test (65,100,0.8)
Exact binomial test
data: 65 and 100
number of successes = 65, number of trials = 100, p-value = 0.0004141
alternative hypothesis: true probability of success is not equal to 0.8 (Significant difference noted)
95 percent confidence interval:
0.5481506 0.7427062
sample estimates:
probability of success
0.65
b. Wilcox signed rank test
File used: nonpara.xlsx https://drive.google.com/file/d/1dZWDJ9Sb309pgrUe2aLZUyGQsU4SFlxS/view?usp=sharing
Commands to follow:
A. Import and attach file
B. Define factors
> nonpara$Trt<-as.factor(nonpara$Trt)
> nonpara$mulcing<-as.factor(nonpara$mulcing)
C. Observe summary
> summary(nonpara)
Trt mulcing yield_fefore Yield_after
1 : 4 No :18 Min. :1.780 Min. :2.330
2 : 4 yes: 9 1st Qu.:1.780 1st Qu.:2.360
3 : 4 Yes: 9 Median :1.814 Median :2.405
4 : 4 Mean :1.851 Mean :2.440
5 : 4 3rd Qu.:1.864 3rd Qu.:2.473
6 : 4 Max. :2.450 Max. :2.910
(Other):12
D. Check normality
1st way:
> shapiro.test(Yield_after)
Shapiro-Wilk normality test
data: Yield_after
W = 0.78766, p-value = 9.35e-06
Interpretation: P value is less than 0.05 so it is not normal.
2nd way
> res.aov <- aov(Yield_after ~ Trt , data = nonpara)
> summary(res.aov)
Df Sum Sq Mean Sq F value Pr(>F)
Trt 8 0.4212 0.05265 16.21 1.8e-08 ***
Residuals 27 0.0877 0.00325
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> model.tables(res.aov, type="means", se = TRUE)
Tables of means
Grand mean
2.440389
> plot(res.aov,2)
See normal curve
> g=yield_fefore
> hist(g)
> m<-mean(g)
> std<-sqrt(var(g))
> hist(g, density=20, breaks=12, prob=TRUE,
xlab=DS10, ylim=c(0, 9.5),
main="normal curve over histogram")
> curve(dnorm(x, mean=m, sd=std),
col="red", lwd=2, add=TRUE, yaxt="n")
Note: data is positively skewed
See homogeneity of variance
> bartlett.test(yield_fefore~Trt)
Bartlett test of homogeneity of variances
data: yield_fefore by Trt
Bartlett's K-squared = Inf, df = 8, p-value < 2.2e-16
Note: As the p value is less than 0.05. The variance are not homogenous.
We can also use Levene test for it.
> leveneTest(yield_fefore~Trt, data = nonpara)
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 8 1.1062 0.3898
27
Note: Bartlett test is more robust than Levene test.
As none of the test showed that the data are normal we can move on to non parametric test.
E. Wilcox sign rank test ( Alternative to pair sample t test)
> wilcox.test(yield_fefore,Yield_after, paired = TRUE)
Wilcoxon signed rank test with continuity correction
data: yield_fefore and Yield_after
V = 0, p-value = 1.712e-07
alternative hypothesis: true location shift is not equal to 0
Note: As p value is less than 0.05, significant difference was observed for yield before and yield after.
F. Mann Whitney Wilcoxon test
wilcox.test(yield_before~INM)
File used: t.xlsx https://drive.google.com/file/d/1DEFqFbemqia3FlEPyIPxeGQSIApbs8c_/view?usp=sharing
Wilcoxon rank sum test with continuity correction
data: yield_before by INM
W = 0, p-value = 0.001723
Note: Significant difference in yield noted by practices of doing integrated nutrient management to that of not doing so as p<0.01.
G. Kruskal Wallis test
File used: nonpara.xlsx
> kruskal.test(yield_fefore~Trt, data=nonpara)
Kruskal-Wallis rank sum test
data: yield_fefore by Trt
Kruskal-Wallis chi-squared = 28.844, df = 8, p-value = 0.0003377
Note: Significant difference was observed in mean yield with respect to treatment.
Now for mean separation, use the following command.
> library(PMCMR)
> library(PMCMRplus)
> posthoc.kruskal.dunn.test(x=yield_fefore, g=Trt, p.adjust.method = "bonferroni")
Pairwise comparisons using Dunn's-test for multiple
comparisons of independent samples
data: yield_fefore and Trt
1 2 3 4 5 6 7 8
2 1.0000 - - - - - - -
3 1.0000 0.6591 - - - - - -
4 1.0000 1.0000 1.0000 - - - - -
5 1.0000 1.0000 1.0000 1.0000 - - - -
6 1.0000 1.0000 0.6591 1.0000 1.0000 - - -
7 0.3524 1.0000 0.1040 1.0000 1.0000 1.0000 - -
8 0.3524 1.0000 0.1040 1.0000 1.0000 1.0000 1.0000 -
9 1.0000 0.0878 1.0000 1.0000 1.0000 0.0878 0.0094 0.0094
Note: Significance difference was seen between treatment 7 vs treatment 9 and 8 vs 9.
H. Vann Waeren test
> vanWaerden.test(x=yield_fefore,g=Trt)
Van der Waerden normal scores test
data: yield_fefore and Trt
Van der Waerden chi-squared = 28.264, df = 8, p-value = 0.0004266
alternative hypothesis: true location shift is not equal to 0
Note: Significant difference was observed in this case.
> posthoc.vanWaerden.test(x=yield_fefore,g=Trt,p.adjust.method="none")
Pairwise comparisons using van der Waerden normal scores test for
multiple comparisons of independent samples
data: yield_fefore and Trt
1 2 3 4 5 6 7 8
2 0.00138 - - - - - - -
3 0.23618 5.6e-05 - - - - - -
4 0.09282 0.07945 0.00643 - - - - -
5 0.40986 0.01107 0.05033 0.37334 - - - -
6 0.00138 1.00000 5.6e-05 0.07945 0.01107 - - -
7 6.0e-05 0.24680 2.3e-06 0.00566 0.00056 0.24680 - -
8 6.0e-05 0.24680 2.3e-06 0.00566 0.00056 0.24680 1.00000 -
9 0.01343 1.2e-06 0.16302 0.00016 0.00171 1.2e-06 5.9e-08 5.9e-08
Note: Significant observation can be noted between the treatments where I have made bold numbers.
1 Comments:
Sir, Which non parametric test we should do for 2 factor CRD and How can we do it in R? And how to interpret and show that in table in paper?
Post a Comment
Subscribe to Post Comments [Atom]
<< Home