Two independent samples

↬ Analyse

Non-parametric tests are used when the assumptions for parametric tests cannot be met, as non-parametric tests rely on fewer assumption. One of the key assumptions for parametric testing is that the sample data is normally distributed. 

Importantly, parametric tests are preferred if possible, as they have higher statistical power (ie. require lower sample numbers than their non-parametric counterparts) and are less likely to make type II errors. Therefore, before using a non-parametric test, it would be beneficial to ensure that the sample data does not meet parametric testing assumptions and that the data cannot be transformed to meet those assumptions, for example by using logarithmic values for data with heavy skews. 

For every parametric test, there exist a non-parametric counterpart.

What is it?

Mann-Whitney U test (also known as Wilcoxon rank-sum test) is used to compare the means between two independent groups. It’s parametric counterpart is the Two-sample T-test.More precisely, Mann-Whitney U combines the values of both groups and ranks them. It then assigns a rank number to each value and compares the mean rank of each group. Therefore, it is not directly comparing means, but the means of ranks, which is able to determine if one group is statistically different from the other. The hypotheses in this case are as follows

Null: H0= The distribution of the two groups are equal

Alternative: HA= The distribution of the two groups are not equal


Of note, it is important to understand how each group is distributed. If both groups have the exact same distribution but are simply shifted in location, then any difference calculated by the Mann-Whitney U test is caused by a difference in median. Therefore, in this case, the alternative hypothesis can be expressed as follows: 


HA= The median of the two groups are not equal



  • All observations for both groups are independent of each other.
  • The responses are ordinal, interval or ratios (which means values can be ranked in order)
  • One independent variable that consists of two categorical, independent groups (if more than two groups, you will need to use the Kruskal Wallis test)


Worked Example

In this example, we will be looking at patient pain score at 24 hours following hip replacement to see if there is a difference in gender. For this example please download the excel file below. 

↬ Analyse

1. Click analyse above

2. Upload the example file and chose file type .xlsx

3. Select variables 'Pain_score' and 'gender'

4. Define continuous (pain score) and categorical (gender) variables. 

5. Select the tab 'Man-Whitney, 2-group one binary 

6. Download as PDF, HTML or Word document if desirable. 


The p-value ( of the Wilcoxon Rank Sum test (also known as Mann Whitney U test) reveals a high statistical probability that the two samples are not likely to derive from the same population. 

The Wilcoxon “W” test statistic is derived from the smallest rank total. The smaller this value is, the less likely it will have occurred by chance. There are tables of critical values available online which vary with the sample size. 



You can only compare two groups at a time, you will need to look to the Kruskal-Wallis test otherwise. 

With great variance in the sample data, the power of the test reduces.

In general, non-parametric tests are less powerful than their parametric counterpart (and may therefore require a greater population sample). They are also more prone to make type II errors.



Written by Lorenzo Lenti