Science
05 -Statistical Analysis
There are a few elements to statistical analysis you need to know to fully understand scientific study language.
Some things you should be aware of are:
-Confidence intervals
-Standard deviation
-P values
-Interquartile Range
-Types of scales
Confidence Intervals
This is very common within studies. You might see something like this:
0.71, 95% confidence interval 0.67 to 0.74
This means that the researcher or statistician is saying that the value written (0.71) is stated with 95% confidence. The second number (0.67) and third number (0.74) are the lower and upper limit of the confidence range.
So this number means that we can say with 95% confidence that the value is between 0.67 and 0.74, or 0.71.
Statistics isn't about finding absolute values, more so about finding probabilities or approximations. This is expressed through the range shown above.
Standard deviation
This is what is used to describe the average deviation or offset of all of the values from the mean value.
The mean value is the overall average, which is calculated by adding up all values and dividing by the number of values.
For example, the mean of 1,4,7,3,8,4,7 would be calculated by (1 + 4 + 7 + 3 + 8 + 4 + 7) / 7 = 34 / 7 = 4.86.
To calculate the standard deviation, which is the average offset from the mean average, you have to square and add each value's offset from the mean, and divide by the number of values, then calculate the square root.
My 4.86 mean is rounded to 2 decimal places but let's take it as accurate for the sake of this example.
The offset from the mean would be:
-3.86² + (-0.86²) + 2.14² + (-1.86²) + 3.14² + (-0.86²) + 2.14²
-14.8996 -0.7396 + 4.5796 - 3.4596 + 9.8596 -0.7396 + 4.5796 = -0.8196
If we were using population data, we would divide this by the number of samples.
As we are using sample data, we divide by the number of samples - 1.
-0.8196 / (5-1) = -0.2049. The square root of this is 0.45 rounded to 2 decimal places.
So from this random data set of 1,4,7,3,8,4,7:
The mean is 4.86
The standard deviation is 0.45
Putting this into a confidence interval calculator, we get the following:
95% Confidence Interval: 4.86 ± 0.333
(4.53 to 5.19)
"With 95% confidence the population mean is between 4.53 and 5.19, based on only 7 samples."
Short Styles:
4.86 (95% CI 4.53 to 5.19)
4.86, 95% CI [4.53, 5.19]
Margin of Error: 0.333
This means that if we picked a number at random from the sample, we could be 95% confident that the value would be between 4.53 and 5.19.
The cool thing is that when your sample size increases, the confident interval range shrinks. Basically, the higher your sample size, the more accurate your confidence interval.
For example, if in the calculator I change the sample size from 7 to 700, we get this:
95% Confidence Interval: 4.86 ± 0.0333
(4.83 to 4.89)
"With 95% confidence the population mean is between 4.83 and 4.89, based on 700 samples."
Short Styles:
4.86 (95% CI 4.83 to 4.89)
4.86, 95% CI [4.83, 4.89]
Margin of Error: 0.0333
I'm not very well versed in statistics to go indepth beyond this, but if you're interested there are plenty of websites out there.
If you want to learn more about standard deviation, StudentsForBestEvidence has a good page to start with.
Why this is important
Most studies use confidence intervals. In fact, the example I first used is from a study we will look at as an example on the next page.
0.71, 95% confidence interval 0.67 to 0.74
We now know that this means that with a 95% confidence we can say the value being discussed is within 0.67 and 0.74.
You will very likely also see in studies something like this:
P<0.001 for trend
The P value is a nuanced and complicated element of statistics. Before we continue, here are some pages:
Simply Psychology
From the Towards Data Science article:
"Hypothesis testing is used to test the validity of a claim (null hypothesis) that is made about a population using sample data. The alternative hypothesis is the one you would believe if the null hypothesis is concluded to be untrue.
In other words, we’ll make a claim (null hypothesis) and use a sample data to check if the claim is valid. If the claim isn’t valid, then we’ll choose our alternative hypothesis instead. Simple as that.
To know if a claim is valid or not, we’ll use a p-value to weigh the strength of the evidence to see if it’s statistically significant. If the evidence supports the alternative hypothesis, then we’ll reject the null hypothesis and accept the alternative hypothesis."
From the Simply Psychology article:
"The level of statistical significance is often expressed as a p-value between 0 and 1. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis."
"A p-value less than 0.05 (typically ≤ 0.05) is statistically significant. It indicates strong evidence against the null hypothesis, as there is less than a 5% probability the null is correct (and the results are random). Therefore, we reject the null hypothesis, and accept the alternative hypothesis.
However, this does not mean that there is a 95% probability that the research hypothesis is true. The p-value is conditional upon the null hypothesis being true is unrelated to the truth or falsity of the research hypothesis."
"A p-value higher than 0.05 (> 0.05) is not statistically significant and indicates strong evidence for the null hypothesis. This means we retain the null hypothesis and reject the alternative hypothesis. You should note that you cannot accept the null hypothesis, we can only reject the null or fail to reject it.
A statistically significant result cannot prove that a research hypothesis is correct (as this implies 100% certainty).
Instead, we may state our results “provide support for” or “give evidence for” our research hypothesis (as there is still a slight probability that the results occurred by chance and the null hypothesis was correct – e.g. less than 5%)."
The best site for explaining this I found is StatsDirect:
"The P value, or calculated probability, is the probability of finding the observed, or more extreme, results when the null hypothesis (H0) of a study question is true"
"Most authors refer to statistically significant as P < 0.05 and statistically highly significant as P < 0.001 (less than one in a thousand chance of being wrong)."
Summarising all of these website definitions of what the P value is:
The P value is in essence the probability of being wrong if your statistical association is wrong and no legitimate trend exists. P values are decimals but can be thought of percentages, where 0.001 is 0.1%.
It is the accuracy of the probability of being incorrect. I'm probably not doing the nuance justice but that's how I see it.
Or super simply - Lower P numbers are better.
You might see this pop up in studies, too. StatisticsHowTo has a good page explaining this.
This is a way of displaying data visually.
The median is the average value for your data set
The lower quartile is the average value for the first half of the data
The upper quartile is the average value for the second half of the data
The lower and upper quartiles make up the interquartile range
The minimum and maximum values make the whisker lines
Why this is useful is because it shows you the variation in the data. The larger the box is the more varied the data is.
If your data is mostly in the same place with some outliers at the top and bottom, you would have a smaller box with longer whiskers. Or another way to put it, the more uniform and concentrated your data, the smaller your interquartile range will be.
I don't fully think this means your data is any more or less accurate, it just shows uniformity in the data vs the high and low extremes. I'm not very well versed in statistics so I can't comment on the utility or application of various tools and methods, but I can at least explain what they basically mean.
While no doubt many scales exist to show data, I will only show two.
The first is a linear scale. This is the simplest and easiest to recognise scale. Every step in the scale represents an even jump. For example, a ruler is a linear scale. Numbers increment 1 to usually 30 in even jumps of 1: 1, 2, 3, 4, 5, 6, 7, and so on.
The second scale type I want to mention is the logarithmic scale. This scale increments in magnitude. Instead of increasing in even steps of say 1 like a ruler, it may increase in a magnitude of 10. A log 10 scale would go 10, 100, 1000, 10000, 100000, and so on. These numbers might be expressed as powers. 10¹, 10², 10³ etc.
You might see logarithmic scales being used for example in Coronavirus trends.
As you can see above, this covid-19 graph uses a logarithmic 10 scale. 1, 10, 100, 1000, 10000, and so on.
When looking at logarithmic graphs it's important to understand that higher up on the graph represents much greater values than a linear graph.
The benefit of a logarithmic graph is that it allows you to show trends for huge and small values on the same graph. A further benefit is that trends at the top of the graph are comparable to trends at the bottom despite the values being massively different.
As the visual trends are in magnitudes, a small spike at the top can be compared to a spike of the same size at the bottom.
Logarithmic scales that are widely used are:
The Richter scale is a log 10 scale for measuring earthquakes. Interestingly enough, the Richter scale is a log 10 scale in terms of weight of TNT explosion. 2 on the Richter scale is 1 ton of TNT. 3 is 10 tonnes, 4 is 100 tonnes, 5 is kilotonnes, and so on.
The Decibel Scale is a log for measuring sound power or loudness where 3db represents a doubling or halving of the sound presence.
Octaves in music and sound in general which describes a doubling or a halving of the sound frequency.
The pH scale for measuring acid and alkaline strength is a log 10 scale. 7 is a neutral state, where higher numbers up to 14 represent a stronger alkaline, and a lower number down to 0 represents a stronger acid.
The f stop scale in camera lenses is a type of logarithmic scale representing the amount of light entering the camera in factors of 2.
Confidence intervals are the range of your calculated values with a specific confidence rating (usually 95%)
Standard deviation is the calculated range below and above the mean of all values where expected values will lie
The P number is a probability tool for showing the probability that you're wrong with lower P's being better
Interquartile ranges are ways of displaying sets of data visually
Linear scales go up in set increments
Logarithmic scales go up in magnitudes and allow for large and small values to trend on the same graph
We will put this to practice on the next page where we will actually read a scientific study and I will break down the elements of what makes up a study, putting into practice this recent explanation of statistics, and the previous explanation of science.
Next page: Science - Reading Studies