This content is locked. Please login or become a member.


Biased Sampling
A fact is not data. It may not be representative. And so what I mean by this is the facts could be cherry-picked. They could be the exception that does not prove the rule.
Let’s say you want to test the hypothesis that trading cryptocurrency is the road to riches. And so you might think, well, let’s gather some facts to support that hypothesis. I know a few of my friends who traded cryptocurrency and they made money. And I could check this. I could look and ask for their bank statement. I could ask for what price did they buy Bitcoin and what price did they sell them for. And I could also see on social media, there’s lots of other people who claim to be successful through trading cryptocurrency.
But even if every single person was being factually accurate, this could still be misleading because this is a selected sample. Even if the examples that you’re given are one hundred percent factually accurate, they could be just isolated exceptions. Maybe there’s thousands of other people who’ve also traded Bitcoin, including some of your friends, but they lost money. They would have to admit that they made a bad decision. They got too overconfident. So the stories that we hear are only gonna be the success stories of people who traded cryptocurrency and became successful. Just like if I wanted to say that smoking is not bad for your health, I’m sure I could come up with one or maybe ten smokers that live to a hundred, but this is not evidence that smoking is good for your health. I’ve just handpicked those particular people.
Random Sampling
The best way to test a hypothesis is similar to a randomized control trial in medicine. So there you take a group of people, and you give some of them a drug. You see how many get better and how many get worse. And then you give other people a placebo and see how many of them get better and how many of them get worse. And then you compare the two rates of recovery. But importantly, the sample contains people who took the drug and still got worse. So that is like people who traded cryptocurrency and lost their shirt. Also, the sample contains people who were given the placebo and still got better. Those are people who just invested in cash and equities and got rich.
So whenever we read a book or see a study, to understand whether you have facts or data, ask whether you are being shown the full picture. If we don’t have those counterexamples, then we are being shown a selected sample, and we should be skeptical.
Statistical Significance
Let’s say you have a coin and you want to test the hypothesis that the coin is biased towards heads. So if you toss the coin twice and it lands heads both times, you might think, well, isn’t this evidence that the coin is biased towards heads? It’s only ever landed heads. But this is not evidence. Why? Because it could well be that even if the coin is unbiased and fair, the chance of it landing heads twice in a row is twenty-five percent. So it is quite likely for you to have two heads even under an unbiased coin.
So what statistical significance is about is how unlikely a sequence needs to be before I can call the coin unfair. So while two heads is not unlikely, with five heads, the probability of five heads in a row under a fair coin is only three percent. And, typically, we apply a five percent threshold if the probability of the data that we see is less than five percent under a fair coin, i.e., no relationship, then we accept our hypothesis that the coin must be unfair.
So let’s translate this into the cryptocurrency setting. So the hypothesis we want to test is does trading crypto lead to riches? Now if five of my friends traded crypto, three of them made money, and two of them lost money, well, that’s something which could still be the case under randomness. Even if crypto did not lead to success, it could be that just because of luck, more friends made money than lost money. But if instead, thirty friends made money with crypto and two lost, that is statistically significant. It’s similar to why five heads in a row is evidence that the coin is unfair, whereas two heads in a row is not.
But we often forget about statistical significance. We make simple statements such as more people made money than lost money, and we think that is enough. It is not enough. We need the magnitude of outperformance to be so large that we rule out the effect of luck.