We will look at mathematical foundations of propability theory and how deep our team has to dive into this theory to manage its optimizing solutions.
Random number generators
Random number generator is the base part of many optimization and or computational algorithms. Be it Monte Carlo methods, genetic, evolution based algorithms or initialization of weights in deep neural networks. There are still many unknowns about how to use them. And general human intuition falls short.
Ideal random number generator has uniform distribution of produced values and there is absolutely no relationship between them.
Truly random generators
Truly random events are all around us on the low level – we call it noise.
Atmospheric noise is radio noise caused by natural atmospheric processes – everybody can experience it when tuning a radio to a frequency where no radio station transmit. Others are thermal, electromagnetic and quantum events – cosmic background radiation, radioactive decay or as simple as noise generated by various events on semiconductor’s PN junction.
Pseudo random generators
In computing we are trying to eliminate noise as much as possible. Computers are very exact and precise – exactly opposite to what we need with randomness.
To get something random from them requires collection of random events from their surroundings or a specific hardware with truly random generator.
Examples of events produced around computers are time (very limited use), delays between keyboard’s key presses, precise mouse movements, delays between packets on the network interfaces etc. The bigger the collection of these events creates better entropy for pseudo random generator initialization “vector” or “seed”.
In other words more time the computer is running the higher entropy for a pseudo random generator.
There are many ways how to initialize a random number generator with the first seed/vector. But when we are developing a program it’s essential to be able to compare program runs between each other. It’s useful to set random number generator to the same seed so the run of the program produce the same results.
Human world is driven mostly by Normal distribution processes.
Normal distribution has very low probability of extreme events. Usually in the finance standard deviation of 2 and higher is often considered as an extreme event. In other literature it is strongly advised not to use Normal distribution and it’s connected standard deviation. As it was documented the finance market might be driven by Poisson or other processes with much higher tails.
Human intuitions expects Normal distribution and definitely nothing like flat distribution of the white noise.
Randomness with evolution based algorithms
In finance and or trading the simple evidence is the genetic algorithm being able to find bugs in the backtesting process. If there is a bug allowing to look into the future… it usually takes just tens of generations to find out and exploit this bug.
Human intuition would expect the algorithm not to be able to find such edge case. Our experiences dictate, it’s able to find it each and every time.
At the same time human optimized approach to data generation and caching falls short. Of course the genetic algorithm from its definition is going to explore all of the types of data available to search for the solution probabilistically in the whole multidimensional space. Without any slim tails at the border of the space…
Missing entropy is the main cause of low robustness of financial models and trading systems. Respect for its valuation is going to increase robustness and validation od our optimizing solutions.