The article is intended for both data scientists and ordinary software developers.
In general data-scientists work with statistical methods of processing data, use Python programming language, and run tasks on cloud CPUs together with GPU/TPUs (Graphics Processing Units/Tensor-flow Processing Units).
In contrast, I am a programmer, emphasize on creating non-statistics based algorithms, use C programming language, and run tasks on cloud CPUs only.
The article in not theoretical but puts all of the propositions in practice - implemented in the for-ex volatility charts product.
Foreign Exchange (ForEx) is a global financial market composed of 4 exchanges located worldwide n - New York, London, Sydney, and Tokyo. The global position of these exchanges allow almost 24 hours trading during the week.
Trading starts Monday morning in Sydney, and ends Friday afternoon in New York.
The market has several distinctive categories:
There are in total of around 125 different trading items in the ForEx market.
The state of the art in machine learning today uses statistical methods and processing of GBytes of data - the more data the better the models are supposed to work.
The ML programs use multiple cloud CPU together with GPU/TPUs and the development of models require multiple iterations and runs which take anywhere from minutes to hours to complete.
The ML programs favor the use and calling of a big specialized library like - TensorFlow, or MxNet. The resulting Python programs have only few or even a single important line of code.
The main beneficiaries and drivers of the current state of the art data science and machine learning are the cloud providers who sell GPU/TPU hours and/or bill per 'container' and are not interested in changing the existing status-quo.
Instead of data accumulation as proposed by the current state of ML I am performing a data reduction.
In the context of ForEx, I am proposing and then implementing the following ways of data reduction:
A single share:in the category of currencies for ex. USD_CAD is around $1.30. A single share in the category of commodities for ex. oil_usd is around $65. A single share in the category of metals for ex. XAU_USD (gold_usd) is around $1500. And a single share in the category of indexes for ex. US Nasdaq 100 is around $9,000.
Normalization allows expressing items with difference in several orders of magnitude in value into an uniform way - in percentages (%).
Data collection system can sample the pricing of the for-ex items anywhere from 1 second and up. But is such precision necessary? Looking at the image above we see the rise of the NATGAS_USD shown with a blue line from point1 to point2 take the span of 38 minutes. After considering this and other such examples, a sample time of 1 minute provides sufficient details while reducing the data processing burden.
Looking at the vertical axis we the range of normalized volatility varies fro 0 to around 3.5%. With such span it is acceptable to choose precision of 2 digits after the decimal point, which results in 0.01/3.5 ~= 0.0028 % error in precision.
The pricing if items comes always as a number with multiple digits after the decimal point. In a program it is placed normally in a floating point type of variable.
A test program in 'C' which did have 2 loops with multiplication and division - one of the loops with floating point type of variables and one with decimal type of variables did result in the second loop using decimal variables being around 2 times faster than the loop using the floating point variables.
After choosing precision of 2 digits after decimal point, and multiplying the pricing by 100 we are getting a decimal number which can then be processed much faster.
Probably the most important issue when trading on financial markets is to choose which item is important and is worth placing a buy or sell on it. Traders usually have favorite items like for ex. USD_GBP in currencies, and/or GOLD_USD in metals, and/or OIL_USD in commodities. However the favorite items may not be allowing the best opportunity for gains.
Volatile items, which move rapidly up and down, when used in the right time offer the highest opportunity for gains, with the lowest leverage..
Selecting the top 3, 5, or 8 items in a category and discarding everything else reduces significantly the amount of data to be analyzed.
The last but still important way of data reduction is the time period of data to be considered. While behavior from past week or month may define an overall up/down-trend - the immediate upcoming price movement is influenced mostly by developments (media, news, reports) in the present moment or within the previous 6 to 48 hours.
While mainstream data science and machine learning use statistical methods, deal with gigabytes of data and uses multiple cloud CPU together with GPU/TPUs, the proposed uses data reduction, deals with kilo Bytes of data, and uses CPUs only.
The ways of the proposed data reduction are - normalization, choosing sample time, choosing precision, selecting data variable type, focusing on relevant items for trading, and choosing relevant time slice period.
The data prepared in this way is used in a non-statistics based light weight machine learning.