Georgios' Blog

Technical Analysis does not work for predicting price movements of financial assets

Georgios Tsoukas — Tue, 06 Feb 2024 17:07:06 GMT

In countless examples people claim to be successfull using technical analysis of financial instruments. Technical analysis refers to investment decisions that rely on analyzing the historic price movement of assets e.g. via moving averages and "support" and "resistance" in order to predict future price movements.

The efficient-market hypothesis

Despite widespread use of technical analysis, there is also the efficient-market hypothesis, which states that there is no use in technical analysis in order to predict future price movements. Unfortunately, it is only a hypothesis and it has not been proven. Some of the most famous investors claim the efficient-market hypothesis does not hold.

I am convinced that a buy-and-hold strategy is the best way most investors can maximize profits – given that the investor chooses the right assets. So, how can we find out the truth, i.e. can we prove that technical analysis works in the era of machine learning (ML) and artificial intelligence? The strenght of ML is to find patterns in large amounts of data and to get even more successfull in doing so, as the amount of data gets larger. Why should a human be better at detecting patterns in technical analysis than a modern ML algorithm?

Using Machine Learning

Assuming that ML can yield superhuman performance in this task, we need to use a very large dataset so that ML can really show it's strengths. Stocks are typically only traded during the working hours of a day but the foreign exchange FOREX is open around the clock. FOREX leads to more data than stocks.

The following experiment is about predicting the direction of the next minute of a FOREX price. This means we only want to predict, whether the price will go up or down in the next minute. If the ML algorithm would come up with useful predictions in that basic scenario, this would count as strong evidence that technical analysis does work.

The data

Somewhat randomly, we use the CHF.USD price pair. Through a prorpietary source we obtain a dataset of approximately seven million datapoints.

The code

You find the complete code of the experiment in the notebook below.

Results

In the experiment, the next minute direction of the price movement could not be predicted. Looking at ROC AUC, one might conclude, that the model can distinguish between upward and downward price movements in the next minute. But, looking at the precision values in the recall-precision curve, there is no significant portion of the curve above 50% precision. As an additional evaluation, the scatter plot of scores vs. gain/loss does not indicate any correlation between the score value and the gain/loss classification. Interestingly, the scaterplot shows, that the strength of the price movement can be predicted, but that is not what we are interested in.

Fast filtered vector search (without a database engine)

Georgios Tsoukas — Mon, 10 Apr 2023 05:09:08 GMT

Today many AI/ML systems like recommenders, image search, or text search rely on finding the most similar vectors for a given vector. As exact brute force search is linear in the number of vectors to search in, brute force search is too slow for systems with millions of vectors when single digit milliseconds retrieval time on cheap hardware is desired. Many open source libraries exist for approximate solutions (Approximate Nearest Neighbours – ANN). A good overview of existing ANN library implementations and their performance-accuracy tradeoffs is ann-benchmarks.com.

The top performing ANN libraries often rely on Hierarchical Navigable Small World graphs (HNSW) and it is easy to achieve good retrieval time at acceptable recall as long as the search spans the entire vector set. However, real world applications often require searching in a fraction of the vectors e.g. a fashion store might want to recommend the best matches of the summer collection to a website visitor. Once you try to search in a fraction of the vector set, ANN retrieval time increases. This observation is counter-intuitive and may come unexpected. Below plot shows an example of the last.fm dataset with approximately 300k vectors. Brute force search on the full vector set is arround 100 milliseconds, whereas ANN yields comparable recall in just a few milliseconds. On the other extreme, when we search in only a small subset of the full vecotr set (e.g. a small music genre), brute force is fast and HNSW based ANN becomes very slow (>300ms).

Thus at some cutoff, where the subset becomes very small, one must switch from ANN to brute force search, but what if you search subspace is larger, e.g. 50% of the vecotrs? In this instance brute force becomes slow and and ANN has an edge.

If the ANN library's API offers only searching in all vectors, one could try to search in the full set and then filter down search results. This so called post-filtering, willl not yield good latency, when the vectors of the subset to search in are far away from the search vector, because a potentially very large number of vectors will be processed before reaching the desired subspace.

At this point, one might consider to move from an embedded ANN library to one of the search engines supporting filtered ANN search, of which there are many available today. For large scale production applications this can be the best option but it has drawbacks:

Using a database engine capable of vector search increases system complexity and thus development cost;
Increased operating cost for the engine;
Increased latencies (by a few milliseconds) for the communication with the engine as oposed to an in memory library embedded in the application.

Until recently there were no options for using an ANN library and doing fast filtered vector search. Since it's recent 0.7 release hnswlib offers this capability. This opens up the possibility of building extremely fast low cost systemes for filtered ANN search. The above graphs are produced with hnswlib and you can finde the code here.

Let us conclude this article with a Python code example of how to do filtered vector search with hnswlib. In this example we use random data for simplicity, but please note, that the behaviour of HNSW indces and filtering on random data is not representative for real world data. Also, a fallback to brute force search is not shown in the example but highly recommended for real world applications. While the code might be similar, for real world applications the HNSW parameters (space, M, efConstruction, efSearch) and the cutoff when switching to brute force search must be tuned.

import numpy as np

import hnswlib

dim = 16
num_elements = 10000

# Generating sample data
data = np.float32(np.random.random((num_elements, dim)))

# Declaring index
hnsw_index = hnswlib.Index(space='l2', dim=dim)

# Initiating index
hnsw_index.init_index(max_elements=num_elements, ef_construction=100, M=16)

# Controlling the recall by setting ef:
# higher ef leads to better accuracy, but slower search
hnsw_index.set_ef(10)

hnsw_index.set_num_threads(4)  # by default using all available cores

# Add data to index
hnsw_index.add_items(data)

# Query the even elements:
filter_function = lambda id: id%2 == 0
# Warning: search with a filter works slow in python in multithreaded mode, therefore we set num_threads=1
labels, distances = hnsw_index.knn_query(data, k=1, num_threads=1, filter=filter_function)

When hash-maps are an order of magnitude too slow

Georgios Tsoukas — Fri, 02 Jul 2021 17:58:09 GMT

Recently I was writing some performance critical code. It was about an algorithm which had to run in linear time. Out of habit, I started to use hash-maps in order to make linear time possible. The execution speed was not as expected so I profiled the code. Hash-map lookups were consuming the largest fraction of run-time, so there was no point in tuning the other parts of the code.

Seeking inspiration in the performance characteristics of the data structures, it turned out that hash-maps have only time eC – "effectively constant" – when doing lookups but there are other data structures, namely BitSets which can do lookups in time C – "(fast) constant time". That sparked the idea of experimenting with BitSets and arrays for the aforementioned performance critical lookups.

Since there was no apparent performance difference between BitSets and arrays, the following Scala code focuses on arrays due to their broader applicability.

import scala.collection.mutable.HashSet
 
// Suppose we have sparse indices between zero and one million
val n:Int = 1e6.toInt
 
// Lets generate some random data, to simulate the problem
val random = new java.security.SecureRandom
 
val b:Array[Boolean] = Array.fill(n) { random.nextBoolean }
 
// Via a hash set it can be efficiently checked if a particular
// index is present, right? So we store our data in a HashMap
val h:HashSet[Int] = new HashSet[Int]()

val r = 0 to n-1
for (i <- r){
  if(b(i)) h += i
}
 
// And here is our naive timing function
def time[A](a: => A): (A,Long) = {
  val now = System.nanoTime
  val result = a
  val micros = (System.nanoTime - now) / 1000
  (result, micros)
}
 
// Now we want to query a number of arbitrary indices.
// HashMap goes first
def queryHashMap():Long = {
  var numHits:Long = 0
  for(i <- r){
  if(h.contains(i)) numHits += 1
  }
  numHits
}
 
time(queryHashMap)
 
 
 
// But what if we just used an array instead of the hash map
def queryArray():Long = {
  var numHits = 0
  for(i <- r){
  if(b(i)) numHits += 1
  }
  numHits
}
 
time(queryArray)
 
 
// After some warm-up calls to make the JVM hotspot compiler
// really come up with efficient machine code, lets run both
// for 100 times
(1 to 100)
	.map(_=> (time(queryHashMap)._2.toDouble / time(queryArray)._2.toDouble))
	.reduce(_ + _)/100
// -> array is approx. 7 times faster

It turned out that the array based lookups where 7 to 9 times faster. After thinking about it, this result is completely plausible since obviously less CPU cycles are involved in getting a bit at a random memory address than having to compute the hash function first.

Limitations

This solution does not fit all cases. Keys were transformed to space efficient indices without gaps beforehand. Otherwise an array can not replace a hash-map. Also, memory was traded for faster execution speed.