I identified two sources for optimisation.
- Every iteration during clustering the matrix is completely re-generated. Instead when clustreing a pair of items, it should only remove those two elements from the list and append the new cluster. This would save an awful lot of operations.
- The distance from A to B is the same as from B to A. That means that the matrix is symmetric. Therefore, we only need to generate and examine half of the matrix. Again, that would be a massive speedup.