Optimising algorithms in Go for machine learning - Part 2: Sparse matrix formats
This is the second in a series of blog posts sharing my experiences working with algorithms and data structures for machine learning. These experiences were gained whilst building out the nlp project for LSA (Latent Semantic Analysis) of text documents. In Part 1 of this series, I explored alternative approaches for representing and applying TF-IDF transforms for weighting term frequencies across document corpora. We tested the approaches using Go’s inbuilt benchmark functionality and found that our optimisations materially improved not just memory consumption but also performance (reducing memory consumption and processing time from 7 GB and 41 seconds to 250 KB and 0. »