To train the model, in addition to the tutorial on fairseq’s webpage, I had to specify an optimizer. I went with SGD:
It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data). Especially in high-dimensional optimization problems this reduces the computational burden, achieving faster iterations in trade for a lower convergence rate.
The fall semester started today, hopefully I can keep up with learning how to work with the grid!
I struggled quite a bit with the fairseq commands today (and job scheduling). I realized that I could not run fairseq-interactive with qlogin since the latter was not granting me a session, and the former was too heavy a job. So, in the end I decided to non-interactively train a model instead.
Fairseq is a Python-based language modeling toolkit devloped by Facebook (I was also recommended Marian, which is C++-based and developed by Microsoft, but fairseq was easier to install, and Python is all the rage these days).
For the record, the CLSP grid runs Python2 instead of Python3 as default, so I inevitably came across the problem at the last step:
Just reminding myself to use pip3 instead of pip next time.
There was a problem I ran into on my Mac when composing the config file under ~/.ssh/, where even though the User was specified for a particular Host, it still defaults to my local system username when logging in via terminal. To resolve the issue, I added the following to the config file:
A solution, yet temporary. I am not sure what will happen if I try specifying a different username for a different host - would I still run into the same problem with the current default?