Synthesis Experiments | Benchmark Pt.11

11 April 2021

BPE

Based on previous outputs, there seemed to be a problem where there are a lot of unknown tokens in the output translation (see the last line of synthesis/train_model/pass-1/fs_generate*.out, where BLEU = 1.87 for example). It was then recommended that the corpus all be tokenized before training. The results did improve significantly, and the third pass (synthesis/train_model/pass-3), I got BLEU = 10.92. Still not great, but at least greatly improved.

Lisa Z.

Synthesis Experiments | Benchmark Pt.11

BPE

Related Posts

I'm Back! 20 Mar 2021

Synthesis Experiments | Benchmark Pt.10 25 Jan 2021

Synthesis Experiments | Benchmark Pt.9 11 Jan 2021