Here is the list of VizSeq built-in metrics. They are all accelerated with multi-processing/multi-threading. To use VizSeq scorers, check out our example and the APIs.
N-gram-based
- BLEU (Papineni et al., 2002): sacreBLEU implementation
- NIST (Doddington, 2002): NLTK implementation
- METEOR (Banerjee et al., 2005): NLTK implementation
- TER (Snover et al., 2006): VizSeq implementation
- RIBES (Isozaki et al., 2010): NLTK implementation
- chrF (Popović et al., 2015): sacreBLEU implementation
- GLEU (Wu et al., 2016): NLTK implementation
- ROUGE (Lin, 2004): py-rouge implementation
- CIDEr (Vedantam et al., 2015): pycocoevalcap implementation
- WER (Word Error Rate): VizSeq implementation
Embedding-based
- LASER (Artetxe and Schwenk, 2018): official LASER implementation
- BERTScore (Zhang et al., 2019): official BERTScore implementation
User-defined Metrics
VizSeq opens the API for user-defined metrics. Refer to the adding new metrics section for more details.