Built-In Metrics

Here is the list of VizSeq built-in metrics. They are all accelerated with multi-processing/multi-threading. To use VizSeq scorers, check out our example and the APIs.

N-gram-based

BLEU (Papineni et al., 2002): sacreBLEU implementation
NIST (Doddington, 2002): NLTK implementation
METEOR (Banerjee et al., 2005): NLTK implementation
TER (Snover et al., 2006): VizSeq implementation
RIBES (Isozaki et al., 2010): NLTK implementation
chrF (Popović et al., 2015): sacreBLEU implementation
GLEU (Wu et al., 2016): NLTK implementation
ROUGE (Lin, 2004): py-rouge implementation
CIDEr (Vedantam et al., 2015): pycocoevalcap implementation
WER (Word Error Rate): VizSeq implementation

Embedding-based

LASER (Artetxe and Schwenk, 2018): official LASER implementation
BERTScore (Zhang et al., 2019): official BERTScore implementation

User-defined Metrics

VizSeq opens the API for user-defined metrics. Refer to the adding new metrics section for more details.

#N-gram-based

#Embedding-based

#User-defined Metrics

N-gram-based

Embedding-based

User-defined Metrics