SentencePieceEncoder
-
final class fairseq2.data.text.SentencePieceEncoder(model, prefix_tokens=None, suffix_tokens=None, reverse=False, enable_sampling=False, nbest_size=-1, alpha=0.1, device=None, pin_memory=False)[source]
Bases: TextTokenEncoder
-
__call__(text)[source]
- Parameters:
text (str | CString) – The text to encode.
- Return type:
Tensor
-
encode_as_tokens(text)[source]
- Parameters:
text (str | CString) – The text to encode.
- Return type:
List[str | CString]
-
property prefix_indices: Tensor | None
Get the indices of the prefix tokens. Shape: \((S)\), where
\(S\) is the number of indices.
-
property suffix_indices: Tensor | None
Get the indices of the suffix tokens. Shape: \((S)\), where
\(S\) is the number of indices.