ailia_tokenizer
1.4.0.0
|
ailia Tokenizer is a tokenizer for NLP that can be used from Unity or C++. The Tokenizer can convert text into tokens that can be handled by AI, or convert tokens back into text.
Whisper, CLIP, XLMRoberta, Marian, BERT Japanese WordPiece, BERT Japanese Character, T5, Roberta, BERT, GPT2, and LLAMA are supported.
Japanese tokenization using Mecab is supported.
Automatically normalize to UKFC format.
In addition to the C API, a C# API and a Unity Plugin are provided, allowing to easily integrate Tokenizer recognition to your apps that are using Unity.
It is possible to perform translation to English at the same time as Tokenizer transcription. This allows to implement realtime translation from Japanese or Chinese into English.
ailia Tokenizer is available for Windows, macOS, Linux, iOS, and Android.