ailia_tokenizer  1.3.0.0
About ailia Tokenizer

Presentation of ailia Tokenizer

ailia Tokenizer is a tokenizer for NLP that can be used from Unity or C++. The Tokenizer can convert text into tokens that can be handled by AI, or convert tokens back into text.

Features of ailia Tokenizer

Support for various forms of tokenization

Whisper, CLIP, XLMRoberta, Marian, BERT Japanese WordPiece, BERT Japanese Character, T5, Roberta, BERT, GPT2, and LLAMA are supported.

Support for Japanese tokenization

Japanese tokenization using Mecab is supported.

Supports Unicode normalization

Automatically normalize to UKFC format.

Unity support

In addition to the C API, a C# API and a Unity Plugin are provided, allowing to easily integrate Tokenizer recognition to your apps that are using Unity.

Convient for translating to English

It is possible to perform translation to English at the same time as Tokenizer transcription. This allows to implement realtime translation from Japanese or Chinese into English.

Supported platforms

ailia Tokenizer is available for Windows, macOS, Linux, iOS, and Android.