zorba::Tokenizer::Callback
#include <zorba/tokenizer.h>
A Callback is called once per token.
This is only internally by Zorba. You do not need to derive from this class. Public Functionsvoid | item(Item const &item, bool entering)
This member-function is called whenever an item that is being tokenized is entered or exited. | void | token(char const *utf8_s, size_type utf8_len, locale::iso639_1::type lang, size_type token_no, size_type sent_no, size_type para_no, Item const *item=nullptr)=0
This member-function is called once per token. | | ~Callback()
|
Public Functionsitemvoid item(Item const &item, bool entering)
This member-function is called whenever an item that is being tokenized is entered or exited.
The default implementation does nothing. Parametersitem |
The item being entered or exited. | entering |
If true, the item is being entered; if false, the item is being exited. |
tokenvoid token(char const *utf8_s, size_type utf8_len, locale::iso639_1::type lang, size_type token_no, size_type sent_no, size_type para_no, Item const *item=nullptr)=0
This member-function is called once per token.
Parametersutf8_s |
The UTF-8 token string. It is not null-terminated. | utf8_len |
The number of bytes in the token string. | lang |
The language of the token. | token_no |
The token number. Token numbers start at 0. | sent_no |
The sentence number. Sentence numbers start at 1. | para_no |
The paragraph number. Paragraph numbers start at 1. | item |
The Item this token is from, if any. |
|