zorba::Tokenizer::Callback

#include <zorba/tokenizer.h>

A Callback is called once per token. This is only internally by Zorba. You do not need to derive from this class.

Public Functions

void

item(Item const &item, bool entering)

This member-function is called whenever an item that is being tokenized is entered or exited.

void

token(char const *utf8_s, size_type utf8_len, locale::iso639_1::type lang, size_type token_no, size_type sent_no, size_type para_no, Item const *item=nullptr)=0

This member-function is called once per token.

~Callback()

Public Types

size_type

Public Functions

item

void item(Item const &item, bool entering)

This member-function is called whenever an item that is being tokenized is entered or exited.

The default implementation does nothing.

Parameters

item The item being entered or exited.
entering If true, the item is being entered; if false, the item is being exited.

token

void token(char const *utf8_s, size_type utf8_len, locale::iso639_1::type lang, size_type token_no, size_type sent_no, size_type para_no, Item const *item=nullptr)=0

This member-function is called once per token.

Parameters

utf8_s The UTF-8 token string. It is not null-terminated.
utf8_len The number of bytes in the token string.
lang The language of the token.
token_no The token number. Token numbers start at 0.
sent_no The sentence number. Sentence numbers start at 1.
para_no The paragraph number. Paragraph numbers start at 1.
item The Item this token is from, if any.

~Callback

 ~Callback()