encoding::utf8
utf8: support for UTF-8 encoded data
encoding::utf8 provides helper functions for working with runes and UTF-8 encoded slices.
Index
Types
type more = void;
type decoder = struct {
offs: size,
src: []u8,
};
Errors
type invalid = !void;
Functions
fn decode(src: []u8) decoder;
fn encoderune(r: rune) []u8;
fn next(d: *decoder) (rune | done | more | invalid);
fn position(d: *decoder) size;
fn prev(d: *decoder) (rune | done | more | invalid);
fn remaining(d: *decoder) []u8;
fn runesz(r: rune) size;
fn slice(begin: *decoder, end: *decoder) []u8;
fn strerror(err: invalid) str;
fn utf8sz(c: u8) (size | invalid);
fn validate(src: []u8) (void | invalid);
Types
type more
type more = void;
Returned when more data is needed, i.e. when an incomplete UTF-8 sequence is encountered.
type decoder
Show undocumented member
type decoder = struct {
offs: size,
src: []u8,
};
Errors
type invalid
type invalid = !void;
Returned when an invalid UTF-8 sequence was found.
Functions
fn decode
fn decode(src: []u8) decoder;
Initializes a new UTF-8 decoder. You may copy the decoder to save its state.
fn encoderune
fn encoderune(r: rune) []u8;
Encodes a rune as UTF-8 and returns the result as a slice. The return value is statically allocated, and will not be consistent after subsequent calls to encoderune.
fn next
fn next(d: *decoder) (rune | done | more | invalid);
Returns the next rune from a decoder. done is returned when there are no remaining codepoints.
If an invalid UTF-8 sequence is encountered, the position of the decoder is set to immediately after the first invalid byte.
fn position
fn position(d: *decoder) size;
Returns the position of the decoder. When possible, it's generally considered more idiomatic to use other functions in this module, such as remaining and slice.
fn prev
fn prev(d: *decoder) (rune | done | more | invalid);
Returns the previous rune from a decoder. done is returned when there are no previous codepoints.
fn remaining
fn remaining(d: *decoder) []u8;
Returns a subslice from the next byte in the decoder to the end of the slice.
fn runesz
fn runesz(r: rune) size;
Returns the size of a rune, in octets, when encoded as UTF-8.
fn slice
fn slice(begin: *decoder, end: *decoder) []u8;
Return a subslice from the position of the first decoder to the position of the second decoder. The decoders must originate from the same slice and the position of the second decoder must not be before the position of the first one.
fn strerror
fn strerror(err: invalid) str;
Returns a human-friendly string for invalid.
fn utf8sz
fn utf8sz(c: u8) (size | invalid);
Returns the expected length of a UTF-8 codepoint in bytes given its first byte, or invalid if the given byte doesn't begin a valid UTF-8 sequence.
fn validate
fn validate(src: []u8) (void | invalid);
Returns void if a given byte slice contains only valid UTF-8 sequences, otherwise returns invalid.