encoding::utf8 — Hare documentation

encoding::utf8+x86_64 +linux

utf8: support for UTF-8 encoded data

encoding::utf8 provides helper functions for working with runes and UTF-8 encoded slices.

Index

Types

type more = void;

// Undocumented types:
type decoder = struct {
	offs: size,
	src: []u8,
};

Errors

type invalid = !void;

Functions

fn decode(src: []u8) decoder;
fn encoderune(r: rune) []u8;
fn next(d: *decoder) (rune | done | more | invalid);
fn position(d: *decoder) size;
fn prev(d: *decoder) (rune | done | more | invalid);
fn remaining(d: *decoder) []u8;
fn runesz(r: rune) size;
fn slice(begin: *decoder, end: *decoder) []u8;
fn strerror(err: invalid) str;
fn utf8sz(c: u8) (size | invalid);
fn validate(src: []u8) (void | invalid);

Types

type more[permalink] [source]

type more = void;

Returned when more data is needed, i.e. when an incomplete UTF-8 sequence is encountered.

type decoder[permalink] [source]

Show undocumented member

type decoder = struct {
	offs: size,
	src: []u8,
};

Errors

type invalid[permalink] [source]

type invalid = !void;

Returned when an invalid UTF-8 sequence was found.

Functions

fn decode[permalink] [source]

fn decode(src: []u8) decoder;

Initializes a new UTF-8 decoder. You may copy the decoder to save its state.

fn encoderune[permalink] [source]

fn encoderune(r: rune) []u8;

Encodes a rune as UTF-8 and returns the result as a slice. The return value is statically allocated, and will not be consistent after subsequent calls to encoderune.

fn next[permalink] [source]

fn next(d: *decoder) (rune | done | more | invalid);

Returns the next rune from a decoder. done is returned when there are no remaining codepoints.

If an invalid UTF-8 sequence is encountered, the position of the decoder is set to immediately after the first invalid byte.

fn position[permalink] [source]

fn position(d: *decoder) size;

Returns the position of the decoder. When possible, it's generally considered more idiomatic to use other functions in this module, such as remaining and slice.

fn prev[permalink] [source]

fn prev(d: *decoder) (rune | done | more | invalid);

Returns the previous rune from a decoder. done is returned when there are no previous codepoints.

fn remaining[permalink] [source]

fn remaining(d: *decoder) []u8;

Returns a subslice from the next byte in the decoder to the end of the slice.

fn runesz[permalink] [source]

fn runesz(r: rune) size;

Returns the size of a rune, in octets, when encoded as UTF-8.

fn slice[permalink] [source]

fn slice(begin: *decoder, end: *decoder) []u8;

Return a subslice from the position of the first decoder to the position of the second decoder. The decoders must originate from the same slice and the position of the second decoder must not be before the position of the first one.

fn strerror[permalink] [source]

fn strerror(err: invalid) str;

Returns a human-friendly string for invalid.

fn utf8sz[permalink] [source]

fn utf8sz(c: u8) (size | invalid);

Returns the expected length of a UTF-8 codepoint in bytes given its first byte, or invalid if the given byte doesn't begin a valid UTF-8 sequence.

fn validate[permalink] [source]

fn validate(src: []u8) (void | invalid);

Returns void if a given byte slice contains only valid UTF-8 sequences, otherwise returns invalid.