Libparserutils
Functions
utf16.h File Reference

UTF-16 manipulation functions (interface). More...

#include <inttypes.h>
#include <parserutils/errors.h>

Go to the source code of this file.

Functions

parserutils_error parserutils_charset_utf16_to_ucs4 (const uint8_t *s, size_t len, uint32_t *ucs4, size_t *clen)
 Convert a UTF-16 sequence into a single UCS-4 character.
 
parserutils_error parserutils_charset_utf16_from_ucs4 (uint32_t ucs4, uint8_t *s, size_t *len)
 Convert a single UCS-4 character into a UTF-16 sequence.
 
parserutils_error parserutils_charset_utf16_length (const uint8_t *s, size_t max, size_t *len)
 Calculate the length (in characters) of a bounded UTF-16 string.
 
parserutils_error parserutils_charset_utf16_char_byte_length (const uint8_t *s, size_t *len)
 Calculate the length (in bytes) of a UTF-16 character.
 
parserutils_error parserutils_charset_utf16_prev (const uint8_t *s, uint32_t off, uint32_t *prevoff)
 Find previous legal UTF-16 char in string.
 
parserutils_error parserutils_charset_utf16_next (const uint8_t *s, uint32_t len, uint32_t off, uint32_t *nextoff)
 Find next legal UTF-16 char in string.
 
parserutils_error parserutils_charset_utf16_next_paranoid (const uint8_t *s, uint32_t len, uint32_t off, uint32_t *nextoff)
 Find next legal UTF-16 char in string.
 

Detailed Description

UTF-16 manipulation functions (interface).

Definition in file utf16.h.

Function Documentation

◆ parserutils_charset_utf16_char_byte_length()

parserutils_error parserutils_charset_utf16_char_byte_length ( const uint8_t * s,
size_t * len )

Calculate the length (in bytes) of a UTF-16 character.

Parameters
sPointer to start of character
lenPointer to location to receive length
Returns
PARSERUTILS_OK on success, appropriate error otherwise

Definition at line 133 of file utf16.c.

References len, PARSERUTILS_BADPARM, and PARSERUTILS_OK.

◆ parserutils_charset_utf16_from_ucs4()

parserutils_error parserutils_charset_utf16_from_ucs4 ( uint32_t ucs4,
uint8_t * s,
size_t * len )

Convert a single UCS-4 character into a UTF-16 sequence.

Parameters
ucs4The character to process (0 <= c <= 0x7FFFFFFF) (host endian)
sPointer to 4 byte long output buffer
lenPointer to location to receive length of multibyte sequence
Returns
PARSERUTILS_OK on success, appropriate error otherwise

Definition at line 70 of file utf16.c.

References len, PARSERUTILS_BADPARM, PARSERUTILS_INVALID, and PARSERUTILS_OK.

Referenced by charset_utf16_codec_encode().

◆ parserutils_charset_utf16_length()

parserutils_error parserutils_charset_utf16_length ( const uint8_t * s,
size_t max,
size_t * len )

Calculate the length (in characters) of a bounded UTF-16 string.

Parameters
sThe string
maxMaximum length
lenPointer to location to receive length of string
Returns
PARSERUTILS_OK on success, appropriate error otherwise

Definition at line 102 of file utf16.c.

References len, max, PARSERUTILS_BADPARM, and PARSERUTILS_OK.

◆ parserutils_charset_utf16_next()

parserutils_error parserutils_charset_utf16_next ( const uint8_t * s,
uint32_t len,
uint32_t off,
uint32_t * nextoff )

Find next legal UTF-16 char in string.

Parameters
sThe string (assumed valid)
lenMaximum offset in string
offOffset in the string to start at
nextoffPointer to location to receive offset of first byte of next legal character
Returns
PARSERUTILS_OK on success, appropriate error otherwise

Definition at line 186 of file utf16.c.

References len, PARSERUTILS_BADPARM, and PARSERUTILS_OK.

◆ parserutils_charset_utf16_next_paranoid()

parserutils_error parserutils_charset_utf16_next_paranoid ( const uint8_t * s,
uint32_t len,
uint32_t off,
uint32_t * nextoff )

Find next legal UTF-16 char in string.

Parameters
sThe string (assumed to be of dubious validity)
lenMaximum offset in string
offOffset in the string to start at
nextoffPointer to location to receive offset of first byte of next legal character
Returns
PARSERUTILS_OK on success, appropriate error otherwise

Definition at line 214 of file utf16.c.

References len, PARSERUTILS_BADPARM, PARSERUTILS_NEEDDATA, and PARSERUTILS_OK.

Referenced by charset_utf16_codec_read_char().

◆ parserutils_charset_utf16_prev()

parserutils_error parserutils_charset_utf16_prev ( const uint8_t * s,
uint32_t off,
uint32_t * prevoff )

Find previous legal UTF-16 char in string.

Parameters
sThe string
offOffset in the string to start at
prevoffPointer to location to receive offset of first byte of previous legal character
Returns
PARSERUTILS_OK on success, appropriate error otherwise

Definition at line 158 of file utf16.c.

References PARSERUTILS_BADPARM, and PARSERUTILS_OK.

◆ parserutils_charset_utf16_to_ucs4()

parserutils_error parserutils_charset_utf16_to_ucs4 ( const uint8_t * s,
size_t len,
uint32_t * ucs4,
size_t * clen )

Convert a UTF-16 sequence into a single UCS-4 character.

Parameters
sThe sequence to process
lenLength of sequence in bytes
ucs4Pointer to location to receive UCS-4 character (host endian)
clenPointer to location to receive byte length of UTF-16 sequence
Returns
PARSERUTILS_OK on success, appropriate error otherwise

Definition at line 27 of file utf16.c.

References len, PARSERUTILS_BADPARM, PARSERUTILS_INVALID, PARSERUTILS_NEEDDATA, and PARSERUTILS_OK.

Referenced by charset_utf16_codec_read_char().