Package com.ibm.icu.impl.coll
Class CollationFastLatin
- java.lang.Object
-
- com.ibm.icu.impl.coll.CollationFastLatin
-
public final class CollationFastLatin extends java.lang.Object
-
-
Field Summary
Fields Modifier and Type Field Description (package private) static int
BAIL_OUT
static int
BAIL_OUT_RESULT
Comparison return value when the regular comparison must be used.(package private) static int
CASE_AND_TERTIARY_MASK
(package private) static int
CASE_MASK
(package private) static int
COMMON_SEC
(package private) static int
COMMON_SEC_PLUS_OFFSET
(package private) static int
COMMON_TER
(package private) static int
COMMON_TER_PLUS_OFFSET
(package private) static int
CONTR_CHAR_MASK
Contraction result first word bits 8..0 contain the second contraction character, as a char index 0..NUM_FAST_CHARS-1.(package private) static int
CONTR_LENGTH_SHIFT
Contraction result first word bits 10..9 contain the result length: 1=bail out, 2=one mini CE, 3=two mini CEs(package private) static int
CONTRACTION
Contraction with one fast Latin character.(package private) static int
EOS
(package private) static int
EXPANSION
An expansion encodes two CEs.(package private) static int
INDEX_MASK
static int
LATIN_LIMIT
static int
LATIN_MAX
(package private) static int
LATIN_MAX_UTF8_LEAD
(package private) static int
LONG_INC
(package private) static int
LONG_PRIMARY_MASK
(package private) static int
LOWER_CASE
(package private) static int
MAX_LONG
(package private) static int
MAX_SEC_AFTER
(package private) static int
MAX_SEC_BEFORE
(package private) static int
MAX_SEC_HIGH
(package private) static int
MAX_SHORT
The highest primary weight is reserved for U+FFFF.(package private) static int
MAX_TER_AFTER
(package private) static int
MERGE_WEIGHT
(package private) static int
MIN_LONG
Encodes one CE with a long/low mini primary (there are 128).(package private) static int
MIN_SEC_AFTER
(package private) static int
MIN_SEC_BEFORE
(package private) static int
MIN_SEC_HIGH
(package private) static int
MIN_SHORT
Encodes one CE with a short/high primary (there are 60), plus a secondary CE if the secondary weight is high.(package private) static int
NUM_FAST_CHARS
(package private) static int
PUNCT_LIMIT
(package private) static int
PUNCT_START
(package private) static int
SEC_INC
(package private) static int
SEC_OFFSET
Lookup: Add this offset to secondary weights, except for completely ignorable CEs.(package private) static int
SECONDARY_MASK
(package private) static int
SHORT_INC
(package private) static int
SHORT_PRIMARY_MASK
(package private) static int
TER_OFFSET
Lookup: Add this offset to tertiary weights, except for completely ignorable CEs.(package private) static int
TERTIARY_MASK
(package private) static int
TWO_CASES_MASK
(package private) static int
TWO_COMMON_SEC_PLUS_OFFSET
(package private) static int
TWO_COMMON_TER_PLUS_OFFSET
(package private) static int
TWO_LONG_PRIMARIES_MASK
(package private) static int
TWO_LOWER_CASES
(package private) static int
TWO_SEC_OFFSETS
(package private) static int
TWO_SECONDARIES_MASK
(package private) static int
TWO_SHORT_PRIMARIES_MASK
(package private) static int
TWO_TER_OFFSETS
(package private) static int
TWO_TERTIARIES_MASK
static int
VERSION
Fast Latin format version (one byte 1..FF).
-
Constructor Summary
Constructors Modifier Constructor Description private
CollationFastLatin()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static int
compareUTF16(char[] table, char[] primaries, int options, java.lang.CharSequence left, java.lang.CharSequence right, int startIndex)
private static int
getCases(int variableTop, boolean strengthIsPrimary, int pair)
(package private) static int
getCharIndex(char c)
static int
getOptions(CollationData data, CollationSettings settings, char[] primaries)
Computes the options value for the compare functions and writes the precomputed primary weights.private static int
getPrimaries(int variableTop, int pair)
private static int
getQuaternaries(int variableTop, int pair)
private static int
getSecondaries(int variableTop, int pair)
private static int
getSecondariesFromOneShortCE(int ce)
private static int
getTertiaries(int variableTop, boolean withCaseBits, int pair)
private static int
lookup(char[] table, int c)
private static long
nextPair(char[] table, int c, int ce, java.lang.CharSequence s16, int sIndex)
Java returns a negative result (use the '~' operator) if sIndex is to be incremented.
-
-
-
Field Detail
-
VERSION
public static final int VERSION
Fast Latin format version (one byte 1..FF). Must be incremented for any runtime-incompatible changes, in particular, for changes to any of the following constants. When the major version number of the main data format changes, we can reset this fast Latin version to 1.- See Also:
- Constant Field Values
-
LATIN_MAX
public static final int LATIN_MAX
- See Also:
- Constant Field Values
-
LATIN_LIMIT
public static final int LATIN_LIMIT
- See Also:
- Constant Field Values
-
LATIN_MAX_UTF8_LEAD
static final int LATIN_MAX_UTF8_LEAD
- See Also:
- Constant Field Values
-
PUNCT_START
static final int PUNCT_START
- See Also:
- Constant Field Values
-
PUNCT_LIMIT
static final int PUNCT_LIMIT
- See Also:
- Constant Field Values
-
NUM_FAST_CHARS
static final int NUM_FAST_CHARS
- See Also:
- Constant Field Values
-
SHORT_PRIMARY_MASK
static final int SHORT_PRIMARY_MASK
- See Also:
- Constant Field Values
-
INDEX_MASK
static final int INDEX_MASK
- See Also:
- Constant Field Values
-
SECONDARY_MASK
static final int SECONDARY_MASK
- See Also:
- Constant Field Values
-
CASE_MASK
static final int CASE_MASK
- See Also:
- Constant Field Values
-
LONG_PRIMARY_MASK
static final int LONG_PRIMARY_MASK
- See Also:
- Constant Field Values
-
TERTIARY_MASK
static final int TERTIARY_MASK
- See Also:
- Constant Field Values
-
CASE_AND_TERTIARY_MASK
static final int CASE_AND_TERTIARY_MASK
- See Also:
- Constant Field Values
-
TWO_SHORT_PRIMARIES_MASK
static final int TWO_SHORT_PRIMARIES_MASK
- See Also:
- Constant Field Values
-
TWO_LONG_PRIMARIES_MASK
static final int TWO_LONG_PRIMARIES_MASK
- See Also:
- Constant Field Values
-
TWO_SECONDARIES_MASK
static final int TWO_SECONDARIES_MASK
- See Also:
- Constant Field Values
-
TWO_CASES_MASK
static final int TWO_CASES_MASK
- See Also:
- Constant Field Values
-
TWO_TERTIARIES_MASK
static final int TWO_TERTIARIES_MASK
- See Also:
- Constant Field Values
-
CONTRACTION
static final int CONTRACTION
Contraction with one fast Latin character. Use INDEX_MASK to find the start of the contraction list after the fixed table. The first entry contains the default mapping. Otherwise use CONTR_CHAR_MASK for the contraction character index (in ascending order). Use CONTR_LENGTH_SHIFT for the length of the entry (1=BAIL_OUT, 2=one CE, 3=two CEs). Also, U+0000 maps to a contraction entry, so that the fast path need not check for NUL termination. It usually maps to a contraction list with only the completely ignorable default value.- See Also:
- Constant Field Values
-
EXPANSION
static final int EXPANSION
An expansion encodes two CEs. Use INDEX_MASK to find the pair of CEs after the fixed table. The higher a mini CE value, the easier it is to process. For expansions and higher, no context needs to be considered.- See Also:
- Constant Field Values
-
MIN_LONG
static final int MIN_LONG
Encodes one CE with a long/low mini primary (there are 128). All potentially-variable primaries must be in this range, to make the short-primary path as fast as possible.- See Also:
- Constant Field Values
-
LONG_INC
static final int LONG_INC
- See Also:
- Constant Field Values
-
MAX_LONG
static final int MAX_LONG
- See Also:
- Constant Field Values
-
MIN_SHORT
static final int MIN_SHORT
Encodes one CE with a short/high primary (there are 60), plus a secondary CE if the secondary weight is high. Fast handling: At least all letter primaries should be in this range.- See Also:
- Constant Field Values
-
SHORT_INC
static final int SHORT_INC
- See Also:
- Constant Field Values
-
MAX_SHORT
static final int MAX_SHORT
The highest primary weight is reserved for U+FFFF.- See Also:
- Constant Field Values
-
MIN_SEC_BEFORE
static final int MIN_SEC_BEFORE
- See Also:
- Constant Field Values
-
SEC_INC
static final int SEC_INC
- See Also:
- Constant Field Values
-
MAX_SEC_BEFORE
static final int MAX_SEC_BEFORE
- See Also:
- Constant Field Values
-
COMMON_SEC
static final int COMMON_SEC
- See Also:
- Constant Field Values
-
MIN_SEC_AFTER
static final int MIN_SEC_AFTER
- See Also:
- Constant Field Values
-
MAX_SEC_AFTER
static final int MAX_SEC_AFTER
- See Also:
- Constant Field Values
-
MIN_SEC_HIGH
static final int MIN_SEC_HIGH
- See Also:
- Constant Field Values
-
MAX_SEC_HIGH
static final int MAX_SEC_HIGH
- See Also:
- Constant Field Values
-
SEC_OFFSET
static final int SEC_OFFSET
Lookup: Add this offset to secondary weights, except for completely ignorable CEs. Must be greater than any special value, e.g., MERGE_WEIGHT. The exact value is not relevant for the format version.- See Also:
- Constant Field Values
-
COMMON_SEC_PLUS_OFFSET
static final int COMMON_SEC_PLUS_OFFSET
- See Also:
- Constant Field Values
-
TWO_SEC_OFFSETS
static final int TWO_SEC_OFFSETS
- See Also:
- Constant Field Values
-
TWO_COMMON_SEC_PLUS_OFFSET
static final int TWO_COMMON_SEC_PLUS_OFFSET
- See Also:
- Constant Field Values
-
LOWER_CASE
static final int LOWER_CASE
- See Also:
- Constant Field Values
-
TWO_LOWER_CASES
static final int TWO_LOWER_CASES
- See Also:
- Constant Field Values
-
COMMON_TER
static final int COMMON_TER
- See Also:
- Constant Field Values
-
MAX_TER_AFTER
static final int MAX_TER_AFTER
- See Also:
- Constant Field Values
-
TER_OFFSET
static final int TER_OFFSET
Lookup: Add this offset to tertiary weights, except for completely ignorable CEs. Must be greater than any special value, e.g., MERGE_WEIGHT. Must be greater than case bits as well, so that with combined case+tertiary weights plus the offset the tertiary bits does not spill over into the case bits. The exact value is not relevant for the format version.- See Also:
- Constant Field Values
-
COMMON_TER_PLUS_OFFSET
static final int COMMON_TER_PLUS_OFFSET
- See Also:
- Constant Field Values
-
TWO_TER_OFFSETS
static final int TWO_TER_OFFSETS
- See Also:
- Constant Field Values
-
TWO_COMMON_TER_PLUS_OFFSET
static final int TWO_COMMON_TER_PLUS_OFFSET
- See Also:
- Constant Field Values
-
MERGE_WEIGHT
static final int MERGE_WEIGHT
- See Also:
- Constant Field Values
-
EOS
static final int EOS
- See Also:
- Constant Field Values
-
BAIL_OUT
static final int BAIL_OUT
- See Also:
- Constant Field Values
-
CONTR_CHAR_MASK
static final int CONTR_CHAR_MASK
Contraction result first word bits 8..0 contain the second contraction character, as a char index 0..NUM_FAST_CHARS-1. Each contraction list is terminated with a word containing CONTR_CHAR_MASK.- See Also:
- Constant Field Values
-
CONTR_LENGTH_SHIFT
static final int CONTR_LENGTH_SHIFT
Contraction result first word bits 10..9 contain the result length: 1=bail out, 2=one mini CE, 3=two mini CEs- See Also:
- Constant Field Values
-
BAIL_OUT_RESULT
public static final int BAIL_OUT_RESULT
Comparison return value when the regular comparison must be used. The exact value is not relevant for the format version.- See Also:
- Constant Field Values
-
-
Method Detail
-
getCharIndex
static int getCharIndex(char c)
-
getOptions
public static int getOptions(CollationData data, CollationSettings settings, char[] primaries)
Computes the options value for the compare functions and writes the precomputed primary weights. Returns -1 if the Latin fastpath is not supported for the data and settings. The capacity must be LATIN_LIMIT.
-
compareUTF16
public static int compareUTF16(char[] table, char[] primaries, int options, java.lang.CharSequence left, java.lang.CharSequence right, int startIndex)
-
lookup
private static int lookup(char[] table, int c)
-
nextPair
private static long nextPair(char[] table, int c, int ce, java.lang.CharSequence s16, int sIndex)
Java returns a negative result (use the '~' operator) if sIndex is to be incremented. C++ modifies sIndex.
-
getPrimaries
private static int getPrimaries(int variableTop, int pair)
-
getSecondariesFromOneShortCE
private static int getSecondariesFromOneShortCE(int ce)
-
getSecondaries
private static int getSecondaries(int variableTop, int pair)
-
getCases
private static int getCases(int variableTop, boolean strengthIsPrimary, int pair)
-
getTertiaries
private static int getTertiaries(int variableTop, boolean withCaseBits, int pair)
-
getQuaternaries
private static int getQuaternaries(int variableTop, int pair)
-
-