Package com.ibm.icu.impl.coll
Class CollationIterator
- java.lang.Object
-
- com.ibm.icu.impl.coll.CollationIterator
-
- Direct Known Subclasses:
CollationDataBuilder.DataBuilderCollationIterator
,IterCollationIterator
,UTF16CollationIterator
public abstract class CollationIterator extends java.lang.Object
Collation element iterator and abstract character iterator. When a method returns a code point value, it must be in 0..10FFFF, except it can be negative as a sentinel value.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
CollationIterator.CEBuffer
private static class
CollationIterator.SkippedState
-
Field Summary
Fields Modifier and Type Field Description private CollationIterator.CEBuffer
ceBuffer
private int
cesIndex
protected CollationData
data
private boolean
isNumeric
protected static long
NO_CP_AND_CE32
private int
numCpFwd
private CollationIterator.SkippedState
skipped
protected Trie2_32
trie
-
Constructor Summary
Constructors Constructor Description CollationIterator(CollationData d)
Partially constructs the iterator.CollationIterator(CollationData d, boolean numeric)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
appendCEsFromCE32(CollationData d, int c, int ce32, boolean forward)
private void
appendNumericCEs(int ce32, boolean forward)
Turns a string of digits (bytes 0..9) into a sequence of CEs that will sort in numeric order.private void
appendNumericSegmentCEs(java.lang.CharSequence digits)
Turns 1..254 digits into a sequence of CEs.protected abstract void
backwardNumCodePoints(int num)
private void
backwardNumSkipped(int n)
(package private) void
clearCEs()
void
clearCEsIfNoneRemaining()
boolean
equals(java.lang.Object other)
int
fetchCEs()
Fetches all CEs.protected boolean
forbidSurrogateCodePoints()
protected abstract void
forwardNumCodePoints(int num)
long
getCE(int i)
protected int
getCE32FromBuilderData(int ce32)
private int
getCE32FromPrefix(CollationData d, int ce32)
long[]
getCEs()
int
getCEsLength()
protected int
getDataCE32(int c)
Returns the CE32 from the data trie.abstract int
getOffset()
protected char
handleGetTrailSurrogate()
Called when handleNextCE32() returns a LEAD_SURROGATE_TAG for a lead surrogate code unit.protected long
handleNextCE32()
Returns the next code point and its local CE32 value.int
hashCode()
protected static boolean
isLeadSurrogate(int c)
private static boolean
isSurrogate(int c)
protected static boolean
isTrailSurrogate(int c)
protected long
makeCodePointAndCE32Pair(int c, int ce32)
long
nextCE()
Returns the next collation element.private int
nextCE32FromContraction(CollationData d, int contractionCE32, java.lang.CharSequence trieChars, int trieOffset, int ce32, int c)
private int
nextCE32FromDiscontiguousContraction(CollationData d, CharsTrie suffixes, int ce32, int lookAhead, int c)
private long
nextCEFromCE32(CollationData d, int c, int ce32)
abstract int
nextCodePoint()
Returns the next code point (with post-increment).private int
nextSkippedCodePoint()
long
previousCE(UVector32 offsets)
Returns the previous collation element.private long
previousCEUnsafe(int c, UVector32 offsets)
Returns the previous CE when data.isUnsafeBackward(c, isNumeric).abstract int
previousCodePoint()
Returns the previous code point (with pre-decrement).protected void
reset()
protected void
reset(boolean numeric)
Resets the state as well as the numeric setting, and completes the initialization.abstract void
resetToOffset(int newOffset)
Resets the iterator state and sets the position to the specified offset.(package private) void
setCurrentCE(long ce)
Overwrites the current CE (the last one returned by nextCE()).
-
-
-
Field Detail
-
NO_CP_AND_CE32
protected static final long NO_CP_AND_CE32
- See Also:
- Constant Field Values
-
trie
protected final Trie2_32 trie
-
data
protected final CollationData data
-
ceBuffer
private CollationIterator.CEBuffer ceBuffer
-
cesIndex
private int cesIndex
-
skipped
private CollationIterator.SkippedState skipped
-
numCpFwd
private int numCpFwd
-
isNumeric
private boolean isNumeric
-
-
Constructor Detail
-
CollationIterator
public CollationIterator(CollationData d)
Partially constructs the iterator. In Java, we cache partially constructed iterators and finish their setup when starting to work on text (via reset(boolean) and the setText(numeric, ...) methods of subclasses). This avoids memory allocations for iterators that remain unused.In C++, there is only one constructor, and iterators are stack-allocated as needed.
-
CollationIterator
public CollationIterator(CollationData d, boolean numeric)
-
-
Method Detail
-
equals
public boolean equals(java.lang.Object other)
- Overrides:
equals
in classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
resetToOffset
public abstract void resetToOffset(int newOffset)
Resets the iterator state and sets the position to the specified offset. Subclasses must implement, and must call the parent class method, or CollationIterator.reset().
-
getOffset
public abstract int getOffset()
-
nextCE
public final long nextCE()
Returns the next collation element.
-
fetchCEs
public final int fetchCEs()
Fetches all CEs.- Returns:
- getCEsLength()
-
setCurrentCE
final void setCurrentCE(long ce)
Overwrites the current CE (the last one returned by nextCE()).
-
previousCE
public final long previousCE(UVector32 offsets)
Returns the previous collation element.
-
getCEsLength
public final int getCEsLength()
-
getCE
public final long getCE(int i)
-
getCEs
public final long[] getCEs()
-
clearCEs
final void clearCEs()
-
clearCEsIfNoneRemaining
public final void clearCEsIfNoneRemaining()
-
nextCodePoint
public abstract int nextCodePoint()
Returns the next code point (with post-increment). Public for identical-level comparison and for testing.
-
previousCodePoint
public abstract int previousCodePoint()
Returns the previous code point (with pre-decrement). Public for identical-level comparison and for testing.
-
reset
protected final void reset()
-
reset
protected final void reset(boolean numeric)
Resets the state as well as the numeric setting, and completes the initialization. Only exists in Java where we reset cached CollationIterator instances rather than stack-allocating temporary ones. (See also the constructor comments.)
-
handleNextCE32
protected long handleNextCE32()
Returns the next code point and its local CE32 value. Returns Collation.FALLBACK_CE32 at the end of the text (c<0) or when c's CE32 value is to be looked up in the base data (fallback). The code point is used for fallbacks, context and implicit weights. It is ignored when the returned CE32 is not special (e.g., FFFD_CE32). Returns the code point in bits 63..32 (signed) and the CE32 in bits 31..0.
-
makeCodePointAndCE32Pair
protected long makeCodePointAndCE32Pair(int c, int ce32)
-
handleGetTrailSurrogate
protected char handleGetTrailSurrogate()
Called when handleNextCE32() returns a LEAD_SURROGATE_TAG for a lead surrogate code unit. Returns the trail surrogate in that case and advances past it, if a trail surrogate follows the lead surrogate. Otherwise returns any other code unit and does not advance.
-
forbidSurrogateCodePoints
protected boolean forbidSurrogateCodePoints()
- Returns:
- false if surrogate code points U+D800..U+DFFF map to their own implicit primary weights (for UTF-16), or true if they map to CE(U+FFFD) (for UTF-8)
-
forwardNumCodePoints
protected abstract void forwardNumCodePoints(int num)
-
backwardNumCodePoints
protected abstract void backwardNumCodePoints(int num)
-
getDataCE32
protected int getDataCE32(int c)
Returns the CE32 from the data trie. Normally the same as data.getCE32(), but overridden in the builder. Call this only when the faster data.getCE32() cannot be used.
-
getCE32FromBuilderData
protected int getCE32FromBuilderData(int ce32)
-
appendCEsFromCE32
protected final void appendCEsFromCE32(CollationData d, int c, int ce32, boolean forward)
-
isSurrogate
private static final boolean isSurrogate(int c)
-
isLeadSurrogate
protected static final boolean isLeadSurrogate(int c)
-
isTrailSurrogate
protected static final boolean isTrailSurrogate(int c)
-
nextCEFromCE32
private final long nextCEFromCE32(CollationData d, int c, int ce32)
-
getCE32FromPrefix
private final int getCE32FromPrefix(CollationData d, int ce32)
-
nextSkippedCodePoint
private final int nextSkippedCodePoint()
-
backwardNumSkipped
private final void backwardNumSkipped(int n)
-
nextCE32FromContraction
private final int nextCE32FromContraction(CollationData d, int contractionCE32, java.lang.CharSequence trieChars, int trieOffset, int ce32, int c)
-
nextCE32FromDiscontiguousContraction
private final int nextCE32FromDiscontiguousContraction(CollationData d, CharsTrie suffixes, int ce32, int lookAhead, int c)
-
previousCEUnsafe
private final long previousCEUnsafe(int c, UVector32 offsets)
Returns the previous CE when data.isUnsafeBackward(c, isNumeric).
-
appendNumericCEs
private final void appendNumericCEs(int ce32, boolean forward)
Turns a string of digits (bytes 0..9) into a sequence of CEs that will sort in numeric order. Starts from this ce32's digit value and consumes the following/preceding digits. The digits string must not be empty and must not have leading zeros.
-
appendNumericSegmentCEs
private final void appendNumericSegmentCEs(java.lang.CharSequence digits)
Turns 1..254 digits into a sequence of CEs. Called by appendNumericCEs() for each segment of at most 254 digits.
-
-