Package com.ibm.icu.impl.number
Class AffixUtils
- java.lang.Object
-
- com.ibm.icu.impl.number.AffixUtils
-
public class AffixUtils extends java.lang.Object
Performs manipulations on affix patterns: the prefix and suffix strings associated with a decimal format pattern. For example:Affix Pattern Example Unescaped (Formatted) String abc abc ab- ab− ab'-' ab- ab'' ab' long tag = 0L; while (AffixPatternUtils.hasNext(tag, patternString)) { tag = AffixPatternUtils.nextToken(tag, patternString); int typeOrCp = AffixPatternUtils.getTypeOrCp(tag); switch (typeOrCp) { case AffixPatternUtils.TYPE_MINUS_SIGN: // Current token is a minus sign. break; case AffixPatternUtils.TYPE_PLUS_SIGN: // Current token is a plus sign. break; case AffixPatternUtils.TYPE_PERCENT: // Current token is a percent sign. break; // ... other types ... default: // Current token is an arbitrary code point. // The variable typeOrCp is the code point. break; } }
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
AffixUtils.SymbolProvider
static interface
AffixUtils.TokenConsumer
-
Field Summary
Fields Modifier and Type Field Description private static int
STATE_AFTER_QUOTE
private static int
STATE_BASE
private static int
STATE_FIFTH_CURR
private static int
STATE_FIRST_CURR
private static int
STATE_FIRST_QUOTE
private static int
STATE_FOURTH_CURR
private static int
STATE_INSIDE_QUOTE
private static int
STATE_OVERFLOW_CURR
private static int
STATE_SECOND_CURR
private static int
STATE_THIRD_CURR
static int
TYPE_APPROXIMATELY_SIGN
private static int
TYPE_CODEPOINT
Represents a literal character; the value is stored in the code point field.static int
TYPE_CURRENCY_DOUBLE
Represents a double currency symbol '¤¤'.static int
TYPE_CURRENCY_OVERFLOW
Represents a sequence of six or more currency symbols.static int
TYPE_CURRENCY_QUAD
Represents a quadruple currency symbol '¤¤¤¤'.static int
TYPE_CURRENCY_QUINT
Represents a quintuple currency symbol '¤¤¤¤¤'.static int
TYPE_CURRENCY_SINGLE
Represents a single currency symbol '¤'.static int
TYPE_CURRENCY_TRIPLE
Represents a triple currency symbol '¤¤¤'.static int
TYPE_MINUS_SIGN
Represents a minus sign symbol '-'.static int
TYPE_PERCENT
Represents a percent sign symbol '%'.static int
TYPE_PERMILLE
Represents a permille sign symbol '‰'.static int
TYPE_PLUS_SIGN
Represents a plus sign symbol '+'.
-
Constructor Summary
Constructors Constructor Description AffixUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static boolean
containsOnlySymbolsAndIgnorables(java.lang.CharSequence affixPattern, UnicodeSet ignorables)
Returns whether the given affix pattern contains only symbols and ignorables as defined by the given ignorables set.static boolean
containsType(java.lang.CharSequence affixPattern, int type)
Checks whether the given affix pattern contains at least one token of the given type, which is one of the constants "TYPE_" inAffixUtils
.static java.lang.String
escape(java.lang.CharSequence input)
Version ofescape(java.lang.CharSequence, java.lang.StringBuilder)
that returns a String, or null if input is null.static int
escape(java.lang.CharSequence input, java.lang.StringBuilder output)
Takes a string and escapes (quotes) characters that have special meaning in the affix pattern syntax.static int
estimateLength(java.lang.CharSequence patternString)
Estimates the number of code points present in an unescaped version of the affix pattern string (one that would be returned byunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
), assuming that all interpolated symbols consume one code point and that currencies consume as many code points as their symbol width.private static int
getCodePoint(long tag)
static NumberFormat.Field
getFieldForType(int type)
private static int
getOffset(long tag)
private static int
getState(long tag)
private static int
getType(long tag)
private static int
getTypeOrCp(long tag)
This function helps determine the identity of the token consumed bynextToken(long, java.lang.CharSequence)
.static boolean
hasCurrencySymbols(java.lang.CharSequence affixPattern)
Checks whether the specified affix pattern has any unquoted currency symbols ("¤").private static boolean
hasNext(long tag, java.lang.CharSequence string)
Returns whether the affix pattern string has any more tokens to be retrieved from a call tonextToken(long, java.lang.CharSequence)
.static void
iterateWithConsumer(java.lang.CharSequence affixPattern, AffixUtils.TokenConsumer consumer)
Iterates over the affix pattern, calling the TokenConsumer for each token.private static long
makeTag(int offset, int type, int state, int cp)
Encodes the given values into a 64-bit tag.private static long
nextToken(long tag, java.lang.CharSequence patternString)
Returns the next token from the affix pattern.static java.lang.String
replaceType(java.lang.CharSequence affixPattern, int type, char replacementChar)
Replaces all occurrences of tokens with the given type with the given replacement char.static int
unescape(java.lang.CharSequence affixPattern, FormattedStringBuilder output, int position, AffixUtils.SymbolProvider provider, NumberFormat.Field field)
Executes the unescape state machine.static int
unescapedCount(java.lang.CharSequence affixPattern, boolean lengthOrCount, AffixUtils.SymbolProvider provider)
Sames asunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
, but only calculates the length or code point count.
-
-
-
Field Detail
-
STATE_BASE
private static final int STATE_BASE
- See Also:
- Constant Field Values
-
STATE_FIRST_QUOTE
private static final int STATE_FIRST_QUOTE
- See Also:
- Constant Field Values
-
STATE_INSIDE_QUOTE
private static final int STATE_INSIDE_QUOTE
- See Also:
- Constant Field Values
-
STATE_AFTER_QUOTE
private static final int STATE_AFTER_QUOTE
- See Also:
- Constant Field Values
-
STATE_FIRST_CURR
private static final int STATE_FIRST_CURR
- See Also:
- Constant Field Values
-
STATE_SECOND_CURR
private static final int STATE_SECOND_CURR
- See Also:
- Constant Field Values
-
STATE_THIRD_CURR
private static final int STATE_THIRD_CURR
- See Also:
- Constant Field Values
-
STATE_FOURTH_CURR
private static final int STATE_FOURTH_CURR
- See Also:
- Constant Field Values
-
STATE_FIFTH_CURR
private static final int STATE_FIFTH_CURR
- See Also:
- Constant Field Values
-
STATE_OVERFLOW_CURR
private static final int STATE_OVERFLOW_CURR
- See Also:
- Constant Field Values
-
TYPE_CODEPOINT
private static final int TYPE_CODEPOINT
Represents a literal character; the value is stored in the code point field.- See Also:
- Constant Field Values
-
TYPE_MINUS_SIGN
public static final int TYPE_MINUS_SIGN
Represents a minus sign symbol '-'.- See Also:
- Constant Field Values
-
TYPE_PLUS_SIGN
public static final int TYPE_PLUS_SIGN
Represents a plus sign symbol '+'.- See Also:
- Constant Field Values
-
TYPE_APPROXIMATELY_SIGN
public static final int TYPE_APPROXIMATELY_SIGN
- See Also:
- Constant Field Values
-
TYPE_PERCENT
public static final int TYPE_PERCENT
Represents a percent sign symbol '%'.- See Also:
- Constant Field Values
-
TYPE_PERMILLE
public static final int TYPE_PERMILLE
Represents a permille sign symbol '‰'.- See Also:
- Constant Field Values
-
TYPE_CURRENCY_SINGLE
public static final int TYPE_CURRENCY_SINGLE
Represents a single currency symbol '¤'.- See Also:
- Constant Field Values
-
TYPE_CURRENCY_DOUBLE
public static final int TYPE_CURRENCY_DOUBLE
Represents a double currency symbol '¤¤'.- See Also:
- Constant Field Values
-
TYPE_CURRENCY_TRIPLE
public static final int TYPE_CURRENCY_TRIPLE
Represents a triple currency symbol '¤¤¤'.- See Also:
- Constant Field Values
-
TYPE_CURRENCY_QUAD
public static final int TYPE_CURRENCY_QUAD
Represents a quadruple currency symbol '¤¤¤¤'.- See Also:
- Constant Field Values
-
TYPE_CURRENCY_QUINT
public static final int TYPE_CURRENCY_QUINT
Represents a quintuple currency symbol '¤¤¤¤¤'.- See Also:
- Constant Field Values
-
TYPE_CURRENCY_OVERFLOW
public static final int TYPE_CURRENCY_OVERFLOW
Represents a sequence of six or more currency symbols.- See Also:
- Constant Field Values
-
-
Method Detail
-
estimateLength
public static int estimateLength(java.lang.CharSequence patternString)
Estimates the number of code points present in an unescaped version of the affix pattern string (one that would be returned byunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
), assuming that all interpolated symbols consume one code point and that currencies consume as many code points as their symbol width. Used for computing padding width.- Parameters:
patternString
- The original string whose width will be estimated.- Returns:
- The length of the unescaped string.
-
escape
public static int escape(java.lang.CharSequence input, java.lang.StringBuilder output)
Takes a string and escapes (quotes) characters that have special meaning in the affix pattern syntax. This function does not reverse-lookup symbols.Example input: "-$x"; example output: "'-'$x"
- Parameters:
input
- The string to be escaped.output
- The string builder to which to append the escaped string.- Returns:
- The number of chars (UTF-16 code units) appended to the output.
-
escape
public static java.lang.String escape(java.lang.CharSequence input)
Version ofescape(java.lang.CharSequence, java.lang.StringBuilder)
that returns a String, or null if input is null.
-
getFieldForType
public static final NumberFormat.Field getFieldForType(int type)
-
unescape
public static int unescape(java.lang.CharSequence affixPattern, FormattedStringBuilder output, int position, AffixUtils.SymbolProvider provider, NumberFormat.Field field)
Executes the unescape state machine. Replaces the unquoted characters "-", "+", "%", "‰", and "¤" with the corresponding symbols provided by theAffixUtils.SymbolProvider
, and inserts the result into the FormattedStringBuilder at the requested location.Example input: "'-'¤x"; example output: "-$x"
- Parameters:
affixPattern
- The original string to be unescaped.output
- The FormattedStringBuilder to mutate with the result.position
- The index into the FormattedStringBuilder to insert the the string.provider
- An object to generate locale symbols.- Returns:
- The length of the string added to affixPattern.
-
unescapedCount
public static int unescapedCount(java.lang.CharSequence affixPattern, boolean lengthOrCount, AffixUtils.SymbolProvider provider)
Sames asunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
, but only calculates the length or code point count. More efficient thanunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
if you only need the length but not the string itself.- Parameters:
affixPattern
- The original string to be unescaped.lengthOrCount
- true to count length (UTF-16 code units); false to count code pointsprovider
- An object to generate locale symbols.- Returns:
- The number of code points in the unescaped string.
-
containsType
public static boolean containsType(java.lang.CharSequence affixPattern, int type)
Checks whether the given affix pattern contains at least one token of the given type, which is one of the constants "TYPE_" inAffixUtils
.- Parameters:
affixPattern
- The affix pattern to check.type
- The token type.- Returns:
- true if the affix pattern contains the given token type; false otherwise.
-
hasCurrencySymbols
public static boolean hasCurrencySymbols(java.lang.CharSequence affixPattern)
Checks whether the specified affix pattern has any unquoted currency symbols ("¤").- Parameters:
affixPattern
- The string to check for currency symbols.- Returns:
- true if the literal has at least one unquoted currency symbol; false otherwise.
-
replaceType
public static java.lang.String replaceType(java.lang.CharSequence affixPattern, int type, char replacementChar)
Replaces all occurrences of tokens with the given type with the given replacement char.- Parameters:
affixPattern
- The source affix pattern (does not get modified).type
- The token type.replacementChar
- The char to substitute in place of chars of the given token type.- Returns:
- A string containing the new affix pattern.
-
containsOnlySymbolsAndIgnorables
public static boolean containsOnlySymbolsAndIgnorables(java.lang.CharSequence affixPattern, UnicodeSet ignorables)
Returns whether the given affix pattern contains only symbols and ignorables as defined by the given ignorables set.
-
iterateWithConsumer
public static void iterateWithConsumer(java.lang.CharSequence affixPattern, AffixUtils.TokenConsumer consumer)
Iterates over the affix pattern, calling the TokenConsumer for each token.
-
nextToken
private static long nextToken(long tag, java.lang.CharSequence patternString)
Returns the next token from the affix pattern.- Parameters:
tag
- A bitmask used for keeping track of state from token to token. The initial value should be 0L.patternString
- The affix pattern.- Returns:
- The bitmask tag to pass to the next call of this method to retrieve the following token (never negative), or -1 if there were no more tokens in the affix pattern.
- See Also:
hasNext(long, java.lang.CharSequence)
-
hasNext
private static boolean hasNext(long tag, java.lang.CharSequence string)
Returns whether the affix pattern string has any more tokens to be retrieved from a call tonextToken(long, java.lang.CharSequence)
.- Parameters:
tag
- The bitmask tag of the previous token, as returned bynextToken(long, java.lang.CharSequence)
.string
- The affix pattern.- Returns:
- true if there are more tokens to consume; false otherwise.
-
getTypeOrCp
private static int getTypeOrCp(long tag)
This function helps determine the identity of the token consumed bynextToken(long, java.lang.CharSequence)
. Converts from a bitmask tag, based on a call tonextToken(long, java.lang.CharSequence)
, to its corresponding symbol type or code point.- Parameters:
tag
- The bitmask tag of the current token, as returned bynextToken(long, java.lang.CharSequence)
.- Returns:
- If less than zero, a symbol type corresponding to one of the
TYPE_
constants, such asTYPE_MINUS_SIGN
. If greater than or equal to zero, a literal code point.
-
makeTag
private static long makeTag(int offset, int type, int state, int cp)
Encodes the given values into a 64-bit tag.- Bits 0-31 => offset (int32)
- Bits 32-35 => type (uint4)
- Bits 36-39 => state (uint4)
- Bits 40-60 => code point (uint21)
- Bits 61-63 => unused
-
getOffset
private static int getOffset(long tag)
-
getType
private static int getType(long tag)
-
getState
private static int getState(long tag)
-
getCodePoint
private static int getCodePoint(long tag)
-
-