Class AlphabeticIndex<V>
- java.lang.Object
-
- com.ibm.icu.text.AlphabeticIndex<V>
-
- All Implemented Interfaces:
java.lang.Iterable<AlphabeticIndex.Bucket<V>>
public final class AlphabeticIndex<V> extends java.lang.Object implements java.lang.Iterable<AlphabeticIndex.Bucket<V>>
AlphabeticIndex supports the creation of a UI index appropriate for a given language. It can support either direct use, or use with a client that doesn't support localized collation. The following is an example of what an index might look like in a UI:... A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ... A Addison Albertson Azensky B Baecker ...
The class can generate a list of labels for use as a UI "index", that is, a list of clickable characters (or character sequences) that allow the user to see a segment (bucket) of a larger "target" list. That is, each label corresponds to a bucket in the target list, where everything in the bucket is greater than or equal to the character (according to the locale's collation). Strings can be added to the index; they will be in sorted order in the right bucket.The class also supports having buckets for strings before the first (underflow), after the last (overflow), and between scripts (inflow). For example, if the index is constructed with labels for Russian and English, Greek characters would fall into an inflow bucket between the other two scripts.
Note: If you expect to have a lot of ASCII or Latin characters as well as characters from the user's language, then it is a good idea to call addLabels(ULocale.English).
Direct Use
The following shows an example of building an index directly. The "show..." methods below are just to illustrate usage.
// Create a simple index where the values for the strings are Integers, and add the strings AlphabeticIndex<Integer> index = new AlphabeticIndex<Integer>(desiredLocale).addLabels(additionalLocale); int counter = 0; for (String item : test) { index.addRecord(item, counter++); } ... // Show index at top. We could skip or gray out empty buckets for (AlphabeticIndex.Bucket<Integer> bucket : index) { if (showAll || bucket.size() != 0) { showLabelAtTop(UI, bucket.getLabel()); } } ... // Show the buckets with their contents, skipping empty buckets for (AlphabeticIndex.Bucket<Integer> bucket : index) { if (bucket.size() != 0) { showLabelInList(UI, bucket.getLabel()); for (AlphabeticIndex.Record<Integer> item : bucket) { showIndexedItem(UI, item.getName(), item.getData()); }
The caller can build different UIs using this class. For example, an index character could be omitted or grayed-out if its bucket is empty. Small buckets could also be combined based on size, such as:... A-F G-N O-Z ...
Client Support
Callers can also use the
AlphabeticIndex.ImmutableIndex
, or the AlphabeticIndex itself, to support sorting on a client that doesn't support AlphabeticIndex functionality.The ImmutableIndex is both immutable and thread-safe. The corresponding AlphabeticIndex methods are not thread-safe because they "lazily" build the index buckets.
- ImmutableIndex.getBucket(index) provides random access to all buckets and their labels and label types.
- AlphabeticIndex.getBucketLabels() or the bucket iterator on either class can be used to get a list of the labels, such as "...", "A", "B",..., and send that list to the client.
- When the client has a new name, it sends that name to the server.
The server needs to call the following methods,
and communicate the bucketIndex and collationKey back to the client.
int bucketIndex = index.getBucketIndex(name); String label = immutableIndex.getBucket(bucketIndex).getLabel(); // optional RawCollationKey collationKey = collator.getRawCollationKey(name, null);
- The client would put the name (and associated information) into its bucket for bucketIndex. The collationKey is a sequence of bytes that can be compared with a binary compare, and produce the right localized result.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
AlphabeticIndex.Bucket<V>
An index "bucket" with a label string and type.private static class
AlphabeticIndex.BucketList<V>
static class
AlphabeticIndex.ImmutableIndex<V>
Immutable, thread-safe version ofAlphabeticIndex
.static class
AlphabeticIndex.Record<V>
A (name, data) pair, to be sorted by name into one of the index buckets.
-
Field Summary
Fields Modifier and Type Field Description private static java.lang.String
BASE
Prefix string for Chinese index buckets.private static java.util.Comparator<java.lang.String>
binaryCmp
private AlphabeticIndex.BucketList<V>
buckets
private static char
CGJ
private RuleBasedCollator
collatorExternal
private RuleBasedCollator
collatorOriginal
private RuleBasedCollator
collatorPrimaryOnly
private java.util.List<java.lang.String>
firstCharsInScripts
private static int
GC_CN_MASK
private static int
GC_L_MASK
private static int
GC_LL_MASK
private static int
GC_LM_MASK
private static int
GC_LO_MASK
private static int
GC_LT_MASK
private static int
GC_LU_MASK
private java.lang.String
inflowLabel
private UnicodeSet
initialLabels
private java.util.List<AlphabeticIndex.Record<V>>
inputList
private int
maxLabelCount
private java.lang.String
overflowLabel
private java.util.Comparator<AlphabeticIndex.Record<V>>
recordComparator
private java.lang.String
underflowLabel
-
Constructor Summary
Constructors Modifier Constructor Description AlphabeticIndex(RuleBasedCollator collator)
Create an AlphabeticIndex that uses a specific collator.AlphabeticIndex(ULocale locale)
Create the index object.private
AlphabeticIndex(ULocale locale, RuleBasedCollator collator)
Internal constructor containing implementation used by public constructors.AlphabeticIndex(java.util.Locale locale)
Create the index object.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description private boolean
addChineseIndexCharacters()
Add Chinese index characters from the tailoring.private void
addIndexExemplars(ULocale locale)
This method is called to get the index exemplars.AlphabeticIndex<V>
addLabels(UnicodeSet additions)
Add more index characters (aside from what are in the locale)AlphabeticIndex<V>
addLabels(ULocale... additions)
Add more index characters (aside from what are in the locale)AlphabeticIndex<V>
addLabels(java.util.Locale... additions)
Add more index characters (aside from what are in the locale)AlphabeticIndex<V>
addRecord(java.lang.CharSequence name, V data)
Add a record (name and data) to the index.AlphabeticIndex.ImmutableIndex<V>
buildImmutableIndex()
Builds an immutable, thread-safe version of this instance, without data records.AlphabeticIndex<V>
clearRecords()
Clear the index.private AlphabeticIndex.BucketList<V>
createBucketList()
private static java.lang.String
fixLabel(java.lang.String current)
int
getBucketCount()
Return the number of buckets in the index.int
getBucketIndex(java.lang.CharSequence name)
Get the bucket number for the given name.java.util.List<java.lang.String>
getBucketLabels()
Get the labels.RuleBasedCollator
getCollator()
Get a clone of the collator used internally.java.util.List<java.lang.String>
getFirstCharactersInScripts()
Deprecated.This API is ICU internal, only for testing.java.lang.String
getInflowLabel()
Get the default label used for abbreviated buckets between other labels.int
getMaxLabelCount()
Get the limit on the number of labels in the index.java.lang.String
getOverflowLabel()
Get the default label used in the IndexCharacters' locale for overflow, eg the first item in: ...int
getRecordCount()
Return the number of records in the index: that is, the total number of distinct <name,data> pairs added with addRecord(...), over all the buckets.java.lang.String
getUnderflowLabel()
Get the default label used in the IndexCharacters' locale for underflow, eg the last item in: X Y Z ...private static boolean
hasMultiplePrimaryWeights(RuleBasedCollator coll, long variableTop, java.lang.String s)
private void
initBuckets()
Creates an index, and buckets and sorts the list of records into the index.private java.util.List<java.lang.String>
initLabels()
Determine the best labels to use.private static boolean
isOneLabelBetterThanOther(Normalizer2 nfkdNormalizer, java.lang.String one, java.lang.String other)
Returns true if one index character string is "better" than the other.java.util.Iterator<AlphabeticIndex.Bucket<V>>
iterator()
Return an iterator over the buckets.private java.lang.String
separated(java.lang.String item)
Return the string with interspersed CGJs.AlphabeticIndex<V>
setInflowLabel(java.lang.String inflowLabel)
Set the inflowLabel labelAlphabeticIndex<V>
setMaxLabelCount(int maxLabelCount)
Set a limit on the number of labels in the index.AlphabeticIndex<V>
setOverflowLabel(java.lang.String overflowLabel)
Set the overflow labelAlphabeticIndex<V>
setUnderflowLabel(java.lang.String underflowLabel)
Set the underflowLabel label
-
-
-
Field Detail
-
BASE
private static final java.lang.String BASE
Prefix string for Chinese index buckets. See http://unicode.org/repos/cldr/trunk/specs/ldml/tr35-collation.html#Collation_Indexes- See Also:
- Constant Field Values
-
CGJ
private static final char CGJ
- See Also:
- Constant Field Values
-
binaryCmp
private static final java.util.Comparator<java.lang.String> binaryCmp
-
collatorOriginal
private final RuleBasedCollator collatorOriginal
-
collatorPrimaryOnly
private final RuleBasedCollator collatorPrimaryOnly
-
collatorExternal
private RuleBasedCollator collatorExternal
-
recordComparator
private final java.util.Comparator<AlphabeticIndex.Record<V>> recordComparator
-
firstCharsInScripts
private final java.util.List<java.lang.String> firstCharsInScripts
-
initialLabels
private final UnicodeSet initialLabels
-
inputList
private java.util.List<AlphabeticIndex.Record<V>> inputList
-
buckets
private AlphabeticIndex.BucketList<V> buckets
-
overflowLabel
private java.lang.String overflowLabel
-
underflowLabel
private java.lang.String underflowLabel
-
inflowLabel
private java.lang.String inflowLabel
-
maxLabelCount
private int maxLabelCount
-
GC_LU_MASK
private static final int GC_LU_MASK
- See Also:
- Constant Field Values
-
GC_LL_MASK
private static final int GC_LL_MASK
- See Also:
- Constant Field Values
-
GC_LT_MASK
private static final int GC_LT_MASK
- See Also:
- Constant Field Values
-
GC_LM_MASK
private static final int GC_LM_MASK
- See Also:
- Constant Field Values
-
GC_LO_MASK
private static final int GC_LO_MASK
- See Also:
- Constant Field Values
-
GC_L_MASK
private static final int GC_L_MASK
- See Also:
- Constant Field Values
-
GC_CN_MASK
private static final int GC_CN_MASK
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
AlphabeticIndex
public AlphabeticIndex(ULocale locale)
Create the index object.- Parameters:
locale
- The locale for the index.
-
AlphabeticIndex
public AlphabeticIndex(java.util.Locale locale)
Create the index object.- Parameters:
locale
- The locale for the index.
-
AlphabeticIndex
public AlphabeticIndex(RuleBasedCollator collator)
Create an AlphabeticIndex that uses a specific collator.The index will be created with no labels; the addLabels() function must be called after creation to add the desired labels to the index.
The index will work directly with the supplied collator. If the caller will need to continue working with the collator it should be cloned first, so that the collator provided to the AlphabeticIndex remains unchanged after creation of the index.
- Parameters:
collator
- The collator to use to order the contents of this index.
-
AlphabeticIndex
private AlphabeticIndex(ULocale locale, RuleBasedCollator collator)
Internal constructor containing implementation used by public constructors.
-
-
Method Detail
-
addLabels
public AlphabeticIndex<V> addLabels(UnicodeSet additions)
Add more index characters (aside from what are in the locale)- Parameters:
additions
- additional characters to add to the index, such as A-Z.- Returns:
- this, for chaining
-
addLabels
public AlphabeticIndex<V> addLabels(ULocale... additions)
Add more index characters (aside from what are in the locale)- Parameters:
additions
- additional characters to add to the index, such as those in Swedish.- Returns:
- this, for chaining
-
addLabels
public AlphabeticIndex<V> addLabels(java.util.Locale... additions)
Add more index characters (aside from what are in the locale)- Parameters:
additions
- additional characters to add to the index, such as those in Swedish.- Returns:
- this, for chaining
-
setOverflowLabel
public AlphabeticIndex<V> setOverflowLabel(java.lang.String overflowLabel)
Set the overflow label- Parameters:
overflowLabel
- see class description- Returns:
- this, for chaining
-
getUnderflowLabel
public java.lang.String getUnderflowLabel()
Get the default label used in the IndexCharacters' locale for underflow, eg the last item in: X Y Z ...- Returns:
- underflow label
-
setUnderflowLabel
public AlphabeticIndex<V> setUnderflowLabel(java.lang.String underflowLabel)
Set the underflowLabel label- Parameters:
underflowLabel
- see class description- Returns:
- this, for chaining
-
getOverflowLabel
public java.lang.String getOverflowLabel()
Get the default label used in the IndexCharacters' locale for overflow, eg the first item in: ... A B C- Returns:
- overflow label
-
setInflowLabel
public AlphabeticIndex<V> setInflowLabel(java.lang.String inflowLabel)
Set the inflowLabel label- Parameters:
inflowLabel
- see class description- Returns:
- this, for chaining
-
getInflowLabel
public java.lang.String getInflowLabel()
Get the default label used for abbreviated buckets between other labels. For example, consider the labels for Latin and Greek are used: X Y Z ... Α Β Γ.- Returns:
- inflow label
-
getMaxLabelCount
public int getMaxLabelCount()
Get the limit on the number of labels in the index. The number of buckets can be slightly larger: see getBucketCount().- Returns:
- maxLabelCount maximum number of labels.
-
setMaxLabelCount
public AlphabeticIndex<V> setMaxLabelCount(int maxLabelCount)
Set a limit on the number of labels in the index. The number of buckets can be slightly larger: see getBucketCount().- Parameters:
maxLabelCount
- Set the maximum number of labels. Currently, if the number is exceeded, then every nth item is removed to bring the count down. A more sophisticated mechanism may be available in the future.- Returns:
- this, for chaining
-
initLabels
private java.util.List<java.lang.String> initLabels()
Determine the best labels to use. This is based on the exemplars, but we also process to make sure that they are unique, and sort differently, and that the overall list is small enough.
-
fixLabel
private static java.lang.String fixLabel(java.lang.String current)
-
addIndexExemplars
private void addIndexExemplars(ULocale locale)
This method is called to get the index exemplars. Normally these come from the locale directly, but if they aren't available, we have to synthesize them.
-
addChineseIndexCharacters
private boolean addChineseIndexCharacters()
Add Chinese index characters from the tailoring.
-
separated
private java.lang.String separated(java.lang.String item)
Return the string with interspersed CGJs. Input must have more than 2 codepoints.This is used to test whether contractions sort differently from their components.
-
buildImmutableIndex
public AlphabeticIndex.ImmutableIndex<V> buildImmutableIndex()
Builds an immutable, thread-safe version of this instance, without data records.- Returns:
- an immutable index instance
-
getBucketLabels
public java.util.List<java.lang.String> getBucketLabels()
Get the labels.- Returns:
- The list of bucket labels, after processing.
-
getCollator
public RuleBasedCollator getCollator()
Get a clone of the collator used internally. Note that for performance reasons, the clone is only done once, and then stored. The next time it is accessed, the same instance is returned.Don't use this method across threads if you are changing the settings on the collator, at least not without synchronizing.
- Returns:
- a clone of the collator used internally
-
addRecord
public AlphabeticIndex<V> addRecord(java.lang.CharSequence name, V data)
Add a record (name and data) to the index. The name will be used to sort the items into buckets, and to sort within the bucket. Two records may have the same name. When they do, the sort order is according to the order added: the first added comes first.- Parameters:
name
- Name, such as a namedata
- Data, such as an address or link- Returns:
- this, for chaining
-
getBucketIndex
public int getBucketIndex(java.lang.CharSequence name)
Get the bucket number for the given name. This routine permits callers to implement their own bucket handling mechanisms, including client-server handling. For example, when a new name is created on the client, it can ask the server for the bucket for that name, and the sortkey (using getCollator). Once the client has that information, it can put the name into the right bucket, and sort it within that bucket, without having access to the index or collator.Note that the bucket number (and sort key) are only valid for the settings of the current AlphabeticIndex; if those are changed, then the bucket number and sort key must be regenerated.
- Parameters:
name
- Name, such as a name- Returns:
- the bucket index for the name
-
clearRecords
public AlphabeticIndex<V> clearRecords()
Clear the index.- Returns:
- this, for chaining
-
getBucketCount
public int getBucketCount()
Return the number of buckets in the index. This will be the same as the number of labels, plus buckets for the underflow, overflow, and inflow(s).- Returns:
- number of buckets
-
getRecordCount
public int getRecordCount()
Return the number of records in the index: that is, the total number of distinct <name,data> pairs added with addRecord(...), over all the buckets.- Returns:
- total number of records in buckets
-
iterator
public java.util.Iterator<AlphabeticIndex.Bucket<V>> iterator()
Return an iterator over the buckets.- Specified by:
iterator
in interfacejava.lang.Iterable<V>
- Returns:
- iterator over buckets.
-
initBuckets
private void initBuckets()
Creates an index, and buckets and sorts the list of records into the index.
-
isOneLabelBetterThanOther
private static boolean isOneLabelBetterThanOther(Normalizer2 nfkdNormalizer, java.lang.String one, java.lang.String other)
Returns true if one index character string is "better" than the other. Shorter NFKD is better, and otherwise NFKD-binary-less-than is better, and otherwise binary-less-than is better.
-
createBucketList
private AlphabeticIndex.BucketList<V> createBucketList()
-
hasMultiplePrimaryWeights
private static boolean hasMultiplePrimaryWeights(RuleBasedCollator coll, long variableTop, java.lang.String s)
-
getFirstCharactersInScripts
@Deprecated public java.util.List<java.lang.String> getFirstCharactersInScripts()
Deprecated.This API is ICU internal, only for testing.Return a list of the first character in each script. Only exposed for testing.- Returns:
- list of first characters in each script
-
-