Package com.ibm.icu.text
Class UnicodeDecompressor
- java.lang.Object
-
- com.ibm.icu.text.UnicodeDecompressor
-
- All Implemented Interfaces:
SCSU
public final class UnicodeDecompressor extends java.lang.Object implements SCSU
A decompression engine implementing the Standard Compression Scheme for Unicode (SCSU) as outlined in Unicode Technical Report #6.USAGE
The static methods on UnicodeDecompressor may be used in a straightforward manner to decompress simple strings:
byte [] compressed = ... ; // get compressed bytes from somewhere String result = UnicodeDecompressor.decompress(compressed);
The static methods have a fairly large memory footprint. For finer-grained control over memory usage, UnicodeDecompressor offers more powerful APIs allowing iterative decompression:
// Decompress an array "bytes" of length "len" using a buffer of 512 chars // to the Writer "out" UnicodeDecompressor myDecompressor = new UnicodeDecompressor(); final static int BUFSIZE = 512; char [] charBuffer = new char [ BUFSIZE ]; int charsWritten = 0; int [] bytesRead = new int [1]; int totalBytesDecompressed = 0; int totalCharsWritten = 0; do { // do the decompression charsWritten = myDecompressor.decompress(bytes, totalBytesDecompressed, len, bytesRead, charBuffer, 0, BUFSIZE); // do something with the current set of chars out.write(charBuffer, 0, charsWritten); // update the no. of bytes decompressed totalBytesDecompressed += bytesRead[0]; // update the no. of chars written totalCharsWritten += charsWritten; } while(totalBytesDecompressed < len); myDecompressor.reset(); // reuse decompressor
Decompression is performed according to the standard set forth in Unicode Technical Report #6
- See Also:
UnicodeCompressor
-
-
Field Summary
Fields Modifier and Type Field Description private static int
BUFSIZE
Size of our internal bufferprivate byte[]
fBuffer
Internal buffer for saving stateprivate int
fBufferLength
Number of characters in our internal bufferprivate int
fCurrentWindow
Alias to current dynamic windowprivate int
fMode
Current compression modeprivate int[]
fOffsets
Dynamic compression window offsets-
Fields inherited from interface com.ibm.icu.text.SCSU
ARMENIANINDEX, COMPRESSIONOFFSET, GREEKINDEX, HALFWIDTHKATAKANAINDEX, HIRAGANAINDEX, INVALIDCHAR, INVALIDWINDOW, IPAEXTENSIONINDEX, KATAKANAINDEX, LATININDEX, MAXINDEX, NUMSTATICWINDOWS, NUMWINDOWS, RESERVEDINDEX, SCHANGE0, SCHANGE1, SCHANGE2, SCHANGE3, SCHANGE4, SCHANGE5, SCHANGE6, SCHANGE7, SCHANGEU, SDEFINE0, SDEFINE1, SDEFINE2, SDEFINE3, SDEFINE4, SDEFINE5, SDEFINE6, SDEFINE7, SDEFINEX, SINGLEBYTEMODE, sOffsets, sOffsetTable, SQUOTE0, SQUOTE1, SQUOTE2, SQUOTE3, SQUOTE4, SQUOTE5, SQUOTE6, SQUOTE7, SQUOTEU, SRESERVED, UCHANGE0, UCHANGE1, UCHANGE2, UCHANGE3, UCHANGE4, UCHANGE5, UCHANGE6, UCHANGE7, UDEFINE0, UDEFINE1, UDEFINE2, UDEFINE3, UDEFINE4, UDEFINE5, UDEFINE6, UDEFINE7, UDEFINEX, UNICODEMODE, UQUOTEU, URESERVED
-
-
Constructor Summary
Constructors Constructor Description UnicodeDecompressor()
Create a UnicodeDecompressor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.lang.String
decompress(byte[] buffer)
Decompress a byte array into a String.static char[]
decompress(byte[] buffer, int start, int limit)
Decompress a byte array into a Unicode character array.int
decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit)
Decompress a byte array into a Unicode character array.void
reset()
Reset the decompressor to its initial state.
-
-
-
Field Detail
-
fCurrentWindow
private int fCurrentWindow
Alias to current dynamic window
-
fOffsets
private int[] fOffsets
Dynamic compression window offsets
-
fMode
private int fMode
Current compression mode
-
BUFSIZE
private static final int BUFSIZE
Size of our internal buffer- See Also:
- Constant Field Values
-
fBuffer
private byte[] fBuffer
Internal buffer for saving state
-
fBufferLength
private int fBufferLength
Number of characters in our internal buffer
-
-
Constructor Detail
-
UnicodeDecompressor
public UnicodeDecompressor()
Create a UnicodeDecompressor. Sets all windows to their default values.- See Also:
reset()
-
-
Method Detail
-
decompress
public static java.lang.String decompress(byte[] buffer)
Decompress a byte array into a String.- Parameters:
buffer
- The byte array to decompress.- Returns:
- A String containing the decompressed characters.
- See Also:
decompress(byte [], int, int)
-
decompress
public static char[] decompress(byte[] buffer, int start, int limit)
Decompress a byte array into a Unicode character array.- Parameters:
buffer
- The byte array to decompress.start
- The start of the byte run to decompress.limit
- The limit of the byte run to decompress.- Returns:
- A character array containing the decompressed bytes.
- See Also:
decompress(byte [])
-
decompress
public int decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit)
Decompress a byte array into a Unicode character array. This function will either completely fill the output buffer, or consume the entire input.- Parameters:
byteBuffer
- The byte buffer to decompress.byteBufferStart
- The start of the byte run to decompress.byteBufferLimit
- The limit of the byte run to decompress.bytesRead
- A one-element array. If not null, on return the number of bytes read from byteBuffer.charBuffer
- A buffer to receive the decompressed data. This buffer must be at minimum two characters in size.charBufferStart
- The starting offset to which to write decompressed data.charBufferLimit
- The limiting offset for writing decompressed data.- Returns:
- The number of Unicode characters written to charBuffer.
-
reset
public void reset()
Reset the decompressor to its initial state.
-
-