Package com.ibm.icu.impl.breakiter
Class LSTMBreakEngine
- java.lang.Object
-
- com.ibm.icu.impl.breakiter.DictionaryBreakEngine
-
- com.ibm.icu.impl.breakiter.LSTMBreakEngine
-
- All Implemented Interfaces:
LanguageBreakEngine
public class LSTMBreakEngine extends DictionaryBreakEngine
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) class
LSTMBreakEngine.CodePointsVectorizer
static class
LSTMBreakEngine.EmbeddingType
(package private) class
LSTMBreakEngine.GraphemeClusterVectorizer
static class
LSTMBreakEngine.LSTMClass
static class
LSTMBreakEngine.LSTMData
(package private) class
LSTMBreakEngine.Vectorizer
-
Nested classes/interfaces inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
DictionaryBreakEngine.DequeI, DictionaryBreakEngine.PossibleWord
-
-
Field Summary
Fields Modifier and Type Field Description private LSTMBreakEngine.LSTMData
fData
private int
fScript
private LSTMBreakEngine.Vectorizer
fVectorizer
private static byte
MIN_WORD
private static byte
MIN_WORD_SPAN
-
Fields inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
fSet
-
-
Constructor Summary
Constructors Constructor Description LSTMBreakEngine(int script, UnicodeSet set, LSTMBreakEngine.LSTMData data)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private static void
addDotProductTo(float[] a, float[][] b, float[] result)
private static void
addHadamardProductTo(float[] a, float[] b, float[] result)
private static void
addTo(float[] a, float[] result)
private float[]
compute(float[][] W, float[][] U, float[] B, float[] x, float[] h, float[] c)
static LSTMBreakEngine
create(int script, LSTMBreakEngine.LSTMData data)
static LSTMBreakEngine.LSTMData
createData(int script)
static LSTMBreakEngine.LSTMData
createData(UResourceBundle bundle)
private static java.lang.String
defaultLSTM(int script)
int
divideUpDictionaryRange(java.text.CharacterIterator fIter, int rangeStart, int rangeEnd, DictionaryBreakEngine.DequeI foundBreaks, boolean isPhraseBreaking)
Divide up a range of known dictionary characters handled by this break engine.private static void
hadamardProductTo(float[] a, float[] result)
boolean
handles(int c)
int
hashCode()
private static float[]
make1DArray(int[] data, int start, int d1)
private static float[][]
make2DArray(int[] data, int start, int d1, int d2)
private LSTMBreakEngine.Vectorizer
makeVectorizer(LSTMBreakEngine.LSTMData data)
private static int
maxIndex(float[] data)
private static void
sigmoid(float[] result, int start, int length)
private static void
tanh(float[] result, int start, int length)
-
Methods inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
findBreaks, setCharacters
-
-
-
-
Field Detail
-
MIN_WORD
private static final byte MIN_WORD
- See Also:
- Constant Field Values
-
MIN_WORD_SPAN
private static final byte MIN_WORD_SPAN
- See Also:
- Constant Field Values
-
fData
private final LSTMBreakEngine.LSTMData fData
-
fScript
private int fScript
-
fVectorizer
private final LSTMBreakEngine.Vectorizer fVectorizer
-
-
Constructor Detail
-
LSTMBreakEngine
public LSTMBreakEngine(int script, UnicodeSet set, LSTMBreakEngine.LSTMData data)
-
-
Method Detail
-
make2DArray
private static float[][] make2DArray(int[] data, int start, int d1, int d2)
-
make1DArray
private static float[] make1DArray(int[] data, int start, int d1)
-
makeVectorizer
private LSTMBreakEngine.Vectorizer makeVectorizer(LSTMBreakEngine.LSTMData data)
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
handles
public boolean handles(int c)
- Specified by:
handles
in interfaceLanguageBreakEngine
- Overrides:
handles
in classDictionaryBreakEngine
- Parameters:
c
- A Unicode codepoint value- Returns:
- true if the engine can handle this character, false otherwise
-
addDotProductTo
private static void addDotProductTo(float[] a, float[][] b, float[] result)
-
addTo
private static void addTo(float[] a, float[] result)
-
hadamardProductTo
private static void hadamardProductTo(float[] a, float[] result)
-
addHadamardProductTo
private static void addHadamardProductTo(float[] a, float[] b, float[] result)
-
sigmoid
private static void sigmoid(float[] result, int start, int length)
-
tanh
private static void tanh(float[] result, int start, int length)
-
maxIndex
private static int maxIndex(float[] data)
-
compute
private float[] compute(float[][] W, float[][] U, float[] B, float[] x, float[] h, float[] c)
-
divideUpDictionaryRange
public int divideUpDictionaryRange(java.text.CharacterIterator fIter, int rangeStart, int rangeEnd, DictionaryBreakEngine.DequeI foundBreaks, boolean isPhraseBreaking)
Description copied from class:DictionaryBreakEngine
Divide up a range of known dictionary characters handled by this break engine.
- Specified by:
divideUpDictionaryRange
in classDictionaryBreakEngine
- Parameters:
fIter
- A UText representing the textrangeStart
- The start of the range of dictionary charactersrangeEnd
- The end of the range of dictionary charactersfoundBreaks
- Output of break positions. Positions are pushed. Pre-existing contents of the output stack are unaltered.- Returns:
- The number of breaks found
-
createData
public static LSTMBreakEngine.LSTMData createData(UResourceBundle bundle)
-
defaultLSTM
private static java.lang.String defaultLSTM(int script)
-
createData
public static LSTMBreakEngine.LSTMData createData(int script)
-
create
public static LSTMBreakEngine create(int script, LSTMBreakEngine.LSTMData data)
-
-