class RMMSeg::Config
Configurations of RMMSeg
.
Attributes
An array of dictionary files. Each element should be of the form: [file, whether_dic_include_frequency_info]. This should be set before the dictionaries are loaded (They are loaded only when they are used). Or else you should call Dictionary.instance.reload manually to reload the dictionaries.
The maximum length of a CJK word. The default value is 4. Making this value too large might slow down the segment operations.
Public Class Methods
Get the algorithm name currently using
# File lib/rmmseg/config.rb, line 19 def algorithm @algorithm end
Set the algorithm name used to segment. Valid values are :complex
and :simple
. The former is the default one.
# File lib/rmmseg/config.rb, line 24 def algorithm=(algor) unless [:complex, :simple].include? algor raise ArgumentError, "Unknown algorithm #{algor}" end @algorithm = algor end
Get an instance of the algorithm object corresponding to the algorithm name configured. tok
is the class of the token oject to be returned. For example, if you want to use with Ferret
, you should provide ::Ferret::Analysis::Token
.
# File lib/rmmseg/config.rb, line 34 def algorithm_instance(text, tok=Token) RMMSeg.const_get("#{@algorithm}".capitalize+"Algorithm").new(text, tok) end
Get the behavior description when an unresolved ambiguity occured.
# File lib/rmmseg/config.rb, line 39 def on_ambiguity @on_ambiguity end
Set the behavior on an unresolved ambiguity. Valid values are :raise_exception
and :select_first
. The latter is the default one.
# File lib/rmmseg/config.rb, line 45 def on_ambiguity=(behavior) unless [:raise_exception, :select_first].include? behavior raise ArgumentError, "Unknown behavior on ambiguity: #{behavior}" end @on_ambiguity = behavior end