Determines the monospace display width of a string in Ruby. Implementation based on EastAsianWidth.txt and other data, 100% in Ruby. You can also use wcswidth-ruby for the same purpose, but it is less often updated by OS vendors, so results may differ.
Guesing the correct space a character will consume on terminals is not easy. There is no single standard. Most implementations combine data from East Asian Width, some General Categories, and hand-picked adjustments.
As of version 1.0.0. Further at the top means higher precedence. Please expect changes to this algorithm with every MINOR version update (the X in 1.X.0)!
Width | Characters |
Comment
-------|------------------------------|--------------------------------------------------
X
| (user defined) | Overwrites any other values
-1 |
"\b"
| Backspace (total width never below 0)
0 |
"\0"
, "\x05"
,
"\a"
, "\n"
,
"\v"
, "\f"
,
"\r"
, "\x0E"
,
"\x0F"
| C0
control codes that do not change horizontal width
1 |
"\u{00AD}"
| SOFT HYPHEN
2 |
"\u{2E3A}"
| TWO-EM DASH
3 |
"\u{2E3B}"
| THREE-EM DASH
0 | General Categories:
Mn, Me, Cf (non-arabic) | Excludes ARABIC format characters
0 |
"\u{1160}".."\u{11FF}"
| HANGUL
JUNGSEONG
2 | East Asian Width: F, W | Full-width characters
1 or 2 | East
Asian Width: A | Ambiguous characters, user defined, default: 1
1 | All
other codepoints | -
Install the gem with:
gem install unicode-display_width
Or add to your Gemfile:
gem 'unicode-display_width'
require 'unicode/display_width' Unicode::DisplayWidth.of("⚀") # => 1 Unicode::DisplayWidth.of("一") # => 2
The second parameter defines the value returned by characterrs defined as ambiguous:
Unicode::DisplayWidth.of("·", 1) # => 1 Unicode::DisplayWidth.of("·", 2) # => 2
You can overwrite how to handle specific code points by passing a hash (or even a proc) as third parameter:
Unicode::DisplayWidth.of("a\tb", 1, 0x09 => 10)) # => 12
Activated by default. Will be deactivated in version 2.0:
require 'unicode/display_width/string_ext' "⚀".display_width #=> 1 '一'.display_width #=> 2
You can actively opt-out from the string extension with: require
'unicode/display_width/no_string_ext'
If you are not a Ruby developer, but you still want to use this software to print out display widths for strings:
$ gem install unicode-display_width $ ruby -r unicode/display_width -e 'puts Unicode::DisplayWidth.of $*[0]' -- "一"
Replace “一” with the actual string to measure
Python: github.com/jquast/wcwidth
JavaScript: github.com/mycoboco/wcwidth.js
C for Julia: github.com/JuliaLang/utf8proc/issues/2
Copyright © 2011, 2015-2016 Jan Lelis, janlelis.com, released under the MIT license
Early versions based on runpaint's unicode-data interface: Copyright © 2009 Run Paint Run Run