Project

runicode

0.0
No commit activity in last 3 years
No release in over 3 years
Helper method for stripping unicode 'control', 'other', and 'blank' characters from strings
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

>= 0
 Project Readme

runicode

Simple gem that provides a method for stripping non-visible unicode characters.

This is achieved using regular expressions that match specific unicode character categories

Install the gem

gem install runicode

Usage

    require 'runicode'
    
    bytes = [0xE2, 0x80, 0x8E, 102, 111, 111]
    str = bytes.pack('C*').force_encoding('UTF-8')
    
    str #=> "foo"
    str.length #=> 4
    str.bytes.length #=> 6
    
    stripped_str = Runicode.strip(str)
    stripped_str #=> "foo"
    stripped_str.length #=> 3
    stripped_str.bytes.length #=> 3