0.0
No commit activity in last 3 years
No release in over 3 years
There's a lot of open issues
Match and then change groups, prefix, postfix or even the text between the groups
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

 Project Readme

Lab42Match

Build Status Gem Version Code Climate Issue Count Test Coverage

Beyond Match Data: Modify Your Matches

... and avoid matching again

Here is your API

Wrap a Regex, Get a Match (object)

    Match = Lab42::Match
    let(:rgx) { %r{(\d+)\.(\d*)} }
    subject { Match.new(rgx) }

Query it

Example: ... to get your wrapped Regex back

    expect( subject.rgx ).to eq(rgx)

... or discover that it is not matched yet

    expect( subject ).not_to be_matched

... be aware of accessing data that is not there yet!

    expect{ subject[0] }.to raise_error(Match::NotMatchedYet)
    expect{ subject.capts }.to raise_error(Match::NotMatchedYet)

Example: And even the wrapped MatchData object must not be accessed

    expect{ subject.match }.to  raise_error(Match::NotMatchedYet)

Context Attempt an Unsuccessful Match

However after a first matching attempt the state changes and even if there was no match your queries will now return nil instead of raising

    subject.match("")
    expect( subject ).to be_matched
    expect( subject.capts ).to be_nil
    expect( subject[0] ).to be_nil

Context A Successful Match

Now things get more interesting

    let(:string){ "> 42.43 <" }
    let(:match_data){ subject.rgx.match(string) }
    subject.match(string)

Example: Again we are matched now

    expect( subject ).to be_matched

Firstly let us proof that all the information of a Regex#match result is accessible

Example: All MatchData data is available

    expect( subject.match ).to eq(match_data)
    (0..2).each do |i|
      expect( subject[i] ).to eq(match_data[i])
    end
    expect( subject.capts ).to eq(match_data.captures)
    expect( subject.subject ).to eq(string)

So What?

Well the fun starts now, we can change all parts of our matches! Did you just say all?

Indeed I did, but let us start with the obvious ones, the captures:

Example: Increase Integer Part

    replacement = subject.replace(1, "43")
    expect( replacement.string ).to eq("> 43.43 <")
    # With a block, demonstrating also that the original object has not been altered
    expect( replacement.replace(1){ |old| old.to_i.succ}.string ).to  eq("> 44.43 <")
    expect( subject.replace(1){ |old| old.to_i.succ}.string ).to eq("> 43.43 <")

Context All The Parts

To demonstrate that, we need a little more complex string and regex

    let(:rgx){ %r{\w+\s+(\d+)\s+(\d+)\s+\w+} }
    let(:string){ "> Hello 42 43 World <" }
    let(:matched){ Match.new(rgx, string) }

Here is the layout of our string, and where the parts are after a successful match

    |> |Hello |42| |43| World| <|
     ^  ^      ^  ^ ^  ^      ^
     |  |      |  | |  |      |
     |  |      |  | |  |      +---------- part[6] corresponds to MatchData#post_match
     |  |      |  | |  |                  symbolic: :last, :post or :suffix
     |  |      |  | |  +----------------- part[5] corresponds to the matched part after the last capture
     |  |      |  | |                     symbolic: :last_match
     |  |      |  | +-------------------- part[4] corresponds to the last capture
     |  |      |  |                       symbolic: :last_capture
     |  |      |  +---------------------- part[3] corresponds to the matched part between the two captures
     |  |      |
     |  |      +------------------------- part[2] corresponds to the first capture
     |  |                                 symbolic: :first_capture
     |  +-------------------------------- part[1] corresponds to the matched part before the first capture
     |                                    symbolic: :first_match
     +----------------------------------- part[0] corresponds to MatchData#pre_match
                                          symbolic: :first, :pre or :prefix

Example: Demonstrate parts

    expect( matched.parts ).to eq([
        "> ", "Hello ", "42", " ", "43", " World", " <"
      ])

It can be seen easily that the indices used for #replace and to index the captures, that is 1 based can be transformed to point to their corresponding parts by simply doubling them.

For the parts outside the captures convenient shortcuts will be provided, and only for the parts between captures you would need to do some calculations to access them.

But then oftentimes you will make a capture group in order to change the matched text.

Let us change some parts now to see what that does

Example: Change parts by numeric index

      incremented = matched.replace_part(2, 43).replace_part(-1){|s| s.reverse}
      expected = [
        "> ", "Hello ", "43", " ", "43", " World", "< "
      ]
      expect( incremented.parts ).to eq(expected)
      expect( incremented.string ).to eq(expected.join)

The same can be achieved by using symbolic indices which are

    first: 0
    first_capture: 2
    first_match: 1
    last: -1
    last_capture: -3
    last_match: -2
    post: -1
    pre: 0
    prefix: 0
    suffix: -1

Therefore the following will hold

Example: Change parts by symbolic name

      modified = matched
        .replace_part(:first, ">>>")
        .replace_part(:first_match){ |x| x[2] }
        .replace_part(:first_capture, "43")
        .replace_part(:last_capture, "42")
        .replace_part(:last_match){ |x| x[-1] }
        .replace_part(:suffix, "<<<")
      expected_parts = [
        ">>>", "l", "43", " ", "42", "d", "<<<"
      ]
      expect( modified.parts ).to eq(expected_parts)
    

From this it follows directly

Author

Copyright © 2020 Robert Dober mailto: robert.dober@gmail.com

LICENSE

Same as Elixir -- 😉 --, which is Apache License v2.0. Please refer to LICENSE for details.

SPDX-License-Identifier: Apache-2.0