27

Oh dear, a scaling problem I solved was replacing some Regex matching with simpler string functions. While I'm a huge fan of Regex, it's unreal how much performance they can suck out of some high-n loops...
I got about 120x out of some critical code thus making a CPU upgrade unnecessary.

Comments
  • 10
    the key to using regex:

    * as much as needed.

    * as little as possible.

    * and then a bit less.

    and document _precisely_ what and how a regex (of any non-trivial complexity) is doing. even near-future-you will thank you.
  • 4
    That must have been either: A really bad regexp, a really bad algorithm leading to multiple use of the regexp on the same data, or just a heeeeeeeeap of data.
  • 1
    Unless this is a log dump processing done using regexs on each of the log lines...
  • 4
    I like this kind of rant because it makes me discover things I didn't know.
  • 2
    @Oktokolo If I remember it correctly it was just the usual JavaScript String.match(...). V8 should be able to optimize this pretty well, but I think there will still be some overhead left.

    @Paps Yeah, we'll never stop learning new things, independent of seniority.
  • 2
    Wow, some regexp hater seems to do multi-account downvoting here. I seem to get enough upvotes to keep my answer at zero though...
  • 1
    This reminds me of the time I tried to validate IPv4 CIDR ranges with regex. The pattern ended up becoming stupidly complicated until I decided to forego regex entirely. I ended up splitting the string and parsing the octets into numbers, then validated with simple math.

    Regex has its place but sometimes it just isn't the best option.
  • 1
    Some things are not suited for regex. It's often used as a hammer while it can be a scalpel.
    Also the implementations vary. When you do a loop like that at least use an implementation that can compile the regex and use that.
Add Comment