2
kiki
177d

A genuine question: are regexps inherently better than split().replace().trim() and other such chains?

Comments
  • 3
  • 1
    Maybe in mass execution or so if you write a parser but real men write lexers
  • 5
    I'm no expert but I prefer chaining functions. Easier to read, debug, test and maintain.
  • 4
    Yes
    Regex is declarative and short, much easier to maintain or change, can have immediate feedback and multiple test cases in e.g. https://regex101.com/

    Anyone who writes >5 chains or loops for string manipulation (unless for performance) probably has a skill issue
  • 5
    Somewhere I saw a benchmark regarding this, and apparently regexes take 50% less time.
  • 3
    Regex is more powerful and cleaner in a code base. There are cases when it's more performant and cases when it's worse.

    Because it's a little esoteric, I usually add a comment above it with an example string it's trying to match. Bonus points if you wrap it in a well named function.
  • 0
    Just out of curiosity, assuming that's JavaScript (and valid such), can you call replace on an array?
  • 2
    rule 1 about regexes: only use them _when. you. need. to._

    rule 2 about regexes: even then, double-check if you _really_ need them.

    regular functions are both _very much_ easier to maintain AND more performant.
  • 5
    @devRancid "regexes are easier to maintain".... i have to degree zealously.

    unless your regexes are simple enough to qualify as "trivial"
  • 4
    Anyone who says regex is hard to maintain needs to apply a little fucking brain power. You can literally write a custom function that applies a regex query, and guess what?, you can name that function WHATEVER YOU WANT dumbass. DO PEOPLE NOT KNOW HOW TO CODE ANYMORE JESUS CHRIST. I’m terms of performance, it really depends on what operation you are trying to do. Would u use a power tool or screwdriver to build a house vs screw in a battery cover?
  • 2
    @tosensei have to agree with @devRancid .

    Regex is not hard to maintain. The syntax is esoteric and increases cognitive load to grok, but that's why God invented comments and function names
  • 2
    @tosensei if you do anything non-trivial without regex you essentially have to build your own state machine with temporary variables and branches which sucks infinitely more
  • 1
    Up to 2 chars - I use replace(). Anything more - a regex
  • 4
    @shovethisrant and now please maintain a regex that is 200 characters long.

    for bonus points: one that someone else wrote. like 6-months-ago-you.

    also: don't get me wrong. i love regexes as puzzles. but for almost any real use case there's a better alternative.

    they're a swiss army knife - you can solve everything with it, but nothing as good as with a dedicated tool
  • 1
    @tosensei that's... Not that hard...

    Put all cases with the regex into a live match tool (regexr) find the changes, and fix the offending portion of the match string.

    Also, if you have a 200 character match string, you're likely doing something wrong.
  • 3
    @devRancid if regex is so easy to maintain, why are you literally in the next breath posting a link to help you with said regex then?

    No thanks, i'm sticking with stuff i can actually read.
  • 4
    RegEd: Dependent on size of input and number of repetitions to fulfill the pattern.

    Usually always the worst and slowest choice - the longer the string, the worse it gets.

    Reason why most libraries don't use RegEx and rather iterate over the whole string char by char.

    The variance of RegEx performance is also a security risk. ReDos says hello.

    Most people claim that RegExes are fast and simple...

    But truth be told: They're not.

    One example I can give is NGINX.

    http://nginx.org/en/docs/...

    Many people love to pump NGINX full with regexes, as NGINX even supports Perl named group patterns.

    Then they wonder why their request rate is abysmal and the performance sucks.

    Yeah.... Cause thats not what RegExes were made for.

    TLDR: RegExes should never be used if string operations are available. Especially not for strings with a dynamic length.

    The cost and security risk isn't worth it.
  • 1
    All the haters summarised:
  • 4
    Regex is obfuscated code written in another language that is hard to read, write, debug and maintain. It has multiple dialects and you don‘t know which is supported by your language, so it even can‘t be shared reliably.
  • 2
    @jestdotty because it's yet another tool, that pulls me out of my flow, even though i have one right in front of my face: The IDE. That's one thing.

    The second is, i usually don't need a manual to read _and understand_ code.

    Otherwise i'd program in Brainfuck for all intents and purposes. It can also accomplish everything you need, and is even easier than regex, because it has only 7 things to remember
    /s

    So yeah, you guys keep using regex, i keep writing readable and usually better performing code in the meantime ;)
  • 3
    @lungdart "Also, if you have a 200 character match string, you're likely doing something wrong."

    yes.

    that's exactly my point.

    at any level above "trivial", it's highly likely that a regex is _not_ the best solution for the problem by far.
  • 1
    @tosensei I catch what you're throwing now.

    I agree.
  • 1
    @tosensei so, if I'm getting it right, I don't need regexes for small tasks because string manipulation functions are more readable, and I don't need regexes for large tasks because I have to write a parser/use a better tool to make it at least tolerably quick?
  • 1
    @kiki it's just one tool like any other.

    it just happens to be versatile, yet not as good as specialised tools.
Add Comment