19

*maniacal laughter*

/(?<digit>\\d)|(?<non_digit>\\D)|(?<alphanumeric>\\w)|(?<non_alphanumeric>\\W)|(?<whitespace>\\s)|(?<non_whitespace>\\S)|(?<horizontal_tab>\\t)|(?<carriage_return>\\r)|(?<linefeed>\\n)|(?<vertical_tab>\\v)|(?<form_feed>\\f)|(?<backspace>\[\\b.*?\])|(?<NUL>\\0)|(?<control_character>\\c[A-Z])/g;

... I need to sleep

Comments
  • 5
    *confused screeching*
  • 5
    What ... is ... that ... ?

    Looks like its matching characters and assigning them to named captures.

    But ... why ..,
  • 6
  • 5
    @magicMirror
    > The post looks exactly as it is supposed to look - there are no problems with its content.
  • 0
    Is that....a regex?
  • 1
    @Ranchu Yes, yes it is.
  • 1
    @magicMirror Not parsing [X]HTML
  • 2
    @Voxera That’s exactly what it’s doing. (Because I’m working on writing a program to take a regular expression and output a more human-readable string description of what the regular expression is matching as a code comment)
  • 0
    @AmyShackles ....is that what I'm gonna be forced to do as a job in the future?
  • 1
    @Ranchu Doubtful. I’m currently unemployed and doing this for fun. 😅
  • 0
    Have you seen the news? Regex101
  • 1
    @010001111 I was going to write a very sarcastic response, but realize that this is a stressful time for a lot of folk and for all I know, you’re trying to be well-meaning.

    Yes, I know things like that exist. It would be hard to not know that, considering the amount of research that goes into writing a program like this. I’m working on a program that adds these descriptions as code comments, to make it easier for future developers working on a project to understand what’s going on, since regular expressions are magic for a lot of developers.
  • 3
    @AmyShackles as a vivid regex enthusiast I have to mention that although it sounds like a good project and I don’t want to stop you from it, I’m afraid that longer regexes simply shouldn’t exist, as programmer code at least.

    You may want to merge programmatically multiple regexes, but the longer one regex is, the harder it is to debug, no matter what comments or documentation there is.

    E.g. when Mail checking, you can simply explode the string into an user and domain signature and test both of them separately, instead of making a huge regex that matches everything and can’t be easily extended if the spec will do things like, oh say... tld using emojis
  • 0
    🙈@💩👾.🤓
  • 0
    @010001111 Absolutely agree with you re: long regular expressions. :)
  • 0
    I raise you a transpiler regex: /val|var|[1-9]{1,32}|\+|\-|\*|s\/s|>>|=|;|(["'])(?:(?=(\\?))\2.)*?\1|print\(|log\(|sqrt\(|input\(|strToArray\(|httpGet\(|if\(|else|{|}|s==s|s>=s|s<=s|s>s|s<s|s&&s|\|\||!|;|\(|\)|\[|\]| |\w+/gi
  • 0
    @PrivateGER So the capture group containing the backslash and the '?' quantifier inside of the positive lookahead inside of a non-capturing group using a back reference to aforementioned capture group -- that's to ensure you either have 0 backslashes or two? Why use a back reference at that point at all, why not just group two backslashes together with a '?' quantifier?

    (Unless I'm misparsing this bit /(?:(?=(\\?))\2.)/ )
  • 1
    @010001111 and you should never use a big regex to check an email :)

    Any checking beyond that it has an @ and maybe something more or ultimately going to reject valid email addresses while possibly allowing invalid.

    With idn and the over 20 different rfc’s governing email address format there are just to many alternatives.

    You could probably split the email up in parts and then by part test it against multiple regexes to find the right one.

    But I would recommend either buy a component or ditch validation and just require a test email be sent with a link, this not only validated the format but that its actually a working email belonging to the user, thing no regex “hopefully” will be able to do.
  • 0
    Its like machine learning without machine learning XD
Add Comment