regex

Ranter

AmyShackles

7099

Comments

2

Ranchonyx

10406

5y

...you're having fun doing this aren't you?
2

AmyShackles

7099

5y

@Ranchu I'm a sick, sick person, clearly. I keep saying I'll go back to being a normal person when I have a job again, but I'm honestly not sure if that's true.
1

AmyShackles

7099

5y

@Demolishun Okay, so the idea is a user passes in a regular expression, then I convert that regular expression to a string, parse it with regex to categorize the different parts of it and add them to an object with the index of each match as the key on the object. After building the object, I'd then go through the object's values in order of key to generate a string describing what the regular expression is matching.

.... I really need to find a better way to explain it. :D
2

RememberMe

13709

5y

@Demolishun it's fairly straightforward - regular expressions are "linear", they have no "memory". They cannot remember past information and act on it.

XML has recursive structures, if you consider <tag> bleh </tag>, by the time you get to the closing tag, you have to "remember" that you opened a tag and match that as now closed. You can parse the above with a regex, but if you have <tag> <tag> bleh </tag> </tag>, the parser now has to "remember" that you opened two tags and have to close two tags for a perfect match. Regexes can't do this. You can hardcode it to work with 2 tags, but then it wouldn't work for 1 or 3 or anything else that's not 2.

Even worse, take <tag> <innertag> bleh </innertag> </tag>. The parser now has to "remember" that it opened a tag first, then a innertag, so it should close a innertag first then a tag for a perfect match. Regexes just can't remember this much. They're "stupid" that way.

You can think of a regex matcher as a machine with a single cell as the memory (the "present"). The technical term for this is linear automaton. Something that matches recursive structures can be modelled by a similar machine but with a stack ("the past, but in order") for the memory which can grow as needed. The technical term for that is a pushdown automaton. On a massively simplified level, anything that is recursive needs a matcher with more "powerful" memory that can remember more extensive patterns (for this case a stack was enough, but if you want to generalize that, if you give the machine an infinitely large RAM stick ("the past") you essentially get the next level up, a Turing machine, which can match even more complicated languages).
1

RememberMe

13709

5y

@RememberMe I should point out that I didn't explicitly show recursion in the examples. Consider what would happen if the "bleh" inside the tags was another set of XML tags. Even worse for the regex since it has to remember more now. Regexes can't match corresponding open-close, let alone ones done recursively. (Two tags is an example of recursion too, since it's an XML structure inside an XML structure).
2

Ranchonyx

10406

5y

Is it possible to learn this power?
3

AmyShackles

7099

5y

@Ranchu I wasn't born with an instinctual knowledge of regular expressions, so yes. :P
0

gitreflog

2357

5y

i can only recommend retina (https://github.com/m-ender/retina) for larger regex based projects. Has some amazing features and deals with some of the problems described by @rememberme

Related Rants

Add Comment

rant

naming conventions are weird

hopeful