Phrodo_00@lemmy.worldtoProgramming@programming.dev•'Don't parse markup languages with Regex' is an annoying trollpost and it should die... right?
4·
7 months agoHTML is not even a tree (XHTML is. XML is a type 2 grammar). SGML languages like HTML are more similar to Tree-adjoining grammars.
For example <b>This<i>is perfectly</b>valid</i> html
.
Original grep was pretty much a wrapper around sed (or actually maybe ed, I don’t remember). That’s why it’s called g/re/p, which is the sed command to do the same thing.