c# - Match nested HTML tags -
in c# app, want match every html "font" tag "color" attribute.
i have following text:
1<font color="red">2<font color="blue">3</font>4</font>56
and want matchcollection containing following items:
[0] <font color="red">234</font> [1] <font color="blue">3</font>
but when use code:
regex.matches(result, "<font color=\"(.*)\">(.*)</font>");
the matchcollection following one:
[0] <font color="red">2<font color="blue">3</font>4</font>
how can matchcollection want using c#?
thanks.
regex on "html" antipattern. don't it.
to steer on right path, @ can html agility pack:
htmldocument doc = new htmldocument(); doc.loadhtml(@"1<font color=""red"">2<font color=""blue"">3</font>4</font>56"); var fontelements = doc.documentnode.descendants("font"); var newnodes = fontelements.select(fe => { var newnode = fe.clone(); newnode.innerhtml = fe.innertext; return newnode; }); var collection = newnodes.select(n => n.outerhtml);
now, in collection
have following strings:
<font color="red">234</font> <font color="blue">3</font>
mmm... lovely.
Comments
Post a Comment