asp.net - How to write rich text to word document generated from htm file in C# -


i trying generate word doc saved html file using open xml library. if html file not contain image can use code below , write text content word doc.

htmldocument doc = new htmldocument(); doc.load(filename); //filename htm file string detail = string.empty; string webdata = string.empty; htmlnode hcollection = doc.documentnode.selectsinglenode("//body"); detail = hcollection.innertext; 

but if html file contains embedded image struggling include image in word doc.

using hcollection.innertext writes text part , excludes image.

when use

htmlnode hcollection = doc.documentnode.selectsinglenode("//body"); detail = hcollection.innerhtml; 

all html tags written word doc along path of image in tag

<table border='0' width='100%' cellpadding='0' cellspacing='0' align='center'> <tr><td valign='top' align="left"> <div style='width:100%'><div id="div_img"> <div>  <img src="http://www.myweb.com/web/img/2013/07/18/img_1.jpg">  <span>sample text</span></div></div><br><br>sample text content here<br><br>                         </div></td></tr></table> 

how remove html tags , instead of path shown

<img src="http://www.myweb.com/web/img/2013/07/18/img_1.jpg"> 

the corresponding picture gets loaded.

please help.

you'll need @ html , translate openxml somehow.

i've used htmltoopenxml open-source library (license), , works enough. should handle images (inline, local or remote) , correctly insert them openxml document. submitted patch accepted, project still active.

there limitations library though:

javascript (<script>), css <style>, <meta> , other not supported tags not generate error ignored.

it handle inline style information, entirely ignores other css, needed. ended integrating simple parsing of single <style> element open-source project (jsonfx, using mit license).

note: handling multiple <style> elements, downloading css files, sorting out style rules have precedence -- these problems did not address.


Comments

Popular posts from this blog

How to mention the localhost in android -

php - Calling a template part from a post -

c# - String.format() DateTime With Arabic culture -