15,888,977 members
Sign in
Sign in
Email
Password
Forgot your password?
Sign in with
home
articles
Browse Topics
>
Latest Articles
Top Articles
Posting/Update Guidelines
Article Help Forum
Submit an article or tip
Import GitHub Project
Import your Blog
quick answers
Q&A
Ask a Question
View Unanswered Questions
View All Questions
View C# questions
View C++ questions
View Javascript questions
View Visual Basic questions
View Python questions
discussions
forums
CodeProject.AI Server
All Message Boards...
Application Lifecycle
>
Running a Business
Sales / Marketing
Collaboration / Beta Testing
Work Issues
Design and Architecture
Artificial Intelligence
ASP.NET
JavaScript
Internet of Things
C / C++ / MFC
>
ATL / WTL / STL
Managed C++/CLI
C#
Free Tools
Objective-C and Swift
Database
Hardware & Devices
>
System Admin
Hosting and Servers
Java
Linux Programming
Python
.NET (Core and Framework)
Android
iOS
Mobile
WPF
Visual Basic
Web Development
Site Bugs / Suggestions
Spam and Abuse Watch
features
features
Competitions
News
The Insider Newsletter
The Daily Build Newsletter
Newsletter archive
Surveys
CodeProject Stuff
community
lounge
Who's Who
Most Valuable Professionals
The Lounge
The CodeProject Blog
Where I Am: Member Photos
The Insider News
The Weird & The Wonderful
help
?
What is 'CodeProject'?
General FAQ
Ask a Question
Bugs and Suggestions
Article Help Forum
About Us
Search within:
Articles
Quick Answers
Messages
Comments by Cansid (Top 3 by date)
Cansid
23-Sep-15 5:05am
View
You can utilize the alternative chunk feature of DOCX files in order to achieve this.
You see the DOCX documents can have a certain placeholders (called "altChunks") that enable you to reference a HTML file which you can store inside the DOCX file itself.
You can read more about this and how to achieve this with OpenXML SDK on the following link:
How to Use altChunk for Document Assembly
[
^
]
You can also find other approaches that do not use OpenXML SDK in order to import "altChunks":
Appending HTML and RTF content to the DOCX with MadMilkman.Docx
[
^
]
HTML as a Source for a DOCX File
[
^
]
But note that there are some drawbacks in this, until you open a document that contains altChunk elements in MS Word it will not have a "normal" (WordprocessingML markups) content because this approach itself does not convert html but rather relies on MS Word to do the conversion at the time when opening the document.
If you need a "real" convertion than you can try the approach from this article:
Convert HTML to / from Word document in C# and VB.NET
[
^
]
This article uses
a .NET library for Word processing
[
^
].
Cansid
23-Sep-15 4:39am
View
Sharma I know this is probably too late to help you, but I want to add on the Sergey answer regarding the DOCX to TXT.
First in case you or anyone else is planning to use Office Interop on server side I would encourage you to really reconsider, it will definitely result in a lot of headaches...
Second for extracting the DOCX content into a TXT file is not that complicated (unlike in PDF files) and you can find the solutions that do not require any third parties (like Open XML SDK which is not a lightweight library at all...)
For example see this CodeProject's article:
Find Text in Word Documents
[
^
]
It uses only System.IO.Packaging and System.Xml namespaces and all you need to do in order to use it is the following code (and the accompanied two classes DocxReader and DocxToStringConverter):
using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
string docxText = new DocxToStringConverter(stream).Convert();
// Do something with DOCX text ...
}
Cansid
22-Sep-15 6:07am
View
Abdul here is something that I'm using:
string htmlText = null;
var inputOptions = LoadOptions.HtmlDefault;
var outputOptions = SaveOptions.PdfDefault;
using (var htmlStream = new MemoryStream(inputOptions.Encoding.GetBytes(htmlText)))
DocumentModel.Load(htmlStream, inputOptions)
.Save(this.Response, outputOptions);
The code I used is from
this article of converting html into a pdf with C#
. You may noticed that both input html and output pdf can be a physical file or a stream. Also you can notice that I used a Response with the save method, this
directly exports pdf to asp.net client
.