Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Friday, November 20, 2009 7:05 AM
Hi Everyone,
What i want to do is insert some html content into a .xlsx file, how can i implement this by OpenXML SDK?
Thanks for any advice.
All replies (11)
Monday, November 23, 2009 9:52 AM
Hi, awar,
Thanks for your question.
However, would you please provide more information about your requirement? Do you want to insert some html formatted rich-text into a .xlsx cell? Or, do you want to extract some data from a html page and load them into a .xmlx file? Or, some others? Some examples would be more helpful to get your idea if provided:)
Thanks,
Raymond
Monday, November 23, 2009 8:48 PM
Sounds like you want to convert html to openxml. This is not a feature of the OpenXML specification.
Under "Open XML format SDK Limitations"
"Does not provide functionality to convert Open XML Formats to and from other formats, like HTML or XPS. "
Though, for what it's worth there is a codeplex project going from OpenXML > HTML. Having used the translator myself and knowing the limitations of CSS I don't see this being a professionally viable solution anytime in the next 5 years.
Good coding involves knowing one's logical limits and expanding them as necessary.
Monday, November 23, 2009 10:29 PM
How much fidelity are you looking to preserve when converting html to SpreadsheetML? In other words, what type of content are you trying to convert. If you do not have strict requirements it may be pretty easy to write your own converter (not based on xslt, but rather the Open XML SDK). Check out Eric White's blog here.Zeyad Rajabi (MS)
Tuesday, November 24, 2009 9:44 AM
Hi Raymond,
Thanks a lot for your answer:)
My request is the former of your selections, i've managed to extract html content from a .html page, and want to save it into a .xlsx file, ensure the content be displayed in the xlsx file the same as that in IE.
Thank you very much for your help.
Regards,
awar
Tuesday, November 24, 2009 10:15 AM
Hi Brian,
Thanks for you answer.
Yes, my confusion lies in that i cannot find a good solution to translate the html content with css into OpenXML format can be added as SpreadsheetML objects, or altered solutions to implement this.
Regards,
awar
Tuesday, November 24, 2009 10:21 AM
Hi Zeyad,
My html has no strict requirements, but the content should be displayed in xlsx the same as its status in internet explorer. That's the real goal for this case.
Thanks & regards,
awar
Wednesday, November 25, 2009 5:12 AM
Hi, awar
Based on my knowledge, OPEN XML SDK cannot do such a job.
The SDK could put the html content (at least the text-based content) into a cell, however, it could not let them show the same as that in IE.
To let the content to show as expected, some other tools may be of help. For example, maybe, a VBA script.
Thanks,
Raymond
Wednesday, November 25, 2009 6:21 AM
Hi Raymond,
Thanks for your reply again.
So I'm not expecting to implement the function simply by OpemXML SDK now, thanks for your advice.
Regards,
awar
Wednesday, November 25, 2009 6:25 AM
Hi, awar
You are welcome.
Thanks a lot for your feedback!
Wish you go well with your tasks.
Thanks,
Raymond
Wednesday, November 25, 2009 2:05 PM
awar, Perhaps I wasn't clear enough in my first post. I highly recommend you take some time to review the OpenXML sdk. OpenXML is the standard by which Excel 2007 stores data. There are three clear ways you can approach this problem.
1. Write a translator for OpenXML to HTML.
2. Use DocumentReflector (part of OpenXML sdk download) to generate the C# for a SpreadSheet and edit the generated contents programmatically.
3. Use another method to read and write to an Excel Template such as oledb queries.
#1 is going to require a team of experts and a great deal of time. If that is not available to you then #2 or #3 are the only options. I recommend you try #2 first as it only requires the editing of an existing (generated) codebase. You have to create an Excel Template by hand first, then generate the C# with the DocumentReflector. Then, you can go through the generated code and look for parts that you want to programmatically access. With the C# at hand it should be easy for a developer to extract the pieces of data that one requires. Then generate the HTML and stick the data in there or write a simple content management system to display it in a browser.
Good coding involves knowing one's logical limits and expanding them as necessary.
Thursday, November 26, 2009 2:07 AM
Hi Brian,
Many thanks for your suggestions. The solution of coding based on DocumentReflector generated code looks like a good choice, I'll try it.
BTW. I've learned from a wrapped package that the format can be defined and used by code, just like css for html page. (ExcelPackage Formatted File Creation)
I'm not sure if it's feasible. If it can be defined like that in html page, my idea is to parse my html content into tables (it's mainly consists of tables) and css text, then insert the content into xlsx cells with css classes.
Please have a look about this solution if possible. Thanks very much.
Regards,
awar