Share via


itextsharp render html source and display in a paragraph

Question

Saturday, April 18, 2009 8:58 AM

 hi experts

 i'm trying to export my website as pdf and into chapters from navigation

currently using itextsharp

the problem is this  

1           Dim Row As Data.DataRow = Nothing
2            Dim section_search As Integer = Nothing
3    
4            Dim page_name As String = Nothing
5            Dim file_name As String = Nothing
6            Dim FK_section_mapping As Integer = Nothing
7            Dim hide_navigation As Boolean = Nothing
8            Dim publishing_control As Integer = Nothing
9            Dim page_content As String = Nothing
10   
11           Dim sectionFont As Font = FontFactory.GetFont(FontFactory.HELVETICA, 16, Font.NORMAL, New Color(0, 0, 255))
12           Dim subsectionFont As Font = FontFactory.GetFont(FontFactory.HELVETICA, 12, Font.BOLD, New Color(0, 64, 64))
13   
14           For Each Row In pagesDS.Tables(0).Rows
15   
16               section_search = Row.Item("FK_section_mapping")
17   
18               If KEY_section_mapping = section_search Then
19   
20                   page_name = Row.Item("page_name")
21                   file_name = Row.Item("file_name")
22                   hide_navigation = Row.Item("hide_navigation")
23                   publishing_control = Row.Item("publishing_control")
24                   page_content = Row.Item("page_content")
25   
26                   Dim sub_chapter As New Paragraph(page_name, sectionFont)
27   
28                   Dim sub_chapter_section As Section = current_chapter.AddSection(sub_chapter, 2)
29   
30                   ''this is the content for this section
31                   Dim sub_chapter_content As New Paragraph(Server.HtmlDecode(page_content), subsectionFont)
32                   sub_chapter_section.Add(sub_chapter_content)
33   
34               End If
35           Next

line 31. page_content has the html source code (which i want it to render it as html and display it on the pdf how a web brower would render it)

eg...

<p>Latest ssss</p>
<p><img height="133" alt="" width="200" src="/content_files/100_4292.jpg" /></p>
<p><span style="font-family: SimHei">x + y update</span></p> 

the aim is that when users want to export to pdf, the formatting is retained

current, the pdf content is just being spit out as html source for my section,

anyone know how to fix this?

All replies (9)

Saturday, July 11, 2009 8:41 AM âś…Answered

have found my solution with duo pdf

http://www.duodimension.com/html_pdf_asp.net/component_html_pdf.aspx


Saturday, April 18, 2009 2:55 PM

 Which version of Visual Studio are you using? 2005 or 2008?


Saturday, April 18, 2009 3:45 PM

 

 Which version of Visual Studio are you using? 2005 or 2008?

 VS 2008


Saturday, April 18, 2009 4:14 PM

 You will need to post the whole of the class. (Preferably without the line numbers)


Sunday, April 19, 2009 12:33 AM

 

 You will need to post the whole of the class. (Preferably without the line numbers)

 here is the entire code behind 

Imports MySql.Data.MySqlClient
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports iTextSharp.text.html

Partial Class pdf_icecore
    Inherits System.Web.UI.Page

    Dim sectionDS As New Data.DataSet
    Dim pagesDS As New Data.DataSet

    Protected Sub btn_pdf_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btn_pdf.Click

        Session("current_site") = 8
        Session("key1") = "server=localhost...etc.."
        build_pdf_site()

    End Sub

    Sub build_pdf_site()

        get_section_data()
        get_page_data()

        Dim PK_section_mapping As Integer = Nothing
        Dim site_display_name As String = Nothing
        Dim hyperlink_text_value As String = Nothing

        Dim Row As Data.DataRow = Nothing
        Dim template_search As Integer = Nothing

        '' create our response so it displays the pdf
        Response.Clear()

        '' step 1: creation of a document-object and set margins
        Dim document As New Document(PageSize.A4, 50, 50, 50, 50)
        '' meta data for the pdf
        document.AddTitle(site_display_name)
        document.AddSubject("This example explains step 6 in Chapter 1")
        document.AddKeywords("Metadata, iText, step 6, tutorial")
        document.AddCreator("My program using iText#")
        document.AddAuthor("Bruno Lowagie")
        document.AddHeader("Expires", "0")

        '' step 2: we create a writer that listens to the document
        Dim writer As PdfWriter = PdfWriter.GetInstance(document, Response.OutputStream)
        writer.SetPdfVersion(PdfWriter.PDF_VERSION_1_7)
        
        '' step 3: we open the document
        document.Open()

        Dim chapterFont As Font = FontFactory.GetFont(FontFactory.HELVETICA, 24, Font.NORMAL, New Color(255, 0, 0))

        Dim i As Integer = 1
        For Each Row In sectionDS.Tables(0).Rows
            PK_section_mapping = Row.Item("PK_section_mapping")
            site_display_name = Row.Item("site_display_name")
            hyperlink_text_value = Row.Item("hyperlink_text_value")

            '' step 4: we add content to the document
            '''''''''' create the main chapter
            Dim title_chapter As New Paragraph(hyperlink_text_value, chapterFont)

            Dim chapter As New Chapter(title_chapter, i)
            chapter.NumberDepth = 0
           
            Dim title_chapter_sub_text As New Paragraph("text under chapter heading")
            chapter.Add(title_chapter_sub_text)

            generate_sub_chapter(PK_section_mapping, chapter)

            ''''''''''put all the sub chapter(s) generated as a new chapater
            document.Add(chapter)
            i = i + 1
        Next

        '' step 5: we close the document
        document.Close()

        '''''''''' output the file for users to download
        Response.ContentType = "application/pdf"
        Response.AddHeader("Content-Disposition", "attachment; filename=""" & site_display_name & ".pdf""")
        Response.End()

    End Sub

    Sub generate_sub_chapter(ByVal KEY_section_mapping As Integer, ByVal current_chapter As Chapter)

        Dim Row As Data.DataRow = Nothing
        Dim section_search As Integer = Nothing

        Dim page_name As String = Nothing
        Dim FK_section_mapping As Integer = Nothing
        Dim page_content As String = Nothing

        Dim sectionFont As Font = FontFactory.GetFont(FontFactory.HELVETICA, 16, Font.NORMAL, New Color(0, 0, 255))
        Dim subsectionFont As Font = FontFactory.GetFont(FontFactory.HELVETICA, 12, Font.BOLD, New Color(0, 64, 64))

        For Each Row In pagesDS.Tables(0).Rows

            section_search = Row.Item("FK_section_mapping")

            If KEY_section_mapping = section_search Then

                page_name = Row.Item("page_name")
                page_content = Row.Item("page_content") '<this is all html markup

                ''''''''''sub chapters name
                Dim sub_chapter As New Paragraph(page_name, sectionFont)
                Dim sub_chapter_section As Section = current_chapter.AddSection(sub_chapter, 2)
 
                ''''''''''this is the content for this section'''''' want to render the html here and show in it the section
                Dim sub_chapter_content As New Paragraph(Server.HtmlDecode(page_content), subsectionFont)
                sub_chapter_section.Add(sub_chapter_content)

            End If
        Next

    End Sub

    Sub get_section_data()

        Dim connectionString As String = Session("key1")
        Dim connectionObject As New MySql.Data.MySqlClient.MySqlConnection(connectionString)
        Dim adapter As New MySqlDataAdapter()
        Dim commandText As String = "prc_pdf_sections"
        Dim command As New MySqlCommand(commandText, connectionObject)
        Dim parameter As MySqlParameter = Nothing

        command.CommandType = System.Data.CommandType.StoredProcedure
        parameter = New MySqlParameter
        parameter.Direction = System.Data.ParameterDirection.Input
        command.Parameters.AddWithValue("?PK_sites_COL", Session("current_site"))

        adapter.SelectCommand = command
        adapter.Fill(sectionDS, "itemTable")
        adapter.SelectCommand.Connection.Close()

    End Sub

    Sub get_page_data()

        Dim connectionString As String = Session("key1")
        Dim connectionObject As New MySql.Data.MySqlClient.MySqlConnection(connectionString)
        Dim adapter As New MySqlDataAdapter()
        Dim commandText As String = "prc_pdf_pages"
        Dim command As New MySqlCommand(commandText, connectionObject)
        Dim parameter As MySqlParameter = Nothing

        command.CommandType = System.Data.CommandType.StoredProcedure
        parameter = New MySqlParameter
        parameter.Direction = System.Data.ParameterDirection.Input
        command.Parameters.AddWithValue("?FK_sites_COL", Session("current_site"))

        adapter.SelectCommand = command
        adapter.Fill(pagesDS, "itemTable")
        adapter.SelectCommand.Connection.Close()

    End Sub

End Class

 


Sunday, April 19, 2009 11:50 AM

 You will need to post the whole of the class. (Preferably without the line numbers)

 

any ideas?


Thursday, April 23, 2009 4:03 AM

this is what the pdf looks like once generated,

http://www.speedyshare.com/578698156.html

http://www.flyupload.com/get?fid=143363661

http://www.sendspace.com/file/n4df9s


Friday, May 1, 2009 3:09 PM

I don't use iTextSharp. However, using iText (java library), I've been unable to take HTML, render using HTMLWorker and put into a paragraph with the intended HTML format. However, I have been able to do this and put into the document.

NOTE -- you'll want to make certain that your HTML is well-formed. If it's missing " around the attributes for a tag, it won't be rendered (ex: <h4 height=10> vs. <h4 height="10"/>)

:: Putting to a paragraph won't work
PdfPTable table = new TableData(2,100f).getTable();
float[] widths = {50f,50f};
table.setWidths(widths);
table.addCell("list of values");
Paragraph para = new Paragraph();
StringReader reader = new StringReader("<ul><li>one</li><li>two</li></ul>");
ArchivePageUtil.parseHTML(para, reader);
table.addCell(para);
doc.add(table);

:: Putting directly on the document will work ::
StringReader reader = new StringReader("<ul><li>one</li><li>two</li></ul>");
ArchivePageUtil.parseHTML(doc, reader);

Here's the helper methods ArchivePageUtil
public StyleSheet defaultStyleSheet = new StyleSheet();

public static void parseHTML(Paragraph para, Reader reader) {
try {
ArrayList arrayList = HTMLWorker.parseToList(reader,ArchiveConstants.defaultStyleSheet);
//add the elements to the page
for (int k=0;k<arrayList.size();++k) {
para.add((Element) arrayList.get(k));
}
} catch (IOException ioe) {
//TODO: log the exception and give back some ApplicationException
Logger.exception(ArchivePageUtil.class, ioe, "parseHTML");
}
}

public static void parseHTML(Document doc, Reader reader) {
try {
ArrayList arrayList = HTMLWorker.parseToList(reader, ArchiveConstants.defaultStyleSheet);
//add the elements to the page
for (int k=0;k<arrayList.size();++k) {
doc.add((Element) arrayList.get(k));
}
} catch (IOException ioe) {
Logger.exception(ArchivePageUtil.class, ioe, "parseHTML");
} catch (DocumentException de) {
Logger.exception(ArchivePageUtil.class, de, "parseHTML");
}
}

:: There's also an example in the iText book where you can put the HTML in a column object instead of a PdfPTable

I hadn't done this (as putting to the doc was good enough), but you could write your own HTMLWorker class and utilize that. HTMLWorker doesn't appear to be "supported" anymore. But you have the source code, so you can change the way it currently works. If are able to fix it, give back to the community.

Scott


Friday, May 1, 2009 5:19 PM

thanks scott,

apologies as i'm not very good with C#

would you be able to modfiy my code with your method?