Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Monday, September 8, 2008 7:10 AM
Hi,
I need to read and parse a PDF file that has 50,000 pages. The "Save As" option within the Acrobat Reader is not of much use. The PDF file contains records in columns and I need to able to move those records into a database format.
Therefor, what I am looking for is set of API's that can open a PDF file and read the PDF file data.
Any help would be appreciated.
Regards,
Vibhu Bansal
http://www.ITSYSSolutions.com
All replies (2)
Monday, September 8, 2008 7:45 AM âś…Answered
http://www.accusoft.com/products/imagegear/igpro/pdf.asp?WT.srch=1
http://www.adobe.com/devnet/acrobat/
hope this helps
Shimmy
Friday, September 12, 2008 5:22 AM
Vibhu Bansal said:
Therefor, what I am looking for is set of API's that can open a PDF file and read the PDF file data.
Hi Vibhu,
Here is one document for you to check:
http://www.codeproject.com/KB/string/pdf2text.aspx
Parsing PDF files in .NET using PDFBox and IKVM.NET (managed code).
PDFBox library allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. PDFBox also includes several command line utilities.
This response contains a reference to a third party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these sites and has not tested any software or information found on these sites; therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet.
Best regards,
Martin Xie