Apache PDFBox

Software Screenshot:
Apache PDFBox
Software Details:
Version: 2.0.0 updated
Upload Date: 9 Apr 16
Developer: Ben Litchfield
Distribution Type: Freeware
Downloads: 414

Rating: 4.5/5 (Total Votes: 2)

This project will allow access to all of the components in a PDF document from inside Java applications.

FontBox and JempBox are also available for download.

What is new in this release:

  • OverlayPDF logic should be moved into a library class
  • Load document error for two RegisSTAR documents
  • TestFilters is non-deterministic
  • PDFCloneUtility does not handle COSStreamArray
  • RubberStampWithImage should support more image types
  • Make TestImageIOUtils optional in 1.8 for Fedora packaging
  • Support for multipage TIFFs in CCITTFactory, makes PDFBox capable of doing tiff2pdf
  • Added PDFBox version to the title
  • COSDocument and PDDocument declare throws IOException when they don't
  • Added unit test for RandomAccessFileOutputStream

What is new in version 1.8.9:

  • OverlayPDF logic should be moved into a library class
  • Load document error for two RegisSTAR documents
  • TestFilters is non-deterministic
  • PDFCloneUtility does not handle COSStreamArray
  • RubberStampWithImage should support more image types
  • Make TestImageIOUtils optional in 1.8 for Fedora packaging
  • Support for multipage TIFFs in CCITTFactory, makes PDFBox capable of doing tiff2pdf
  • Added PDFBox version to the title
  • COSDocument and PDDocument declare throws IOException when they don't
  • Added unit test for RandomAccessFileOutputStream

What is new in version 1.8.8:

  • OverlayPDF logic should be moved into a library class
  • Load document error for two RegisSTAR documents
  • TestFilters is non-deterministic
  • PDFCloneUtility does not handle COSStreamArray
  • RubberStampWithImage should support more image types
  • Make TestImageIOUtils optional in 1.8 for Fedora packaging
  • Support for multipage TIFFs in CCITTFactory, makes PDFBox capable of doing tiff2pdf
  • Added PDFBox version to the title
  • COSDocument and PDDocument declare throws IOException when they don't
  • Added unit test for RandomAccessFileOutputStream

What is new in version 1.8.1:

  • Bug Fixes:
  • PDGraphicsState class receives null page argument leading to NPE
  • Content of annotation not visible in image (converted from PDF)
  • TextPosition.getX() and getY() do not work properly with CropBox
  • TTFSubFont generates bug-prone TTF sub fonts screwing some printers
  • Merging PDFs with interactive forms results in a corrupt PDF
  • Saving a document containing a xfa form creates invalid pdf
  • NonSequentialPDFParser incorrectly parsing document info
  • Unused PDSignature class should be removed
  • Error when using monospaced Fonts

What is new in version 1.7.1:

  • Change the wrapped exception to extend Exception and pass the wrapped exception for more standard/better printout of wrapped exceptions
  • Only parsing object streams if they are referenced by the xref table/stream
  • Stream parsing of BaseParser should fall back to scanning if length value is wrong
  • Reduce the memory consumption of a RandomAccessBuffer

What is new in version 1.7.0:

  • CJK decoding
  • Integration of a PDF/A validator in PDFBox
  • Implement type 4 functions (PDFunctionType4)
  • Color conversion for PDJpegs using a DeviceN colorspace
  • Added "Save as image" to PDFReader
  • Allow subclassing of PDFParser
  • Added support to set a start and/or end page when splitting a PDF
  • Support CIDToGIDMap of CID-Type2 fonts
  • Split PDFont#encode

What is new in version 1.6.0:

  • Improvements:
  • PDF signing interface and improvements
  • Can't extract b/w images from PDF
  • Create Type1C font metrics only when necessary
  • Skip PS XObjects instead of throwing an exception
  • Add optional debug output to ExtractText
  • Unnecessary filling new array with zeros in RandomAccessBuffer::write(byte[], int, int)
  • Unnecessary using intermediate ByteArrayInputStream to copy from given byte array to OutputStream in FlateFilter::decode
  • Remove imageIO dependency (was: PDPage convertToImage bug creates white images from black and white pdf files.)
  • Signing improvement (settable signature size)
  • PDF Version not read in the document catalog
  • Unit tests for PDFBox features
  • Bug Fixes:
  • Rotated images aren't placed and rendered correctly while converting PDF pages to images
  • CLONE -convertToImage seems to invert colors
  • Convert to image makes blank image
  • PDF 2 Tiff conversion is not happening properly
  • RandomAccessBuffer returns wrong values for single byte reads, patch attached

What is new in version 1.5.0:

  • Improvements:
  • PDFDocument.save is really slow
  • Read non-conforming PDFs (attached) without throwing ...
  • Added NPE protection which occurred when reading corrupt PDFs
  • Avoid using temporary files in PDJpeg
  • Don't use temporty files by default for all PDF sizes
  • Bug Fixes:
  • Error on text extraction: java.lang.IndexOutOfBoundsExceptio
  • PDFTextStripper not handling some Japanese
  • NPE NullPointerException in PDPageNode.getCount
  • CFFParser.readCharset java.lang.IllegalArgumentException
  • Failed to create Type1C font. Falling back to Type1 font
  • PDFont fails to close Font File.
  • NPE in PDPageNode
  • PDFStreamEngine.processEncodedText fails on UTF-16 text
  • ExtractText china pdf ,but pdfbox distinguish Korea,The ...
  • Text not extracted with PDFBox 1.4
  • Wrong extracted text using PDFBox 1.4
  • Lost whitespaces when extracting Arabic text
  • Extracting Japanese characters gives garbage
  • Image quality improvements
  • PDFBOX may not depend on plattform encoding
  • RandomAccessBuffer shoud be created empty
  • ExtractText returns junk
  • getParent method of class PDField doesn't consider both ...
  • Text extraction slow and /tmp fills upwith AWT font files
  • Null Pointer Exception when Annotation is missing the Subtype

What is new in version 1.3.1:

  • New Features:
  • CID to Unicode mapping
  • Find encodings in FontFile3 - CompactFont Format
  • Add utility class to easily extract a range of pages from a PDF
  • PDFToImage : add the ability to select the area to export ...
  • Add WriteDecodedDoc to standalone app
  • Improvements:
  • Addtional CMap files from adobe
  • Handle JPEG2000 images via JPXDecode filter
  • Please accommodate '-' where a number is expected
  • Implementation of additional CMAP Formats for TrueType fonts
  • Access to metadata keys in the PD model
  • Update/adjust used junit version
  • Update/reactivate ant build
  • Objects from streams overwrite objects already read with ..
  • Better handle out of spec PDFs
  • Add ability to ignore errors with AcroForms
  • PDPixelMap is too verbose
  • Better handle corrupt/missing %%EOF flags at the end of a file
  • Improved handling erronous data between endstream and ...
  • Remove dependency on PageDrawer from text only operators
  • Support TIFF predictor 2 with FlateDecode, patch included
  • Increase performance of ColorSpaceCMYK.toRGB, patch attached
  • Bug Fixes:
  • Problems with text extraction form Polish documents.
  • Indexed color images have wrong colors after encryption
  • Exception in text extraction
  • PDFMergerUtility may create non-unique AcroForm field names
  • Somtimes, TextPosition have incorrect value ..
  • Text Extraction strips 1 char when extracting a twin pair
  • Incorrect text for Exolab.pdf in Regression Test
  • Improper text produced depending on font for ...
  • PDFBox can't parse PDF documents from jstor.org
  • testextract failure on Linux and Mac OS X
  • Last characters in a line overlap when a PDF is printed
  • Invalid text rendering while printing a PDF
  • Re-setting filled properties of PDDocumentInformation do ...
  • EXCEPTION_ACCESS_VIOLATION in fontmanager.so/fontmanager.dll
  • PDChoiceField's implementation of SetValue does not work ...
  • CMap parser doesn't work for double byte mappings with ...
  • PrintPDF does not take the windows default printer ...
  • Error by text extraction
  • Text extraction from PDF generated from MS Word fails
  • scratchfile ignored in PDDocument load( File file, ...
  • Extratced ascii text in CJK document is malformed
  • PDTrueTypeFont.loadTTF() freezes (at TTFDataStream.java:195)
  • Problem in extracting roman page numbers [PDPageLabels.java]
  • ClassCastException: COSInteger cannot be cast to COSDictionary
  • PDFont.getEncodingManager is not thread safe; FIX included
  • Wrong handling of PNG predictors with FlateDecode, patch ...
  • Wrong opacity for images with indexed color space
  • Spaces dissapear and text is shifted left
  • IIOException: Error 2 when displaying PDF containing CCITT ...
  • Write2File Fails for PDCalRGB
  • Use COSName constant instead of COSString
  • Umlauts font size calculation problem
  • [pdfbox-app] maven-bundle-configuration problem
  • Documentation: prominent example has out-of-date class name
  • AFM-files aren't loaded
  • TextExtraction mixes case of text
  • PageDrawer does not take the full CropBox into account
  • Define a standard encoding for the standard 14 fonts
  • Indexed images are sometimes corrupted when encrypting the PDF
  • OutOfMemoryError in text extraction tests

Similar Software

BackboneMVC
BackboneMVC

13 May 15

Flyer.js
Flyer.js

1 Mar 15

clinch
clinch

10 Dec 15

Comments to Apache PDFBox

Comments not found
Add Comment
Turn on images!