PDF to byte array and vice versa PDF to byte array and vice versa arrays arrays

PDF to byte array and vice versa


Java 7 introduced Files.readAllBytes(), which can read a PDF into a byte[] like so:

import java.nio.file.Path;import java.nio.file.Paths;import java.nio.file.Files;Path pdfPath = Paths.get("/path/to/file.pdf");byte[] pdf = Files.readAllBytes(pdfPath);

EDIT:

Thanks Farooque for pointing out: this will work for reading any kind of file, not just PDFs. All files are ultimately just a bunch of bytes, and as such can be read into a byte[].


You basically need a helper method to read a stream into memory. This works pretty well:

public static byte[] readFully(InputStream stream) throws IOException{    byte[] buffer = new byte[8192];    ByteArrayOutputStream baos = new ByteArrayOutputStream();    int bytesRead;    while ((bytesRead = stream.read(buffer)) != -1)    {        baos.write(buffer, 0, bytesRead);    }    return baos.toByteArray();}

Then you'd call it with:

public static byte[] loadFile(String sourcePath) throws IOException{    InputStream inputStream = null;    try     {        inputStream = new FileInputStream(sourcePath);        return readFully(inputStream);    }     finally    {        if (inputStream != null)        {            inputStream.close();        }    }}

Don't mix up text and binary data - it only leads to tears.


The problem is that you are calling toString() on the InputStream object itself. This will return a String representation of the InputStream object not the actual PDF document.

You want to read the PDF only as bytes as PDF is a binary format. You will then be able to write out that same byte array and it will be a valid PDF as it has not been modified.

e.g. to read a file as bytes

File file = new File(sourcePath);InputStream inputStream = new FileInputStream(file); byte[] bytes = new byte[file.length()];inputStream.read(bytes);