AWS SDK file upload to S3 via Node/Express using stream PassThrough - file is always corrupt AWS SDK file upload to S3 via Node/Express using stream PassThrough - file is always corrupt express express

AWS SDK file upload to S3 via Node/Express using stream PassThrough - file is always corrupt


Multer is the way to go.

It provides a few different modes, but as far as I could tell, you have to write a custom storage handler in order to access the underlying Stream, otherwise it's going to buffer all the data in memory and only callback once it's done.

If you check req.file in your route handler, Multer would normally provide a Buffer under the buffer field, but it's no longer present as I don't pass anything along in the callback, so I'm reasonably confident this is streaming as expected.

Below is a working solution.

Note: parse.single('image') is passed into the route handler. This refers to the multi-part field name I used.

const aws = require('aws-sdk');const stream = require('stream');const express = require('express');const router = express.Router();const multer = require('multer')const AWS_ACCESS_KEY_ID = "XXXXXXXXXXXXXXXXXXXX";const AWS_SECRET_ACCESS_KEY = "superSecretAccessKey";const BUCKET_NAME = "my-bucket";const BUCKET_REGION = "us-east-1";const s3 = new aws.S3({    region: BUCKET_REGION,    accessKeyId: AWS_ACCESS_KEY_ID,    secretAccessKey: AWS_SECRET_ACCESS_KEY});const uploadStream = key => {    let streamPass = new stream.PassThrough();    let params = {        Bucket: BUCKET_NAME,        Key: key,        Body: streamPass    };    let streamPromise = s3.upload(params, (err, data) => {        if (err) {            console.error('ERROR: uploadStream:', err);        } else {            console.log('INFO: uploadStream:', data);        }    }).promise();    return {        streamPass: streamPass,        streamPromise: streamPromise    };};class CustomStorage {    _handleFile(req, file, cb) {        let key = req.query.file_name;        let { streamPass, streamPromise } = uploadStream(key);        file.stream.pipe(streamPass)        streamPromise.then(() => cb(null, {}))    }}const storage = new CustomStorage();const parse = multer({storage});router.post('/upload', parse.single('image'), async (req, res) => {    try {        res.status(200).send({ result: 'Success!' });    } catch (e) {        console.log(e)        res.status(500).send({ result: 'Fail!' });    }});module.exports = router;

Update: A Better Solution

The Multer based solution I provided above is a bit hacky. So I took a look under the hood to see how it worked. This solution just uses Busboy to parse and stream the file. Multer is really just a wrapper for this with some disk I/O convenience functions.

const aws = require('aws-sdk');const express = require('express');const Busboy = require('busboy');const router = express.Router();const AWS_ACCESS_KEY_ID = "XXXXXXXXXXXXXXXXXXXX";const AWS_SECRET_ACCESS_KEY = "superSecretAccessKey";const BUCKET_NAME = "my-bucket";const BUCKET_REGION = "us-east-1";const s3 = new aws.S3({    region: BUCKET_REGION,    accessKeyId: AWS_ACCESS_KEY_ID,    secretAccessKey: AWS_SECRET_ACCESS_KEY});function multipart(request){    return new Promise(async (resolve, reject) => {        const headers = request.headers;        const busboy = new Busboy({ headers });        // you may need to add cleanup logic using 'busboy.on' events        busboy.on('error', err => reject(err))        busboy.on('file', function (fieldName, fileStream, fileName, encoding, mimeType) {            const params = {                Bucket: BUCKET_NAME,                Key: fileName,                Body: fileStream            };            s3.upload(params).promise().then(() => resolve());        })        request.pipe(busboy)    })}router.post('/upload', async (req, res) => {    try {        await multipart(req)        res.status(200).send({ result: 'Success!' });    } catch (e) {        console.log(e)        res.status(500).send({ result: 'Fail!' });    }});module.exports = router;


As far as I can tell, Postman is behaving as it should — the "text-injection" is actually a web standard, used to identify/demarcate files on upload. Please see this MDN Web Doc as well as this one for why.

It's actually injecting that part regardless of the file type:

let streamPass = new stream.PassThrough();// adding thisconst chunks = [];streamPass.on('data', (chunk) => chunks.push(chunk) );streamPass.on("end", () => {    body = Buffer.concat(chunks).toString();    console.log(chunks, chunks.length)    console.log("finished", body);  // <-- see it here});

I tried several methods to control/change this, with no luck on a simple method — from the Postman end, I don't think this is a setting that can be changed, and from the NodeJS end...I mean it's possible, but the solution will most likely be clunky/complicated, which I suspect you don't want. (I could be wrong though...)

Given the above, I'll join @relief.melone in recommending multer as a simple solution.

If you'd like to use multer with streams, try this: (I've indicated where I made changes to your code):

// const uploadStream = (key) => {const uploadStream = (key, mime_type) => {      // <- adding the mimetype    let streamPass = new stream.PassThrough();        let params = {        Bucket: BUCKET_NAME,        Key: key,        Body: streamPass,        ACL: 'public-read', // <- you can remove this        ContentType: mime_type  // <- adding the mimetype    };    let streamPromise = s3.upload(params, (err, data) => {        if (err) {            console.error("ERROR: uploadStream:", err);        } else {            console.log("INFO: uploadStream:", data);        }    }).promise();        return {        streamPass: streamPass,        streamPromise: streamPromise    };};// router.post("/upload", async (req, res) => {router.post("/upload", multer().single('file'), async (req, res) => {      // <- we're adding multer    try {                let key = req.query.file_name;        // === change starts here             // console.log(req.file); // <- if you want to see, uncomment this file            let { streamPass, streamPromise } = uploadStream(key, req.file.mimetype);   // adding the mimetype            var bufferStream = new stream.PassThrough();            bufferStream.end(req.file.buffer);            bufferStream.pipe(streamPass); // no longer req.pipe(streamPass);        // === change ends here         await streamPromise;                res.status(200).send({ result: "Success!" });    } catch (e) {        console.log(e)        res.status(500).send({ result: "Fail!" });    }});