Ingest email attachments on ElasticSearch Ingest email attachments on ElasticSearch elasticsearch elasticsearch

Ingest email attachments on ElasticSearch


In the end I defined a totally different pipeline.I read emails using a Ruby application with the mail library (you can find it on github), where it's quite easy to extract attachments.Then I put the base64 encoding of those attachments directly on ElasticSearch, using Ingest Attachment Processor.

I filter on content_type just to be sure to load only "real" attachments, as the multiparts emails treat any multimedial content in the body (ie: images) as attachment.

P.S.

Using the mail library, you should do something like:

Mail.defaults do    retriever_method :imap, { :address =>                 address,                              :port =>                    port,                              :user_name =>               user_name,                              :password =>                password,                              :enable_ssl =>              enable_ssl,                              :openssl_verify_mode =>     openssl_verify_mode }

and new_messages = Mail.find(keys: ['NOT','SEEN']) to retrieve unseen messages.

Then iterate over new_messages. After, you can encode a message simply using encoded = Base64.strict_encode64(attachment.body.to_s). Please inspect new_messages to check the exact field names to use.


Your problem might come from strip_attachment => true in the imap input plugin.