Elasticsearch - Extracting PDF content and encoding with base64
This didn't work :
curl -X PUT "http://localhost:9200/docs/attachment/_mapping" -d '{ "attachment": { "properties" : { "content" : { "type" : "attachment", "fields" : { "title" : {"store":"yes"}, "content":{ "type":"string", "term_vector":"with_positions_offsets", "store":"yes"} } } } } }'
This worked, and I can see the content of the PDF through Kibana :
curl -X PUT "http://localhost:9200/docs" -d '{ "mappings" : { "attachment" : { "properties" : { "content" : { "type" : "attachment", "fields" : { "content" : { "store" : "yes" }, "author" : { "store" : "yes" }, "title" : { "store" : "yes"}, "date" : { "store" : "yes" }, "keywords" : { "store" : "yes", "analyzer" : "keyword" }, "name" : { "store" : "yes" }, "content_length" : { "store" : "yes" }, "content_type" : { "store" : "yes" } } } } } } }'