Download an excel file and read content with azure functions Download an excel file and read content with azure functions azure azure

Download an excel file and read content with azure functions


Open XML SDK works fine in Azure Function. I tested it on my side. Here is the full code.

#r "DocumentFormat.OpenXml.dll"#r "WindowsBase.dll"using System.Net;using System.IO;using DocumentFormat.OpenXml.Packaging;using DocumentFormat.OpenXml.Spreadsheet;public static HttpResponseMessage Run(HttpRequestMessage req, TraceWriter log){    log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");    WebClient client = new WebClient();    byte[] buffer = client.DownloadData("http://amor-webapp-test.azurewebsites.net/Content/hello.xlsx");    MemoryStream stream = new MemoryStream();    stream.Write(buffer, 0, buffer.Length);    stream.Position = 0;    using (SpreadsheetDocument doc = SpreadsheetDocument.Open(stream, false))    {        WorkbookPart workbookPart = doc.WorkbookPart;        SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();        SharedStringTable sst = sstpart.SharedStringTable;        WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();        Worksheet sheet = worksheetPart.Worksheet;        var cells = sheet.Descendants<Cell>();        var rows = sheet.Descendants<Row>();        log.Info(string.Format("Row count = {0}", rows.LongCount()));        log.Info(string.Format("Cell count = {0}", cells.LongCount()));        // One way: go through each cell in the sheet        foreach (Cell cell in cells)        {            if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))            {                int ssid = int.Parse(cell.CellValue.Text);                string str = sst.ChildElements[ssid].InnerText;                log.Info(string.Format("Shared string {0}: {1}", ssid, str));            }            else if (cell.CellValue != null)            {                log.Info(string.Format("Cell contents: {0}", cell.CellValue.Text));            }        }    }    return req.CreateResponse(HttpStatusCode.OK, "Hello ");}

enter image description here

To use Open XML, please make sure you have created a bin folder under your function folder and uploaded DocumentFormat.OpenXml.dll and WindowsBase.dll to it.

"File contains corrupted data".

Have you tried another excel file to check whether the issue is related to specific excel file. I suggest you create a new simple excel to test your code again.

"It didn't work on my file with the same "File contains corrupted data" message. "

I download your excel file and found that it is a older version(.xls) of excel file.

To fixed the exception, you could convert the excel to latest version(.xlsx) or choose another excel parse library. ExcelDataReader could work for any versions of excel file. You could install this library using NuGet by searching 'ExcelDataReader'. Following is the sample code of how to parse .xls format excel file. I tested it on Azure Function, it did worked fine.

#r "Excel.dll"#r "System.Data"using System.Net;using System.IO;using Excel;using System.Data;public static HttpResponseMessage Run(HttpRequestMessage req, TraceWriter log){    log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");    WebClient client = new WebClient();    byte[] buffer = client.DownloadData("http://amor-webapp-test.azurewebsites.net/Content/abcdefg.xls");    MemoryStream stream = new MemoryStream();    stream.Write(buffer, 0, buffer.Length);    stream.Position = 0;    IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);    DataSet result = excelReader.AsDataSet();    for (int i = 0; i < result.Tables.Count; i++)    {        log.Info(result.Tables[i].TableName +" has " + result.Tables[i].Rows.Count + " rows.");    }    return req.CreateResponse(HttpStatusCode.OK, "Hello ");}

Please add "Excel.dll" file to the bin folder of your function before executing upper code.


If you do need to save a temporary file, Azure Functions has a %TEMP% environment variable with a path to a temporary folder. This is a folder that is local to the vm that runs your function and will not be persisted.

However, saving the file locally / in Azure Files is unnecessary. You should be able to get the stream from the response to your get request and pass it straight to OpenXML.

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(originalExcelUrl);using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())using (Stream stream = response.GetResponseStream()) {    var doc = SpreadsheetDocument.Open(stream, true);    // etc}