How could I save dataset from ipython notebook in Azure ML Studio?

python azure cortana-intelligence azure-machine-learning-studio

I have read the source code in the python package azureml, and found out that they are using a simple request post when uploading a dataset, which has a limited content length 4194304 bytes.

I tried to modify the code inside "http.py" within the python package azureml. I posted the request with a chunked data, and I got the following error:

Traceback (most recent call last):  File ".\azuremltest.py", line 10, in <module>    ws.datasets.add_from_dataframe(frame, 'GenericCSV', 'output2.csv', 'Uotput results')  File "C:\Python34\lib\site-packages\azureml\__init__.py", line 507, in add_from_dataframe    return self._upload(raw_data, data_type_id, name, description)  File "C:\Python34\lib\site-packages\azureml\__init__.py", line 550, in _uploadraw_data, None)  File "C:\Python34\lib\site-packages\azureml\http.py", line 135, in upload_dataset    upload_result = self._send_post_req(api_path, raw_data)  File "C:\Python34\lib\site-packages\azureml\http.py", line 197, in _send_post_req    raise AzureMLHttpError(response.text, response.status_code)azureml.errors.AzureMLHttpError: Chunked transfer encoding is not permitted. Upload size must be indicated in the Content-Length header.Request ID: 7b692d82-845c-4106-b8ec-896a91ecdf2d 2016-03-14 04:32:55Z

The REST API in azureml package does not support chunked transfer encoding. Hence, I took a look at how the Azure ML studio implements this, and I found out this:

It post a request with content-length=0 to https://studioapi.azureml.net/api/resourceuploads/workspaces/<workspace_id>/?userStorage=true&dataTypeId=GenericCSV, which will return an id in the response body.
Break the .csv file into chunks less than 4194304 bytes, and post them to https://studioapi.azureml.net/api/blobuploads/workspaces/<workspace_id>/?numberOfBlocks=<the number of chunks>&blockId=<index of chunk>&uploadId=<the id you get from previous request>&dataTypeId=GenericCSV

If you really want this functionality, you can implement it with python and the above REST API.

If you think it's too complicated, report the issue to this. The azureml python package is still under development, so your suggestion would be very helpful for them.

python azure cortana-intelligence azure-machine-learning-studio

According to AzureML project on Github, the Workspace Object ws works via HTTP request for Azure Resource Management. SO your code is a resoure manager API request. The error AzureMLHttpError was caused by overing the limitation for Azure Resource Manager API request size. The maximum limit size is 4194304 bytes.

You can find it in the section Subscription limits - Azure Resource Manager of the doc Azure subscription and service limits, quotas, and constraints, please see the figure below.

python azure cortana-intelligence azure-machine-learning-studio

Alexey, You might write to Azure Blob Storage, but I find methods to do so very sparsely documented.

https://azure.microsoft.com/en-us/documentation/articles/storage-python-how-to-use-blob-storage/

CodeHunter

How could I save dataset from ipython notebook in Azure ML Studio?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last