How to connect Azure Data lake storage to Azure ML? How to connect Azure Data lake storage to Azure ML? azure azure

How to connect Azure Data lake storage to Azure ML?


I recommend the following:

  • Get a tenant ID, client ID, and client secret for your ADLS using the tutorial here.
  • Install the azure-datalake-store Python package on AML Studio by attaching it as a Script Bundle to an Execute Python Script module.
  • In the Execute Python Script module, import the azure-datalake-store package and connect to the ADLS with your tenant ID, client ID, and client secret.
  • Download the data you need from ADLS and convert it into a dataframe within the Python Script module; return that dataframe to make the data available in the rest of AML Studio.


You can check this Microsoft Azure Docs which covers:

  1. Create a data science environment for building scalable end-to-end solutions in Azure Data Lake.

  2. This environment was used to analyze a large public dataset, taking it through the canonical steps of the Data Science Process, from data acquisition through model training, and then to the deployment of the model as a web service.

  3. U-SQL was used to process, explore and sample the data.

  4. Python and Hive were used with Azure Machine Learning Studio to build and deploy predictive models.

Link: https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/data-lake-walkthrough