Azure Data Lake - HDInsight vs Data Warehouse Azure Data Lake - HDInsight vs Data Warehouse azure azure

Azure Data Lake - HDInsight vs Data Warehouse


If your main purpose is to query data stored in the Azure Data Warehouse (ADW) then there is not real benefit to using Azure Data Lake Analytics (ADLA). But as soon as you have other (un)structured data stored in ADLS, like json documents or csv files for example, the benefit of ADLA becomes clear as U-Sql allows you to join your relational data stored in ADW with the (un)structured / nosql data stored in ADLS.

Also, it enables you to use U-Sql to prepare this other data for direct import in ADW, so Azure Data Factory is not longer required to get the data into you data warehouse. See this blogpost for more information:

A common use case for ADLS and SQL DW is the following. Raw data is ingested into ADLS from a variety of sources. Then ADL Analytics is used to clean and process the data into a loading ready format. From there, the high value data can be imported into Azure SQL DW via PolyBase.

..

You can import data stored in ORC, RC, Parquet, or Delimited Text file formats directly into SQL DW using the Create Table As Select (CTAS) statement over an external table.


Please note that the SQL statement in SQL Data Warehouse is currently NOT generating U-SQL behind the scenes. Also, the use cases between ADLA/U-SQL and SDW are different.

ADLA is giving you an processing engine to do batch data preparation/cooking to generate your data to build a data mart/warehouse that you then can read interactively with SQL DW. In your example above, you seem to be mainly doing the second part. Adding "Views" on top on these EXTERNAL tables to do transformations in SQL DW will quickly run into scalability limits if you operating on big data (and not just a few 100k rows).