Latency between Azure Web Role and SQL Azure and Application performance Latency between Azure Web Role and SQL Azure and Application performance azure azure

Latency between Azure Web Role and SQL Azure and Application performance


I posted this on another thread but it's old and has been closed. I think that it highlights some of your issues.

I've been attempting to move our business app to the cloud. Considering that our onsite servers are 8+ years old there should be a notable improvement with Azure services. However when we tested our app and benchmarked the cloud versus onsite we noticed that there was almost 3X more latency in the cloud then onsite (8+ year old servers) overall and 20X more latent when you compare them using modern equipment.Our app is an asp.net app and the DB is about 11 GB in size.

SQL Azure and Performance

  1. Never use a pooled connection in the cloud. If you do, you will have your queries dropping left right and center. Instead open one connection and keep it open until you are done.
  2. Use caching. You don't have a choice if you have a hope of making this work. I have one site successfully in the cloud but I had to use caching to get any reasonable performance out of it.
  3. Realize that it's not your fault! The Azure team needs to fix their end more than you do yours. We have a lean app, we've been upgrading, tweaking and optimizing for 10 years and if we can't make it work then you won't either.

I like Azure as a concept. I like the options. I like the expandability, but I don't like the performance. I hope Microsoft pays some more attention to this and makes some changes because no one should move their business there until this has been fixed.

Tests

Tests are done by running a series of queries that access the exact same data and are done in code via a custom function that measures the response time from the creation of an object to the time it's disposed (forced to write to the DB). This object wraps the code that's being tested.

No caching was enabled for the test however I did allow the code and DB to execute it once before and took the best results so that the DB server has the opportunity to optimize the query and so the web server could load the assemblies properly.

Test 1 - Web and DB on same good machine

  • Quad core 2.5GHz, 8GB Ram @ 800Mhz with 1300 FSB and SQL 2005
  • yields 290 ms response time.

Test 2 - Web and DB on same machine

  • SQL and Web on 2 Proc (dual core 3.0GHz), 16GB Ram @ 200Mhz with 200 FSB and SQL 2005Really old IBM Server.
  • Web and SQL both are local to each other
  • yields 656 ms response time.

Test 3 - Web separate from DB

  • SQL on 2 Proc (dual core 3.0GHz), 16GB Ram @ 200Mhz with 200 FSB and SQL 2005
  • Web on 1 Proc dual core 3.0GHz, 8GB Ram @ 200Mhz with 200 FSB
  • Really old IBM Server.
  • Web on one machine and SQL on the other.
  • yields 796 ms response time.

Test 4 - Azure

  • Medium VM on Azure
  • SQL Azure DB
  • yields 3,174 ms response time.

Conclusions

  • The difference in latency for me when moving from a one server scenario to a two server scenario was 140 ms.
  • Moving from that scenario to Azure was 2,518 ms. that's 17.98 times worse performance than on my 8 year old machines.

Don't do it until they fix this and take the time to let them know that it's an issue for you as well.


First of all, you need to know that Windows Azure SQL Database is a multi-tenant, high-density RDBMS offered as a service. That means, that a single server is used by possibly hundreds of customers.

I also suggest that you get to know the SLA for the services, Windows Azure SQL Database in particular. Noone has ever claim that there will be 0ms latency. There is also such thing as "Transient Conditions" in WIndows Azure SQL Database.

A good suggested reading is the Windows Azure SQL Database Performance and elasticiy guide.

As for web application performance, after reading the Performance and elasticity guide, I don't think the 200ms that happens once in a while are the core bottleneck.

UPDATE after the first comment

You will always have ups and downs in a shared environment. You should also expect query execution times to change up and down. This is unavoidable in that kind of environment and something you have to design for and live with. There is no magic wand here, and there is no dedicated server for you (for us) in the case of Windows Azure SQL Database. If you think that you app needs more reliable SQL Server services, you might try the Windows Azure Virtual Machines and bring up a SQL Server cluster by yourself. I guess ( and this is only a guess) that the communication between your cloud service and your VMs, given everything is in the same availability set, will be more predictable.

Update after 2nd and 3rd comments:

Well, yes, you might have licensing issues (I'm the expert on licensing). Is the ticket you have opened for the ups and downs? If so, you may try to escalate it (don't know how, but you have your ticket ID, you must also have an assigned engineer and an e-mail about the ticket - reply-to-all to that e-mail). Also, when you created the ticket there must have been a small questionnaire to reflect the business impact of your issue. Then a usual response time must have been assigned to the ticket. If the support has not come back to you within that time, you can definitely escalate it.

UPDATE

Interesting observation that I have is that in all your screenshots only the first packet is delayed, then every consecutive has 0 latency. In all the samples you are providing. If this is your case 10 out of 10 times, then you are definitely not having any issues with latency. I will suggest that you use the "-t" option in the regular ping to send more than 4 packets and observe. I suggest breaking at around 100 packets, and then observe the results. I would not take into account 4 packet samples, where only first one has latency for any performance reviewing.