How do I retrieve multiple types of entities using a single query to Azure Table Storage? How do I retrieve multiple types of entities using a single query to Azure Table Storage? azure azure

How do I retrieve multiple types of entities using a single query to Azure Table Storage?


Finally there's a official way! :)

Look at the NoSQL sample which does exactly this in this link from the Azure Storage Team Blog:

Windows Azure Storage Client Library 2.0 Tables Deep Dive


There are a few ways to go about this and how you do it depends a bit on your personal preference as well as potentially performance goals.

  • Create an amalgamated class that represents all queried types. If I had StatusUpdateEntry and a NotificationEntry, then I would simply merge each property into a single class. The serializer willautomatically fill in the correct properties and leave the others null (or default). If you also put a 'type' property on the entity (calculated or set in storage), you could easily switch on that type. Since I always recommend mapping from table entity to your own type in the app, this works fine as well (the class only becomes used for DTO).

Example:

[DataServiceKey("PartitionKey", "RowKey")]public class NoticeStatusUpdateEntry{    public string PartitionKey { get; set; }       public string RowKey { get; set; }    public string NoticeProperty { get; set; }    public string StatusUpdateProperty { get; set; }    public string Type    {       get        {           return String.IsNullOrEmpty(this.StatusUpdateProperty) ? "Notice" : "StatusUpate";       }    }}
  • Override the serialization process. You can do this yourself by hooking the ReadingEntity event. It gives you the raw XML and you can choose to serialize however you want. Jai Haridas and Pablo Castro gave some example code for reading an entity when you don't know the type (included below), and you can adapt that to read specific types that you do know about.

The downside to both approaches is that you end up pulling more data than you need in some cases. You need to weigh this on how much you really want to query one type versus another. Keep in mind you can use projection now in Table storage, so that also reduces the wire format size and can really speed things up when you have larger entities or many to return. If you ever had the need to query only a single type, I would probably use part of the RowKey or PartitionKey to specify the type, which would then allow me to query only a single type at a time (you could use a property, but that is not as efficient for query purposes as PK or RK).

Edit: As noted by Lucifure, another great option is to design around it. Use multiple tables, query in parallel, etc. You need to trade that off with complexity around timeouts and error handling of course, but it is a viable and often good option as well depending on your needs.

Reading a Generic Entity:

[DataServiceKey("PartitionKey", "RowKey")]   public class GenericEntity   {       public string PartitionKey { get; set; }       public string RowKey { get; set; }     Dictionary<string, object> properties = new Dictionary<string, object>();       internal object this[string key]       {           get           {               return this.properties[key];           }           set           {               this.properties[key] = value;           }       }       public override string ToString()       {           // TODO: append each property           return "";       }   }       void TestGenericTable()       {           var ctx = CustomerDataContext.GetDataServiceContext();           ctx.IgnoreMissingProperties = true;           ctx.ReadingEntity += new EventHandler<ReadingWritingEntityEventArgs>(OnReadingEntity);           var customers = from o in ctx.CreateQuery<GenericTable>(CustomerDataContext.CustomersTableName) select o;           Console.WriteLine("Rows from '{0}'", CustomerDataContext.CustomersTableName);           foreach (GenericEntity entity in customers)           {               Console.WriteLine(entity.ToString());           }       }      // Credit goes to Pablo from ADO.NET Data Service team     public void OnReadingEntity(object sender, ReadingWritingEntityEventArgs args)       {           // TODO: Make these statics           XNamespace AtomNamespace = "http://www.w3.org/2005/Atom";           XNamespace AstoriaDataNamespace = "http://schemas.microsoft.com/ado/2007/08/dataservices";           XNamespace AstoriaMetadataNamespace = "http://schemas.microsoft.com/ado/2007/08/dataservices/metadata";           GenericEntity entity = args.Entity as GenericEntity;           if (entity == null)           {               return;           }           // read each property, type and value in the payload           var properties = args.Entity.GetType().GetProperties();           var q = from p in args.Data.Element(AtomNamespace + "content")                                   .Element(AstoriaMetadataNamespace + "properties")                                   .Elements()                   where properties.All(pp => pp.Name != p.Name.LocalName)                   select new                   {                       Name = p.Name.LocalName,                       IsNull = string.Equals("true", p.Attribute(AstoriaMetadataNamespace + "null") == null ? null : p.Attribute(AstoriaMetadataNamespace + "null").Value, StringComparison.OrdinalIgnoreCase),                       TypeName = p.Attribute(AstoriaMetadataNamespace + "type") == null ? null : p.Attribute(AstoriaMetadataNamespace + "type").Value,                       p.Value                   };           foreach (var dp in q)           {               entity[dp.Name] = GetTypedEdmValue(dp.TypeName, dp.Value, dp.IsNull);           }       }       private static object GetTypedEdmValue(string type, string value, bool isnull)       {           if (isnull) return null;           if (string.IsNullOrEmpty(type)) return value;           switch (type)           {               case "Edm.String": return value;               case "Edm.Byte": return Convert.ChangeType(value, typeof(byte));               case "Edm.SByte": return Convert.ChangeType(value, typeof(sbyte));               case "Edm.Int16": return Convert.ChangeType(value, typeof(short));               case "Edm.Int32": return Convert.ChangeType(value, typeof(int));               case "Edm.Int64": return Convert.ChangeType(value, typeof(long));               case "Edm.Double": return Convert.ChangeType(value, typeof(double));               case "Edm.Single": return Convert.ChangeType(value, typeof(float));               case "Edm.Boolean": return Convert.ChangeType(value, typeof(bool));               case "Edm.Decimal": return Convert.ChangeType(value, typeof(decimal));               case "Edm.DateTime": return XmlConvert.ToDateTime(value, XmlDateTimeSerializationMode.RoundtripKind);               case "Edm.Binary": return Convert.FromBase64String(value);               case "Edm.Guid": return new Guid(value);               default: throw new NotSupportedException("Not supported type " + type);           }       }


Another option, of course, is to have only a single entity type per table, query the tables in parallel and merge the result sorted by timestamp.In the long run this may prove to be the more prudent choice with reference to scalability and maintainability.

Alternatively you would need to use some flavor of generic entities as outlined by ‘dunnry’, where the non-common data is not explicitly typed and instead persisted via a dictionary.

I have written an alternate Azure table storage client, Lucifure Stash, which supports additional abstractions over azure table storage including persisting to/from a dictionary, and may work in your situation if that is the direction you want to pursue.

Lucifure Stash supports large data columns > 64K, arrays & lists, enumerations, composite keys, out of the box serialization, user defined morphing, public and private properties and fields and more. It is available free for personal use at http://www.lucifure.com or via NuGet.com.

Edit: Now open sourced at CodePlex