Entity Framework Algorithm For Combining Data Entity Framework Algorithm For Combining Data database database

Entity Framework Algorithm For Combining Data


I don't think that you will improve the performance when using Entity framework:

The query

  • Loading each record by separate query is not good
  • You can improve the performance by loading multiple records in the same query. For example you can load small batch of records by using either || in condition or Contains (like IN in SQL). Contains is supported only by .NET 4.0.
  • Another improvement can be replacing the query with stored procedure and table valued parameter to pass all guids to SQL server join A with X.Guids and get results. Table valued parameters are only supported on SQL 2008 and newer.

The data modification

  • You don't have to should not call SaveChanges after each modification. You can call it after foreach loop and it will still work. It will pass all modifications in single transaction but you will not get any performance boost by such operation and according to this answer it can give you a significant boost.
  • EF doesn't support command batching and because of that each update or insert always takes separate round trip to the database. There is no way around this when using EF to modify data except implementing whole new EF ADO.NET provider (it is like starting a new project).
  • Again solution is reducing roundtrips by using stored procedure with table valued parameter
  • If your DB also uses that GUID as primary key and clustered index you have another performance decrease by reordering index after each insert = modifying data on disk.

The problem is not in algorithm but in the way you process the data and technology used to process the data. Entity framework is not a good choice for data pumps. You should go with these information to your boss because improving performance means more complicated change in your application. It is not your fault and it is not the fault of the programmer who did the application. This is a feature of EF which is not very well known and as I know it is not clearly stated in any MS best practices.