SQL Server custom counter stored procedure creating dupes SQL Server custom counter stored procedure creating dupes sql-server sql-server

SQL Server custom counter stored procedure creating dupes


A HOLDLOCK hint on the update statement will avoid the race condition. To prevent deadlocks, I suggest a clustered composite primary key (or unique index) on ID and date.

The example below incorporates these changes and uses the SET <variable> = <column> = <expression> form of the SET clause to avoid the need for the subsequent SELECT of the final counter value and thereby improve performance.

ALTER PROCEDURE [dbo].[CheckKey]    @apikey AS VARCHAR(10)ASSET NOCOUNT ON;--SET XACT_ABORT ON is a best practice for procs with explcit transactionsSET XACT_ABORT ON; DECLARE      @userID as int    , @limit as int    , @curCount as int    , @curDate as Date = GETDATE();BEGIN TRY;    SELECT          @userID = id        , @limit = limit     FROM [users]     WHERE apiKey = @apikey;    IF @userID IS NULL    BEGIN        --Key not found        SELECT 'False' as [Response], 'Invalid API key!' as [Reason];    END    ELSE    BEGIN        --Key found        BEGIN TRANSACTION Upsert;        UPDATE [counter] WITH(HOLDLOCK)         SET @curCount = [count] = [count] + 1         WHERE            [ID] = @userID             AND [date] = @curDate;            IF @@ROWCOUNT = 0            BEGIN                    INSERT INTO [counter] ([ID], [date], [count])                     VALUES (@userID, @curDate, 1);            END;        IF @limit IS NOT NULL AND @curCount > @limit        BEGIN            SELECT 'False' as [Response], 'Request limit reached!' as [Reason]        END        ELSE        BEGIN            SELECT 'True' as [Response], NULL as [Reason]        END;        COMMIT TRANSACTION Upsert;    END;END TRYBEGIN CATCH    IF @@TRANCOUNT > 0 ROLLBACK;    THROW;END CATCH;GO


Probably not the answer you're looking for but for a rate-limiting counter I would use a cache like Redis in a middleware before hitting the API. Performance-wise it's pretty great since Redis would have no problem with the load and your DB won’t be impacted.

And if you want to keep a history of hits per api key per day in SQL, run a daily task to import yesterday's counts from Redis to SQL.

The data-set would be small enough to get a Redis instance that would cost literally nothing (or close).


It will be the merge statement getting into a race condition with itself, i.e. your API is getting called by the same client and both times the merge statement finds no row so inserts one. Merge isn't an atomic operation, even though it's reasonable to assume it is. For example see this bug report for SQL 2008, about merge causing deadlocks, the SQL server team said this is by design.

From your post I think the immediate issue is that your clients will be potentially getting​ a small number of free hits on your API. For example if two requests come in and see no row you'll start with two rows with a count of 1 when you'd actually want one row with a count of 2 and the client could end up getting 1 free API hit that day. If three requests crossed over you'd get three rows with a count of 1 and they could get 2 free API hits, etc.

Edit

So as your link suggests you've got two categories of options you could explore, firstly just try and get this working in SQL server, secondly other architectural solutions.

For the SQL option I would do away with the merge and consider pre-populating your clients ahead of time, either nightly or less often for several days at a time, this will leave you a single update instead of the merge/update and insert. Then you can confirm both your update and your select are fully optimised, ie have the necessary index and that they aren't causing scans. Next you could look at tweaking locking so you're only locking at the row level, see this for some more info. For the select you could also look at using NOLOCK which means you could get slightly incorrect data but this shouldn't matter in your case, you'll be using a WHERE which targets a single row always as well.

For the non-SQL options, as your link says you could look at queuing things up, obviously these would be the updates/inserts so your selects would be seeing old data. This may or may not be acceptable depending on how far apart they are although you could have this as an "eventually consistent" solution if you wanted to be strict and charge extra or take off API hits the next day or something. You could also look at caching options to store the counts, this would get more complex if your app is distributed but there are caching solutions for this. If you went with caching you could choose to not persist anything but then you'd potentially give away a load of free hits if your site went down, but you'd probably have bigger issues to worry about then anyway!