Azure Stream Analytics Session window abnormal behavior Azure Stream Analytics Session window abnormal behavior azure azure

Azure Stream Analytics Session window abnormal behavior


The Azure Stream Analytics is way to manage session windows:

Session windows group events that arrive at similar times, filtering out periods of time where there is no data. Session window function has three main parameters: timeout, maximum duration, and partitioning key (optional).

The following diagram illustrates a stream with a series of events and how they are mapped into session windows of 5 minutes timeout, and maximum duration of 10 minutes.

A session window begins when the first event occurs. If another event occurs within the specified timeout from the last ingested event, then the window extends to include the new event. Otherwise if no events occur within the timeout, then the window is closed at the timeout.

If events keep occurring within the specified timeout, the session window will keep extending until maximum duration is reached. Please note that the maximum duration checking intervals are set to be the same size as the specified max duration. For example, if the max duration is 10, then the checks on if the window exceed maximum duration will happen at t = 0, 10, 20, 30, etc.

Thus mathematically, our session window ends if the following condition is satisfied:

Stream Analytics session window 5 mins timeout & 10 mins maximum

When a partition key is provided, the events are grouped together by the key and session window is applied to each group independently. This is useful for cases where you need different session windows for different users or devices.

Here is the Syntax:

SESSIONWINDOW(timeunit, timeoutSize, maxDurationSize) [OVER (PARTITION BY partitionKey)]SESSIONWINDOW(Timeout(timeunit , timeoutSize), MaxDuration(timeunit, maxDurationSize)) [OVER (PARTITION BY partitionKey)]

Explanation:

timeoutsize

A big integer that describes the gap size of the session window. Data that occur within the gap size are grouped together in the same window.

maxdurationsize

If the total window size exceeds the specified maxDurationSize at a checking point, then the window is closed and a new window is opened at the same point. Currently, the size of the checking interval is equal to maxDurationSize.

partitionkey

An optional parameter that specifies the key that the session window operates over. If specified, the window will only group together events of the same key.

Examples:JSON:

[  // time: the timestamp when the user clicks on the link  // user_id: the id of the user  // url: the url the user clicked on  {    "time": "2017-01-26T00:00:00.0000000z",    "user_id": 0,    "url": "www.example.com/a.html"  },  {    "time": "2017-01-26T00:00:20.0000000z",    "user_id": 0,    "url": "www.example.com/b.html"  },  {    "time": "2017-01-26T00:00:55.0000000z",    "user_id": 1,    "url": "www.example.com/c.html"  },  // ...]

To measure how long each user sessions are, you can use the following query:

CREATE TABLE localinput(time DATETIME, user_id BIGINT, url NVARCHAR(MAX))SELECT    user_id,    MIN(time) AS window_start,    System.Timestamp AS window_end,    DATEDIFF(s, MIN(time), System.Timestamp) AS duration_in_secondsFROM localinput TIMESTAMP BY timeGROUP BY user_id, SessionWindow(minute, 2, 60) OVER (PARTITION BY user_id)

The preceding query creates a session window with a timeout of 2 minutes, a maximum duration of 60 minutes and partitioning key of user_id. This means independent session windows will be created for each user_id. For each window, this query will generate output that contains the user_id, the start time of the window (window_start), the end of the window (window_end) and the total duration of the user session (duration_in_seconds).

It's very simple, just start from the beginning if you're getting stuck.