Can a http request be sent with the nginx location directive? Can a http request be sent with the nginx location directive? curl curl

Can a http request be sent with the nginx location directive?


(as pointed out in the comments), ngx_http_lua_module can do it!

location / {          access_by_lua_block  {            os.execute("/usr/bin/curl --data 'v=1&t=pageview&tid=UA-XXXXXXXX-X&cid=123&dp=hit'  https://google-analytics.com/collect >/dev/null 2>/dev/null")         }}

note that the execution halts the pageload until curl has finished. to run curl in the background and continue the pageload immediately, add a space and an & to the end so it looks like

>/dev/null 2>/dev/null &")


What you're trying to do — execute a new curl instance for Google Analytics on each URL request on your server — is a wrong approach to the problem:

  1. Nginx itself is easily capable of servicing 10k+ concurrent connections at any given time as a lower limit, i.e., as a minimum, if you do things right, see https://en.wikipedia.org/wiki/C10k_problem.

  2. On the other hand, the performance of fork, the underlying system call that creates a new process, which would be necessary if you want to run curl for each request, is very slow, on the order 1k forks per second as an upper limit, e.g., if you do things right, that's the fastest it'll ever go, see Faster forking of large processes on Linux?.


What's the best alternative solution with better architecture?

  • My recommendation would be to perform this through batch processing. You're not really gaining anything by doing Google Analytics in real time, and a 5 minute delay in statistics should be more than adequate. You could write a simple script in a programming language of your choice to look through relevant http://nginx.org/r/access_log, collect the data for the required time period, and make a single batch request (and/or multiple individual requests from within a single process) to Google Analytics with the requisite information about each visitor in the last 5 minutes. You can run this as a daemon process, or as a script from a cron job, see crontab(5) and crontab(1).

  • Alternatively, if you still want real-time processing for Google Analytics (which I don't recommend, because most of these services themselves are implemented on an eventual consistency basis, meaning, GA itself wouldn't necessarily guarantee accurate real-time statistics for the last XX seconds/minutes/hours/etc), then you might want to implement a daemon of some sort to handle statistics in real time:

    • My suggestion would still be to utilise access_log in such daemon, for example, through a tail -f /var/www/logs/access_log equivalent in your favourite programming language, where you'd be opening the access_log file as a stream, and processing data as it comes and when it comes.

    • Alternatively, you could implement this daemon to have an HTTP request interface itself, and duplicate each incoming request to both your actual backend, as well as this extra server.You could multiplex this through nginx with the help of the not-built-by-default auth_request or add_after_body to make a "free" subrequest for each request. This subrequest would go to your server, for example, written in Go. The server would have at least two goroutines: one would process incoming requests into a queue (implemented through a buffered string channel), immediately issuing a reply to the client, to make sure to not delay nginx upstream; another one would receive the requests from the first one through the chan string from the first, processing them as it goes and sending appropriate requests to Google Analytics.

Ultimately, whichever way you'd go, you'd probably still want to implement some level of batching and/or throttling, because I'd imagine at one point, Google Analytics itself would likely have throttling if you keep sending it requests from the same IP address on a very excessive basis without any sort of a batch implementation at stake. As per What is the rate limit for direct use of the Google Analytics Measurement Protocol API? as well as https://developers.google.com/analytics/devguides/collection/protocol/v1/limits-quotas, it would appear that most libraries implement internal limits to how many requests per second they'd be sending to Google.


If everything you need is to submit a hit to Google Analytics, then it can be accomplished easier: Nginx can modify page HTML on the fly, embedding GA code before the closing </body> tag:

sub_filter_once on;sub_filter '</body>' "<script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');ga('create', 'UA-XXXXXXXX-X', 'auto');ga('send', 'pageview');</script></body>";location / {}

This Nginx module is called sub.