How does SQL query parameterisation work? How does SQL query parameterisation work? sql sql

How does SQL query parameterisation work?


A parameterized query doesn't actually do string replacement. If you use string substitution, then the SQL engine actually sees a query that looks like

SELECT * FROM mytable WHERE user='wayne'

If you use a ? parameter, then the SQL engine sees a query that looks like

SELECT * FROM mytable WHERE user=<some value>

Which means that before it even sees the string "wayne", it can fully parse the query and understand, generally, what the query does. It sticks "wayne" into its own representation of the query, not the SQL string that describes the query. Thus, SQL injection is impossible, since we've already passed the SQL stage of the process.

(The above is generalized, but it more or less conveys the idea.)


When you do text replacement (like your method B), you have to be wary of quotes and such, because the server will get a single piece of text, and it have to determine where the value ends.

With parameterized statements, OTOH, the DB server gets the statement as is, without the parameter. The value is sent to the server as a different piece of data, using a simple binary safe protocol. Therefore, your program doesn't have to put quotes around the value, and of course it doesn't matter if there were already quotes in the value itself.

An analogy is about source and compiled code: in your method B, you're building the source code of a procedure, so you have to be sure to strictly follow the language syntax. With Method A, you first build and compile a procedure, then (immediately after, in your example), you call that procedure with your value as a parameter. And of course, in-memory values aren't subject to syntax limitations.

Umm... that wasn't really an analogy, it's really what is happening under the hood (roughly).


Using parameterized queries is a good way to punt the task for escaping and preventing injections to the DB client library. It will do the escape before it replaces the string with "?". This is done in the client library, before DB server.

If you have MySQL running, turn on SQL log, and try a few parameterized queries, and you will see that MySQL server is receiving fully substituted queries with no "?" in it, but the MySQL client library has already escaped any quotes in your "parameter" for you.

If you use method B with just string replacement, "s are not automatically escaped.

Synergetically, with MySQL, you can prepare a parameterized query ahead of time, and then use the prepared statement repeatedly later. When you prepare a query, MySQL parses it and gives you back a prepared statement -- some parsed representation MySQL understands. Each time you use the prepared statement, not only you are guarded against injection, but also you avoid the cost of parsing the query again.

And, if you really want to be secure, you can modify your DB access/ORM layer so that 1) web server code can only use prepared statements, and 2) you can only prepare statements before your web server starts. Then, even if your web app is hacked into (say via a buffer overrun exploit), the hacker can only still use the prepared statements, but nothing more. For this you need to jail your web app and only allow access to the database via your DB access/ORM layer.