Curl encoding of SERP Curl encoding of SERP curl curl

Curl encoding of SERP


if what you say is really true, then it's a problem with the server, not with curl. but most likely, it's not a problem with the server either, it's probably a problem with how you view the result. here are my theories, ranging from most likely, to least likely:

1: you view the result in a web browser, you are not supplying the encoding parameter in the Content-Type: header, and the browser identify the content as HTML4, where the default charset is ISO-8859-1, and thus renders it as ISO-8859-1, which doesn't support ś, and the browser turns the unrenderable characters into ?. the fix is to change the Content-Type header into Content-Type: text/html;charset=utf8

2: same as above, but your server is actually supplying the wrong content-type header, eg Content-Type: text/html;charset=ISO-8859-1, the fix is the same as above.

3: the server is storing data in a sql db (like mysql) with the saving charset set to ISO-8859-1 (or something close to it), and then the db replace invalid characters with ? (i've seen this many times in the past, but not in recent years), in which case the server code must be fixed. check this answer https://stackoverflow.com/a/279279/1067003

4: you run PHP in a terminal which doesn't support unicode characters. the solution is to switch to a better terminal. (not very likely, but hey, xterm is still around, and still has a no-unicode version, you could be using normal xterm)

5: the server really is running some version of $response=str_replace($response,'ś','?');echo $response; ... highly unlikely, but not impossible, which must also be fixed on the server side. check this answer https://stackoverflow.com/a/279279/1067003

lastly, protip, you're confused, CURLOPT_HTTPHEADER is headers curl send to the target url in the request, when you set Content-Type with CURLOPT_HTTPHEADER, you set the Content-Type for the request body of the curl request. but because you're not using CURLOPT_INFILE, nor are you using CURLOPT_POSTFIELDS, there is no request body at all, and thus there shouldn't be any content-type header in the request, get rid of it. you were probably looking for the header() function, eg header('Content-Type: text/html; charset=utf-8');, which will send that header to the browser.