Get content-type of images in HTML file with Curl get_info Get content-type of images in HTML file with Curl get_info curl curl

Get content-type of images in HTML file with Curl get_info


First of, to get to content type from the fetch document use the constant CURLINFO_CONTENT_TYPE.

$type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);echo $type;

Second, you only fetched the HTML not the images. Saying that $a outputs the images is wrong. You output HTML <img src="">. If you do that in the browser you see the images, but that's just because the browser fetches those images at client side.

The easiest way to get the the content type, would be to use check the file extension.

$ext = pathinfo($a['src'], PATHINFO_EXTENSION);

If you really need to use the file extension, you have to do additional curl calls. Best to use curl_multi to do these in parallel.

$mh = curl_multi_init();foreach($html->find('img') as $a) {    $src = $a['src']; // You probably need to rewrite relative URL to absolute URLs    $ch = curl_init($src);    $ch = curl_set_opt($ch, CURLOPT_CUSTOMREQUEST, "HEAD");    curl_multi_add_handle($mh, $ch);    $handles[] = $ch;}$running = null;do {    curl_multi_exec($mh, $running);} while ($running);foreach ($handles as $ch) {    $src = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);    $type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);    $image_types[$src] = $type;}

Code is untested

Note that the received content-type can never be fully trusted and should never be used for security measures. If you want to be sure, download all the images (GET request instead of HEAD request) and use the fileinfo extension.