A reliable way to scrape title, description and keywords A reliable way to scrape title, description and keywords curl curl

A reliable way to scrape title, description and keywords


Generally get_meta_tags() should get you most of what you need, you just need to setup a set of cascading checks that will sample the required field from each metadata system until one is found. For example, something like this:

function get_title($url) {  $tags = get_meta_tags($url);  $props = get_meta_props($url);  return @tags["title"] || @props["og:title"] || ...}

The above implementation is obviously not efficient (because if we implemetn all the getters like this you'd reload the URL for each getter), and I didn't implement get_meta_props() - which is problematic to implement correctly using pcre_* and tedious to implement using DOMDocument.

Still a correct implementation is trivial though a lot of work - which is a classic scenario for an external library to solve the problem! Fortunately, there is one for just that - called simply "Embed" and you can find it on github, or using composer just run

composer require embed/embed