A reliable way to scrape title, description and keywords
Generally get_meta_tags()
should get you most of what you need, you just need to setup a set of cascading checks that will sample the required field from each metadata system until one is found. For example, something like this:
function get_title($url) { $tags = get_meta_tags($url); $props = get_meta_props($url); return @tags["title"] || @props["og:title"] || ...}
The above implementation is obviously not efficient (because if we implemetn all the getters like this you'd reload the URL for each getter), and I didn't implement get_meta_props()
- which is problematic to implement correctly using pcre_*
and tedious to implement using DOMDocument
.
Still a correct implementation is trivial though a lot of work - which is a classic scenario for an external library to solve the problem! Fortunately, there is one for just that - called simply "Embed" and you can find it on github, or using composer just run
composer require embed/embed