Get title of website via link Get title of website via link php php

Get title of website via link


My answer is expanding on @AI W's answer of using the title of the page. Below is the code to accomplish what he said.

<?phpfunction get_title($url){  $str = file_get_contents($url);  if(strlen($str)>0){    $str = trim(preg_replace('/\s+/', ' ', $str)); // supports line breaks inside <title>    preg_match("/\<title\>(.*)\<\/title\>/i",$str,$title); // ignore case    return $title[1];  }}//Example:echo get_title("http://www.washingtontimes.com/");?>

OUTPUT

Washington Times - Politics, Breaking News, US and World News

As you can see, it is not exactly what Google is using, so this leads me to believe that they get a URL's hostname and match it to their own list.

http://www.washingtontimes.com/ => The Washington Times


$doc = new DOMDocument();@$doc->loadHTMLFile('http://www.washingtontimes.com/news/2010/dec/3/debt-panel-fails-test-vote/');$xpath = new DOMXPath($doc);echo $xpath->query('//title')->item(0)->nodeValue."\n";

Output:

Debt commission falls short on test vote - Washington Times

Obviously you should also implement basic error handling.


Using get_meta_tags() from the domain home page, for NYT brings back something which might need truncating but could be useful.

$b = "http://www.washingtontimes.com/news/2010/dec/3/debt-panel-fails-test-vote/" ;$url = parse_url( $b ) ;$tags = get_meta_tags( $url['scheme'].'://'.$url['host'] );var_dump( $tags );

includes the description 'The Washington Times delivers breaking news and commentary on the issues that affect the future of our nation.'