php regex to get string inside href tag
Dont use regex for this. You can use xpath and built in php functions to get what you want:
$xml = simplexml_load_string($myHtml); $list = $xml->xpath("//@href"); $preparedUrls = array(); foreach($list as $item) { $item = parse_url($item); $preparedUrls[] = $item['scheme'] . '://' . $item['host'] . '/'; } print_r($preparedUrls);
$html = '<a href="http://www.mydomain.com/page.html">URL</a>';$url = preg_match('/<a href="(.+)">/', $html, $match);$info = parse_url($match[1]);echo $info['scheme'].'://'.$info['host']; // http://www.mydomain.com
this expression will handle 3 options:
- no quotes
- double quotes
- single quotes
'/href=["\']?([^"\'>]+)["\']?/'