Retrieve all hashtags from a tweet in a PHP function Retrieve all hashtags from a tweet in a PHP function php php

Retrieve all hashtags from a tweet in a PHP function


I created my own solution. It does:

  • Finds all hashtags in a string
  • Removes duplicate ones
  • Sorts hashtags regarding to count of the existence in text
  • Supports unicode characters

    function getHashtags($string) {      $hashtags= FALSE;      preg_match_all("/(#\w+)/u", $string, $matches);      if ($matches) {        $hashtagsArray = array_count_values($matches[0]);        $hashtags = array_keys($hashtagsArray);    }    return $hashtags;}

Output is like this:

(    [0] => #_ƒOllOw_    [1] => #FF    [2] => #neslitükendi    [3] => #F_0_L_L_O_W_    [4] => #takipedeğerdost    [5] => #GönüldenTakipleşiyorum)


$tweet = "this has a #hashtag a  #badhash-tag and a #goodhash_tag";preg_match_all("/(#\w+)/", $tweet, $matches);var_dump( $matches );

*Dashes are illegal chars for hashtags, underscores are allowed.


Don't forget about hashtags that contain unicode, numeric values and underscores:

$tweet = "Valid hashtags include: #hashtag #NYC2016 #NYC_2016 #gøypålandet!";preg_match_all('/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u', $tweet, $matches);print_r( $matches );

\p{Pc} - to match underscore

\p{N} - numeric character in any script

\p{L} - letter from any language

\p{Mn} - any non marking space (accents, umlauts, etc)