Extract X number of words surrounding a given search string within a string

php mysql regex search

You might not be able to fully solve this problem with regex. There are too many possibilities of other characters between the words...

But you can try this regex:

((?:\S+\s*){0,5}\S*inmate\S*(?:\s*\S+){0,5})

See here : rubular

You might also want to exclude certain characters as they are not counted as words. Right now the regex counts any sequence of non space characters that are surrounded by spaces as word.

To match only real words:

((?:\w+\s*){0,5}<search word>(?:\s*\w+){0,5})

But here any non word character (,". etc.) brakes the matching.

So you can go on...

((?:[\w"',.-]+\s*){0,5}["',.-]?<search word>["',.-]?(?:\s*[\w"',.-]+){0,5})

This would also match 5 words with one of "',.- around your search term.

To use it in php:

$sourcestring="For example, if a user enters \"inmate\" as a search word and the MySQL";preg_match_all('/(?:\S+\s*){0,5}\S*inmate\S*(?:\s*\S+){0,5}/s',$sourcestring,$matches);echo $matches[0][0]; // you might have more matches, they will be in $matches[0][x]

php mysql regex search

I would use this regex for php which also takes UTF8 characters into account

'~(?:[\p{L}\p{N}\']+[^\p{L}\p{N}\']+){0,5}<search word>(?:[^\p{L}\p{N}\']+[\p{L}\p{N}\']+){0,5}~u'

In this case '~' is the delimiter and the modificator 'u' at the end identifies the regex is UTF8 interpreted.

please see a documentation about the Unicode Regex identifiers here:

http://www.regular-expressions.info/refunicode.html

CodeHunter

Extract X number of words surrounding a given search string within a string

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last