Ruby: Extracting Words From String
The split command.
words = @string1.split(/\W+/)
will split the string into an array based on a regular expression. \W means any "non-word" character and the "+" means to combine multiple delimiters.
For me the best to spliting sentences is:
line.split(/[^[[:word:]]]+/)
Even with multilingual words and punctuation marks work perfectly:
line = 'English words, Polski Żurek!!! crème fraîche...'line.split(/[^[[:word:]]]+/)=> ["English", "words", "Polski", "Żurek", "crème", "fraîche"]
Well, you could split the string on spaces if that's your delimiter of interest
@string1.split(' ')
Or split on word boundaries
\W # Any non-word character\b # Any word boundary character
Or on non-words
\s # Any whitespace character
Hint: try testing each of these on http://rubular.com
And note that ruby 1.9 has some differences from 1.8