Postgresql full text search in postgresql - japanese, chinese, arabic Postgresql full text search in postgresql - japanese, chinese, arabic postgresql postgresql

Postgresql full text search in postgresql - japanese, chinese, arabic


Just a hint from the manual: A large list of dictionaries is available on the OpenOffice Wiki.


Dictionaries won't help you too much with Chinese - you'll need to look in to NGRAM tokenising...


The similar solution of link at stackoverflow.com is How do I implement full text search in Chinese on PostgreSQL? .

Although that, I would provide a solution below in detail based on my experience and a solution on Internet. I use both tools of SCWS and zhparser as the solution of Chinese full-text search in postgres.

20160131 Update:
You must check whether you have installed postgresql-server-devel-{number version} because we will use pgxs function from it for creating extension in postgresql.

Step1: install SCWS.
It's remarkable that --prefix=/usr/local/scws follows ./configure . Not just has ./configure along in below 4th line.

wget http://www.xunsearch.com/scws/down/scws-1.2.2.tar.bz2tar xvjf scws-1.2.2.tar.bz2cd scws-1.2.2./configure --prefix=/usr/local/scws makemake install

To check whether it installed successfully, please enter below command:

ls -al /usr/local/scws/lib/libscws.la


Step2: Install zhparser

git clone https://github.com/amutu/zhparser.gitcd zhparserSCWS_HOME=/usr/local/scws/include make && make install

20160131 Update:If you use Mac OS X Yosemite, aboved value of SCWS_HOME is same. But if you use Ubuntu 14.04 LTS, please change value of SCWS_HOME to /usr/local/scws .

Step3: Configure a new extension using zhparser in Postres
Step3.1: Login your postgres database through terminal/commandline

psql yourdatabasename

Step3.2: Create extension in Postgres. You could specify what dictionary name you want.

CREATE EXTENSION zhparser;CREATE TEXT SEARCH CONFIGURATION dictionarynameyouwant (PARSER = zhparser);ALTER TEXT SEARCH CONFIGURATION dictionarynameyouwant ADD MAPPING FOR n,v,a,i,e,l WITH simple;


If you follow above steps, you can use the function of Postgres full-text searching in Chinese/Mandarin words.

Extra step(not necessary) in Rails for using pg_search gem: Step4. Configure the dictionary name at :dictionary attribute of :tsearch in app/models/yourmodel.rb

class YourOwnClass < ActiveRecord::Base    ...    include PgSearch    pg_search_scope :functionnameyoulike, :against => [columnsyoulike1, columnsyoulike2, ...,etc], :using => { :tsearch => {:dictionary => "dictionary name you just specified in creating a extension in postgres", blah blah blah, ..., etc} }end

Reference:
1. SCWS install tutorial
2. Zhparser@github.com
3. Francs' Post - Postgres full-text search in Chinese with zhparser and SCWS
4. Rails365.net's Post - Postgres full-text search in Chinese with pg_search gem with zhparser
5. My Post at xuite.net - Make Postgres support full text search in Mandarin/Chinese