libUnihan - Usage
space

Installation

From RPM

  1. Download the latest libUnihan and UnihanDb RPMs from here.
  2. Suppose the downloaded packages are at /tmp,
    • If previous version is installed, then cd /tmp; sudo rpm -hUv *.rpm
    • else cd /tmp; sudo rpm -hiv *.rpm

From Yum

Build from Source

Requires following package to be build: glib2-devel >= 2.4, cmake >= 2.4, sqlite-devel >= 3.0

doxygen is needed for API generation

  1. Download libUnihan-<version number>-Source.tar.gz
  2. Extract source to a directory you desire, for example: /tmp
  3. cd /tmp/libUnihan-<version number>
  4. cmake -DPREFIX=<directory prefix>
  5. cmake .
  6. make
  7. If you want API documentation:
    then make doxygen
  8. If you want to generate the Unihan.db yourself:
    1. then
      1. Download Unihan.zip from Unicode.org and extract it as Unihan.txt in current directory.
      2. bin/unihan_converter Unihan.txt Unihan.db.5.1.0-7
    2. Otherwise just extract the Unihan.db from the latest libUnihan-data tar ball.
  9. make install
  10. make install-db

Tell us if it does not work.

Tips - Supporting programs

unihan_convert

Summary: Convert Unihan.txt to libUnihan SQLite db format.

Synopsis: unihan_convert <Unihan.txt> <SQL Unihan>

You don't really need to run it unless you prefer to rebuild the Unihan.db on your own.

unihan_query

Summary: query on the libUnihan database

Synopsis:

  1. unihan_query [-V] [-L] [-U] <given_field> <given_value> <query_on_field>
  2. unihan_query [-V] -S <SQL clause>

Options:

  • -V: increase verbose levels, can have multiple Vs (at the moment, the maximum is 4)
  • -L: like mode, specify that the given_value is a pattern to be used in LIKE SQL search.
  • -U: output decimal Unicode code point as Unicode scalar string (U+xxxxx).
  • -S: Use SQL query

unihan_query has 2 modes, 1) simple query and 2) SQL query. SQL knowledge is not required for simple query, so if you don't want to mess with SQL, use simple query. On the other hand, SQL query sheds more light for advance users.

unihan_field_validation (Move to test suite, so it won't be installed by default)

Summary: Verify the Unihan.db by comparing the database query against the original Unihan tag values.

Synopsis: unihan_convert <Unihan.txt> <SQL Unihan>

It might takes days to complete, so no need to run it unless you do need to verify whether the output of pseudo fields are correct.

Examples

Mandarin pinyin lookup:

unihan_query kMandarin <pinyin> utf8

Used as a Chinese -> English dictionary

unihan_query utf8 <chineseCharacter> kDefinition

Used as an very crude English -> Chinese dictionary

unihan_query -L kDefinition "%<english word>%" utf8

CSS Valid XHTML 1.0! Valid CSS! SourceForge.net Logo

Last Updated: 10/20/2008 07:51:41