Jive Community Forums


  JavadesktopFAQ JavadesktopWIKI JavadesktopBLOG JavadesktopPROJECTS JavadesktopHOME javadotnet javadesktopFORUMS javadesktopARTICLES

Using Input Methods on the JavaTM Platform

by Naoto Sato

This is an updated version of an article that was originally published in September, 2002. You can find the original article on The Swing Connection.

Do you know how many characters are defined in Unicode Standard 4.0, the supported version in the Java 2 runtime environment version 1.5.0? It is 96,382!1 You may wonder how these characters are input into an application written for the Java platform. There are input methods for this purpose. Input methods allow the user to input text where each character may not be directly represented by a single keystroke. A user may compose text in advance from a series of keystrokes and insert the final desired text into the document when the composition is complete. In the Java 2 development environment, we provide the Input Method Framework for the collaboration between text components and input methods. By using this framework, Swing text components can handle the input method composition on-the-spot, or inline; in other words, the text being composed is immediately visually and logically inserted into the text backing store. Swing text components accomplish this on-the-spot editing style using the client API in the Input Method Framework. The engine SPI in the Input Method Framework allows you to plug your favorite input methods into any Java runtime environment. Like other Java technology-based applications, an input method can be deployed on any platform where the Java runtime environment is available. Furthermore, unlike applications, you can enjoy a common user interface across platforms; a feature that platform-native input methods seldom provide.

In this article, you will learn how to use input methods in your Swing text components. The information in this article is based on the Java 2 runtime environment, version 1.4.0 or higher, and the operating systems listed in the Supported Locales document (1.4.2 version | 1.5.0 version ).

Here is the table of contents of this article:

Installing Input Methods

Installing an input method is pretty easy. An input method is provided in JAR archive form, and you need only to place it in the extension directory. This is usually lib/ext, but you can also specify the extension directory at runtime by setting the java.ext.dirs system property. We provide a sample input method named City Input Method, available here, to use in this article. Copy the CityIM.jar file to the extension directory and the installation is done!

Selecting Input Methods

Once an input method is installed in the extension directory, you will notice an extra menu item in the System menu on SolarisTM or Microsoft Windows when you run an application which uses the Swing text component, as shown here:

System menu

After selecting Select Input Method, the following popup appears:

Popup menu

This menu contains a list of all input methods available in this runtime environment. Input methods that are provided by the underlying operating system are listed in the System Input Methods submenu. Input methods listed below the separator line are Java technology-based input methods. In this example, City Input Method demonstrates that it can support multiple languages. The specific languages supported are displayed as submenu items.

If you set the user locale to Japanese, you would see Japanese menu items if translations are provided by the input methods.

Japanese popup menu

Using Input Methods

To use an input method, select it in the popup menu. Let's select City Input Method in the Japanese locale; this menu item is highlighted in the previous picture. A small popup window appears at the bottom-right corner of the screen. This tells you that City Input Method - Japanese is now selected.

Status window

Now, type s, f, and o from the keyboard. You would see sfo with (dotted) underline, which means that sfo is still in composition mode. This type of editing is often known as pre-composing. Since City Input Method is an input method to input city names from airport codes, such as SFO, you can see candidate city names in some languages by pressing the space bar, as shown here:


Once you determine the candidate you prefer, commit that pre-composed string into the text backing store. This is typically done by pressing the Return key. In the example, it looks like this:


Input Method Selection by a Hot Key

For platforms that do not have the Select Input Method menu item in the system menu (e.g. Linux), or for applets that are running inside a browser, we provide an alternative way to select an input method by pressing a user-defined hot key. This way of selecting an input method is also useful for platforms that have the menu item in the system menu, e.g., Solaris CDE Desktop and Microsoft Windows. If you press the hot key, the same popup menu discussed previously is displayed.

To define a hot key combination, download the Input Method Hot Key tool by clicking the button below.

You can then run it as follows:

         java -jar InputMethodHotKey.jar [-system]

This pops up a window like this:

Hot key tool

After you set up your favorite hot key, press that combination on any Swing text component. You will see the same popup menu for selecting an input method. For multi-user platforms, the -system option is provided. If you set a hot key with the -system option, that hot key is active for all users.

Other Sample Input Methods

We've seen how the City Input Method works in the Swing text component. We also provide the following useful, but unsupported, input methods:

Code Point Input Method

Code Point Input Method

The Code Point Input Method is a simple input method that allows Unicode characters to be entered via their hexadecimal code point values. A user enters the hexadecimal code point value using the \uxxxx notation for character literals.

In general, the input method passes characters through unchanged. However, when the user types a \, the input method enters composition mode. In composition mode, the user types the desired code point using the \uxxxx notation, where x is one of the set [0-9a-fA-F]. When a valid sequence is entered it is converted to the corresponding Unicode character and committed. The input method then returns to pass-through mode until another \ character is entered.

While in composition mode, the user can use the left arrow, right arrow, backspace and delete keys to edit the sequence. The \u characters can only be deleted if there are no hex digits present in the composition sequence. Deleting the \u returns the input method to pass-through mode.

Since the \ character triggers composition mode to begin, a user must type two \ characters in order for a single \ to be added to the text. When a single \ has been entered, if the next character is not a u, both the \ and the subsequent character are committed and the input method returns to pass-through mode.

The Code Point Input Method can be downloaded by clicking the button below.

A newer version of the Code Point Input Method is now included in the Java 2 SDK version 1.5.0 as a demo program. This allows the users to input supplementary characters that have the code points outside of the Basic Multilingual Plane, i.e., the scalar values of their code points are between U+10000 and U+10FFFF. For more detail, please refer to the README file in the Java 2 SDK.

Indic Input Method

This input method archive contains several writing scripts used in India. Other than Devanagari, which is the supported writing system since Java 2 runtime edition, version 1.4.0, it also contains the input methods for the following writing scripts: Bengali, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. This input method basically maps the US 101/104 keyboard layout to the INSCRIPT layouts used in these writing scripts. Here are the diagrams for each keyboard layout, and the mapping table from the Latin alphabet to each writing script.

You can download the Indic input method by clicking the button below.

Thai Input Method

This input method implements the input-sequence checking for Thai, as defined in the Thai API Consortium's "WTT" Input/Output Methods document. This input method also maps the US 101/104 keyboard layout to the Thai TIS820-2538 layout. Here is the mapping table from the Latin alphabet to the Thai writing script.

The Thai input method is available by clicking the button below.

Further Information

For further information on using the Input Method Framework, see the documentation here. Also, you may find this Java internationalization forum useful. Finally, you can send feedback to the Java Internationalization team.

1: The number is the sum of graphic characters and format characters.