Applications need to be tailored to provide locale-specific information according to the conventions of the end user's language and region. Resource bundles can be used to keep text messages, formatting conventions, and images targeted to a specific locale. Resource Bundles automatically isolate the locale-specific objects so that locale-sensitive information does not need to be hardcoded into an application. Resource bundles:
• Allow storage and retrieval of all locale-specific information.
• Allow support for multiple locales in a single application.
• Allow the addition of locales easily by adding additional resource bundles.
The java.util.ResourceBundle class stores text that are locale sensitive. This section reviews the ResourceBundle class and its subclasses.
Figure 4. Illustration of ResourceBundles Communicate with Applications.
1. The ResourceBundle Class
The ResourceBundle class is in the java.util package and it has two subclasses: PropertyResourceBundle and ListResourceBundle. Figure 5 illustrates the class hierarchy:
ResourceBundle acts like a container that holds key/value pairs. The key identifies a specific value in a bundle. A subclass of resource bundle implements two abstract methods: getKeys and handleGetObject. The handleGetObject(String key) uses a string as its argument, and then it returns a specific value from the resource bundle. The getKeys method returns all keys from a specific resource bundle.
1.1. Names
A ResourceBundle is a set of related subclasses that share the same base name. The basename is followed by characters that specify the language code, country code, and variant of a Locale. For example, if the base name is "newResourceBundle", and there is a U.S. English and French Canada locale, then the names would look like the following:
The default Locale would be named "newResourceBundle". All of the files will be located in the same package or directory. All of the property files with a different language, country, and variant codes will contain the same keys but have values in different languages.
1.2. Properties File
A properties file (.properties) stores a collection of text elements. The PropertyResourceBundle provides the code necessary to retrieve the text elements. The properties class in the java.util package handles the reading and writing and consists of a list of key and value pairs. The values in a properties file are specified in the following format:
Key = value
The comment lines in a properties files have a (#) pound or (!) exclamation at the beginning of the line.
1.3. Creating a ResourceBundle
The first step in creating a ResourceBundle is to create a Local instance. The Local instance and the resource bundle to load is then passed to the ResourceBundle.getBundle() method. The getString() and getObject() values can then be used to access the localized values. An example of a ResourceBundle instance is as follows:
Locale locale = new Locale("en", "US"); ResourceBundle labels = ResourceBundle.getBundle("i18n.newResourceBundle", locale); System.out.println(labels.getString("label"));
An instance of ResourceBundle is never created, but an instance of ListResourceBundle or PropertyResourceBundle subclass. The name of the Java property file is "newResourceBundle", and the Java package is "i18n". An example of the content of the property file is as follows:
label = Label 1
Once you have obtained a ResourceBundle instance, you can get localized values from it using one of the methods:
A set of all keys in a ResourceBundle can be obtained using the keySet() method.
1.4. Dates and Times
The format for displaying date and time varies depending on the locale. For example, 07/03/1977 can be interpreted as July 3, 1977, in the U.S., but would be interpreted as March 7, 1977, in Britain. The order of the fields, delimiters used and even the calendar used can vary depending on the region. The DateFormat class in java.util package provides formatting styles for a specific locale in easy-to-use formats.
1.4.1. Formatting Dates
The DateFormat class can be used to format dates, and it consists of two steps:
The inputs for the getDateInstance method consist of (1) the date format to use and the (2) locale. The format method returns a string with the formatted date. The date format to use can be specified as "default," "short," "medium," "long", and "full." These five styles are specified for each locale as shown in Table 4:
Table 4. Styles for U.S. Locale.
Locale locale = new Locale("en", "US"); DateFormat dateFormat = DateFormat.getDateInstance( DateFormat.DEFAULT, locale); String date = dateFormat.format(new Date()); System.out.println(date);
The output from this code on June 24, 2016, would be:
Jun 24, 2016
1.4.2. Formatting Time
The DateFormat class can also be used to format times in a similar manner using the getTimeInstance method and the format method. The time format can also be specified as "default," "short," "medium," "long", and "full." These five styles are specified for each locale as shown in Table 5:
Table 5. U.S. Styles
An example using the getTimeInstance and the format method is as follows:
Locale locale = new Locale("en", "US"); DateFormat dateFormat = DateFormat.getTimeInstance( DateFormat.DEFAULT, locale); String date = dateFormat.format(new Date()); System.out.println(date);
If the time is 10:30:39, then the output from the code is:
10:30:39
1.4.3. Formatting Date and Time
The getDateTimeInstance method can be used to format both date and time. The inputs for the getDateTimeInstance method consist of (1) the date format to use, (2) the time format to use, and the (3) locale.
Locale locale = new Locale("en", "US"); DateFormat dateFormat = DateFormat.getDateTimeInstance( DateFormat.DEFAULT,DateFormat.DEFAULT, locale); String date = dateFormat.format(new Date()); System.out.println(date);
Here is an example output from this code:
Jun 24, 2016 9:08:21 AM
1.5. Formatting Currencies
The NumberFormat class can be used for formatting currencies specific to a locale, and it consists of two steps:
The input for the getCurrencyInstance method consists of the locale. The format method returns a string with the formatted currency. An example is shown below:
Double currency = new Double(525,600.10); Currency currentCurrency = Currency.getInstance(locale); NumberFormat currencyFormatter = NumberFormat.getCurrencyInstance(locale); System.out.println( currentCurrency.getDisplayName() + ": " + currencyFormatter.format(currency));
The output of this code is:
US Dollar: $525,600.10
1.6. Number Formatting
The NumberFormat class can be used for formatting numbers, currencies, and percentages according to a locale. An example of a difference in number format between countries is the use of a "dot" in the U.S. and England to indicate a decimal or fraction, and the use of a comma in Denmark. The NumberFormat class consists of two steps:
The input for the getNumberInstance method consists of the locale.
Double num = new Double(525949.2); NumberFormat numberFormatter; String numOut; Locale locale = new Locale("en", "US"); numberFormatter = NumberFormat.getNumberInstance(locale); numOut = numberFormatter.format(num); System.out.println(numOut + " " + currentLocale.toString());
The output of this program would format the number as follows:
525,949.2 en_US
1.7. Percentages
The NumberFormat class can be used to format percentages, and it consists of two steps:
The input for the getPercentInstance method consists of the locale. An example is as follows:
Double percent = new Double(0.25); NumberFormat percentFormatter; String percentOut; percentFormatter = NumberFormat.getPercentInstance(currentLocale); percentOut = percentFormatter.format(percent); System.out.println(percentOut + " " + currentLocale.toString());
The output of this program will be as follows:
25% en_US
1.8. Time Zones
If your application needs to be run in different time zones, the code needs to be written to manage dates and times in a manner that is consistent to all users. The standard method of handling this is to convert time to UTC (Coordinated Universal Time) before storing it. The time in each time zones is calculated as an offset to UTC. For example, the U.S. is UTC-5, which means that it is UTC minus 5 hours. Figure shows a diagram of the time zones:
The java.util.Calendar class can be used to convert between time zones.The steps for using this class are as follows:
Calendar calendar = new GregorianCalendar(); calendar.setTimeZone(TimeZone.getTimeZone("America/New_York")); System.out.println("NYC: " + calendar.get(Calendar.HOUR_OF_DAY)); System.out.println("NYC: " + calendar.getTimeInMillis());
The output from this example would be:
NYC: 8
NYC: 1363351520548
The function, Calendar.getTimeInMillis(), always returns the time in UTC, regardless of the time zone set on the Calendar instance. The table below shows a list of time zone IDs that can be used with the TimeZone class:
1.9. Messages
Messages help the user to understand the status of a program. Messages keep the user informed, and also display any errors that are occurring. Local applications need to display messages in the appropriate language to be understood by a user in a specific locale. As described previously, strings are usually moved into a ResourceBundle to be translated appropriately. However, if there is data embedded in a message that is variable, there are some extra steps to prepare it for translation.
A compound message contains data that has numbers or text that are local-specific.
1.10. Character Methods
The java.lang.Character class has many methods that are useful for comparing characters, which is very useful in internationalization. The methods can tell if a character is a number, letter, space, or if it is upper or lower case, and they are based upon Unicode characters. Some of the most useful character methods are:
The input parameter for each of these methods is a char. For example, if char newChar = 'A', then Character.isDigit (newChar) = False and Character.isLetter(newChar) = true.
The Character class also has a getType() method. The getType method returns the type of a specified character. For example, the getType method will return Character.UPPERCASE_LETTER for the character "A". The Character API documentation fully specifies the methods in the Character class.
1.11. Sorting Strings
Different languages may have different rules for the sequence and sorting of strings and letters. If your application is for an English-speaking audience, string comparisons can be performed with the String.compareTo method. The String.compareTo method performs a comparison of the Unicode characters within two strings. In many languages, the Unicode values do not correspond to the relative order of the characters, therefore, the String.compareTo method cannot be used. The java.text.Collator class allows use to perform string comparisons in different languages.To use the Collator class for a specific local, the following code can be used:
Locale locale = Locale.US; Collator collator = Collator.getInstance(locale);
The compare() method can be used to compare strings. The outputs of this method are a -1, 0, or 1. The "-1" output means that the first string occurs earlier than the second string. A "0" means that the strings have the same order, and a "1" means that the first strings occur later in the order. An example is as follows:
Locale locale = Locale.US; Collator collator = Collator.getInstance(locale); int result = collator.compare("ab", "yz");
The return value would result in a "-1". The string "ab" will appear before the string "yz" when sorted according to the US rules.
1.12. Customized Collation Rules
If the pre-defined collation rules in the java.text.Collator class does not meet your needs, then you can define customer rules and assign them to a RuleBasedCollator object. The steps for doing this is as follows:
Here is an example:
String usRules = "< x < y < z"; RuleBasedCollator usCollator = new RuleBasedCollator(usRules); int result = usCollator.compare("x", "z"); System.out.println(result);
The example defines that x comes before y, and y comes before z. The results will print out a "1" because x comes before z.
1.13. Detecting Text Boundaries
Many applications need to find where text boundaries begin and end. In different languages, character, word, and sentence boundaries may abide by different rules. One method of handling this is to search for punctuation such as periods, commas, spaces, or colons. The java.text.BreakIterator class makes it easier to search for boundaries in different languages.
1.13.1. Using the BreakIterator Class
There are four types of boundaries that can be analyzed with the BreakIterator class: character, word, sentence, and line boundaries. The corresponding methods are:
These methods use the locale as the input parameter and then creates a BreakIterator instance. A new instance is required for each type of boundary. Here is a simple example:
Locale locale = LocaleUK; BreakIterator breakIterator = BreakIterator.characterInstance(locale);
The BreakIterator class holds an imaginary cursor to a current boundary in a string of text, and the cursor can be moved with the previous and next methods. The first boundary will be "0", and the last boundary will be the length of the string.
Figure 6. The boundaries found using the BreakIterator class.
1.14. Character Boundaries
Finding the location of character boundaries may be important if the end user can highlight one character at a time. Depending upon the language used, a character may depend on upon more than one Unicode character. The getCharacterInstance method in the BreakIterator class finds character boundaries for user characters, not Unicode characters.
The following example finds character boundaries in US English using the text provided in the setText() method:
Locale locale = Locale.US; BreakIterator breakIterator = BreakIterator.getCharacterInstance(locale); breakIterator.setText("This tutorial is really great."); int boundaryIndex = breakIterator.first(); while(boundaryIndex != BreakIterator.DONE) { System.out.println(boundaryIndex) ; boundaryIndex = breakIterator.next(); }
1.15. Word, Sentence and Line Boundaries
A BreakIterator instance can be created for word, sentence, or line boundaries for a particular language. The methods for each of these boundaries are:
An example of the getWordIterator() for finding boundaries in the US English text is as follows:
Locale locale = Locale.US; BreakIterator breakIterator = BreakIterator.getWordInstance(locale); breakIterator.setText("This tutorial is really great."); int boundaryIndex = breakIterator.first(); while(boundaryIndex != BreakIterator.DONE) { System.out.println(boundaryIndex) ; boundaryIndex = breakIterator.next(); }
As shown previously, the first() and next() methods return the Unicode index of the found word boundary. The boundaries would be marked as follows:
Figure 7. The boundaries found using the first() and next() methods.
1.16. Converting to and from Unicode
Unicode is a 16-bit character encoding that allows people from all over the world to use computers. It enables software internationalization of all software, operating systems, and the world wide web. The Java language stores all characters in Unicode. Since not all text received from users default file encoding, your application may need to convert into Unicode. Outgoing text may also need to be converted from Unicode to the format required by the outside file.
Java uses two main methods to convert text to Unicode:
1.16.1 String Class
The Sting class can be used to convert a byte array to a String instance.The string class can be used by first creating a byte array. The byte array and specific byte encoding to convert are used as parameters to the constructor to create a new string. The String constructor then converts the bytes from the character set of the byte array to Unicode. An example of using the Sting class to convert a byte array to a String instance is as follows:
byte[] bytesArray = new byte[10]; // array of bytes (0xF0, 0x9F, 0x98, 0x81) String string = new String(bytesArray, Charset.forName("UTF-8")); // covert byteArraySystem.out.println(string); // Test result
A string can also be converted to another format using the getBytes() method. This example uses the string "hello":
String Str1 = new String("hello"); Str2 = Str1.getBytes( "UTF-8" ); System.out.println(Str2);
1.16.2. Reader and Writer Classes
The Java.io package Reader and Writer Classes enable a Java application to convert between Unicode character streams and byte stream of non-Unicode text. The InputStreamReader class converts from a certain character set (UTF-8) to Unicode. The OutputStreamWriter can translate Unicode to non-Unicode characters. The following example demonstrates how to translate a text file in the UTF-8 encoding into Unicode:
FileInputStream inputStream = new FileInputStream("test.txt"); InputStreamReader reader = new InputStreamReader(inputStream, "UTF8");
Figure 8. Illustration of InputStreamReader and OutputStreamWriter.
This example creates a FileInputStream and puts it in an InputStreamReader. To write a stream of characters back into UTF-8 encoding from Unicode, the following can be used:
OutputStream outputStream = new FileOutputStream("output.txt"); OutputStreamWriter writer = new OutputStreamWriter(outputStream, "UTF-8");
These classes will reply on the default encoding it the encoding identifier is not specified. The getEncoding method can be used with the InputStreamReader or OutputStreamWriter as follows:
InputStreamReader defaultReader = new InputStreamReader(inputStream); String defaultEncoding = defaultReader.getEncoding();
1. Before Internationalization
Most programs that are written in one particular language and locale (in our case English), and have text hard-coded into the program code. As an example, you have written the following program entitled "NotInternationalized":
public class NotInternationalized { static public void main(String[] args) { System.out.println("Hello."); } }
You would like this program to display the same messages for users in Germany and France. Since you do not know German and French, you will need to hire a translator, and since the translator will not be used to looking at the code, and the text that needs to be translated needs to be moved into a separate file. Also, you are interested in translating this program into other languages in the future. The best method of efficiently translating these messages is to internationalize the program.
2. After Internationalization
The program after internationalization looks like the following example. The messages are not hardcoded into the program.
import java.util.*; public class InternationalizedSample { static public void main(String[] args) { String language; String country; if (args.length != 2) { language = new String("en"); country = new String("US"); } else { language = new String(args[0]); country = new String(args[1]); } Locale currentLocale; ResourceBundle messages; currentLocale = new Locale(language, country); messages = ResourceBundle.getBundle("MessagesBundle", currentLocale); System.out.println(messages.getString("greetings")); } }
As you can see, the internationalized source code has the hardcoded messages removed. The language is specified at run time, therefore, the program can be distributed worldwide. The steps for creating the internationalized program is as follows:
1. Create the Properties Files
A properties file stores the text that needs to be translated. The properties file is in plain-text format, and can be created using any text editor. We will name the properties file MessagesBundle.properties, and it has the following text:
greetings = Hello
We will create a new properties file for every language that we would like the text translated to. For the French text, we will create the MessagesBundle_fr_FR.properties file. MessagesBundle_fr_FR.properties file contains the fr language code and the FR country code, and contains these lines:
greetings = Bonjour.
The values on the left sign of the equal sign are called "keys", and these remain the same in every properties file. These are the references that the globalized program uses to call the values in a particular language.
2. Define the Locale
A Locale object specified a particular region and language. A Locale for the English language and the United States is as follows:
aLocale = new Locale("en","US");
To create Locale objects for the French language in Canada and France, we have the following:
caLocale = new Locale("fr","CA");
frLocale = new Locale("fr","FR");
The internationalized program gets the hardcoded language and country codes from the command line at run time:
String language = new String(args[0]);
String country = new String(args[1]);
currentLocale = new Locale(language, country);
Specifying a local only identifies a region and language. For other functions, additional code must be written to format dates, numbers, currencies, time zones, etc. These objects are locale-sensitive because their behavior varies according to Locale. A ResourceBundle is an example of a locale-sensitive object.
3. Create a ResourceBundle
A ResourceBundle contains locale-specific objects like translatable text. The ResourceBundle uses properties files that contain the text to be displayed. The ResourceBundle is created as follows:
messages = ResourceBundle.getBundle("MessagesBundle", currentLocale);
There are two inputs to the getBundle method: the properties file and the locale. "MessagesBundle" refers to the family of properties files:
MessagesBundle_en_US.properties
MessagesBundle_fr_FR.properties
MessagesBundle_de_DE.properties
The locale will specify which MessagesBundle files is chosen using the language and country code, which follows the MessagesBundle in the names of the properties files. The next step is to obtain the translated messages from the ResourceBundle.
4. Fetch the Text from the ResourceBundle
To retrieve the message from the ResourceBundle, the getString method is used. The keys are hardcoded into the code, and the keys fetch the values which are the translated messages. An example of the getString method is as follows:
String msg1 = messages.getString("greetings");
As you can see, the basic steps of internationalizing a program are simple. It requires some planning and some extra coding, but the process of internationalization can save a lot of time if the program needs to be used in multiple locales. The topics and examples in this tutorial provide a starting point for some of the other internationalization features of the Java programming language.
PhraseApp helps you to manage your software localization projects on-line while being fully integrated with major software technologies. The translation center allows you to edit and control localization files in your browser. The in-line context editor provides translators with useful contextual information to improve overall translation quality. The platforms and formats that Phraseapp supports include Java, Ruby, PHP, Python, javascript, as well as many more.