![]() ![]()
|
24
String Tokenizer
Features of JCStringTokenizer
Classes
Methods
Examples
24.1 Features of JCStringTokenizer
JCStringTokenizerprovides simple linear tokenization of aString. The set of delimiters, which defaults to common whitespace characters, can be specified either during creation or on a per-token basis. It is similar tojava.util.StringTokenizer, but delimiters can be included as literals by preceding them with a backslash character (the default). It exhibits this useful behavior: if one delimiter immediately follows another, a null String is returned as the token instead of being skipped over.
JCStringTokenizerhas these capabilities:
- Parses a String using a delimiter you specify.
- Parses a String using the specified delimiter and escape character.
- Counts the number of tokens in the
Stringusing the specified delimiter.
24.2 Classes
This utility consists of a single class called
JCStringTokenizer.Pass the String to be tokenized to the constructor:
String s = "Hello my friend";
JCStringTokenizer st = new JCStringTokenizer(s);Process the tokens in the String tokenizer with methods
hasMoreTokens()andnextToken().
24.3 Methods
These are the methods of
JCStringTokenizer:
24.4 Examples
At one point, there are two side-by-side commas in the String that is to be split into tokens. The delimiter for tokenization is a comma, so a null is returned as the token in this case. Upon encountering it,
String token, s = "this, is, a,, test";println()outputs the word "null" as part of the print stream. Note that leading spaces are not stripped from the tokenized word.
JCStringTokenizer st = new JCStringTokenizer(s);
while (st.hasMoreTokens()) {
token = st.nextToken(',');
System.out.println(token); }This prints the following to the console:
this
is
a
null
testYou can remove the leading spaces by passing each token in turn to another String tokenizer whose delimiter is a space.
In the next example, a slightly longer String is parsed based on the delimiter being the space character. As in the previous example, side-by-side spaces are interpreted as having a null token between them.
import com.klg.jclass.util.JCStringTokenizer;
public class StringTokenizerExample {
public static void main(String args[]){
String token, s = "this is a test of the string " + + "tokenizer called JCStringTokenizer. " + "\nThe whitespace between the repeated words is a tab tab. ";
System.out.println("First, the string: " + s);
JCStringTokenizer st = new JCStringTokenizer(s);
while (st.hasMoreTokens()) {
token = st.nextToken(' ');
System.out.println(token);
}
}
}
First, the string: this is a test of the string tokenizer called JCStringTokenizer.
The whitespace between the repeated words is a tab tab.
this
null
null
is
a
null
test
of
the
string
tokenizer
called
JCStringTokenizer.
The
whitespace
between
the
repeated
words
is
a
tab
tab.
![]() ![]()
|