|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.xml.serializer.EncodingInfo
public final class EncodingInfo
Holds information about a given encoding, which is the Java name for the encoding, the equivalent ISO name.
An object of this type has two useful methods
isInEncoding(char ch);which can be called if the character is not the high one in a surrogate pair and:
isInEncoding(char high, char low);which can be called if the two characters from a high/low surrogate pair.
An EncodingInfo object is a node in a binary search tree. Such a node
will answer if a character is in the encoding, and do so for a given
range of unicode values (m_first
to
m_last
). It will handle a certain range of values
explicitly (m_explFirst
to m_explLast
).
If the unicode point is before that explicit range, that is it
is in the range m_first <= value < m_explFirst
, then it will delegate to another EncodingInfo object for The root
of such a tree, m_before. Likewise for values in the range
m_explLast < value <= m_last
, but delgating to m_after
Actually figuring out if a code point is in the encoding is expensive. So the purpose of this tree is to cache such determinations, and not to build the entire tree of information at the start, but only build up as much of the tree as is used during the transformation.
This Class is not a public API, and should only be used internally within the serializer.
This class is not a public API.
Nested Class Summary | |
---|---|
private class |
EncodingInfo.EncodingImpl
This class implements the |
private static interface |
EncodingInfo.InEncoding
A simple interface to isolate the implementation. |
Field Summary | |
---|---|
(package private) java.lang.String |
javaName
The name used by the Java convertor. |
private EncodingInfo.InEncoding |
m_encoding
A helper object that we can ask if a single char, or a surrogate UTF-16 pair of chars that form a single character, is in this encoding. |
private char |
m_highCharInContiguousGroup
Not all characters in an encoding are in on contiguous group, however there is a lowest contiguous group starting at '' and working up to m_highCharInContiguousGroup. |
(package private) java.lang.String |
name
The ISO encoding name. |
Constructor Summary | |
---|---|
EncodingInfo(java.lang.String name,
java.lang.String javaName,
char highChar)
Create an EncodingInfo object based on the ISO name and Java name. |
Method Summary | |
---|---|
char |
getHighChar()
This method exists for performance reasons. |
private static boolean |
inEncoding(char ch,
byte[] data)
This method is the core of determining if character is in the encoding. |
private static boolean |
inEncoding(char high,
char low,
java.lang.String encoding)
This is heart of the code that determines if a given high/low surrogate pair forms a character that is in the given encoding. |
private static boolean |
inEncoding(char ch,
java.lang.String encoding)
This is heart of the code that determines if a given character is in the given encoding. |
boolean |
isInEncoding(char ch)
This is not a public API. |
boolean |
isInEncoding(char high,
char low)
This is not a public API. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private final char m_highCharInContiguousGroup
This is the char for which chars at or below this value are definately in the encoding, although for chars above this point they might be in the encoding. This exists for performance, especially for ASCII characters because for ASCII all chars in the range '' to '' are in the encoding.
final java.lang.String name
final java.lang.String javaName
private EncodingInfo.InEncoding m_encoding
Constructor Detail |
---|
public EncodingInfo(java.lang.String name, java.lang.String javaName, char highChar)
name
- reference to the ISO name.javaName
- reference to the Java encoding name.highChar
- The char for which characters at or below this value are
definately in the
encoding, although for characters above this point they might be in the encoding.Method Detail |
---|
public boolean isInEncoding(char ch)
ch
- the char in question.
This method is not a public API.
public boolean isInEncoding(char high, char low)
high
- a char that the a high char of a high/low surrogate pair.low
- a char that is the low char of a high/low surrogate pair.
This method is not a public API.
private static boolean inEncoding(char ch, java.lang.String encoding)
This method is not a public API, and should only be used internally within the serializer.
ch
- the char in question, that is not a high char of
a high/low surrogate pair.encoding
- the Java name of the enocding.private static boolean inEncoding(char high, char low, java.lang.String encoding)
This method is not a public API, and should only be used internally within the serializer.
high
- the high char of
a high/low surrogate pair.low
- the low char of a high/low surrogate pair.encoding
- the Java name of the encoding.private static boolean inEncoding(char ch, byte[] data)
ch
- the char that was converted using getBytes, or
the first char of a high/low pair that was converted.data
- the bytes written out by the call to s.getBytes(encoding);
public final char getHighChar()
Except for ' ', if a char is less than or equal to the value returned by this method then it in the encoding.
The characters in an encoding are not contiguous, however there is a lowest group of chars starting at '' upto and including the char returned by this method that are all in the encoding. So the char returned by this method essentially defines the lowest contiguous group.
chars above the value returned might be in the encoding, but chars at or below the value returned are definately in the encoding.
In any case however, the isInEncoding(char) method can be used regardless of the value of the char returned by this method.
If the value returned is ' ' it means that every character must be tested
with an isInEncoding method isInEncoding(char)
or isInEncoding(char, char)
for surrogate pairs.
This method is not a public API.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |