|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.nhncorp.neptune.common.io.NText
public class NText
This class stores text using standard UTF8 encoding. It provides methods to serialize, deserialize, and compare texts at byte level. The type of length is integer and is serialized using zero-compressed format.
In addition, it provides methods for string traversal without converting the byte array to a string.
Also includes utilities for serializing/deserialing a string, coding/decoding a string, checking if a byte array contains valid UTF8 code, calculating the length of an encoded string.
| Nested Class Summary | |
|---|---|
static class |
NText.Comparator
A WritableComparator optimized for Text keys. |
| Constructor Summary | |
|---|---|
NText()
|
|
NText(byte[] utf8)
Construct from a byte array. |
|
NText(NText utf8)
Construct from another text. |
|
NText(java.lang.String string)
Construct from a string. |
|
| Method Summary | |
|---|---|
static int |
bytesToCodePoint(java.nio.ByteBuffer bytes)
Returns the next code point at the current position in the buffer. |
int |
charAt(int position)
Returns the Unicode Scalar Value (32-bit integer value) for the character at position. |
int |
compareTo(java.lang.Object o)
Compare two Texts bytewise using standard UTF8 ordering. |
static java.lang.String |
decode(byte[] utf8)
Converts the provided byte array to a String using the UTF-8 encoding. |
static java.lang.String |
decode(byte[] utf8,
int start,
int length)
|
static java.lang.String |
decode(byte[] utf8,
int start,
int length,
boolean replace)
Converts the provided byte array to a String using the UTF-8 encoding. |
static java.nio.ByteBuffer |
encode(java.lang.String string)
Converts the provided String to bytes using the UTF-8 encoding. |
static java.nio.ByteBuffer |
encode(java.lang.String string,
boolean replace)
Converts the provided String to bytes using the UTF-8 encoding. |
boolean |
equals(java.lang.Object o)
Returns true iff o is a Text with the same contents. |
int |
find(java.lang.String what)
|
int |
find(java.lang.String what,
int start)
Finds any occurence of what in the backing
buffer, starting as position start. |
int |
getAllocatedSize()
레퍼런스 영역 등 모든 오버헤드를 포함한 JVM내에서 실제 차지하고 있는 메모리 |
byte[] |
getBytes()
Retuns the raw bytes. |
int |
getLength()
Returns the number of bytes in the byte array |
int |
hashCode()
hash function |
void |
readFields(java.io.DataInput in)
deserialize |
static java.lang.String |
readString(java.io.DataInput in)
Read a UTF8 encoded string from in |
void |
set(byte[] utf8)
Set to a utf8 byte array |
void |
set(byte[] utf8,
int start,
int len)
Set the Text to range of bytes |
void |
set(NText other)
copy a text. |
void |
set(java.lang.String string)
Set to contain the contents of a string. |
static void |
skip(java.io.DataInput in)
Skips over one Text in the input. |
java.lang.String |
toString()
Convert text back to string |
static int |
utf8Length(java.lang.String string)
For the given string, returns the number of UTF-8 bytes required to encode the string. |
static void |
validateUTF8(byte[] utf8)
Check if a byte array contains valid utf-8 |
static void |
validateUTF8(byte[] utf8,
int start,
int len)
Check to see if a byte array is valid utf-8 |
void |
write(java.io.DataOutput out)
serialize write this object to out length uses zero-compressed encoding |
static int |
writeString(java.io.DataOutput out,
java.lang.String s)
Write a UTF8 encoded string to out |
| Methods inherited from class java.lang.Object |
|---|
getClass, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public NText()
public NText(java.lang.String string)
public NText(NText utf8)
public NText(byte[] utf8)
| Method Detail |
|---|
public byte[] getBytes()
public int getLength()
public int charAt(int position)
position. Note that this
method avoids using the converter or doing String instatiation
public int getAllocatedSize()
public int find(java.lang.String what)
public int find(java.lang.String what,
int start)
what in the backing
buffer, starting as position start. The starting
position is measured in bytes and the return value is in
terms of byte position in the buffer. The backing buffer is
not converted to a string for this operation.
public void set(java.lang.String string)
public void set(byte[] utf8)
public void set(NText other)
public void set(byte[] utf8,
int start,
int len)
utf8 - the data to copy fromstart - the first position of the new stringlen - the number of bytes of the new stringpublic java.lang.String toString()
toString in class java.lang.ObjectObject.toString()
public void readFields(java.io.DataInput in)
throws java.io.IOException
readFields in interface NWritablereadFields in interface org.apache.hadoop.io.Writablejava.io.IOException
public static void skip(java.io.DataInput in)
throws java.io.IOException
java.io.IOException
public void write(java.io.DataOutput out)
throws java.io.IOException
write in interface NWritablewrite in interface org.apache.hadoop.io.Writablejava.io.IOExceptionNWritable.write(DataOutput)public int compareTo(java.lang.Object o)
compareTo in interface java.lang.Comparablepublic boolean equals(java.lang.Object o)
o is a Text with the same contents.
equals in class java.lang.Objectpublic int hashCode()
hashCode in class java.lang.Object
public static java.lang.String decode(byte[] utf8)
throws java.nio.charset.CharacterCodingException
java.nio.charset.CharacterCodingException
public static java.lang.String decode(byte[] utf8,
int start,
int length)
throws java.nio.charset.CharacterCodingException
java.nio.charset.CharacterCodingException
public static java.lang.String decode(byte[] utf8,
int start,
int length,
boolean replace)
throws java.nio.charset.CharacterCodingException
replace is true, then
malformed input is replaced with the
substitution character, which is U+FFFD. Otherwise the
method throws a MalformedInputException.
java.nio.charset.CharacterCodingException
public static java.nio.ByteBuffer encode(java.lang.String string)
throws java.nio.charset.CharacterCodingException
java.nio.charset.CharacterCodingException
public static java.nio.ByteBuffer encode(java.lang.String string,
boolean replace)
throws java.nio.charset.CharacterCodingException
replace is true, then
malformed input is replaced with the
substitution character, which is U+FFFD. Otherwise the
method throws a MalformedInputException.
java.nio.charset.CharacterCodingException
public static java.lang.String readString(java.io.DataInput in)
throws java.io.IOException
java.io.IOException
public static int writeString(java.io.DataOutput out,
java.lang.String s)
throws java.io.IOException
java.io.IOException
public static void validateUTF8(byte[] utf8)
throws java.nio.charset.MalformedInputException
utf8 - byte array
java.nio.charset.MalformedInputException - if the byte array contains invalid utf-8
public static void validateUTF8(byte[] utf8,
int start,
int len)
throws java.nio.charset.MalformedInputException
utf8 - the array of bytesstart - the offset of the first byte in the arraylen - the length of the byte sequence
java.nio.charset.MalformedInputException - if the byte array contains invalid bytespublic static int bytesToCodePoint(java.nio.ByteBuffer bytes)
public static int utf8Length(java.lang.String string)
string - text to encode
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||