Input character sequence class for unified access to sources of input. More...
#include <input.h>
Classes | |
struct | Const |
Common constants. More... | |
Public Member Functions | |
Input (const Input &input) | |
Copy constructor (with intended "move semantics" as internal state is shared, should not rely on using the rhs after copying). More... | |
Input (void) | |
Construct empty input character sequence. More... | |
Input (const char *cstring) | |
Construct input character sequence from a NUL-terminated string. More... | |
Input (const std::string &string) | |
Construct input character sequence from a std::string. More... | |
Input (const std::string *string) | |
Construct input character sequence from a pointer to a std::string. More... | |
Input (const wchar_t *wstring) | |
Construct input character sequence from a NUL-terminated wide character string. More... | |
Input (const std::wstring &wstring) | |
Construct input character sequence from a std::wstring. More... | |
Input (const std::wstring *wstring) | |
Construct input character sequence from a pointer to a std::wstring. More... | |
Input (FILE *file) | |
Construct input character sequence from an open FILE* file descriptor, supports UTF-8 conversion from UTF-16 and UTF-32, use stdin if file == NULL. More... | |
Input (std::istream &istream) | |
Construct input character sequence from a std::istream. More... | |
Input (std::istream *istream) | |
Construct input character sequence from a pointer to a std::istream, use stdin if istream == NULL. More... | |
operator const char * () | |
Cast this Input object to string. More... | |
operator const wchar_t * () | |
Cast this Input object to wide character string. More... | |
operator FILE * () | |
Cast this Input object to file descriptor FILE*. More... | |
operator std::istream * () | |
Cast this Input object to std::istream*. More... | |
const char * | cstring (void) |
Get the remaining string of this Input object. More... | |
const wchar_t * | wstring (void) |
Get the remaining wide character string of this Input object. More... | |
FILE * | file (void) |
Get the FILE* of this Input object. More... | |
std::istream * | istream (void) |
Get the std::istream of this Input object. More... | |
size_t | size (void) |
Get the size of the input character sequence in number of ASCII/UTF-8 bytes (zero if size is not determinable from a FILE* or std::istream source). More... | |
bool | good (void) |
bool | eof (void) |
size_t | get (char *s, size_t n) |
void | file_encoding (short enc) |
Set encoding for FILE* input to Const::plain, Const::utf16be, Const::utf16le, Const::utf32be, or Const::utf32le. File encodings are automatically detected by the presence of a UTF BOM in the file. This function may be used when a BOM is not present and file encoding is known or to override the BOM. More... | |
short | file_encoding (void) const |
Protected Member Functions | |
void | init (void) |
Initialize the state after (re)setting the input source. More... | |
void | file_init (void) |
Implements init() on a FILE*. More... | |
size_t | file_get (char *s, size_t n) |
Implements get() on a FILE*. More... | |
void | file_size (void) |
Implements size() on a FILE*. More... | |
bool | file_good (void) |
Implements good()operation on a FILE*. More... | |
bool | file_eof (void) |
Implements eof() on a FILE*. More... | |
Protected Attributes | |
const char * | cstring_ |
NUL-terminated char string input (when non-null) More... | |
const wchar_t * | wstring_ |
NUL-terminated wide string input (when non-null) More... | |
FILE * | file_ |
FILE* input (when non-null) More... | |
std::istream * | istream_ |
stream input (when non-null) More... | |
size_t | size_ |
size of the input in bytes, when known More... | |
char | utf8_ [8] |
UTF-8 conversion buffer. More... | |
unsigned short | uidx_ |
index in utf8_[] or >= 8 when unused More... | |
unsigned short | utfx_ |
0 = ASCII/UTF-8, 1 = UTF-16 BE, 2 = UTF-16 LE, 3 = UTF-32 BE, 4 = UTF-32 LE More... | |
Input character sequence class for unified access to sources of input.
The Input class unifies access to a source of input of a character sequence as follows:
char*
string, a wchar_t*
wide string, a std::string
, a std::wstring
, a FILE*
descriptor, or a std::istream
object.FILE*
source as input, the file is checked for the presence of a UTF-8 or a UTF-16 BOM (Byte Order Mark). A UTF-8 BOM is ignored and will not appear on the input character stream (and size is adjusted by 3 bytes). A UTF-16 BOM is intepreted, resulting in the conversion of the file content automatically to an UTF-8 character sequence when reading the file with get(). Also, size() gives the content size in the number of UTF-8 bytes.size_t Input::get(char *buf, size_t len);
reads source input and fills buf
with up to len
bytes, returning the number of bytes read or zero when a stream or file is bad or when EOF is reached.size_t Input::size(void);
returns the number of ASCII/UTF-8 bytes available to read from the source input or zero (zero is also returned when the size is not determinable). Use this function only before reading input with get(). Wide character strings and UTF-16 FILE*
content is counted as the total number of UTF-8 bytes that will be produced by get(). The size of a std::istream
cannot be determined.bool Input::good(void);
returns true if the input is readable and a non-empty sequence of characters is available to get. Returns false on EOF or if an error condition is present.bool Input::eof(void);
returns true if the input reached EOF. Note that good() == ! eof() for string source input only, since files and streams may have error conditions that prevent reading. That is, for files and streams eof() implies good() == false, but not vice versa. Thus, an error is diagnosed when the condition good() == false && eof() == false holds. Note that get(buf, len) == 0 && len > 0 implies good() == false.The following example shows how to read a character sequence in blocks from a std::ifstream
:
The following example shows how to buffer the entire content of a file:
Files with UTF-16 content are converted to UTF-8 by get(buf, len), where size() gives the total number of UTF-8 bytes that will be produced by get(buf, len).
The following example shows how to read a character sequence in blocks from a file:
The following example shows how to echo characters one by one from stdin (reading input from a tty):
The following example shows how to read a character sequence in blocks from a wide character string while converting it to UTF-8:
The following example shows how to convert a wide character string to UTF-8:
The following example shows how to switch source inputs while reading input byte by byte (use a buffer as shown in other examples to improve efficiency):
|
inline |
Copy constructor (with intended "move semantics" as internal state is shared, should not rely on using the rhs after copying).
input | an Input object to share state with (undefined behavior results from using both objects at the same time) |
|
inline |
Construct empty input character sequence.
|
inline |
Construct input character sequence from a NUL-terminated string.
cstring | NUL-terminated char* string |
|
inline |
Construct input character sequence from a std::string.
string | input string |
|
inline |
Construct input character sequence from a pointer to a std::string.
string | input string |
|
inline |
Construct input character sequence from a NUL-terminated wide character string.
wstring | NUL-terminated wchar_t* input string |
|
inline |
Construct input character sequence from a std::wstring.
wstring | input wide string |
|
inline |
Construct input character sequence from a pointer to a std::wstring.
wstring | input wide string |
|
inline |
Construct input character sequence from an open FILE* file descriptor, supports UTF-8 conversion from UTF-16 and UTF-32, use stdin if file == NULL.
file | input file |
|
inline |
Construct input character sequence from a std::istream.
istream | input stream |
|
inline |
Construct input character sequence from a pointer to a std::istream, use stdin if istream == NULL.
istream | input stream |
|
inline |
Get the remaining string of this Input object.
|
inline |
Check if input reached EOF.
|
inline |
Get the FILE* of this Input object.
|
inline |
Set encoding for FILE*
input to Const::plain, Const::utf16be, Const::utf16le, Const::utf32be, or Const::utf32le. File encodings are automatically detected by the presence of a UTF BOM in the file. This function may be used when a BOM is not present and file encoding is known or to override the BOM.
enc | Const::plain, Const::utf16be, Const::utf16le, Const::utf32be, or Const::utf32le |
|
inline |
Get encoding of the current FILE*
input, Const::plain, Const::utf16be, Const::utf16le, Const::utf32be, or Const::utf32le.
|
inlineprotected |
Implements eof() on a FILE*.
|
protected |
Implements get() on a FILE*.
s | points to the string buffer to fill with input |
n | size of buffer pointed to by s |
|
inlineprotected |
Implements good()operation on a FILE*.
|
protected |
Implements init() on a FILE*.
|
protected |
Implements size() on a FILE*.
|
inline |
Copy subsequent character sequence data into buffer.
s | points to the string buffer to fill with input |
n | size of buffer pointed to by s |
|
inline |
Check if input is available.
|
inlineprotected |
Initialize the state after (re)setting the input source.
|
inline |
Get the std::istream of this Input object.
|
inline |
Cast this Input object to string.
|
inline |
Cast this Input object to wide character string.
|
inline |
Cast this Input object to file descriptor FILE*.
|
inline |
Cast this Input object to std::istream*.
|
inline |
Get the size of the input character sequence in number of ASCII/UTF-8 bytes (zero if size is not determinable from a FILE*
or std::istream
source).
|
inline |
Get the remaining wide character string of this Input object.
|
protected |
NUL-terminated char string input (when non-null)
|
protected |
FILE* input (when non-null)
|
protected |
stream input (when non-null)
|
protected |
size of the input in bytes, when known
|
protected |
index in utf8_[] or >= 8 when unused
|
protected |
UTF-8 conversion buffer.
|
protected |
0 = ASCII/UTF-8, 1 = UTF-16 BE, 2 = UTF-16 LE, 3 = UTF-32 BE, 4 = UTF-32 LE
|
protected |
NUL-terminated wide string input (when non-null)