UTF8Encoding Class
Namespace: System.Text
Assembly: System.Text.Encoding.dll
Overview
The UTF8Encoding
class represents UTF-8 character encoding of Unicode characters. It provides methods to convert between Unicode characters and byte sequences, supporting both big-endian and little-endian byte order marks and optional error detection.
Syntax
public class UTF8Encoding : Encoding
{
public UTF8Encoding();
public UTF8Encoding(bool encoderShouldEmitUTF8Identifier);
public UTF8Encoding(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes);
}
Constructors
Constructor | Parameters | Remarks |
---|---|---|
UTF8Encoding() |
None | Creates a UTF-8 encoding without a BOM and without throwing on invalid bytes. |
UTF8Encoding(bool encoderShouldEmitUTF8Identifier) |
encoderShouldEmitUTF8Identifier – true to emit a UTF-8 identifier (BOM). |
Enables optional BOM emission. |
UTF8Encoding(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes) |
encoderShouldEmitUTF8Identifier , throwOnInvalidBytes |
Allows both BOM emission and exception throwing on invalid byte sequences. |
Key Properties
CodePage
– Returns 65001.EncodingName
– Returns "Unicode (UTF-8)".IsSingleByte
– Returnsfalse
.
Examples
Encoding and decoding a string with UTF-8:
// Encode a string to UTF-8 bytes
string original = "Hello, 🌍!";
UTF8Encoding utf8 = new UTF8Encoding(true);
byte[] utf8Bytes = utf8.GetBytes(original);
// Decode bytes back to a string
string decoded = utf8.GetString(utf8Bytes);
Console.WriteLine(decoded); // Output: Hello, 🌍!
Remarks
UTF-8 is the most commonly used encoding on the web. When handling external data, consider setting throwOnInvalidBytes
to true
to detect malformed sequences early.