.NET API Reference

UTF8Encoding Class

Namespace: System.Text

Assembly: System.Text.Encoding.dll

Overview

The UTF8Encoding class represents UTF-8 character encoding of Unicode characters. It provides methods to convert between Unicode characters and byte sequences, supporting both big-endian and little-endian byte order marks and optional error detection.

Syntax


public class UTF8Encoding : Encoding
{
    public UTF8Encoding();
    public UTF8Encoding(bool encoderShouldEmitUTF8Identifier);
    public UTF8Encoding(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes);
}
    

Constructors

ConstructorParametersRemarks
UTF8Encoding() None Creates a UTF-8 encoding without a BOM and without throwing on invalid bytes.
UTF8Encoding(bool encoderShouldEmitUTF8Identifier) encoderShouldEmitUTF8Identifier – true to emit a UTF-8 identifier (BOM). Enables optional BOM emission.
UTF8Encoding(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes) encoderShouldEmitUTF8Identifier, throwOnInvalidBytes Allows both BOM emission and exception throwing on invalid byte sequences.

Key Properties

Examples

Encoding and decoding a string with UTF-8:


// Encode a string to UTF-8 bytes
string original = "Hello, 🌍!";
UTF8Encoding utf8 = new UTF8Encoding(true);
byte[] utf8Bytes = utf8.GetBytes(original);

// Decode bytes back to a string
string decoded = utf8.GetString(utf8Bytes);
Console.WriteLine(decoded); // Output: Hello, 🌍!

Remarks

UTF-8 is the most commonly used encoding on the web. When handling external data, consider setting throwOnInvalidBytes to true to detect malformed sequences early.

See Also