Microsoft Logo

Microsoft Docs

System.Text UTF8Encoding Class

Represents a UTF-8 code page.

Note: This class is largely superseded by the static Encoding.UTF8 property in .NET Core and .NET 5+. Consider using the static property for simpler and more efficient UTF-8 encoding.

Declaration

public class UTF8Encoding : Encoding

Inheritance

Object
System.Text.Encoding
System.Text.UTF8Encoding

Remarks

The UTF8Encoding class provides methods for converting strings to and from UTF-8 byte arrays. UTF-8 is a variable-width character encoding capable of encoding all possible Unicode code points. It is the dominant character encoding for the World Wide Web, accounting for more than 90% of all web pages.

While the static Encoding.UTF8 property is the recommended way to access a UTF-8 encoder in modern .NET, the UTF8Encoding class itself can still be useful for specific scenarios where you need finer control over the encoding process, such as specifying whether to include the UTF-8 byte order mark (BOM) or how to handle invalid characters.

Constructors

Name Description
UTF8Encoding() Initializes a new instance of the UTF8Encoding class.
UTF8Encoding(bool encoderShouldEmitUTF8Identifier) Initializes a new instance of the UTF8Encoding class, specifying whether to emit the UTF-8 byte order mark (BOM).
UTF8Encoding(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes) Initializes a new instance of the UTF8Encoding class, specifying whether to emit the BOM and whether to throw an exception on invalid bytes.

Methods

Name Description
GetByteCount(char[] chars) Calculates the number of bytes produced by encoding all characters in the specified array of Unicode characters into UTF-8.
GetByteCount(string s) Calculates the number of bytes produced by encoding the specified string into UTF-8.
GetBytes(char[] chars) Encodes all the characters in the specified array of Unicode characters into a sequence of bytes.
GetBytes(string s) Encodes a string into a sequence of bytes.
GetChars(byte[] bytes) Decodes all the bytes in the specified byte array into characters.
GetCharCount(byte[] bytes) Calculates the number of characters produced by decoding all bytes in the specified byte array into UTF-8.
GetString(byte[] bytes) Decodes a sequence of bytes into a string.
Equals(object obj) Determines whether the specified object is equal to the current object.
GetHashCode() Serves as the default hash function.
ToString() Returns a string that represents the current object.

Example

This example demonstrates how to use the UTF8Encoding class to encode a string into a UTF-8 byte array and then decode it back into a string.

using System;
using System.Text;

public class Example
{
    public static void Main()
    {
        // Original string
        string originalString = "Hello, World! 👋";

        // Create a UTF8Encoding instance
        UTF8Encoding utf8 = new UTF8Encoding();

        // Encode the string into a UTF-8 byte array
        byte[] utf8Bytes = utf8.GetBytes(originalString);

        Console.WriteLine($"Original String: {originalString}");
        Console.WriteLine($"UTF-8 Byte Array (length: {utf8Bytes.Length}):");
        foreach (byte b in utf8Bytes)
        {
            Console.Write($"{b:X2} "); // Print bytes in hexadecimal format
        }
        Console.WriteLine();

        // Decode the UTF-8 byte array back into a string
        string decodedString = utf8.GetString(utf8Bytes);
        Console.WriteLine($"Decoded String: {decodedString}");

        // Example demonstrating the BOM
        UTF8Encoding utf8WithBom = new UTF8Encoding(true); // Include BOM
        byte[] utf8BytesWithBom = utf8WithBom.GetBytes(originalString);
        Console.WriteLine($"\nUTF-8 Byte Array with BOM (length: {utf8BytesWithBom.Length}):");
        foreach (byte b in utf8BytesWithBom)
        {
            Console.Write($"{b:X2} ");
        }
        Console.WriteLine();

        // Using the static property (recommended)
        byte[] utf8BytesStatic = Encoding.UTF8.GetBytes(originalString);
        Console.WriteLine($"\nUTF-8 Byte Array using Encoding.UTF8 (length: {utf8BytesStatic.Length}):");
         foreach (byte b in utf8BytesStatic)
        {
            Console.Write($"{b:X2} ");
        }
        Console.WriteLine();
    }
}