System.Text UTF8Encoding Class
Represents a UTF-8 code page.
Encoding.UTF8 property in .NET Core and .NET 5+. Consider using the static property for simpler and more efficient UTF-8 encoding.
Declaration
public class UTF8Encoding : Encoding
Inheritance
Object
System.Text.Encoding
System.Text.UTF8Encoding
Remarks
The UTF8Encoding class provides methods for converting strings to and from UTF-8 byte arrays. UTF-8 is a variable-width character encoding capable of encoding all possible Unicode code points. It is the dominant character encoding for the World Wide Web, accounting for more than 90% of all web pages.
While the static Encoding.UTF8 property is the recommended way to access a UTF-8 encoder in modern .NET, the UTF8Encoding class itself can still be useful for specific scenarios where you need finer control over the encoding process, such as specifying whether to include the UTF-8 byte order mark (BOM) or how to handle invalid characters.
Constructors
| Name | Description |
|---|---|
UTF8Encoding() |
Initializes a new instance of the UTF8Encoding class. |
UTF8Encoding(bool encoderShouldEmitUTF8Identifier) |
Initializes a new instance of the UTF8Encoding class, specifying whether to emit the UTF-8 byte order mark (BOM). |
UTF8Encoding(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes) |
Initializes a new instance of the UTF8Encoding class, specifying whether to emit the BOM and whether to throw an exception on invalid bytes. |
Methods
| Name | Description |
|---|---|
GetByteCount(char[] chars) |
Calculates the number of bytes produced by encoding all characters in the specified array of Unicode characters into UTF-8. |
GetByteCount(string s) |
Calculates the number of bytes produced by encoding the specified string into UTF-8. |
GetBytes(char[] chars) |
Encodes all the characters in the specified array of Unicode characters into a sequence of bytes. |
GetBytes(string s) |
Encodes a string into a sequence of bytes. |
GetChars(byte[] bytes) |
Decodes all the bytes in the specified byte array into characters. |
GetCharCount(byte[] bytes) |
Calculates the number of characters produced by decoding all bytes in the specified byte array into UTF-8. |
GetString(byte[] bytes) |
Decodes a sequence of bytes into a string. |
Equals(object obj) |
Determines whether the specified object is equal to the current object. |
GetHashCode() |
Serves as the default hash function. |
ToString() |
Returns a string that represents the current object. |
Example
This example demonstrates how to use the UTF8Encoding class to encode a string into a UTF-8 byte array and then decode it back into a string.
using System;
using System.Text;
public class Example
{
public static void Main()
{
// Original string
string originalString = "Hello, World! 👋";
// Create a UTF8Encoding instance
UTF8Encoding utf8 = new UTF8Encoding();
// Encode the string into a UTF-8 byte array
byte[] utf8Bytes = utf8.GetBytes(originalString);
Console.WriteLine($"Original String: {originalString}");
Console.WriteLine($"UTF-8 Byte Array (length: {utf8Bytes.Length}):");
foreach (byte b in utf8Bytes)
{
Console.Write($"{b:X2} "); // Print bytes in hexadecimal format
}
Console.WriteLine();
// Decode the UTF-8 byte array back into a string
string decodedString = utf8.GetString(utf8Bytes);
Console.WriteLine($"Decoded String: {decodedString}");
// Example demonstrating the BOM
UTF8Encoding utf8WithBom = new UTF8Encoding(true); // Include BOM
byte[] utf8BytesWithBom = utf8WithBom.GetBytes(originalString);
Console.WriteLine($"\nUTF-8 Byte Array with BOM (length: {utf8BytesWithBom.Length}):");
foreach (byte b in utf8BytesWithBom)
{
Console.Write($"{b:X2} ");
}
Console.WriteLine();
// Using the static property (recommended)
byte[] utf8BytesStatic = Encoding.UTF8.GetBytes(originalString);
Console.WriteLine($"\nUTF-8 Byte Array using Encoding.UTF8 (length: {utf8BytesStatic.Length}):");
foreach (byte b in utf8BytesStatic)
{
Console.Write($"{b:X2} ");
}
Console.WriteLine();
}
}