MSDN Documentation

.NET APIs | System.Globalization

UnicodeCategory Enumeration

Defines the Unicode category of a character.

Syntax

public enum UnicodeCategory

Remarks

The UnicodeCategory enumeration specifies the Unicode general category for a character. These categories are based on the Unicode standard, version 3.0. The enumeration defines categories such as uppercase letters, lowercase letters, decimal digits, punctuation, symbols, and control characters. This enumeration is used by the Char.GetUnicodeCategory method to return the category of a specified character.

Understanding Unicode categories is crucial for tasks involving text processing, internationalization, and localization, as it allows for accurate classification and handling of characters from different scripts and languages.

Members

Member Description
UppercaseLetter An uppercase letter.
LowercaseLetter A lowercase letter.
TitlecaseLetter A titlecase letter.
ModifierLetter A modifier letter.
OtherLetter Other letters, including prime and modifiers.
NonSpacingMark A non-spacing mark.
SpacingMark A spacing mark.
EnclosingMark An enclosing mark.
DecimalDigitNumber A decimal digit.
LetterNumber A letter number.
OtherNumber Other numbers.
ConnectorPunctuation A connector punctuation.
DashPunctuation A dash punctuation.
OpenPunctuation An open punctuation.
ClosePunctuation A close punctuation.
InitialQuotePunctuation An initial quote punctuation.
FinalQuotePunctuation A final quote punctuation.
OtherPunctuation Other punctuation.
MathSymbol A mathematical symbol.
CurrencySymbol A currency symbol.
ModifierSymbol A modifier symbol.
OtherSymbol Other symbols.
LineSeparator A line separator.
ParagraphSeparator A paragraph separator.
Control A control character.
Format A format character.
Surrogate A surrogate character.
PrivateUse A private use character.
SpaceSeparator A space separator.
OtherNotAssigned Characters that are not assigned.

Example

C# Code Example

using System;
using System.Globalization;

public class Example
{
    public static void Main(string[] args)
    {
        char char1 = 'A';
        char char2 = 'b';
        char char3 = '5';
        char char4 = '?';
        char char5 = '\u20AC'; // Euro sign

        UnicodeCategory category1 = Char.GetUnicodeCategory(char1);
        UnicodeCategory category2 = Char.GetUnicodeCategory(char2);
        UnicodeCategory category3 = Char.GetUnicodeCategory(char3);
        UnicodeCategory category4 = Char.GetUnicodeCategory(char4);
        UnicodeCategory category5 = Char.GetUnicodeCategory(char5);

        Console.WriteLine($"'{char1}' is a {category1}");
        Console.WriteLine($"'{char2}' is a {category2}");
        Console.WriteLine($"'{char3}' is a {category3}");
        Console.WriteLine($"'{char4}' is a {category4}");
        Console.WriteLine($"'{char5}' is a {category5}");
    }
}

Output

'A' is a UppercaseLetter
'b' is a LowercaseLetter
'5' is a DecimalDigitNumber
'?' is a OtherPunctuation
'€' is a CurrencySymbol

Requirements

Assembly Available in
mscorlib.dll .NET Framework 2.0, .NET Core 1.0, .NET Standard 1.0, .NET 5 and later versions.