-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Background and motivation
StringInfo is useful for getting graphemes from a string. However, some of the methods require you to create a StringInfo object, which internally creates an array of indexes, all of which are extra Gen0 allocations.
Some benchmarks in the comments of #123077 suggest that no-allocation methods can remove all Gen0 allocations (as well as improving performance by 3-8%). This is important for games and UI apps which potentially run code every frame and wish to avoid GC spikes.
Suggested methods:
static int StringInfo.GetLengthInTextElements(scoped ReadOnlySpan<char>)- Useful for validating maximum length of (username, message, etc), or checking whether text needs to be truncated for display for (link preview, chat message, etc)
static Range? StringInfo.GetRangeByTextElements(scoped ReadOnlySpan<char>, int, int)- Useful for getting the first N text elements
Range? StringInfo.RangeByTextElements(int, int)- Useful for performantly taking substrings of text elements repeatedly on the same string
API Proposal
namespace System.Globalization;
public class StringInfo
{
/// <summary>
/// Returns the number of text elements in the given span.
/// </summary>
public static int GetLengthInTextElements(scoped ReadOnlySpan<char> str);
/// <summary>
/// Retrieves a substring of text elements from the given span, and returns the range of the substring, or null if out of range.
/// </summary>
public static Range? GetRangeByTextElements(scoped ReadOnlySpan<char> str, int startingTextElement, int lengthInTextElements);
/// <summary>
/// Retrieves a substring of text elements from the string, and returns the range of the substring, or null if out of range.
/// </summary>
public Range? RangeByTextElements(int startingTextElement, int lengthInTextElements);
}API Usage
StringInfo.GetLengthInTextElements:
string username = usernameInputBox.Text;
if (StringInfo.GetLengthInTextElements(username) > 20) {
// Make text input box red
}
else {
// Make text input box green
}StringInfo.GetRangeByTextElements:
string description = "This is an example string for demonstrative purposes.";
Range? firstFiveTextElementsRange = StringInfo.GetRangeByTextElements(description, 0, 4);
ReadOnlySpan<char> firstFiveTextElements = description.AsSpan(firstFiveTextElementsRange).Value;
Console.WriteLine($"First 5 text elements of description: {firstFiveTextElements}");StringInfo.RangeByTextElements:
string description = "This is an example string for demonstrative purposes.";
StringInfo stringInfo = new(description);
Range? firstFiveTextElementsRange = stringInfo.RangeByTextElements(0, 4);
ReadOnlySpan<char> firstFiveTextElements = description.AsSpan(firstFiveTextElementsRange.Value);
Console.WriteLine($"First 5 text elements of description: {firstFiveTextElements}");Alternative Designs
The range methods could look like this instead:
public class StringInfo
{
/// <summary>
/// Retrieves a substring of text elements from the given span, writes it to the given output, and returns the number of characters written to the output. If the output is too small to contain the substring, -1 is returned.
/// </summary>
public static int GetSubstringByTextElements(scoped ReadOnlySpan<char> str, int startingTextElement, int lengthInTextElements, out Span<char> output);
/// <summary>
/// Retrieves a substring of text elements from the string, writes it to the given output, and returns the number of characters written to the output. If the output is too small to contain the substring, -1 is returned.
/// </summary>
public int SubstringByTextElements(int startingTextElement, int lengthInTextElements, out Span<char> output);
}StringInfo.GetSubstringByTextElements:
string description = "This is an example string for demonstrative purposes.";
Span<char> output = stackalloc char[64];
int outputCharsWritten = StringInfo.GetSubstringByTextElements(description, 0, 4, output);
if (outputCharsWritten < 0) {
throw new InvalidOperationException("First 5 text elements more than 64 chars");
}
ReadOnlySpan<char> outputReadOnly = output[..outputCharsWritten];
Console.WriteLine($"First 5 text elements of description: {outputReadOnly}");StringInfo.SubstringByTextElements:
string description = "This is an example string for demonstrative purposes.";
Span<char> output = stackalloc char[64];
StringInfo stringInfo = new(description);
int outputCharsWritten = stringInfo.SubstringByTextElements(0, 4, output);
if (outputCharsWritten < 0) {
throw new InvalidOperationException("First 5 text elements more than 64 chars");
}
ReadOnlySpan<char> outputReadOnly = output[..outputCharsWritten];
Console.WriteLine($"First 5 text elements of description: {outputReadOnly}");Risks
There should be no breaking changes or performance regressions with this proposal.
The naming style (Get{...} for static methods, {...} for instance methods) matches the methods already provided by StringInfo.