Skip to content

[API Proposal]: Add no-allocation methods to StringInfo #123092

@Joy-less

Description

@Joy-less

Background and motivation

StringInfo is useful for getting graphemes from a string. However, some of the methods require you to create a StringInfo object, which internally creates an array of indexes, all of which are extra Gen0 allocations.

Some benchmarks in the comments of #123077 suggest that no-allocation methods can remove all Gen0 allocations (as well as improving performance by 3-8%). This is important for games and UI apps which potentially run code every frame and wish to avoid GC spikes.

Suggested methods:

  • static int StringInfo.GetLengthInTextElements(scoped ReadOnlySpan<char>)
    • Useful for validating maximum length of (username, message, etc), or checking whether text needs to be truncated for display for (link preview, chat message, etc)
  • static Range? StringInfo.GetRangeByTextElements(scoped ReadOnlySpan<char>, int, int)
    • Useful for getting the first N text elements
  • Range? StringInfo.RangeByTextElements(int, int)
    • Useful for performantly taking substrings of text elements repeatedly on the same string

API Proposal

namespace System.Globalization;

public class StringInfo
{
    /// <summary>
    /// Returns the number of text elements in the given span.
    /// </summary>
    public static int GetLengthInTextElements(scoped ReadOnlySpan<char> str);
    /// <summary>
    /// Retrieves a substring of text elements from the given span, and returns the range of the substring, or null if out of range.
    /// </summary>
    public static Range? GetRangeByTextElements(scoped ReadOnlySpan<char> str, int startingTextElement, int lengthInTextElements);
    /// <summary>
    /// Retrieves a substring of text elements from the string, and returns the range of the substring, or null if out of range.
    /// </summary>
    public Range? RangeByTextElements(int startingTextElement, int lengthInTextElements);
}

API Usage

StringInfo.GetLengthInTextElements:

string username = usernameInputBox.Text;

if (StringInfo.GetLengthInTextElements(username) > 20) {
    // Make text input box red
}
else {
    // Make text input box green
}

StringInfo.GetRangeByTextElements:

string description = "This is an example string for demonstrative purposes.";

Range? firstFiveTextElementsRange = StringInfo.GetRangeByTextElements(description, 0, 4);
ReadOnlySpan<char> firstFiveTextElements = description.AsSpan(firstFiveTextElementsRange).Value;

Console.WriteLine($"First 5 text elements of description: {firstFiveTextElements}");

StringInfo.RangeByTextElements:

string description = "This is an example string for demonstrative purposes.";

StringInfo stringInfo = new(description);
Range? firstFiveTextElementsRange = stringInfo.RangeByTextElements(0, 4);
ReadOnlySpan<char> firstFiveTextElements = description.AsSpan(firstFiveTextElementsRange.Value);

Console.WriteLine($"First 5 text elements of description: {firstFiveTextElements}");

Alternative Designs

The range methods could look like this instead:

public class StringInfo
{
    /// <summary>
    /// Retrieves a substring of text elements from the given span, writes it to the given output, and returns the number of characters written to the output. If the output is too small to contain the substring, -1 is returned.
    /// </summary>
    public static int GetSubstringByTextElements(scoped ReadOnlySpan<char> str, int startingTextElement, int lengthInTextElements, out Span<char> output);
    /// <summary>
    /// Retrieves a substring of text elements from the string, writes it to the given output, and returns the number of characters written to the output. If the output is too small to contain the substring, -1 is returned.
    /// </summary>
    public int SubstringByTextElements(int startingTextElement, int lengthInTextElements, out Span<char> output);
}

StringInfo.GetSubstringByTextElements:

string description = "This is an example string for demonstrative purposes.";

Span<char> output = stackalloc char[64];
int outputCharsWritten = StringInfo.GetSubstringByTextElements(description, 0, 4, output);
if (outputCharsWritten < 0) {
    throw new InvalidOperationException("First 5 text elements more than 64 chars");
}
ReadOnlySpan<char> outputReadOnly = output[..outputCharsWritten];

Console.WriteLine($"First 5 text elements of description: {outputReadOnly}");

StringInfo.SubstringByTextElements:

string description = "This is an example string for demonstrative purposes.";

Span<char> output = stackalloc char[64];
StringInfo stringInfo = new(description);
int outputCharsWritten = stringInfo.SubstringByTextElements(0, 4, output);
if (outputCharsWritten < 0) {
    throw new InvalidOperationException("First 5 text elements more than 64 chars");
}
ReadOnlySpan<char> outputReadOnly = output[..outputCharsWritten];

Console.WriteLine($"First 5 text elements of description: {outputReadOnly}");

Risks

There should be no breaking changes or performance regressions with this proposal.

The naming style (Get{...} for static methods, {...} for instance methods) matches the methods already provided by StringInfo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-suggestionEarly API idea and discussion, it is NOT ready for implementationarea-System.GlobalizationuntriagedNew issue has not been triaged by the area owner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions