Skip to content

Optimize generating many bitsets for serialization #365

@richardartoul

Description

@richardartoul

One of our heaviest workloads uses this library to index a lot of ingested data in real time. In one part of the workload we basically have dozens of goroutines doing something like the following in a loop:

postings := roaring.NewBitmap()
for _, thing := range things {
    postings.Clear()
    for _, row := range $MATCHING_ROWS {
        postings.Add(row)
    }
    postings.WriteTo()
}

We've already heavily optimized the code that generates things to be very efficient and avoid allocations/pointers, however, when its time to actually serialize the index we're building we have to go through this library and it allocates like crazy even though we're trying to reuse a single datastructure.

I've provided a benchmark that demonstrates this along with some improvements in this P.R: #364

I realize that adding pooling of internal datastructures may be a big change, but I tried to structure the P.R so that its completely "opt in" via the new ClearRetainDatastructures() API.

EDIT: I can't provide screenshots of profiling for obvious reasons, but I just went back and checked and the allocations removed by this P.R represent ~ 56% of all allocations in our workload so this will have a very substantial performance impact for us (and others I expect who use the new API).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions