distributed partition resolution is easy to get wrong

We have a few instances of "give an task index and/or total number of tasks, determine what partitions to execute". This logic is copy pasted in a few places.

Also, implementors of `TaskEstimator` or even `ExecutionPlan` are expected to use `task_idx` and `num_tasks`.

The risk is that an implementor may execute the wrong partition if they get the math wrong. Even within the code base, we have this math copy-pasted a few times. Depending on the type of network boundary, the math is different too - determining what partitions to run with a Coalesce is different than for a Shuffle.

https://github.com/datafusion-contrib/datafusion-distributed/blob/030ee2c1e089ed57538a881d27066c91d92b8290/src/execution_plans/network_shuffle.rs?plain=1#L232

https://github.com/datafusion-contrib/datafusion-distributed/blob/030ee2c1e089ed57538a881d27066c91d92b8290/src/execution_plans/network_shuffle.rs?plain=1#L238

Is there something better we can do? Ex. pass an explicit range around instead of `num_tasks` and `task_idx`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

distributed partition resolution is easy to get wrong #395

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

distributed partition resolution is easy to get wrong #395

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions