You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{rack-awareness-docs}[Rack awareness] is a feature in Apache Hadoop that allows users to define a cluster's node topology.
6
6
Hadoop uses that topology to distribute block replicas in a way that maximizes fault tolerance.
@@ -35,3 +35,7 @@ This creates an internal topology label by combining the values of the `topology
35
35
In order to enable gathering this information the Hadoop images contain the {hdfs-topology-provider}[hdfs-topology-provider] on the classpath, which can be configured to read labels from Kubernetes objects.
36
36
37
37
The operator deploys ClusterRoles and ServicesAccounts with the relevant RBAC rules to allow the Hadoop Pod to access the necessary Kubernetes objects.
38
+
Topologies and other metadata such as Node- and Pod-IPs and endpoints are held in separate caches so that they can be refeshed independently of one another.
39
+
The {hdfs-topology-provider}[hdfs-topology-provider] is namespace-scoped and pods in the active namespace are watched so that changes can be propagated to the internal cache to minimise cache misses.
40
+
41
+
NOTE: Rack awareness may not work as expected on clusters such as `kind` or `k3s` that configure IP-masquerading differently to production-ready distributions.
0 commit comments