Skip to content

Conversation

@ld-kerley
Copy link
Contributor

While continuing to investigate differences between GLSL and MSL highlighted in #2697 it became apparent that GLSL and MSL don't support the same behavior for constant wrap mode - so in this PR we implement the GLSL version directly in shader source to ensure portability across languages.

We may take this further in the future and implement the rest of the wrap modes if differences appear.

One alternative solution here could be to implement MSL versions of the image node implementations with this additional code, and continue to rely on the GLSL wrap mode behavior, but this will lead to a higher maintenance burden, and its also possible that other new languages might also behave differently, so a direct source code implementation ensures all future languages follow the same behavior.

glsl_vs_msl_wrap_mode_fix.pdf

ld-kerley and others added 2 commits December 2, 2025 11:48
…0 size if use original 512. (122 to 10 MB). and with 256 width to 3MB.

This makes easier to store comparisons images in PDF.
The idea is to use base64 encoding into JPG and embed it into the html.This also means less disk read / writes for diff image generation.
Uses OpenCV + numpy for fast resize. PIL and OpenImage are much slower for this encoding case.
JSON storage is about 4mb from 160K when embedding encoded images.
Replaced diff with OpenCV + numpy instead of PIL to avoid extra data copies.

Format updates.

Preserve the image size, border and spacing from original version.

- Fix responsive formatting for HTML.
  - Will pack by default (prevous behaviour)
  - -w <=0 will make it responsive.
- Revert -e arg help.

Fix error tolerance test.

Was pruning when no error tolerance set.

Update output for test output builder

- Add in option to filter out rows based on minimum RMS value.
- Add in option JSON, and Markdown options
- Cleaned up HTML output.
Copy link
Contributor

@kwokcb kwokcb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good change. I'm just curious how this is affected by the wrap mode set via the API ? I assume that if it's repeating then this code will do the appropriate logic. However is there a case where if clamping is set that the clamp color will show up along the 0..1 borders ?

Copy link
Contributor

@kwokcb kwokcb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually never mind. I did not look closely at the new results. It all looks good.

@jstone-lucasfilm
Copy link
Member

Although this may end up being a good future strategy for hardware shading languages in MaterialX, I have a lot of questions about the potential performance implications!

Would this approach reduce performance for complex ALU-bound materials in real-time renderers? It doesn't seem obvious to me what the answer would be, and we should be very sure before proceeding.

@jstone-lucasfilm
Copy link
Member

My guess is that this conditional and early return would be implemented as a dynamic branch by the shader compiler, which can have significant performance implications in shaders that require a large number of registers.

I'd love to hear from folks with strong, recent expertise in real-time shading, though, and it could be that the proposed new code is less problematic than my own experience in this domain would suggest.

@kwokcb
Copy link
Contributor

kwokcb commented Dec 3, 2025

Afaik, the uv can be computed along with the sampler lookup, and then a select should be done afterwards. The lookup and select shoudl mask each other, and avoid divergent branching.

Apologies for not being more vigilant in the review.

Copy link
Contributor

@kwokcb kwokcb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for th flipping Lee. Jonathan is right to check perf. Please check if what I've put in is correct, but is what I looked up as being the better option.

@kwokcb
Copy link
Contributor

kwokcb commented Dec 3, 2025

void mx_image_color3($texSamplerSignature,
                     int layer,
                     float3 defaultval,
                     float2 texcoord,
                     int uaddressmode,
                     int vaddressmode,
                     int filtertype,
                     int framerange,
                     int frameoffset,
                     int frameendaction,
                     float2 uv_scale,
                     float2 uv_offset,
                     thread float3& result)
{
    float2 uv = mx_transform_uv(texcoord, uv_scale, uv_offset);

    bool outsideU = (uaddressmode == 0) && (uv.x < 0.0 || uv.x > 1.0);
    bool outsideV = (vaddressmode == 0) && (uv.y < 0.0 || uv.y > 1.0);
    bool useDefault = outsideU || outsideV;

    // Always sample once (avoids divergent control flow)
    float3 sampled = texture($texSamplerSampler2D, uv).rgb;

    // Select default or sample (branchless in Metal).
    // mix() appears to perform better with float3 than select()
    result = mix(sampled, defaultval, float(useDefault))
    // result = select(sampled, defaultval, useDefault);
}

I think you want something like the above.

@ld-kerley
Copy link
Contributor Author

I'm happy to go with either route - and am certainly not the real time performance expert. In my limited local testing I wasn't able to measure any performance difference at all, but maybe someone with access to more complex production assets could share some concrete numbers to inform which direction we should head here?

I'm very happy to update the code per @bernardkwok suggestion here - in fact that could would be easier share in a library function.

Either way someone with better access to data should profile this - I don't really think the assets we have in the test suite are complex enough perhaps?

@ld-kerley
Copy link
Contributor Author

Do we want to handle the other wrap modes in a similar way here as well? They match with todays GLSL/MSL comparison, but they do require the render engine the shader is being passed to to correctly configure the texture.

It would be possible to write them all in shader source and ensure they are consistent

@kwokcb
Copy link
Contributor

kwokcb commented Dec 3, 2025

Hi @ld-kerley,

Thanks for following up.

As we don't put in conditional branches (and especially not texture fetches within a branch) the number of ALU instructions added for the test and hence processing time is masked (& should run in parallel) by the much more expensive texture fetch. Looks around 10-12 ALU ops.

Even if we added in all the logic for full wrap support should still be minimal, but maybe best for a later PR to avoid stalling this fix. There can be a few approaches:

  • a generalized version with some macro replacement for optimal intrinsic calls;
  • specialized versions for each language (GLSL, MSL, ESSL etc), or
  • stick with generalized only code.

I've added this issue

@jstone-lucasfilm, does this sound like a reasonable path forward ?

@jstone-lucasfilm
Copy link
Member

@kwokcb @ld-kerley I'd be interested in a deeper understanding of the differences in support for texture address modes across GLSL, HLSL, MSL, WGSL, and other hardware shading languages.

Is there a broad pattern of divergence across hardware shading languages, or is MSL simply missing one of the modes that GLSL, HLSL, and WGSL all support in a consistent way?

@kwokcb
Copy link
Contributor

kwokcb commented Jan 12, 2026

Based on the spec

The following values are supported by uaddressmode and vaddressmode inputs of Texture nodes:

“constant”: Texture coordinates outside the 0-1 range return the value of the node's default input.
“clamp”: Texture coordinates are clamped to the 0-1 range before sampling the image.
“periodic”: Texture coordinates outside the 0-1 range "wrap around", effectively being processed by a modulo 1 operation before sampling the image.
"mirror": Texture coordinates outside the 0-1 range will be mirrored back into the 0-1 range, e.g. u=-0.01 will return the u=0.01 texture coordinate value, and u=1.01 will return the u=0.99 texture coordinate value.

Here is a basic shader breakdown.

MaterialX Mode GLSL (Desktop) ESSL (OpenGL ES 3.x) HLSL (D3D11/12) MSL (Metal) WGSL (WebGPU) OSL MDL Shader-only equivalent
constant No No Yes No No No No Yes (uv range check)
clamp Yes Yes Yes Yes Yes Yes Yes Yes (uv = clamp(uv,0,1))
periodic Yes Yes Yes Yes Yes Yes Yes Yes (uv = fract(uv))
mirror Yes Yes Yes Yes Yes Yes Yes Yes (uv = abs(fract(uv*0.5)*2-1))

Note that the current hardware only mode setting (clamp-to-border-color) is ** can differ a bit with the implementation's in OSL and MDL which perform color replacement w/o texture lookup.

Thus I'd recommend that constant for Metal can be done first since it's not showing up at all.

The other modes could be turned into shader code for hardware languages to avoid having to make renderers make pipeline changes when a mode changes. ( It is not obvious at all that a shader parameter changes implies a pipeline change. )

@jstone-lucasfilm
Copy link
Member

@kwokcb Can you elaborate on which aspect of the constant texture address mode is wrong in OpenGL/GLSL? Much like DirectX/HLSL, OpenGL/GLSL supports a GL_TEXTURE_BORDER_COLOR control that can be paired with the GL_CLAMP_TO_BORDER address mode:

https://registry.khronos.org/OpenGL-Refpages/gl4/html/glTexParameter.xhtml

That feature was one of the original models for the constant address mode in MaterialX, and I'd like to learn more about any situations in which its behavior doesn't match our specification.

@kwokcb
Copy link
Contributor

kwokcb commented Jan 12, 2026

In hardware there is filtering on the clamp texels so this can have an effect for instance with normal maps, displacement maps and high frequency color, and filtering that pulls in more texels (like anisotropic).

Copy link
Contributor

@kwokcb kwokcb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ld-kerley ,
@jstone-lucasfilm agreed offline that replace the general GLSL implementation is too intrusive. The proposal is to branch off an MSL only version of this and to only address the constant mode. Thanks.

@ld-kerley
Copy link
Contributor Author

Can I perhaps propose an alternative ideas? What if we made the use of shader code based wrap modes a shader generation option? If enabled we could use the shader code, meaning we would have portable shaders with consistent image generation, and if disabled then users could opt in to a possibly more performant render path, to achieve the same thing, assuming their render target is correctly configuring the textures.

I think of the bigger advantages of this approach is that we have the option to generate code that will interpret the texture reads in a consistent manner, across different shader languages, but also across different engines, some of which may not be correctly interrogating the MaterialX texture data and correctly configuring the wrap modes on the texture object directly.

For example, his would allow simpler use of MaterialX shaders in targets like shaderToy - where not all wrap modes are provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants