Skip to content

Conversation

@Beinsezii
Copy link
Contributor

What does this PR do?

Fixes a small performance regression for Z Image Turbo.

Basically just sets attn_mask to None when it would otherwise be all ones, which is always the case for Z Image Turbo where guidance_scale==1 for typical usage.

On an H100 this improves performance by about 4%, using AttentionBackendName._NATIVE_CUDNN.

Before submitting

Who can review?

@yiyixuxu or @sayakpaul probably

@sayakpaul
Copy link
Member

Cc: @JerryWu-code who contributed the model.

@Beinsezii
Copy link
Contributor Author

More relevant with new higher compute model

@Beinsezii
Copy link
Contributor Author

We've been running this in production for Turbo for a while

@Beinsezii Beinsezii force-pushed the beinsezii/zimg_optim branch from 02ae19d to b6d107a Compare January 28, 2026 00:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants