Skip to content

Conversation

@alexdowad
Copy link
Contributor

Thanks to the GitHub user @vi3tL0u1s (Viet Hoang Luu) for reporting this issue.

The MacJapanese legacy text encoding has a very unusual property; it is possible for a string to encode more codepoints than it has bytes. In some corner cases, this resulted in a situation where the implementation code for mb_substr() would allocate a buffer of size -1. As you can probably imagine, that doesn't end well.

Fixes GH-20832.

@youkidearitai @ndossche @cmb69

Thanks to the GitHub user vi3tL0u1s (Viet Hoang Luu) for reporting this issue.

The MacJapanese legacy text encoding has a very unusual property; it is possible for a string
to encode more codepoints than it has bytes. In some corner cases, this resulted in a situation
where the implementation code for mb_substr() would allocate a buffer of size -1. As you can
probably imagine, that doesn't end well.

Fixes phpGH-20832.
Copy link
Contributor

@youkidearitai youkidearitai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. MacJapanese is sometimes multi codepoints.
LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mbstring: assertion failure in mb_wchar_to_sjismac with MacJapanese encoding

2 participants