CWE-135: Incorrect Calculation of Multi-Byte String Length
Learn about CWE-135 (Incorrect Calculation of Multi-Byte String Length), its security impact, exploitation methods, and prevention guidelines.
What is Incorrect Calculation of Multi-Byte String Length?
• Overview: Incorrect Calculation of Multi-Byte String Length occurs when a program fails to accurately compute the length of strings containing wide or multi-byte characters, leading to potential errors or vulnerabilities.
• Exploitation Methods:
- Attackers can exploit this vulnerability to manipulate string lengths, potentially causing buffer overflows.
- Common attack patterns include inserting malicious payloads into improperly sized buffers and exploiting off-by-one errors.
• Security Impact:
- Direct consequences can include buffer overflows, which may result in arbitrary code execution or crashes.
- Potential cascading effects include denial of service or unauthorized access if memory corruption occurs.
- Business impact may involve data breaches, service downtime, or loss of customer trust.
• Prevention Guidelines:
- Specific code-level fixes include using functions that handle wide or multi-byte strings correctly, such as wcslen in C for wide strings.
- Security best practices involve validating string length calculations and ensuring proper buffer sizing.
- Recommended tools and frameworks include static analysis tools that detect improper calculations and using libraries that provide safe string handling functions.
Technical Details
Likelihood of Exploit: Not specified
Affected Languages: C, C++
Affected Technologies: Not specified
Vulnerable Code Example
C Example for CWE-135
#include <stdio.h>
#include <string.h>
// Vulnerable function that incorrectly calculates the length of a multi-byte string
void printStringLength(const char *str) {
// Using strlen() to calculate the length of a multi-byte string
size_t length = strlen(str);
printf("Length of the string: %zu\n", length);
}
int main() {
// Example multi-byte string (UTF-8)
const char *multiByteStr = "こんにちは"; // "Hello" in Japanese
printStringLength(multiByteStr);
return 0;
}
Explanation
- Vulnerability: The code uses
strlen()
to calculate the length of a multi-byte string.strlen()
returns the number of bytes, not the number of characters, for multi-byte strings. This can lead to incorrect length calculation and potential buffer overflows or logic errors when the string contains multi-byte characters.
How to fix Incorrect Calculation of Multi-Byte String Length?
To properly handle multi-byte strings, use functions that are designed to work with wide or multi-byte character encodings. In C, the mbstowcs()
function can be used to convert a multi-byte string to a wide-character string, and then wcslen()
can be used to calculate the number of wide characters (not bytes). This approach ensures that the length calculation is accurate for multi-byte characters.
Key Fixes:
- Convert the multi-byte string to a wide-character string using
mbstowcs()
. - Calculate the length of the wide-character string using
wcslen()
.
Fixed Code Example
#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <string.h>
// Correct function to calculate the length of a multi-byte string
void printStringLength(const char *str) {
// Calculate the required size for the wide-character string
size_t wide_len = mbstowcs(NULL, str, 0);
if (wide_len == (size_t)-1) {
perror("Conversion error");
return;
}
printf("Number of characters in the string: %zu\n", wide_len);
}
int main() {
// Example multi-byte string (UTF-8)
const char *multiByteStr = "こんにちは"; // "Hello" in Japanese
printStringLength(multiByteStr);
return 0;
}
Explanation
- Fix: The
mbstowcs()
function is used to determine the number of characters in the multi-byte string by converting it to a wide-character string. This ensures that the actual character count is calculated, not just the byte count, which is essential for accurate string length determination in internationalized applications.
Additional Notes:
- The fixed code correctly handles potential conversion errors by checking the return value of
mbstowcs()
. - It's important to ensure that the locale is set appropriately before using these functions to ensure correct behavior with multi-byte character sets. This can typically be done with
setlocale(LC_CTYPE, "")
if needed.