CWE-135: Incorrect Calculation of Multi-Byte String Length

What is Incorrect Calculation of Multi-Byte String Length?

• Overview: Incorrect Calculation of Multi-Byte String Length occurs when a program fails to accurately compute the length of strings containing wide or multi-byte characters, leading to potential errors or vulnerabilities.

• Exploitation Methods:

Attackers can exploit this vulnerability to manipulate string lengths, potentially causing buffer overflows.
Common attack patterns include inserting malicious payloads into improperly sized buffers and exploiting off-by-one errors.

• Security Impact:

Direct consequences can include buffer overflows, which may result in arbitrary code execution or crashes.
Potential cascading effects include denial of service or unauthorized access if memory corruption occurs.
Business impact may involve data breaches, service downtime, or loss of customer trust.

• Prevention Guidelines:

Specific code-level fixes include using functions that handle wide or multi-byte strings correctly, such as wcslen in C for wide strings.
Security best practices involve validating string length calculations and ensuring proper buffer sizing.
Recommended tools and frameworks include static analysis tools that detect improper calculations and using libraries that provide safe string handling functions.

Corgea can automatically detect and fix Incorrect Calculation of Multi-Byte String Length in your codebase. Try Corgea free today.

Technical Details

Likelihood of Exploit: Not specified

Affected Languages: C, C++

Affected Technologies: Not specified

Vulnerable Code Example

C Example for CWE-135

#include <stdio.h>
#include <string.h>

// Vulnerable function that incorrectly calculates the length of a multi-byte string
void printStringLength(const char *str) {
    // Using strlen() to calculate the length of a multi-byte string
    size_t length = strlen(str);
    printf("Length of the string: %zu\n", length);
}

int main() {
    // Example multi-byte string (UTF-8)
    const char *multiByteStr = "こんにちは"; // "Hello" in Japanese
    printStringLength(multiByteStr);
    return 0;
}

Explanation

Vulnerability: The code uses strlen() to calculate the length of a multi-byte string. strlen() returns the number of bytes, not the number of characters, for multi-byte strings. This can lead to incorrect length calculation and potential buffer overflows or logic errors when the string contains multi-byte characters.

How to fix Incorrect Calculation of Multi-Byte String Length?

To properly handle multi-byte strings, use functions that are designed to work with wide or multi-byte character encodings. In C, the mbstowcs() function can be used to convert a multi-byte string to a wide-character string, and then wcslen() can be used to calculate the number of wide characters (not bytes). This approach ensures that the length calculation is accurate for multi-byte characters.

Key Fixes:

Convert the multi-byte string to a wide-character string using mbstowcs().
Calculate the length of the wide-character string using wcslen().

Fixed Code Example

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <string.h>

// Correct function to calculate the length of a multi-byte string
void printStringLength(const char *str) {
    // Calculate the required size for the wide-character string
    size_t wide_len = mbstowcs(NULL, str, 0);
    if (wide_len == (size_t)-1) {
        perror("Conversion error");
        return;
    }
    printf("Number of characters in the string: %zu\n", wide_len);
}

int main() {
    // Example multi-byte string (UTF-8)
    const char *multiByteStr = "こんにちは"; // "Hello" in Japanese
    printStringLength(multiByteStr);
    return 0;
}

Explanation

Fix: The mbstowcs() function is used to determine the number of characters in the multi-byte string by converting it to a wide-character string. This ensures that the actual character count is calculated, not just the byte count, which is essential for accurate string length determination in internationalized applications.

Additional Notes:

The fixed code correctly handles potential conversion errors by checking the return value of mbstowcs().
It's important to ensure that the locale is set appropriately before using these functions to ensure correct behavior with multi-byte character sets. This can typically be done with setlocale(LC_CTYPE, "") if needed.

CWE-135: Incorrect Calculation of Multi-Byte String Length

What is Incorrect Calculation of Multi-Byte String Length?

Technical Details

Vulnerable Code Example

C Example for CWE-135

Explanation

How to fix Incorrect Calculation of Multi-Byte String Length?

Key Fixes:

Fixed Code Example

Explanation

Additional Notes:

On This Page

Find this vulnerability and fix it with Corgea