Question

I have asked a related question previously about this so I know its an undefined behavior.

returning const char* to char* and then changing the data

string _str = "SDFDFSD";
char* pStr = (char*)_str.data();
for (int i = 0; i < iSize; i++)
    pStr[i] = ::tolower(pStr[i]);

I had a discussion with one of my colleagues about this. And he told me that it won't cause any problem in this scenario ever unless I change the length of the data. If I change the data but keep the length same than it will never create any problem as there is no way for std::string to detect that data has been changed. It won't cause any internal inconsistency either in _str. Is it really the case?

Was it helpful?

Solution

Undefined Behavior has been decried too much, I am afraid, and references to nasal daemons seem to have convinced most people that it was more mythical than anything else.

It seems that your colleague has been so desensitized, in order to convince you thus need to bring him concrete proof of the issue. Luckily, if you have gcc at hand it can be done:

#include <iostream>
#include <string>

int main() {
    std::string const UPPER = "HELLO, WORLD!";
    std::cout << "UPPER: " << UPPER << "\n";

    std::string lower = UPPER;
    for (char* begin = const_cast<char*>(lower.data()),
         * end = begin + lower.size();
         begin != end;
         ++begin)
    {
        *begin = std::tolower(*begin);
    }
    std::cout << "lower: " << lower << "\n";
    std::cout << "UPPER: " << UPPER << "\n";
    return 0;
}

If you use gcc, here is what you get:

UPPER: HELLO, WORLD!
lower: hello, world!
UPPER: hello, world!   // What the hell ? UPPER was const !!!

Why ? Because gcc has historically used Copy On Write, and since you cheated it did not detect the write and thus the underlying storage array is shared.

Note: yes, this is non-conformant with C++11, I wish I had the chance to work in C++11 though.

OTHER TIPS

As per the data() documentation:

Modifying the character array accessed through data is undefined behavior.

So your colleague is wrong, there is no trick, it is undefined behavior. What happens is implementation specific.

Bad idea! You cannot know for sure how string is implemented on any current or future platform. For example, it could be sharing storage with other objects that are somehow similar. It could be putting the data in a readonly segment on some platforms.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top