Tag Archives: c++

Frequent error while using std::erase with std::remove_if in C++

To remove an element from container of type vector, string, list or deque we use erase() method with an iterator to unneeded element as a parameter.

After some time we do optimize the code or start to use STL algorithm remove_if() from the very beginning. Next we add some fancy boost::bind stuff for convenience and etc.

And once in a while we have to remove several elements from container instead of one. And it does not matter if it happens in new code, or while extending old. And here many C++ beginners can do the mistake, either but not knowing about it or just lazy enough not to check the erase() method documentation. They copy-paste the old code and change only remove_if() search condition. But they forget, that if only one argument is passed to erase(), it will delete that only element which is that iterator argument pointing to.

If several elements are needed to be removed from the container, there is an overloaded erase() method which accepts two iterators , to the start and to the end of the sequence to be removed. When remove_if() algorithm is used, it rearranges elements that should not be removed to the beginning of the container, leaving some trash in the end. For example, let’s have a vector , containing the following values:
1 2 3 1 2 3 1 2 3

Then if we call
bool IsTwo (int i) { return (i == 2); }

tstart = remove_if(v.begin(), v.end(), isTwo)
//remove elements with value 2 from the container v

we’ll get
1 3 1 3 1 3 ? ? ?
where ? – is some trash. tstart iterator, returned by remove_if(), will point to the first trash element, the third from the end in our case. Trash may contain the same values as before calling remove_if(), but that’s not guaranteed. Container size remains unchanged.

So if we write down the whole code which novice works with, we’ll get the following:
v.erase(remove_if(v.begin(), v.end(), isTwo));

And erase() will remove only the first trash element from the container and will change containers size by one. The contents will be the following:
1 3 1 3 1 3 ? ?
which is obviously wrong and causes unpredictable consequences if working further with this container.

How to avoid that? When developer is not sure in his skills, he is advised to check the documentation, and find out, that if we call erase() as follows:
v.erase(remove_if(v.begin(), v.end(), isTwo), v.end());
all the trash element will be removed from the container.

If the developer (sometimes overly) confident, the tests are always to the rescue. And don’t be lazy, write tests for border cases as well. In our case we should at least check removing not only single element, but several as well, and don’t forget to test none elements removal, and even test with empty container.

Danger of using default values in declarations

Working on a large project using C++, recently faced a dangerous things during
debugging. Let’s see in examples. Once there was a method:

Class1::SetEditBox(CString const& value, bool isUppercase = true);

Somewhere it was used differently:

obj1.SetEditBox(_T("Volume"), false);
 ...
 obj2.SetEditBox(_T("SerialNo"));

After a while this declaration was a bit extended:

Class1::SetEditBox(CString const& value, bool isEditable = false, bool isUppercase = true);

New places were using it properly, but not all that old places were changed, so

obj1.SetEditBox(_T("Volume"), false);

became to have isUppercase = true, however, since volume is usually numeric,
this was not detected during “happy way” testing.
Meanwhile system became more complex. Refactored and rewritten, developers were
also changing. Inheritance and other OOP stuff started to using this method:

Class2::SetListEdit(CString const& value, bool isEditable = false, bool isUppercase = true)
{
    // ...
    obj4.SetEditBox(value, isEditable, isUppercase);
    // ...
}

And again time passed, things changed, and so changed method declaration:

Class1::SetEditBox(CString const& value, bool isNumeric = true, bool isEditable = false, bool isUppercase = true);

Now there were even more things to recheck and update, but guess what?
Somebody, who did this change did not care about looking up every place to change
and now it became the bug, which I called “Shifting Boolean”:

Class3::AddListEdit(CString const& value, bool isUppercase = true)
{
    Class2 obj5;
    obj5.SetListEdit(value, isUppercase);
}

So the call to Class3::AddListEdit() and passing only CString value, made
isUppercase default value bacame isEditable in Class2::SetListEdit() and then
to became isNumeric in Class1::SetEditBox().

If there were no default values, all the differences would come out during
compilation, or even earlier, in IDE. However there are a trade-offs: either
you save typing and introduce default values, or make changes more transparent.

In any case, every change that can be interpreted by implicit type conversion
should be carefully examined and all the places should be changed as required.

It took me about 6 hours to refactor all the project code and get rid of this
bug. In my solution I have replaced booleans with a single enum argument, because
all the booleans were tightly coupled, and in the end switch()-ing between
enumerated values was easier than checking all the boolean combinations.