emplace_back vs push_back

Short summary

May 11, 2020 6 min read Programming

tl;dr emplace_back is often mistaken as a faster push_back, while it is in fact just a different tool. Do not blindly replace push_back by emplace_back, be careful of how you use emplace_back, since it can have unexpected consequences.

I have repeatedly run into the choice of using emplace_back instead of push_back in C++. This short blog post serves as my take on this decision.

Both of the methods in the title, along with insert and emplace, are ways to insert data into standard library containers. emplace_back is for adding a single element to the dynamic array std::vector. There is a somewhat subtle difference between the two:

push_back calls the constructor of the data that you intend to push and then pushes it to the container.
emplace_back “constructs in place”, so one skips an extra move operation, potentially creating faster bytecode. This is done by forwarding the arguments to the container’s template type constructor.

On the surface, emplace_back might look like a faster push_back, but there is a subtle difference contained in the act of forwarding arguments. Searching for the problem online yields a lengthy discussion on the issue (emplace_back vs push_back). In summary, the discussion leans towards choosing emplace_back to insert data into your container, however the reason is not completely clear.

Be careful

After searching a bit more I found this post, which stresses how careful one should be. To further stress the ambiguity of the matter, the google c++ style guide does not provide an explicit preference. The reason they don’t state a preference, is because these are simply slightly different tools, and you should not use emplace_back unless you properly understand it, and there is a proper reason for it.

The following code should make it clear how emplace_back is different from push_back:

#include<vector>
#include<iostream>

int main(){
  // Basic example
  std::vector<int> foo;
  foo.push_back(10);
  foo.emplace_back(20);

  // More tricky example
  std::vector<std::vector<int>> foo_bar;
  //foo_bar.push_back(10); // Throws error!!!!
  foo_bar.emplace_back(20); // Compiles with no issue
  std::cout << "foo_bar size: " << foo_bar.size() << "\n";
  std::cout << "foo_bar[0] size: " << foo_bar[0].size() << "\n";
  return 0;
}

Uncommenting the line foo_bar.push_back(10) yields the following compilation error.

$ g++ test.cpp -o test
test.cpp: In function ‘int main()’:
test.cpp:11:24: error: no matching function for call to ‘std::vector<std::vector<int> >::push_back(int)’
   foo_bar.push_back(10);
                        ^
... Some more verbose diagnostic

So we get an error. An extra diagnostic provides us with the following: no known conversion for argument 1 from ‘int’ to ‘const value_type& {aka const std::vector<int>&}’. So it seems there is no conversion here that makes sense.

The compilation completes without errors if we comment out the trouble line. Running the code yields the following result:

$ ./test
foo_bar size: 1
foo_bar[0] size: 20

What is the difference? For emplace_back we forward the arguments to the constructor, adding a new std::vector<int> new_vector_to_add(20) to foo_bar. This is the critical difference. So if you are using emplace_back you need to be a bit extra careful in double checking types.

Another example with conversion

The example above is very simple and shows the difference of forwarding arguments to the value type constructor or not. The following example I added in an SO question shows a bit more subtle case where emplace_back could make us miss catching a narrowing conversion from double to int.

#include <vector>
class A {
 public:
  explicit A(int /*unused*/) {}
};
int main() {
  double foo = 4.5;
  std::vector<A> a_vec{};
  a_vec.emplace_back(foo); // No warning with Wconversion
  //A a(foo); // Gives compiler warning with Wconversion as expected
}

Why is this a problem?

The problem is that we are unaware of the problem at compile-time. If this was not the intended behavior, we have caused a runtime error, which is generally harder to fix. Let us catch the issue somehow. You might wonder that some warning flags, e.g., -Wall could reveal the issue. However, the program compiles fine with -Wall. -Wall contains narrowing, but it does not contain conversion. Further, adding -Wconversion yields no warnings.

$ g++ -Wall -Wconversion test_2.cpp -o test

The problem is that this conversion is happening in a system header, so we also need -Wsystem-headers to catch the issue.

$ g++ -Wconversion -Wsystem-headers test_2.cpp 
In file included from /usr/include/c++/7/vector:60:0,
                 from test_2.cpp:1:
/usr/include/c++/7/bits/stl_algobase.h: In function ‘constexpr int std::__lg(int)’:
/usr/include/c++/7/bits/stl_algobase.h:1001:44: warning: conversion to ‘int’ from ‘long unsigned int’ may alter its value [-Wconversion]
   { return sizeof(int) * __CHAR_BIT__  - 1 - __builtin_clz(__n); }
                                            ^
/usr/include/c++/7/bits/stl_algobase.h: In function ‘constexpr unsigned int std::__lg(unsigned int)’:
/usr/include/c++/7/bits/stl_algobase.h:1005:44: warning: conversion to ‘unsigned int’ from ‘long unsigned int’ may alter its value [-Wconversion]
   { return sizeof(int) * __CHAR_BIT__  - 1 - __builtin_clz(__n); }
                                            ^
In file included from /usr/include/x86_64-linux-gnu/c++/7/bits/c++allocator.h:33:0,
                 from /usr/include/c++/7/bits/allocator.h:46,
                 from /usr/include/c++/7/vector:61,
                 from test_2.cpp:1:
/usr/include/c++/7/ext/new_allocator.h: In instantiation of ‘void __gnu_cxx::new_allocator<_Tp>::construct(_Up*, _Args&& ...) [with _Up = A; _Args = {double&}; _Tp = A]’:
/usr/include/c++/7/bits/alloc_traits.h:475:4:   required from ‘static void std::allocator_traits<std::allocator<_Tp1> >::construct(std::allocator_traits<std::allocator<_Tp1> >::allocator_type&, _Up*, _Args&& ...) [with _Up = A; _Args = {double&}; _Tp = A; std::allocator_traits<std::allocator<_Tp1> >::allocator_type = std::allocator<A>]’
/usr/include/c++/7/bits/vector.tcc💯30:   required from ‘void std::vector<_Tp, _Alloc>::emplace_back(_Args&& ...) [with _Args = {double&}; _Tp = A; _Alloc = std::allocator<A>]’
test_2.cpp:9:25:   required from here
/usr/include/c++/7/ext/new_allocator.h:136:4: warning: conversion to ‘int’ from ‘double’ may alter its value [-Wfloat-conversion]
  { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }

And there you have it - look at all the verbose output. The problem is not apparent from this wall of text for the uninitiated.

So when should you use `emplace_back`?

One reason to use emplace_back is when the move operation that we can save, is actually quite expensive. Consider the following example:

class Image {
  Image(size_t w, size_t h);
};

std::vector<Image> images;
images.emplace_back(2000, 1000);

Instead of moving the image, which consists of potentially a lot of data, we construct it in place, and forward the constructor arguments in order to do that. These kind of cases are quite special, we should already be aware that this is large data that we are adding to the container.

Another case for using emplace_back was added in C++17. Then emplace_back returns a reference to the inserted element, this is not possible at all with push_back.

`emplace_back` is a potential premature optimization.

Going from push_back to emplace_back is a small change that can usually wait, and like the image case, it is usually quite apparent when we want to use it. If you want to use emplace_back from the start, then make sure you understand the differences. This is not just a faster push_back. For safety, reliability, and maintainability reasons, it is better to write the code with push_back when in doubt. This choice reduces the chance of pushing an unwanted hard to find implicit conversion into the codebase, but you should weigh that risk against the potential speedups, these speedups should then ideally be evaluated when profiling.

cpp

Guðmundur Einarsson

Research Scientist

I am interested in statistics, genetics and machine learning.