Development Sandbox: String Concatenation

When you review and work with other people’s code you sometimes find some tricks to optimize your own code. Most of the time the tricks look impressive in their succinctness and streamlined approach. And so you’d assume that the performance behind the scenes would be mind-blowing. So I decided to have a look at string concatenation. I have seen a number of ways to concatenate strings. Usually, the ones that do everything within scope will do it properly. And there are concatenations done other ways for appropriate reasons. But I thought I’d still have a look at the many ways one can concatenate a string.

In order to make things simple I created a couple of constants and methods:
private static final int ITERATIONS = 100;
private static final String[] fruits = {"apple", "banana", "orange", "grape", "peach", "kiwi", "strawberry"};

private static String getFruit(int i) {
    return fruits[i % fruits.length];
}

I developed four different ways of concatenating a string. I coded the obvious one plus one version. But I also added a Java 8 version coded with Streams for the fun of it.

The first version of the code below is the standard a + b formula from the days of BASIC. The interesting thing about a simple concatenation is that the Java compiler converts it to use a StringBuilder. But a new StringBuilder object is created for every loop iteration. What’s even more insane is that for every line of code that does string concatenation, even if you have successive lines of code assigning to the same variable,  Java creates a new StringBuilder object. Then after the single concatenation has occurred it makes a toString call at the end to assign to the String variable. So there is a lot of overhead for what seems to be very simple code.

But when you just one to concatenate two string together, it’s clean and simple. Java is smart enough to merge multiple concatenations appearing on the same line together.

private static String buildWithStringConcantenation() {
    String str = new String();

    for (int i = 0; i < ITERATIONS; i++) {
        str += getFruit(i);

        if (str.length() > 0 && i < ITERATIONS - 1) {
            str += ",";
        }
    }
    return str;
}

The StringBuilder is Java’s solution to string concatenation. Under the hood StringBuilder uses Arrays.copyOf() and String.getChars().

private static String buildWithStringBuilder() {
    StringBuilder str = new StringBuilder();

    for (int i = 0; i < ITERATIONS; i++) {
        str.append(getFruit(i));

        if (str.length() > 0 && i < ITERATIONS - 1) {
            str.append(",");
        }
    }
    return str.toString();
}

I have seen ArrayList used in a few different scenarios. I have seen it used where string values are collected in a few different areas. Then the array values are joined downstream. I have also seen it to collect values within a loop like below to generate a comma separated value. When you look at the code below it’s sexy and looks super optimized.

private static String buildWithListAdd() {
    List<String> str = new ArrayList<>(ITERATIONS);

    for (int i = 0; i < ITERATIONS; i++) {
        str.add(getFruit(i));
    }

    return String.join(",", str);
}

Java Stream is amazing and looks freaking cool. Cool enough to impress your friends. Just look at the code below. Only a genius would understand what it does. I tried to load this up in a bytecode editor and well, it just blew up. Needless to say, the Stream library generates a lot of bytecode.

private static String buildWithStream() {
    return Stream.iterate(0, n -> n + 1).map(n -> getFruit(n)).limit(ITERATIONS).collect(Collectors.joining(",", "", ""));
}

I ran some pseudo-scientific tests on the above pieces of code and below is the results at 1000 iterations. All times are in nanoseconds so the executions are relatively very fast. I suspected that Streams wouldn’t be that performant. But it is a lot slower than the string concatenation.

20171101075508

At 100,000 iterations the string concatenation takes a wicked nosedive. All the other methods have also taken longer, but not as much. Streaming, interestingly had a much more modest performance degradation compared to the other methods. In the end, StringBuilder is still the killer way to concatenate strings. ArrayList is also a respectable solution.

20171101075959

ArrayList versus ArrayList

After all this testing I decided to try one more thing. In the above tests for ArrayList I was providing the length of the array in the constructor thinking that it would speed up performance by allocating all the memory at the outset. This was sort of true when I initiated the array with 1000 elements.  At 1000 iterations the version of the ArrayList without providing the number of elements in the constructor ran 20% faster than that version where I did provide the number of elements to initiate the array.

But at 100,000 iterations, the second version fo ArrayList is twice as fast as the first. And it’s performance is comparable to that of StringBuilder. When I brought down the number of iterations to 100, the result was the same as the 100,000 iterations. I’m sure a lot of the performance issues may be due to memory management. But its still interesting none the less.

Below are those numbers at 100 iterations:

  • StringBuilder: 115,264 nanoseconds.
  • ArrayList instantiated with length of array: 311,449 nanoseconds
  • ArrayList with default array: 125,132 nanoseconds

So there you go. I always get surprised by what I find.

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s