måndag 18 april 2016

Strings behind the scene

What’s really happening behind the scenes with the strings you create?
I will try to show what’s happening and what to think of when generating a string. And to visualize it I will use log messages.
Often I see log messages where the whole message is created as it should be logged. This is a pretty bad idea in performance perspective, since it always have to generate the string no matter what, even if the log message is discarded. So I will show you what the Java compiler actually does with the string messages you create.
But this will not be another performance comparison blog post. I will just try to show what’s happening behind the scenes with some byte code, then you will have to make your own assumptions. Of course I will tell you my point of view :)
Let’s start with a more or less common examples.
public void logWithConcatInLoop(List<String> values) {
    String logStr = "My values :";
    for (String value : values) {
        logStr += value;
    }

    log.debug(logStr);
}
The interesting part of the byte code is what’s happening within the loop:
NEW java/lang/StringBuilder
DUP
INVOKESPECIAL java/lang/StringBuilder.<init> ()V
ALOAD 2
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 4
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
ASTORE 2
It creates a new StringBuilder in each iteration and appends the logStr (ALOAD 2) and then appends the variable valueASTORE 2 simply means that it stores the toString value to logStr. You don’t have to understand the byte code, the important thing is that it creates a _StringBuilder in each iteration and makes an extra append-call, simply unnecessary and might be a performance issue in larger systems.
Let’s take another example with the string + operator.
public void logWithConcat(String t1, String t2, String t3) {
    log.debug("My values : " + t1 + " " + t2 + " " + t3);
}
This is a very common approach of creating log messages. The compiler simply creates a StringBuilder and each plus-operator is basically translated to an append-call.
NEW java/lang/StringBuilder
DUP
INVOKESPECIAL java/lang/StringBuilder.<init> ()V
LDC "My values : "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 1
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
LDC " "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 2
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
LDC " "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 3
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
INVOKEINTERFACE org/slf4j/Logger.debug (Ljava/lang/String;)V
Pretty neat, huh? The Java compiler is not that stupid after all. You might have heard someone say "don’t use + when working with strings", that was true, back in Java 1.4, but now the compiler is a bit smarter :)
So lets see what happens if we use a StringBuilder directly, instead.
public void logWithStringBuilder(String t1, String t2, String t3) {
    StringBuilder sb = new StringBuilder();
    sb.append("My values : ").append(t1).append(" ").append(t2).append(" ").append(t3).append(" ");
    log.debug(sb.toString());
}
And the byte code:
NEW java/lang/StringBuilder
DUP
INVOKESPECIAL java/lang/StringBuilder.<init> ()V
ASTORE 4
L1
LINENUMBER 78 L1
ALOAD 4
LDC "My values : "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 1
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
LDC " "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 2
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
LDC " "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 3
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
LDC " "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
POP
L2
LINENUMBER 79 L2
GETSTATIC se/hrmsoftware/BehindLogging.log : Lorg/slf4j/Logger;
ALOAD 4
INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
Not much to say. Almost identical to logWithConcat. Just some more line number changes.
So lets look at something else, a log message using parameters instead.
public void logWithFormat(String t1, String t2, String t3) {
    log.debug("My values: {} {} {}", t1, t2, t3);
}
So this is compiled to creation of an array and then just a method invocation. In other words, this is probably the fastest solution in my examples.
GETSTATIC se/hrmsoftware/BehindLogging.log : Lorg/slf4j/Logger;
LDC "My values: {} {} {}"
ICONST_3
ANEWARRAY java/lang/Object
DUP
ICONST_0
ALOAD 1
AASTORE
DUP
ICONST_1
ALOAD 2
AASTORE
DUP
ICONST_2
ALOAD 3
AASTORE
INVOKEINTERFACE org/slf4j/Logger.debug (Ljava/lang/String;[Ljava/lang/Object;)V
So what do I want to tell you with this? As you can see the Java compiler always uses the StringBuilder (stop using StringBuffer!!) when it creates the strings for you.
Of course it’s not totally free to generate the string with a StringBuilder. That’s why all (good) logging frameworks also has the solution with the string format pattern. That will not produce anything else than one method call and possibly an array creation, but that’s pretty fast. Some frameworks also defines methods that actually takes two arguments instead of an array to speed up those logging messages that only takes two arguments, then no array is created. Some might even have more than that.
And then, of course, some logging frameworks has been adding support for Java 8 suppliers and lambdas, that is probably the sweetest solution :)
Well, you get the point. Happy string creation.

Inga kommentarer:

Skicka en kommentar