Notes on String comparison in Java by Will Tracy Late last quarter, I saw questions on IRC about string equality in Java. This is a fairly complicated subject, yet one that is fairly fundamental to writing Java code, which means that it gets screwed up *all* the time. :-) The short explanation is that you use == to test for references that point to the same object, and equals() to test whether both objects have the same value. The == operator in Java does *not* do a deep comparison of objects. Now, there are dozens of developers that will swear that == will do a deep comparison of Java strings. If you are one of those developers, or have to maintain code written by one of those developers, then read on. Consider the following code: public class String1 { public static void main(String args[]) { String s1 = "foo"; String s2 = "foo"; if (s1 == s2) System.out.println("s1 == s2"); else System.out.println("s1 != s2"); if (s1.equals(s2)) System.out.println("s1.equals(s2)"); else System.out.println("!s1.equals(s2)"); } } Copy it to a file String1.java, and compile it. Then run: less String1.class If less asks you whether you really want to see a binary file, say yes. You are now looking at an ASCII interpretation of the Java bytecode. Among other things, you can see all the classes that were imported, and all the method signatures that were used. Now look closer. You will only see the string foo exactly once in this file. This is an optimization on the part of the Java compiler that saves memory. How does this work? Strings in Java are immutable. Once they are instantiated, they cannot be modified, you can only create a new string and update the reference to point to that new string (as in: s1 = s1 + "bar"). So, there will never be an instance where modifying the string pointed to by one reference will modify the string pointed to by another reference--because you cannot actually modify the string in the first place. If you run the String1 example, you will see that s1 == s2 is true in this case. Why? *** Because s1 and s2 actually reference the same object in memory. *** This is a side-effect of a compiler optimization, nothing more. This behavior is *not* guaranteed by the Java runtime. Now, consider this chunk of code: public class String2 { public static void main(String args[]) { String s1 = "foo"; String s2; java.util.Scanner in = new java.util.Scanner(System.in); System.out.print("Type in foo: "); s2 = in.nextLine(); in.close(); if (s1 == s2) System.out.println("s1 == s2"); else System.out.println("s1 != s2"); if (s1.equals(s2)) System.out.println("s1.equals(s2)"); else System.out.println("!s1.equals(s2)"); } } This time, s2 is read in from the user, and the string is set up at run time, not compile time. Compile and run it, and type in foo at the prompt. You will see that the objects are equivalent (equals) but not the same (==). This is because s1 and s2 point to different objects this time around. What's the moral here? If you want to test whether two strings share the same value, use equals() and only equals(). Now, that does not mean to not ever use ==. Use == if you really want to check *reference* equality. The classic example? if (s1 == null) { /* handle null reference */ } s1.equals(null); will never return true--it will just throw a NullPointerException when you try to call the equals() method of a null object. Think that nobody would be dumb enough to do that? Think again: http://thedailywtf.com/Articles/SkillsEquals(null).aspx (The code in question is C# code instead of Java, but the languages are so similar that the same principle applies.) Thanks for reading, and happy hacking. -- William Tracy afishionado@gmail.com -- wtracy@calpoly.edu Vice President, Cal Poly Linux Users' Group http://www.cplug.org