madhadron

Working with Java command line tools

Status: Done
Confidence: Certain

If you learned to write Java in IntelliJ, NetBeans, or Eclipse, and suddenly you’re faced with using the command line Java tools and messing with jars, this is for you.

There are really only three commands you need to know about:

Let’s start with compiling and running hello world.

public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello, world!");
    }
}

Save this in HelloWorld.java. Remember, in Java each file defines one class and the file must be named the same as the class. We compile it by opening a shell and in the same directory as the file running

javac HelloWorld.java

The output of this is another file, HelloWorld.class, in the same directory. Now, if you are familiar with the shell, you might expect to run this with

# THIS DOES NOT WORK!
./HelloWorld.class

The .class file is not program in a form that any major operating system today understands. It was never meant to be. Instead it is a format called bytecode that is designed to be run by another program, called the virtual machine. The same bytecode can run anywhere the virtual machine can run. We run the virtual machine with the java command. Now, you might expect to run it with

# THIS ALSO DOES NOT WORK
java HelloWorld.class

In fact, it prints

Error: Could not find or load main class HelloWorld.class
Caused by: java.lang.ClassNotFoundException: HelloWorld.class

At this point you should be thinking, “Wait a minute, what do you mean you can’t find it? It’s right there!” We need to pause and get into the world the Java virtual machine.

To work on a wide range of machines, the Java virtual machine is a self contained world. When we run the java command, we pass it the name of class, not the name of a file. In HelloWorld.java, we defined the HelloWorld class. The virtual machine starts up, loads all classes in the bytecode you point it to, and looks for the class in there.

Note: Any class you pass must be one that the Java virtual machine can use as a starting point for the program. That is, it must define a public static void main method. Otherwise the Java virtual machine will complain that there is no main method.

So how do you point it to bytecode? You pass directories containing .class files to the -cp (read: classpath) argument of the java command. Since we’re in the same directory as the .class file, we can just tell it ., a synonynm for the current directory.

java -cp . HelloWorld

The Java virtual machine starts, loads .class files from the current directory, and then looks for one named HelloWorld to run. It finds it, finds that it has a public static void main method, and runs that.

Note: This may seem very indirect, but when we start to add more—and different—things to our classpath, this indirection will become necessary.

Now, say that you have a library of class files in /lib/mylib, another in /lib/otherlib, and your own code in /home/me/myproject. How do you tell the virtual machine to load the classes in all of these locations? You list them all as arguments to -cp, separated by :. So you would run

java -cp /lib/mylib:/lib/otherlib:/home/me/myproject HelloWorld

What happens if there’s a class named HelloWorld in one of those libraries, though? The Java virtual machine will pick one of them. Whether it’s the right one is another matter, and classes expecting one may find themselves calling the other. To get around this, Java introduced the notion of packages.

We can change our HelloWorld.java file to put the class in a package.

package com.madhadron;

public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello, world!");
    }
}

We compile it again and try to run it (javac HelloWorld.java && java -cp . HelloWorld), and…it fails!

Error: Could not find or load main class HelloWorld
Caused by: java.lang.NoClassDefFoundError: com/madhadron/HelloWorld (wrong name: HelloWorld)

Again, HelloWorld.class is right there! What’s wrong?

When we tell the Java virtual machine what class to start with, we have to provide its “fully qualified” name, that is, the package followed by the name of the class. In our case that is com.madhadron.HelloWorld. So, let’s try that. We run javac and java, and

Error: Could not find or load main class com.madhadron.HelloWorld
Caused by: java.lang.ClassNotFoundException: com.madhadron.HelloWorld

Just like Java enforces that a class named HelloWorld must be in a file named HelloWorld.java, a class in a package com.madhadron must be in a subdirectory com/madhadron/. Since we are passing the current directory as our classpath, we need to put HelloWorld.class in com/madhadron/.

So we run

javac HelloWorld.java
mkdir -p com/madhadron
mv HelloWorld.class com/madhadron/

Now our folder has the contents

HelloWorld.java
com/
    madhadron/
        HelloWorld.class

And when we run java -cp . com.madhadron.HelloWorld, it successfully loads the class, finds the one we want to run, and runs it.

Note: Package names can prevent class names from colliding, but how do you prevent package names from colliding? The Java community decided to make that someone else’s problem. When you publish a library, you’re supposed to have a domain name, like madhadron.com. You reverse all the components of the domain, add another field to identify your package, and use that as your package name. For example, if I publish this hello world program as a library called helloworld, it would be in the package com.madhadron.helloworld.

Since another system entirely already garuantees that one organization or person owns a domain name, this way we don’t get conflicts. No one enforces this. You could put out a library that puts its code in a package beginning with com.google, but anyone using it would complain mightily and report it as a bug.

Now, say you want to distribute your library. You would probably make a zip file of all your class files in the right subdirectories for someone to uncompress on their machine. Since everyone is already doing this, Java introduced a convenience called jar files. A jar file is a zip file with an extra bit of data saying what code they are carrying. You can add a jar file on your classpath like you would a directory containing class files.

So say I create a jar file of my helloworld library, called helloworld.jar, containing the class com.madhadron.helloworld.HelloWorld. You put it in /lib/helloworld.jar, and run it with

java -cp /lib/helloworld.jar com.madhadron.helloworld.HelloWorld

How do you create these jar files? With a tool called jar. At first glance the options to the jar command may seem strange. That is because they are inherited from a much older Unix tool called tar. This may not have been the best choice, but Java was created at Sun Microsystems, a company that specialized in Unix workstations. Their engineers used tar all the time, and so copied what they were used to.

Now, for our helloworld library, after compiling HelloWorld.java, our directory has

HelloWorld.java
com/
    madhadron/
        helloworld/
            HelloWorld.class

We create helloworld.jar from this by running

jar cf helloworld.jar com

If we had a second directory we wanted as well, we could add its name after com, separated by a space. The same for a third, or a fourth. Why would we do that? Imagine that we have a jar file otherlibrary.jar containing a library our program depends on, and we want to make one jar file containing both the library and our program. You might think you could say

# THIS DOES NOT WORK
jar cf helloworld.jar com otherlibrary.jar

This doesn’t work. When we run the Java virtual machine, it won’t look inside jars for other jars to load classes from. Instead, we need to unzip the library jar, and then build a new jar with both folders.

We can replace cf with xf to unzip jars. So we can run

jar xf otherlibrary.jar

and we will get its contents. Say it’s put out by example.org. Then we now have a folder containing

com/
    madhadron/
        helloworld/
            HelloWorld.class
HelloWorld.java
org/
    example/
        otherlibrary/
            SomeClass.class
otherlibrary.jar

And we can package up our jar with

jar cf helloworld.jar com org

We can give that jar to someone and they can run our program in it by running

java -cp helloworld.jar com.madhadron.helloworld.HelloWorld

That seems like a lot, though. Do they really need to know the name of our class to run this zip file we have given them? No, they don’t. The Java virtual machine recognizes a special case of a jarfile. We can run a slightly different jar command to tell it what the class to run is:

jar cfe helloworld.jar com.madhadron.helloworld.HelloWorld com org

Now we can run

java -jar helloworld.jar

and it will directly run our program.

Afterward: This probably all seems like a lot of work. For every class you have to run javac and copy the class that it produces into the right directory? No one does that. You need to know how all this works, but then you immediately go on to use a build system like Maven or Gradle.