Likes

Thread in Java

Java & Threads


Threads are essentially subprocesses1. Informally, you can think of them as tasks that belong to a program and that can run "simultaneously". Depending on the number of CPUs available and the number of competing threads, some of those threads actually will run in parallel on different CPUs, whilst in other cases the illusion of simultaneous execution will be achieved by "juggling" threads in and out of the available CPUs. A part of the OS called the thread scheduler takes care of deciding which threads to allocate CPU time to (on which CPUs) and when.
Who needs to know about thread programming?
In the past, people who needed to know about thread programming were generally people writing applications that dealt with multiple "requests" or "jobs" concurrently, such as a web server or maybe a web browser making simultaneous requests to download a web page and associated images. But many other "client" or "desktop" applications didn't involve threads: there was generally only one CPU on a desktop machine, so everything was just done in one thread. Even games, which often needed to create the illusion of, say, different characters simultaneously moving about, would still use one thread and just "loop through" the various characters to move on every frame. (Look at pretty much any tutorial on game programming written until fairly recently, and the first chapter will inevitably be about "the game loop"...)
But increasingly, practically every programmer needs to know about threading and parallel programming. A fundamental characteristic of "game loop" approach is that it will only occupy one processor. The only way to improve performance with this approach (aside from using better algorithms) is to increase processor speed. This hasn't mattered until recently, because single processor speeds were continually increasing. But increased performance nowadays is being achieved via increased numbers of CPUs rather than increased speed of a single CPU. This means that to get our game or calculator program to perform better, we need to put the multiple CPUs to work in parallel. And by and large, that means multiple threads.

Getting started: what are threads, and how to use them in Java


It's easier to illustrate what a thread is by diving straight in and seeing some code. We're going to write a program that "splits itself" into two simultaneous tasks. One task is to print Hello, world! every second. The other task is to print Goodbye, cruel world! every two seconds. OK, it's a silly example.
For them to run simultaneously, each of these two tasks will run in a separate thread. To define a "task", we create an instance of Runnable. Then we will wrap each of these Runnables around a Thread object.
Runnable
Runnable object defines an actual task that is to be executed. It doesn't define how it is to be executed (serial, two at a time, three at a time etc), but just what. We can define a Runnable as follows:
Runnable r = new Runnable() {
  public void run() {
    ... code to be executed ...
  }
};
Runnable is actually an interface, with the single run() method that we must provide. In our case, we want the Runnable.run() methods of our two tasks to print a message periodically. So here is what the code could look like:
Runnable r1 = new Runnable() {
  public void run() {
    try {
      while (true) {
        System.out.println("Hello, world!");
        Thread.sleep(1000L);
      }
    } catch (InterruptedException iex) {}
  }
};
Runnable r2 = new Runnable() {
  public void run() {
    try {
      while (true) {
        System.out.println("Goodbye, " +
               "cruel world!");
        Thread.sleep(2000L);
      }
    } catch (InterruptedException iex) {}
  }
};
For now, we'll gloss over a couple of issues, such as how the task ever stops. As you've probably gathered, the Thread.sleep() method essentially "pauses" for the given number of milliseconds, but could get "interrupted", hence the need to catch InterruptedException. We'll come back to this in more detail in the section on Thread interruption and InterruptedException. The most important point for now is that with the Runnable() interface, we're just definingwhat the two tasks are. We haven't actually set them running yet. And that's where the Thread class comes in...
Thread
A Java Thread object wraps around an actual thread of execution. It effectively defines how the task is to be executed— namely, at the same time as other threads2. To run the above two tasks simultaneously, we create a Thread object for each Runnable, then call the start() method on each Thread:
Thread thr1 = new Thread(r1);
Thread thr2 = new Thread(r2);
thr1.start();
thr2.start();
When we call start(), a new thread is spawned, which will begin executing the task that was assigned to the Thread object at some time in the near future. Meanwhile, control returns to the caller of start(), and we can start the second thread. Once that starts, we'll actually have at least three threads now running in parallel: the two we've just started, plus the "main" thread from which we created and started the two others. (In reality, the JVM will tend to have a few extra threads running for "housekeeping" tasks such as garbage collection, although they're essentially outside of our program's control.)

 

Difference between “implements Runnable” and “extends Thread” in java

Difference between “implements Runnable” and “extends Thread” in java
In java language, as we all know that there are two ways to create threads. One using Runnable interface and another by extending Thread class.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public class DemoRunnable implements Runnable {
    public void run() {
        //Code
    }
}
//with a "new Thread(demoRunnable).start()" call

public class DemoThread extends Thread {
    public DemoThread() {
        super("DemoThread");
    }
    public void run() {
        //Code
    }
}
//with a "demoThread.start()" call
There has been a good amount of debate on which is better way. Well, I also tried to find out and below is my learning:
1) Implementing Runnable is the preferred way to do it. Here, you’re not really specializing or modifying the thread’s behavior. You’re just giving the thread something to run. That means composition is the better way to go.
2) Java only supports single inheritance, so you can only extend one class.
3) Instantiating an interface gives a cleaner separation between your code and the implementation of threads.
4)  Implementing Runnable makes your class more flexible. If you extend thread then the action you’re doing is always going to be in a thread. However, if you extend Runnable it doesn’t have to be. You can run it in a thread, or pass it to some kind of executor service, or just pass it around as a task within a single threaded application.
5) By extending Thread, each of your threads has a unique object associated with it, whereas implementing Runnable, many threads can share the same runnable instance.
The issue is that at construction time, a Thread is added to a list of references in an internal thread table. It won’t get removed from that list until its start() method has completed. As long as that reference is there, it won’t get garbage collected.

 

 

 

 

Constructing Threads and Runnables

In our Java threading introduction, we created a thread in two steps:
  • firstly, we constructed a Runnable object to define the code to be executed by the thread;
  • then, we constructed a Thread object around the Runnable.
There are actually a couple of variations on this pattern of thread construction that we'll look at here.

Pattern 1: create an explicit class that implements Runnable

The Runnable implementations that we created were inline classes. That is, we didn't spell out a full class declaration. But strictly speaking, the way to create a Runnable— or rather, a class that implements the Runnable interface— is as follows:
public class MyTask implements Runnable {
  public void run() {
    ...
  }
}
...
Runnable r = new MyTask();
Thread thr = new Thread(r);
If we just write new Runnable(), the compiler under the hood actually creates a "dummy class" of the above form for us. But sometimes it's useful to create our own class. In our simultaneous message printing example, both Runnables essentially had similar code: only the message and time interval differed. So it would be neater to define a class that took the message and interval as parameters to the constructor:
public class MessagePrinter implements Runnable {
  private final String message;
  private final long interval;
  public MessagePrinter(String msg, long interval) {
    this.message = msg;
    this.interval = interval;
  }
  public void run() {
    try {
      while (true) {
        System.out.println(message);
        Thread.sleep(interval);
      }
    } catch (InterruptedException iex) {}
  }
}
Notice that for reasons we'll come to, we declare the variables final. This is basically a means of making sure they are "seen properly" by the two threads involved (the thread that constructs the object, then the thread in which run() will actually be running when we start the thread).

Pattern 2: override Thread.run()

You can actually dispense with the separate Runnable method. The Thread class has a (normally empty) run() method. If you don't pass in a Runnable to the constructor, then the run() method of Thread will be called instead when the thread starts. So we could write something like this:
public void MyThread extends Thread {
  public void run() {
    ...
  }
}
...
Thread thr = new MyThread();
thr.start();
Of course, we can also turn this into an inline class:
Thread thr = new Thread() {
  public void run() {
    ...
  }
}
thr.start();

Which thread construction pattern?

So, which method should you use to construct a thread in Java? In general, constructing a separate Runnable gives you more flexibility. Running in aThread turns out not to be the only way of running a Runnable, so if you embed everything inside a Thread object from the beginning, you may end up with more code to change later on if you decide to do things differently. In the simplest case, having a separate Runnable allows you to write code such as the following:
public void runTask(Runnable r, boolean separateThread) {
  if (separateThread) {
    (new Thread(r)).start();
  } else {
    r.run();
  }
}
Other instances where a Runnable is used are with the Swing.invokeLater() method (called from a non-Swing thread to ask Swing to run a particular task in its UI thread), or with various executor utilities introduced in the Java 5 concurrency package.
On the other hand, for threads representing fairly "major" tasks running right through your application, where it's clear from the ground up that you don't need the flexibility of the separate Runnable object, just overriding Thread.run() may make your code a little less cluttered.

Thread methods in Java


On the previous page, we looked at how to construct a thread in Java, via the Runnable and Thread objects. We mentioned that the Thread class providescontrol over threads. So on this page, we take a high-level look at the most important methods on this class.
Thread.sleep()
We actually saw a sneak preview of Thread.sleep() in our Java threading introduction. This static method asks the system to put the current thread to sleep for (approximately) the specified amount of time, effectively allowing us to implement a "pause". A thread can be interrupted from its sleep.
For more details, see: Thread.sleep() (separate page).
interrupt()
As mentioned, you can call a Thread object's interrupt() method to interrupt the corresponding thread if it is sleeping or waiting. The corresponding thread will "wake up" with an IOException at some point in the future. See thread interruption for more details.
setPriority() / getPriority()
Sets and queries some platform-specific priority assignment of the given thread. When calculating a priority value, it's good practice to always do so in relation to the constants Thread.MIN_PRIORITYThread.NORM_PRIORITY and Thread.MAX_PRIORITY. In practice, values go from 1 to 10, and map on to some machine-specific range of values: nice values in the case of Linux, and local thread priorities in the case of Windows. These are generally the range of values of "normal" user threads, and the OS will actually still run other threads beyond these values (so, for example, you can't preempt the mouse pointer thread by setting a thread to MAX_PRIORITY!).
Three main issues with thread priorities are that:
  • they don't always do what you might intuitively think they do;
  • their behaviour depends on the platform and Java version: e.g. in Linux, priorities don't work at all in Hotspot before Java 6, and the mapping of Java to OS priorities changed under Windows between Java 5 and Java 6;
  • in trying to use them for some purpose, you may actually interfere with more sensible scheduling decisions that the OS would have made anyway to achieve your purpose.
For more information, see the section on thread scheduling and the discussion on thread priorities, where the behaviour on different platforms is compared.
The join() method is called on the Thread object representing enother thread. It tells the current thread to wait for the other thread to complete. To wait for multiple threads at a time, you can use a CountDownLatch.
Thread.yield()
This method effectively tells the system that the current thread is "willing to relinquish the CPU". What it actually does is quite system-dependent. For more details, see: Thread.yield() (separate page).
setName() / getName()
Threads have a name attached to them. By default, Java will attach a fairly dull name such as Thread-12. But for debugging purposes you might want to attach a more meaningful name such as Animation ThreadWorkerThread-10 etc. (Some of the variants of the Thread constructor actually allow you to pass in a name from the start, but you can always change it later.)

Thread interruption in Java


In our overview of thread methods, we saw various methods that throw InterruptedException.
Interruption is a mechanism whereby a thread that is waiting (or sleeping) can be made to prematurely stop waiting.
Incidentally, it is important not to confuse thread interruption with either software interrupts (where the CPU automatically interrupts the current instruction flow in order to call a registered piece of code periodically— as in fact happens to drive the thread scheduler) and hardware interrupts (where the CPU automatically performs a similar task in response to some hardware signal).
To illustrate interruption, let's consider again a thread that prints a message periodically. After printing the message, it sleeps for a couple of seconds, then repeats the loop:
Runnable r = new Runnable() {
  public void run() {
    try {
      while (true) {
        Thread.sleep(2000L);
        System.out.println("Hello, world!");
      }
    } catch (InterruptionException iex) {
      System.err.println("Message printer interrupted");
    }
  }
};
Thread thr = new Thread(r);
thr.start();
The InterruptedException is thrown by the Thread.sleep() method, and in fact by a few other core library methods that can "block", principally:
  • Object.wait(), part of the wait/notify mechanism;
  • Thread.join(), that makes the current thread wait for another thread to complete;
  • Proess.waitFor(), which lets us wait for an external process (started from our Java application) to terminate;
  • various methods in the Java 5 concurrency libraries, such as the tryLock() method of the Java 5 ReentrantLock class.
In general, InterruptedException is thrown when another thread interrupts the thread calling the blocking method. The other thread interrupts the blocking/sleeping thread by calling interrupt() on it:
thr.interrupt();
Provided that the thread or task calling sleep() (or whatever) has been implemented properly, the interruption mechanism can therefore be used as a way to cancel tasks.

Stopping a thread

On the previous pages, we focussed on how to start a thread in Java. We saw that after creating the Thread object, calling start() asynchronously starts the corresponding thread. In our example, the run() method contained an infinite loop. But in real life, we generally want our thread to stop. So how do we make our thread stop?
Firstly, the thing you shouldn't do is call Thread.stop(). This method, now deprecated, was intended to stop a given thread abruptly. But the problem with this is that the caller can't generally determine whether or not the given thread is at a safe point to be stopped. (This isn't just a Java phenomenon: in general, the underlying operating system calls that Thread.stop() makes to abruptly stop the thread are also deprecated for this reason.)
So how can we stop a thread safely? In general:
To make the thread stop, we organise for the run() method to exit.
There are a couple of ways that we would typically do so.

Use a "stop request" variable

A common solution is to use an explicit "stop request" variable, which we check on each pass through the loop. This technique is suitable provided that we can check the variable frequently enough:
private volatile boolean stopRequested = false;
 
public void run() {
  while (!stopRequested) {
    ...
  }
}
 
public void requestStop() {
  stopRequested = true;
}
Note that we must declare the stopRequested variable as volatile, because it is accessed by different threads.

Use Thread.interrupt()

The above pattern is generally suitable if the variable stopRequested can be polled frequently. However, there is an obvious problem if the method blocksfor a long time, e.g. by calling Thread.sleep() or waiting on an object. In general, such blocking methods are interruptible. For more information, see the section on thread interruption.


How threads work: more details


In our introduction to Java threads, we showed the basics of how to start a thread, and the idea that threads let us run multiple tasks or "mini-programs" in parallel. But to understand certain thread programming issues in more detail, it's helpful to take a more detailed look at what threads actually are and how they work.
http://www.javamex.com/tutorials/threads/ThreadDiagram.png
Figure 1: Typical relationsip between
threads and processes.
Threads and processes
A thread is essentially a subdivision of a process, or "lightweight process" (LWP) on some systems. A process is generally the most major and separate unit of execution recognised by the OS. The typical relationship between processes, threads and various other elements of the OS are shown in Figure 1 opposite. This shows two processes, each split into two threads (a simplistic situation, of course: there will be typically dozens of processes, some with dozens or more threads).
Crucially, each process has its own memory space. When Process 1 accesses some given memory location, say 0x8000, that address will be mapped to some physical memory address1. But from Process 2, location 0x8000 will generally refer to a completely different portion of physical memory. Athread is a subdivision that shares the memory space of its parent process. So when either Thread 1 or Thread 2 of Process 1 accesses "memory address 0x8000", they will be referring to the same physical address. Threads belonging to a process usually share a few other key resources as well, such as their working directory, environment variables, file handles etc.
On the other hand, each thread has its own private stack and registers, including program counter. These are essentially the things that threads need in order to be independent. Depending on the OS, threads may have some other private resources too, such as thread-local storage(effectively, a way of referring to "variable number X", where each thread has its own private value of X). The OS will generally attach a bit of "housekeeping" information to each thread, such as its priority and state (running, waiting for I/O etc).
The thread scheduler
There are generally more threads than CPUs. Part of a multithreaded system is therefore a thread scheduler, responsible for sharing out the available CPUs in some way among the competing threads. Note that in practically all modern operating systems, the thread scheduler is part of the OS itself. So the OS actually "sees" our different threads and is responsible for the task of switching between them2. The rationale for handling threading "natively" in the OS is that the OS is likely to have the information to make threading efficient (such as knowing which threads are waiting for I/O and for how long), whereas a software library may not have this information available. In the rest of our discussion, we'll generally assume this native threads model.
Next: scheduling and its implications for Java
On the next pages, we look at


1. Things are usually a little more complex, in fact. For example, a memory address can actually be mapped to something that isn't memory (such as a portion of a file, or device I/O). For this reason, the term address space is often preferred.
2. An alternative scenario, less common nowadays, is that the OS schedules at some higher level, e.g. scheduling processes, or scheduling some kind of "kernal thread" which is a unit bigger than our applications threads. In this model, sometimes called "green threads", threads are to some extent "artificially" handled by the JVM (or the threading library that it is compiled against). In general, if you are using Java 1.4 onwards on Windows or on Solaris 9 or Linux kernel 2.6 or later, then there will be a 1:1 mapping between Java Threads and "native" operating system threads.

Thread Scheduling


In our introduction to how threads work, we introduced the thread scheduler, part of the OS (usually) that is responsible for sharing the available CPUs out between the various threads. How exactly the scheduler works depends on the individual platform, but various modern operating systems (notably Windows and Linux) use largely similar techniques that we'll describe here. We'll also mention some key varitions between the platforms.
Note that we'll continue to talk about a single thread scheduler. On multiprocessor systems, there is generally some kind of scheduler per processor, which then need to be coordinated in some way. (On some systems, switching on different processors is staggered to avoid contention on shared scheduling tables.) Unless otherwise specified, we'll use the term thread scheduler to refer to this overall system of coordinated per-CPU schedulers.
Across platforms, thread scheduling1 tends to be based on at least the following criteria:
  • a priority, or in fact usually multiple "priority" settings that we'll discuss below;
  • a quantum, or number of allocated timeslices of CPU, which essentially determines the amount of CPU time a thread is allotted before it is forced to yield the CPU to another thread of the same or lower priority (the system will keep track of the remaining quantum at any given time, plus its defaultquantum, which could depend on thread type and/or system configuration);
  • a state, notably "runnable" vs "waiting";
  • metrics about the behaviour of threads, such as recent CPU usage or the time since it last ran (i.e. had a share of CPU), or the fact that it has "just received an event it was waiting for".
Most systems use what we might dub priority-based round-robin scheduling to some extent. The general principles are:
  • a thread of higher priority (which is a function of base and local priorities) will preempt a thread of lower priority;
  • otherwise, threads of equal priority will essentially take turns at getting an allocated slice or quantum of CPU;
  • there are a few extra "tweaks" to make things work.
States
Depending on the system, there are various states that a thread can be in. Probably the two most interesting are:
  • runnable, which essentially means "ready to consume CPU"; being runnable is generally the minimum requirement for a thread to actually be scheduled on to a CPU;
  • waiting, meaning that the thread currently cannot continue as it is waiting for a resource such as a lock or I/O, for memory to be paged in, for a signal from another thread, or simply for a period of time to elapse (sleep).
Other states include terminated, which means the thread's code has finished running but not all of the thread's resources have been cleared up, and a newstate, in which the thread has been created, but not all resources necessary for it to be runnable have been created. Internally, the OS may distinguish between various different types of wait states2 (for example "waiting for a signal" vs "waiting for the stack to be paged in"), but this level of granularity is generally not available or so important to Java programs. (On the other hand, Java generally exposes to the programmer things the JVM can reasonly know about, for example, if a thread is waiting to acquire the lock on a Java object— roughly speaking, "entering a synchronized block".)

No comments: