Thursday, September 1, 2011

JVM - Out of swap space! Really?

Houston, ...

One of our hosted Portal systems had suddenly stability problems and the production team found this exception in the server logs:
Exception java.lang.OutOfMemoryError: requested 20971520 bytes for GrET* in
/BUILD_AREA/jdk1.5.0_23/hotspot/src/share/vm/utilities/growableArray.cpp.
Out of swap space?
The Portal has been running for years without such problems and suddenly the JVM reboots once a day with this unexpected C++ (!?) related exception.

Pavlovian conditioning

The team started to look for the usual OutOfMemoryError stuff like open sessions, medium session sizes, Garbage Collection behaviour etc. But there was nothing unusual. The JVM had sufficient free heap and the GC behaved well. This research doesn't cause any harm but will not solve this issue - this is no typical OutOfMemory Error that follows a Java Object allocation.

The message Out of swap space? also triggered some reasonable researches. But the machine had enough swap space and didn't even use that. This should be the expected behaviour for Java server systems - swapping is the Garbage Collectors worst enemy.

What to do if nothing helps? Google is your friend...Bug reports, Forums and Blogs ;) But as often as not the found solutions don't match, contain stale information or even sometimes are simply wrong. The hits for this exceptions contain things like (forget about it):
  • Hotspot should be deactivated, JVM bug
  • Garbage Collector type CMS has a bug, switch to another one
  • Java Heap space too fragmented
  • you should run JProbe, Memory Analyzer and activate verbose GC
  • update Java Version (OK....nothing wrong about that)

    What's really up?

    The error simply means that the JVM process - a (complex) C++ program (forget Java for a moment) - cannot allocate a continuous 20 MB virtual memory element. The JVM process only takes a guess here and claims there may not be enough system swap space: But really, the JVM process cannot know from the memory allocation response. The malloc only delivers a ‘null’ as memory pointer and that’s it. The well-intentioned hint 'swap space' is more often wrong and misleading than true.

    growableArray is used at all kind of places in the JVM, not only for Heap and PermSpace but also for allocating additional working space for Code generation (Hotspot), Thread stacks, Garbage Collection etc. This error has nothing to do with insufficient Heap or Perm space.

    There are many possible reasons, e.g.:
    • Really not enough physical memory or swap space. The claimed continuous memory cannot be in swap space (!) but other memory pages could swap out and indirectly the guess “not enough swap space” could be true. This is very likely not the problem. You should deactivate swap on JVM server systems anyway because the Garbage Collector and swapping don't go well together.
    • You have a 32 Bit JVM and simply overflow the 32 Bit process size of the underlying operating system. Remember only Solaris allows nearly 4 GB 32 Bit processes, the maximum JVM 32 Bit process size on Windows or older Linux is more like 3 GB.
    • An additional problem in the 32 Bit case could be a fragmented process space. Use the best practice for server JVMs and set -Xms equal to -Xmx and this shouldn't be the reason for this problem.

    4 GB ought to be enough for anybody.

    The JVM process size is not only the sum of Heap and Perm space. There is also the thread stack space (-Xss), Garbage Collector working space, Hotspot compilation space, general process space (e.g. for I/O or array copy) and much more - many of them have no specific JVM maximum settings - and this are none-trivial sizes and can also sum up to multiple 100 MB in a well frequented system with lots of threads and GC (and maybe large memory page sizes).

    Indirectly Hotspot compilation or Garbage Collection can be a trigger for the Error because this functionality needs process memory too on top of the Java Heap space. E.g. the GC type CMS needs much more additional memory as other types (interesting statistics). But the real reason are too generous settings for Heap and Perm space in the restricted 32 Bit JVM process size. Many developers know the following image and the important Xmx and MaxPermSize settings. With them you can restrict the largest chunk of the used process memory and you will be tempted to optimize the maximum settings too much:


    The initially mentioned (suddenly instable) Portal server runs with -Xms2600m -Xmx2600m -Xss128k on Solaris and uses 350 MB Perm space. But the real process size is on the verge of the 4 GB hazard zone:

    PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
    25732 bea 3799M 3487M sleep 59 0 0:42:37 2.2% java/94

    Maybe we used more Perm space over the time because the Portal functionality is growing with each deployment. Or we used more worker threads because we have increasing peek traffic etc. It's really not important - lower the Heap settings by a small margin (the system runs nearly a full day before the error is triggered) and all is good ;)

    Switching to a 64 Bit VM is a possibility but keep the overhead in mind. For older JVMs this was an important decision. With modern JVMs CompressedOops may be an interesting option to alleviate the effects. In each case I cringe when I see a stateless high throughput system like an ESB with huge Heap sizes in 64 Bit mode.

    1 comment:

    1. The JVM memory was always a problem and it still is even in JVM 6 , thnaks for this insight that helps us cure out of memory problems

      ReplyDelete