::Z::Thinking::

::Simple::
文章 - 124,收藏 - , 评论 - 49, trackbacks - 0
Gc 流程
[older generation][survivor 1][survivor 2][eden]
*young generation=eden + survivor
1,当eden满了,触发minor GC;
2,minor GC做2件事,一,去掉一部分没用的object。2,把老的还被引用的object发到survior里面,等下几次GC以后,survivor再放到old里面。
3,当old满了,触发full GC.full GC很消耗内存,把old,young里面大部分垃圾回收掉。这个时候用户线程都会被block。
Summary
1,保存gc信息。
java -Xloggc:<file> (Sun JVM option)/java -verbose:gc (IBM JVM)
2,调优参数。
JVM option
java -XX:NewRatio=2
   2代表older generation:young generation==2:1
java -XX:SurvivorRatio=6
   sets the ratio between each survivor space and eden to be 1:6. In other words, each survivor space will be one eighth of the young generation (not one seventh, because there are two survivor spaces).
        Survivor的space是young generation的1/8,young generation space=eden + survivor.
使用其他垃圾收集机制:
Full GC pause too long:java XX:+UseConcMarkSweepGC
Minor GC pause too long:  java -XX:+UseParallelGC
Those flags differing per architecture/OS/JVM Verison. "Flag and Default" has the default of Sparc/-server, JVM version 1.3
Flag and DefaultDescription
-XX:CompileThreshold=10000number of method invocations/branches before (re-)compiling [10,000 -server, 1,500 -client]
-XX:MaxNewSize=32mMaximum size of new generation (in bytes) [32m sparc, 2.5m intel for 1.3, no limit for 1.4 as NewRatio is now used to determine MaxNewSize]
-XX:NewRatio=2 Ratio of new/old generation sizes [sparc -server: 2, sparc -client: 4 (1.3) 8 (1.3.1+), intel: 12]
-XX:NewSize=2228224Default size of new generation (in bytes) [sparc 2.125M, intel: 640k]
-XX:PreBlockSpin=10Spin count variable for use with -XX:+UseSpinning. Controls the maximum spin iterations allowed before entering operating system thread synchronization code. (as of J2SE 1.4.2, Linux only)
-XX:ReservedCodeCacheSize=32mReserved code cache size (in bytes) - maximum code cache size. [Solaris 64-bit: 1024m]
-XX:SurvivorRatio=64Ratio of eden/survivor space size [Solaris: 64, Solaris: 32 (on 1.3.1 and later), Linux/Windows: 8]
-XX:ThreadStackSize=512 Thread Stack Size (in Kbytes) (0 means use default stack size) [Sparc: 512, Solaris Intel: 256, Sparc 64bit: 1024 all others 0]
-XX:+UseTLE
(-XX:+UseTLAB in J2SE 1.4)
Use thread-local object allocation [Sparc -server: true, all others: false]
-XX:-UseISM See Intimate Shared Memory
-XX:-UseMPSSUse Multiple Page Size Support (Solaris 9 only) w/4mb pages for the heap. Do not use with ISM as this replaces the need for ISM.
-XX:LargePageSizeInBytes=2mSets the large page size used for the Java heap. Set to 2m when running 32-bit JVM on Solaris 10 x86 with AMD64 CPUs. Solaris Only. Defaults (SPARC: 8m, x86: 4m, x86_64: 2m)
-XX:-UseSpinningEnable naive spinning on Java monitor before entering operating system thread synchronizaton code. (as of J2SE 1.4.2, Linux only)
3,young generation比例越大。不一定最好。
〉footprint越多。所用heap越大。
〉如果太大了,会导致minor GC很少发生(young space大了,不容易触发minor GC),最后导致full GC次数增多,Acc time增多。
〉full GC肯能会越少。older generation满了会导致full GC。
〉Acc Time可能会越少。full GC会花很多的时间。当full GC的时候,用户线程都是blocking的。
〉thoughput可能越高。实际用来运行工作的运行时间越多。
4,总的来说。
  对于短生命周期的object比例越大的时候,young size需要小一些。因为每次minor gc都会比较有效。
  当长生命周期的object比例越大的时候,young size需要大一些。因为每次的minor gc回收的垃圾比较少,放到older space里面的object越多,这样容易促发full gc,然而,full gc并不会收集多少垃圾,只会blocking用户进程。
*是不是越烂的程序young size还是小点好?哈哈!
*其实并不是极端最好,young size太小了会有minor gc效率不高。太大了又可能导致大量的full gc。


Java Performance Tuning
Come here for all your Java Performance Training needs
Java(TM) is a trademark or registered trademark of Sun Microsystems, Inc. in the United States and other countries.
JavaPerformanceTuning.com is not connected to Sun Microsystems, Inc. and is not sponsored by Sun Microsystems, Inc.

|home|services|training|newsletter|tuning tips|tool reports|articles|resources|about us|site map|contact us|

Our valued sponsors who help make this site possible
Wily Technology- Manage performance of applications, portals, and integration 

Enerjy Software: Introducing Java development tools you'll actually use. 

Tool Report: GCViewer

*FREE* Java Profiling & Testing Tools
Free Java suite for profiling, code analysis and unit/functional/load testing

Java development tools
Develop clean, reliable Java code and create rules to address your coding issues.

Java Performance Training
Don't leave performance to chance. Train your developers on site to build efficient Java programs


Java Performance Training Courses
COURSES AVAILABLE NOW. We can provide training courses to handle all your Java performance needs

See Your Message Here
You could have your tool advertised here, to be seen by thousands of potential customers

Java Performance Tuning, 2nd ed
Covers Java SDK 1.4 and includes four new J2EE tuning chapters

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now! Enter email:

Cognotive
Advice, training and mentoring on the use of TopLink and J2EE

www.jsig.com
We support the London JSIG

www.java.net
We are a java.net member


Wily Technology
Manage performance of applications, portals, and integration

Enerjy Software
Introducing Java development tools you'll actually use


*FREE* Java Profiling & Testing Tools
Free Java suite for profiling, code analysis and unit/functional/load testing

Java development tools
Develop clean, reliable Java code and create rules to address your coding issues.

Java Performance Training
Don't leave performance to chance. Train your developers on site to build efficient Java programs


Published July 2002

The JavaPerformanceTuning.com Tool Reports are designed to help readers make informed choices about the tools they may wish to use. JavaPerformanceTuning.com provides these reports as a service to our readers; JavaPerformanceTuning.com is not responsible for the information provided by the tool author or vendor, nor do we necessarily endorse the products mentioned. JavaPerformanceTuning.com is not responsible for any additional resources provided from the article (such as downloadable files or other accessible material), even where we host such material.

Introduction

GCViewer is a little tool that visualizes verbose garbage collection data generated by Sun and IBM Java Virtual Machines (JVM). The analysis of this data can be useful for tuning applications in order to maximize the performance of the garbage collector and thus the application itself.

Supported verbose:gc formats are:

  • Sun JDK 1.4 with the option -Xloggc:<file> (preferred)
  • Sun JDK 1.3.1/1.4 with the option -verbose:gc
  • IBM JDK 1.3.0 with the option -verbose:gc
  • IBM JDK 1.2.2 with the option -verbose:gc

    Installation

    To install GCViewer simply download the latest version from http://www.tagtraum.com/. Put the archive into the directory where you want to install GCViewer and unpack it. Then change into the newly created directory gcviewer-XXX. On Windows systems, where JAR files are associated with the java launcher, you can simply double click on the file gcviewer.jar to start the application. In all other cases, you can start GCViewer like this:

    java -jar gcviewer.jar

    Overview

    Once you have started GCViewer you will see a Swing window. The figure below depicts most of the available controls with a short description.

    GUI Overview

    By using the View-menu you may turn particular lines on or off. ("Acc Pauses" is the accumulated pause time from garbage collections.)

    Example

    Mostly it's server applications that benefit from tuning garbage collection. Therefore we want to look at a very simple (somewhat artificial) example of a serverside application. In our setup the webserver jo! is pounded by Apache JMeter for a limited time. To keep this example as simple as possible, the same resource is requested again and again.

    Obtaining GC data

    In order to obtain garbage collection data, jo! is started with Sun JDK 1.4.0 Server and the following parameters:

    c:\j2sdk1.4.0\bin\java -server -Xloggc:normal.gc ... (off course this works under Unix/Linux, too)

    The parameter -Xloggc:normal.gc tells the VM that it should write verbose garbage collection information to the file normal.gc. After our test run finished we open the file with GCViewer:

    GC Data with normal garbage collection - 200%

    Probably the two most important lines in the graph are the red and the blue line. The red line indicates the total heap size, the blue line shows how much of the heap is actually used. As we can see the blue line follows a saw like pattern. If we increase the zoom to 1.000% it becomes obvious that this pattern is repeated in each tooth of the saw.

    GC Data with normal garbage collection - 1.000%

    To understand why we see this doubled saw pattern, we have to take a look at the garbage collector being used.

    Sun JDK 1.4.0 Server uses a generational garbage collector with two young generations and an old generation. New objects are always allocated in the young generation part of the heap. Every time the young generation is full a minor garbage collection takes place. In the graph this is indicated by the little grey lines at the bottom. As you can see from the length of these lines (and the green line), minor garbage collections don't take very long. Objects which survived some minor collections are promoted to the older generation. Usually the older generation is full at some point too, and needs to be cleaned. This leads to a major garbage collection that removes garbage from both the old and the younger generations. Full garbage collections are indicated by black vertical lines. As you can see in the graph (green line), full garbage collections take a lot longer than minor garbage collections. This is due to the fact that a larger part of the memory is cleaned and that the older generation uses a slower, but less memory consuming, garbage collection algorithm than the younger generation.

    To summarize this: The saw pattern in the first figure (zoom 200%) shows full garbage collections, the saw pattern in the second figure (zoom 1 000%) shows minor garbage collections.

    Enlarging the new generation

    How can this help us tune the application? Well, after each minor collection the heap memory consumption is slightly higher than after the minor collection before. Some objects have been promoted to the older generation. But every time a full collection occurs the memory consumption (blue line) is reduced to more or less the same value as after the last full collection. This means that even though some objects were promoted to the older generation, most of them didn't become very old either. Which means, that maybe they should not have been promoted to the older generation in the first place.

    In this particular scenario it seems beneficial to increase the size of the younger generation to prevent the early promotion of medium-lived objects. For Sun JDK 1.4.0/1.3.1 this can be achieved with the unsupported parameter -XX:NewRatio=<value>. This parameter defines the ratio between the young and the old generations. A NewRatio of 2 means, that the young generations use one third of the heap. The default value of this parameter differs depending on your VM, whether you use the Server HotSpot compiler or the Client HotSpot compiler and which platform you are operating on.

    To see what effect a different NewRatio has on our application we start it like this:

    c:\j2sdk1.4.0\bin\java -server -XX:NewRatio=2 -Xloggc:newratio2.gc ...

    The figure below shows the result:

    GC Data with NewRatio=2 - 200%

    It is obvious that fewer full garbage collections occur (4 instead of 9) and that the minor collections remove more garbage (1 Mbyte instead of 0.5 MByte). Because fewer full garbage collections took place the accumulated pause time (Acc Pauses) caused by garbage collections decreased from 4.26s to 2.26s. This means that the application now spends only little more than half as much time on garbage collection than before. This sounds great, but to be honest for this application the overall effect of this improvement isn't terribly large as we didn't have excessive pause times in the first place. But still, the throughput (time the application actually runs and isn't busy with garbage collection) increased from 97.62% to 98.84%. Again - if your real world application has a throughput of over 95% you probably shouldn't bother with tuning garbage collection.

    As a side effect of NewRatio=2 we notice that the overall memory consumption (Footprint) slightly increased from 3.364 KByte to 3.604 KByte. This is due to the larger younger generation.

    Tweaking it even more

    To see whether we can improve the throughput even more, we start another test run with NewRatio=1:

    GC Data with NewRatio=1 - 200%

    As you can see from the graph there are still four full collections - but they occur in the initialization and shutdown phases of the test run. Also, they don't seem to take a very long time. The sum of all pause times decreased from 2.26s to 1.74s and as a direct result the throughput increased to fantastic 99.02%. The downside of this improvement is the increased footprint of 4.756 KByte.

    Tuning real applications

    In our example we used a very simple scenario to demonstrate a single tuning effort. In a real application we would have to make sure that we can repeat test runs in order to compare the values. We also need to keep in mind, that verbose garbage collection has a huge impact on the performance of an application. Therefore you might want to think twice about collecting the data in a production environment. For the sake of repeatability and not hindering the everyday business you should set up a test environment that is identical to your production environment and run load tests that simulate real load situations. You should also measure the performance of your application without verbose garbage collection in order to measure the real effect of any garbage collection tuning.

    Conclusion

    GCViewer helps you interpret verbose GC data from your VM. Especially with Sun JVMs, manipulating generation sizes can have a large impact on the behaviour of your application. GCViewer helps you make the correct decisions and run the right tests. Off course this doesn't free you from the task of not producing too much garbage in the first place...

    Further Reading

  • Tuning Garbage Collection: http://java.sun.com/docs/hotspot/gc/index.html
  • Richard Jones, Rafael Lins: Garbage Collection: Algorithms for Automatic Dynamic Memory Management. New York: Wiley & Sons 1996.

    Author

    Hendrik Schreiber is author of the German book Performant Java Programmieren published by Addison-Wesley and works as senior consultant for innoQ Deutschland GmbH (see also iQgen). In his spare time he contributes to open source projects such as tagtraum industries and others




    Diagnosing a Garbage Collection problem

    This appendix contain examples of garbage collection verbosegc output that indicate an problem with garbage collection. The default collector is being used unless otherwise specified. Unless otherwise specified the verbosegc output is generated with

    -verbosegc -XX:+PrintGCDetails

    The format for the diagnostic output presented here is the format used in 1.4.2. The format is not guaranteed to be the same in later releases. In particular the output associated with -XX:+PrintGCDetails is almost guaranteed to change in that it represents the current needs of JVM developers.


    The minor collection output for these options produce output of the form


    [GC [<collector>: <starting occupancy1> -> <ending occupancy1>, <pause time1> secs] <starting occupancy3> -> <ending occupancy3>, <pause time3> secs]


    where

    <collector> is an internal name for the collector used in the minor collection

    <starting occupancy1> is the occupancy of the young generation before the collection

    <ending occupancy1> is the occupancy of the young generation after the collection

    <pause time1> is the pause time in seconds for the minor collection.

    <starting occupancy3> is the occupancy of the entire heap before the collection

    <ending occupancy3> is the occupancy of the entire heap after the collection

    <pause time3> is the pause time for the entire garbage collection. This would include the time for a major collection is one was done.


    In the discussions that follow snippets of the verbosegc output are presented as examples of the information that can be gathered. When statements are made regarding a quantity such as pause time, the numbers being quoted are for averages over all the data gathered, not just from the snippet presented. If a range of values is given, that range is from the snippet presented.

    Young generation size is too small

    The young generation heap size in this first example is about 4 Mbytes with an overall heap size of about 32 Mbytes.

    [GC [DefNew: 4032K->64K(4032K), 0.0429742 secs] 9350K->7748K(32704K), 0.0431096 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0403446 secs] 11716K->10121K(32704K), 0.0404867 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0443969 secs] 14089K->12562K(32704K), 0.0445251 secs]

    This output seems reasonable from the point of view of the time spent in garbage collection but note that although the occupancy of the young generation is decreasing (e.g., in the first line from 4032 to 64k by 3968k) the occupancy of the entire heap is decreasing by considerably less (by 1602k from 9350k to 7748k). This indicates that only about 40% objects in the young generation were garbage and the rest survive the collection and are being promoted into the old generation.


    Increasing the young generation size to increase the free space after the collection


    The young generation heap size in this next run is increased to 8 Mbytes.


    [GC [DefNew: 8128K->64K(8128K), 0.0453670 secs] 13000K->7427K(32704K), 0.0454906 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388632 secs] 15491K->9663K(32704K), 0.0390013 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388610 secs] 17727K->11829K(32704K), 0.0389919 secs]


    With an 8 Mbyte size most of young generation is garbage at the time of the minor collection. In the first line the young generation heap decreases by 8064k from 8128k to 64k and the entire heap decreases by 5573k from 13000k to 7427k meaning about 68% of the young generation was garbage. As would be expected the size of the young generation does not change the amount of live objects that survive the minor collection (about 2.4M bytes in each case) but there are fewer minor collections and the cost of the collections in terms of the minor collection pause times are comparable (fractions of a second listed at the end of each line).


    Application time and stopped time


    To illustrate the cost of the 4 Mbyte young generation versus the 8 Mbyte young generation, the output below was generated with the additional command line options -XX:+PrintGCApplicationConcurrentTime and -XX:+PrintGCApplicationStoppedTime which measure the amount of time the applications runs between collection pauses and the length of the collection pauses, respectively. In the run with a 4 Mbyte young generation the application runs for .53 to .91 second with minor collection pauses of from .045 to .047 second. That's between about 5% and 8% overhead for minor collections.


    Application time: 0.5291524 seconds

    [GC [DefNew: 3968K->64K(4032K), 0.0460948 secs] 7451K->6186K(32704K), 0.0462350 secs]

    Total time for which application threads were stopped: 0.0468229 seconds

    Application time: 0.5279058 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0447854 secs] 10154K->8648K(32704K), 0.0449156 secs]

    Total time for which application threads were stopped: 0.0453124 seconds

    Application time: 0.9063706 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0464574 secs] 12616K->11187K(32704K), 0.0465921 secs]

    Total time for which application threads were stopped: 0.0470484 seconds


    For the 8 Mbyte young generations


    Application time: 1.3874623 seconds

    [GC [DefNew: 8064K->63K(8128K), 0.0509215 secs] 11106K->5994K(32704K), 0.0510972 secs]

    Total time for which application threads were stopped: 0.0517092 seconds

    Application time: 1.5225065 seconds

    [GC [DefNew: 8127K->63K(8128K), 0.0432982 secs] 14058K->8273K(32704K), 0.0434172 secs]

    Total time for which application threads were stopped: 0.0440447 seconds

    Application time: 1.4263524 seconds

    [GC [DefNew: 8127K->64K(8128K), 0.0363538 secs] 16337K->10381K(32704K), 0.0364811 secs]

    Total time for which application threads were stopped: 0.0369103 seconds



    The application is running 1.38 to 1.52 seconds between minor collection pauses of from .036 to .051. The overheard for minor collection pauses is between 2.6% and 3.6%.


    The young generation size is too large

    In going from a young generation heap size of 4 Mbytes to 8 Mbytes the overheard due to minor collections was about cut in half. What happens if the young generation size is doubled to 16 Mbytes?

    [GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000518 secs][Tenured: 2704K->2535K(16384K), 0.0931034 secs] 18704K->2535K(32576K), 0.0933519 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000498 secs][Tenured: 2535K->2319K(16384K), 0.0860925 secs] 18535K->2319K(32576K), 0.0863350 secs]

    This is an example of the relative size of the young generation to the tenured generation being too large to allow the young generation promotion to be guaranteed (the young generation is about half the size of the tenured generation). The young generation cannot be collected successfully and only major collections are occurring. Note that in this cause the young generation is collected but only as part of the more expensive major collection.


    Is the tenured generation too large or too small?

    With the young generation sized at 8 Mbytes and the overall heap at 32 Mbytes the major collection pauses are about .13 second.

    [GC [DefNew: 8128K->8128K(8128K), 0.0000558 secs][Tenured: 17746K->2309K(24576K), 0.1247669 secs] 25874K->2309K(32704K), 0.1250098 secs]

    If the heap size is increased to 64 Mbytes, the major collections pauses increase to about .21 second.


    [GC [DefNew: 8128K->8128K(8128K), 0.0000369 secs][Tenured: 50059K->5338K(57344K), 0.2218912 secs]

    58187K->5338K(65472K), 0.2221317 secs]


    but the major collections are occurring at about half the frequency. To see this print out time stamps at the collections by adding the option -XX:+PrintGCTimeStamps. For the 32 Mbyte heap the major collections are occurring about every 10 -11 seconds (only the major collections are shown).


    111.042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs]

    122.463: [GC 122.463: [DefNew: 8128K->8128K(8128K), 0.0000560 secs]122.463: [Tenured: 18630K->2366K(24576K), 0.1322560 secs] 26758K->2366K(32704K), 0.1325284 secs]

    133.896: [GC 133.897: [DefNew: 8128K->8128K(8128K), 0.0000443 secs]133.897: [Tenured: 18240K->2573K(24576K), 0.1340199 secs] 26368K->2573K(32704K), 0.1343218 secs]

    144.112: [GC 144.112: [DefNew: 8128K->8128K(8128K), 0.0000544 secs]144.112: [Tenured: 16564K->2304K(24576K), 0.1246831 secs] 24692K->2304K(32704K), 0.1249602 secs]


    For the 64 Mbyte heap the major collections are occurring about every 30 seconds.


    90.597: [GC 90.597: [DefNew: 8128K->8128K(8128K), 0.0000542 secs]90.597: [Tenured: 49841K->5141K(57344K), 0.2129882 secs] 57969K->5141K(65472K), 0.2133274 secs]

    120.899: [GC 120.899: [DefNew: 8128K->8128K(8128K), 0.0000550 secs]120.899: [Tenured: 50384K->2430K(57344K), 0.2216590 secs] 58512K->2430K(65472K), 0.2219384 secs]

    153.968: [GC 153.968: [DefNew: 8128K->8128K(8128K), 0.0000511 secs]153.968: [Tenured: 51164K->2309K(57344K), 0.2193906 secs] 59292K->2309K(65472K), 0.2196372 secs]


    Which heap size, 32 Mbytes or 64 Mbytes, is better? For higher throughput the 64 Mbyte heap is better while the 32 Mbyte heap provides the lesser pause times.


    Even larger young generation sizes


    To pursue higher throughput consider even larger heaps. For a 256 Mbyte total heap with a 64 Mbyte young generation size, when compared to the 32 Mbyte heap with an 8 Mbyte young generation run, the minor collection pauses are only about 10% longer reflecting the fact that the number of objects surviving the collection are about the same. The major collection are about 3.8 times longer reflecting the larger size of the heap.


    [GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs]

    [GC [DefNew: 64575K->64575K(64576K), 0.0000573 secs][Tenured: 132673K->5437K(196608K), 0.4959855 secs] 197249K->5437K(261184K), 0.4962533 secs]

    [GC [DefNew: 63616K->959K(64576K), 0.0360258 secs] 69053K->7600K(261184K), 0.0361663 secs]


    After tuning the minor collection pauses are too long

    Try the parallel young generation collectors


    If the minor collection pauses are too long, try using the parallel young generation collectors. Adding -XX:+UseParallelGC produces the following output. In this example adaptive sizing was turned off with -XX:-UseAdaptiveSizing for simplicity. Also time stamps were added for comparison with other results.


    500.285: [GC 51526K->2678K(253952K), 0.0120328 secs]

    506.734: [GC 51830K->2646K(253952K), 0.0117832 secs]

    513.068: [GC 51798K->2742K(253952K), 0.0124632 secs]

    519.566: [GC 51894K->2550K(253952K), 0.0122923 secs]


    Here the average minor collection pause is about .015 second which is a reduction of 68%. The adaptive sizing was turned off because of an anomalous behavior that is exhibited in this next set of output with adaptive sizing on.


    [GC 62752K->2992K(259328K), 0.0126936 secs]

    [GC 62896K->60192K(259328K), 0.0127579 secs]

    [GC 63008K->3120K(259328K), 0.0123150 secs]

    [GC 63024K->60256K(259328K), 0.0120565 secs]

    [GC 63072K->3024K(259328K), 0.0127215 secs]

    [GC 62928K->60208K(259328K), 0.0113090 secs]

    [GC 63024K->3136K(259328K), 0.0133799 secs]

    [GC 63040K->60256K(259328K), 0.0135459 secs]


    Adaptive sizing adjusts the size of the survivor spaces. In the alternate minor collections shown above the size of the survivor spaces is oscillating between values that alternatively cause a minimal collection of the young generation with a nearly complete collection of the young generation. Although a strange behavior the use of adaptive sizing may still yield a higher throughput than turning it off.


    Alternatively, using -XX:+UseParNewGC yields


    497.905: [GC 497.905: [ParNew: 64576K->960K(64576K), 0.0255372 secs] 155310K->93003K(261184K), 0.0256767 secs]

    506.305: [GC 506.305: [ParNew: 64576K->960K(64576K), 0.0276291 secs] 156619K->94267K(261184K), 0.0277958 secs]

    514.565: [GC 514.565: [ParNew: 64576K->960K(64576K), 0.0261376 secs] 157883K->95711K(261184K), 0.0262927 secs]

    522.838: [GC 522.838: [ParNew: 64576K->960K(64576K), 0.0316625 secs] 159327K->97331K(261184K), 0.0318099 secs]


    which shows about a 44% decrease in the minor collection pauses. The time stamps were added to show the period of the minor collections. With -XX:+UseParallelGC the minor collections occur about every 6.5 seconds and take .015 second (for a minor collection overhead of .33%). For -XX:+UseParNewGC the collections occur about every 8 seconds and take about .026 second (for a minor collection overhead of .23%).


    After tuning the major collections pause are too long

    Try the concurrent low pause collector

    The major collection pauses for the 256 Mbyte total heap with the 64 Mbyte young generation is about .489 second. If this is too long the concurrent low pause collector can be tried with the command line option -XX:+UseConcMarkSweepGC (and also keeping the -XX:+UseParNewGC)


    [GC [ParNew: 64576K->960K(64576K), 0.0377639 secs] 140122K->78078K(261184K), 0.0379598 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0329313 secs] 141694K->79533K(261184K), 0.0331324 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0413880 secs] 143149K->81128K(261184K), 0.0416101 secs]

    [GC [1 CMS-initial-mark: 80168K(196608K)] 81144K(261184K), 0.0059036 secs]

    [CMS-concurrent-mark: 0.129/0.129 secs]

    [CMS-concurrent-preclean: 0.007/0.007 secs]

    [GC[Rescan (non-parallel) [grey object rescan, 0.0020879 secs][root rescan, 0.0144199 secs], 0.016

    6258 secs][weak refs processing, 0.0000411 secs] [1 CMS-remark: 80168K(196608K)] 82493K(261184K),

    0.0168943 secs]

    [CMS-concurrent-sweep: 1.208/1.208 secs]

    [CMS-concurrent-reset: 0.036/0.036 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0311520 secs] 66308K->4171K(261184K), 0.0313513 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0348341 secs] 67787K->5695K(261184K), 0.0350776 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0359806 secs] 69311K->7154K(261184K), 0.0362064 secs]


    With the application used for these examples the time to do a concurrent collection is short relative to the time between minor collections. This is typically not the case. More often there will be minor collections during the concurrent collection. Note that the concurrent phases may be long (e.g., the CMS-concurrent-sweep is 1.2 seconds) but the application is not stopped during the concurrent phases so the application will not see a pause. Conversely, although the application is not paused during the concurrent phases, neither does it have use of all the available processors (one being used by the garbage collection thread). The pauses will come from the CMS-initial-mark and CMS-remark pause. In this snippet of output the larger of those is .017 second). Over the run the average for the CMS-remark pause (always the longer pause) was .019 second. The maximum pause for a major collection (as compared to the default collector) was then reduced from .489 second to .019 second (96% reduction). Note also that the minor collection pause increased to .035 second which is higher than the .026 second minor pause using only -XX:+UseParNewGC. This is indicative of the overhead associated with the concurrency of the major collection.


    Try the incremental low pause collector


    The major collection times can also be reduced by use of the incremental low pause collector. Use the command line option -Xincgc.


    [GC [DefNew: 64575K->959K(64576K), 0.0616204 secs][Train: 5537K->5474K(196608K), 0.0108803 secs] 68664K->6434K(261184K), 0.0730248 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0559078 secs][Train: 6813K->6749K(196608K), 0.0115077 secs] 70050K->7709K(261184K), 0.0680289 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0607615 secs][Train: 8178K->8114K(196608K), 0.0112705 secs] 71325K->9074K(261184K), 0.0727902 secs]


    During each minor collection part of the old generation is collected and there is no separate major collection pause. The minor collection pauses here increase to about .078 second. If the incremental collection of the tenured generation cannot be completed before tenured generation is full, a major collection occurs and is indicated in the verbosegc output by the 'MSC' in the output.


    [GC [DefNew: 64575K->64575K(64576K), 0.0005311 secs][Train MSC: 68226K->135K(196608K), 0.4347857 secs] 132802K->2565K(261184K), 0.4397831 secs]

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    <starting occupancy1> is the occupancy of the young generation before the collection

    <ending occupancy1> is the occupancy of the young generation after the collection

    <pause time1> is the pause time in seconds for the minor collection.

    <starting occupancy3> is the occupancy of the entire heap before the collection

    <ending occupancy3> is the occupancy of the entire heap after the collection

    <pause time3> is the pause time for the entire garbage collection. This would include the time for a major collection is one was done.


    In the discussions that follow snippets of the verbosegc output are presented as examples of the information that can be gathered. When statements are made regarding a quantity such as pause time, the numbers being quoted are for averages over all the data gathered, not just from the snippet presented. If a range of values is given, that range is from the snippet presented.

    Young generation size is too small

    The young generation heap size in this first example is about 4 Mbytes with an overall heap size of about 32 Mbytes.

    [GC [DefNew: 4032K->64K(4032K), 0.0429742 secs] 9350K->7748K(32704K), 0.0431096 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0403446 secs] 11716K->10121K(32704K), 0.0404867 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0443969 secs] 14089K->12562K(32704K), 0.0445251 secs]

    This output seems reasonable from the point of view of the time spent in garbage collection but note that although the occupancy of the young generation is decreasing (e.g., in the first line from 4032 to 64k by 3968k) the occupancy of the entire heap is decreasing by considerably less (by 1602k from 9350k to 7748k). This indicates that only about 40% objects in the young generation were garbage and the rest survive the collection and are being promoted into the old generation.


    Increasing the young generation size to increase the free space after the collection


    The young generation heap size in this next run is increased to 8 Mbytes.


    [GC [DefNew: 8128K->64K(8128K), 0.0453670 secs] 13000K->7427K(32704K), 0.0454906 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388632 secs] 15491K->9663K(32704K), 0.0390013 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388610 secs] 17727K->11829K(32704K), 0.0389919 secs]


    With an 8 Mbyte size most of young generation is garbage at the time of the minor collection. In the first line the young generation heap decreases by 8064k from 8128k to 64k and the entire heap decreases by 5573k from 13000k to 7427k meaning about 68% of the young generation was garbage. As would be expected the size of the young generation does not change the amount of live objects that survive the minor collection (about 2.4M bytes in each case) but there are fewer minor collections and the cost of the collections in terms of the minor collection pause times are comparable (fractions of a second listed at the end of each line).


    Application time and stopped time


    To illustrate the cost of the 4 Mbyte young generation versus the 8 Mbyte young generation, the output below was generated with the additional command line options -XX:+PrintGCApplicationConcurrentTime and -XX:+PrintGCApplicationStoppedTime which measure the amount of time the applications runs between collection pauses and the length of the collection pauses, respectively. In the run with a 4 Mbyte young generation the application runs for .53 to .91 second with minor collection pauses of from .045 to .047 second. That's between about 5% and 8% overhead for minor collections.


    Application time: 0.5291524 seconds

    [GC [DefNew: 3968K->64K(4032K), 0.0460948 secs] 7451K->6186K(32704K), 0.0462350 secs]

    Total time for which application threads were stopped: 0.0468229 seconds

    Application time: 0.5279058 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0447854 secs] 10154K->8648K(32704K), 0.0449156 secs]

    Total time for which application threads were stopped: 0.0453124 seconds

    Application time: 0.9063706 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0464574 secs] 12616K->11187K(32704K), 0.0465921 secs]

    Total time for which application threads were stopped: 0.0470484 seconds


    For the 8 Mbyte young generations


    Application time: 1.3874623 seconds

    [GC [DefNew: 8064K->63K(8128K), 0.0509215 secs] 11106K->5994K(32704K), 0.0510972 secs]

    Total time for which application threads were stopped: 0.0517092 seconds

    Application time: 1.5225065 seconds

    [GC [DefNew: 8127K->63K(8128K), 0.0432982 secs] 14058K->8273K(32704K), 0.0434172 secs]

    Total time for which application threads were stopped: 0.0440447 seconds

    Application time: 1.4263524 seconds

    [GC [DefNew: 8127K->64K(8128K), 0.0363538 secs] 16337K->10381K(32704K), 0.0364811 secs]

    Total time for which application threads were stopped: 0.0369103 seconds



    The application is running 1.38 to 1.52 seconds between minor collection pauses of from .036 to .051. The overheard for minor collection pauses is between 2.6% and 3.6%.


    The young generation size is too large

    In going from a young generation heap size of 4 Mbytes to 8 Mbytes the overheard due to minor collections was about cut in half. What happens if the young generation size is doubled to 16 Mbytes?

    [GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000518 secs][Tenured: 2704K->2535K(16384K), 0.0931034 secs] 18704K->2535K(32576K), 0.0933519 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000498 secs][Tenured: 2535K->2319K(16384K), 0.0860925 secs] 18535K->2319K(32576K), 0.0863350 secs]

    This is an example of the relative size of the young generation to the tenured generation being too large to allow the young generation promotion to be guaranteed (the young generation is about half the size of the tenured generation). The young generation cannot be collected successfully and only major collections are occurring. Note that in this cause the young generation is collected but only as part of the more expensive major collection.


    Is the tenured generation too large or too small?

    With the young generation sized at 8 Mbytes and the overall heap at 32 Mbytes the major collection pauses are about .13 second.

    [GC [DefNew: 8128K->8128K(8128K), 0.0000558 secs][Tenured: 17746K->2309K(24576K), 0.1247669 secs] 25874K->2309K(32704K), 0.1250098 secs]

    If the heap size is increased to 64 Mbytes, the major collections pauses increase to about .21 second.


    [GC [DefNew: 8128K->8128K(8128K), 0.0000369 secs][Tenured: 50059K->5338K(57344K), 0.2218912 secs]

    58187K->5338K(65472K), 0.2221317 secs]


    but the major collections are occurring at about half the frequency. To see this print out time stamps at the collections by adding the option -XX:+PrintGCTimeStamps. For the 32 Mbyte heap the major collections are occurring about every 10 -11 seconds (only the major collections are shown).


    111.042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs]

    122.463: [GC 122.463: [DefNew: 8128K->8128K(8128K), 0.0000560 secs]122.463: [Tenured: 18630K->2366K(24576K), 0.1322560 secs] 26758K->2366K(32704K), 0.1325284 secs]

    133.896: [GC 133.897: [DefNew: 8128K->8128K(8128K), 0.0000443 secs]133.897: [Tenured: 18240K->2573K(24576K), 0.1340199 secs] 26368K->2573K(32704K), 0.1343218 secs]

    144.112: [GC 144.112: [DefNew: 8128K->8128K(8128K), 0.0000544 secs]144.112: [Tenured: 16564K->2304K(24576K), 0.1246831 secs] 24692K->2304K(32704K), 0.1249602 secs]


    For the 64 Mbyte heap the major collections are occurring about every 30 seconds.


    90.597: [GC 90.597: [DefNew: 8128K->8128K(8128K), 0.0000542 secs]90.597: [Tenured: 49841K->5141K(57344K), 0.2129882 secs] 57969K->5141K(65472K), 0.2133274 secs]

    120.899: [GC 120.899: [DefNew: 8128K->8128K(8128K), 0.0000550 secs]120.899: [Tenured: 50384K->2430K(57344K), 0.2216590 secs] 58512K->2430K(65472K), 0.2219384 secs]

    153.968: [GC 153.968: [DefNew: 8128K->8128K(8128K), 0.0000511 secs]153.968: [Tenured: 51164K->2309K(57344K), 0.2193906 secs] 59292K->2309K(65472K), 0.2196372 secs]


    Which heap size, 32 Mbytes or 64 Mbytes, is better? For higher throughput the 64 Mbyte heap is better while the 32 Mbyte heap provides the lesser pause times.


    Even larger young generation sizes


    To pursue higher throughput consider even larger heaps. For a 256 Mbyte total heap with a 64 Mbyte young generation size, when compared to the 32 Mbyte heap with an 8 Mbyte young generation run, the minor collection pauses are only about 10% longer reflecting the fact that the number of objects surviving the collection are about the same. The major collection are about 3.8 times longer reflecting the larger size of the heap.


    [GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs]

    [GC [DefNew: 64575K->64575K(64576K), 0.0000573 secs][Tenured: 132673K->5437K(196608K), 0.4959855 secs] 197249K->5437K(261184K), 0.4962533 secs]

    [GC [DefNew: 63616K->959K(64576K), 0.0360258 secs] 69053K->7600K(261184K), 0.0361663 secs]


    After tuning the minor collection pauses are too long

    Try the parallel young generation collectors


    If the minor collection pauses are too long, try using the parallel young generation collectors. Adding -XX:+UseParallelGC produces the following output. In this example adaptive sizing was turned off with -XX:-UseAdaptiveSizing for simplicity. Also time stamps were added for comparison with other results.


    500.285: [GC 51526K->2678K(253952K), 0.0120328 secs]

    506.734: [GC 51830K->2646K(253952K), 0.0117832 secs]

    513.068: [GC 51798K->2742K(253952K), 0.0124632 secs]

    519.566: [GC 51894K->2550K(253952K), 0.0122923 secs]


    Here the average minor collection pause is about .015 second which is a reduction of 68%. The adaptive sizing was turned off because of an anomalous behavior that is exhibited in this next set of output with adaptive sizing on.


    [GC 62752K->2992K(259328K), 0.0126936 secs]

    [GC 62896K->60192K(259328K), 0.0127579 secs]

    [GC 63008K->3120K(259328K), 0.0123150 secs]

    [GC 63024K->60256K(259328K), 0.0120565 secs]

    [GC 63072K->3024K(259328K), 0.0127215 secs]

    [GC 62928K->60208K(259328K), 0.0113090 secs]

    [GC 63024K->3136K(259328K), 0.0133799 secs]

    [GC 63040K->60256K(259328K), 0.0135459 secs]


    Adaptive sizing adjusts the size of the survivor spaces. In the alternate minor collections shown above the size of the survivor spaces is oscillating between values that alternatively cause a minimal collection of the young generation with a nearly complete collection of the young generation. Although a strange behavior the use of adaptive sizing may still yield a higher throughput than turning it off.


    Alternatively, using -XX:+UseParNewGC yields


    497.905: [GC 497.905: [ParNew: 64576K->960K(64576K), 0.0255372 secs] 155310K->93003K(261184K), 0.0256767 secs]

    506.305: [GC 506.305: [ParNew: 64576K->960K(64576K), 0.0276291 secs] 156619K->94267K(261184K), 0.0277958 secs]

    514.565: [GC 514.565: [ParNew: 64576K->960K(64576K), 0.0261376 secs] 157883K->95711K(261184K), 0.0262927 secs]

    522.838: [GC 522.838: [ParNew: 64576K->960K(64576K), 0.0316625 secs] 159327K->97331K(261184K), 0.0318099 secs]


    which shows about a 44% decrease in the minor collection pauses. The time stamps were added to show the period of the minor collections. With -XX:+UseParallelGC the minor collections occur about every 6.5 seconds and take .015 second (for a minor collection overhead of .33%). For -XX:+UseParNewGC the collections occur about every 8 seconds and take about .026 second (for a minor collection overhead of .23%).


    After tuning the major collections pause are too long

    Try the concurrent low pause collector

    The major collection pauses for the 256 Mbyte total heap with the 64 Mbyte young generation is about .489 second. If this is too long the concurrent low pause collector can be tried with the command line option -XX:+UseConcMarkSweepGC (and also keeping the -XX:+UseParNewGC)


    [GC [ParNew: 64576K->960K(64576K), 0.0377639 secs] 140122K->78078K(261184K), 0.0379598 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0329313 secs] 141694K->79533K(261184K), 0.0331324 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0413880 secs] 143149K->81128K(261184K), 0.0416101 secs]

    [GC [1 CMS-initial-mark: 80168K(196608K)] 81144K(261184K), 0.0059036 secs]

    [CMS-concurrent-mark: 0.129/0.129 secs]

    [CMS-concurrent-preclean: 0.007/0.007 secs]

    [GC[Rescan (non-parallel) [grey object rescan, 0.0020879 secs][root rescan, 0.0144199 secs], 0.016

    6258 secs][weak refs processing, 0.0000411 secs] [1 CMS-remark: 80168K(196608K)] 82493K(261184K),

    0.0168943 secs]

    [CMS-concurrent-sweep: 1.208/1.208 secs]

    [CMS-concurrent-reset: 0.036/0.036 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0311520 secs] 66308K->4171K(261184K), 0.0313513 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0348341 secs] 67787K->5695K(261184K), 0.0350776 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0359806 secs] 69311K->7154K(261184K), 0.0362064 secs]


    With the application used for these examples the time to do a concurrent collection is short relative to the time between minor collections. This is typically not the case. More often there will be minor collections during the concurrent collection. Note that the concurrent phases may be long (e.g., the CMS-concurrent-sweep is 1.2 seconds) but the application is not stopped during the concurrent phases so the application will not see a pause. Conversely, although the application is not paused during the concurrent phases, neither does it have use of all the available processors (one being used by the garbage collection thread). The pauses will come from the CMS-initial-mark and CMS-remark pause. In this snippet of output the larger of those is .017 second). Over the run the average for the CMS-remark pause (always the longer pause) was .019 second. The maximum pause for a major collection (as compared to the default collector) was then reduced from .489 second to .019 second (96% reduction). Note also that the minor collection pause increased to .035 second which is higher than the .026 second minor pause using only -XX:+UseParNewGC. This is indicative of the overhead associated with the concurrency of the major collection.


    Try the incremental low pause collector


    The major collection times can also be reduced by use of the incremental low pause collector. Use the command line option -Xincgc.


    [GC [DefNew: 64575K->959K(64576K), 0.0616204 secs][Train: 5537K->5474K(196608K), 0.0108803 secs] 68664K->6434K(261184K), 0.0730248 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0559078 secs][Train: 6813K->6749K(196608K), 0.0115077 secs] 70050K->7709K(261184K), 0.0680289 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0607615 secs][Train: 8178K->8114K(196608K), 0.0112705 secs] 71325K->9074K(261184K), 0.0727902 secs]


    During each minor collection part of the old generation is collected and there is no separate major collection pause. The minor collection pauses here increase to about .078 second. If the incremental collection of the tenured generation cannot be completed before tenured generation is full, a major collection occurs and is indicated in the verbosegc output by the 'MSC' in the output.


    [GC [DefNew: 64575K->64575K(64576K), 0.0005311 secs][Train MSC: 68226K->135K(196608K), 0.4347857 secs] 132802K->2565K(261184K), 0.4397831 secs]

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    <starting occupancy1> is the occupancy of the young generation before the collection

    <ending occupancy1> is the occupancy of the young generation after the collection

    <pause time1> is the pause time in seconds for the minor collection.

    <starting occupancy3> is the occupancy of the entire heap before the collection

    <ending occupancy3> is the occupancy of the entire heap after the collection

    <pause time3> is the pause time for the entire garbage collection. This would include the time for a major collection is one was done.


    In the discussions that follow snippets of the verbosegc output are presented as examples of the information that can be gathered. When statements are made regarding a quantity such as pause time, the numbers being quoted are for averages over all the data gathered, not just from the snippet presented. If a range of values is given, that range is from the snippet presented.

    Young generation size is too small

    The young generation heap size in this first example is about 4 Mbytes with an overall heap size of about 32 Mbytes.

    [GC [DefNew: 4032K->64K(4032K), 0.0429742 secs] 9350K->7748K(32704K), 0.0431096 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0403446 secs] 11716K->10121K(32704K), 0.0404867 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0443969 secs] 14089K->12562K(32704K), 0.0445251 secs]

    This output seems reasonable from the point of view of the time spent in garbage collection but note that although the occupancy of the young generation is decreasing (e.g., in the first line from 4032 to 64k by 3968k) the occupancy of the entire heap is decreasing by considerably less (by 1602k from 9350k to 7748k). This indicates that only about 40% objects in the young generation were garbage and the rest survive the collection and are being promoted into the old generation.


    Increasing the young generation size to increase the free space after the collection


    The young generation heap size in this next run is increased to 8 Mbytes.


    [GC [DefNew: 8128K->64K(8128K), 0.0453670 secs] 13000K->7427K(32704K), 0.0454906 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388632 secs] 15491K->9663K(32704K), 0.0390013 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388610 secs] 17727K->11829K(32704K), 0.0389919 secs]


    With an 8 Mbyte size most of young generation is garbage at the time of the minor collection. In the first line the young generation heap decreases by 8064k from 8128k to 64k and the entire heap decreases by 5573k from 13000k to 7427k meaning about 68% of the young generation was garbage. As would be expected the size of the young generation does not change the amount of live objects that survive the minor collection (about 2.4M bytes in each case) but there are fewer minor collections and the cost of the collections in terms of the minor collection pause times are comparable (fractions of a second listed at the end of each line).


    Application time and stopped time


    To illustrate the cost of the 4 Mbyte young generation versus the 8 Mbyte young generation, the output below was generated with the additional command line options -XX:+PrintGCApplicationConcurrentTime and -XX:+PrintGCApplicationStoppedTime which measure the amount of time the applications runs between collection pauses and the length of the collection pauses, respectively. In the run with a 4 Mbyte young generation the application runs for .53 to .91 second with minor collection pauses of from .045 to .047 second. That's between about 5% and 8% overhead for minor collections.


    Application time: 0.5291524 seconds

    [GC [DefNew: 3968K->64K(4032K), 0.0460948 secs] 7451K->6186K(32704K), 0.0462350 secs]

    Total time for which application threads were stopped: 0.0468229 seconds

    Application time: 0.5279058 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0447854 secs] 10154K->8648K(32704K), 0.0449156 secs]

    Total time for which application threads were stopped: 0.0453124 seconds

    Application time: 0.9063706 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0464574 secs] 12616K->11187K(32704K), 0.0465921 secs]

    Total time for which application threads were stopped: 0.0470484 seconds


    For the 8 Mbyte young generations


    Application time: 1.3874623 seconds

    [GC [DefNew: 8064K->63K(8128K), 0.0509215 secs] 11106K->5994K(32704K), 0.0510972 secs]

    Total time for which application threads were stopped: 0.0517092 seconds

    Application time: 1.5225065 seconds

    [GC [DefNew: 8127K->63K(8128K), 0.0432982 secs] 14058K->8273K(32704K), 0.0434172 secs]

    Total time for which application threads were stopped: 0.0440447 seconds

    Application time: 1.4263524 seconds

    [GC [DefNew: 8127K->64K(8128K), 0.0363538 secs] 16337K->10381K(32704K), 0.0364811 secs]

    Total time for which application threads were stopped: 0.0369103 seconds



    The application is running 1.38 to 1.52 seconds between minor collection pauses of from .036 to .051. The overheard for minor collection pauses is between 2.6% and 3.6%.


    The young generation size is too large

    In going from a young generation heap size of 4 Mbytes to 8 Mbytes the overheard due to minor collections was about cut in half. What happens if the young generation size is doubled to 16 Mbytes?

    [GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000518 secs][Tenured: 2704K->2535K(16384K), 0.0931034 secs] 18704K->2535K(32576K), 0.0933519 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000498 secs][Tenured: 2535K->2319K(16384K), 0.0860925 secs] 18535K->2319K(32576K), 0.0863350 secs]

    This is an example of the relative size of the young generation to the tenured generation being too large to allow the young generation promotion to be guaranteed (the young generation is about half the size of the tenured generation). The young generation cannot be collected successfully and only major collections are occurring. Note that in this cause the young generation is collected but only as part of the more expensive major collection.


    Is the tenured generation too large or too small?

    With the young generation sized at 8 Mbytes and the overall heap at 32 Mbytes the major collection pauses are about .13 second.

    [GC [DefNew: 8128K->8128K(8128K), 0.0000558 secs][Tenured: 17746K->2309K(24576K), 0.1247669 secs] 25874K->2309K(32704K), 0.1250098 secs]

    If the heap size is increased to 64 Mbytes, the major collections pauses increase to about .21 second.


    [GC [DefNew: 8128K->8128K(8128K), 0.0000369 secs][Tenured: 50059K->5338K(57344K), 0.2218912 secs]

    58187K->5338K(65472K), 0.2221317 secs]


    but the major collections are occurring at about half the frequency. To see this print out time stamps at the collections by adding the option -XX:+PrintGCTimeStamps. For the 32 Mbyte heap the major collections are occurring about every 10 -11 seconds (only the major collections are shown).


    111.042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs]

    122.463: [GC 122.463: [DefNew: 8128K->8128K(8128K), 0.0000560 secs]122.463: [Tenured: 18630K->2366K(24576K), 0.1322560 secs] 26758K->2366K(32704K), 0.1325284 secs]

    133.896: [GC 133.897: [DefNew: 8128K->8128K(8128K), 0.0000443 secs]133.897: [Tenured: 18240K->2573K(24576K), 0.1340199 secs] 26368K->2573K(32704K), 0.1343218 secs]

    144.112: [GC 144.112: [DefNew: 8128K->8128K(8128K), 0.0000544 secs]144.112: [Tenured: 16564K->2304K(24576K), 0.1246831 secs] 24692K->2304K(32704K), 0.1249602 secs]


    For the 64 Mbyte heap the major collections are occurring about every 30 seconds.


    90.597: [GC 90.597: [DefNew: 8128K->8128K(8128K), 0.0000542 secs]90.597: [Tenured: 49841K->5141K(57344K), 0.2129882 secs] 57969K->5141K(65472K), 0.2133274 secs]

    120.899: [GC 120.899: [DefNew: 8128K->8128K(8128K), 0.0000550 secs]120.899: [Tenured: 50384K->2430K(57344K), 0.2216590 secs] 58512K->2430K(65472K), 0.2219384 secs]

    153.968: [GC 153.968: [DefNew: 8128K->8128K(8128K), 0.0000511 secs]153.968: [Tenured: 51164K->2309K(57344K), 0.2193906 secs] 59292K->2309K(65472K), 0.2196372 secs]


    Which heap size, 32 Mbytes or 64 Mbytes, is better? For higher throughput the 64 Mbyte heap is better while the 32 Mbyte heap provides the lesser pause times.


    Even larger young generation sizes


    To pursue higher throughput consider even larger heaps. For a 256 Mbyte total heap with a 64 Mbyte young generation size, when compared to the 32 Mbyte heap with an 8 Mbyte young generation run, the minor collection pauses are only about 10% longer reflecting the fact that the number of objects surviving the collection are about the same. The major collection are about 3.8 times longer reflecting the larger size of the heap.


    [GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs]

    [GC [DefNew: 64575K->64575K(64576K), 0.0000573 secs][Tenured: 132673K->5437K(196608K), 0.4959855 secs] 197249K->5437K(261184K), 0.4962533 secs]

    [GC [DefNew: 63616K->959K(64576K), 0.0360258 secs] 69053K->7600K(261184K), 0.0361663 secs]


    After tuning the minor collection pauses are too long

    Try the parallel young generation collectors


    If the minor collection pauses are too long, try using the parallel young generation collectors. Adding -XX:+UseParallelGC produces the following output. In this example adaptive sizing was turned off with -XX:-UseAdaptiveSizing for simplicity. Also time stamps were added for comparison with other results.


    500.285: [GC 51526K->2678K(253952K), 0.0120328 secs]

    506.734: [GC 51830K->2646K(253952K), 0.0117832 secs]

    513.068: [GC 51798K->2742K(253952K), 0.0124632 secs]

    519.566: [GC 51894K->2550K(253952K), 0.0122923 secs]


    Here the average minor collection pause is about .015 second which is a reduction of 68%. The adaptive sizing was turned off because of an anomalous behavior that is exhibited in this next set of output with adaptive sizing on.


    [GC 62752K->2992K(259328K), 0.0126936 secs]

    [GC 62896K->60192K(259328K), 0.0127579 secs]

    [GC 63008K->3120K(259328K), 0.0123150 secs]

    [GC 63024K->60256K(259328K), 0.0120565 secs]

    [GC 63072K->3024K(259328K), 0.0127215 secs]

    [GC 62928K->60208K(259328K), 0.0113090 secs]

    [GC 63024K->3136K(259328K), 0.0133799 secs]

    [GC 63040K->60256K(259328K), 0.0135459 secs]


    Adaptive sizing adjusts the size of the survivor spaces. In the alternate minor collections shown above the size of the survivor spaces is oscillating between values that alternatively cause a minimal collection of the young generation with a nearly complete collection of the young generation. Although a strange behavior the use of adaptive sizing may still yield a higher throughput than turning it off.


    Alternatively, using -XX:+UseParNewGC yields


    497.905: [GC 497.905: [ParNew: 64576K->960K(64576K), 0.0255372 secs] 155310K->93003K(261184K), 0.0256767 secs]

    506.305: [GC 506.305: [ParNew: 64576K->960K(64576K), 0.0276291 secs] 156619K->94267K(261184K), 0.0277958 secs]

    514.565: [GC 514.565: [ParNew: 64576K->960K(64576K), 0.0261376 secs] 157883K->95711K(261184K), 0.0262927 secs]

    522.838: [GC 522.838: [ParNew: 64576K->960K(64576K), 0.0316625 secs] 159327K->97331K(261184K), 0.0318099 secs]


    which shows about a 44% decrease in the minor collection pauses. The time stamps were added to show the period of the minor collections. With -XX:+UseParallelGC the minor collections occur about every 6.5 seconds and take .015 second (for a minor collection overhead of .33%). For -XX:+UseParNewGC the collections occur about every 8 seconds and take about .026 second (for a minor collection overhead of .23%).


    After tuning the major collections pause are too long

    Try the concurrent low pause collector

    The major collection pauses for the 256 Mbyte total heap with the 64 Mbyte young generation is about .489 second. If this is too long the concurrent low pause collector can be tried with the command line option -XX:+UseConcMarkSweepGC (and also keeping the -XX:+UseParNewGC)


    [GC [ParNew: 64576K->960K(64576K), 0.0377639 secs] 140122K->78078K(261184K), 0.0379598 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0329313 secs] 141694K->79533K(261184K), 0.0331324 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0413880 secs] 143149K->81128K(261184K), 0.0416101 secs]

    [GC [1 CMS-initial-mark: 80168K(196608K)] 81144K(261184K), 0.0059036 secs]

    [CMS-concurrent-mark: 0.129/0.129 secs]

    [CMS-concurrent-preclean: 0.007/0.007 secs]

    [GC[Rescan (non-parallel) [grey object rescan, 0.0020879 secs][root rescan, 0.0144199 secs], 0.016

    6258 secs][weak refs processing, 0.0000411 secs] [1 CMS-remark: 80168K(196608K)] 82493K(261184K),

    0.0168943 secs]

    [CMS-concurrent-sweep: 1.208/1.208 secs]

    [CMS-concurrent-reset: 0.036/0.036 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0311520 secs] 66308K->4171K(261184K), 0.0313513 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0348341 secs] 67787K->5695K(261184K), 0.0350776 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0359806 secs] 69311K->7154K(261184K), 0.0362064 secs]


    With the application used for these examples the time to do a concurrent collection is short relative to the time between minor collections. This is typically not the case. More often there will be minor collections during the concurrent collection. Note that the concurrent phases may be long (e.g., the CMS-concurrent-sweep is 1.2 seconds) but the application is not stopped during the concurrent phases so the application will not see a pause. Conversely, although the application is not paused during the concurrent phases, neither does it have use of all the available processors (one being used by the garbage collection thread). The pauses will come from the CMS-initial-mark and CMS-remark pause. In this snippet of output the larger of those is .017 second). Over the run the average for the CMS-remark pause (always the longer pause) was .019 second. The maximum pause for a major collection (as compared to the default collector) was then reduced from .489 second to .019 second (96% reduction). Note also that the minor collection pause increased to .035 second which is higher than the .026 second minor pause using only -XX:+UseParNewGC. This is indicative of the overhead associated with the concurrency of the major collection.


    Try the incremental low pause collector


    The major collection times can also be reduced by use of the incremental low pause collector. Use the command line option -Xincgc.


    [GC [DefNew: 64575K->959K(64576K), 0.0616204 secs][Train: 5537K->5474K(196608K), 0.0108803 secs] 68664K->6434K(261184K), 0.0730248 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0559078 secs][Train: 6813K->6749K(196608K), 0.0115077 secs] 70050K->7709K(261184K), 0.0680289 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0607615 secs][Train: 8178K->8114K(196608K), 0.0112705 secs] 71325K->9074K(261184K), 0.0727902 secs]


    During each minor collection part of the old generation is collected and there is no separate major collection pause. The minor collection pauses here increase to about .078 second. If the incremental collection of the tenured generation cannot be completed before tenured generation is full, a major collection occurs and is indicated in the verbosegc output by the 'MSC' in the output.


    [GC [DefNew: 64575K->64575K(64576K), 0.0005311 secs][Train MSC: 68226K->135K(196608K), 0.4347857 secs] 132802K->2565K(261184K), 0.4397831 secs]

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    <starting occupancy1> is the occupancy of the young generation before the collection

    <ending occupancy1> is the occupancy of the young generation after the collection

    <pause time1> is the pause time in seconds for the minor collection.

    <starting occupancy3> is the occupancy of the entire heap before the collection

    <ending occupancy3> is the occupancy of the entire heap after the collection

    <pause time3> is the pause time for the entire garbage collection. This would include the time for a major collection is one was done.


    In the discussions that follow snippets of the verbosegc output are presented as examples of the information that can be gathered. When statements are made regarding a quantity such as pause time, the numbers being quoted are for averages over all the data gathered, not just from the snippet presented. If a range of values is given, that range is from the snippet presented.

    Young generation size is too small

    The young generation heap size in this first example is about 4 Mbytes with an overall heap size of about 32 Mbytes.

    [GC [DefNew: 4032K->64K(4032K), 0.0429742 secs] 9350K->7748K(32704K), 0.0431096 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0403446 secs] 11716K->10121K(32704K), 0.0404867 secs]

    [GC [DefNew: 4032K->64K(4032K), 0.0443969 secs] 14089K->12562K(32704K), 0.0445251 secs]

    This output seems reasonable from the point of view of the time spent in garbage collection but note that although the occupancy of the young generation is decreasing (e.g., in the first line from 4032 to 64k by 3968k) the occupancy of the entire heap is decreasing by considerably less (by 1602k from 9350k to 7748k). This indicates that only about 40% objects in the young generation were garbage and the rest survive the collection and are being promoted into the old generation.


    Increasing the young generation size to increase the free space after the collection


    The young generation heap size in this next run is increased to 8 Mbytes.


    [GC [DefNew: 8128K->64K(8128K), 0.0453670 secs] 13000K->7427K(32704K), 0.0454906 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388632 secs] 15491K->9663K(32704K), 0.0390013 secs]

    [GC [DefNew: 8128K->64K(8128K), 0.0388610 secs] 17727K->11829K(32704K), 0.0389919 secs]


    With an 8 Mbyte size most of young generation is garbage at the time of the minor collection. In the first line the young generation heap decreases by 8064k from 8128k to 64k and the entire heap decreases by 5573k from 13000k to 7427k meaning about 68% of the young generation was garbage. As would be expected the size of the young generation does not change the amount of live objects that survive the minor collection (about 2.4M bytes in each case) but there are fewer minor collections and the cost of the collections in terms of the minor collection pause times are comparable (fractions of a second listed at the end of each line).


    Application time and stopped time


    To illustrate the cost of the 4 Mbyte young generation versus the 8 Mbyte young generation, the output below was generated with the additional command line options -XX:+PrintGCApplicationConcurrentTime and -XX:+PrintGCApplicationStoppedTime which measure the amount of time the applications runs between collection pauses and the length of the collection pauses, respectively. In the run with a 4 Mbyte young generation the application runs for .53 to .91 second with minor collection pauses of from .045 to .047 second. That's between about 5% and 8% overhead for minor collections.


    Application time: 0.5291524 seconds

    [GC [DefNew: 3968K->64K(4032K), 0.0460948 secs] 7451K->6186K(32704K), 0.0462350 secs]

    Total time for which application threads were stopped: 0.0468229 seconds

    Application time: 0.5279058 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0447854 secs] 10154K->8648K(32704K), 0.0449156 secs]

    Total time for which application threads were stopped: 0.0453124 seconds

    Application time: 0.9063706 seconds

    [GC [DefNew: 4032K->64K(4032K), 0.0464574 secs] 12616K->11187K(32704K), 0.0465921 secs]

    Total time for which application threads were stopped: 0.0470484 seconds


    For the 8 Mbyte young generations


    Application time: 1.3874623 seconds

    [GC [DefNew: 8064K->63K(8128K), 0.0509215 secs] 11106K->5994K(32704K), 0.0510972 secs]

    Total time for which application threads were stopped: 0.0517092 seconds

    Application time: 1.5225065 seconds

    [GC [DefNew: 8127K->63K(8128K), 0.0432982 secs] 14058K->8273K(32704K), 0.0434172 secs]

    Total time for which application threads were stopped: 0.0440447 seconds

    Application time: 1.4263524 seconds

    [GC [DefNew: 8127K->64K(8128K), 0.0363538 secs] 16337K->10381K(32704K), 0.0364811 secs]

    Total time for which application threads were stopped: 0.0369103 seconds



    The application is running 1.38 to 1.52 seconds between minor collection pauses of from .036 to .051. The overheard for minor collection pauses is between 2.6% and 3.6%.


    The young generation size is too large

    In going from a young generation heap size of 4 Mbytes to 8 Mbytes the overheard due to minor collections was about cut in half. What happens if the young generation size is doubled to 16 Mbytes?

    [GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000518 secs][Tenured: 2704K->2535K(16384K), 0.0931034 secs] 18704K->2535K(32576K), 0.0933519 secs]

    [GC [DefNew: 16000K->16000K(16192K), 0.0000498 secs][Tenured: 2535K->2319K(16384K), 0.0860925 secs] 18535K->2319K(32576K), 0.0863350 secs]

    This is an example of the relative size of the young generation to the tenured generation being too large to allow the young generation promotion to be guaranteed (the young generation is about half the size of the tenured generation). The young generation cannot be collected successfully and only major collections are occurring. Note that in this cause the young generation is collected but only as part of the more expensive major collection.


    Is the tenured generation too large or too small?

    With the young generation sized at 8 Mbytes and the overall heap at 32 Mbytes the major collection pauses are about .13 second.

    [GC [DefNew: 8128K->8128K(8128K), 0.0000558 secs][Tenured: 17746K->2309K(24576K), 0.1247669 secs] 25874K->2309K(32704K), 0.1250098 secs]

    If the heap size is increased to 64 Mbytes, the major collections pauses increase to about .21 second.


    [GC [DefNew: 8128K->8128K(8128K), 0.0000369 secs][Tenured: 50059K->5338K(57344K), 0.2218912 secs]

    58187K->5338K(65472K), 0.2221317 secs]


    but the major collections are occurring at about half the frequency. To see this print out time stamps at the collections by adding the option -XX:+PrintGCTimeStamps. For the 32 Mbyte heap the major collections are occurring about every 10 -11 seconds (only the major collections are shown).


    111.042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs]

    122.463: [GC 122.463: [DefNew: 8128K->8128K(8128K), 0.0000560 secs]122.463: [Tenured: 18630K->2366K(24576K), 0.1322560 secs] 26758K->2366K(32704K), 0.1325284 secs]

    133.896: [GC 133.897: [DefNew: 8128K->8128K(8128K), 0.0000443 secs]133.897: [Tenured: 18240K->2573K(24576K), 0.1340199 secs] 26368K->2573K(32704K), 0.1343218 secs]

    144.112: [GC 144.112: [DefNew: 8128K->8128K(8128K), 0.0000544 secs]144.112: [Tenured: 16564K->2304K(24576K), 0.1246831 secs] 24692K->2304K(32704K), 0.1249602 secs]


    For the 64 Mbyte heap the major collections are occurring about every 30 seconds.


    90.597: [GC 90.597: [DefNew: 8128K->8128K(8128K), 0.0000542 secs]90.597: [Tenured: 49841K->5141K(57344K), 0.2129882 secs] 57969K->5141K(65472K), 0.2133274 secs]

    120.899: [GC 120.899: [DefNew: 8128K->8128K(8128K), 0.0000550 secs]120.899: [Tenured: 50384K->2430K(57344K), 0.2216590 secs] 58512K->2430K(65472K), 0.2219384 secs]

    153.968: [GC 153.968: [DefNew: 8128K->8128K(8128K), 0.0000511 secs]153.968: [Tenured: 51164K->2309K(57344K), 0.2193906 secs] 59292K->2309K(65472K), 0.2196372 secs]


    Which heap size, 32 Mbytes or 64 Mbytes, is better? For higher throughput the 64 Mbyte heap is better while the 32 Mbyte heap provides the lesser pause times.


    Even larger young generation sizes


    To pursue higher throughput consider even larger heaps. For a 256 Mbyte total heap with a 64 Mbyte young generation size, when compared to the 32 Mbyte heap with an 8 Mbyte young generation run, the minor collection pauses are only about 10% longer reflecting the fact that the number of objects surviving the collection are about the same. The major collection are about 3.8 times longer reflecting the larger size of the heap.


    [GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs]

    [GC [DefNew: 64575K->64575K(64576K), 0.0000573 secs][Tenured: 132673K->5437K(196608K), 0.4959855 secs] 197249K->5437K(261184K), 0.4962533 secs]

    [GC [DefNew: 63616K->959K(64576K), 0.0360258 secs] 69053K->7600K(261184K), 0.0361663 secs]


    After tuning the minor collection pauses are too long

    Try the parallel young generation collectors


    If the minor collection pauses are too long, try using the parallel young generation collectors. Adding -XX:+UseParallelGC produces the following output. In this example adaptive sizing was turned off with -XX:-UseAdaptiveSizing for simplicity. Also time stamps were added for comparison with other results.


    500.285: [GC 51526K->2678K(253952K), 0.0120328 secs]

    506.734: [GC 51830K->2646K(253952K), 0.0117832 secs]

    513.068: [GC 51798K->2742K(253952K), 0.0124632 secs]

    519.566: [GC 51894K->2550K(253952K), 0.0122923 secs]


    Here the average minor collection pause is about .015 second which is a reduction of 68%. The adaptive sizing was turned off because of an anomalous behavior that is exhibited in this next set of output with adaptive sizing on.


    [GC 62752K->2992K(259328K), 0.0126936 secs]

    [GC 62896K->60192K(259328K), 0.0127579 secs]

    [GC 63008K->3120K(259328K), 0.0123150 secs]

    [GC 63024K->60256K(259328K), 0.0120565 secs]

    [GC 63072K->3024K(259328K), 0.0127215 secs]

    [GC 62928K->60208K(259328K), 0.0113090 secs]

    [GC 63024K->3136K(259328K), 0.0133799 secs]

    [GC 63040K->60256K(259328K), 0.0135459 secs]


    Adaptive sizing adjusts the size of the survivor spaces. In the alternate minor collections shown above the size of the survivor spaces is oscillating between values that alternatively cause a minimal collection of the young generation with a nearly complete collection of the young generation. Although a strange behavior the use of adaptive sizing may still yield a higher throughput than turning it off.


    Alternatively, using -XX:+UseParNewGC yields


    497.905: [GC 497.905: [ParNew: 64576K->960K(64576K), 0.0255372 secs] 155310K->93003K(261184K), 0.0256767 secs]

    506.305: [GC 506.305: [ParNew: 64576K->960K(64576K), 0.0276291 secs] 156619K->94267K(261184K), 0.0277958 secs]

    514.565: [GC 514.565: [ParNew: 64576K->960K(64576K), 0.0261376 secs] 157883K->95711K(261184K), 0.0262927 secs]

    522.838: [GC 522.838: [ParNew: 64576K->960K(64576K), 0.0316625 secs] 159327K->97331K(261184K), 0.0318099 secs]


    which shows about a 44% decrease in the minor collection pauses. The time stamps were added to show the period of the minor collections. With -XX:+UseParallelGC the minor collections occur about every 6.5 seconds and take .015 second (for a minor collection overhead of .33%). For -XX:+UseParNewGC the collections occur about every 8 seconds and take about .026 second (for a minor collection overhead of .23%).


    After tuning the major collections pause are too long

    Try the concurrent low pause collector

    The major collection pauses for the 256 Mbyte total heap with the 64 Mbyte young generation is about .489 second. If this is too long the concurrent low pause collector can be tried with the command line option -XX:+UseConcMarkSweepGC (and also keeping the -XX:+UseParNewGC)


    [GC [ParNew: 64576K->960K(64576K), 0.0377639 secs] 140122K->78078K(261184K), 0.0379598 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0329313 secs] 141694K->79533K(261184K), 0.0331324 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0413880 secs] 143149K->81128K(261184K), 0.0416101 secs]

    [GC [1 CMS-initial-mark: 80168K(196608K)] 81144K(261184K), 0.0059036 secs]

    [CMS-concurrent-mark: 0.129/0.129 secs]

    [CMS-concurrent-preclean: 0.007/0.007 secs]

    [GC[Rescan (non-parallel) [grey object rescan, 0.0020879 secs][root rescan, 0.0144199 secs], 0.016

    6258 secs][weak refs processing, 0.0000411 secs] [1 CMS-remark: 80168K(196608K)] 82493K(261184K),

    0.0168943 secs]

    [CMS-concurrent-sweep: 1.208/1.208 secs]

    [CMS-concurrent-reset: 0.036/0.036 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0311520 secs] 66308K->4171K(261184K), 0.0313513 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0348341 secs] 67787K->5695K(261184K), 0.0350776 secs]

    [GC [ParNew: 64576K->960K(64576K), 0.0359806 secs] 69311K->7154K(261184K), 0.0362064 secs]


    With the application used for these examples the time to do a concurrent collection is short relative to the time between minor collections. This is typically not the case. More often there will be minor collections during the concurrent collection. Note that the concurrent phases may be long (e.g., the CMS-concurrent-sweep is 1.2 seconds) but the application is not stopped during the concurrent phases so the application will not see a pause. Conversely, although the application is not paused during the concurrent phases, neither does it have use of all the available processors (one being used by the garbage collection thread). The pauses will come from the CMS-initial-mark and CMS-remark pause. In this snippet of output the larger of those is .017 second). Over the run the average for the CMS-remark pause (always the longer pause) was .019 second. The maximum pause for a major collection (as compared to the default collector) was then reduced from .489 second to .019 second (96% reduction). Note also that the minor collection pause increased to .035 second which is higher than the .026 second minor pause using only -XX:+UseParNewGC. This is indicative of the overhead associated with the concurrency of the major collection.


    Try the incremental low pause collector


    The major collection times can also be reduced by use of the incremental low pause collector. Use the command line option -Xincgc.


    [GC [DefNew: 64575K->959K(64576K), 0.0616204 secs][Train: 5537K->5474K(196608K), 0.0108803 secs] 68664K->6434K(261184K), 0.0730248 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0559078 secs][Train: 6813K->6749K(196608K), 0.0115077 secs] 70050K->7709K(261184K), 0.0680289 secs]

    [GC [DefNew: 64575K->959K(64576K), 0.0607615 secs][Train: 8178K->8114K(196608K), 0.0112705 secs] 71325K->9074K(261184K), 0.0727902 secs]


    During each minor collection part of the old generation is collected and there is no separate major collection pause. The minor collection pauses here increase to about .078 second. If the incremental collection of the tenured generation cannot be completed before tenured generation is full, a major collection occurs and is indicated in the verbosegc output by the 'MSC' in the output.


    [GC [DefNew: 64575K->64575K(64576K), 0.0005311 secs][Train MSC: 68226K->135K(196608K), 0.4347857 secs] 132802K->2565K(261184K), 0.4397831 secs]

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.

    Draft version: February 20, 2003

    Copyright © 2003 Sun Microsystems, Inc. All Rights Reserved.



  • Trackback: http://tb.donews.net/TrackBack.aspx?PostId=445862


    [点击此处收藏本文]  发表于2005年06月27日 3:31 PM




    正在读取评论……