Jiang 的个人资料N.Space日志列表 工具 帮助

Yixin Jiang

职业
世界上最广阔的是海洋,比海洋更广阔的是天空,比天空还要广阔的是人的心灵,比人的心灵还要广阔的是...

N.Space

简单得明显没有错误,复杂得没有明显的错误
6月25日

Google的逆袭(Z,原名“用数据说话,看Google 怎样被陷害”)

转自新浪blog: http://blog.sina.com.cn/s/blog_60676a3f0100e0xk.html

近日,央视爆出谷歌搜索出现大量黄色词条的信息。一个引起舆论强烈反响的例子是,在谷歌搜索“儿子”竟然也能搜索到黄色词条。那么,事情是怎么发生的呢?

下面我们来看谷歌是如何被陷害的:众所周知,谷歌关键词提醒是计算机自动摘取最近最流行的关键词来生成的。于是某些人利用这一点,大量在谷歌上搜索黄色词汇,陷害了谷歌。
在谷歌搜索趋势图,Google Insights for Search,以及一些第三方的统计数据中,可以看到:

在央视曝光谷歌之前7天:
1.有人故意在谷歌大量搜索黄色词汇,使单日黄色词汇搜索量同比猛增 5950% ,单月
搜索总量与上月相比增幅达数千倍
2.这些搜索量100%来自北京
3.这些搜索量几乎呈线性急剧上升,理论上这些瞬时搜索量应该服从正态分布并是突发性
的,换句话说,这是有人故意为之。

以下再附上几张类似图表,请注意峰值全部在6月17日,即CCTV节目(6月18日)播出的前一天。

(全年统计)

(本月统计)

为做对比,说明搜索引擎的统计应该是什么样子,我来附上一张对关键词“天气预报”的搜索统计图表,从图中我们可以看到,全年搜索量应该大致呈均匀分布,考虑到搜索引擎的普及使用,会有一个逐渐升高的趋势,但绝不可能出现在某个月份呈直线上升的情况。

那么,还有一种可能,是不是北京的人们在6月份,由于夏天到来,荷尔蒙分泌过多,导致对“儿子母亲不正当关系”这样的黄色词汇搜索过多呢?我们且来看这张对关键词“日本女优”的搜索统计图表,

可以看到,对关键词“日本女优”的搜索量全年大致呈均匀分布,甚至在近期有下降的趋势。那么,这种近期全民荷尔蒙分泌过多的情况也应该被排除了。并不是说对所有黄色信息都有大量的搜索需求。搜索数量呈急剧上升的关键词,只局限在媒体大书特书的几个词汇之中,特别要注意的是其急剧上升阶段和峰值都在媒体报道之前,显然,这不是自然的结果,那么,答案是什么呢?是谁让谷歌如此低俗?

3月25日

前田约翰《简单法则》

第一,减少,就是说,达到简单的最简单方法,就是要有所割舍,割舍一些没用的功能、多余的部分,就能简单许多。
第二,组织,妥善组织能使复杂的系统显得比较简单,这就好比合理使用一张写字台。
第三,时间,节省时间也会让人感觉简单(虽然这种一时的简单不一定是真正的简单)。
第四,学习,知识、经验的积累能帮助人们把某些事物变得更为简捷。
第五,差异,简单和复杂相辅相成,没有复杂的对比反差,简单就不能更好地显现。
第六,背景,简单的周边事物决非无关紧要,它有助于形成一种简单的氛围,让人感觉到简单。
第七,感情,感情的寄托也有助于简单。
第八,信任,要对一些简单的事物报以必要的信任。
第九,失败,要相信有些事物不可能简单,不是所有东西都适合简单。
第十,单一,简单就是要求减少形式的、无意义的,增加有意义的。

前田约翰简介:
前田约翰(John Maeda)世界知名的图像设计师、视觉艺术家、电脑科技专家,也是麻省理工学院媒体实验室的教授。前田约翰在艺术上的贡献也不容忽视,他得奖无数,例如:美国设计界最高荣誉Smithsonian 杂志的国家设计奖(2001年)、日本朝日设计奖(2002年)、德国Raymond Loewy 基金会奖(2005年)、戴姆勒克莱斯勒设计奖(2000年)等等。他曾在巴黎、纽约、伦敦、旧金山、东京、大阪等地举办过多次个人展览,深获好评,他的作品也被纽约现代美术馆,旧金山现代美术馆、史密森尼机构的国家设计美术馆收藏。

2月2日

调整 Java 虚拟机 - zt

http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/tprf_tunejvm.html

 

调整 Java 虚拟机

应用程序服务器是一个 Java 进程,它需要 Java 虚拟机(JVM)才能运行以及支持它所运行的 Java 应用程序。在配置应用程序服务器的过程中,可以对设置进行微调以改善系统对 JVM 的使用方式。

关于本任务

JVM 为基于 Java 的应用程序提供了运行时执行环境。WebSphere Application Server 是 JVM 运行时环境与基于 Java 的服务器运行时的组合。它可以在不同 JVM 提供程序提供的 JVM 上运行。要确定正在运行 Application Server 的 JVM 的提供程序,请从 WebSphere Application Server 的 app_server_root/java/bin 目录中发出 java -fullversion 命令。您也可以检查其中一个服务器的 SystemOut.log。当应用程序服务器启动时,Websphere Application Server 会将关于 JVM 的信息(包括 JVM 提供程序信息)写入此日志文件。

从调整 JVM 的观点看,有两种主要的 JVM 类型:

  • IBM JVM
  • 基于 Sun HotSpot 的 JVM,其中包括 Solaris 上的 Sun HotSpot JVM 以及 HP 的 JVM for HP-UX

尽管 JVM 调整操作随 JVM 提供程序的不同而有所变化,但一般的调整概念适用于所有 JVM。这些一般的概念包括:

  • 编译器调整。在服务器运行时期间,所有 JVM 都使用即时(JIT)编译器来将 Java 字节码编译为本机指令。
  • Java 内存或堆调整。JVM 内存管理功能(即垃圾回收)为提高 JVM 性能提供了其中一种最大的可能性。
  • 类装入调整。
过程
  • 优化启动性能和运行时性能

    在某些环境中,优化 WebSphere Application Server 的启动性能比优化运行时性能更重要。在另一些环境中,优化运行时性能更为重要。缺省情况下,IBM JVM 是针对运行时性能进行优化的,而基于 HotSpot 的 JVM 是针对启动性能进行优化的。

    Java JIT 编译器在很大程度上决定了是优化启动性能还是优化运行时性能。编译器使用的初始优化级别影响编译类方法所耗用的时间以及启动服务器所耗用的时间。为了提高启动速度,可以降低编译器所使用的初始优化级别。这意味着,由于现在使用较低的优化级别来编译类方法,所以应用程序的运行时性能可能会下降。

    因为编译器在运行时执行阶段会根据自己的判断来重新编译类方法以提高性能,所以,很难提供一个有关特定的运行时性能影响的说明。最终,应用程序的持续时间是影响运行时性能下降程度的主要原因。运行时间较短的应用程序的方法被重新编译的可能性较高。运行时间较长的应用程序的方法被重新编译的可能性较低。IBM JVM 的缺省设置是使用高优化级别来执行初始编译。如果需要更改此行为,可以使用以下 IBM JVM 选项:

    -Xquickstart

    此设置影响 IBM JVM 使用较低优化级别来编译类方法的方式,这将提高服务器启动速度,但会使运行时性能下降。缺省情况下,如果未指定此参数,IBM JVM 最初将使用较高的初始优化级别来执行编译。此设置能够提高运行时性能,但会减慢服务器启动速度。

    缺省值:
    高初始编译器优化级别

    建议值:
    高初始编译器优化级别

    用法:
    -Xquickstart 可以加快服务器启动速度。

    基于 Sun 的 Hotspot 技术的 JVM 最初使用低优化级别来编译类方法。使用下列 JVM 选项来更改此行为:

    -server

    基于 Sun 的 Hotspot 技术的 JVM 最初使用低优化级别来编译类方法。这些 JVM 使用简单编译器和能够进行优化的 JIT 编译器。通常情况下,使用简单 JIT 编译器。然而,可以通过设置此选项来使用能够执行优化的编译器。此更改将显著提高服务器的性能,但使用能够执行优化的编译器时,服务器的预备时间将会较长。

    缺省值:
    简单编译器

    建议值:
    能够执行优化的编译器

    用法:
    -server 启用能够执行优化的编译器。

  • 设置堆大小 以下命令行参数对于设置堆大小来说很有用。
    • -Xms

      此设置控制 Java 堆的初始大小。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。对于某些应用程序来说,此选项的缺省设置可能会太低,从而导致发生大量小型垃圾回收。

      缺省值:
      256 MB

      建议值:
      随工作负载的不同而有所变化,但高于缺省值。

      用法:
      -Xms256m 将初始堆大小设置为 256 兆字节

    • -Xmx

      此设置控制 Java 堆的最大大小。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。对于某些应用程序来说,此选项的缺省设置可能会太低,从而导致发生大量小型垃圾回收。

      缺省值:
      512 MB

      建议值:
      随工作负载的不同而有所变化,但高于缺省值。

      用法:
      -Xmx512m 将最大堆大小设置为 512 兆字节

    • -Xlp

      此设置可以与 IBM JVM 配合使用,以使用大页来分配堆。然而,如果使用此设置,则必须将操作系统配置为支持大页。使用大页可以降低 CPU 跟踪堆内存时的开销,并且还允许创建较大的堆。

      请参阅调整操作系统 以了解有关调整操作系统的更多信息。

    应该指定的堆大小取决于不同时段的堆使用情况。在堆大小频繁更改的情况下,对 Xms 和 Xmx 参数指定相同的值可以提高性能。

  • 调整 IBM JVM 的垃圾回收器。

    使用 Java -X 选项来查看内存选项列表。

    • -Xgcpolicy

      将 gcpolicy 设置为 optthruput 会禁用并发标记。如果没有暂停时间问题(表现为应用程序响应时间不规律),则应该使用此选项来实现最大吞吐量。将 gcpolicy 设置为 optavgpause 会使用缺省值来启用并发标记。此设置将减少由正常垃圾回收所引起的应用程序响应时间不规律情况。然而,此选项可能会降低整体吞吐量。

      缺省值:
      optthruput

      建议值:
      optthruput

      用法:
      Xgcpolicy:optthruput

    • -Xnoclassgc

      缺省情况下,当一个类没有任何活动实例时,JVM 就会从内存中卸载该类,但是这样会使性能下降。如果关闭类垃圾回收,就可以消除由于多次装入和卸载同一个类而造成的开销。

      如果不再需要某个类,则该类在堆中所占用的空间通常将用于创建新对象。但是,如果应用程序通过创建类的新实例来处理请求,并且该应用程序的请求是随机出现的,则可能会发生以下情况:先前请求者完成后,正常的类垃圾回收将通过释放这个类占用的堆空间来清除这个类,但当下一个请求出现时,又必须将这个类重新实例化。在这种情况下,您可能想使用此选项来禁用类垃圾回收。

      缺省值:
      启用类垃圾回收

      建议值:
      禁用类垃圾回收

      用法:
      Xnoclassgc 禁用类垃圾回收

    有关其他信息,请参阅下列 DeveloperWorks 文章:

  • [Solaris] 调整 Sun JVM 的垃圾回收器

    在 Solaris 平台上,WebSphere Application Server 在 Sun Hotspot JVM 上运行,而不是在 IBM JVM 上运行。对 Sun JVM 使用正确的调整参数以利用其性能优化功能十分重要。

    Sun Hotspot JVM 依靠分代垃圾回收来实现最佳性能。下列命令行参数对于调整垃圾回收来说非常有用。

    • -XX:SurvivorRatio

      将 Java 堆划分为旧对象(长生命周期对象)区域和新对象区域。新对象区域进一步细分为两部分,第一部分用于分配给新对象(初始区域),第二部分存放那些经过其前几次垃圾回收之后、但在被提升为旧对象之前仍在使用中的新对象(幸存者空间)。幸存者比率是堆的新对象区域中初始区域与幸存者空间的比率。增大此设置将针对需要创建大量对象但仅保留少量对象的应用程序优化 JVM。与其他应用程序相比,WebSphere Application Server 会生成更多中等生命周期对象和长生命周期对象,因此,应该将此设置设置为小于缺省值。

      缺省值:
      32

      建议值:
      16

      用法:
      -XX:SurvivorRatio=16

    • -XX:PermSize

      为永久生成对象保留的堆区域存储 JVM 的所有反射数据。对于动态地装入和卸载大量类的应用程序来说,应该增大此大小以优化它们的性能。通过将此参数设置为 128MB,可以消除增大此部分堆所需的开销。

      建议值:
      128 MB

      用法:
      XX:PermSize=128m 将 perm 大小设置为 128 兆字节。

    • -Xmn

      此设置控制允许新生成的对象在堆中耗用的空间量。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。此参数的缺省设置通常过低,这将导致执行大量的小型垃圾回收操作。如果将此参数设置得过高,可能会导致 JVM 仅执行大型(全面)垃圾回收。这些垃圾回收操作通常会耗时几秒钟,这将严重影响服务器的整体性能。您必须保持将此参数设置为小于整个堆大小的一半,以避免这种情况出现。

      缺省值:
      2228224 字节

      建议值:
      大约整个堆大小的 1/4

      用法:
      -Xmn256m 将大小设置为 256 兆字节。

    • -Xnoclassgc

      缺省情况下,当一个类没有任何活动实例时,JVM 就会从内存中卸载该类,但是这样会使性能下降。如果关闭类垃圾回收,就可以消除由于多次装入和卸载同一个类而造成的开销。

      如果不再需要某个类,则该类在堆中所占用的空间通常将用于创建新对象。但是,如果应用程序通过创建类的新实例来处理请求,并且该应用程序的请求是随机出现的,则可能会发生以下情况:先前请求者完成后,正常的类垃圾回收将通过释放这个类占用的堆空间来清除这个类,但当下一个请求出现时,又必须将这个类重新实例化。在这种情况下,您可能想使用此选项来禁用类垃圾回收。

      缺省值:
      启用类垃圾回收

      建议值:
      禁用类垃圾回收

      用法:
      Xnoclassgc 禁用类垃圾回收

    有关调整 Sun JVM 的其他信息,请参阅 Java HotSpot VM 的性能文档

  • [HP-UX] 调整 HP JVM 的垃圾回收器

    HP JVM 依靠分代垃圾回收来实现最佳性能。下列命令行参数对于调整垃圾回收来说非常有用。

    • -Xoptgc

      此设置针对包含许多短生命周期对象的应用程序优化 JVM。如果未指定此参数,则 JVM 通常执行大型(全面)垃圾回收。全面垃圾回收会花费几秒钟时间,这将显著影响服务器性能。

      缺省值:
      off

      建议值:
      on

      用法:
      -Xoptgc 启用优化的垃圾回收。

    • -XX:SurvivorRatio

      将 Java 堆划分为旧对象(长生命周期对象)区域和新对象区域。新对象区域进一步细分为两部分,第一部分用于分配给新对象(初始区域),第二部分存放那些经过其前几次垃圾回收之后、但在被提升为旧对象之前仍在使用中的新对象(幸存者空间)。幸存者比率是堆的新对象区域中初始区域与幸存者空间的比率。增大此设置将针对需要创建大量对象但仅保留少量对象的应用程序优化 JVM。与其他应用程序相比,WebSphere Application Server 会生成更多中等生命周期对象和长生命周期对象,因此,应该将此设置设置为小于缺省值。

      缺省值:
      32

      建议值:
      16

      用法:
      -XX:SurvivorRatio=16

    • -XX:PermSize

      为永久生成对象保留的堆区域存储 JVM 的所有反射数据。对于动态地装入和卸载大量类的应用程序来说,应该增大此大小以优化它们的性能。通过将此参数指定为 128 兆字节,可以消除增大此部分堆所需的开销。

      缺省值:
      0

      建议值:
      128 兆字节

      用法:
      -XX:PermSize=128m 将 PermSize 设置为 128 兆字节

    • -XX:+ForceMmapReserved

      缺省情况下,Java 堆以“惰性交换”方式进行分配。在此方式下,将根据需要来分配内存页,这样可以节省交换空间,但是也将强制使用 4KB 页。在大型堆系统中,这种内存分配方式允许堆包含数以十万计的页。此命令禁用“惰性交换”并允许操作系统使用较大的内存页,从而优化对构成 Java 堆的内存的访问。

      缺省值:
      off

      建议值:
      on

      用法:
      -XX:+ForceMmapReserved 将禁用“惰性交换”。

    • -Xmn

      此设置控制允许新生成的对象在堆中耗用的空间量。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。此参数的缺省设置通常过低,这将导致执行大量的小型垃圾回收操作。

      缺省值:
      没有缺省值

      建议值:
      大约整个堆大小的 3/4

      用法:
      -Xmn768m 将大小设置为 768 兆字节

    • 虚拟页大小

      通过将 Java 虚拟机的指令页大小和数据页大小设置为 64MB,可以提高性能。

      缺省值:
      4MB

      建议值:
      64MB

      用法:
      使用以下命令。命令输出提供了进程可执行文件的当前操作系统特征:

      chatr +pi64M +pd64M /opt/WebSphere/
      AppServer/java/bin/PA_RISC2.0/
      native_threads/java 
    • -Xnoclassgc

      缺省情况下,当一个类没有任何活动实例时,JVM 就会从内存中卸载该类,但是这样会使性能下降。如果关闭类垃圾回收,就可以消除由于多次装入和卸载同一个类而造成的开销。

      如果不再需要某个类,则该类在堆中所占用的空间通常将用于创建新对象。但是,如果应用程序通过创建类的新实例来处理请求,并且该应用程序的请求是随机出现的,则可能会发生以下情况:先前请求者完成后,正常的类垃圾回收将通过释放这个类占用的堆空间来清除这个类,但当下一个请求出现时,又必须将这个类重新实例化。在这种情况下,您可能想使用此选项来禁用类垃圾回收。

      缺省值:
      启用类垃圾回收

      建议值:
      禁用类垃圾回收

      用法:
      Xnoclassgc 禁用类垃圾回收

    有关调整 HP 虚拟机的其他信息,请参阅 Java 技术软件 HP-UX 11i

  • [HP-UX] 调整 HP 的 JVM for HP-UX 设置下列选项以提高应用程序性能:
    -XX:SchedulerPriorityRange=SCHED_NOAGE 
    -Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.DevPollSelectorProvider 
    -XX:-ExtraPollBeforeRead
    

调整 Java 虚拟机 - zt

http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/tprf_tunejvm.html

 

调整 Java 虚拟机

应用程序服务器是一个 Java 进程,它需要 Java 虚拟机(JVM)才能运行以及支持它所运行的 Java 应用程序。在配置应用程序服务器的过程中,可以对设置进行微调以改善系统对 JVM 的使用方式。

关于本任务

JVM 为基于 Java 的应用程序提供了运行时执行环境。WebSphere Application Server 是 JVM 运行时环境与基于 Java 的服务器运行时的组合。它可以在不同 JVM 提供程序提供的 JVM 上运行。要确定正在运行 Application Server 的 JVM 的提供程序,请从 WebSphere Application Server 的 app_server_root/java/bin 目录中发出 java -fullversion 命令。您也可以检查其中一个服务器的 SystemOut.log。当应用程序服务器启动时,Websphere Application Server 会将关于 JVM 的信息(包括 JVM 提供程序信息)写入此日志文件。

从调整 JVM 的观点看,有两种主要的 JVM 类型:

  • IBM JVM
  • 基于 Sun HotSpot 的 JVM,其中包括 Solaris 上的 Sun HotSpot JVM 以及 HP 的 JVM for HP-UX

尽管 JVM 调整操作随 JVM 提供程序的不同而有所变化,但一般的调整概念适用于所有 JVM。这些一般的概念包括:

  • 编译器调整。在服务器运行时期间,所有 JVM 都使用即时(JIT)编译器来将 Java 字节码编译为本机指令。
  • Java 内存或堆调整。JVM 内存管理功能(即垃圾回收)为提高 JVM 性能提供了其中一种最大的可能性。
  • 类装入调整。
过程
  • 优化启动性能和运行时性能

    在某些环境中,优化 WebSphere Application Server 的启动性能比优化运行时性能更重要。在另一些环境中,优化运行时性能更为重要。缺省情况下,IBM JVM 是针对运行时性能进行优化的,而基于 HotSpot 的 JVM 是针对启动性能进行优化的。

    Java JIT 编译器在很大程度上决定了是优化启动性能还是优化运行时性能。编译器使用的初始优化级别影响编译类方法所耗用的时间以及启动服务器所耗用的时间。为了提高启动速度,可以降低编译器所使用的初始优化级别。这意味着,由于现在使用较低的优化级别来编译类方法,所以应用程序的运行时性能可能会下降。

    因为编译器在运行时执行阶段会根据自己的判断来重新编译类方法以提高性能,所以,很难提供一个有关特定的运行时性能影响的说明。最终,应用程序的持续时间是影响运行时性能下降程度的主要原因。运行时间较短的应用程序的方法被重新编译的可能性较高。运行时间较长的应用程序的方法被重新编译的可能性较低。IBM JVM 的缺省设置是使用高优化级别来执行初始编译。如果需要更改此行为,可以使用以下 IBM JVM 选项:

    -Xquickstart

    此设置影响 IBM JVM 使用较低优化级别来编译类方法的方式,这将提高服务器启动速度,但会使运行时性能下降。缺省情况下,如果未指定此参数,IBM JVM 最初将使用较高的初始优化级别来执行编译。此设置能够提高运行时性能,但会减慢服务器启动速度。

    缺省值:
    高初始编译器优化级别

    建议值:
    高初始编译器优化级别

    用法:
    -Xquickstart 可以加快服务器启动速度。

    基于 Sun 的 Hotspot 技术的 JVM 最初使用低优化级别来编译类方法。使用下列 JVM 选项来更改此行为:

    -server

    基于 Sun 的 Hotspot 技术的 JVM 最初使用低优化级别来编译类方法。这些 JVM 使用简单编译器和能够进行优化的 JIT 编译器。通常情况下,使用简单 JIT 编译器。然而,可以通过设置此选项来使用能够执行优化的编译器。此更改将显著提高服务器的性能,但使用能够执行优化的编译器时,服务器的预备时间将会较长。

    缺省值:
    简单编译器

    建议值:
    能够执行优化的编译器

    用法:
    -server 启用能够执行优化的编译器。

  • 设置堆大小 以下命令行参数对于设置堆大小来说很有用。
    • -Xms

      此设置控制 Java 堆的初始大小。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。对于某些应用程序来说,此选项的缺省设置可能会太低,从而导致发生大量小型垃圾回收。

      缺省值:
      256 MB

      建议值:
      随工作负载的不同而有所变化,但高于缺省值。

      用法:
      -Xms256m 将初始堆大小设置为 256 兆字节

    • -Xmx

      此设置控制 Java 堆的最大大小。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。对于某些应用程序来说,此选项的缺省设置可能会太低,从而导致发生大量小型垃圾回收。

      缺省值:
      512 MB

      建议值:
      随工作负载的不同而有所变化,但高于缺省值。

      用法:
      -Xmx512m 将最大堆大小设置为 512 兆字节

    • -Xlp

      此设置可以与 IBM JVM 配合使用,以使用大页来分配堆。然而,如果使用此设置,则必须将操作系统配置为支持大页。使用大页可以降低 CPU 跟踪堆内存时的开销,并且还允许创建较大的堆。

      请参阅调整操作系统 以了解有关调整操作系统的更多信息。

    应该指定的堆大小取决于不同时段的堆使用情况。在堆大小频繁更改的情况下,对 Xms 和 Xmx 参数指定相同的值可以提高性能。

  • 调整 IBM JVM 的垃圾回收器。

    使用 Java -X 选项来查看内存选项列表。

    • -Xgcpolicy

      将 gcpolicy 设置为 optthruput 会禁用并发标记。如果没有暂停时间问题(表现为应用程序响应时间不规律),则应该使用此选项来实现最大吞吐量。将 gcpolicy 设置为 optavgpause 会使用缺省值来启用并发标记。此设置将减少由正常垃圾回收所引起的应用程序响应时间不规律情况。然而,此选项可能会降低整体吞吐量。

      缺省值:
      optthruput

      建议值:
      optthruput

      用法:
      Xgcpolicy:optthruput

    • -Xnoclassgc

      缺省情况下,当一个类没有任何活动实例时,JVM 就会从内存中卸载该类,但是这样会使性能下降。如果关闭类垃圾回收,就可以消除由于多次装入和卸载同一个类而造成的开销。

      如果不再需要某个类,则该类在堆中所占用的空间通常将用于创建新对象。但是,如果应用程序通过创建类的新实例来处理请求,并且该应用程序的请求是随机出现的,则可能会发生以下情况:先前请求者完成后,正常的类垃圾回收将通过释放这个类占用的堆空间来清除这个类,但当下一个请求出现时,又必须将这个类重新实例化。在这种情况下,您可能想使用此选项来禁用类垃圾回收。

      缺省值:
      启用类垃圾回收

      建议值:
      禁用类垃圾回收

      用法:
      Xnoclassgc 禁用类垃圾回收

    有关其他信息,请参阅下列 DeveloperWorks 文章:

  • [Solaris] 调整 Sun JVM 的垃圾回收器

    在 Solaris 平台上,WebSphere Application Server 在 Sun Hotspot JVM 上运行,而不是在 IBM JVM 上运行。对 Sun JVM 使用正确的调整参数以利用其性能优化功能十分重要。

    Sun Hotspot JVM 依靠分代垃圾回收来实现最佳性能。下列命令行参数对于调整垃圾回收来说非常有用。

    • -XX:SurvivorRatio

      将 Java 堆划分为旧对象(长生命周期对象)区域和新对象区域。新对象区域进一步细分为两部分,第一部分用于分配给新对象(初始区域),第二部分存放那些经过其前几次垃圾回收之后、但在被提升为旧对象之前仍在使用中的新对象(幸存者空间)。幸存者比率是堆的新对象区域中初始区域与幸存者空间的比率。增大此设置将针对需要创建大量对象但仅保留少量对象的应用程序优化 JVM。与其他应用程序相比,WebSphere Application Server 会生成更多中等生命周期对象和长生命周期对象,因此,应该将此设置设置为小于缺省值。

      缺省值:
      32

      建议值:
      16

      用法:
      -XX:SurvivorRatio=16

    • -XX:PermSize

      为永久生成对象保留的堆区域存储 JVM 的所有反射数据。对于动态地装入和卸载大量类的应用程序来说,应该增大此大小以优化它们的性能。通过将此参数设置为 128MB,可以消除增大此部分堆所需的开销。

      建议值:
      128 MB

      用法:
      XX:PermSize=128m 将 perm 大小设置为 128 兆字节。

    • -Xmn

      此设置控制允许新生成的对象在堆中耗用的空间量。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。此参数的缺省设置通常过低,这将导致执行大量的小型垃圾回收操作。如果将此参数设置得过高,可能会导致 JVM 仅执行大型(全面)垃圾回收。这些垃圾回收操作通常会耗时几秒钟,这将严重影响服务器的整体性能。您必须保持将此参数设置为小于整个堆大小的一半,以避免这种情况出现。

      缺省值:
      2228224 字节

      建议值:
      大约整个堆大小的 1/4

      用法:
      -Xmn256m 将大小设置为 256 兆字节。

    • -Xnoclassgc

      缺省情况下,当一个类没有任何活动实例时,JVM 就会从内存中卸载该类,但是这样会使性能下降。如果关闭类垃圾回收,就可以消除由于多次装入和卸载同一个类而造成的开销。

      如果不再需要某个类,则该类在堆中所占用的空间通常将用于创建新对象。但是,如果应用程序通过创建类的新实例来处理请求,并且该应用程序的请求是随机出现的,则可能会发生以下情况:先前请求者完成后,正常的类垃圾回收将通过释放这个类占用的堆空间来清除这个类,但当下一个请求出现时,又必须将这个类重新实例化。在这种情况下,您可能想使用此选项来禁用类垃圾回收。

      缺省值:
      启用类垃圾回收

      建议值:
      禁用类垃圾回收

      用法:
      Xnoclassgc 禁用类垃圾回收

    有关调整 Sun JVM 的其他信息,请参阅 Java HotSpot VM 的性能文档

  • [HP-UX] 调整 HP JVM 的垃圾回收器

    HP JVM 依靠分代垃圾回收来实现最佳性能。下列命令行参数对于调整垃圾回收来说非常有用。

    • -Xoptgc

      此设置针对包含许多短生命周期对象的应用程序优化 JVM。如果未指定此参数,则 JVM 通常执行大型(全面)垃圾回收。全面垃圾回收会花费几秒钟时间,这将显著影响服务器性能。

      缺省值:
      off

      建议值:
      on

      用法:
      -Xoptgc 启用优化的垃圾回收。

    • -XX:SurvivorRatio

      将 Java 堆划分为旧对象(长生命周期对象)区域和新对象区域。新对象区域进一步细分为两部分,第一部分用于分配给新对象(初始区域),第二部分存放那些经过其前几次垃圾回收之后、但在被提升为旧对象之前仍在使用中的新对象(幸存者空间)。幸存者比率是堆的新对象区域中初始区域与幸存者空间的比率。增大此设置将针对需要创建大量对象但仅保留少量对象的应用程序优化 JVM。与其他应用程序相比,WebSphere Application Server 会生成更多中等生命周期对象和长生命周期对象,因此,应该将此设置设置为小于缺省值。

      缺省值:
      32

      建议值:
      16

      用法:
      -XX:SurvivorRatio=16

    • -XX:PermSize

      为永久生成对象保留的堆区域存储 JVM 的所有反射数据。对于动态地装入和卸载大量类的应用程序来说,应该增大此大小以优化它们的性能。通过将此参数指定为 128 兆字节,可以消除增大此部分堆所需的开销。

      缺省值:
      0

      建议值:
      128 兆字节

      用法:
      -XX:PermSize=128m 将 PermSize 设置为 128 兆字节

    • -XX:+ForceMmapReserved

      缺省情况下,Java 堆以“惰性交换”方式进行分配。在此方式下,将根据需要来分配内存页,这样可以节省交换空间,但是也将强制使用 4KB 页。在大型堆系统中,这种内存分配方式允许堆包含数以十万计的页。此命令禁用“惰性交换”并允许操作系统使用较大的内存页,从而优化对构成 Java 堆的内存的访问。

      缺省值:
      off

      建议值:
      on

      用法:
      -XX:+ForceMmapReserved 将禁用“惰性交换”。

    • -Xmn

      此设置控制允许新生成的对象在堆中耗用的空间量。正确调整此参数有助于降低垃圾回收开销,从而缩短服务器响应时间并提高吞吐量。此参数的缺省设置通常过低,这将导致执行大量的小型垃圾回收操作。

      缺省值:
      没有缺省值

      建议值:
      大约整个堆大小的 3/4

      用法:
      -Xmn768m 将大小设置为 768 兆字节

    • 虚拟页大小

      通过将 Java 虚拟机的指令页大小和数据页大小设置为 64MB,可以提高性能。

      缺省值:
      4MB

      建议值:
      64MB

      用法:
      使用以下命令。命令输出提供了进程可执行文件的当前操作系统特征:

      chatr +pi64M +pd64M /opt/WebSphere/
      AppServer/java/bin/PA_RISC2.0/
      native_threads/java 
    • -Xnoclassgc

      缺省情况下,当一个类没有任何活动实例时,JVM 就会从内存中卸载该类,但是这样会使性能下降。如果关闭类垃圾回收,就可以消除由于多次装入和卸载同一个类而造成的开销。

      如果不再需要某个类,则该类在堆中所占用的空间通常将用于创建新对象。但是,如果应用程序通过创建类的新实例来处理请求,并且该应用程序的请求是随机出现的,则可能会发生以下情况:先前请求者完成后,正常的类垃圾回收将通过释放这个类占用的堆空间来清除这个类,但当下一个请求出现时,又必须将这个类重新实例化。在这种情况下,您可能想使用此选项来禁用类垃圾回收。

      缺省值:
      启用类垃圾回收

      建议值:
      禁用类垃圾回收

      用法:
      Xnoclassgc 禁用类垃圾回收

    有关调整 HP 虚拟机的其他信息,请参阅 Java 技术软件 HP-UX 11i

  • [HP-UX] 调整 HP 的 JVM for HP-UX 设置下列选项以提高应用程序性能:
    -XX:SchedulerPriorityRange=SCHED_NOAGE 
    -Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.DevPollSelectorProvider 
    -XX:-ExtraPollBeforeRead
    

Tuning Garbage Collection with the 1.4.2 JavaTM Virtual Machine

http://java.sun.com/docs/hotspot/gc1.4.2/

Table of Contents

1 Introduction

2 Generations

2.1 Performance Considerations

2.2 Measurement

3 Sizing the Generations

3.1 Total Heap

3.2 The Young Generation

3.2.1 Young Generation Guarantee

4 Types of Collectors

4.1When to Use the Throughput Collector

4.2 The Throughput Collector

4.2.1 Adaptive Sizing

4.2.2 AggressiveHeap

4.2.3 Measurements with the Throughput Collector

4.3 When to Use the Concurrent Low Pause Collector

4.4 The Concurrent Low Pause Collector

4.4.1 Overhead of Concurrency

4.4.2 Young Generation Guarantee

4.4.3 Full Collections

4.4.4 Floating Garbage

4.4.5 Pauses

4.4.6 Concurrent Phases

4.4.7 Measurements with the Concurrent Collector

4.4.8 Parallel Minor Collection Options with the Concurrent Collector

4.5 When to Use the Incremental Low Pause Collector

4.6 The Incremental Low Pause Collector

4.6.1 Measurements with the Incremental Collector

5 Other Considerations

6 Conclusion

7 Other Documentation

7.1 Example of Output

7.2 Frequently Asked Questions

1 Introduction

The JavaTM 2 Platform, Standard Edition (J2SETM platform) is used for a wide variety of applications from small applets on desktops to web services on large servers. In the J2SE platform version 1.4.1 two new garbage collectors were introduced to make a total of four garbage collectors from which to choose. How should that choice be made and what are the consequences of that choice? This document will describe some of the general features shared by all the garbage collectors. It will then discuss tuning options to take the best advantage of those features in the context of the default single-threaded, stop-the-world collector. Finally, it will discuss the specific features of the three other collectors, and discuss the criteria for choosing one of the four collectors.

When does garbage collection performance matter to the user? For many applications it doesn't. That is, the application can perform within its specifications in the presence of garbage collection with pauses of modest frequency and duration. An example where this is not the case (when the default collector is used) would be a large application that scales well to large number of threads, processors, sockets, and a large amount of memory.

Amdahl observed that most workloads cannot be perfectly parallelized; some portion is always sequential and does not benefit from parallelism. This is also true for the J2SE platform. In particular, virtual machines for the JavaTM platform up to and including version 1.3.1 do not have parallel garbage collection, so the impact of garbage collection on a multiprocessor system grows relative to an otherwise parallel application.

The graph below models an ideal system that is perfectly scalable with the exception of garbage collection. The red line is an application spending only 1% of the time in garbage collection on a uniprocessor system. This translates to more than a 20% loss in throughput on 32 processor systems. At 10% of the time in garbage collection (not considered an outrageous amount of time in garbage collection in uniprocessor applications) more than 75% of throughput is lost when scaling up to 32 processors.

GC vs. Amdahl's law

This shows that negligible speed issues when developing on small systems may become principal bottlenecks when scaling up to large systems. However, small improvements in reducing such a bottleneck can produce large gains in performance. For a sufficiently large system it becomes well worthwhile to tune the garbage collector.

The default collector should be the first choice for garbage collection and will be adequate for the majority of applications. Each of the other collectors have some added overhead and/or complexity, which is the price for specialized behavior. If the application doesn't need the specialized behavior of the alternate collectors, use the default collector. The exception to this rule is large applications that are heavily threaded and run on hardware with a large amount of memory and a large number of processors. For such applications, first try the aggressive heap option (-XX:+AggressiveHeap) described below.

This document was written using the J2SE platform, version 1.4.2, on the SolarisTM Operating Environment (SPARC(R) Platform Edition) as the base platform, because it provides the most scalable hardware and software for the J2SE platform. However, the descriptive text applies to other supported platforms, including Linux, Microsoft Windows, and the Solaris Operating Environment (x86 Platform Edition), to the extent that scalable hardware is available. Although command line options are consistent across platforms, some platforms may have defaults different than those described here.

2 Generations

One strength of the J2SE platform is that it shields the complexity of memory allocation and garbage collection from the developer. However, once garbage collection is the principal bottleneck, it is worth understanding some aspects of this hidden implementation. Garbage collectors make assumptions about the way applications use objects, and these are reflected in tunable parameters that can be adjusted for improved performance without sacrificing the power of the abstraction.

An object is considered garbage when it can no longer be reached from any pointer in the running program. The most straightforward garbage collection algorithms simply iterate over every reachable object. Any objects left over are then considered garbage. The time this approach takes is proportional to the number of live objects, which is prohibitive for large applications maintaining lots of live data.

Beginning with the J2SE platform, version 1.2, the virtual machine incorporated a number of different garbage collection algorithms that are combined using generational collection. While naive garbage collection examines every live object in the heap, generational collection exploits several empirically observed properties of most applications to avoid extra work.

The most important of these observed properties is infant mortality. The blue area in the diagram below is a typical distribution for the lifetimes of objects. The X axis is object lifetimes measured in bytes allocated. The byte count on the Y axis is the total bytes in objects with the corresponding lifetime. The sharp peak at the left represents objects that can be reclaimed (i.e., have "died") shortly after being allocated. Iterator objects, for example, are often alive for the duration of a single loop.

histogram with collections

Some objects do live longer, and so the distribution stretches out to the the right. For instance, there are typically some objects allocated at initialization that live until the process exits. Between these two extremes are objects that live for the duration of some intermediate computation, seen here as the lump to the right of the infant mortality peak. Some applications have very different looking distributions, but a surprisingly large number possess this general shape. Efficient collection is made possible by focusing on the fact that a majority of objects "die young".

To optimize for this scenario, memory is managed in generations, or memory pools holding objects of different ages. Garbage collection occurs in each generation when the generation fills up. Objects are allocated in a generation for younger objects or the young generation, and because of infant mortality most objects die there. When the young generation fills up it causes a minor collection. Minor collections can be optimized assuming a high infant mortality rate. The costs of such collections are, to the first order, proportional to the number of live objects being collected. A young generation full of dead objects is collected very quickly. Some surviving objects are moved to an tenured generation. When the tenured generation needs to be collected there is a major collection that is often much slower because it involves all live objects.

The diagram below shows minor collections occurring at intervals long enough to allow many of the objects to die between collections. It is well-tuned in the sense that the young generation is large enough (and thus the period between minor collections long enough) that the minor collection can take advantage of the high infant mortality rate. This situation can be upset by applications with unusual lifetime distributions, or by poorly sized generations that cause collections to occur before objects have had time to die.

The default garbage collector is meant to be used by applications large and small. Its default parameters were designed to be effective for most small applications. The default parameters aren't optimal for many server applications. This leads to the central tenet of this document:

If the garbage collector has become a bottleneck, you may wish to customize the generation sizes. Check the verbose garbage collector output, and then explore the sensitivity of your individual performance metric to the garbage collector parameters.

The default arrangement of generations looks something like this.

space usage by generations

At initialization, a maximum address space is virtually reserved but not allocated to physical memory unless it is needed. The complete address space reserved for object memory can be divided into the young and tenured generations.

The young generation consists of eden plus two survivor spaces . Objects are initially allocated in eden. One survivor space is empty at any time, and serves as a destination of the next, copying collection of any live objects in eden and the other survivor space. Objects are copied between survivor spaces in this way until they old enough to be tenured, or copied to the tenured generation.

Other virtual machines, including the production virtual machine for the J2SE platform, version 1.2 for the Solaris Operating Environment, used two equally sized spaces for copying rather than one large eden plus two small spaces. This means the options for sizing the young generation are not directly comparable; see the Performance FAQ for an example.

One portion of the tenured generation called the permanent generation is special because it holds all the reflective data of the virtual machine itself, such as class and method objects.

2.1 Performance Considerations

There are two primary measures of garbage collection performance. Throughput is the percentage of total time not spent in garbage collection, considered over long periods of time. Throughput includes time spent in allocation (but tuning for speed of allocation is generally not needed.) Pauses are the times when an application appears unresponsive because garbage collection is occurring.

Users have different requirements of garbage collection. For example, some consider the right metric for a web server to be throughput, since pauses during garbage collection may be tolerable, or simply obscured by network latencies. However, in an interactive graphics program even short pauses may negatively affect the user experience.

Some users are sensitive to other considerations. Footprint is the working set of a process, measured in pages and cache lines. On systems with limited physical memory or many processes, footprint may dictate scalability. Promptness is the time between when an object becomes dead and when the memory becomes available, an important consideration for distributed systems, including remote method invocation (RMI).

In general, a particular generation sizing chooses a trade-off between these considerations. For example, a very large young generation may maximize throughput, but does so at the expense of footprint, promptness, and pause times. young generation pauses can be minimized by using a small young generation at the expense of throughput. To a first approximation, the sizing of one generation does not affect the collection frequency and pause times for another generation.

There is no one right way to size generations. The best choice is determined by the way the application uses memory as well as user requirements. For this reason the virtual machine's default garbage collectior may not be optimal, and may be overridden by the user in the form of command line options, described below.

2.2 Measurement

Throughput and footprint are best measured using metrics particular to the application. For example, throughput of a web server may be tested using a client load generator, while footprint of the server might be measured on the Solaris Operating Environment using the pmap command. On the other hand, pauses due to garbage collection are easily estimated by inspecting the diagnostic output of the virtual machine itself.

The command line argument -verbose:gc prints information at every collection. Note that the format of the -verbose:gc output is subject to change between releases of the J2SE platform. For example, here is output from a large server application:

  [GC 325407K->83000K(776768K), 0.2300771 secs]
  [GC 325816K->83372K(776768K), 0.2454258 secs]
  [Full GC 267628K->83769K(776768K), 1.8479984 secs]

Here we see two minor collections and one major one. The numbers before and after the arrow

325407K->83000K (in the first line)

indicate the combined size of live objects before and after garbage collection, respectively. After minor collections the count includes objects that aren't necessarily alive but can't be reclaimed, either because they are directly alive, or because they are within or referenced from the tenured generation. The number in parenthesis

(776768K)(in the first line)

is the total available space, not counting the space in the permanent generation, which is the total heap minus one of the survivor spaces. The minor collection took about a quarter of a second.

0.2300771 secs (in the first line)

The format for the major collection in the third line is similar. The flag -XX:+PrintGCDetails prints additional information about the collections. The additional information printed with this flag is liable to change with each version of the virtual machine. The additional output with the -XX:+PrintGCDetails flag in particular changes with the needs of the development of the Java Virtual Machine. An example of the output with -XX:+PrintGCDetails for the J2SE platform, version 1.4.2 is shown here.

[GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs]]

indicates that the minor collection recovered about 98% of the young generation,

DefNew: 64575K->959K(64576K)

and took about 46 milliseconds.

0.0457646 secs

The usage of the entire heap was reduced to about 51%

196016K->133633K(261184K)

and that there was some slight additional overhead for the collection (over and above the collection of the young generation) as indicated by the final time:

0.0459067 secs

The flag -XX:+PrintGCTimeStamps will additionally print a time stamp at the start of each collection.

111.042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs]

The collection starts about 111 seconds into the execution of the application. The minor collection starts at about the same time. Additionally the information is shown for a major collection delineated by Tenured. The tenured generation usage was reduced to about 10%

18154K->2311K(24576K)

and took about .13 seconds.

0.1290354 secs

3 Sizing the Generations

A number of parameters affect generation size. The following diagram illustrates the difference between committed space and virtual space in the heap. At initialization of the virtual machine, the entire space for the heap is reserved. The size of the space reserved can be specified with the -Xmx option. If the value of the -Xms parameter is smaller than the value of the -Xmx parameter, not all of the space that is reserved is immediately committed to the virtual machine. The uncommitted space is labeled "virtual" in this figure. The different parts of the heap (permanent generation, tenured generation, and young generation) can grow to the limit of the virtual space as needed.

Some of the parameters are ratios of one part of the heap to another. For example the parameter NewRatio denotes the relative size of the tenured generation to the young generation. These parameters are discussed below.

options affecting sizing

3.1 Total Heap

Since collections occur when generations fill up, throughput is inversely proportional to the amount of memory available. Total available memory is the most important factor affecting garbage collection performance.

By default, the virtual machine grows or shrinks the heap at each collection to try to keep the proportion of free space to live objects at each collection within a specific range. This target range is set as a percentage by the parameters -XX:MinHeapFreeRatio=<minimum> and -XX:MaxHeapFreeRatio=<maximum>, and the total size is bounded below by -Xms and above by -Xmx . The default parameters for the Solaris Operating Environment (SPARC Platform Edition) are shown in this table:

-XX:MinHeapFreeRatio=

40

-XX:MaxHeapFreeRatio=

70

-Xms

3670k

-Xmx

64m

With these parameters if the percent of free space in a generation falls below 40%, the size of the generation will be expanded so as to have 40% of the space free, assuming the size of the generation has not already reached its limit. Similarly, if the percent of free space exceeds 70%, the size of the generation will be shrunk so as to have only 70% of the space free as long as shrinking the generation does not decrease it below the minimum size of the generation.

Large server applications often experience two problems with these defaults. One is slow startup, because the initial heap is small and must be resized over many major collections. A more pressing problem is that the default maximum heap size is unreasonably small for most server applications. The rules of thumb for server applications are:

Unless you have problems with pauses, try granting as much memory as possible to the virtual machine. The default size (64MB) is often too small.

Setting -Xms and -Xmx to the same value increases predictability by removing the most important sizing decision from the virtual machine. On the other hand, the virtual machine can't compensate if you make a poor choice.

Be sure to increase the memory as you increase the number of processors, since allocation can be parallelized.

A description of other virtual machine options can be found at

http://java.sun.com/docs/hotspot/VMOptions.html

3.2 The Young Generation

The second most influential knob is the proportion of the heap dedicated to the young generation. The bigger the young generation, the less often minor collections occur. However, for a bounded heap size a larger young generation implies a smaller tenured generation, which will increase the frequency of major collections. The optimal choice depends on the lifetime distribution of the objects allocated by the application.

By default, the young generation size is controlled by NewRatio. For example, setting -XX:NewRatio=3 means that the ratio between the young and tenured generation is 1:3. In other words, the combined size of the eden and survivor spaces will be one fourth of the total heap size.

The parameters NewSize and MaxNewSize bound the young generation size from below and above. Setting these equal to one another fixes the young generation, just as setting -Xms and -Xmx equal fixes the total heap size. This is useful for tuning the young generation at a finer granularity than the integral multiples allowed by NewRatio.

3.2.1 Young Generation Guarantee

In an ideal minor collection the live objects are copied from one part of the young generation (the eden space plus the first survivor space) to another part of the young generation (the second survivor space). However, there is no guarantee that all the live objects will fit into the second survivor space. To ensure that the minor collection can complete even if all the objects are live, enough free memory must be reserved in the tenured generation to accommodate all the live objects. In the worst case, this reserved memory is equal to the size of eden plus the objects in non-empty survivor space. When there isn't enough memory available in the tenured generation for this worst case, a major collection will occur instead. This policy is fine for small applications, because the memory reserved in the tenured generation is typically only virtually committed but not actually used. But for applications needing the largest possible heap, an eden bigger than half the virtually committed size of the heap is useless: only major collections would occur. Note that the young generation guarantee applies to all of the collectors with the exception of the throughput collector . The throughput collector will proceed with a young generation collection, and if the tenured generation cannot accommodate all the promotions from the young generation, both generations are collected.

If desired, the parameter SurvivorRatio can be used to tune the size of the survivor spaces, but this is often not as important for performance. For example, -XX:SurvivorRatio=6 sets the ratio between each survivor space and eden to be 1:6. In other words, each survivor space will be one eighth of the young generation (not one seventh, because there are two survivor spaces).

If survivor spaces are too small, copying collection overflows directly into the tenured generation. If survivor spaces are too large, they will be uselessly empty. At each garbage collection the virtual machine chooses a threshold number of times an object can be copied before it is tenured. This threshold is chosen to keep the survivors half full. An option, -XX:+PrintTenuringDistribution, can be used to show this threshold and the ages of objects in the new generation. It is also useful for observing the lifetime distribution of an application.

Here are the default values for the Solaris Operating Environment (SPARC Platform Edition):

NewRatio

2 (client JVM: 8)

NewSize

2228k

MaxNewSize

unlimited

SurvivorRatio

32

The maximum size of the young generation will be calculated from the maximum size of the total heap and NewRatio. The "unlimited" default value for MaxNewSize means that the calculated value is not limited by MaxNewSize unless a value for MaxNewSize is specified on the command line.

The rules of thumb for server applications are:

First decide the total amount of memory you can afford to give the virtual machine. Then graph your own performance metric against young generation sizes to find the best setting.

Unless you find problems with excessive major collection or pause times, grant plenty of memory to the young generation.

Increasing the young generation becomes counterproductive at half the total heap or less (whenever the young generation guarantee cannot be met).

Be sure to increase the young generation as you increase the number of processors, since allocation can be parallelized.

4 Types of Collectors

The discussion to this point has been about the default collector. In the J2SE platform, version 1.4.2 there are three additional collectors. Each is a generational collector which has been implemented to emphasize the throughput of the application or low garbage collection pause times.

  1. The throughput collector: this collector uses a parallel version of the young generation collector. It is used if the -XX:+UseParallelGC option is passed on the command line. The tenured generation collector is the same as the default collector.

  2. The concurrent low pause collector: this collector is used if the -XX:+UseConcMarkSweepGC is passed on the command line. The concurrent collector is used to collect the tenured generation and does most of the collection concurrently with the execution of the application. The application is paused for short periods during the collection. A parallel version of the young generation copying collector is used with the concurrent collector (i.e. if -XX:+UseConcMarkSweepGC is used on the command line then the flag UseParNewGC is also set to true if it is not otherwise explicitly set on the command line).

  3. The incremental (sometimes called train) low pause collector: this collector is used only if -Xincgc is passed on the command line. By careful bookkeeping, the incremental garbage collector collects just a portion of the tenured generation at each minor collection, trying to spread the large pause of a major collection over many minor collections. However, it is even slower than the default tenured generation collector when considering overall throughput.

Note that -XX:+UseParallelGC should not be used with -XX:+UseConcMarkSweepGC. The argument parsing in the J2SE platform, version 1.4.2 should only allow legal combinations of command line options for garbage collectors, but earlier releases may not detect all illegal combinations and the results for illegal combinations are unpredictable.

Always try the default collector on your application before trying one of the other collectors. Tune the heap size for your application and then consider what requirements of your application are not being met. Based on the latter, consider using one of the other collectors.

4.1When to Use the Throughput Collector

Use the throughput collector when you want to improve the performance of your application with larger numbers of processors. In the default collector garbage collection is done by one thread, and therefore garbage collection adds to the serial execution time of the application. The throughput collector uses multiple threads to execute a minor collection and so reduces the serial execution time of the application. A typical situation is one in which the application has a large number of threads allocating objects. In such an application it is often the case that a large young generation is needed.

4.2 The Throughput Collector

The throughput collector is a generational collector similar to the default collector but with multiple threads used to do the minor collection. The major collections are essentially the same as with the default collector. By default on a host with N CPUs, the throughput collector uses N garbage collector threads in the collection. The number of garbage collector threads can be controlled with a command line option (see below). On a host with 1 CPU the throughput collector will likely not perform as well as the default collector because of the additional overhead for the parallel execution (e.g., synchronization costs). On a host with 2 CPUs the throughput collector generally performs as well as the default garbage collector and a reduction in the minor garbage collector pause times can be expected on hosts with more than 2 CPUs.

The throughput collector can be enabled by using command line flag -XX:+UseParallelGC. The number of garbage collector threads can be controlled with the ParallelGCThreads command line option (-XX:ParallelGCThreads=<desired number>). The size of the heap needed with the throughput collector to first order is the same as with the default collector. Turning on the throughput collector should just make the minor collection pauses shorter. Because there are multiple garbage collector threads participating in the minor collection there is a small possibility of fragmentation due to promotions from the young generation to the tenured generation during the collection. Each garbage collection thread reserves a part of the tenured generation for promotions and the division of the available space into these "promotion buffers" can cause a fragmentation effect. Reducing the number of garbage collector threads will reduce this fragmentation effect as will increasing the size of the tenured generation.

4.2.1 Adaptive Sizing

A feature available with the throughput collector in the J2SE platform, version 1.4.1 and later releases is the use of adaptive sizing (-XX:+UseAdaptiveSizePolicy), which is on by default. Adaptive sizing keeps statistics about garbage collection times, allocation rates, and the free space in the heap after a collection. These statistics are used to make decisions regarding changes to the sizes of the young generation and tenured generation so as to best fit the behavior of the application. Use the command line option -verbose:gc to see the resulting sizes of the heap.

4.2.2 AggressiveHeap

The -XX:+AggressiveHeap option inspects the machine resources (size of memory and number of processors) and attempts to set various parameters to be optimal for long-running, memory allocation-intensive jobs. It was originally intended for machines with large amounts of memory and a large number of CPUs, but in the J2SE platform, version 1.4.1 and later it has shown itself to be useful even on four processor machines. With this option the throughput collector (-XX:+UseParallelGC) is used along with adaptive sizing (-XX:+UseAdaptiveSizePolicy). The physical memory on the machines must be at least 256MB before AggressiveHeap can be used. The size of the initial heap is calculated based on the size of the physical memory and attempts to make maximal use of the physical memory for the heap (i.e., the algorithms attempt to use heaps nearly as large as the total physical memory).

4.2.3 Measurements with the Throughput Collector

The verbose garbage collector output is the same for the throughput collector as with the default collector.

4.3 When to Use the Concurrent Low Pause Collector

Use the concurrent low pause collector if your application would benefit from shorter garbage collector pauses and can afford to share processor resources with the garbage collector when the application is running. Typically applications which have a relatively large set of long-lived data (a large tenured generation), and run on machines with two or more processors tend to benefit from the use of this collector. However, this collector should be considered for any application with a low pause time requirement. Optimal results have been observed for interactive applications with tenured generations of a modest size on a single processor.

4.4 The Concurrent Low Pause Collector

The concurrent low pause collector is a generational collector similar to the default collector. The tenured generation is collected concurrently with this collector.

This collector attempts to reduce the pause times needed to collect the tenured generation. It uses a separate garbage collector thread to do parts of the major collection concurrently with the applications threads. The concurrent collector is enabled with the command line option -XX:+UseConcMarkSweepGC. For each major collection the concurrent collector will pause all the application threads for a brief period at the beginning of the collection and toward the middle of the collection. The second pause tends to be the longer of the two pauses and multiple threads are used to do the collection work during that pause. The remainder of the collection is done with a garbage collector thread that runs concurrently with the application. The minor collections are done in a manner similar to the default collector although multiple garbage collector threads are used to reduce the minor collection times. See "Parallel Minor Collection Options with the Concurrent Collector" below for information on using multiple threads with the concurrent low pause collector.

The techniques used in the concurrent collector (for the collection of the tenured generation) are described at:

http://research.sun.com/techrep/2000/abstract-88.html

4.4.1 Overhead of Concurrency

The concurrent collector trades processor resources (which would otherwise be available to the application) for shorter major collection pause times. The concurrent part of the collection is done by a single garbage collection thread. On an N processor system when the concurrent part of the collection is running, it will be using 1/Nth of the available processor power. On a uniprocessor machine it would be fortuitous if it provided any advantage. It conceivably could break up a single long pause into several shorter pauses (a pause being defined in this case as the absence of any application threads running) but that is not the intent of the concurrent collector. The concurrent collector also has some additional overhead costs that will take away from the throughput of the applications, and some inherent disadvantages (e.g., fragmentation) for some types of applications. On a two processor machine there is a processor available for applications threads while the concurrent part of the collection is running, so running the concurrent garbage collector thread does not "pause" the application. There may be reduced pause times as intended for the concurrent collector but again less processor resources are available to the application and some slowdown of the application should be expected. As N increases, the reduction in processor resources due to the running of the concurrent garbage collector thread becomes less, and the advantages of the concurrent collector become more.

4.4.2 Young Generation Guarantee

As with the default collector a minor collection may require enough space in the tenured generation to accommodate all the objects in eden and one survivor space. Because fragmentation can occur in a concurrent collection, the requirement for this guarantee is more severe with the concurrent collector. There has to be enough contiguous space available in the tenured generation for all the objects in eden and one survivor space because there is no a priori way (except at a significant performance cost) to know the distribution of the sizes in eden and the one survivor space. A larger heap is almost always needed when the concurrent collector is used as compared to the default collector. As with the default collector the space in the tenured generation must be reserved but does not actually have to be used. As a rough estimate choose the appropriate young generation and tenured generation heap sizes as would be appropriate for the default collector, and then increase the tenured generation size by the equivalent of the young generation size for the concurrent collector. This is a very rough approximation and the correct values are application dependent.

4.4.3 Full Collections

The concurrent collector uses a single garbage collector thread that runs simultaneously with the application threads with the goal of completing the collection of the tenured generation before it becomes full. In normal operation, the concurrent collector is able to do most of its work with the application threads still running, so only brief pauses are seen by the application threads. As a fall back, if the concurrent collector is unable to finish before the tenured generation fills up, the application is paused and the collection is completed with all the application threads stopped. Such collections with the application stopped are referred to as full collections and are a sign that some adjustments need to be made to the concurrent collection parameters.

4.4.4 Floating Garbage

A garbage collector works to find the live objects in the heap. Because application threads and the garbage collector thread run concurrently, objects that are found to be alive by the garbage collector thread may become dead by the time collection finishes. Such objects are referred to as floating garbage. The amount of floating garbage depends on the length of the concurrent collection (more time for the applications threads to discard an object) and on the particulars of the application. As a rough rule of thumb try increasing the size of the tenured generation by 20% to account for the floating garbage. Floating garbage is collected at the next garbage collection.

4.4.5 Pauses

The concurrent collector pauses an application twice during a concurrent collection cycle. The first pause is to mark as live the objects directly reachable from the roots (e.g., objects on thread stack, static objects and so on) and elsewhere in the heap (e.g., the young generation). This first pause is referred to as the initial mark. The second pause comes at the end of the marking phase and finds objects that were missed during the concurrent marking phase due to the concurrent execution of the application threads. The second pause is referred to as the remark.

4.4.6 Concurrent Phases

The concurrent marking occurs between the initial mark and the remark. During the concurrent marking the concurrent garbage collector thread is executing and using processor resources that would otherwise be available to the application. After the remark there is a concurrent sweeping phase which collects the dead objects. During this phase the concurrent garbage collector thread is again taking processor resources from the application. After the sweeping phase the concurrent collector sleeps until the start of the next major collection.

4.4.7 Measurements with the Concurrent Collector

Below is output for -verbose:gc with -XX:+PrintGCDetails (some details have been removed). Note that the output for the concurrent collector is interspersed with the output from the minor collections. Typically many minor collections will occur during a concurrent collection cycle. The CMS-initial-mark: indicates the start of the concurrent collection cycle. The CMS-concurrent-mark: indicates the end of the concurrent marking phase as CMS-concurrent-sweep: marks the end of the concurrent sweeping phase. Not discussed before is the precleaning phase indicated by CMS-concurrent-preclean: which represents work that can be done concurrently and is in preparation for the remark phase CMS-remark. The final phase is indicated by the CMS-concurrent-reset: and is in preparation for the next concurrent collection.

[GC [1 CMS-initial-mark: 13991K(20288K)] 14103K(22400K), 0.0023781 secs]

[GC [DefNew: 2112K->64K(2112K), 0.0837052 secs] 16103K->15476K(22400K), 0.0838519 secs]

...

[GC [DefNew: 2077K->63K(2112K), 0.0126205 secs] 17552K->15855K(22400K), 0.0127482 secs]

[CMS-concurrent-mark: 0.267/0.374 secs]

[GC [DefNew: 2111K->64K(2112K), 0.0190851 secs] 17903K->16154K(22400K), 0.0191903 secs]

[CMS-concurrent-preclean: 0.044/0.064 secs]

[GC[1 CMS-remark: 16090K(20288K)] 17242K(22400K), 0.0210460 secs]

[GC [DefNew: 2112K->63K(2112K), 0.0716116 secs] 18177K->17382K(22400K), 0.0718204 secs]

[GC [DefNew: 2111K->63K(2112K), 0.0830392 secs] 19363K->18757K(22400K), 0.0832943 secs]

...

[GC [DefNew: 2111K->0K(2112K), 0.0035190 secs] 17527K->15479K(22400K), 0.0036052 secs]

[CMS-concurrent-sweep: 0.291/0.662 secs]

[GC [DefNew: 2048K->0K(2112K), 0.0013347 secs] 17527K->15479K(27912K), 0.0014231 secs]

[CMS-concurrent-reset: 0.016/0.016 secs]

[GC [DefNew: 2048K->1K(2112K), 0.0013936 secs] 17527K->15479K(27912K), 0.0014814 secs]

The initial mark pause is typically short relative to the minor collection pause time. The times of the concurrent phases (concurrent mark, concurrent precleaning, and concurrent sweep) may be relatively long (as in the example above) when compared to a minor collection pause but the application is not paused during the concurrent phases. The remark pause is affected by the specifics of the application (e.g., a higher rate of modifying objects can increase this pause) and the time since the last minor collection (i.e., more objects in the young generation may increase this pause).

4.4.8 Parallel Minor Collection Options with the Concurrent Collector

On a multiple processor platform, the default for the UseParNewGC option is true.

If the UseParNewGC option is in use the remark pauses may be decreased with the CMSParallelRemarkEnabled option.

-XX:+CMSParallelRemarkEnabled

4.5 When to Use the Incremental Low Pause Collector

Use the incremental low pause collector when your application can afford to trade longer and more frequent young generation garbage collection pauses for shorter tenured generation pauses. A typical situation is one in which a larger tenured generation is required (lots of long-lived objects), a smaller young generation will suffice (most objects are short-lived and don't survive the young generation collection), and only a single processor is available.

4.6 The Incremental Low Pause Collector

The incremental low pause collector is a generational collector similar to the default collector. The minor collections are done with the same young generation collector as the default collector. Do not use either -XX:+UseParallelGC or -XX:+UseParNewGC with this collector. The major collections are done incrementally on the tenured generation.

This collector (also known as the train collector) collects portions of the tenured generation at each minor collection. The goal of the incremental collector is to avoid very long major collection pauses by doing portions of the major collection work at each minor collection. The incremental collector will sometimes find that a non-incremental major collection (as is done in the default collector) is required in order to avoid running out of memory.

This collector can cause some fragmentation of the heap, so sometimes a larger tenured generation heap size will be required, as compared to the default mark-sweep-compact collector.

In order to collect a portion of the tenured generation at each minor collection additional information is maintained by the incremental collector. The overhead of maintaining this information increases the overall cost of garbage collection and throughput is typically less than when using the default mark sweep collector.

The incremental collector should be used by first trying the default collector and sizing the heap as discussed for the default collector. If the major pauses cannot be reduced to an acceptable level by adjusting the sizes of the generations in the heap, try the incremental collector with the same generation sizes first. Then vary the generation sizes to fit your application.

  • If full collections are occurring, the incremental collector may not be able to incrementally collect the tenured generation fast enough to finish before the tenured generation runs out of memory. Try decreasing the size of the young generation in order to increase the number of young generation collections

  • If full collections are occurring because the young generation guarantee cannot be met, then fragmentation may be the cause. The failure to guarantee a young generation collection is indicated by a young generation collection that does not recover any space (see the example below). Increase the size of the tenured generation to offset the fragmentation. The space in the larger generation may not be used but it will be available for the young generation guarantee.

4.6.1 Measurements with the Incremental Collector

For details on the incremental collector in addition to the -verbose:gc command line flag, add the flag -XX:+PrintGCDetails (which first became available in the J2SE platform, version 1.4.1). A typical line of output with the details flag will look like this.

[GC [DefNew: 2074K->25K(2112K), 0.0050065 secs][Train: 1676K->1633K(63424K), 0.0082112 secs] 3750K->1659K(65536K), 0.0138017 secs]

This line says that a young generation collection was done and took about 5 milliseconds. An incremental collection was done (indicated by the Train: part of the line) and took about 8 milliseconds. If a full collection is done instead of an incremental collection, the line will include Train MSC: which indicates a full (mark-sweep-compact) collection was done.

[GC [DefNew: 2049K->2049K(2112K), 0.0003304 secs][Train MSC: 61809K->357K(63424K), 0.3956982 secs] 63859K->394K(65536K), 0.3987650 secs]

Also in this line you can see that the minor collection was not effective. Before the collection 2049KB of space in the young generation was occupied and after the collection the same amount was occupied. This indicates that the contiguous free space in the tenured generation was not enough to satisfy the young generation guarantee.

5 Other Considerations

For most applications the permanent generation is not relevant to garbage collector performance. However, some applications dynamically generate and load many classes. For instance, some implementations of JSPTM pages do this. If necessary, the maximum permanent generation size can be increased with MaxPermSize.

Some applications interact with garbage collection by using finalization and weak/soft/phantom references. These features can create performance artifacts at the Java programming language level. An example of this is relying on finalization to close file descriptors, which makes an external resource (descriptors) dependent on garbage collection promptness. Relying on garbage collection to manage resources other than memory is almost always a bad idea.

Another way applications can interact with garbage collection is by invoking full garbage collections explicitly, such as through the System.gc() call. These calls force major collection, and inhibit scalability on large systems. The performance impact of explicit garbage collections can be measured by disabling explicit garbage collections using the flag -XX:+DisableExplicitGC.

One of the most commonly encountered uses of explicit garbage collection occurs with RMI's distributed garbage collection (DGC). Applications using RMI refer to objects in other virtual machines. Garbage can't be collected in these distributed applications without occasional local collection, so RMI forces periodic full collection. The frequency of these collections can be controlled with properties. For example,

java -Dsun.rmi.dgc.client.gcInterval=3600000
-Dsun.rmi.dgc.server.gcInterval=3600000 ...

specifies explicit collection once per hour instead of the default rate of once per minute. However, this may also cause some objects to take much longer to be reclaimed. These properties can be set as high as Long.MAX_VALUE to make the time between explicit collections effectively infinite, if there is no desire for an upper bound on the timeliness of DGC activity.

The Solaris 8 Operating Environment supports an alternate version of libthread that binds threads to light-weight processes (LWPs) directly. Some applications can benefit greatly from the use of the alternate libthread. This is a potential benefit for any threaded application. To try this, set the environment variable LD_LIBRARY_PATH to include /usr/lib/lwp before launching the virtual machine. The alternate libthread is the default libthread in the Solaris 9 Operating Environment.

Soft references are cleared less aggressively in the server virtual machine than the client. The rate of clearing can be slowed by increasing a parameter in this way: -XX:SoftRefLRUPolicyMSPerMB=10000. The default value is 1000, or one second per megabyte.

6 Conclusion

Garbage collection can become a bottleneck in different applications depending on the requirements of the applications. By understanding the requirements of the application and the garbage collection options, it is possible to minimize the impact of garbage collection.

7 Other Documentation

7.1 Example of Output

The GC output examples document contains examples for different types of garbage collector behavior. The examples show the diagnostic output from the garbage collector and explain how to recognize various problems. Examples from different collectors are included.

7.2 Frequently Asked Questions

A FAQ is included that contains answers to specific questions. The level of detail in the FAQ is generally greater than in this tuning document.

 

Engadget

正在加载...正在加载...
没有添加内容。