Memory System Behavior of Java-Based Middleware Martin Karlsson,
26 Slides964.00 KB
Memory System Behavior of Java-Based Middleware Martin Karlsson, Kevin E. Moore, Erik Hagersten and David A. Wood February 11, 2003 Ninth International Symposium on High Performance Computer Architecture
Java-Based Middleware: An Important New Workload for Multiprocessor Servers Java-Based middleware connects Web pages to databases Web-based applications are deployed in 3-tier systems – Clients – Middleware (e.g. application servers) – Databases Rapid growth Diverse clients will increase the role of middleware Browsers/ Thin Clients Middleware Web Server Business Logic Databases LAN/WAN HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 2
Java Middleware Benchmarks SPECjbb2000 – Approximates a 3-tier system in a single application – Will run on any JVM without any 3rd-party software – Easy to install, tune and run (set up time measured in hours) ECperf (now SPECjAppServer2001) – Runs on a real 3-tier system – Easy to isolate the behavior of individual tiers – Requires expensive 3rd-party software (application server and database) – Difficult to install, tune and run (set up time measured in weeks) HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 3
Outline Background – 3-Tiered Systems – ECperf and SPECjbb2000 Hardware monitoring experiments – System size scaling – Benchmark scaling Simulation Experiments – Cache Performance Design decisions – Shared Caches HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 4
Application Servers & 3-Tiered Systems Users/Customers (e-commerce) 3-tiered systems are common in e-commerce and B2B applications Application servers provide a framework for middle-tier applications – Presentation – Business Rules Services include Tier 1 Application Server Presentation Logic Other Businesses (B2B) Business Rules – Database connectivity – Client connectivity – Resource management Application servers often implemented in Java HPCA February 11, 2003 Tier 2 Database Memory System Performance of Java-Based Middleware Tier 3 5
ECperf Driver Runs on top of existing commercial applications (Database and Application Server) – Adds Cost, tuning effort – Restricted source code Consists of 4 networked programs – – – – Application Server Database Supplier Emulator Driver Runs on multiple machines Order Agents Supplier Emulator Emulator Servlet Mfg Agents Application Server Servlet Host Presentation Logic Orders & Mfg Servlets Supplier Servlets Java Beans EJB Container Business Rules Mfg Supplier Corp Orders Database – Easy to isolate tiers HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 6
ECperf Driver Runs on top of existing commercial applications (Database and Application Server) Order Agents Application Server Servlet Host Consists of 4 networked Measurements on middle tier programs Application Server Database Supplier Emulator Driver Runs on multiple machines Emulator Servlet Mfg Agents – Adds Cost, tuning effort – Restricted source code – – – – Supplier Emulator only Orders & Mfg Servlets Supplier Servlets EJB Container Mfg Supplier Corp Orders Database – Easy to isolate tiers HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 7
SPECjbb2000 Single JVM Database emulated by trees of Java objects Easy to install tune and run Available source code Difficult to measure behavior of individual tiers Client Threads Business Logic Engine Object Trees Benchmark Process HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 8
SPECjbb2000 Single JVM Database emulated by trees of Java objects Easy to install tune and run Available source code Measurements include database and client Difficult to measure code behavior of individual tiers Client Threads Business Logic Engine Object Trees Benchmark Process HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 9
Outline Background – 3-Tiered Systems – ECperf and SPECjbb2000 Hardware monitoring experiments – System size scaling – Benchmark scaling Simulation Experiments – Cache Performance Design decisions – Shared Caches HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 10
Monitoring Experiments Hardware – Sun E6000 (SPECjbb2000, Application Server, Database) 16, 248 MHz UltraSparc II processors 2 GB RAM 1 MB unified L2 cache – Sun Netra (Emulator, Driver) 1, 500 MHz UltraSparc IIe Software – HotSpot 1.3.1 JVM – Solaris 8 HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 11
Benchmark Settings and Alterations SPECjbb2000 – Increased warm-up and measurement intervals 60 s warm-up and 6 min measurement – Picked 1 value for the number of warehouses #warehouses #processors ECperf – Relaxed response time requirements JVM Options – Heap Size 1424 MB – ISM – New Generation 400 MB HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 12
Performance Scaling 15 Speedup 10 Linear SPECjbb ECperf 5 0 0 2 4 6 8 10 12 14 16 Processors HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 13
Data Sharing 80 Cache to Cache Transfer Ratio (%) 70 60 50 ECperf 40 SPECjbb 30 20 10 0 0 2 4 6 8 10 12 14 16 Processors HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 14
Memory Use vs. Scale Factor (8 p) 600 Memory Use (MB) 500 400 ECperf 300 SPECjbb 200 100 0 0 5 10 15 20 25 30 35 40 Scale Factor HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 15
Scaling Effects Scaling System Size – – – – – – Increased system size from 1 to 15 processors High Idle times for both benchmarks on large systems Contention inside the application or JVM High fraction of sharing misses on large systems Very few misses to main memory despite large heap CPI (ECperf 2.0-2.8, SPECjbb2000 1.8-2.3) Benchmark Scaling – Increased transaction input rate and database size ECperf: SPECjbb2000: Orders Input Rate Warehouses – Affects SPECjbb2000 more than ECperf HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 16
Outline Background – 3-Tiered Systems – ECperf and SPECjbb2000 Hardware monitoring experiments – System size scaling – Benchmark scaling Simulation Experiments – Cache Performance Design decisions – Shared Caches HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 17
Cache Simulations Experiments conducted with Virtutech Simics with an extended memory system simulator – 4-way set associative caches – 64 byte cache lines Cache Miss Rates – Uniprocessor simulations – Split 1-level caches Sharing Analysis – 8-processor simulations – Unified cache HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 18
Data Cache Misses/1000 Instructions 20 ECperf SPECjbb-25 10 SPECjbb-10 SPECjbb-1 0 32 64 128 256 512 1024 2048 4096 8192 16384 Cache Size (KB) HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 19
Instruction Cache Misses/1000 Instructions 20 ECperf SPECjbb-25 10 SPECjbb-10 SPECjbb-1 0 32 64 128 256 512 1024 2048 4096 8192 16384 Cache Size (KB) HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 20
Communication Distribution 12.3%, 100% Percent Cache-to-Cache Tranfers (%) 100 20%, 88.5% 80 60 ECperf SPECjbb-25 40 20 0 0 20 40 60 80 100 Percent of All Cache Lines (%) HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 21
Outline Background – 3-Tiered Systems – ECperf and SPECjbb2000 Hardware monitoring experiments – System size scaling – Benchmark scaling Simulation Experiments – Cache Performance Design decisions – Shared Caches HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 22
Shared Caches Potentially a good fit for Java-based middleware – High cache-to-cache transfer ratio – Small working sets – Low memory bandwidth Important design point for CMPs Experiment: Measured data miss rate for a simulated 8-processor system running each benchmark – All caches are 1MB – Varied number of caches and degree of sharing HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 23
Data Miss Rate vs. Sharing Degree 25 Misses/1000 Instructions 20 15 ECperf SPECjbb-25 10 5 0 1 2 4 8 Processors/Cache HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 24
Paper Summary Descriptions of ECperf and SPECjbb2000 Combination of hardware monitoring and full-system simulation – – – – Scalability Execution time breakdown I/D Cache performance Input rate scaling Effects of garbage collection Data sharing analysis and shared-cache performance Conclusion – Benchmark differences can lead to opposite design conclusions HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 25
Conclusions Both SPECjbb2000 and ECperf – Have small data sets – High rate of sharing misses SPECjbb2000 approximates ECperf well except for 2 important differences – ECperf has a much larger instruction footprint and a higher instruction miss rate – The memory footprint of SPECjbb2000 is larger than that of ECperf, especially on large systems HPCA February 11, 2003 Memory System Performance of Java-Based Middleware 26