CS5226 Hardware Tuning
36 Slides371.00 KB
CS5226 Hardware Tuning
Application Programmer (e.g., business analyst, Data architect) Application Sophisticated Application Programmer (e.g., SAP admin) Query Processor Indexes Storage Subsystem Concurrency Control DBA, Tuner Recovery Operating System Hardware [Processor(s), Disk(s), Memory] 2
Outline Part 1: Tuning the storage subsystem RAID storage system Choosing a proper RAID level Part 2: Enhancing the hardware configuration 3
Magnetic Disks tracks spindle 1956: IBM (RAMAC) first disk drive 5 Mb – 0.002 Mb/in2 35000 /year 9 Kb/sec platter read/write head 1980: SEAGATE actuator disk arm Controller first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625 Kb/sec 1999: IBM MICRODRIVE first 1’’ disk drive 340Mb 6.1 MB/sec disk interface 4
Magnetic Disks Access Time (2001) Controller overhead (0.2 ms) Seek Time (4 to 9 ms) Rotational Delay (2 to 6 ms) Read/Write Time (10 to 500 KB/ms) Disk Interface IDE (16 bits, Ultra DMA - 25 MHz) SCSI: width (narrow 8 bits vs. wide 16 bits) - frequency (Ultra3 - 80 MHz). http://www.pcguide.com/ref/hdd/ 5
Storage Metrics DRAM Unit Capacity Unit Price /Gb Latency (sec) 2GB 1600 800 1.E-8 Disk 18GB 467 26 2.E-3 Tape Robot 14x70Gb 20900 21 3.E 1 (15k RPM) Bandwidth 1000 (Mbps) Kaps Maps Scan time (sec/Tb) 1.E 6 1.E 3 2 40 (up to 160) 470 23 450 40 (up to 100) 3.E-2 3.E-2 24500 6
Hardware Bandwidth System Bandwidth Yesterday in megabytes per second (not to scale!) 40 15 per disk Slide courtesy of J. Gray/L.Chung The familiar bandwidth pyramid: 133 422 The farther from the CPU, the less the bandwidth. Hard Disk SCSI PCI Memory Processor 7
Hardware Bandwidth System Bandwidth Today in megabytes per second (not to scale!) The familiar pyramid is gone! PCI is now the bottleneck! 26 26 160 133 1,600 In practice, 3 disks can reach saturation using sequential IO Hard Disk SCSI PCI Memory Processor 26 Slide courtesy of J. Gray/L.Chung 8
RAID Storage System Redundant Array of Inexpensive Disks Combine multiple small, inexpensive disk drives into a group to yield performance exceeding that of one large, more expensive drive Appear to the computer as a single virtual drive Support fault-tolerance by redundantly storing information in various ways 9
RAID 0 - Striping No redundancy No fault tolerance High I/O performance Parallel I/O 11
RAID 1 – Mirroring Provide good fault tolerance Works ok if one disk in a pair is down One write a physical write on each disk One read either read both or read the less busy one Could double the read rate 12
RAID 3 - Parallel Array with Parity Fast read/write All disk arms are synchronized Speed is limited by the slowest disk 13
Parity Check - Classical An extra bit added to a byte to detect errors in storage or transmission Even (odd) parity means that the parity bit is set so that there are an even (odd) number of one bits in the word, including the parity bit A single parity bit can only detect single bit errors since if an even number of bits are wrong then the parity bit will not change It is not possible to tell which bit is wrong 14
RAID 5 – Parity Checking For error detection, rather than full redundancy Each stripe unit has an extra parity stripe Parity stripes are distributed 15
RAID 5 Read/Write Read: parallel stripes read from multiple disks Good performance Write: 2 reads 2 writes Read old data stripe; read parity stripe (2 reads) XOR old data stripe with new data stripe. XOR result into parity stripe. Write new data stripe and new parity stripe (2 writes). 16
RAID 10 – Striped Mirroring RAID 10 Striping mirroring A striped array of RAID 1 arrays High performance of RAID 0, and high tolerance of RAID 1 (at the cots of doubling disks) . More information about RAID disks at http://www.acnc.com/04 01 05.html 17
Hardware vs. Software RAID Software RAID Software RAID: run on the server’s CPU Directly dependent on server CPU performance and load Occupies host system memory and CPU operation, degrading server performance Hardware RAID Hardware RAID: run on the RAID controller’s CPU Does not occupy any host system memory. Is not operating system dependent Host CPU can execute applications while the array adapter's processor simultaneously executes array functions: true hardware multi-tasking 18
RAID Levels - Data Settings: accounts( number, branchnum, balance); create clustered index c on accounts(number); 100000 rows Cold Buffer Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000. 19
RAID Levels - Transactions No Concurrent Transactions: Read Intensive: select avg(balance) from accounts; Write Intensive, e.g. typical insert: insert into accounts values (690466,6840,2272.76); Writes are uniformly distributed. 20
RAID Levels Throughput (tuples/sec) Read-Intensive 80000 60000 40000 20000 0 SoftRAID5 RAID5 RAID0 RAID10 RAID1 Single Disk SQL Server7 on Windows 2000 (SoftRAID means striping/parity at host) Read-Intensive: Write-Inte nsiv e Throughput (tuples/sec) 160 120 80 Write-Intensive: 40 0 SoftRAID5 RAID5 RAID0 RAID10 RAID1 Using multiple disks (RAID0, RAID 10, RAID5) increases throughput significantly. Without cache, RAID 5 suffers. With cache, it is ok. Single Disk 21
Comparing RAID Levels RAID 0 RAID 1 RAID 5 RAID 10 Read High 2X High High Write High 1X Medium High Fault tolerance No Yes Yes Yes Disk utilization High Low High Low Key problems Data lost when any disk fails Use double the disk space Key advantag es High I/O Very high I/O performance performance Lower throughput with disk failure Very expensive, not scalable A good overall balance High reliability with good performance 22
Controller Pre-fetching No, Write-back Yes Read-ahead: Prefetching at the disk controller level. No information on access pattern. Better to let database management system do it. Write-back vs. write through: Write back: transfer terminated as soon as data is written to cache. Batteries to guarantee write back in case of power failure Write through: transfer terminated as soon as data is written to disk. 23
SCSI Controller Cache Data Settings: employees(ssnum, name, lat, long, hundreds1, hundreds2); create clustered index c on employees(hundreds2); Employees table partitioned over two disks; Log on a separate disk; same controller (same channel). 200 000 rows per table Database buffer size limited to 400 Mb. Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000. 24
SCSI (not disk) Controller Cache - Transactions No Concurrent Transactions: update employees set lat long, long lat where hundreds2 ?; cache friendly: update of 20,000 rows ( 90Mb) cache unfriendly: update of 200,000 rows ( 900Mb) 25
SCSI Controller Cache 2 Disks - Cache Size 80Mb Throughput (tuples/sec) 2000 no cache cache 1500 1000 500 0 cache friendly (90Mb) cache unfriendly (900Mb) SQL Server 7 on Windows 2000. Adaptec ServerRaid controller: 80 Mb RAM Write-back mode Updates Controller cache increases throughput whether operation is cache friendly or not. Efficient replacement policy! 26
Which RAID Level to Use? Data and Index Files RAID 5 is best suited for read intensive apps or if the RAID controller cache is effective enough. RAID 10 is best suited for write intensive apps. Log File RAID 1 is appropriate Fault tolerance with high write throughput. Writes are synchronous and sequential. No benefits in striping. Temporary Files RAID 0 is appropriate. No fault tolerance. High throughput. 27
What RAID Provides Fault tolerance It does not prevent disk drive failures It enables real-time data recovery High I/O performance Mass data capacity Configuration flexibility Lower protected storage costs Easy maintenance 28
Enhancing Hardware Config. Add memory Cheapest option to get better performance Can be used to enlarge DB buffer pool Better hit ratio If used for enlarge OS buffer (as disk cache), it benefits but to other apps as well Add disks Add processors 29
Add Disks Larger disk better performance Add disks for Bottleneck is disk bandwidth A dedicated disk for the log Switch RAID5 to RAID10 for update-intensive apps Move secondary indexes to another disk for write-intensive apps Partition read-intensive tables across many disks Consider intelligent disk systems Automatic replication and load balancing 30
Add Processors Function parallelism Use different processors for different tasks GUI, Query Optimisation, TT&CC, different types of apps, different users Operation pipelines: E.g., scan, sort, select, join Easy for RO apps, hard for update apps Data partition parallelism Partition data, thus the operation on the data 31
Parallelism Some tasks are easier to parallelize E.g., join phase of GRACE hash join E.g., scan, join, sum, min Some tasks are not so easy E.g., sorting, avg, nested-queries 32
Summary We have covered: The storage subsystem RAID: what are they and which one to use? Memory, disks and processors When to add what? 33
Database Tuning Database Tuning is the activity of making a database application run more quickly. “More quickly” usually means higher throughput, though it may mean lower response time for time-critical applications. 34
Tuning Principles Think globally, fix locally Partitioning breaks bottlenecks (temporal and spatial) Start-up costs are high; running costs are low Render onto server what is due onto Server Be prepared for trade-offs (indexes and inserts) 35
Tuning Mindset Set reasonable performance tuning goals Measure and document current performance Identify current system performance bottleneck Identify current OS bottleneck Tune the required components eg: application, DB, I/O, contention, OS etc Track and exercise change-control procedures Measure and document current performance Repeat step 3 through 7 until the goal is met 36
Goals Met? Appreciation of DBMS architecture Study the effect of various components on the performance of the systems Tuning principle Troubleshooting techniques for chasing down performance problems Hands-on experience in Tuning 37