Fuzzy Logic approach for Capacity Projection Rahul Vishwakarma
15 Slides2.85 MB
Fuzzy Logic approach for Capacity Projection Rahul Vishwakarma
Agenda Data Domain Backup Storage Capacity Projection Feature Why HOFTS Applying HOFTS Forecast Accuracy and Comparison Demo Next Steps of2Y Dell – Internal Use – Confidential
Data Domain backup storage Market leader in PBBA segment Data Domain Management Center Large Enterprise Midsize Enterprise Small Enterprise /ROBO Data Domain Virtual Edition Data Domain DD3300 Data Domain DD6300 Data Domain DD6800 Data Domain DD9300 Data Domain DD9800 Supports DD Cloud DR Supports DD Cloud DR Supports DD Cloud DR Supports DD Cloud DR Supports DD Cloud DR Supports DD Cloud DR Supports DD Cloud Tier Supports DD Cloud Tier Supports DD Cloud Tier Supports DD Cloud Tier Supports DD Cloud Tier Supports Instant Access Supports Instant Access Supports Instant Access Supports Instant Access Supports Instant Access Supports HA Configurations Supports HA Configurations Supports HA Configurations of3Y Dell – Internal Use – Confidential Supports Instant Access
Capacity Projection Feature of4Y Dell – Internal Use – Confidential
Why Higher Order Fuzzy Time Series (HOFTS) Explored Time Series models Segmented Regression HOFTS Accuracy Low High – Neural Network Data Trend Linear Non-Linear ANN or Data points 15 5 Complexity Medium Low – Regressive Models AR, MA, ARIMA, kNN – Fuzzy Logic Approach FTS variants and HOFTS Scale-out backup systems need algorithms with – – – – – of5Y Better Accuracy Lesser time complexity Lesser computational cost Scalability and Simplicity Streaming time series data Dell – Internal Use – Confidential
Applying HOFTS Data Collection – Autosupport from Customers Time Series parameters – Time: 365 Days – Capacity: Post Compression Performance Analysis – Computational Complexity FLR (Fuzzy Logic Relation) Forecasting Time Complexity of6Y Dell – Internal Use – Confidential
Forecast Accuracy & Comparison of7Y Dell – Internal Use – Confidential Model FTS FTS Diff WFTS WFTS Diff IWFTS IWFTS Diff EWFS EWFS Diff HOFTS HOFTS Diff RMSE 0.85 0.47 0.95 0.99 0.88 0.34 1.5 0.81 0.28 1.47 MAPE (%) 1.4 0.86 1.84 1.86 1.61 0.74 2.97 1.71 0.62 2.62
Demo of8Y Dell – Internal Use – Confidential
Work in Progress Parameter selection – Study correlation between post compression and deduplication ratio n-step ahead forecasting – Evaluating Direct-Recursive Hybrid Multi-step Forecast Strategy Multivariate model – M factor Fuzzy time series – Hybridized models (Neuro-fuzzy approach) Distributed execution – pyFTS and dispy libraries to validate HOFTS over install base of9Y Dell – Internal Use – Confidential
References Ajoy K. Palit and Dobrivoje Popovic. 2005. Computational Intelligence in Time Series Forecasting: Theory and Engineering Applications (Advances in Industrial Control). Springer-Verlag, Berlin, Heidelberg. Chen, 2014 Chen Mu-Yen, A high-order fuzzy time series forecasting model for internet stock trading. (Future Generation Computer Systems). vol. 37, number , pp. 461--467, 2014. Hwang, Chen et al., 1998 Hwang Jeng-Ren, Chen Shyi-Ming and Lee Chia-Hoang, Handling forecasting problems using fuzzy time series. (Fuzzy sets and systems). vol. 100, number 1, pp. 217--228, 1998. Mark Chamness. 2011. Capacity forecasting in a backup storage environment. In Proceedings of the 25th international conference on Large Installation System Administration (LISA'11). USENIX Association, Berkeley, CA, USA, 12-12. Pritpal Singh. 2016. Applications of Soft Computing in Time Series Forecasting (Simulation and Modeling Techniques). Springer-Verlag, Berlin, Heidelberg. Song and Chissom, 1993 Song Qiang and Chissom Brad S, Fuzzy time series and its models. (Fuzzy sets and systems). vol. 54, number 3, pp. 269--277, 1993. of 10Y Dell – Internal Use – Confidential
Backup Slides of11Y Dell – Internal Use – Confidential
HOFTS Example Universe of Discourse (U) 90 Days Interval Partitioning (len 20) Linguist Terms: A0 (Very Low), A1 (Low), , A9 (Medium), , A18 (High), A19 (Very High) (2nd Order FLRG) A10,A10 - A11, A12 A10,A11 - A11, A12, A13, A5, A6, A7, A8 A10,A12 - A11, A12, A13, A5, A6, A7, A8 . A9,A8 - A10, A11, A8, A9 A9,A9 - A10, A11, A12, A8, A9 of 12Y Dell – Internal Use – Confidential
Linear regression and KNN of 13Y Dell – Internal Use – Confidential
Segmented Regression of 14Y Dell – Internal Use – Confidential
Time Complexity Comparison Song-Chissom Chen's method Markov method HOFTS 3.2% 3.22% 2.6% 3.12% Time Complexity Forecasting Error of 15Y Dell – Internal Use – Confidential