LightBlog
Contact at mumbai.academics@gmail.com or 8097636691/9323040215
Responsive Ads Here

Monday, 28 May 2018

Assessing Invariant Mining Techniques for Cloud-based Utility Computing Systems

ABSTRACT:
Likely system invariants model properties that hold in operating conditions of a computing system. Invariants may be mined offline from training datasets, or inferred during execution. Scientific work has shown that invariants’ mining techniques support several activities, including capacity planning and detection of failures, anomalies and violations of Service Level Agreements. However their practical application by operation engineers is still a challenge. We aim to fill this gap through an empirical analysis of three major techniques for mining invariants in cloud-based utility computing systems: clustering, association rules, and decision list. The experiments use independent datasets from real-world systems: a Google cluster, whose traces are publicly available, and a Software-as-a-Service platform used by various companies worldwide. We assess the techniques in two invariants’ applications, namely executions characterization and anomaly detection, using the metrics of coverage, recall and precision. A sensitivity analysis is performed. Experimental results allow inferring practical usage implications, showing that relatively few invariants characterize the majority of operating conditions, that precision and recall may drop significantly when trying to achieve a large coverage, and that techniques exhibit similar precision, though the supervised one a higher recall. Finally, we propose a general heuristic for selecting likely invariants from a dataset.
EXISTING SYSTEM:
  • Di et al. used a K-means clustering algorithm to classify applications in an optimized number of sets based on task events and resource usage; they also found a correlation between task events and application types, with about 81.3% of fail events belonging to batch applications.
  • Chen et al. used the dataset for analysis and prediction of job failures; Guan and Fu identified anomalies through Principal Component Analysis of monitored system performance metrics.
  • Rosà et al. analyze unsuccessful tasks/jobs executions and propose Neural Networks based prediction models. While these studies do not specifically address invariants, some of their results about workload characterization and failures identification are in line with the ones we present based on the three mining techniques.
DISADVANTAGES OF EXISTING SYSTEM:
  • Scientific work has shown that invariants’ mining techniques support several activities, including capacity planning and detection of failures, anomalies and violations of Service Level Agreements.
  • However their practical application by operation engineers is still a challenge.
PROPOSED SYSTEM:
  • We explore the use of the techniques for two typical applications of invariant-based analysis, namely executions characterization and anomaly detection. We assess them based on the widely used metrics coverage, precision and recall. A sensitivity analysis is performed to carefully explore the invariants returned by each technique under different settings of the mining algorithms.
  • The key findings of the study are:
  • The considered techniques provide a valuable support for characterizing executions and detecting anomalies in an automated way.
  • A relatively small number of invariants hold in a majority of system executions.
  • Invariants are very sensitive to the coverage: small variations of the coverage impact significantly recall and precision. For instance, the recall of association rules (Apriori algorithm) for the Google cluster drops from 0.54 to 0.33 when coverage increases from 68% to 77%; similarly, when the coverage of clustering (DBSCAN algorithm) raises from 87% to 92%, precision drops from 0.35 to 0.01 for SaaS.
  • There seems to be a sort of threshold phenomenon: recall/precision are strongly bound to the coverage of the correct executions.
  • Precision is surprisingly similar across the techniques.
  • In spite of the best coverage, association rules are not well suited for anomaly detection; notwithstanding the smaller coverage, invariants mined by decision list achieve higher recall/precision for anomaly detection.
  • We propose a general heuristic for selecting a set of likely invariants from a dataset.
ADVANTAGES OF PROPOSED SYSTEM:
  • For the SaaS cloud platform in particular, using the mined invariants it was possible to provide a valuable result to the service operation team of the IT company, spotting true anomalies for a number of transactions out of the seven month’s of operation data, which were indeed missing and went unnoticed.
  • No few-fits-all invariants can be practically mined to characterize all system executions. The coverage of the correct executions is roughly 80%-90% for both datasets.
  • As for recall, the decision list supervised technique outperforms the unsupervised clustering and association rules.
SYSTEM ARCHITECTURE:
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS: 
  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB
SOFTWARE REQUIREMENTS: 
  • Operating system : Windows 7.
  • Coding Language : JAVA/J2EE
  • Tool : Eclipse Luna
  • Database : MYSQL
Thanks and Regards,
Mumbai Academics | Airoli 
8097636691 (Gaurav Sir)[Project Manager]
7506234650 (Hema Yadav)[HR]
Row House No 7,Opp Datta Meghe College, 
Sector 2,Airoli ,Navi Mumbai MH 400708

No comments:

Post a Comment