Grid Infrastructure Monitoring System y Based on Nagios g E. Imamagic, g D. Dobrenic SRCE HPDC 2007,, Workshop p on Grid Monitoring g HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Overview Motivation Nagios framework Nagios-based grid monitoring Architecture Grid e extensions tensions Statistics Demo Contributions to WLCG Grid Service Monitoring WG Future work Conclusions HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Motivation Provide site admin-centric monitoring E bl b Enable better tt resource availability il bilit simplify grid resources operations issue notifications as soon as problem appears Achieve complex sensor’s sensor s dependencies enables problem isolation only relevant notifications are issued Visualization & management interface grid resources status g Report generation availability, problem history HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Nagios Framework Open source monitoring framework Hostt and H d service i problems bl d detection t ti and d recovery Provides wide set of basic sensors eas to de easy develop elop ccustom stom sensors Centralized vs. distributed deployment High configurability widely used & actively developed service dependencies, fine-grained notification options Web interface status view, administration HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Nagios--based Grid Monitoring Nagios Monitoring CRO-GRID Infrastructure (2004-2006) Globus Toolkit Pre-WS & WS, UNICORE, other services active recovery of services still in production within CRO NGI Monitoring EGEE resources in Central Europe (CE) core services since mid 2006 all CE sites for 1st line support since September 2006 centralized deployment - single server @ SRCE http://nagios.ce-egee.org HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Architecture SAM gatherer Nagios web interface VOMS proxy certificate Credential refresh Sensors descriptions Gather nodes i f information ti SE SE Site BDII LFC Site BDII CE CE WMS BDII HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios MON Grid Extensions Grid sensors Security facilities & services • CA distribution distribution, Certificate lifetime lifetime, MyProxy MyProxy, VOMS VOMS, VOMS Admin Monitoring & information services • R-GMA, BDII, MDS, GridICE Job management services • Globus Gatekeeper, RB, WMS, WMProxy, Job matching File management services • GridFTP, SRM, DPNS, LFC HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Grid Extensions Sensor hierarchy Automatic recovery both local and remote services security handled with sudo Certificate based authentication for the web interface NCG, SAM gatherer, Credential mgmt. HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Statistics EGEE implementation statistics 69 hosts 570 services actively monitored 1029 services results imported from SAM Nagios server statistics (last month) HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Demo EGEE implementation p web interface HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Contributions to WLCG Grid Service Monitoring WG All sensors rewritten to be compliant with Probe specification D Developed l d iinterface t f tto N Nagios i d data t compliant li t with ith D Data t exchange format Nagios based prototype Nagios-based several grid extensions used (NCG, credential management, SAM gatherer) g ) HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Future Work Utilizing our extensions on site level Distributing monitoring deployment hierarchy of Nagios servers Migration of credential management to robot certificates F th sensor development Further d l t Service check execution optimization active ti vs. passive i checks h k HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Conclusions Nagios highly configurable monitoring framework with notifications, service dependencies dependencies, … simple, programming language-agnostic sensor API Grid extensions integration with existing infrastructure (user certificates, VOMS, GOCDB, SAM) sensors for f key k grid id services i Nagios @ grid enables sites’ better availability admins get only relevant notifications HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Thank Th k Y You!! Questions? HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
© Copyright 2024