VSM Introduction & Major Features Wang, Yaguang Ferber, Dan March 2015 1 Agenda Overview Architecture & Key Components Major Features 2 Overview 3 VSM Overview • VSM (Virtual Storage Manager) is an open source ceph management tool developed by Intel, and announced on 2014 Nov’s openstack Paris summit. It is designed to help make the day to day management of Ceph easier for storage administrators. • Home page: • https://01.org/virtual-storage-manager • Code Repository: • https://github.com/01org/virtual-storage-manager • Issue Tracking: • https://01.org/jira/browse/VSM • Mailing list: • http://vsm-discuss.33411.n7.nabble.com/ 4 VSM Overview • VSM is designed for bundled ceph storage appliance, it creates ceph cluster for management and monitoring. So far it runs on Centos 6.5 with Ceph firefly and openstack havana icehouse. Allows Easier Ceph Creation and Management Creates Ceph cluster Manages Ceph Monitors Ceph Web-based UI Administrator-friendly interface for management and monitoring VSM API Interfaces For automation 5 Typical VSM-Managed Cluster VSM Controller – Dedicated server or server instance Server Nodes Are members of VSM-managed Ceph cluster VSM agent runs on every server in VSM-managed cluster OpenStack-Administered Network VSM Controller OpenStack Admin Network Configuration Client Node Client Node Client Node RADOS RADOS RADOS Ceph public subnet – Carries data traffic between clients and Ceph cluster servers Administration subnet – Carries administrative communications between VSM controller and agents Also administrative comms between Ceph daemons Ceph cluster subnet – Carries data traffic between Ceph storage nodes – replication and rebalancing OpenStack admin (optional) Ceph public - 10GbE or InfiniBand Administration GbE Server Node Server Node Server Node Server Node VSM Agent VSM Agent VSM Agent VSM Agent Monitor Monitor Monitor OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD SSD SSD SSD SSD Optionally connected to VSM via SSH connection Allows VSM to “tell” OpenStack about Ceph storage pools Ceph cluster - 10GbE or InfiniBand 6 Architecture & Key components 7 VSM Architecture Web UI python-vsmclient API scheduler storage conductor RabbitMQ agent storage agent storage agent 8 Key Components: Overview Python-vsmclient This is a client for the vsm API, it consists of a Python API (the vsmclient module), a command-line script (vsm). Each implements 100% of the vsm API. Vsm A major module for ceph management Vsm-dashboard web based management interface for VSM. Vsm-deploy The ceph deployment tool kit provided by VSM . 9 Key Components: relationships Vsm component Vsm-deploy REST 3rdparty Vsm-dashboard CLI Openstack vsmclient REST REST API Keystone VSM SSH Nova Ceph SSH Cinder 10 Key Components: interaction flow VSM Architecture External Party VSM Agents VSM UI Portal (on VSM Controller Node) a. HTTPs authenticate with username/password, get token 2.Authenticate via REST VSM Controller Node Keystone (Authentication Center. Listen on ports 5000 &35357) SSH Channel 9. Socket Connection Apache/Django (VSM Dashboard) 5.(c.)Https Authenticate via REST 3.Library Call OpenStack MySQL DB (Listen on ports 3306) (Via SQLAlchemy+MariaDB Client) 1. HTTPS with Username/Password OpenStack Nova Controller 11.(0)HTTPS authenticate & Register via REST b. Https request w/ token External Client (On Ceph Storage/Monitor Nodes) Ceph (On Ceph Storage/Monitor Nodes) VSM Agent 1 Ceph Command Call VSM Conductor (Manage DB Access) VSM Agent 2 8.RPC Message Ceph Ceph Command Call 10.RPC Message Operator Web Browser Rabbit MQ (Listen on port 5672) VSM Web Client Proxy Ceph 6.RPC Message VSM API (Expose API. Listen on port 8778) 7. RPC Message VSM Software VSM Scheduler (Schedule and dispatch tasks to Agent, Conductor) 3rd Party Software 3rdParty Software – Customized Authentication Messages U 4.HTTPS REST 11 Major Features Major Features Cooperation Cluster Management - Managing monitors or servers - Replacing failed disks/servers - Managing storage pools (replicated, erasure-coded or cache tiering) - REST API & CLI Cluster Monitoring - Pools for OpenStack - Cluster Health and OSD/PG and Monitors status on dashboard - Capacity utilization for Storage Groups and Pools VSM Configuration Management Capacity Management - - Grouping storage devices by function or performance - Monitoring capacity for storage group and pool - Creates and controls Ceph cluster configuration through pre-defined manifests Operator-selectable configuration options are defined and validated by OEM or System Integrator Openness - Apache V2 license. - Written in python - Bundle (CentOS 6.5, Ceph Firefly 80.x, Openstack Havana/Icehouse) 13 What it does… Internet Web-based UI Administrator-friendly interface for management and monitoring Configuration management Storage Groups aggregate similar drives Zones aggregate drives within failure domains Data Center Data Center Firewall Capacity Management OpenStack Pools segregate allocation by application Nova Controller Server(s) Capacity utilization by storage group and pool Cluster Management Manage capacity growth Manage Server & disk maintenance Cluster Monitoring Capacity and performance OSD, PG, and monitor state VSM APIs Client Server Operator Cluster Server SSH HTTP API (OSDs/Monitor/MDS)) VSM Controller (Controller Node Server) VSM Agent Socket (on each Ceph node) Management framework = Consistent configuration Operator-friendly interface for management & monitoring Software interface supports automation 14 What it is… VSM Controller Software Runs on dedicated server (or server instance) Connects to Ceph cluster through VSM agent Connects to OpenStack Nova controller (optional) via SSH Never touches clients or client data VSM Agent Software Runs on every server in the Ceph cluster Relays server configuration & status information to VSM controller 15 Managing Servers and Disks Servers can host more than one type of drive, drives with similar performance characteristics are identified by Storage Class. Drives with the same Storage Class are grouped together in Storage Groups Storage Groups are paired with specific Storage Classes. 7200_RPM_HDD 10K_RPM_HDD 15K_RPM_HDD Storage class “Capacity” = 7200_RPM_HDD “Performance” = 10K_RPM_HDD “High Performance” = 15K_RPM_HDD VSM monitors Storage Group capacity utilization, warns at threshold. Capacity Performance High Performance Storage Classes and Storage Groups are defined in the cluster manifest file Drives are identified by Storage Class in the server manifest file Storage group 16 Managing Failure Domains Zone 2 Zone 1 Zone 3 Servers can be grouped into failure domains. In VSM, failure domains are indented by zones. Zones are placed under each Storage Group Drives in each zone are placed in their respective storage group In the example at right, six servers are placed in three different zones. VSM creates three zones under each storage group, and places the drives in their respective storage groups and zones. Performance Capacity Zone 1 Zone 2 Zone 2 Zone 1 Zone 2 Zone 2 Zones are defined in the cluster manifest file Zone membership is defined in the server manifest file 7200_RPM_HDD (Capacity) 10K_RPM_HDD (Perfromance) 17 One Zone with server-level replication 17 VSM Controller: Cluster Manifest VSM Controller runs on dedicated server or server instance VSM controller uses the cluster manifest file to define the cluster meta information, like storage classes, storage groups, zones, subnet, and also cluster settings, like capacity threshold for warning, ec and cache tier settings. Meta information The cluster manifest defines the storage classes, storage groups and zones to be used in the ceph cluster, also it defines the subnets for management, public or cluster. Cluster Setting The cluster manifest provides a few settings to help tune cluster behaviors, like capacity thresholds, heartbeat intervals. Also the default settings for ec pool and cache tiering are also covered. 18 VSM Controller: Cluster Manifest File Cluster Manifest File • • [storage_class] 7200_rpm_sata 10krpm_sas ssd ssd_cached_7200rpm_sata ssd_cached_10krpm_sas Storage classes defined [storage_group] #format: [storage group name] ["user friendly storage group name"] [storage class] high_performance "High_Performance_SSD" ssd capacity "Economy_Disk" 7200_rpm_sata performance "High_Performance_Disk" 10krpm_sas value_performance "High_Performance_Disk_with_ssd_cached_Acceleration" ssd_cached_10krpm_sas value_capacity "Capacity_Disk_with_ssd_cached_Acceleration" ssd_cached_7200rpm_sata Storage groups [cluster] cluster_a Data disk file system [management_addr] 192.168.123.0/24 Network configuration [ceph_cluster_addr] 192.168.125.0/24 [zone] Zone0 zone1 [settings] storage_group_near_full_threshold storage_group_full_threshold 85 ceph_near_full_threshold 75 ceph_full_threshold 90 pg_count_factor 100 heartbeat_interval 6 osd_heartbeat_interval 7 osd_heartbeat_grace 21 65 settings [ec_profiles] #format: [profile-name] [path-to-plugin] [plugin-name] [pg_num value] [json format key/value] #the key/value strings should not have spacesdefault_profile /usr/lib64/ceph/erasure-code jerasure 3 {"k":2,"m":1,"technique":"reed_sol_van"}#profile_name2 /usr/lib64/ceph/erasure-code jerasure 6 {"k":4,"m":2,"technique":"reed_sol_van"} Ec pool profiles Cluster name [file_system] xfs [ceph_public_addr] 192.168.124.0/24 Resides on the VSM controller server. Tells VSM how to organize storage devices, how the network is configured, and other management details [cache_tier_defaults] ct_hit_set_count 1 ct_hit_set_period_s 3600 ct_target_max_mem_mb 1000000 ct_target_dirty_ratio 0.4 ct_target_full_ratio 0.8 ct_target_max_objects 1000000 ct_target_min_flush_age_m 10 ct_target_min_evict_age_m 20 Cache tier default parameters zone 19 VSM Agent: Server Manifest VSM Agent runs on every server managed by VSM VSM Agent uses the server manifest file to identify and authenticate with the VSM controller, and describe server configuration Discovery and authentication To be added to a cluster, the server manifest file must contain the IP address of the VSM controller, and a valid authentication key The authentication key is generated by vsm controller in advance, to be applied into server manifest before launching agent daemon. Server Configuration VSM relies on the server manifest to identify and classify data devices and associated journals. VSM does not have knowledge of how storage devices have been partitioned. Devices and partitions are specified “by path” to ensure that paths remain constant in the event of a device removal or failure. 20 VSM Agent: Server Manifest Address of VSM Controller [vsm_controller_ip] #10.239.82.168 [role] storage monitor Include “storage” if server will host OSD daemons Include “monitor” if server will host monitor daemons [auth_key] token-tenant Authentication key provided by authentication key tool on VSM controller node. [7200_rpm_sata] #format [sata_device] [journal_device] %osd-by-path-1% %journal-by-path-1% %osd-by-path-2% %journal-by-path-2% %osd-by-path-3% %journal-by-path-3% %osd-by-path-4% %journal-by-path-4% [10krpm_sas] #format [sas_device] [journal_device] %osd-by-path-5% %journal-by-path-5% %osd-by-path-6% %journal-by-path-6% %osd-by-path-7% %journal-by-path-7% %osd-by-path-7% %journal-by-path-7% [ssd] #format [ssd_device] Server Manifest File • Resides on each server that VSM manages. • Defines how storage is configured on each server • Identifies other roles (Ceph daemons) that should be run on the server • Authenticates servers to VSM controller Storage Class 7200_rpm_sata: Specifies path to four 7200 RPM drives and their associated journal drives/partitions Storage Class 10krpm_sas: Specifies path to four 10K RPM drives and their associated journal drives/partitions [journal_device] [ssd_cached_7200rpm_sata] #format [intel_cache_device] [journal_device] [ssd_cached_10krpm_sas] #format [intel_cache_device] [journal_device] No drives associated with these Storage Class 21 VSM Operation Map Getting Started Log In Create Cluster Navigation Dashboard Overview Managing Capacity Storage Group Status Manage Pools Creating Storage Pools RBD Status Monitoring Cluster Health Dashboard Overview OSD Status Monitor Status PG Status Managing Servers Manage Servers Add & Remove Servers Add & Remove Monitors Stop & Start Servers Managing Storage Devices Manage Devices Restart OSDs Remove OSDs Restore OSDs Working with OpenStack OpenStack Access Managing Pools Managing VSM Manage VSM Users Manage VSM Configuration MDS Status 22 Live Demo 23 Thank You!
© Copyright 2024