# What is this? Side project to build a CI testbench to work on various Ansible roles to fit with employers default HPC deployment. # Drivers * There is not a true representation of the stack suitable for CI * Existing testbench is slow to re-provision and very manual * There isnt enough hardware to test multiple stacks * Corp hypervisors are unsuitable (possibly Python IPMI listener -> Proxmox/VMWare API would be ok but many ancillary virtual networks would be required that may change on a stack to stack basis) * Updating Ansible in vi on customer systems is tedious # Goal The aim is to simulate baremetal node provision using XCAT -> iPXE/IPMI -> virtualBMC -> QEMU VMs, and continue to develop the Ansible roles that configure the various classes of node. # Components Use commodity hardware to act as hypervisors and model the storage and network components * tested on 2 and 3 nodes, single nvme, single NIC Generate static or dynamic Ansible inventory natively via the XCAT API. * working model * requires networks to be pulled from XCAT Use a dynamic role model triggered by XCAT group membership. * existing working model * all Ansible variables imported under top level object ready for keypairDB integration * various helper roles to deep merge dictionaries and lists for individual site/deployment customisations Use VXLAN point to point between each hypervisor to simulate the various cluster networks. * working model that will scale to many hypervisors Use hyperconverged Ceph to provide RBD for VM disk images, CephFS+Ganesha for NFS mounts hosting scheduler/HPC software * latest Ceph is now nearly all yaml spec driven allowing automation, most exisitng Ansible is behind * cluster build automation complete * OSD + Pools complete * RBD complete * NFS outstanding Deploy XCAT container, seed with inventory of to-be provisioned VMs * to complete Deploy virtualBMC * working model Deploy QEMU with RBD disk * to complete