openstack_rhosp16.2_nvidia_.../2) Undercloud Deployment.md

558 lines
19 KiB
Markdown
Executable File

# Proxmox installation
Proxmox hosts the undercloud node, this enables snapshots to assist in Update/DR/Rebuild scenarios, primarily this will allow a point in time capture of working heat-templates and containers.
> https://pve.proxmox.com/wiki/Installation
| setting | value |
| --- | --- |
| filesystem | xfs |
| swapsize | 8GB |
| maxroot | 50GB |
| country | United Kingdom |
| time zone | Europe/London |
| keyboard layout | United Kingdom |
| password | Password0 |
| email | user@university.ac.uk (this can be changed in the web console @ datacenter/users/root) |
| management interface | eno1 |
| hostname | pve.local |
| ip address | 10.122.0.5/24 |
| gateway | 10.122.0.1 (placeholder, there is no gateway on this range) |
| dns | 144.173.6.71 |
- Install from a standard version 7.2 ISO, use settings listed as above.
- Create a bridge on the 1G management interface, this is VLAN 1 native on the 'ctlplane' network with VLAN 2 tagged for IPMI traffic.
- Ensure the 25G interfaces are setup as an LACP bond, create a bridge on the bond with the 'tenant', 'storage', 'internal-api' and 'external' VLANs as tagged (the 'external' range has the default gateway).
- Proxmox host has VLAN interfaces into each openstack network for introspection/debug, nmap is installed.
```sh
cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
iface eno1 inet manual
iface eno2 inet manual
iface eno3 inet manual
iface eno4 inet manual
iface enx3a68dd4a4c5f inet manual
auto ens2f0np0
iface ens2f0np0 inet manual
auto ens2f1np1
iface ens2f1np1 inet manual
auto bond0
iface bond0 inet manual
bond-slaves ens2f0np0 ens2f1np1
bond-miimon 100
bond-mode 802.3ad
auto vmbr0
iface vmbr0 inet static
address 10.122.0.5/24
bridge-ports eno1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#vlan 1(native) 2 (tagged) ControlPlane
auto vmbr1
iface vmbr1 inet manual
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#vlan 1(native) 11 12 13 1214 (tagged)
auto vlan2
iface vlan2 inet static
address 10.122.1.5/24
vlan-raw-device vmbr0
#IPMI
auto vlan13
iface vlan13 inet static
address 10.122.10.5/24
vlan-raw-device vmbr1
#Storage
auto vlan1214
iface vlan1214 inet static
address 10.121.4.5/24
gateway 10.121.4.1
vlan-raw-device vmbr1
#External
auto vlan12
iface vlan12 inet static
address 10.122.6.5/24
vlan-raw-device vmbr1
#InternalApi
auto vlan11
iface vlan11 inet static
address 10.122.8.5/24
vlan-raw-device vmbr1
#Tenant
```
Setup the no-subscription repository.
```sh
# comment/disable enterprise repo
nano -cw /etc/apt/sources.list.d/pve-enterprise.list
#deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise
# insert pve-no-subscription repo
nano -cw /etc/apt/sources.list
deb http://ftp.uk.debian.org/debian bullseye main contrib
deb http://ftp.uk.debian.org/debian bullseye-updates main contrib
# security updates
deb http://security.debian.org bullseye-security main contrib
# pve-no-subscription
deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription
# update
apt-get update
apt-get upgrade -y
reboot
```
Download some LXC containers.
- LXC is not used in production, but during build LXC containers with network interfaces in all ranges (last octet suffix .6) was used to debug IP connectivity, switch configuration and serve linux boot images over NFS for XClarity.
```sh
pveam update
pveam available --section system
pveam download local almalinux-8-default_20210928_amd64.tar.xz
pveam download local rockylinux-8-default_20210929_amd64.tar.xz
pveam download local ubuntu-18.04-standard_18.04.1-1_amd64.tar.gz
pveam download local ubuntu-22.04-standard_22.04-1_amd64.tar.zst
pveam list local
NAME SIZE
local:vztmpl/almalinux-8-default_20210928_amd64.tar.xz 109.08MB
local:vztmpl/rockylinux-8-default_20210929_amd64.tar.xz 107.34MB
local:vztmpl/ubuntu-18.04-standard_18.04.1-1_amd64.tar.gz 203.54MB
local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst 123.81MB
```
# Undercloud VM instance
## Download RHEL 8.4 full DVD image
Select the RHEL8.4 image, choose the full image rather than the boot image, this will allow installation without registering the system during the installer, you may then attach the system to a license via the `subscription-manager` tool after the host is built.
## Install spec
- RHEL8 (RHEL 8.4 specifically)
- 1 socket, 16 core (must use HOST cpu type for nested virtualization)
- 24GB ram
- 100GB disk (/root 89GiB lvm, /boot 1024MiB, swap 10GiB lvm)
- ControlPlane network interface on vmbr0, no/native vlan, 10.122.0.25/24, ens18
- IPMI network interface on vmbr0, vlan2 (vlan assigned in proxmox not OS), 10.122.1.25/24, ens19
- External/Routable network interface on vmbr1, vlan 1214 (vlan assigned in proxmox not OS), 10.121.4.25/24, gateway 10.121.4.1, dns 144.173.6.7,1 1.1.1.1, ens20
- ensure all network interfaces do not have the firewall enabled in proxmox or OS (mac spoofing will be required and should be allowed in the firewall if used)
- root:Password0
- undercloud.local
- minimal install with QEMU guest agents
- will require registering with redhat subscription service
## OCF partner subscription entitlement
Register for a partner product entitlement.
> https://partnercenter.redhat.com/NFRPageLayout
> Product: Red Hat OpenStack Platform, Standard Support (4 Sockets, NFR, Partner Only) - 25.0 Units
Once the customer has purchased the entitlement, this should be present in their own RedHat portal to consume on the production nodes.
## Register undercloud node with the require software repositories
> [https://access.redhat.com/documentation/en-us/red\_hat\_openstack\_platform/16.2/html/director\_installation\_and\_usage/assembly_preparing-for-director-installation#enabling-repositories-for-the-undercloud](https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/director_installation_and_usage/assembly_preparing-for-director-installation#enabling-repositories-for-the-undercloud)
Browse to:
> https://access.redhat.com/management/systems/create
Create a new system with the following attributes.
- Virtual System
- Name: university_test
- Architecture: x86_64
- Number of vCPUs: 16
- Red Hat Enterprise Linux Version: 8
- Create
Attach the following initial subscription: 'Red Hat Enterprise Linux, Self-Support (128 Sockets, NFR, Partner Only)'
Note the name and UUID of the system.
Register the system.
```sh
sudo su -
[root@undercloud ~]# subscription-manager register --name=university_test --consumerid=f870ae18-6664-4206-9a89-21f24f312866 --username=tseed@ocf.co.uk
Registering to: subscription.rhsm.redhat.com:443/subscription
Password:
The system has been registered with ID: a1b24b8a-933b-4ce8-8244-1a7e16ff51a3
The registered system name is: university_test
#[root@undercloud ~]# subscription-manager refresh
[root@undercloud ~]# subscription-manager list
+-------------------------------------------+
Installed Product Status
+-------------------------------------------+
Product Name: Red Hat Enterprise Linux for x86_64
Product ID: 479
Version: 8.4
Arch: x86_64
Status: Subscribed
Status Details:
Starts: 06/13/2022
Ends: 06/13/2023
[root@undercloud ~]# subscription-manager list
+-------------------------------------------+
Installed Product Status
+-------------------------------------------+
Product Name: Red Hat Enterprise Linux for x86_64
Product ID: 479
Version: 8.4
Arch: x86_64
Status: Subscribed
Status Details:
Starts: 06/13/2022
Ends: 06/13/2023
[root@undercloud ~]# subscription-manager identity
system identity: f870ae18-6664-4206-9a89-21f24f312866
name: university_test
org name: 4110881
org ID: 4110881
```
Add an entitlement to the license system.
```sh
# Check the entitlement/purchased-products portal
# you will find the SKU under a contract - this will help to identify the openstack entitlement if you have multiple
# find a suitable entitlement pool ID for Red Hat OpenStack Director Deployment Tools
subscription-manager list --available --all
subscription-manager list --available --all --matches="*OpenStack*"
Subscription Name: Red Hat OpenStack Platform, Standard Support (4 Sockets, NFR, Partner Only)
SKU: SER0505
Contract: 13256907
Pool ID: 8a82c68d812ba3c301815c6f842f5ecf
# attach to the entitlement pool ID
subscription-manager attach --pool=8a82c68d812ba3c301815c6f842f5ecf
Successfully attached a subscription for: Red Hat OpenStack Platform, Standard Support (4 Sockets, NFR, Partner Only)
1 local certificate has been deleted.
# set release version statically
subscription-manager release --set=8.4
```
Enable repositories, set version of container-tools, update packages.
```sh
subscription-manager repos --disable=* ;\
subscription-manager repos \
--enable=rhel-8-for-x86_64-baseos-eus-rpms \
--enable=rhel-8-for-x86_64-appstream-eus-rpms \
--enable=rhel-8-for-x86_64-highavailability-eus-rpms \
--enable=ansible-2.9-for-rhel-8-x86_64-rpms \
--enable=openstack-16.2-for-rhel-8-x86_64-rpms \
--enable=fast-datapath-for-rhel-8-x86_64-rpms ;\
dnf module disable -y container-tools:rhel8 ;\
dnf module enable -y container-tools:3.0 ;\
dnf update -y
reboot
```
## Install Tripleo client
```sh
# install tripleoclient for install of the undercloud
dnf install -y python3-tripleoclient
# these packages are advised for the TLS everywhere functionality, probably not required for external TLS endpoint but wont hurt
dnf install -y python3-ipalib python3-ipaclient krb5-devel python3-novajoin
```
Install Ceph-Ansible packages, even if you are not initially using Ceph it cannot hurt to have an undercloud capable of deploying Ceph, to use external Ceph (as in not deployed by tripleo) you will need the following package.
There are different packages for different versions of Ceph, this is especially relevant when using external Ceph.
> https://access.redhat.com/solutions/2045583
- Redhat Ceph 4.1 = Nautilus release
- Redhat Ceph 5.1 = Pacific release
```sh
subscription-manager repos | grep -i ceph
# Nautilus (default version in use with Tripleo deployed Ceph)
#subscription-manager repos --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms
# Pacific (if you are using external Ceph from the opensource repos you will likely be using this version)
#dnf remove -y ceph-ansible
#subscription-manager repos --disable=rhceph-4-tools-for-rhel-8-x86_64-rpms
subscription-manager repos --enable=rhceph-5-tools-for-rhel-8-x86_64-rpms
# install
dnf info ceph-ansible
dnf install -y ceph-ansible
```
# Configure and deploy the Tripleo undercloud
## Prepare host
Disable firewalld.
```sh
systemctl disable firewalld
systemctl stop firewalld
```
Create user/sudoers, push ssh key. Sudoers required for the tripleo installer.
```sh
groupadd -r -g 1001 stack && useradd -r -u 1001 -g 1001 -m -s /bin/bash stack
echo "%stack ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers.d/stack
chmod 0440 /etc/sudoers.d/stack
passwd stack # password is Password0
exit
ssh-copy-id -i ~/.ssh/id_rsa.pub stack@university-new-undercloud
```
Local ssh config setup.
```sh
nano -cw ~/.ssh/config
Host undercloud
Hostname 10.121.4.25
User stack
IdentityFile ~/.ssh/id_rsa
```
Set hostname, disable firewall (leave SElinux enabled, RHOSP tripleo requires it), install packages.
```sh
ssh undercloud
sudo su -
timedatectl set-timezone Europe/London
dnf install chrony nano -y
# replace server/pool entries with PHC (high precision clock device) entry to use the hypervisors hardware clock (which in turn is sync'd from online ntp pool), this should be the most acurate time for a VM
# the LXC container running ntp (192.168.101.43) does actually use the hypervisor hardware clock, the LXC container and VM should be on the same hypervisor if this is used
nano -cw /etc/chrony.conf
#server 192.168.101.43 iburst
#pool 2.centos.pool.ntp.org iburst
refclock PHC /dev/ptp0 poll 2
systemctl enable chronyd
echo ptp_kvm > /etc/modules-load.d/ptp_kvm.conf
# the undercloud installer should set the hostname based on the 'undercloud_hostname' entry in the undercloud.conf config file
# you can set it before deployment with the following, the Opensource tripleo documentation advises to allow the undercloud installer to set it
hostnamectl set-hostname undercloud.local
hostnamectl set-hostname --transient undercloud.local
# RHOSP hosts entry
nano -cw /etc/hosts
10.121.4.25 undercloud.local undercloud
reboot
hostname -A
hostname -s
# install some useful tools
sudo su -
dnf update -y
dnf install qemu-guest-agent nano tree lvm2 chrony telnet traceroute net-tools bind-utils python3 yum-utils mlocate ipmitool tmux wget -y
# need to shutdown for qemu-guest tools to function, ensure the VM profile on the hypervisor has guest agents enabled
shutdown -h now
```
## Build the undercloud config file
The first interface (enp6s18 on the proxmox VM instance) will be on the ControlPlane range.
- Controller nodes are in all networks but cannot install nmap, can find hosts in ranges with `for ip in 10.122.6.{1..254}; do ping -c 1 -t 1 $ip > /dev/null && echo "${ip} is up"; done`.
- Proxmox has interfaces in every network and nmap installed `nmap -sn 10.122.6.0/24` to assist with debug.
| Node | IPMI VLAN2 | Ctrl_plane VLAN1 | External VLAN1214 | Internal_api VLAN12 | Storage VLAN13 | Tenant VLAN11 |
| --- | --- | --- | --- | --- | --- | --- |
| Proxmox | 10.122.1.54 (IPMI) (Proxmox interface 10.122.1.5) | 10.122.0.5 | 10.121.4.5 | 10.122.6.5 | 10.122.10.5 | 10.122.8.5 |
| Undercloud | 10.122.1.25 | 10.121.0.25-27 (br-ctlplane) | 10.121.4.25 (Undercloud VM) | NA | NA | NA |
| Temporary Storage Nodes | 10.122.1.55-57 | NA | 10.121.4.7-9 | NA | 10.122.10.7-9 | NA |
| Overcloud Controllers | 10.122.1.10-12 (Instance-HA 10.122.1.80-82 | 10.122.0.30-32 | 10.121.4.30-32 | 10.122.6.30-32 | 10.122.10.30-32 | 10.122.8.30-32 |
| Overcloud Networkers | 10.122.1.20-21 | 10.122.0.40-41 | NA (reserved 10.121.4.23-24) | 10.122.6.40-41 | NA | 10.122.8.40-41 |
| Overcloud Compute | 10.122.1.30-53/54,58-77 | 10.122.0.50-103 | NA | 10.122.6.50-103 | 10.122.10.50-103 | 10.122.8.50-103 |
```sh
sudo su - stack
nano -cw /home/stack/undercloud.conf
[DEFAULT]
certificate_generation_ca = local
clean_nodes = true
cleanup = true
container_cli = podman
container_images_file = containers-prepare-parameter.yaml
discovery_default_driver = ipmi
enable_ironic = true
enable_ironic_inspector = true
enable_nova = true
enabled_hardware_types = ipmi
generate_service_certificate = true
inspection_extras = true
inspection_interface = br-ctlplane
ipxe_enabled = true
ironic_default_network_interface = flat
ironic_enabled_network_interfaces = flat
local_interface = enp6s18
local_ip = 10.122.0.25/24
local_mtu = 1500
local_subnet = ctlplane-subnet
overcloud_domain_name = university.ac.uk
subnets = ctlplane-subnet
undercloud_admin_host = 10.122.0.27
undercloud_debug = true
undercloud_hostname = undercloud.local
undercloud_nameservers = 144.173.6.71,1.1.1.1
undercloud_ntp_servers = ntp.university.ac.uk,0.pool.ntp.org
undercloud_public_host = 10.122.0.26
[ctlplane-subnet]
cidr = 10.122.0.0/24
#dhcp_end = 10.122.0.140
#dhcp_start = 10.122.0.80
dhcp_end = 10.122.0.194
dhcp_start = 10.122.0.140
#dns_nameservers =
gateway = 10.122.0.25
#inspection_iprange = 10.122.0.141,10.122.0.201
inspection_iprange = 10.122.0.195,10.122.0.249
masquerade = true
```
## RHEL Tripleo container preparation
Generate the `/home/stack/containers-prepare-parameter.yaml` config file using the default method for a local registry on the undercloud.
```sh
sudo su - stack
openstack tripleo container image prepare default \
--local-push-destination \
--output-env-file containers-prepare-parameter.yaml
```
Add the API key to download containers from RHEL Quay public registry.
RHEL requires containers to be pulled from Quay.io using a valid API token (unique to your RHEL account), containers-prepare-parameters.yaml must be modified to include the API key.
The following opensource tripleo sections explain the containers-prepare-parameters.yaml in more detail, for a quick deployment use the following instructions.
> https://access.redhat.com/RegistryAuthentication
Edit `containers-prepare-parameter.yaml` to include the Redhat Quay bearer token.
```sh
nano -cw /home/stack/containers-prepare-parameter.yaml
parameter_defaults:
ContainerImagePrepare:
- push_destination: true
set:
<....settings....>
tag_from_label: '{version}-{release}'
ContainerImageRegistryLogin: true
ContainerImageRegistryCredentials:
registry.redhat.io:
4110881|osp16-undercloud: long-bearer-token-here
```
## Deploy the undercloud
Shutdown the Undercloud VM instance and take a snapshot in Proxmox, call it 'pre\_undercloud\_deploy'.
```sh
openstack undercloud install --dry-run
time openstack undercloud install
#time openstack undercloud install --verbose # if there are failing tasks
##########################################################
The Undercloud has been successfully installed.
Useful files:
Password file is at /home/stack/undercloud-passwords.conf
The stackrc file is at ~/stackrc
Use these files to interact with OpenStack services, and
ensure they are secured.
##########################################################
real 31m11.191s
user 13m28.211s
sys 3m15.817s
```
> If you need to change any configuration in the undercloud.conf you can rerun the install over the top and the node **should** reconfigure itself (network changes likely necessitate redeployment, changinf ipxe/inspection ranges seems to require redeployment of VM).
```sh
# update undercloud configuration, forcing regeneration of passwords 'undercloud-passwords.conf'
openstack undercloud install --force-stack-update
```
## Output
- undercloud-passwords.conf - A list of all passwords for the director services.
- stackrc - A set of initialisation variables to help you access the director command line tools.
Load env vars specific to the undercloud for the openstack cli tool.
```sh
source ~/stackrc
```
Check openstack undercloud endpoints, after a reboot always check the endpoints are up before performing actions.
```sh
openstack endpoint list
```