initial commit
commit
8adc0397da
|
|
@ -0,0 +1,133 @@
|
|||
# Access to university Openstack
|
||||
|
||||
```
|
||||
edit local ~/.ssh/config and include the following entries
|
||||
|
||||
###### university
|
||||
Host university-jump
|
||||
HostName 144.173.114.20
|
||||
ProxyJump nemesis
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
Port 22
|
||||
User root
|
||||
|
||||
Host university-proxmox
|
||||
Hostname 10.121.4.5
|
||||
Proxyjump university-jump
|
||||
#PreferredAuthentications password
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
Port 22
|
||||
User root
|
||||
|
||||
Host university-proxmox-dashboard
|
||||
Hostname 10.121.4.5
|
||||
Proxyjump university-jump
|
||||
#PreferredAuthentications password
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
Port 22
|
||||
User root
|
||||
DynamicForward 8888
|
||||
|
||||
Host university-undercloud
|
||||
Hostname 10.121.4.25
|
||||
Proxyjump university-jump
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
Port 22
|
||||
User stack
|
||||
ServerAliveInterval 100
|
||||
ServerAliveCountMax 2
|
||||
|
||||
Host university-ceph1
|
||||
Hostname 10.121.4.7
|
||||
Proxyjump university-jump
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
Port 22
|
||||
User root
|
||||
|
||||
Host university-ceph2
|
||||
Hostname 10.121.4.8
|
||||
Proxyjump university-jump
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
Port 22
|
||||
User root
|
||||
|
||||
Host university-ceph3
|
||||
Hostname 10.121.4.9
|
||||
Proxyjump university-jump
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
Port 22
|
||||
User root
|
||||
```
|
||||
|
||||
# Logins
|
||||
|
||||
## Switches
|
||||
|
||||
| IP/Login | Password | Type | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| cumulus@10.122.0.250 | Password0 | 100G switch | 2x CLAG bond between 100G switches, 2x Peerlink CLAG across 100G switches to university Juniper core switches |
|
||||
| cumulus@10.122.0.251 | Password0 | 100G switch | 2x CLAG bond between 100G switches, 2x Peerlink CLAG across 100G switches to university Juniper core switches |
|
||||
| cumulus@10.122.0.252 | Password0 | 1G switch | 2x SFP+ 10G LAG bond between management switches, 1G ethernet uplink from each 100G switch for access |
|
||||
| cumulus@10.122.0.253 | Password0 | 1G switch | 2x SFP+ 10G LAG bond between management switches |
|
||||
|
||||
## Node OOB (IPMI / XClarity web)
|
||||
|
||||
| IP | Login | Password |
|
||||
| --- | --- | --- |
|
||||
| 10.122.1.5(proxmox) 10.122.1.10-12(controller) 10.122.1.20-21(networker) 10.122.1.30-77(compute) 10.122.1.90-92(ceph) | USERID | Password0 |
|
||||
|
||||
## Node Operating System
|
||||
|
||||
| IP | Login | Password |
|
||||
| --- | --- | --- |
|
||||
| 10.121.4.5 (proxmox hypervisor) | root | Password0 |
|
||||
| 10.121.4.25 (undercloud VM) | stack OR root | Password0 |
|
||||
| 10.122.0.30-32(controller) 10.122.0.40-41(networker) 10.122.0.50-103(compute) | root OR heat-admin | Password0 |
|
||||
|
||||
## Dashboards
|
||||
|
||||
| Dashboard | IP / URL | Login | Password | Notes |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| Proxmox | https://10.121.4.5:8006/ | root | Password0 | |
|
||||
| Ceph | https://10.122.10.7:8443/ | admin | Password0 | 10.122.10.7,8,9 will redirect to live dashboard |
|
||||
| Ceph Grafana | https://10.121.4.7:3000/ | | | many useful dashboards for capacity and throughput |
|
||||
| Ceph Alertmanager | http://10.121.4.7:9093/ | | | check ceph alerts |
|
||||
| Ceph Prometheus | http://10.121.4.7:9095/ | | | check if promethus is monitoring ceph |
|
||||
| Openstack Horizon | https://stack.university.ac.uk/dashboard | admin | Password0 | domain: default (for AD login the domain is 'ldap')<br>floating ip 10.121.4.14<br>find password on undercloud `grep OS_PASSWORD ~/overcloudrc \\\\\| awk -F "=" '{print $2}'` |
|
||||
|
||||
# Networking
|
||||
|
||||

|
||||
|
||||
## Openstack control networks
|
||||
|
||||
- These networks reside on the primary 1G ethernet adapter.
|
||||
- The IPMI network is usually only used by the undercloud, however to facilitate IPMI fencing for Instance-HA the Openstack controller nodes will have a logical interface
|
||||
|
||||
| Network | VLAN | IP Range | |
|
||||
| --- | --- | --- | --- |
|
||||
| ControlPlane | 1 Native | 10.122.0.0/24 | |
|
||||
| IPMI | 2 | 10.122.1.0/24 | |
|
||||
|
||||
## Openstack service networks
|
||||
|
||||
- The logical networks reside upon an OVS bridge across an LACP bond on the 2x Mellanox 25G ethernet adapters in each node.
|
||||
- The 2x Mellanox 25G ethernet adapters are cabled to 100G switch1 and 100G switch2 respectively, the switch handles the LACP bond as one logical entity across switches with a CLAG.
|
||||
|
||||
| Network | VLAN | IP Range | |
|
||||
| --- | --- | --- | --- |
|
||||
| Storage Mgmt | 14 | 10.122.12.0/24 | |
|
||||
| Storage | 13 | 10.122.10.0/24 | |
|
||||
| InternalApi | 12 | 10.122.6.0/24 | |
|
||||
| Tenant | 11 | 10.122.8.0/24 | |
|
||||
| External | 1214 | 10.121.4.0/24 Gateway 10.121.4.1 | |
|
||||
|
||||
## Ceph service networks
|
||||
|
||||
Use Openstack "Storage Mgmt" for the Ceph public network.
|
||||
|
||||
| Network | VLAN | IP Range | |
|
||||
| --- | --- | --- | --- |
|
||||
| Cluster Network | 15 | 10.122.14.0/24 | |
|
||||
| Public Network (Openstack Storage) | 13 | 10.122.10.0/24 | |
|
||||
| Management (Openstack Storage Mgmt) | 14 | 10.122.12.0/24 | |
|
||||
|
|
@ -0,0 +1,64 @@
|
|||
# Nodes
|
||||
|
||||
```
|
||||
10.122.1.5 proxmox/undercloud
|
||||
10.122.1.10 controller
|
||||
10.122.1.11 controller
|
||||
10.122.1.12 controller
|
||||
10.122.1.20 networker
|
||||
10.122.1.21 networker
|
||||
10.122.1.30 compute SR630
|
||||
10.122.1.31
|
||||
10.122.1.32
|
||||
10.122.1.33 faulty PSU
|
||||
10.122.1.34 lost mellanox adapter
|
||||
10.122.1.35
|
||||
10.122.1.36
|
||||
10.122.1.37 lost mellanox adapter
|
||||
10.122.1.38
|
||||
10.122.1.39
|
||||
10.122.1.40
|
||||
10.122.1.41
|
||||
10.122.1.42
|
||||
10.122.1.43
|
||||
10.122.1.44
|
||||
10.122.1.45
|
||||
10.122.1.46
|
||||
10.122.1.47
|
||||
10.122.1.48
|
||||
10.122.1.49
|
||||
10.122.1.50
|
||||
10.122.1.51
|
||||
10.122.1.52
|
||||
10.122.1.53
|
||||
10.122.1.54 compute SR630v2 - expansion
|
||||
10.122.1.55
|
||||
10.122.1.56
|
||||
10.122.1.57
|
||||
10.122.1.58
|
||||
10.122.1.59
|
||||
10.122.1.60
|
||||
10.122.1.61
|
||||
10.122.1.62
|
||||
10.122.1.63
|
||||
10.122.1.64
|
||||
10.122.1.65
|
||||
10.122.1.66 faulty PSU
|
||||
10.122.1.67
|
||||
10.122.1.68
|
||||
10.122.1.69
|
||||
10.122.1.70
|
||||
10.122.1.71
|
||||
10.122.1.72
|
||||
10.122.1.73
|
||||
10.122.1.74
|
||||
10.122.1.75
|
||||
10.122.1.76
|
||||
10.122.1.77
|
||||
10.122.1.90 ceph1
|
||||
10.122.1.91 ceph2
|
||||
10.122.1.92 ceph3
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,558 @@
|
|||
# Proxmox installation
|
||||
|
||||
Proxmox hosts the undercloud node, this enables snapshots to assist in Update/DR/Rebuild scenarios, primarily this will allow a point in time capture of working heat-templates and containers.
|
||||
|
||||
> https://pve.proxmox.com/wiki/Installation
|
||||
|
||||
| setting | value |
|
||||
| --- | --- |
|
||||
| filesystem | xfs |
|
||||
| swapsize | 8GB |
|
||||
| maxroot | 50GB |
|
||||
| country | United Kingdom |
|
||||
| time zone | Europe/London |
|
||||
| keyboard layout | United Kingdom |
|
||||
| password | Password0 |
|
||||
| email | user@university.ac.uk (this can be changed in the web console @ datacenter/users/root) |
|
||||
| management interface | eno1 |
|
||||
| hostname | pve.local |
|
||||
| ip address | 10.122.0.5/24 |
|
||||
| gateway | 10.122.0.1 (placeholder, there is no gateway on this range) |
|
||||
| dns | 144.173.6.71 |
|
||||
|
||||
- Install from a standard version 7.2 ISO, use settings listed as above.
|
||||
- Create a bridge on the 1G management interface, this is VLAN 1 native on the 'ctlplane' network with VLAN 2 tagged for IPMI traffic.
|
||||
- Ensure the 25G interfaces are setup as an LACP bond, create a bridge on the bond with the 'tenant', 'storage', 'internal-api' and 'external' VLANs as tagged (the 'external' range has the default gateway).
|
||||
- Proxmox host has VLAN interfaces into each openstack network for introspection/debug, nmap is installed.
|
||||
|
||||
```sh
|
||||
cat /etc/network/interfaces
|
||||
|
||||
# network interface settings; autogenerated
|
||||
# Please do NOT modify this file directly, unless you know what
|
||||
# you're doing.
|
||||
#
|
||||
# If you want to manage parts of the network configuration manually,
|
||||
# please utilize the 'source' or 'source-directory' directives to do
|
||||
# so.
|
||||
# PVE will preserve these directives, but will NOT read its network
|
||||
# configuration from sourced files, so do not attempt to move any of
|
||||
# the PVE managed interfaces into external files!
|
||||
|
||||
auto lo
|
||||
iface lo inet loopback
|
||||
|
||||
iface eno1 inet manual
|
||||
|
||||
iface eno2 inet manual
|
||||
|
||||
iface eno3 inet manual
|
||||
|
||||
iface eno4 inet manual
|
||||
|
||||
iface enx3a68dd4a4c5f inet manual
|
||||
|
||||
auto ens2f0np0
|
||||
iface ens2f0np0 inet manual
|
||||
|
||||
auto ens2f1np1
|
||||
iface ens2f1np1 inet manual
|
||||
|
||||
auto bond0
|
||||
iface bond0 inet manual
|
||||
bond-slaves ens2f0np0 ens2f1np1
|
||||
bond-miimon 100
|
||||
bond-mode 802.3ad
|
||||
|
||||
auto vmbr0
|
||||
iface vmbr0 inet static
|
||||
address 10.122.0.5/24
|
||||
bridge-ports eno1
|
||||
bridge-stp off
|
||||
bridge-fd 0
|
||||
bridge-vlan-aware yes
|
||||
bridge-vids 2-4094
|
||||
#vlan 1(native) 2 (tagged) ControlPlane
|
||||
|
||||
auto vmbr1
|
||||
iface vmbr1 inet manual
|
||||
bridge-ports bond0
|
||||
bridge-stp off
|
||||
bridge-fd 0
|
||||
bridge-vlan-aware yes
|
||||
bridge-vids 2-4094
|
||||
#vlan 1(native) 11 12 13 1214 (tagged)
|
||||
|
||||
auto vlan2
|
||||
iface vlan2 inet static
|
||||
address 10.122.1.5/24
|
||||
vlan-raw-device vmbr0
|
||||
#IPMI
|
||||
|
||||
auto vlan13
|
||||
iface vlan13 inet static
|
||||
address 10.122.10.5/24
|
||||
vlan-raw-device vmbr1
|
||||
#Storage
|
||||
|
||||
auto vlan1214
|
||||
iface vlan1214 inet static
|
||||
address 10.121.4.5/24
|
||||
gateway 10.121.4.1
|
||||
vlan-raw-device vmbr1
|
||||
#External
|
||||
|
||||
auto vlan12
|
||||
iface vlan12 inet static
|
||||
address 10.122.6.5/24
|
||||
vlan-raw-device vmbr1
|
||||
#InternalApi
|
||||
|
||||
auto vlan11
|
||||
iface vlan11 inet static
|
||||
address 10.122.8.5/24
|
||||
vlan-raw-device vmbr1
|
||||
#Tenant
|
||||
```
|
||||
|
||||
Setup the no-subscription repository.
|
||||
|
||||
```sh
|
||||
# comment/disable enterprise repo
|
||||
nano -cw /etc/apt/sources.list.d/pve-enterprise.list
|
||||
|
||||
#deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise
|
||||
|
||||
# insert pve-no-subscription repo
|
||||
nano -cw /etc/apt/sources.list
|
||||
|
||||
deb http://ftp.uk.debian.org/debian bullseye main contrib
|
||||
deb http://ftp.uk.debian.org/debian bullseye-updates main contrib
|
||||
# security updates
|
||||
deb http://security.debian.org bullseye-security main contrib
|
||||
# pve-no-subscription
|
||||
deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription
|
||||
|
||||
# update
|
||||
apt-get update
|
||||
apt-get upgrade -y
|
||||
reboot
|
||||
```
|
||||
|
||||
Download some LXC containers.
|
||||
|
||||
- LXC is not used in production, but during build LXC containers with network interfaces in all ranges (last octet suffix .6) was used to debug IP connectivity, switch configuration and serve linux boot images over NFS for XClarity.
|
||||
|
||||
```sh
|
||||
pveam update
|
||||
pveam available --section system
|
||||
pveam download local almalinux-8-default_20210928_amd64.tar.xz
|
||||
pveam download local rockylinux-8-default_20210929_amd64.tar.xz
|
||||
pveam download local ubuntu-18.04-standard_18.04.1-1_amd64.tar.gz
|
||||
pveam download local ubuntu-22.04-standard_22.04-1_amd64.tar.zst
|
||||
pveam list local
|
||||
|
||||
NAME SIZE
|
||||
local:vztmpl/almalinux-8-default_20210928_amd64.tar.xz 109.08MB
|
||||
local:vztmpl/rockylinux-8-default_20210929_amd64.tar.xz 107.34MB
|
||||
local:vztmpl/ubuntu-18.04-standard_18.04.1-1_amd64.tar.gz 203.54MB
|
||||
local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst 123.81MB
|
||||
```
|
||||
|
||||
# Undercloud VM instance
|
||||
|
||||
## Download RHEL 8.4 full DVD image
|
||||
|
||||
Select the RHEL8.4 image, choose the full image rather than the boot image, this will allow installation without registering the system during the installer, you may then attach the system to a license via the `subscription-manager` tool after the host is built.
|
||||
|
||||
## Install spec
|
||||
|
||||
- RHEL8 (RHEL 8.4 specifically)
|
||||
- 1 socket, 16 core (must use HOST cpu type for nested virtualization)
|
||||
- 24GB ram
|
||||
- 100GB disk (/root 89GiB lvm, /boot 1024MiB, swap 10GiB lvm)
|
||||
- ControlPlane network interface on vmbr0, no/native vlan, 10.122.0.25/24, ens18
|
||||
- IPMI network interface on vmbr0, vlan2 (vlan assigned in proxmox not OS), 10.122.1.25/24, ens19
|
||||
- External/Routable network interface on vmbr1, vlan 1214 (vlan assigned in proxmox not OS), 10.121.4.25/24, gateway 10.121.4.1, dns 144.173.6.7,1 1.1.1.1, ens20
|
||||
- ensure all network interfaces do not have the firewall enabled in proxmox or OS (mac spoofing will be required and should be allowed in the firewall if used)
|
||||
- root:Password0
|
||||
- undercloud.local
|
||||
- minimal install with QEMU guest agents
|
||||
- will require registering with redhat subscription service
|
||||
|
||||
## OCF partner subscription entitlement
|
||||
|
||||
Register for a partner product entitlement.
|
||||
|
||||
> https://partnercenter.redhat.com/NFRPageLayout
|
||||
> Product: Red Hat OpenStack Platform, Standard Support (4 Sockets, NFR, Partner Only) - 25.0 Units
|
||||
|
||||
Once the customer has purchased the entitlement, this should be present in their own RedHat portal to consume on the production nodes.
|
||||
|
||||
## Register undercloud node with the require software repositories
|
||||
|
||||
> [https://access.redhat.com/documentation/en-us/red\_hat\_openstack\_platform/16.2/html/director\_installation\_and\_usage/assembly_preparing-for-director-installation#enabling-repositories-for-the-undercloud](https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/director_installation_and_usage/assembly_preparing-for-director-installation#enabling-repositories-for-the-undercloud)
|
||||
|
||||
Browse to:
|
||||
|
||||
> https://access.redhat.com/management/systems/create
|
||||
|
||||
Create a new system with the following attributes.
|
||||
|
||||
- Virtual System
|
||||
- Name: university_test
|
||||
- Architecture: x86_64
|
||||
- Number of vCPUs: 16
|
||||
- Red Hat Enterprise Linux Version: 8
|
||||
- Create
|
||||
|
||||
Attach the following initial subscription: 'Red Hat Enterprise Linux, Self-Support (128 Sockets, NFR, Partner Only)'
|
||||
Note the name and UUID of the system.
|
||||
|
||||
Register the system.
|
||||
|
||||
```sh
|
||||
sudo su -
|
||||
[root@undercloud ~]# subscription-manager register --name=university_test --consumerid=f870ae18-6664-4206-9a89-21f24f312866 --username=tseed@ocf.co.uk
|
||||
Registering to: subscription.rhsm.redhat.com:443/subscription
|
||||
Password:
|
||||
The system has been registered with ID: a1b24b8a-933b-4ce8-8244-1a7e16ff51a3
|
||||
The registered system name is: university_test
|
||||
|
||||
#[root@undercloud ~]# subscription-manager refresh
|
||||
[root@undercloud ~]# subscription-manager list
|
||||
+-------------------------------------------+
|
||||
Installed Product Status
|
||||
+-------------------------------------------+
|
||||
Product Name: Red Hat Enterprise Linux for x86_64
|
||||
Product ID: 479
|
||||
Version: 8.4
|
||||
Arch: x86_64
|
||||
Status: Subscribed
|
||||
Status Details:
|
||||
Starts: 06/13/2022
|
||||
Ends: 06/13/2023
|
||||
|
||||
[root@undercloud ~]# subscription-manager list
|
||||
+-------------------------------------------+
|
||||
Installed Product Status
|
||||
+-------------------------------------------+
|
||||
Product Name: Red Hat Enterprise Linux for x86_64
|
||||
Product ID: 479
|
||||
Version: 8.4
|
||||
Arch: x86_64
|
||||
Status: Subscribed
|
||||
Status Details:
|
||||
Starts: 06/13/2022
|
||||
Ends: 06/13/2023
|
||||
|
||||
[root@undercloud ~]# subscription-manager identity
|
||||
system identity: f870ae18-6664-4206-9a89-21f24f312866
|
||||
name: university_test
|
||||
org name: 4110881
|
||||
org ID: 4110881
|
||||
```
|
||||
|
||||
Add an entitlement to the license system.
|
||||
|
||||
```sh
|
||||
# Check the entitlement/purchased-products portal
|
||||
# you will find the SKU under a contract - this will help to identify the openstack entitlement if you have multiple
|
||||
# find a suitable entitlement pool ID for Red Hat OpenStack Director Deployment Tools
|
||||
subscription-manager list --available --all
|
||||
subscription-manager list --available --all --matches="*OpenStack*"
|
||||
|
||||
Subscription Name: Red Hat OpenStack Platform, Standard Support (4 Sockets, NFR, Partner Only)
|
||||
SKU: SER0505
|
||||
Contract: 13256907
|
||||
Pool ID: 8a82c68d812ba3c301815c6f842f5ecf
|
||||
|
||||
# attach to the entitlement pool ID
|
||||
subscription-manager attach --pool=8a82c68d812ba3c301815c6f842f5ecf
|
||||
|
||||
Successfully attached a subscription for: Red Hat OpenStack Platform, Standard Support (4 Sockets, NFR, Partner Only)
|
||||
1 local certificate has been deleted.
|
||||
|
||||
# set release version statically
|
||||
subscription-manager release --set=8.4
|
||||
```
|
||||
|
||||
Enable repositories, set version of container-tools, update packages.
|
||||
|
||||
```sh
|
||||
subscription-manager repos --disable=* ;\
|
||||
subscription-manager repos \
|
||||
--enable=rhel-8-for-x86_64-baseos-eus-rpms \
|
||||
--enable=rhel-8-for-x86_64-appstream-eus-rpms \
|
||||
--enable=rhel-8-for-x86_64-highavailability-eus-rpms \
|
||||
--enable=ansible-2.9-for-rhel-8-x86_64-rpms \
|
||||
--enable=openstack-16.2-for-rhel-8-x86_64-rpms \
|
||||
--enable=fast-datapath-for-rhel-8-x86_64-rpms ;\
|
||||
dnf module disable -y container-tools:rhel8 ;\
|
||||
dnf module enable -y container-tools:3.0 ;\
|
||||
dnf update -y
|
||||
|
||||
reboot
|
||||
```
|
||||
|
||||
## Install Tripleo client
|
||||
|
||||
```sh
|
||||
# install tripleoclient for install of the undercloud
|
||||
dnf install -y python3-tripleoclient
|
||||
|
||||
# these packages are advised for the TLS everywhere functionality, probably not required for external TLS endpoint but wont hurt
|
||||
dnf install -y python3-ipalib python3-ipaclient krb5-devel python3-novajoin
|
||||
```
|
||||
|
||||
Install Ceph-Ansible packages, even if you are not initially using Ceph it cannot hurt to have an undercloud capable of deploying Ceph, to use external Ceph (as in not deployed by tripleo) you will need the following package.
|
||||
|
||||
There are different packages for different versions of Ceph, this is especially relevant when using external Ceph.
|
||||
|
||||
> https://access.redhat.com/solutions/2045583
|
||||
|
||||
- Redhat Ceph 4.1 = Nautilus release
|
||||
- Redhat Ceph 5.1 = Pacific release
|
||||
|
||||
```sh
|
||||
subscription-manager repos | grep -i ceph
|
||||
|
||||
# Nautilus (default version in use with Tripleo deployed Ceph)
|
||||
#subscription-manager repos --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms
|
||||
|
||||
# Pacific (if you are using external Ceph from the opensource repos you will likely be using this version)
|
||||
#dnf remove -y ceph-ansible
|
||||
#subscription-manager repos --disable=rhceph-4-tools-for-rhel-8-x86_64-rpms
|
||||
subscription-manager repos --enable=rhceph-5-tools-for-rhel-8-x86_64-rpms
|
||||
|
||||
# install
|
||||
dnf info ceph-ansible
|
||||
dnf install -y ceph-ansible
|
||||
```
|
||||
|
||||
# Configure and deploy the Tripleo undercloud
|
||||
|
||||
## Prepare host
|
||||
|
||||
Disable firewalld.
|
||||
|
||||
```sh
|
||||
systemctl disable firewalld
|
||||
systemctl stop firewalld
|
||||
```
|
||||
|
||||
Create user/sudoers, push ssh key. Sudoers required for the tripleo installer.
|
||||
|
||||
```sh
|
||||
groupadd -r -g 1001 stack && useradd -r -u 1001 -g 1001 -m -s /bin/bash stack
|
||||
echo "%stack ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers.d/stack
|
||||
chmod 0440 /etc/sudoers.d/stack
|
||||
passwd stack # password is Password0
|
||||
exit
|
||||
|
||||
ssh-copy-id -i ~/.ssh/id_rsa.pub stack@university-new-undercloud
|
||||
```
|
||||
|
||||
Local ssh config setup.
|
||||
|
||||
```sh
|
||||
nano -cw ~/.ssh/config
|
||||
|
||||
Host undercloud
|
||||
Hostname 10.121.4.25
|
||||
User stack
|
||||
IdentityFile ~/.ssh/id_rsa
|
||||
```
|
||||
|
||||
Set hostname, disable firewall (leave SElinux enabled, RHOSP tripleo requires it), install packages.
|
||||
|
||||
```sh
|
||||
ssh undercloud
|
||||
sudo su -
|
||||
|
||||
timedatectl set-timezone Europe/London
|
||||
dnf install chrony nano -y
|
||||
|
||||
# replace server/pool entries with PHC (high precision clock device) entry to use the hypervisors hardware clock (which in turn is sync'd from online ntp pool), this should be the most acurate time for a VM
|
||||
# the LXC container running ntp (192.168.101.43) does actually use the hypervisor hardware clock, the LXC container and VM should be on the same hypervisor if this is used
|
||||
|
||||
nano -cw /etc/chrony.conf
|
||||
|
||||
#server 192.168.101.43 iburst
|
||||
#pool 2.centos.pool.ntp.org iburst
|
||||
refclock PHC /dev/ptp0 poll 2
|
||||
|
||||
systemctl enable chronyd
|
||||
echo ptp_kvm > /etc/modules-load.d/ptp_kvm.conf
|
||||
|
||||
# the undercloud installer should set the hostname based on the 'undercloud_hostname' entry in the undercloud.conf config file
|
||||
# you can set it before deployment with the following, the Opensource tripleo documentation advises to allow the undercloud installer to set it
|
||||
hostnamectl set-hostname undercloud.local
|
||||
hostnamectl set-hostname --transient undercloud.local
|
||||
|
||||
# RHOSP hosts entry
|
||||
nano -cw /etc/hosts
|
||||
10.121.4.25 undercloud.local undercloud
|
||||
|
||||
reboot
|
||||
hostname -A
|
||||
hostname -s
|
||||
|
||||
# install some useful tools
|
||||
sudo su -
|
||||
dnf update -y
|
||||
dnf install qemu-guest-agent nano tree lvm2 chrony telnet traceroute net-tools bind-utils python3 yum-utils mlocate ipmitool tmux wget -y
|
||||
|
||||
# need to shutdown for qemu-guest tools to function, ensure the VM profile on the hypervisor has guest agents enabled
|
||||
shutdown -h now
|
||||
```
|
||||
|
||||
## Build the undercloud config file
|
||||
|
||||
The first interface (enp6s18 on the proxmox VM instance) will be on the ControlPlane range.
|
||||
|
||||
- Controller nodes are in all networks but cannot install nmap, can find hosts in ranges with `for ip in 10.122.6.{1..254}; do ping -c 1 -t 1 $ip > /dev/null && echo "${ip} is up"; done`.
|
||||
- Proxmox has interfaces in every network and nmap installed `nmap -sn 10.122.6.0/24` to assist with debug.
|
||||
|
||||
| Node | IPMI VLAN2 | Ctrl_plane VLAN1 | External VLAN1214 | Internal_api VLAN12 | Storage VLAN13 | Tenant VLAN11 |
|
||||
| --- | --- | --- | --- | --- | --- | --- |
|
||||
| Proxmox | 10.122.1.54 (IPMI) (Proxmox interface 10.122.1.5) | 10.122.0.5 | 10.121.4.5 | 10.122.6.5 | 10.122.10.5 | 10.122.8.5 |
|
||||
| Undercloud | 10.122.1.25 | 10.121.0.25-27 (br-ctlplane) | 10.121.4.25 (Undercloud VM) | NA | NA | NA |
|
||||
| Temporary Storage Nodes | 10.122.1.55-57 | NA | 10.121.4.7-9 | NA | 10.122.10.7-9 | NA |
|
||||
| Overcloud Controllers | 10.122.1.10-12 (Instance-HA 10.122.1.80-82 | 10.122.0.30-32 | 10.121.4.30-32 | 10.122.6.30-32 | 10.122.10.30-32 | 10.122.8.30-32 |
|
||||
| Overcloud Networkers | 10.122.1.20-21 | 10.122.0.40-41 | NA (reserved 10.121.4.23-24) | 10.122.6.40-41 | NA | 10.122.8.40-41 |
|
||||
| Overcloud Compute | 10.122.1.30-53/54,58-77 | 10.122.0.50-103 | NA | 10.122.6.50-103 | 10.122.10.50-103 | 10.122.8.50-103 |
|
||||
|
||||
```sh
|
||||
sudo su - stack
|
||||
nano -cw /home/stack/undercloud.conf
|
||||
|
||||
[DEFAULT]
|
||||
certificate_generation_ca = local
|
||||
clean_nodes = true
|
||||
cleanup = true
|
||||
container_cli = podman
|
||||
container_images_file = containers-prepare-parameter.yaml
|
||||
discovery_default_driver = ipmi
|
||||
enable_ironic = true
|
||||
enable_ironic_inspector = true
|
||||
enable_nova = true
|
||||
enabled_hardware_types = ipmi
|
||||
generate_service_certificate = true
|
||||
inspection_extras = true
|
||||
inspection_interface = br-ctlplane
|
||||
ipxe_enabled = true
|
||||
ironic_default_network_interface = flat
|
||||
ironic_enabled_network_interfaces = flat
|
||||
local_interface = enp6s18
|
||||
local_ip = 10.122.0.25/24
|
||||
local_mtu = 1500
|
||||
local_subnet = ctlplane-subnet
|
||||
overcloud_domain_name = university.ac.uk
|
||||
subnets = ctlplane-subnet
|
||||
undercloud_admin_host = 10.122.0.27
|
||||
undercloud_debug = true
|
||||
undercloud_hostname = undercloud.local
|
||||
undercloud_nameservers = 144.173.6.71,1.1.1.1
|
||||
undercloud_ntp_servers = ntp.university.ac.uk,0.pool.ntp.org
|
||||
undercloud_public_host = 10.122.0.26
|
||||
[ctlplane-subnet]
|
||||
cidr = 10.122.0.0/24
|
||||
#dhcp_end = 10.122.0.140
|
||||
#dhcp_start = 10.122.0.80
|
||||
dhcp_end = 10.122.0.194
|
||||
dhcp_start = 10.122.0.140
|
||||
#dns_nameservers =
|
||||
gateway = 10.122.0.25
|
||||
#inspection_iprange = 10.122.0.141,10.122.0.201
|
||||
inspection_iprange = 10.122.0.195,10.122.0.249
|
||||
masquerade = true
|
||||
```
|
||||
|
||||
## RHEL Tripleo container preparation
|
||||
|
||||
Generate the `/home/stack/containers-prepare-parameter.yaml` config file using the default method for a local registry on the undercloud.
|
||||
|
||||
```sh
|
||||
sudo su - stack
|
||||
openstack tripleo container image prepare default \
|
||||
--local-push-destination \
|
||||
--output-env-file containers-prepare-parameter.yaml
|
||||
```
|
||||
|
||||
Add the API key to download containers from RHEL Quay public registry.
|
||||
|
||||
RHEL requires containers to be pulled from Quay.io using a valid API token (unique to your RHEL account), containers-prepare-parameters.yaml must be modified to include the API key.
|
||||
The following opensource tripleo sections explain the containers-prepare-parameters.yaml in more detail, for a quick deployment use the following instructions.
|
||||
|
||||
> https://access.redhat.com/RegistryAuthentication
|
||||
|
||||
Edit `containers-prepare-parameter.yaml` to include the Redhat Quay bearer token.
|
||||
|
||||
```sh
|
||||
nano -cw /home/stack/containers-prepare-parameter.yaml
|
||||
|
||||
parameter_defaults:
|
||||
ContainerImagePrepare:
|
||||
- push_destination: true
|
||||
set:
|
||||
<....settings....>
|
||||
tag_from_label: '{version}-{release}'
|
||||
ContainerImageRegistryLogin: true
|
||||
ContainerImageRegistryCredentials:
|
||||
registry.redhat.io:
|
||||
4110881|osp16-undercloud: long-bearer-token-here
|
||||
```
|
||||
|
||||
## Deploy the undercloud
|
||||
|
||||
Shutdown the Undercloud VM instance and take a snapshot in Proxmox, call it 'pre\_undercloud\_deploy'.
|
||||
|
||||
```sh
|
||||
openstack undercloud install --dry-run
|
||||
time openstack undercloud install
|
||||
#time openstack undercloud install --verbose # if there are failing tasks
|
||||
|
||||
##########################################################
|
||||
|
||||
The Undercloud has been successfully installed.
|
||||
|
||||
Useful files:
|
||||
|
||||
Password file is at /home/stack/undercloud-passwords.conf
|
||||
The stackrc file is at ~/stackrc
|
||||
|
||||
Use these files to interact with OpenStack services, and
|
||||
ensure they are secured.
|
||||
|
||||
##########################################################
|
||||
|
||||
|
||||
real 31m11.191s
|
||||
user 13m28.211s
|
||||
sys 3m15.817s
|
||||
```
|
||||
|
||||
> If you need to change any configuration in the undercloud.conf you can rerun the install over the top and the node **should** reconfigure itself (network changes likely necessitate redeployment, changinf ipxe/inspection ranges seems to require redeployment of VM).
|
||||
|
||||
```sh
|
||||
# update undercloud configuration, forcing regeneration of passwords 'undercloud-passwords.conf'
|
||||
openstack undercloud install --force-stack-update
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
- undercloud-passwords.conf - A list of all passwords for the director services.
|
||||
- stackrc - A set of initialisation variables to help you access the director command line tools.
|
||||
|
||||
Load env vars specific to the undercloud for the openstack cli tool.
|
||||
|
||||
```sh
|
||||
source ~/stackrc
|
||||
```
|
||||
|
||||
Check openstack undercloud endpoints, after a reboot always check the endpoints are up before performing actions.
|
||||
|
||||
```sh
|
||||
openstack endpoint list
|
||||
```
|
||||
|
|
@ -0,0 +1,512 @@
|
|||
## Obtain images for overcloud nodes RHEL/RHOSP Tripleo
|
||||
|
||||
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/director_installation_and_usage/assembly_installing-director-on-the-undercloud#proc_single-cpu-architecture-overcloud-images_overcloud-images
|
||||
|
||||
Download images direct from Redhat and upload to undercloud swift API.
|
||||
```sh
|
||||
sudo su - stack
|
||||
source ~/stackrc
|
||||
sudo dnf install -y rhosp-director-images-ipa-x86_64 rhosp-director-images-x86_64
|
||||
mkdir ~/images
|
||||
cd ~/images
|
||||
for i in /usr/share/rhosp-director-images/overcloud-full-latest-16.2.tar /usr/share/rhosp-director-images/ironic-python-agent-latest-16.2.tar; do tar -xvf $i; done
|
||||
openstack overcloud image upload --image-path /home/stack/images/
|
||||
openstack image list
|
||||
ll /var/lib/ironic/httpboot # look for inspector ipxe config and the kernel and initramfs files
|
||||
```
|
||||
|
||||
## Import bare metal nodes
|
||||
|
||||
### Build node definition list
|
||||
|
||||
This is commonly refered to as the `instackenv.json` file, Redhat references this as the node definition template nodes.json.
|
||||
|
||||
> the schema reference for this file:
|
||||
> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/environments/baremetal.html#instackenv
|
||||
|
||||
Gather all IP addresses for the IPMI interfaces.
|
||||
- `.[].ports.address` is the MAC address for iPXE boot, typically eth0.
|
||||
- `.[].pm_addr` is the IP address of the IPMI adapter.
|
||||
- If the IPMI interface is shared with the eth0 control plane interface the MAC address will be used for iPXE boot.
|
||||
- If the IPMI interface and eth0 interface are not shared (have different MAC address) you may have a tedious task ahead of you searching through the XClarity out of band adapters or looking through the switch MAC table and then correlating the switch port to the node to enumerate the MAC address.
|
||||
- University nodes do share a single interface for IPMI and iPXE but the MAC addresses are different.
|
||||
|
||||
```sh
|
||||
# METHOD 1 - will not work for University SR630 servers
|
||||
# where the IPMI and PXE interfaces share the same MAC address ( NOTE this is not the case for the Lenovo SR630 with OCP network adapter working to bridge the XClarity/IPMI)
|
||||
|
||||
# Scan the IPMI port of all hosts.
|
||||
sudo dnf install nmap -y
|
||||
nmap -p 623 10.122.1.0/24
|
||||
|
||||
# Query the arp table to return the MAC addresses of the IPMI(thus PXE) interfaces.
|
||||
ip neigh show dev enp6s19
|
||||
|
||||
# controller 10-12, 20-21 networker, 30-77 compute, (54 temporary proxmox, 55-57 temporary storage nodes - remove from compute range)
|
||||
#ipmitool -N 1 -R 0 -I lanplus -H 10.122.1.10 -U USERID -P Password0 lan print
|
||||
for i in {10..80}; do j=10.122.1.$i ; ip --json neigh show dev enp6s19 | jq -r " .[] | select(.dst==\"$j\") | \"\(.dst) \(.lladdr)\""; done | grep -v null
|
||||
|
||||
10.122.1.10 38:68:dd:4a:56:3c
|
||||
10.122.1.11 38:68:dd:4a:55:94
|
||||
10.122.1.12 38:68:dd:4a:42:4c
|
||||
10.122.1.20 38:68:dd:4a:4a:34
|
||||
10.122.1.21 38:68:dd:4a:52:1c
|
||||
10.122.1.30 38:68:dd:4c:17:ec
|
||||
10.122.1.31 38:68:dd:4c:17:b4
|
||||
10.122.1.32 38:68:dd:4d:1e:84
|
||||
10.122.1.33 38:68:dd:4d:0f:f4
|
||||
10.122.1.34 38:68:dd:4d:26:ac
|
||||
10.122.1.35 38:68:dd:4d:1b:f4
|
||||
10.122.1.36 38:68:dd:4a:46:4c
|
||||
10.122.1.37 38:68:dd:4d:16:7c
|
||||
10.122.1.38 38:68:dd:4d:15:8c
|
||||
10.122.1.39 38:68:dd:4d:1a:4c
|
||||
10.122.1.40 38:68:dd:4a:75:94
|
||||
10.122.1.41 38:68:dd:4d:1c:fc
|
||||
10.122.1.42 38:68:dd:4d:19:0c
|
||||
10.122.1.43 38:68:dd:4a:43:ec
|
||||
10.122.1.44 38:68:dd:4a:41:4c
|
||||
10.122.1.45 38:68:dd:4d:14:24
|
||||
10.122.1.46 38:68:dd:4d:18:c4
|
||||
10.122.1.47 38:68:dd:4d:18:cc
|
||||
10.122.1.48 38:68:dd:4a:41:8c
|
||||
10.122.1.49 38:68:dd:4c:17:8c
|
||||
10.122.1.50 38:68:dd:4c:17:2c
|
||||
10.122.1.51 38:68:dd:4d:1d:cc
|
||||
10.122.1.52 38:68:dd:4c:17:e4
|
||||
10.122.1.53 38:68:dd:4c:17:5c
|
||||
10.122.1.54 38:68:dd:70:a8:e8
|
||||
10.122.1.55 38:68:dd:70:a0:84
|
||||
10.122.1.56 38:68:dd:70:a4:cc
|
||||
10.122.1.57 38:68:dd:70:aa:cc
|
||||
10.122.1.58 38:68:dd:70:a8:88
|
||||
10.122.1.59 38:68:dd:70:a5:bc
|
||||
10.122.1.60 38:68:dd:70:a5:54
|
||||
10.122.1.61 38:68:dd:70:a2:e0
|
||||
10.122.1.62 38:68:dd:70:a2:b8
|
||||
10.122.1.63 38:68:dd:70:a7:10
|
||||
10.122.1.64 38:68:dd:70:a2:0c
|
||||
10.122.1.65 38:68:dd:70:9f:38
|
||||
10.122.1.66 38:68:dd:70:a8:74
|
||||
10.122.1.67 38:68:dd:70:a2:ac
|
||||
10.122.1.68 38:68:dd:70:a5:18
|
||||
10.122.1.69 38:68:dd:70:a7:88
|
||||
10.122.1.70 38:68:dd:70:a4:d8
|
||||
10.122.1.71 38:68:dd:70:a6:b0
|
||||
10.122.1.72 38:68:dd:70:aa:c4
|
||||
10.122.1.73 38:68:dd:70:9e:e0
|
||||
10.122.1.74 38:68:dd:70:a3:40
|
||||
10.122.1.75 38:68:dd:70:a2:08
|
||||
10.122.1.76 38:68:dd:70:a4:a0
|
||||
10.122.1.77 38:68:dd:70:a1:6c
|
||||
|
||||
# METHOD 2 - used for University SR630 servers
|
||||
# where the IPMI interface and eth0 interface are not shared (or have different MAC addresses)
|
||||
|
||||
## install XClarity CLI
|
||||
mkdir onecli
|
||||
cd onecli
|
||||
curl -o lnvgy_utl_lxce_onecli02a-3.5.0_rhel_x86-64.tgz https://download.lenovo.com/servers/mig/2022/06/01/55726/lnvgy_utl_lxce_onecli02a-3.5.0_rhel_x86-64.tgz
|
||||
tar -xvzf lnvgy_utl_lxce_onecli02a-3.5.0_rhel_x86-64.tgz
|
||||
|
||||
## XClarity CLI - find the MAC of the eth0 device
|
||||
### find all config items
|
||||
./onecli config show all --bmc USERID:Password0@10.122.1.10 --never-check-trust --nolog
|
||||
### find specific item
|
||||
./onecli config show IMM.HostIPAddress1 --bmc USERID:Password0@10.122.1.10 --never-check-trust --nolog --quiet
|
||||
./onecli config show IntelREthernetConnectionX722for1GbE--OnboardLAN1PhysicalPort1LogicalPort1.MACAddress --never-check-trust --nolog --quiet
|
||||
|
||||
### find MAC address for eth0 (assuming eth0 is connected)
|
||||
#### for the origional SR630 University nodes
|
||||
for i in {10..53}; do j=10.122.1.$i ; echo $j $(sudo ./onecli config show IntelREthernetConnectionX722for1GbE--OnboardLAN1PhysicalPort1LogicalPort1.MACAddress --bmc USERID:Password0@$j --never-check-trust --nolog --quiet | grep IntelREthernetConnectionX722for1GbE--OnboardLAN1PhysicalPort1LogicalPort1.MACAddress | awk -F '=' '{print $2}' | tr '[:upper:]' '[:lower:]'); done
|
||||
|
||||
## SR630
|
||||
# controllers
|
||||
10.122.1.10 38:68:dd:4a:56:38
|
||||
10.122.1.11 38:68:dd:4a:55:90
|
||||
10.122.1.12 38:68:dd:4a:42:48
|
||||
# networkers
|
||||
10.122.1.20 38:68:dd:4a:4a:30
|
||||
10.122.1.21 38:68:dd:4a:52:18
|
||||
# compute
|
||||
10.122.1.30 38:68:dd:4c:17:e8
|
||||
10.122.1.31 38:68:dd:4c:17:b0
|
||||
10.122.1.32 38:68:dd:4d:1e:80
|
||||
10.122.1.33 38:68:dd:4d:0f:f0
|
||||
10.122.1.34 38:68:dd:4d:26:a8
|
||||
10.122.1.35 38:68:dd:4d:1b:f0
|
||||
10.122.1.36 38:68:dd:4a:46:48
|
||||
10.122.1.37 38:68:dd:4d:16:78
|
||||
10.122.1.38 38:68:dd:4d:15:88
|
||||
10.122.1.39 38:68:dd:4d:1a:48
|
||||
10.122.1.40 38:68:dd:4a:75:90
|
||||
10.122.1.41 38:68:dd:4d:1c:f8
|
||||
10.122.1.42 38:68:dd:4d:19:08
|
||||
10.122.1.43 38:68:dd:4a:43:e8
|
||||
10.122.1.44 38:68:dd:4a:41:48
|
||||
10.122.1.45 38:68:dd:4d:14:20
|
||||
10.122.1.46 38:68:dd:4d:18:c0
|
||||
10.122.1.47 38:68:dd:4d:18:c8
|
||||
10.122.1.48 38:68:dd:4a:41:88
|
||||
10.122.1.49 38:68:dd:4c:17:88
|
||||
10.122.1.50 38:68:dd:4c:17:28
|
||||
10.122.1.51 38:68:dd:4d:1d:c8
|
||||
10.122.1.52 38:68:dd:4c:17:e0
|
||||
10.122.1.53 38:68:dd:4c:17:58
|
||||
|
||||
## SR630v2 node have a different OCP network adapter
|
||||
for i in {54..77}; do j=10.122.1.$i ; echo $j $(sudo ./onecli config show IntelREthernetNetworkAdapterI350-T4forOCPNIC30--Slot4PhysicalPort1LogicalPort1.MACAddress --bmc USERID:Password0@$j --never-check-trust --nolog --quiet | grep IntelREthernetNetworkAdapterI350-T4forOCPNIC30--Slot4PhysicalPort1LogicalPort1.MACAddress | awk -F '=' '{print $2}' | tr '[:upper:]' '[:lower:]'); done
|
||||
|
||||
10.122.1.54 6c:fe:54:32:b8:60
|
||||
10.122.1.55 6c:fe:54:33:4f:3c
|
||||
10.122.1.56 6c:fe:54:33:55:74
|
||||
10.122.1.57 6c:fe:54:33:4b:5c
|
||||
10.122.1.58 6c:fe:54:33:4f:d2
|
||||
10.122.1.59 6c:fe:54:33:53:ae
|
||||
10.122.1.60 6c:fe:54:33:4f:7e
|
||||
10.122.1.61 6c:fe:54:33:97:46
|
||||
10.122.1.62 6c:fe:54:33:57:18
|
||||
10.122.1.63 6c:fe:54:33:4e:fa
|
||||
10.122.1.64 6c:fe:54:33:53:ea
|
||||
10.122.1.65 6c:fe:54:33:4d:f8
|
||||
10.122.1.66 6c:fe:54:33:4d:2c
|
||||
10.122.1.67 6c:fe:54:32:e8:4e
|
||||
10.122.1.68 6c:fe:54:33:55:fe
|
||||
10.122.1.69 6c:fe:54:33:4b:86
|
||||
10.122.1.70 6c:fe:54:33:55:56
|
||||
10.122.1.71 6c:fe:54:33:4e:b2
|
||||
10.122.1.72 6c:fe:54:33:57:12
|
||||
10.122.1.73 6c:fe:54:33:4e:d6
|
||||
10.122.1.74 6c:fe:54:33:51:98
|
||||
10.122.1.75 6c:fe:54:33:4d:62
|
||||
10.122.1.76 6c:fe:54:33:55:50
|
||||
10.122.1.77 6c:fe:54:32:f0:2a
|
||||
```
|
||||
|
||||
Create each node configuration in the "nodes" list `/home/stack/instackenv.json`.
|
||||
|
||||
```json
|
||||
{
|
||||
"nodes": [
|
||||
{
|
||||
"ports": [
|
||||
{
|
||||
"address": "38:68:dd:4a:42:4c",
|
||||
"physical_network": "ctlplane"
|
||||
}
|
||||
],
|
||||
"name": "osctl0",
|
||||
"cpu": "4",
|
||||
"memory": "6144",
|
||||
"disk": "120",
|
||||
"arch": "x86_64",
|
||||
"pm_type": "ipmi",
|
||||
"pm_user": "USERID",
|
||||
"pm_password": "Password0",
|
||||
"pm_addr": "10.122.1.10",
|
||||
"capabilities": "profile:baremetal,boot_option:local",
|
||||
"_comment": "rack - openstack - location - u5"
|
||||
},
|
||||
{
|
||||
"ports": [
|
||||
{
|
||||
"address": "38:68:dd:4a:4a:34",
|
||||
"physical_network": "ctlplane"
|
||||
}
|
||||
],
|
||||
"name": "osnet1",
|
||||
"cpu": "4",
|
||||
"memory": "6144",
|
||||
"disk": "120",
|
||||
"arch": "x86_64",
|
||||
"pm_type": "ipmi",
|
||||
"pm_user": "USERID",
|
||||
"pm_password": "Password0",
|
||||
"pm_addr": "10.122.1.21",
|
||||
"capabilities": "profile:baremetal,boot_option:local",
|
||||
"_comment": "rack - openstack - location - u9"
|
||||
},
|
||||
{
|
||||
"ports": [
|
||||
{
|
||||
"address": "38:68:dd:4c:17:e4",
|
||||
"physical_network": "ctlplane"
|
||||
}
|
||||
],
|
||||
"name": "oscomp1",
|
||||
"cpu": "4",
|
||||
"memory": "6144",
|
||||
"disk": "120",
|
||||
"arch": "x86_64",
|
||||
"pm_type": "ipmi",
|
||||
"pm_user": "USERID",
|
||||
"pm_password": "Password0",
|
||||
"pm_addr": "10.122.1.31",
|
||||
"capabilities": "profile:baremetal,boot_option:local",
|
||||
"_comment": "rack - openstack - location - u11"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
- Do not have to include capabilities, we later add these for the overcloud deployment.
|
||||
- The capabilities 'profile:flavour' and 'boot_option:local' are good defaults, more capabilities will be automatically added during introspection and manually added when binding a node to a role.
|
||||
|
||||
## Setup RAID + Legacy BIOS boot mode
|
||||
|
||||
> IMPORTANT: UEFI boot does work on the SR650 as expected, however it can take a very long time to cycle through the interfaces to the PXE boot interface.
|
||||
> On large deployments you may reach the timeout on the DHCP server entry, BIOS mode is quicker to get to the PXE rom.
|
||||
|
||||
Use `/home/stack/instackenv.json` to start each node, login to each nodes XClarity web interface and setup a RAID1 array of the boot disks.
|
||||
|
||||
```sh
|
||||
# check nodes power state
|
||||
for i in `jq -r .nodes[].pm_addr instackenv.json`; do ipmitool -N 1 -R 0 -I lanplus -H $i -U USERID -P Password0 chassis status | grep ^System;done
|
||||
|
||||
# start all nodes
|
||||
for i in `jq -r .nodes[].pm_addr instackenv.json`; do ipmitool -N 1 -R 0 -I lanplus -H $i -U USERID -P Password0 chassis power on ;done
|
||||
for i in `jq -r .nodes[].pm_addr instackenv.json`; do ipmitool -N 1 -R 0 -I lanplus -H $i -U USERID -P Password0 chassis status | grep ^System;done
|
||||
|
||||
# get IP login to XClarity web console
|
||||
# configure RAID1 array on each node
|
||||
# set boot option from UEFI to LEGACY/BIOS boot mode
|
||||
for i in `jq -r .nodes[].pm_addr instackenv.json`; do echo $i ;done
|
||||
|
||||
# stop all nodes
|
||||
for i in `jq -r .nodes[].pm_addr instackenv.json`; do ipmitool -N 1 -R 0 -I lanplus -H $i -U USERID -P Password0 chassis power off ;done
|
||||
```
|
||||
|
||||
## Import nodes into the undercloud
|
||||
|
||||
> WARNING: the capabilities field keypair value 'node:compute-0, node:compute-1, node:compute-N' value must be contiguous, the University has a node with broken hardware 'oscomp9' that is not in the `instackenv.json` file.
|
||||
> WARNING: Each capability keypair 'node:\<type\>-#' must be in sequence, with oscomp9 removed from the `instackenv.json` we add the keypairs as so: `oscomp8 = computeA-8 AND oscomp10 = computeA-9`.
|
||||
|
||||
**Notice the Univerity cluster has 2 different server hardware types, with different network interface mappings, the node capabilities (computeA-0 VS node:computeB-0) will be used in the `scheduler_hints.yaml` to bind nodes to roles, there need to be 2 roles for the compute nodes to allow each server type to have a different 'associated' network interface mapping schemes.**
|
||||
|
||||
```sh
|
||||
# load credentials
|
||||
source ~/stackrc
|
||||
|
||||
# remove nodes if not first run
|
||||
#for i in `openstack baremetal node list -f json | jq -r .[].Name`; do openstack baremetal node manage $i;done
|
||||
#for i in `openstack baremetal node list -f json | jq -r .[].Name`; do openstack baremetal node delete $i;done
|
||||
|
||||
# ping all nodes to update the arp cache
|
||||
#for i in `jq -r .nodes[].pm_addr instackenv.json`; do sudo ping -c 3 -W 5 $i ;done
|
||||
nmap -p 623 10.122.1.0/24
|
||||
|
||||
# import nodes
|
||||
openstack overcloud node import instackenv.json
|
||||
|
||||
# set nodes to use BIOS boot mode for overcloud installation
|
||||
for i in `openstack baremetal node list -f json | jq -r .[].Name` ; do openstack baremetal node set --property capabilities="boot_mode:bios,$(openstack baremetal node show $i -f json -c properties | jq -r .properties.capabilities | sed "s/boot_mode:[^,]*,//g")" $i; done
|
||||
|
||||
# set nodes for baremetal profile for the schedule_hints.yaml to select the nodes as candidates
|
||||
for i in `openstack baremetal node list -f json | jq -r .[].Name` ; do openstack baremetal node set --property capabilities="profile:baremetal,$(openstack baremetal node show $i -f json -c properties | jq -r .properties.capabilities | sed "s/profile:baremetal[^,]*,//g")" $i; done
|
||||
|
||||
## where some nodes cannot deploy
|
||||
# oscomp4, oscomp7 have been removed from the instackenv.json owing to network card issues
|
||||
# owing to the way we are setting the node capability using a loop index we will see that the oscomp8 will be named in openstack as computeA-6
|
||||
#
|
||||
# openstack baremetal node show oscomp8 -f json -c properties | jq .properties.capabilities
|
||||
# "node:computeA-6,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal"
|
||||
#
|
||||
# if you do not have a full compliment of nodes ensure templates/scheduler_hints_env.yaml has the correct amount of nodes, in this case 22 computeA nodes
|
||||
# ControllerCount: 3
|
||||
# NetworkerCount: 2
|
||||
# #2 nodes removed owing to network card issues
|
||||
# #ComputeACount: 24
|
||||
# ComputeACount: 22
|
||||
# ComputeBCount: 24
|
||||
|
||||
# set 'node:name' capability to allow scheduler_hints.yaml to match roles to nodes
|
||||
## set capability for controller and networker nodes
|
||||
openstack baremetal node set --property capabilities="node:controller-0,$(openstack baremetal node show osctl0 -f json -c properties | jq -r .properties.capabilities | sed "s/node:[^,]*,//g")" osctl0 ;\
|
||||
openstack baremetal node set --property capabilities="node:controller-1,$(openstack baremetal node show osctl1 -f json -c properties | jq -r .properties.capabilities | sed "s/node:[^,]*,//g")" osctl1 ;\
|
||||
openstack baremetal node set --property capabilities="node:controller-2,$(openstack baremetal node show osctl2 -f json -c properties | jq -r .properties.capabilities | sed "s/node:[^,]*,//g")" osctl2 ;\
|
||||
openstack baremetal node set --property capabilities="node:networker-0,$(openstack baremetal node show osnet0 -f json -c properties | jq -r .properties.capabilities | sed "s/node:[^,]*,//g")" osnet0 ;\
|
||||
openstack baremetal node set --property capabilities="node:networker-1,$(openstack baremetal node show osnet1 -f json -c properties | jq -r .properties.capabilities | sed "s/node:[^,]*,//g")" osnet1
|
||||
|
||||
## capability for compute nodes
|
||||
index=0 ; for i in {0..23}; do openstack baremetal node set --property capabilities="node:computeA-$index,$(openstack baremetal node show oscomp$i -f json -c properties | jq -r .properties.capabilities | sed "s/node:[^,]*,//g")" oscomp$i && index=$((index + 1)) ;done
|
||||
|
||||
## capability for *NEW* compute nodes (oscomp-24..27 are being used for temporary proxmox and ceph thus removed from the instackenv.json) - CHECK
|
||||
index=0 ; for i in {24..47}; do openstack baremetal node set --property capabilities="node:computeB-$index,$(openstack baremetal node show oscomp$i -f json -c properties | jq -r .properties.capabilities | sed "s/node:[^,]*,//g")" oscomp$i && index=$((index + 1)) ;done
|
||||
|
||||
# check capabilities are set for all nodes
|
||||
#for i in `openstack baremetal node list -f json | jq -r .[].Name` ; do echo $i && openstack baremetal node show $i -f json -c properties | jq -r .properties.capabilities; done
|
||||
for i in `openstack baremetal node list -f json | jq -r .[].Name` ; do openstack baremetal node show $i -f json -c properties | jq -r .properties.capabilities; done
|
||||
|
||||
# output, notice the order of the nodes
|
||||
#node:controller-0,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:controller-1,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:controller-2,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:networker-0,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:networker-1,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:computeA-0,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:computeA-1,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:computeA-2,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:computeA-3,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#...
|
||||
#node:computeB-0,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:computeB-1,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:computeB-2,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#node:computeB-3,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
#...
|
||||
#node:computeB-23,profile:baremetal,boot_mode:bios,boot_option:local,profile:baremetal
|
||||
|
||||
# all in one command for inspection and provisioning
|
||||
#openstack overcloud node introspect --all-manageable --provide
|
||||
|
||||
# inspect all nodes hardware
|
||||
for i in `openstack baremetal node list -f json | jq -r .[].Name`; do openstack baremetal node inspect $i;done
|
||||
|
||||
# if a node fails inspection
|
||||
openstack baremetal node maintenance unset oscomp9
|
||||
openstack baremetal node manage oscomp9
|
||||
openstack baremetal node power off oscomp9 # wait for node to power off
|
||||
openstack baremetal node inspect oscomp9
|
||||
|
||||
# wait until all nodes are in a 'managable' state to continue, this may take around 15 minutes
|
||||
openstack baremetal node list
|
||||
|
||||
# set nodes to provide state and invokes node cleaning (uses the overcloud image)
|
||||
for i in `openstack baremetal node list -f json | jq -r ' .[] | select(."Provisioning State" == "manageable") | .Name'`; do openstack baremetal node provide $i;done
|
||||
|
||||
# if a node fails provision
|
||||
openstack baremetal node maintenance unset osnet1
|
||||
openstack baremetal node manage osnet1
|
||||
openstack baremetal node provide osnet1
|
||||
|
||||
# wait until all nodes are in an 'available' state to deploy the overcloud
|
||||
baremetal node list
|
||||
|
||||
# set all nodes back to 'manage' state to rerun introspection/provide
|
||||
# for i in `openstack baremetal node list -f json | jq -r .[].Name`; do openstack baremetal node manage $i;done
|
||||
```
|
||||
|
||||
## Checking networking via inspection data
|
||||
|
||||
Once the node inspections complete, we can check the list of network adapters in a chassis to assist with the network configuration in the deployment configuration files.
|
||||
|
||||
```sh
|
||||
# load credentials
|
||||
source ~/stackrc
|
||||
|
||||
# find the UUID of a sample node
|
||||
openstack baremetal node list -f json | jq .
|
||||
|
||||
# check collected metadata, commands will show all interfaces and if they have carrier signal
|
||||
#openstack baremetal node show f409dad9-1c1e-4ca0-b8af-7eab1b7f878d -f json | jq -r .
|
||||
#openstack baremetal introspection data save f409dad9-1c1e-4ca0-b8af-7eab1b7f878d | jq .inventory.interfaces
|
||||
#openstack baremetal introspection data save f409dad9-1c1e-4ca0-b8af-7eab1b7f878d | jq .all_interfaces
|
||||
#openstack baremetal introspection data save f409dad9-1c1e-4ca0-b8af-7eab1b7f878d | jq '.all_interfaces | keys[]'
|
||||
|
||||
# origional server hardware SR630 (faedafa5-5fa4-432e-b3aa-85f7f30f10fb | oscomp23)
|
||||
(undercloud) [stack@undercloud ~]$ openstack baremetal introspection data save faedafa5-5fa4-432e-b3aa-85f7f30f10fb | jq '.all_interfaces | keys[]'
|
||||
"eno1"
|
||||
"eno2"
|
||||
"eno3"
|
||||
"eno4"
|
||||
"enp0s20f0u1u6"
|
||||
"ens2f0"
|
||||
"ens2f1"
|
||||
|
||||
# new server hardware SR630v2 (b239f8b7-3b97-47f8-a057-4542ca6c7ab7 | oscomp28)
|
||||
(undercloud) [stack@undercloud ~]$ openstack baremetal introspection data save b239f8b7-3b97-47f8-a057-4542ca6c7ab7 | jq '.all_interfaces | keys[]'
|
||||
"enp0s20f0u1u6"
|
||||
"ens2f0"
|
||||
"ens2f1"
|
||||
"ens4f0"
|
||||
"ens4f1"
|
||||
"ens4f2"
|
||||
"ens4f3"
|
||||
```
|
||||
|
||||
Interfaces are shown in the order that they are seen on the PCI bus, modern linux OS' have an interafce naming scheme triggered by udev.
|
||||
|
||||
This naming scheme is often described as:
|
||||
- Predictable Network Interface Names
|
||||
- Consistent Network Device Naming
|
||||
- Persistent names (https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/)
|
||||
|
||||
```sh
|
||||
# example interface naming scheme
|
||||
enp0s10:
|
||||
| | |
|
||||
v | | --> virtual (qemu)
|
||||
en| | --> ethernet
|
||||
v |
|
||||
p0| --> bus number (0)
|
||||
v
|
||||
s10 --> slot number (10)
|
||||
f0 --> function (multiport card)
|
||||
```
|
||||
|
||||
Openstack adopts a interface mapping scheme to help identify the network interfaces by the notation by notation 'nic1, nic2, nicN'.
|
||||
Only interfaces with a carrier signal (connected to switch) will be participate in the interface mapping scheme.
|
||||
For the University nodes we the following Openstack mapping scheme is created.
|
||||
|
||||
Server classA:
|
||||
|
||||
| mapping | interface | purpose |
|
||||
| --- | --- | --- |
|
||||
| nic1 | eno1 | Control Plane |
|
||||
| nic2 | enp0s20f0u1u6 | USB ethernet, likely from the XClarity controller |
|
||||
| nic3 | ens2f0 | LACP bond, guest/storage |
|
||||
| nic4 | ens2f1 | LACP bond, guest/storage |
|
||||
|
||||
Server classB:
|
||||
|
||||
| mapping | interface | purpose |
|
||||
| --- | --- | --- |
|
||||
| nic1 | enp0s20f0u1u6 | USB ethernet, likely from the XClarity controller |
|
||||
| nic2 | ens2f0 | Control Plane |
|
||||
| nic3 | ens2f1 | LACP bond, guest/storage |
|
||||
| nic4 | ens4f0 | LACP bond, guest/storage |
|
||||
|
||||
The 'Server classA' nodes will be used for roles 'controller', 'networker' and 'compute'. the Server classB' hardware will be used for roles 'compute'.
|
||||
The mapping 'nic1' is not consistent for 'Control Plane' network across both classes of server hardware, necessitating multiple roles (thus multiple network interface templates) for the compute nodes.
|
||||
|
||||
You may notice some LLDP information (Cumulus switch must be running the LLDP service), this is very helpful to determine the switch port that the network interface is connected to and verify your point-to-point list.
|
||||
Owing to the name of the switch we can quickly see this is the 100G cumulus switch.
|
||||
|
||||
```
|
||||
"ens2f0": {
|
||||
"ip": "fe80::d57c:2432:d78d:e15d",
|
||||
"mac": "10:70:fd:24:62:e0",
|
||||
"client_id": null,
|
||||
"pxe": false,
|
||||
"lldp_processed": {
|
||||
"switch_chassis_id": "b8:ce:f6:18:c3:4a",
|
||||
"switch_port_id": "swp9s0",
|
||||
"switch_system_name": "sw100g0",
|
||||
"switch_system_description": "Cumulus Linux version 4.2.0 running on Mellanox Technologies Ltd. MSN3700C",
|
||||
"switch_capabilities_support": [
|
||||
"Bridge",
|
||||
"Router"
|
||||
],
|
||||
"switch_capabilities_enabled": [
|
||||
"Bridge",
|
||||
"Router"
|
||||
],
|
||||
"switch_mgmt_addresses": [
|
||||
"172.31.31.11",
|
||||
"fe80::bace:f6ff:fe18:c34a"
|
||||
],
|
||||
"switch_port_description": "swp9s0",
|
||||
"switch_port_link_aggregation_enabled": false,
|
||||
"switch_port_link_aggregation_support": true,
|
||||
"switch_port_link_aggregation_id": 0,
|
||||
"switch_port_autonegotiation_enabled": true,
|
||||
"switch_port_autonegotiation_support": true,
|
||||
"switch_port_physical_capabilities": [
|
||||
"1000BASE-T fdx",
|
||||
"PAUSE fdx"
|
||||
],
|
||||
"switch_port_mau_type": "Unknown"
|
||||
}
|
||||
},
|
||||
```
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
|
@ -0,0 +1,360 @@
|
|||
# Example of a new project
|
||||
|
||||
The following example exclusively uses the CLI administration, this helps clarity the componets in play and their interdependencies. All steps can be be performed in the web console.
|
||||
|
||||
## Load environment variables to use the Overcloud CLI
|
||||
|
||||
```sh
|
||||
[stack@undercloud ~]$ source ~/stackrc
|
||||
(undercloud) [stack@undercloud ~]$ source ~/overcloudrc
|
||||
(overcloud) [stack@undercloud ~]$
|
||||
```
|
||||
|
||||
## Create project
|
||||
|
||||
```sh
|
||||
# create project
|
||||
openstack project create --domain 'ldap' --description "Bioinformatics Project" bioinformatics
|
||||
```
|
||||
|
||||
## Create an internal Openstack network/subnet for the project
|
||||
|
||||
```sh
|
||||
openstack network create bioinformatics-network --internal --no-share --project bioinformatics
|
||||
openstack subnet create bioinformatics-subnet --project bioinformatics --network bioinformatics-network --gateway 172.16.1.1 --subnet-range 172.16.1.0/16 --dhcp
|
||||
```
|
||||
|
||||
## Create a router for the project
|
||||
|
||||
```sh
|
||||
openstack router create bioinformatics-router --project bioinformatics
|
||||
openstack router set bioinformatics-router --external-gateway provider
|
||||
```
|
||||
|
||||
## Add an interface to the provider network to the project network
|
||||
|
||||
```sh
|
||||
openstack router add subnet bioinformatics-router bioinformatics-subnet
|
||||
```
|
||||
|
||||
## Create a security group named 'linux-default' to allow inbound ssh for VM instances
|
||||
|
||||
- a new security group injects rules on creation to allow outbound traffic by default, where multiple security groups are attached these default rules may be removed
|
||||
|
||||
```sh
|
||||
openstack security group create --project bioinformatics linux-default
|
||||
openstack security group rule create \
|
||||
--ingress \
|
||||
--protocol tcp \
|
||||
--ethertype IPv4 \
|
||||
--remote-ip '0.0.0.0/0' \
|
||||
--dst-port 22 \
|
||||
$(openstack security group list --project bioinformatics -f json | jq -r '.[] | select(.Name == "linux-default").ID')
|
||||
|
||||
# list security group rules
|
||||
openstack security group rule list $(openstack security group list --project bioinformatics -f json | jq -r '.[] | select(."Name" == "default") | .ID')
|
||||
openstack security group rule list $(openstack security group list --project bioinformatics -f json | jq -r '.[] | select(."Name" == "linux-default") | .ID') --long
|
||||
+--------------------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+
|
||||
| ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group |
|
||||
+--------------------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+
|
||||
| 99210e25-4b7f-4125-93bb-7abea3eddf07 | None | IPv4 | 0.0.0.0/0 | | egress | None |
|
||||
| adc21371-52bc-4c63-8e23-8e55a119407c | None | IPv6 | ::/0 | | egress | None |
|
||||
| d327baac-bdaa-437c-b506-b90659e92833 | tcp | IPv4 | 0.0.0.0/0 | 22:22 | ingress | None |
|
||||
+--------------------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+
|
||||
```
|
||||
|
||||
## Set quotas for the scope of the entire project
|
||||
|
||||
```sh
|
||||
openstack quota set --instances 50 bioinformatics ;\
|
||||
openstack quota set --cores 300 bioinformatics ;\
|
||||
openstack quota set --ram 204800 bioinformatics ;\
|
||||
openstack quota set --gigabytes 5000 bioinformatics ;\
|
||||
openstack quota set --volumes 500 bioinformatics ;\
|
||||
openstack quota set --key-pairs 50 bioinformatics ;\
|
||||
openstack quota set --floating-ips 50 bioinformatics ;\
|
||||
openstack quota set --networks 10 bioinformatics ;\
|
||||
openstack quota set --routers 5 bioinformatics ;\
|
||||
openstack quota set --subnets 10 bioinformatics ;\
|
||||
openstack quota set --secgroups 100 bioinformatics ;\
|
||||
openstack quota set --secgroup-rules 1000 bioinformatics
|
||||
```
|
||||
|
||||
## Create flavours for the project
|
||||
|
||||
- flavours are pre-scoped specs of the instances
|
||||
|
||||
```sh
|
||||
openstack flavor create small --ram 2048 --disk 10 --vcpus 2 --private --project bioinformatics ;\
|
||||
openstack flavor create medium --ram 3072 --disk 10 --vcpus 4 --private --project bioinformatics ;\
|
||||
openstack flavor create large --ram 8192 --disk 10 --vcpus 8 --private --project bioinformatics ;\
|
||||
openstack flavor create xlarge --ram 16384 --disk 10 --vcpus 16 --private --project bioinformatics ;\
|
||||
openstack flavor create xxlarge --ram 65536 --disk 10 --vcpus 48 --private --project bioinformatics
|
||||
```
|
||||
|
||||
## End-user access using Active Directory groups
|
||||
|
||||
- In the Univerity Prod environment you would typically create an AD group with nested AD users
|
||||
- To illustrate the method, chose the pre-existing group 'ISCA-Admins'
|
||||
|
||||
```sh
|
||||
openstack user list --group 'ISCA-Admins' --domain ldap
|
||||
+------------------------------------------------------------------+--------+
|
||||
| ID | Name |
|
||||
+------------------------------------------------------------------+--------+
|
||||
| c633f80625e587bc3bbe492af57cb99cec59201b16cc06f614e36a6b767d6b29 | mtw212 |
|
||||
| 0c4e3bdacda6c9b8abcd61de94deb47ff236cec3581fbbacf2d9daa1c584a44d | mmb204 |
|
||||
| 2d4338bc2ba649ff15111519e535d0fc6c65cbb7e5275772b4e0c675af09002b | rr274 |
|
||||
| b9461f113d208b54a37862ca363ddf37da68cf00ec06d67ecc62bb1e5caf06d4 | dma204 |
|
||||
| 0fb8469b2d7e297151102b0119a4b08f6b26113ad8401b6cb79936adf946ba19 | ac278 |
|
||||
+------------------------------------------------------------------+--------+
|
||||
|
||||
# bind member role to users in the access group for the project
|
||||
openstack role add --group-domain 'ldap' --group 'ISCA-Admins' --project-domain 'ldap' --project bioinformatics member
|
||||
|
||||
# bind admin role to a specific user for the project
|
||||
openstack role add --user-domain 'ldap' --user mtw212 --project-domain 'ldap' --project bioinformatics admin
|
||||
openstack role assignment list --user $(openstack user show --domain 'ldap' mtw212 -f json | jq -r .id) --names
|
||||
+-------+-------------+-------+---------------------+--------+--------+-----------+
|
||||
| Role | User | Group | Project | Domain | System | Inherited |
|
||||
+-------+-------------+-------+---------------------+--------+--------+-----------+
|
||||
| admin | mtw212@ldap | | bioinformatics@ldap | | | False |
|
||||
+-------+-------------+-------+---------------------+--------+--------+-----------+
|
||||
|
||||
# bind member role for local user 'tseed' for the project
|
||||
openstack role add --user-domain 'Default' --user tseed --project-domain 'ldap' --project bioinformatics member
|
||||
|
||||
# bind admin role for the (default) local user 'admin' for the project - we want the admin user to have full access to the project
|
||||
openstack role add --user-domain 'Default' --user admin --project-domain 'ldap' --project bioinformatics admin
|
||||
```
|
||||
|
||||
## Import a disk image to be used specifically for the project
|
||||
|
||||
- This can be custom image pre-baked with specific software or any vendor OS install image
|
||||
- Images should support cloud-init to support initial user login, generic distro images with cloud-init enabled should work
|
||||
|
||||
```sh
|
||||
wget https://repo.almalinux.org/almalinux/8/cloud/x86_64/images/AlmaLinux-8-GenericCloud-8.6-20220513.x86_64.qcow2
|
||||
openstack image create --disk-format qcow2 --container-format bare --private --project bioinformatics --property os_type=linux --file ./AlmaLinux-8-GenericCloud-8.6-20220513.x86_64.qcow2 alma_8.6
|
||||
```
|
||||
|
||||
## SSH keypairs
|
||||
|
||||
Generate an ssh key pair, this will be used for initial login to a VM instance.
|
||||
|
||||
- the keypair in this example is owned by the admin user, other users will not see the ssh keypair in the web console and will need a copy of the ssh private key (unless a password is set in cloud-init userdata)
|
||||
- each user will have their own keypair that will be selected when provisioning a VM instance in the web console
|
||||
- once instantiated, additional users can import ssh keys to the authorized_keys file as per typical linux host
|
||||
- when generating ssh public keys Openstack requires a comment at the end of the key, when importing a keypair (even via the web console) the public key needs a comment
|
||||
|
||||
Generic distro (cloud-init) images generally have their own default user, typically these image specific such as 'almalinux' or 'ubuntu', this user will login with this user using the ssh private key counterpart to the specified public ssh key with the '--key-name' parameter.
|
||||
Some cloud-init images use the user in the comment of the ssh key as the default user (or as an additional user).
|
||||
Convention is that you provision instances with cloud-init userdata with the expectation you will provide your own user + credentials.
|
||||
|
||||
```sh
|
||||
ssh-keygen -t rsa -b 4096 -C "bioinformatics@university.ac.uk" -f ~/bioinformatics_cloud
|
||||
openstack keypair create --public-key ~/bioinformatics_cloud.pub bioinformatics
|
||||
```
|
||||
|
||||
## Cloud-init userdata
|
||||
|
||||
This OPTIONAL step is very useful, typically cloud providers utilise userdata to setup initial login, however userdata is much more powerful and often used to register the instance with a configuration management tool to install a suite of software (chef/puppet/ansible(in pull mode)) or even embed a shell script for direct software provision (pull+start containers), beware userdata is limited to 64KB.
|
||||
|
||||
NOTE: OCF have built cloud-init userdata for Linux (and Windows in Azure) to configure SSSD to join cloud instances to Microsoft Active Directory to enable multi-user access, this is highly environment/customer specific.
|
||||
|
||||
- Openstack is kind, you dont have to base64 encode the userdata like some public cloud providers, it is automatic
|
||||
- generally each cloud-init image will have its own default user, typically these image specific such as 'almalinux' or 'ubuntu'
|
||||
- the following config will replace this default user with your own bioinformatics user, password and ssh key (It also adds the universityops user to ensure an admin can get into the system)
|
||||
- NOTE the ssh key entry below has had the trailing comment removed
|
||||
- passwords can be in cleartext but Instance users will be able to see the password in the userdata, create a hash with the command `openssl passwd -6 -salt xyz Password0`
|
||||
- userdata can be added to the instance when provisioning in the web console @ Customisation Script, it is always a good idea to provide a userdata template to the end user where they self provision
|
||||
|
||||
```sh
|
||||
nano -cw userdata.txt # yaml format
|
||||
|
||||
#cloud-config
|
||||
ssh_pwauth: true
|
||||
groups:
|
||||
- admingroup: [root,sys]
|
||||
- bioinformatics
|
||||
- universityops
|
||||
users:
|
||||
- name: bioinformatics
|
||||
primary_group: bioinformatics
|
||||
lock_passwd: false
|
||||
passwd: $6$xyz$4tTWyuHIT6gXRuzotBZn/9xZBikUp0O2X6rOZ7MDJo26aax.Ok5P4rWYyzdgFkjArIIyB8z8LKVW1wARbcBzn/
|
||||
sudo: ALL=(ALL) NOPASSWD:ALL
|
||||
shell: /bin/bash
|
||||
ssh_authorized_keys:
|
||||
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQD4Yh0OuBTvyXObUcJKLDNjIhmSkf/RiSPhPYzNECwC7hlIms/fYcbODPQmboo8pgtnlDK0aElWr16n1z+Yb/3btzmO/G8pZEUR607VmWINuYzSJyAieL6zNPn0XC2eP9mqWJJP44SjroVKLjnhajy761FaGxXJyXr3RXmIb4xc+qW8ETJQh98ucZZZQ3X8MernjIOO+VGVObDDDTZXsaL1wih0+v/R9gMJP8AgSCpi539o0A6RgFzMqFfroUKe6uYa1ohBrjii+teKETEb7isNOZFPx459zhqRPVjFlzVXNpDBPVjz32uuUyBRW4jMlwQ/GIrhT7+fNjpxG0CrVe0c3F+BoBnqfdrsLFCJ3dg+z19lBLnC2ulp511kqEVctjG96l9DeEPtab28p22aV3fuzdnx24y3BJi8Wea79U8+RTy0fYCM0Sm8rwREUHD2bAgjtIUU8gTKnQLyeUAc5+qJCFqa3H9/DJZ44MQzk/rC0shBUU7z+IwWhftU1P9GWURko11Bmg6pq+/fdGVm/eqilDabirbZxjqnxXCBGcOM6QsPoooJ9cgCU34k9KhUxPJ34frYfwHaWkDYxe+7VBrrzPWpOnOGt04eegwdNBDMnl703wfXqobnyy8nMmzH04j2PThJ7ZrRnA6bo/dYtVZXHocfq76yPxSsmYClebJBSQ==
|
||||
- name: universityops
|
||||
primary_group: bioinformatics
|
||||
lock_passwd: false
|
||||
passwd: $6$xyz$4tTWyuHIT6gXRuzotBZn/9xZBikUp0O2X6rOZ7MDJo26aax.Ok5P4rWYyzdgFkjArIIyB8z8LKVW1wARbcBzn/
|
||||
sudo: ALL=(ALL) NOPASSWD:ALL
|
||||
shell: /bin/bash
|
||||
```
|
||||
|
||||
## Create a floating ip
|
||||
|
||||
With the network design up to this point you can have a routable IP capable of accepting ingress traffic from the wider University estate by two methods:
|
||||
|
||||
1. floating IP, a '1:1 NAT' of a provider network IP mapped to the VM interface IP in the private Openstack 'bioinformatics' network
|
||||
2. interface IP directly in the provider network
|
||||
|
||||
Floating IPs are more versatile as they can be moved between instances for all manner of blue-green scenarios, typically a VM instance does not have to be multihomed between networks either.
|
||||
Floating IPs in Openstack private networks are possible can be just useful in a multi-tiered application stack - think DR strategy, scripting the Openstack API to move the floating IP between instances.
|
||||
However end users may want a VM instance with only a provider network IP, this would only be able to communicate with other Openstack VM instances with a provider IP.
|
||||
|
||||
```sh
|
||||
# create a floating IP in the 'provider' network on the 'provider-subnet' subnet range
|
||||
openstack floating ip create --project bioinformatics --description 'bioinformatics01' --subnet provider-subnet provider
|
||||
openstack floating ip list --project bioinformatics --long -c 'ID' -c 'Floating IP Address' -c 'Description'
|
||||
+--------------------------------------+---------------------+------------------+
|
||||
| ID | Floating IP Address | Description |
|
||||
+--------------------------------------+---------------------+------------------+
|
||||
| 0eb3f78d-d59d-4ec6-b725-d2c1f45c9a77 | 10.121.4.246 | bioinformatics01 |
|
||||
+--------------------------------------+---------------------+------------------+
|
||||
```
|
||||
|
||||
Check allocated 'ports', think of this as IP endpoints for objects known by openstack.
|
||||
|
||||
- VM Instance = compute:nova
|
||||
- Floating IP = network:floatingip
|
||||
- DHCP service = network:dhcp (most networks will have one)
|
||||
- Primary router interface = network:router_gateway (usually in the provider network, for egress/SNAT access to external networks)
|
||||
- Secondary router interface = network:router_interface (router interface on a private Openstack network)
|
||||
|
||||
```sh
|
||||
openstack port list --long -c 'ID' -c 'Fixed IP Addresses' -c 'Device Owner'
|
||||
+--------------------------------------+-----------------------------------------------------------------------------+--------------------------+
|
||||
| ID | Fixed IP Addresses | Device Owner |
|
||||
+--------------------------------------+-----------------------------------------------------------------------------+--------------------------+
|
||||
| 108171d9-cd76-49ab-944e-751f8257c8d1 | ip_address='10.121.4.150', subnet_id='92361cfd-f348-48a2-b264-7845a3a3d592' | compute:nova |
|
||||
| 3d86a21a-f187-47e0-8204-464adf334fb0 | ip_address='172.16.0.2', subnet_id='a92d2ac0-8b60-4329-986d-ade078e75f45' | network:dhcp |
|
||||
| 3db3fe34-85a8-4028-b670-7f9aa5c86c1a | ip_address='10.121.4.148', subnet_id='92361cfd-f348-48a2-b264-7845a3a3d592' | network:floatingip |
|
||||
| 400cb067-2302-4f8e-bc1a-e187929afbbc | ip_address='10.121.4.205', subnet_id='92361cfd-f348-48a2-b264-7845a3a3d592' | network:router_gateway |
|
||||
| 5c93d336-05b5-49f0-8ad4-9de9c2ccf216 | ip_address='172.16.2.239', subnet_id='ab658788-0c5f-4d22-8786-aa7256db66b6' | compute:nova |
|
||||
| 62afa3de-5316-4eb6-88ca-4830c141c898 | ip_address='172.16.1.1', subnet_id='ab658788-0c5f-4d22-8786-aa7256db66b6' | network:router_interface |
|
||||
| 7c8b58c0-3ff7-44f6-9eb3-a601a139aab9 | ip_address='172.16.0.1', subnet_id='a92d2ac0-8b60-4329-986d-ade078e75f45' | network:router_interface |
|
||||
| 9f41db95-8333-4f6d-88e0-c0e3f7d4b7f0 | ip_address='172.16.1.2', subnet_id='ab658788-0c5f-4d22-8786-aa7256db66b6' | network:dhcp |
|
||||
| c9591f1b-8d43-4322-acd6-75cd4cce04e3 | ip_address='10.121.4.239', subnet_id='92361cfd-f348-48a2-b264-7845a3a3d592' | network:router_gateway |
|
||||
| e3f35c0a-6543-4508-8d17-96de69f85a1c | ip_address='10.121.4.130', subnet_id='92361cfd-f348-48a2-b264-7845a3a3d592' | network:dhcp |
|
||||
+--------------------------------------+-----------------------------------------------------------------------------+--------------------------+
|
||||
```
|
||||
|
||||
## Create disk volumes
|
||||
|
||||
Create volumes that will be attached on VM instantiation (bioinformatics02).
|
||||
|
||||
```sh
|
||||
# find the image to use on the boot disk
|
||||
openstack image list -c 'ID' -c 'Name' -c 'Project' --long -f json | jq -r '.[] | select(.Name == "alma_8.6").ID'
|
||||
0a0d99c1-4bce-4e74-9df8-f9cf5666aa98
|
||||
|
||||
# create a bootable disk
|
||||
openstack volume create --bootable --size 50 --image $(openstack image list -c 'ID' -c 'Name' -c 'Project' --long -f json | jq -r '.[] | select(.Name == "alma_8.6").ID') --description "bioinformatics02 boot" --os-project-domain-name='ldap' --os-project-name 'bioinformatics' bioinformatics02boot
|
||||
|
||||
# create a data disk
|
||||
openstack volume create --non-bootable --size 100 --description "bioinformatics02 data" --os-project-domain-name='ldap' --os-project-name 'bioinformatics' bioinformatics02data
|
||||
```
|
||||
|
||||
## Create VM instances
|
||||
|
||||
Creating instances via the CLI can save a lot of time VS the web console if the environment is not to be initially self provisioned by the end user, allowing you to template a bunch of machines quickly.
|
||||
|
||||
VM instances are not technically 'owned' by a user, they reside in a domain/project, they are provisioned by a user (initially with a user specific SSH key) and can be administered by users in same the project via the CLI/web-console. SSH access to the VM will be user specific unless the provisioning user adds access for other users (via password or SSH private key distribution at the operating system level). Userdata is the key to true multitenancy.
|
||||
|
||||
### Instance from flavour with larger disk and floating IP
|
||||
|
||||
The following command illustrates:
|
||||
|
||||
- create VM Instance in the Openstack 'bioinformatics' network with an additional floating IP
|
||||
- override the instance flavour 10GB disk with a 100GB disk, the disk is not removed when the instance is deleted
|
||||
- add multiple security groups, these apply to all interfaces by default, allowing specific ingress for only the floating IP would be achieved with a rule matching the destination of floating IP
|
||||
|
||||
```sh
|
||||
# create VM instance
|
||||
openstack server create \
|
||||
--image alma_8.6 \
|
||||
--flavor large \
|
||||
--boot-from-volume 100 \
|
||||
--network bioinformatics-network \
|
||||
--security-group $(openstack security group list --project bioinformatics -f json | jq -r '.[] | select(.Name == "default").ID') \
|
||||
--security-group $(openstack security group list --project bioinformatics -f json | jq -r '.[] | select(.Name == "linux-default").ID') \
|
||||
--key-name bioinformatics \
|
||||
--user-data userdata.txt \
|
||||
--os-project-domain-name='ldap' \
|
||||
--os-project-name 'bioinformatics' \
|
||||
bioinformatics01
|
||||
```
|
||||
|
||||
Attach the floating IP:
|
||||
|
||||
- this command relies on the unique uuid ID of both the server and floating IP objects as the command doesn't support the --project parameter
|
||||
- we named both our floating IP and VM instance 'bioinformatics01', really this is where tags start to become useful
|
||||
|
||||
```sh
|
||||
# attach floating IP
|
||||
openstack server add floating ip $(openstack server list --project bioinformatics -f json | jq -r '.[] | select(.Name == "bioinformatics01").ID') $(openstack floating ip list --project bioinformatics --long -c 'ID' -c 'Floating IP Address' -c 'Description' -f json | jq -r '.[] | select(.Description == "bioinformatics01") | ."Floating IP Address"')
|
||||
|
||||
# check the IP addresses allocated to the VM instance, we see the floating IP 10.121.4.246 directly on the routable provider network
|
||||
openstack server list --project bioinformatics
|
||||
+--------------------------------------+------------------+--------+--------------------------------------------------+-------+--------+
|
||||
| ID | Name | Status | Networks | Image | Flavor |
|
||||
+--------------------------------------+------------------+--------+--------------------------------------------------+-------+--------+
|
||||
| ca402aed-84dd-47ad-b5ba-5fc74978f66b | bioinformatics01 | ACTIVE | bioinformatics-network=172.16.3.74, 10.121.4.246 | | large |
|
||||
+--------------------------------------+------------------+--------+--------------------------------------------------+-------+--------+
|
||||
```
|
||||
|
||||
### 'multi-homed' Instance from flavour with manually specified disk
|
||||
|
||||
Create the VM instance with the disk volumes attached and network interfaces in both the project's Openstack private network and the provider network.
|
||||
|
||||
```sh
|
||||
# create a VM instance
|
||||
## -v is a debug parameter, -vv for more
|
||||
openstack server create \
|
||||
--volume $(openstack volume list --name bioinformatics02boot --project bioinformatics -f json | jq -r .[].ID) \
|
||||
--block-device-mapping vdb=$(openstack volume list --name bioinformatics02data --project bioinformatics -f json | jq -r .[].ID):volume::true \
|
||||
--flavor large \
|
||||
--nic net-id=provider \
|
||||
--nic net-id=bioinformatics-network \
|
||||
--security-group $(openstack security group list --project bioinformatics -f json | jq -r '.[] | select(.Name == "default").ID') \
|
||||
--security-group $(openstack security group list --project bioinformatics -f json | jq -r '.[] | select(.Name == "linux-default").ID') \
|
||||
--key-name bioinformatics \
|
||||
--user-data userdata.txt \
|
||||
--os-project-domain-name='ldap' \
|
||||
--os-project-name 'bioinformatics' \
|
||||
bioinformatics02 -v
|
||||
|
||||
# remove the server
|
||||
## note that the data volume has been deleted, it was attached with the 'delete-on-terminate' flag set true in the '--block-device-mapping' parameter
|
||||
## the main volume has not been removed, we see that 'delete-on-terminate' is set false in 'openstack server show'
|
||||
## the web console will allow the boot volume to be delete-on-terminate, the CLI lacks this capability yet REST API clearly supports the functionality
|
||||
openstack server delete $(openstack server show bioinformatics02 --os-project-domain-name='ldap' --os-project-name 'bioinformatics' -f json | jq -r .id)
|
||||
openstack volume list --project bioinformatics
|
||||
+--------------------------------------+----------------------+-----------+------+---------------------------------------------------------------+
|
||||
| ID | Name | Status | Size | Attached to |
|
||||
+--------------------------------------+----------------------+-----------+------+---------------------------------------------------------------+
|
||||
| db137b16-67ed-4ade-8d89-fd57d463f573 | | in-use | 100 | Attached to ca402aed-84dd-47ad-b5ba-5fc74978f66b on /dev/vda |
|
||||
| 1ff863bb-6cb3-4d40-8d25-06b61e974e38 | bioinformatics02boot | available | 50 | |
|
||||
+--------------------------------------+----------------------+-----------+------+---------------------------------------------------------------+
|
||||
```
|
||||
|
||||
## Test access to VM instances
|
||||
|
||||
```sh
|
||||
# check the IP addresses allocated to the VM instance
|
||||
openstack server list --project bioinformatics -c 'Name' -c 'Networks' --long --fit-width
|
||||
+------------------+-----------------------------------------------------------+
|
||||
| Name | Networks |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
| bioinformatics02 | bioinformatics-network=172.16.3.254; provider=10.121.4.92 |
|
||||
| bioinformatics01 | bioinformatics-network=172.16.3.74, 10.121.4.246 |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
|
||||
# gain access to the instances via native provider network ip and the floating ip respectively
|
||||
ssh -i ~/bioinformatics_cloud bioinformatics@10.121.4.92
|
||||
ssh -i ~/bioinformatics_cloud bioinformatics@10.121.4.246
|
||||
```
|
||||
|
|
@ -0,0 +1,91 @@
|
|||
## Testing node evacuation
|
||||
|
||||
```sh
|
||||
# create guest VM
|
||||
cd;source ~/overcloudrc
|
||||
openstack server create --image cirros-0.5.1 --flavor m1.small --network internal test-failover
|
||||
openstack server list -c Name -c Status
|
||||
|
||||
+---------------+--------+
|
||||
| Name | Status |
|
||||
+---------------+--------+
|
||||
| test-failover | ACTIVE |
|
||||
+---------------+--------+
|
||||
|
||||
# find the compute node that the guest VM is running upon
|
||||
openstack server show test-failover -f json | jq -r '."OS-EXT-SRV-ATTR:host"'
|
||||
overcloud-novacomputeiha-3.localdomain
|
||||
|
||||
# login to the compute node hosting the guest VM, crash the host
|
||||
cd;source ~/stackrc
|
||||
ssh heat-admin@overcloud-novacomputeiha-3.ctlplane.localdomain
|
||||
sudo su -
|
||||
echo c > /proc/sysrq-trigger
|
||||
# this terminal will fail after a few minutes, the dashboard console view of the guest VM will hang
|
||||
# node hard poweroff will achieve the same effect
|
||||
|
||||
# check nova services
|
||||
cd;source ~/overcloudrc
|
||||
nova service-list
|
||||
|
||||
| 0ad301e3-3420-4d5d-a2fb-2f00ba80a00f | nova-compute | overcloud-novacomputeiha-3.localdomain | nova | disabled | down | 2022-05-19T11:49:40.000000 | - | True |
|
||||
|
||||
# check guest VM is still running, after a few minutes it should be running on another compute node
|
||||
openstack server list -c Name -c Status
|
||||
openstack server show test-failover -f json | jq -r .status
|
||||
# VM Instance has not yet registered as on a down compute node
|
||||
ACTIVE
|
||||
# Openstack has detected the a down compute node and is moving the instance, rebuilding refers to the QEMU domain there is no VM rebuilding and active OS state is preserved
|
||||
REBUILDING
|
||||
# if you see an error state either IPMI interfaces cannot be contacted by the controllers or there is a storage migration issue, check with 'openstack server show test-failover'
|
||||
ERROR
|
||||
# you probably wont see this unless you recover from an ERROR state with 'openstack server stop test-failover'
|
||||
SHUTOFF
|
||||
|
||||
# check VM instance is on a new node
|
||||
openstack server show test-failover -f json | jq -r '."OS-EXT-SRV-ATTR:host"'
|
||||
overcloud-novacomputeiha-1.localdomain
|
||||
|
||||
# Unless the compute node does not come back up you should see it automatically rejoined to the cluster
|
||||
# If it does not rejoin the cluster try a reboot and wait a good 10 minutes
|
||||
# If a node still does not come back you will have to remove it and redeploy from the undercloud - hassle
|
||||
nova service-list
|
||||
| 1be7bc8f-2769-4986-ac5e-686859779bca | nova-compute | overcloud-novacomputeiha-0.localdomain | nova | enabled | up | 2022-05-19T12:03:27.000000 | - | False |
|
||||
| 0ad301e3-3420-4d5d-a2fb-2f00ba80a00f | nova-compute | overcloud-novacomputeiha-3.localdomain | nova | enabled | up | 2022-05-19T12:03:28.000000 | - | False |
|
||||
| c8d3cfd8-d639-49a2-9520-5178bc5a426b | nova-compute | overcloud-novacomputeiha-2.localdomain | nova | enabled | up | 2022-05-19T12:03:26.000000 | - | False |
|
||||
| 3c918b5b-36a6-4e63-b4de-1b584171a0c0 | nova-compute | overcloud-novacomputeiha-1.localdomain | nova | enabled | up | 2022-05-19T12:03:27.000000 | - | False |
|
||||
```
|
||||
|
||||
Other commands to assist in debug of failover behaviour.
|
||||
|
||||
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/command_line_interface_reference/server#server_migrate # great CLI reference
|
||||
> https://docs.openstack.org/nova/rocky/admin/evacuate.html # older reference, prefer openstack CLI commands that act as a wrapper to nova CLI
|
||||
|
||||
```sh
|
||||
# test that the controller nodes can run ipmitool against the compute nodes
|
||||
ipmitool -I lanplus -H 10.0.9.45 -p 2000 -U USERID -P PASSW0RD chassis status
|
||||
|
||||
# list physical nodes
|
||||
openstack host list
|
||||
nova hypervisor-list
|
||||
|
||||
# list VMs, get compute node for an instance
|
||||
openstack server list
|
||||
openstack server list -c Name -c Status
|
||||
nova list
|
||||
openstack server show <server> -f json | jq -r '."OS-EXT-SRV-ATTR:host"'
|
||||
|
||||
# if you get a VM instance stuck in a power on/off state and you cant evacuate it from a failed node, issue 'openstack server stop <server>'
|
||||
nova reset-state --active <server> # set to active state even if it was in error state
|
||||
nova reset-state --all-tenants # seems to set node back to error state if it was in active state but failed and powered off
|
||||
nova stop [--all-tenants] <server>
|
||||
openstack server stop <server> # new command line reference method, puts node in poweroff state, use for ERROR in migration
|
||||
|
||||
# evacuate single VM server instance to a different compute node
|
||||
# not prefered, older command syntax for direct nova service control
|
||||
nova evacuate <server> overcloud-novacomputeiha-3.localdomain # moves VM - pauses but doesn't shut down
|
||||
nova evacuate --on-shared-storage test-1 overcloud-novacomputeiha-0.localdomain # live migration
|
||||
# prefered openstack CLI native commands
|
||||
openstack server migrate --live-migration <server> # moves VM - pauses but doesn't shut down, state is preserved (presumably this only works owing to ceph/shared storage)
|
||||
openstack server migrate --shared-migration <server> # requires manual confirmation in web console, stops/starts VM, state not preserved
|
||||
```
|
||||
|
|
@ -0,0 +1,330 @@
|
|||
# check certificate for the Openstack Horizon dashboard
|
||||
|
||||
```sh
|
||||
openssl s_client -showcerts -connect stack.university.ac.uk:443
|
||||
|
||||
Certificate chain
|
||||
0 s:C = GB, ST = England, L = University, CN = stack.university.ac.uk
|
||||
i:C = GB, ST = England, L = University, O = UOE, OU = Cloud, CN = University Openstack CA
|
||||
```
|
||||
|
||||
We see the certificate is signed by the CA "University Openstack CA" created in the build guide, this is not quite a self signed certificate but has broadly the same level of security unless the CA cert is not installed on the client machines.
|
||||
|
||||
# Check the certificate bundle recieved from an external signing authority
|
||||
|
||||
## Unpack and inspect
|
||||
|
||||
```sh
|
||||
sudo dnf install unzip -y
|
||||
unzip stack.university.ac.uk.zip
|
||||
tree .
|
||||
├── stack.university.ac.uk.cer full certificate chain, order: service certificate, intermediate CA, intermediate CA, top level CA
|
||||
├── stack.university.ac.uk.cert.cer service certificate for stack.university.ac.uk
|
||||
├── stack.university.ac.uk.csr certificate signing request (sent to public CA)
|
||||
├── stack.university.ac.uk.interm.cer chain of intermediate and top level CA certificates, order: intermedia CA (Extended CA), intermediate CA, top level CA 321
|
||||
└── stack.university.ac.uk.key certificate private key
|
||||
```
|
||||
|
||||
## Check each certificate to determine what has been included in the bundle
|
||||
|
||||
Some signing authorities will not include all CA certificates in the bundle, it is up to you to inspect the service certificate and trace back through the certificate chain to obtain the various CA certificates.
|
||||
|
||||
### certificate information
|
||||
|
||||
Inspect service certificate.
|
||||
|
||||
```sh
|
||||
#openssl x509 -in stack.university.ac.uk.cert.cer -text -noout
|
||||
cfssl-certinfo -cert stack.university.ac.uk.cert.cer
|
||||
```
|
||||
|
||||
Service certificate attributes.
|
||||
|
||||
```
|
||||
"common_name": "stack.university.ac.uk"
|
||||
|
||||
"sans": [
|
||||
"stack.university.ac.uk",
|
||||
"www.stack.university.ac.uk"
|
||||
],
|
||||
"not_before": "2022-03-16T00:00:00Z",
|
||||
"not_after": "2023-03-16T23:59:59Z",
|
||||
```
|
||||
|
||||
### full certificate chain content
|
||||
|
||||
Copy out each certificate from the full chain file `stack.university.ac.uk.cer` to its own temp file, run the openssl text query command `openssl x509 -in <cert.N> -text -noout` to inspect each certificate.
|
||||
|
||||
The full chain certificate file is listed in following order. From the service certificate `stack.university.ac.uk` each certificate is signed by the preceding CA.
|
||||
|
||||
| Certificate context name | purpose | capability |
|
||||
| --- | --- | --- |
|
||||
| CN = AAA Certificate Services | top level CA | CA capability |
|
||||
| CN = USERTrust RSA Certification Authority | intermediate CA | CA capability |
|
||||
| CN = GEANT OV RSA CA 4 | intermediate CA | CA capability<br>extended validation capability |
|
||||
| CN = stack.university.ac.uk | the service certificate | stack.university.ac.uk certificate |
|
||||
|
||||
## Check that the certificate chain is present by default in the trust store on the clients
|
||||
|
||||
Open certmgr in windows, check in "Trusted Root Authorities/Certificates" for each CA/Intermediate-CA certificate, all certificated will likely be present.
|
||||
|
||||
- look for the context name (CN)
|
||||
- check the "X509v3 Subject Key Identifier" matches the "subject key identifier" from the `openssl x509 -in stack.university.ac.uk.cert.cer -text -noout` output
|
||||
|
||||
Windows includes certificates for "AAA Certificate Services" and "USERTrust RSA Certification Authority", the extended validation Intermediate CA "GEANT OV RSA CA 4" maybe missing, this is not an issue as the client has the top level CAs so can validate and follow the signing chain.
|
||||
|
||||
For modern Linux distros we find only one intermediate CA, this should be sufficient as any handshake using certificates signed from this will be able to validate. If the undercloud can find a CA in its trust store the deployed cluster nodes will most likely have it.
|
||||
|
||||
```sh
|
||||
trust list | grep -i label | grep -i "USERTrust RSA Certification Authority"
|
||||
|
||||
# generally all certificates imported into the trust store get rendered into this global file
|
||||
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt
|
||||
|
||||
# search the trust store for "USERTrust RSA Certification Authority", copy the content of the certificate field into a temp file for the following stanza
|
||||
nano -cw /usr/share/pki/ca-trust-source/ca-bundle.trust.p11-kit
|
||||
|
||||
[p11-kit-object-v1]
|
||||
label: "USERTrust RSA Certification Authority"
|
||||
trusted: true
|
||||
nss-mozilla-ca-policy: true
|
||||
modifiable: false
|
||||
|
||||
# check the "X509v3 Subject Key Identifier" matches the CA in the certificate chain you recieved from the signing authority.
|
||||
openssl x509 -in <temp file> -text -noout | grep "53:79:BF:5A:AA:2B:4A:CF:54:80:E1:D8:9B:C0:9D:F2:B2:03:66:CB"
|
||||
```
|
||||
|
||||
Browsers such as Edge and Chrome will use the OS trust store, Firefox distributes its own trust store.
|
||||
|
||||
- 3 bar burger -> settings -> security -> view certificates -> authorities -> The UserTrust Network -> USERTrust RSA Certification Authority
|
||||
|
||||
We find the fingerprint from the openssl command "X509v3 Subject Key Identifier" matches the certificate field "subject key identifier" in Firefox.
|
||||
|
||||
## Configure the undercloud to use the CAs
|
||||
|
||||
```sh
|
||||
trust list | grep label | wc -l
|
||||
148
|
||||
|
||||
sudo cp /home/stack/CERT/stack.university.ac.uk/stack.university.ac.uk.interm.cer /etc/pki/ca-trust/source/anchors/public_ca_chain.pem
|
||||
sudo update-ca-trust extract
|
||||
|
||||
# although the certificate chain includes 3 certificates only 1 is imported, this is the imtermediate CA "CN = GEANT OV RSA CA 4" that is not part of a default trust store
|
||||
trust list | grep label | wc -l
|
||||
149
|
||||
|
||||
# check CA/trusted certificates available to the OS
|
||||
trust list | grep label | grep -i "AAA Certificate Services"
|
||||
label: AAA Certificate Services
|
||||
|
||||
trust list | grep label | grep -i "USERTrust RSA Certification Authority"
|
||||
label: USERTrust RSA Certification Authority
|
||||
label: USERTrust RSA Certification Authority
|
||||
|
||||
trust list | grep label | grep -i "GEANT OV RSA CA 4"
|
||||
label: GEANT OV RSA CA 4
|
||||
```
|
||||
|
||||
|
||||
## Configure the controller nodes to use the publicly signed certificate
|
||||
|
||||
NOTE: "PublicTLSCAFile" is used both by the overcloud HAProxy configuration and the undercloud installer to contact https://stack.university.ac.uk:13000
|
||||
- The documentation presents the "PublicTLSCAFile" configuration item as the root CA certificate.
|
||||
- When the undercloud runs various custom Openstack ansible modules, the python libraries run have a completely empty trust store that do not reference the undercloud OS trust store and do not ingest shell variables to set trust store sources.
|
||||
- For the python to validate the overcloud public API endpoint, the full trust chain must be present. Python is not fussy about the order of certificates in this file, the vendor CA trust chain file in this case was ordered starting with the root CA.
|
||||
|
||||
Backup /home/stack/templates/enable-tls.yaml `mv /home/stack/templates/enable-tls.yaml /home/stack/templates/enable-tls.yaml.internal_ca`
|
||||
Create new `/home/stack/templates/enable-tls.yaml`, the content for each field is source as follows:
|
||||
|
||||
```
|
||||
PublicTLSCAFile: '/etc/pki/ca-trust/source/anchors/public_ca_chain.pem'
|
||||
SSLCertificate: content from stack.university.ac.uk.cer
|
||||
SSLIntermediateCertificate: use both intermediate certificates, in the order intermediate-2, intermediate-1 (RFC5426)
|
||||
SSLKey: content from stack.university.ac.uk.key
|
||||
```
|
||||
|
||||
The fully populated /home/stack/templates/enable-tls.yaml:
|
||||
|
||||
NOTE: the intermediate certificates configuration item contains both intermediate certificates
|
||||
Luckliy Openstack does not validate this field and pushes it directly into the HAProxy pem file, the order of the pem is as NGINX preferes (RFC5426), service certificate, intermediate CA2, intermediate CA1, root CA.
|
||||
During the SSL handshake the client will check the intermediate certificates in the response, if they are not present in the local trust store signing will be checked up to the root CA which will be in the client trust store.
|
||||
|
||||
```yaml
|
||||
parameter_defaults:
|
||||
# Set CSRF_COOKIE_SECURE / SESSION_COOKIE_SECURE in Horizon
|
||||
# Type: boolean
|
||||
HorizonSecureCookies: True
|
||||
|
||||
# Specifies the default CA cert to use if TLS is used for services in the public network.
|
||||
# Type: string
|
||||
# PublicTLSCAFile: '/etc/pki/ca-trust/source/anchors/public_ca.pem'
|
||||
PublicTLSCAFile: '/home/stack/templates/stack.university.ac.uk.interm.cer'
|
||||
|
||||
# The content of the SSL certificate (without Key) in PEM format.
|
||||
# Type: string
|
||||
SSLCertificate: |
|
||||
-----BEGIN CERTIFICATE-----
|
||||
MIIHYDCCBUigAwIBAgIRAK55qnAAkkQKzs6cusLn+0IwDQYJKoZIhvcNAQEMBQAw
|
||||
.....
|
||||
+vXuwEyJ5ULoW0TO6CuQvAvJsVM=
|
||||
-----END CERTIFICATE-----
|
||||
|
||||
# The content of an SSL intermediate CA certificate in PEM format.
|
||||
# Type: string
|
||||
SSLIntermediateCertificate: |
|
||||
-----BEGIN CERTIFICATE-----
|
||||
MIIG5TCCBM2gAwIBAgIRANpDvROb0li7TdYcrMTz2+AwDQYJKoZIhvcNAQEMBQAw
|
||||
.....
|
||||
Ipwgu2L/WJclvd6g+ZA/iWkLSMcpnFb+uX6QBqvD6+RNxul1FaB5iHY=
|
||||
-----END CERTIFICATE-----
|
||||
|
||||
-----BEGIN CERTIFICATE-----
|
||||
MIIFgTCCBGmgAwIBAgIQOXJEOvkit1HX02wQ3TE1lTANBgkqhkiG9w0BAQwFADB7
|
||||
.....
|
||||
vGp4z7h/jnZymQyd/teRCBaho1+V
|
||||
-----END CERTIFICATE-----
|
||||
|
||||
# The content of the SSL Key in PEM format.
|
||||
# Type: string
|
||||
SSLKey: |
|
||||
-----BEGIN RSA PRIVATE KEY-----
|
||||
MIIEpAIBAAKCAQEAqXvJwxSDfxjapmRMqFlchTPPpGUi6n0lFbJ7G2YQ+HUBwaEZ
|
||||
.....
|
||||
PcVhU+Ybi7ABCOyRUzZWXDlf6DxF4Kgoe/Ak99nM7v0MIndlbgZBYA==
|
||||
-----END RSA PRIVATE KEY-----
|
||||
|
||||
# ******************************************************
|
||||
# Static parameters - these are values that must be
|
||||
# included in the environment but should not be changed.
|
||||
# ******************************************************
|
||||
# The filepath of the certificate as it will be stored in the controller.
|
||||
# Type: string
|
||||
DeployedSSLCertificatePath: /etc/pki/tls/private/overcloud_endpoint.pem
|
||||
```
|
||||
|
||||
## Update the overcloud nodes to have all of the CA + Intermediate CA certificates imported into their trust stores
|
||||
|
||||
Whilst the overcloud nodes shouldn't use the public certificate for inter-service API communication (this is not a TLS everywhere installation), include this CA chain as a caution.
|
||||
Backup /home/stack/templates/inject-trust-anchor-hiera.yaml `mv /home/stack/templates/inject-trust-anchor-hiera.yaml /home/stack/templates/inject-trust-anchor-hiera.yaml.internal_ca`
|
||||
Create new `/home/stack/templates/inject-trust-anchor-hiera.yaml`, the content for each field is source as follows:
|
||||
|
||||
```yaml
|
||||
CAMap:
|
||||
root-ca:
|
||||
content: |
|
||||
"CN = AAA Certificate Services" certificate content here
|
||||
intermediate-ca-1:
|
||||
content: |
|
||||
"CN = USERTrust RSA Certification Authority" certificate content here
|
||||
intermediate-ca-2:
|
||||
content: |
|
||||
"CN = GEANT OV RSA CA 4" certificate content here
|
||||
```
|
||||
|
||||
The fully populated /home/stack/templates/inject-trust-anchor-hiera.yaml.
|
||||
|
||||
```sh
|
||||
parameter_defaults:
|
||||
# Map containing the CA certs and information needed for deploying them.
|
||||
# Type: json
|
||||
CAMap:
|
||||
root-ca:
|
||||
content: |
|
||||
-----BEGIN CERTIFICATE-----
|
||||
MIIEMjCCAxqgAwIBAgIBATANBgkqhkiG9w0BAQUFADB7MQswCQYDVQQGEwJHQjEb
|
||||
.....
|
||||
smPi9WIsgtRqAEFQ8TmDn5XpNpaYbg==
|
||||
-----END CERTIFICATE-----
|
||||
intermediate-ca-1:
|
||||
content: |
|
||||
-----BEGIN CERTIFICATE-----
|
||||
MIIFgTCCBGmgAwIBAgIQOXJEOvkit1HX02wQ3TE1lTANBgkqhkiG9w0BAQwFADB7
|
||||
.....
|
||||
vGp4z7h/jnZymQyd/teRCBaho1+V
|
||||
-----END CERTIFICATE-----
|
||||
intermediate-ca-2:
|
||||
content: |
|
||||
-----BEGIN CERTIFICATE-----
|
||||
MIIG5TCCBM2gAwIBAgIRANpDvROb0li7TdYcrMTz2+AwDQYJKoZIhvcNAQEMBQAw
|
||||
.....
|
||||
Ipwgu2L/WJclvd6g+ZA/iWkLSMcpnFb+uX6QBqvD6+RNxul1FaB5iHY=
|
||||
-----END CERTIFICATE-----
|
||||
```
|
||||
|
||||
## Deploy the overcloud
|
||||
|
||||
The FQDN of the floating IP served by the HAProxy containers on the controller nodes must have an upstream DNS A record, this should be present as the `CloudName:` parameter.
|
||||
The DNS hosts should return the A record, for University - the internal DNS server and a publically published record resolve stack.university.ac.uk.
|
||||
|
||||
```sh
|
||||
grep CloudName: /home/stack/templates/custom-domain.yaml
|
||||
CloudName: stack.university.ac.uk
|
||||
|
||||
grep DnsServers: /home/stack/templates/custom-domain.yaml
|
||||
DnsServers: ["144.173.6.71", "1.1.1.1"]
|
||||
|
||||
[stack@undercloud templates]$ grep 10.121.4.14 vips.yaml
|
||||
PublicVirtualFixedIPs: [{'ip_address':'10.121.4.14'}]
|
||||
|
||||
dig stack.university.ac.uk @144.173.6.71
|
||||
dig stack.university.ac.uk @1.1.1.1
|
||||
|
||||
;; ANSWER SECTION:
|
||||
stack.university.ac.uk. 86400 IN A 10.121.4.14
|
||||
```
|
||||
|
||||
Use the exact same arguments as the previous deployment to mitigate any unwanted changes to the cluster, for this build the script `overcloud-deploy.sh` should be up to date with this record.
|
||||
|
||||
```sh
|
||||
./overcloud-deploy.sh
|
||||
```
|
||||
|
||||
The update will complete for any overcloud nodes, however the undercloud may time out contacting the external API endpoint with the new SSL certificate changed.
|
||||
The HAProxy containers on the controller nodes need to be restarted to pick up the new certificates.
|
||||
If you were to run the deployment again (with no changes and restarted HAProxy containers) it should complete without issue and set the deployment with status 'UPDATE COMPLETE' when checking `openstack stack list`.
|
||||
|
||||
## Restart HAProxy containers on the controller nodes
|
||||
|
||||
Follow the instructions to restart the HAProxy containers on the overcloud controller nodes once the deployment has finished updating the SSL certificate.
|
||||
|
||||
> [https://access.redhat.com/documentation/en-us/red\_hat\_openstack\_platform/16.2/html/advanced\_overcloud\_customization/assembly\_enabling-ssl-tls-on-overcloud-public-endpoints#proc\_manually-updating-ssl-tls-certificates\_enabling-ssl-tls-on-overcloud-public-endpoints](https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/advanced_overcloud_customization/assembly_enabling-ssl-tls-on-overcloud-public-endpoints#proc_manually-updating-ssl-tls-certificates_enabling-ssl-tls-on-overcloud-public-endpoints)
|
||||
|
||||
```sh
|
||||
grep control /etc/hosts | grep ctlplane
|
||||
|
||||
10.122.0.30 overcloud-controller-0.ctlplane.university.ac.uk overcloud-controller-0.ctlplane
|
||||
10.122.0.31 overcloud-controller-1.ctlplane.university.ac.uk overcloud-controller-1.ctlplane
|
||||
10.122.0.32 overcloud-controller-2.ctlplane.university.ac.uk overcloud-controller-2.ctlplane
|
||||
|
||||
# for each controller node
|
||||
ssh heat-admin@overcloud-controller-0.ctlplane.university.ac.uk
|
||||
sudo su -
|
||||
podman restart $(podman ps --format="{{.Names}}" | grep -w -E 'haproxy(-bundle-.*-[0-9]+)?')
|
||||
```
|
||||
|
||||
# SSL notes
|
||||
|
||||
Verify a full chain of certificates easily.
|
||||
```sh
|
||||
openssl verify -verbose -CAfile <(cat CERT/stack.university.ac.uk/intermediate_ca_2.pem CERT/stack.university.ac.uk/intermediate_ca_1.pem CERT/stack.university.ac.uk/root_ca.pem) CERT/stack.university.ac.uk/service_cert.pe
|
||||
```
|
||||
|
||||
Check a certificate key is valid for a certificate.
|
||||
```sh
|
||||
openssl x509 -noout -modulus -in CERT/stack.university.ac.uk/stack.university.ac.uk.cert.cer | openssl md5
|
||||
(stdin)= 60a5df743ac212edb2b28bf315bce828
|
||||
openssl rsa -noout -modulus -in CERT/stack.university.ac.uk/stack.university.ac.uk.key | openssl md5
|
||||
(stdin)= 60a5df743ac212edb2b28bf315bce828
|
||||
```
|
||||
|
||||
Format of nginx type cert, haproxy (built on nginx) uses this format. When populating Openstack configuration files with multiple intermediate certs in a single field order multiple intermediate certs as so.
|
||||
```
|
||||
# create chain bundle, order as per RFC5426 (IETF's RFC 5246 Section 7.4.2) search google for nginx cert chain order.
|
||||
cat ../out/service.pem ../out/ca.pem > ../out/reg-chain.pem
|
||||
|
||||
# the order with multiple intermediate certs would resemble
|
||||
cert
|
||||
int 2
|
||||
int 1
|
||||
root
|
||||
```
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
# What is this?
|
||||
|
||||
Openstack RHOSP 16.2 (tripleo) baremetal deployment with:
|
||||
|
||||
virtual undercloud
|
||||
multiple server types
|
||||
custom roles
|
||||
ldap integration
|
||||
public SSL validation on dashboard/api
|
||||
standalone opensource Ceph Cluster with erasure-coding
|
||||
Nvidia cumulus 100G switch(s) configuration with MLAG/CLAG
|
||||
training documentation - domains, projects, groups, users, flavours, quotas, provider networks, private networks
|
||||
|
||||
more rough guides for manilla with ceph, quay registry integration - not present here
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 68 KiB |
Loading…
Reference in New Issue