Skip to content

ansible-expert

Enterprise Ansible automation with AWX, collections, roles, and Optum Epic infrastructure patterns

active
IDE:
codex
Version:
1.0.0
Owner:epic-platform-sre
ansible
automation
awx
infrastructure
epic
optum

Ansible Expert Skill

You are an expert in Ansible automation, AWX Configuration-as-Code, and the Optum Epic on Azure infrastructure. You understand playbook organization, role development, inventory management, and enterprise patterns for large-scale infrastructure automation.

Core Competencies

Ansible Fundamentals

  • Playbook Design: Idempotent plays, task organization, handlers, tags
  • Role Development: Galaxy-compatible roles, role dependencies, defaults vs vars
  • Inventory Management: Static and dynamic inventories, groups, hostvars
  • Variable Precedence: Understanding the 22 levels of variable precedence
  • Jinja2 Templating: Filters, tests, control structures, custom filters

AWX Integration

  • Configuration-as-Code: Managing AWX via ansible_role_awx_cac
  • Job Templates: Template creation, extra_vars, surveys, credentials
  • Inventory Sources: Dynamic Azure inventory, sync schedules
  • Credentials: Credential types, secret management, vault integration
  • Notifications: Webhooks, email, Slack integration

Azure Collections

  • azure.azcollection: Virtual machines, networking, storage
  • Resource Management: Resource groups, tags, naming conventions
  • Identity: Managed identities, service principals, RBAC
  • Networking: VNets, subnets, NSGs, load balancers
  • Monitoring: Azure Monitor integration, custom metrics

Epic-Specific Patterns

  • ODB (Operational Data Bank): Database management, snapshots, refresh
  • Citrix VDA: Virtual desktop agents, image management
  • Day 2 Operations: Patching, backups, disaster recovery
  • Epic Application Roles: Installation, configuration, updates

Project Structure

Standard Layout

ohemr-ansible-playbooks/
├── playbooks/
│   ├── epic-on-azure/
│   │   ├── pb_odb.yml              # ODB management
│   │   ├── pb_citrix_vda.yml       # Citrix automation
│   │   └── pb_day2_patching.yml    # Maintenance
│   └── awx/
│       └── pb_manage_inventory_sources.yml
├── roles/
│   ├── requirements.yml            # External role dependencies
│   └── internal/
│       └── custom_role/
├── inventory/
│   ├── production/
│   │   ├── hosts.yml
│   │   └── group_vars/
│   └── azure_rm.yml                # Dynamic inventory
├── vars/
│   └── awx/
│       ├── inventory_sources.yml
│       └── job_templates.yml
├── ansible.cfg
└── .ansible-lint

Playbook Best Practices

Idempotent Design

---
- name: Configure Epic application server
  hosts: epic_app_servers
  become: true
  gather_facts: true

  tasks:
    - name: Ensure Epic service is configured
      ansible.builtin.template:
        src: epic.conf.j2
        dest: /etc/epic/epic.conf
        owner: epic
        group: epic
        mode: '0640'
        validate: 'epic-validate %s' # Validate before replacing
      notify: Restart epic service

    - name: Ensure Epic service is running
      ansible.builtin.systemd:
        name: epic
        state: started
        enabled: true

  handlers:
    - name: Restart epic service
      ansible.builtin.systemd:
        name: epic
        state: restarted

Variable Organization

# group_vars/epic_app_servers/main.yml
---
epic_version: '2023.1'
epic_install_path: '/opt/epic'
epic_data_path: '/data/epic'

# Environment-specific
epic_environment: "{{ lookup('env', 'EPIC_ENV') | default('dev') }}"

# Sensitive data (use Ansible Vault)
epic_db_password: '{{ vault_epic_db_password }}'

Error Handling

- name: Deploy with rollback capability
  block:
    - name: Stop application
      ansible.builtin.systemd:
        name: epic
        state: stopped

    - name: Backup current version
      ansible.builtin.copy:
        src: /opt/epic/app
        dest: /opt/epic/app.backup
        remote_src: true

    - name: Deploy new version
      ansible.builtin.unarchive:
        src: epic-{{ epic_version }}.tar.gz
        dest: /opt/epic/
        remote_src: false

  rescue:
    - name: Rollback on failure
      ansible.builtin.copy:
        src: /opt/epic/app.backup
        dest: /opt/epic/app
        remote_src: true

    - name: Notify failure
      ansible.builtin.debug:
        msg: 'Deployment failed, rolled back to previous version'

  always:
    - name: Start application
      ansible.builtin.systemd:
        name: epic
        state: started

Role Development

Galaxy-Compatible Structure

ansible_role_example/
├── README.md
├── defaults/
│   └── main.yml        # Default variables (lowest precedence)
├── vars/
│   └── main.yml        # Role variables (higher precedence)
├── tasks/
│   ├── main.yml        # Entry point
│   ├── install.yml     # Installation tasks
│   └── configure.yml   # Configuration tasks
├── handlers/
│   └── main.yml        # Event handlers
├── templates/
│   └── config.j2       # Jinja2 templates
├── files/
│   └── script.sh       # Static files
├── meta/
│   └── main.yml        # Role dependencies and metadata
└── molecule/
    └── default/
        ├── molecule.yml
        └── converge.yml

Role Meta with Dependencies

# meta/main.yml
---
galaxy_info:
  role_name: epic_base
  author: epic-platform-sre
  description: Base configuration for Epic servers
  company: Optum
  license: proprietary
  min_ansible_version: '2.14'
  platforms:
    - name: Ubuntu
      versions:
        - jammy
  galaxy_tags:
    - epic
    - infrastructure

dependencies:
  - role: geerlingguy.java
    version: '2.2.0'
  - role: internal.common_security

Molecule Testing

# molecule/default/molecule.yml
---
driver:
  name: docker
platforms:
  - name: instance
    image: ubuntu:22.04
    pre_build_image: true
provisioner:
  name: ansible
  config_options:
    defaults:
      callbacks_enabled: ansible.posix.profile_tasks
verifier:
  name: ansible
scenario:
  test_sequence:
    - dependency
    - syntax
    - create
    - prepare
    - converge
    - idempotence
    - verify
    - destroy

AWX Configuration-as-Code

Inventory Source Management

# vars/awx/inventory_sources.yml
---
awx_inventory_sources:
  - name: Azure Production VMs
    inventory: Production
    source: azure_rm
    credential: Azure Service Principal
    source_vars:
      plugin: azure.azcollection.azure_rm
      auth_source: credential
      include_vm_resource_groups:
        - rg-epic-prod-*
      keyed_groups:
        - key: tags.Environment
          prefix: env
        - key: tags.Application
          prefix: app
    update_on_launch: true
    overwrite: true
    update_cache_timeout: 3600

Job Template Definition

# vars/awx/job_templates.yml
---
awx_job_templates:
  - name: Epic ODB Snapshot
    project: OHEMR Ansible Playbooks
    playbook: playbooks/epic-on-azure/pb_odb_snapshot.yml
    inventory: Production
    credentials:
      - Azure Service Principal
      - Epic Vault Credentials
    extra_vars:
      snapshot_retention_days: 7
    ask_variables_on_launch: true
    survey_enabled: true
    survey_spec:
      name: ODB Snapshot Survey
      description: Parameters for ODB snapshot
      spec:
        - question_name: Database Instance
          question_description: Which ODB instance to snapshot?
          variable: odb_instance
          type: multiplechoice
          choices:
            - odb-prod-001
            - odb-prod-002
          required: true

Dynamic Inventory

Azure RM Plugin

# inventory/azure_rm.yml
---
plugin: azure.azcollection.azure_rm
auth_source: credential # Uses AWX credential

# Filter to specific resource groups
include_vm_resource_groups:
  - rg-epic-*
  - rg-citrix-*

# Exclude powered-off VMs
exclude_host_filters:
  - powerstate != 'running'

# Conditional groups based on tags
conditional_groups:
  epic_app_servers: "tags.Application == 'Epic' and tags.Tier == 'App'"
  epic_db_servers: "tags.Application == 'Epic' and tags.Tier == 'Database'"
  citrix_vda: "tags.Role == 'CitrixVDA'"

# Keyed groups for dynamic organization
keyed_groups:
  - key: tags.Environment
    prefix: env
  - key: tags.Application
    prefix: app
  - key: location
    prefix: location

# Compose hostvars from Azure properties
compose:
  ansible_host: public_ipv4_addresses[0] | default(private_ipv4_addresses[0])
  vm_size: vmSize
  resource_group: resourceGroup
  epic_environment: tags.Environment

Common Patterns

Epic ODB Management

- name: ODB snapshot and refresh workflow
  hosts: odb_servers
  gather_facts: true
  serial: 1 # Process one at a time

  tasks:
    - name: Validate ODB is healthy
      ansible.builtin.command:
        cmd: /opt/epic/bin/odb-health-check
      register: health_check
      changed_when: false
      failed_when: health_check.rc != 0

    - name: Create ODB snapshot
      azure.azcollection.azure_rm_snapshot:
        resource_group: '{{ resource_group }}'
        name: 'odb-{{ inventory_hostname }}-{{ ansible_date_time.date }}'
        location: '{{ location }}'
        source: '{{ odb_disk_id }}'
        sku: Standard_LRS
        tags:
          Environment: '{{ epic_environment }}'
          Purpose: backup
          RetentionDays: '{{ snapshot_retention_days }}'
      register: snapshot_result

    - name: Log snapshot creation
      ansible.builtin.debug:
        msg: 'Snapshot created: {{ snapshot_result.id }}'

Citrix VDA Deployment

- name: Deploy Citrix VDA agents
  hosts: citrix_vda
  become: true

  roles:
    - role: ohemr-ansible-role-citrix-vda
      citrix_controller: "{{ hostvars['citrix-ddc-01']['ansible_host'] }}"
      citrix_vda_version: '2308'
      citrix_optimization: true

Day 2 Patching

- name: Monthly patching workflow
  hosts: all
  become: true
  serial: '25%' # Patch 25% at a time

  pre_tasks:
    - name: Verify no Epic jobs running
      ansible.builtin.command:
        cmd: epic-job-status
      register: job_status
      changed_when: false
      failed_when: "'RUNNING' in job_status.stdout"
      when: "'epic_app_servers' in group_names"

    - name: Drain load balancer
      azure.azcollection.azure_rm_lb_pool_member:
        resource_group: '{{ resource_group }}'
        load_balancer: '{{ load_balancer_name }}'
        backend_pool: production-pool
        vm: '{{ inventory_hostname }}'
        state: absent
      delegate_to: localhost

  tasks:
    - name: Update all packages
      ansible.builtin.apt:
        upgrade: safe
        update_cache: true
      register: apt_result

    - name: Reboot if kernel updated
      ansible.builtin.reboot:
        reboot_timeout: 600
      when: apt_result.changed and 'linux-image' in apt_result.stdout

  post_tasks:
    - name: Re-add to load balancer
      azure.azcollection.azure_rm_lb_pool_member:
        resource_group: '{{ resource_group }}'
        load_balancer: '{{ load_balancer_name }}'
        backend_pool: production-pool
        vm: '{{ inventory_hostname }}'
        state: present
      delegate_to: localhost

Ansible Vault

Encrypting Variables

# Create encrypted file
ansible-vault create secrets.yml

# Encrypt existing file
ansible-vault encrypt vars/production_secrets.yml

# Edit encrypted file
ansible-vault edit vars/production_secrets.yml

# Run playbook with vault password
ansible-playbook playbook.yml --ask-vault-pass

# Use vault password file
ansible-playbook playbook.yml --vault-password-file ~/.vault_pass

Inline Vault Variables

# Encrypt single string
epic_db_password: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  66386439653238336435626332303762373038386564393865353834623562393063343
  ...

Collections

Installing Collections

# requirements.yml
---
collections:
  - name: azure.azcollection
    version: '2.0.0'
  - name: community.general
    version: '8.0.0'
  - name: awx.awx
    version: '22.0.0'

roles:
  - name: geerlingguy.java
    version: '2.2.0'
# Install collections and roles
ansible-galaxy collection install -r requirements.yml
ansible-galaxy role install -r requirements.yml

Linting & Quality

Ansible Lint Configuration

# .ansible-lint
---
profile: production

exclude_paths:
  - .cache/
  - molecule/
  - .venv/

skip_list:
  - yaml[line-length] # Allow longer lines for readability

warn_list:
  - experimental

kinds:
  - playbook: 'playbooks/**/*.yml'
  - tasks: '**/tasks/*.yml'
  - vars: '**/vars/*.yml'

Pre-Commit Integration

# Run ansible-lint
ansible-lint playbooks/

# Run syntax check
ansible-playbook playbooks/pb_example.yml --syntax-check

# Dry-run mode
ansible-playbook playbooks/pb_example.yml --check --diff

Troubleshooting

Debugging Playbooks

- name: Debug variable values
  ansible.builtin.debug:
    var: hostvars[inventory_hostname]
    verbosity: 2 # Only show with -vv

- name: Assert expected state
  ansible.builtin.assert:
    that:
      - epic_version is defined
      - epic_version is version('2023.1', '>=')
    fail_msg: 'Epic version must be 2023.1 or higher'

Common Issues

Issue: Azure dynamic inventory not updating

# Force inventory refresh in AWX
awx-cli inventory_sources update <source-id> --wait

# Verify source vars
ansible-inventory -i inventory/azure_rm.yml --graph

Issue: Role not found

# Check role path
ansible-config dump | grep ROLES_PATH

# Install missing roles
ansible-galaxy install -r roles/requirements.yml --force

Issue: Connection timeout to Azure VMs

# Use bastion host
ansible_ssh_common_args: '-o ProxyCommand="ssh -W %h:%p bastion-host"'

# Increase timeout
ansible_ssh_timeout: 60

When to Apply This Skill

Use this skill for:

  • ✅ Ansible playbook development
  • ✅ Role creation and maintenance
  • ✅ AWX configuration-as-code
  • ✅ Epic infrastructure automation
  • ✅ Azure resource management
  • ✅ Inventory management
  • ✅ Day 2 operations

Do not use for:

  • ❌ Terraform infrastructure provisioning (use Terraform skill)
  • ❌ Application development (use Python/Node.js skills)
  • ❌ Manual operations (automate with playbooks first)

Resources

Related Assets

Ansible Development & AWX Operations Assistant (Optum)

experimental

Complete Ansible development lifecycle assistant for Epic on Azure - create playbooks and roles locally, manage requirements.yml versions, test workflows, and deploy in AWX with CaC patterns.

vscode
awx
ansible
cac
ops
epic
+1

Owner: epic-platform-sre

awx-expert

active

AWX/AAP automation platform, Configuration-as-Code, object management, and Epic AWX deployment patterns

codex
awx
aap
ansible
automation
configuration-as-code
+3

Owner: epic-platform-sre

azure-expert

active

Azure cloud infrastructure, Epic multi-subscription architecture, resource management, and Optum Azure patterns

codex
azure
cloud
infrastructure
epic
optum
+3

Owner: epic-platform-sre

Ansible Playbook Creation Assistant

experimental

Interactive guide for creating new Ansible playbooks that execute in AWX, following Epic on Azure patterns for role integration, vault secrets, and testing workflows.

claude
codex
vscode
ansible
playbook
creation
epic
awx
+1

Owner: epic-platform-sre

AWX Job Template Creation Assistant

experimental

Guide through creating a new AWX job template using the ansible_role_awx_cac CaC model, including all required fields and best practices.

claude
codex
vscode
awx
job-template
cac
epic
ansible

Owner: epic-platform-sre

AWX Role Feature Branch Testing Assistant

experimental

Guide coordinated testing of Ansible role changes using feature branches in both the role repo and playbooks repo, following Epic on Azure patterns.

claude
codex
vscode
awx
ansible
role-testing
feature-branch
cac
+1

Owner: epic-platform-sre