第十四章:最佳实践

掌握 Ansible 项目组织和编写的高效最佳实践。

最后更新: 2024-01-28
页面目录

Ansible 最佳实践

本章节汇总 Ansible 项目组织和代码编写的高效最佳实践。

项目结构

推荐的目录结构

project/
├── ansible.cfg                 # Ansible 配置文件
├── requirements.yml            # 角色和集合依赖
├── site.yml                    # 主 Playbook
├── hosts                        # Inventory 文件
├── host_vars/                  # 主机变量
   ├── host1/
      └── vars.yml
   └── host2/
├── group_vars/                 # 组变量
   ├── all/
      ├── vault.yml           # 加密敏感数据
      └── common.yml          # 通用变量
   ├── production/
      ├── vault.yml
      └── env.yml
   └── staging/
       ├── vault.yml
       └── env.yml
├── library/                    # 自定义模块
   └── my_module.py
├── module_utils/               # 自定义模块工具
├── plugins/                    # 自定义插件
   ├── callback/
   ├── filter/
   └── inventory/
├── roles/                     # 角色目录
   ├── common/
   ├── nginx/
   ├── mysql/
   └── app/
├── playbooks/                 # Playbook 目录
   ├── webserver.yml
   ├── dbserver.yml
   └── base.yml
├── vars/                      # 变量文件
└── files/                     # 静态文件
    ├── scripts/
    └── certs/

Inventory 组织

1. 使用目录结构

inventory/
├── hosts.ini                  # 主清单
├── group_vars/
   ├── all.yml               # 所有主机通用
   ├── production.yml        # 生产环境
   └── staging.yml           # 测试环境
└── host_vars/
    ├── web1.yml
    └── db1.yml

2. 环境分离

# inventory/production/hosts
[production:children]
webservers
dbservers
appservers

[webservers]
prod-web-[01:10].example.com

[dbservers]
prod-db-[01:05].example.com

[production:vars]
environment=production

3. 敏感数据分离

# group_vars/all/vault.yml (加密)
---
vault_db_password: "secure_password"
vault_api_key: "secret_key"

# group_vars/all/public.yml
---
environment: production
db_host: prod-db.example.com

Playbook 编写

1. Playbook 命名规范

# ✅ 推荐
- name: Configure webserver
  hosts: webservers
  become: yes

# ❌ 避免
- name: do stuff
  hosts: all

2. 使用 Roles 组织

# site.yml
---
- name: Deploy infrastructure
  import_playbook: playbooks/base.yml

- name: Deploy webservers
  import_playbook: playbooks/webservers.yml

- name: Deploy databases
  import_playbook: playbooks/dbservers.yml

3. 任务组织

# playbook 结构
---
- name: Deploy application
  hosts: webservers
  become: yes

  # 变量
  vars:
    app_version: "2.0.0"

  # 前置任务
  pre_tasks:
    - name: Gather facts
      setup:

  # 导入角色
  roles:
    - nginx
    - app

  # 任务
  tasks:
    - name: Final configuration
      template:
        src: final.conf.j2
        dest: /etc/myapp/final.conf

  # 后置任务
  post_tasks:
    - name: Verify deployment
      uri:
        url: "http://{{ inventory_hostname }}"
        status_code: 200

  # 处理器
  handlers:
    - name: Reload nginx
      service:
        name: nginx
        state: reloaded

4. 使用 Tags

tasks:
  - name: Install packages
    apt:
      name: "{{ packages }}"
    tags:
      - install
      - packages

  - name: Configure application
    template:
      src: app.conf.j2
      dest: /etc/myapp/app.conf
    tags:
      - config

变量管理

1. 命名规范

# ✅ 推荐:使用前缀和描述性名称
nginx_version: "1.24.0"
mysql_max_connections: 1000
app_database_name: myapp

# ❌ 避免:通用名称
var1: value
config: value

2. 变量作用域

# defaults/ - 最低优先级,可被覆盖
# vars/     - 内部使用,不应被覆盖
# inventory - 根据环境设置
# playbook  - 特定配置

3. 使用 Vault 保护敏感数据

# group_vars/all/vault.yml
---
vault_db_password: "secret"
vault_api_key: "key"

# 使用时
db_password: "{{ vault_db_password }}"

Roles 开发

1. Role 最小化

# ❌ 一个大 role 包含所有功能

# ✅ 按功能拆分
roles/
├── common/          # 基础配置
├── nginx/           # Web 服务器
├── app/             # 应用部署
├── db/              # 数据库
└── monitoring/     # 监控

2. 使用默认值

# defaults/main.yml
---
app_port: 8080
app_workers: "{{ ansible_facts['processor_vcpus'] | default(1) }}"
app_log_level: "INFO"

3. 清晰的文档

# defaults/main.yml
---
# Application settings
app_name: myapp
app_version: "1.0.0"

# Server settings
app_host: "0.0.0.0"
app_port: 8080

# Database settings
app_db_host: localhost
app_db_port: 3306
app_db_name: myapp

错误处理

1. 使用 Block 和 Rescue

tasks:
  - name: Deploy application
    block:
      - name: Backup current version
        command: /opt/app/backup.sh

      - name: Deploy new version
        command: /opt/app/deploy.sh

      - name: Verify deployment
        command: /opt/app/verify.sh
    rescue:
      - name: Rollback on failure
        command: /opt/app/rollback.sh
      - name: Notify failure
        debug:
          msg: "Deployment failed, rolled back"
    always:
      - name: Cleanup
        file:
          path: /tmp/deploy_temp
          state: absent

2. 忽略可控错误

tasks:
  - name: Stop service if running
    service:
      name: myapp
      state: stopped
    ignore_errors: yes

3. 条件执行

tasks:
  - name: Run database migration
    command: /opt/app/migrate.sh
    when: inventory_hostname == groups['dbservers'][0]
    run_once: yes

性能优化

1. 禁用不必要的 Facts 收集

---
- name: Fast playbook
  hosts: all
  gather_facts: no

# 或选择性收集
- name: Selective facts
  hosts: all
  gather_facts:
    - ansible_distribution
    - ansible_memory_mb

2. 优化 SSH 连接

# ansible.cfg
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s

3. 并行执行

# 增加并行数
ansible-playbook site.yml -f 50

# ansible.cfg
[defaults]
forks = 50

4. 批量执行 (serial)

# 逐台执行(滚动更新)
- name: Rolling update
  hosts: webservers
  serial: 1

# 批量执行
- name: Batch update
  hosts: webservers
  serial:
    - 1
    - 5
    - 10

5. 异步任务

tasks:
  - name: Long running task
    command: /opt/app/batch.sh
    async: 3600
    poll: 0
    register: batch_job

安全最佳实践

1. 使用 Vault

# 加密敏感文件
ansible-vault encrypt group_vars/all/vault.yml

# 使用密码文件
ansible-playbook site.yml --vault-password-file ~/.vault_pass

2. 最小权限原则

# ❌ 使用 root
- name: Install package
  apt:
    name: nginx
    state: present

# ✅ 使用最小权限
- name: Install package
  apt:
    name: nginx
    state: present
  become_user: root

3. 隐藏敏感输出

tasks:
  - name: Configure secrets
    command: /opt/app/configure.sh
    no_log: true

4. SSH 安全

# 使用密钥认证
# ansible.cfg
[defaults]
private_key_file = ~/.ssh/ansible_key
host_key_checking = False

代码审查清单

Playbook 检查

  • Playbook 有清晰的 name
  • 目标主机正确
  • 使用 become 但设置最小权限
  • 任务有 name
  • 使用 tags 便于选择性执行
  • 使用 handlers 处理变更
  • 敏感数据使用 Vault

Variables 检查

  • 变量命名清晰一致
  • 使用合理的默认值
  • 敏感变量已加密
  • 变量文档完整

Roles 检查

  • 有 README 文档
  • 有 default 变量
  • 任务组织清晰
  • 有 handlers
  • 有测试用例

测试

使用 molecule

# 安装 molecule
pip install molecule

# 初始化测试
molecule init role -r myrole

# 运行测试
molecule test

# 开发时使用
molecule create
molecule converge
molecule verify

Ansible-lint

# 安装
pip install ansible-lint

# 运行检查
ansible-lint site.yml
ansible-lint roles/myrole/

# CI/CD 集成

CI/CD 集成

GitHub Actions

# .github/workflows/ansible.yml
name: Ansible CI

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.x'

      - name: Install Ansible
        run: pip install ansible ansible-lint

      - name: Lint Ansible
        run: ansible-lint .

      - name: Syntax check
        run: ansible-playbook --syntax-check site.yml

      - name: Test playbook
        run: ansible-playbook site.yml --check

下一步

现在你已经掌握了 Ansible 最佳实践。接下来让我们学习故障排查。

👉 故障排查