|
|
1 неделя назад | |
|---|---|---|
| .. | ||
| __alone | 2 недель назад | |
| _shared | 1 неделя назад | |
| dining | 1 неделя назад | |
| docker | 1 неделя назад | |
| gateway | 1 неделя назад | |
| job | 1 неделя назад | |
| k8s | 2 недель назад | |
| lawyer | 1 неделя назад | |
| promote-image | 1 неделя назад | |
| second | 1 неделя назад | |
| store | 1 неделя назад | |
| store-platform | 1 неделя назад | |
| whole | 1 неделя назад | |
| README-ACK-GRAY-RELEASE.md | 2 недель назад | |
| README-HARBOR-SETUP.md | 2 недель назад | |
| README-JOB-GATEWAY.md | 2 недель назад | |
| README-JOB-WHOLE.md | 2 недель назад | |
| README-PROMOTE-IMAGE.md | 2 недель назад | |
| README.md | 2 недель назад | |
集群?*alien-ack-cluster**(ACK 托管版,Kubernetes v1.26.2)。节点池 default-nodepool 当前 3 节点(截图中状态为 Unknown 时须先修复,否则 Pod 无法调度)?
Harbor?*39.106.135.88**(与 Jenkins 构建机同区域/VPC 为佳,避免跨公网拉镜像超时)?
控制台节点为 Unknown 时,在任意能访问 API 的机器执行:
kubectl get nodes -o wide
kubectl describe node <节点?
常见原因?
| 现象 | 处理 |
|---|---|
| NotReady + 超时 | 节点安全组未放行 6443?*10250**;节点与 API Server 网络不? |
| Kubelet 未启? | SSH ?ECS,systemctl status kubelet |
| 磁盘/内存压力 | df -h、free -h,清理镜像或扩容 |
| 容器运行? | 节点?containerd 1.6.20,勿混用 docker ?containerd 配置 |
修复后应显示 Ready,且 kubectl get pods -n kube-system ?CoreDNS、kube-proxy 正常(控制台「集群巡检」曾提示 CoreDNS 异常时优先处理)?
~/.kube/config-ack-alien?ack-kubeconfig-alien?withCredentials([file(credentialsId: 'ack-kubeconfig-alien', variable: 'KUBECONFIG')]) { sh 'kubectl get ns' }?alien 设为 私有,创?**机器人账?*(push + pull)?k8s/examples/secret-harbor.example.yaml)?spec.template.spec.imagePullSecrets 引用?Secret?imagePullSecrets,无需节点 docker login?my-openjdk8-ffmpeg:v1 推送到 Harbor,例如:docker tag my-openjdk8-ffmpeg:v1 39.106.135.88/alien_cloud/base/openjdk8-ffmpeg:v1docker push 39.106.135.88/alien_cloud/base/openjdk8-ffmpeg:v1Namespace: alien-produ
├── Deployment/gateway (stable, 副本?N)
├── Deployment/gateway-canary (灰度, 副本?1~2,仅 canary 策略时更?
├── Service/gateway ?stable Pod
├── Service/gateway-canary ?canary Pod
├── Ingress/alien-gateway ?主路?+ Nginx 灰度注解
└── ConfigMap/Secret ?bootstrap、Jasypt(勿把明文密码提?Git?
Java 微服务端口与现网 compose 一致(gateway 8000,store 50014,…),见?deployment-*.yaml?
配置中心:ACK ?Nacos 若仍?39.106.135.88 上的实例,须?Pod 环境变量?bootstrap-prod.yml 中写 可达?K8S Service 地址?SLB,不能写 Docker 容器?nacos(除?Nacos 也迁进同一集群)?
ACK 控制??应用 ?Ingress 需已安?Nginx Ingress Controller(应用市场「Nginx Ingress Controller」或「ALB Ingress」;下文?Nginx Ingress 注解 为例,与仓库示例一致)?
flowchart LR
User[客户端] --> Ing[Ingress alien-gateway]
Ing -->|权重 90%| SvcStable[Service gateway]
Ing -->|权重 10%| SvcCanary[Service gateway-canary]
SvcStable --> DepStable[Deployment gateway]
SvcCanary --> DepCanary[Deployment gateway-canary]
canary + canary-weight 按百分比分流?kubectl apply -f k8s/examples/namespace.yamlkubectl apply -f k8s/examples/deployment-gateway.yaml(及 stable Service?kubectl apply -f k8s/examples/deployment-gateway-canary.yaml + service-gateway-canary.yamlkubectl apply -f k8s/examples/ingress-gateway-canary.example.yaml(改 host、TLS?39.106.135.88/alien_cloud/gateway:build-42gateway-canary Deployment 镜像canary-weight 设为 1020 ?50 ?100示例注解(完整见 ingress-gateway-canary.example.yaml):
metadata:
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10"
| 方式 | K8S 内置 RollingUpdate | Ingress 灰度(本方案? |
|---|---|---|
| 流量 | ?Pod 替换,旧?Pod 逐渐减少 | 新旧版本 同时 接流量,可调比例 |
| 回滚 | kubectl rollout undo |
?canary-weight 设为 0 或回?canary Deployment |
| Jenkins | DEPLOY_STRATEGY=rolling |
DEPLOY_STRATEGY=canary |
当前?25 ?/ 45Gi 集群总量(控制台资源监控)? ?Java 服务若各 2 副本 + 灰度 1 副本,峰?Pod 数约 7×3=21,需控制?Pod request(示?manifest ?512Mi/250m),避免 Pending?
调度建议?
metrics.k8s.io 不可用,先装 metrics-server 再开 HPA?| Compose?9.105.153.68 produ? | ACK |
|---|---|
gateway-produ 容器 |
Deployment gateway |
| 卷挂?jar | 镜像内打?jar(本流水?Dockerfile? |
common-network-produ |
ClusterIP Service + Ingress |
| 环境变量 Jasypt | Secret alien-jasypt ?ACK 配置? |
| 支付证书目录 | Secret volume alien-pay-cert-store ? |
迁移动作顺序建议?*gateway ?store ?其余**;每迁一个服务,Ingress 切一?path 或子域名,保?compose 回滚路径直至稳定?
| 目标 | 路径 |
|---|---|
| 工作负载 / Deployment | 集群 ?工作负载 ?无状? |
| 灰度 / Ingress | 集群 ?网络 ?Ingress |
| 镜像拉取失败事件 | 工作负载 ?Pod ?事件 |
| 日志 | 工作负载 ?Pod ?日志;或接入 SLS |
| 节点池扩? | 节点管理 ?default-nodepool ?扩容 |
kubectl apply 示例 manifest 一次?gateway-k8s Job(rolling,权?100% 等价全量)?canary + CANARY_WEIGHT=5 验证流量分裂?问题排查:kubectl describe pod -n alien-produ、kubectl logs -n alien-produ deploy/gateway-canary?