省流总结:
可能是22.03 SP3/SP4升级到24.03时,出现了selinux故障,导致升级重启后出错?
解决方法是用live cd chroot到故障盘重装所有包并禁止selinux,接着重启后重装selinux相关包,最后让selinux重建整个文件系统的label。
故障表现:
我是先从22.03 SP3升级到SP4正常,然后修改源地址升级24.03。
但升级过程中,时候先提示丢文件:
Running scriptlet: selinux-policy-targeted-40.7-2.oe2403.noarch 665/3030
uavc: op=load_policy lsm=selinux seqno=2 res=1Regex version mismatch, expected: 10.39 2021-10-29 actual: 10.42 2022-12-11
Regex version mismatch, expected: 10.39 2021-10-29 actual: 10.42 2022-12-11
Running scriptlet: kmod-kvdo-8.2.1.2-4.oe2403.x86_64 986/3030
+ /usr/sbin/dkms --rpm_safe_upgrade add -m kmod-kvdo -v 8.2.1.2-4
Creating symlink /var/lib/dkms/kmod-kvdo/8.2.1.2-4/source -> /usr/src/kmod-kvdo-8.2.1.2-4
+ /usr/sbin/dkms --rpm_safe_upgrade build -m kmod-kvdo -v 8.2.1.2-4
Sign command: /lib/modules/5.10.0-232.0.0.131.oe2203sp4.x86_64/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub
Certificate or key are missing, generating self signed certificate for MOK...
Building module:
Cleaning build area...
make -j2 KERNELRELEASE=5.10.0-232.0.0.131.oe2203sp4.x86_64 -C /lib/modules/5.10.0-232.0.0.131.oe2203sp4.x86_64/build M=/var/lib/dkms/kmod-kvdo/8.2.1.2-4/build...(bad exit status: 2)
Error! Bad return status for module build on kernel: 5.10.0-232.0.0.131.oe2203sp4.x86_64 (x86_64)
Consult /var/lib/dkms/kmod-kvdo/8.2.1.2-4/build/make.log for more information.
+ /usr/sbin/dkms --rpm_safe_upgrade install -m kmod-kvdo -v 8.2.1.2-4
Sign command: /lib/modules/5.10.0-232.0.0.131.oe2203sp4.x86_64/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub
Building module:
Cleaning build area...
make -j2 KERNELRELEASE=5.10.0-232.0.0.131.oe2203sp4.x86_64 -C /lib/modules/5.10.0-232.0.0.131.oe2203sp4.x86_64/build M=/var/lib/dkms/kmod-kvdo/8.2.1.2-4/build...(bad exit status: 2)
Error! Bad return status for module build on kernel: 5.10.0-232.0.0.131.oe2203sp4.x86_64 (x86_64)
Consult /var/lib/dkms/kmod-kvdo/8.2.1.2-4/build/make.log for more information.
warning: %post(kmod-kvdo-8.2.1.2-4.oe2403.x86_64) scriptlet failed, exit status 10
Error in POSTIN scriptlet in rpm package kmod-kvdo
Running scriptlet: libwbclient-4.17.5-12.oe2203sp4.x86_64 2002/3030
/sbin/ldconfig: /usr/lib64/libproxy.so.1 is not a symbolic link
Cleanup : libwbclient-4.17.5-12.oe2203sp4.x86_64 2002/3030
warning: file /usr/lib64/samba/wbclient/libwbclient.so.0.15: remove failed: No such file or directory
warning: file /usr/lib64/samba/wbclient/libwbclient.so.0: remove failed: No such file or directory
然后重启到24.03后,卡boot或者报kernel panic,系统不可用。
看到论坛里面好几个人出现类似情况,不过可能和其他人不一样的是,我在22.03 SP3中安装了DDE桌面,但默认是multi-user.target。
故障处理过程:
在升级到24.03重启后无法进入系统后,我的排除故障过程如下:
1、找任意一张linux发行版的live cd,然后挂载24.03所在故障盘(假设为/media/user/dddddddddd),chroot进入:
mount -o rbind /dev /media/user/dddddddddd/dev
mount -t proc none /media/user/dddddddddd/proc
mount -o bind /sys /media/user/dddddddddd/sys
mount -o bind /tmp /media/user/dddddddddd/tmp
cp /etc/resolv.conf /media/user/dddddddddd/etc/resolv.conf
chroot /media/user/dddddddddd
2、执行重装所有包
dnf reinstall --refresh $(rpm -qa)
3、重启系统到24.03后,发现无法登录。表现为能输入用户名和密码后重新返回登录界面。
4、再次进入live cd,故障盘里面的/var/log/messages
有如下信息:
type=AVC msg=audit(1729844376.924:155): avc: denied { transition } for pid=6432 comm="(systemd)" path="/usr/lib/systemd/systemd" dev="dm-0" ino=1078989 scontext=system_u:system_r:kernel_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0 tclass=process permissive=0
type=AVC msg=audit(1729844376.942:160): avc: denied { transition } for pid=6439 comm="login" path="/usr/bin/bash" dev="dm-0" ino=1048733 scontext=system_u:system_r:kernel_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0 tclass=process permissive=0
然后将故障盘中的/etc/selinux/config
的SELINUX=enforcing
改成SELINUX=permissive
,重启到24.03能正常进入。
5、但此时24.03的selinux状态异常。执行audit2allow后,提示缺失大量selinux规则:
命令:ausearch -m AVC | audit2allow
输出(部分):
#============= kernel_t ==============
allow kernel_t sshd_net_t:process dyntransition;
allow kernel_t unconfined_t:process { dyntransition transition };
命令:audit2allow -w -a
输出(部分):
type=AVC msg=audit(1689734806.566:106): avc: denied { write } for pid=2211 comm="onboard" name="dbus-CrRLvzOqua" dev="tmpfs" ino=28 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:session_dbusd_tmp_t:s0 tclass=sock_file permissive=0
Was caused by:
Unknown - would be allowed by active policy
Possible mismatch between this policy and the one under which the audit message was generated.
Possible mismatch between current in-memory boolean settings vs. permanent ones.
type=AVC msg=audit(1729844951.399:83): avc: denied { transition } for pid=3106 comm="(systemd)" path="/usr/lib/systemd/systemd" dev="dm-0" ino=1078989 scontext=system_u:system_r:kernel_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0 tclass=process permissive=1
Was caused by:
Missing type enforcement (TE) allow rule.
You can use audit2allow to generate a loadable module to allow this access.
type=AVC msg=audit(1729844980.333:100): avc: denied { dyntransition } for pid=3305 comm="sshd" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:sshd_net_t:s0 tclass=process permissive=1
Was caused by:
Missing type enforcement (TE) allow rule.
You can use audit2allow to generate a loadable module to allow this access.
6、此时考虑重建selinux整个环境,一番搜索后,发现命令行如下:
mv /etc/selinux/targeted /etc/selinux/targeted.bak
mv /etc/selinux/config /etc/selinux/config.bak
dnf remove selinux-policy*
dnf install selinux-policy-targeted
dnf install selinux-policy-devel policycoreutils policycoreutils-devel
touch /.autorelabel
(注意要先单独装selinux-policy-targeted,再执行后面命令,不能合并安装,原因未知)
7、重启系统,此时selinux会重建整个文件系统的label,时间可能较长。
重建完成后,系统会再次自动重启。
这个时候命令行界面的故障解除了,也能正常ssh登录了。
8、如果没有安装DDE桌面环境,则故障完全解除;如果有,还需要进行下面操作。
此时DDE桌面环境仍然不能正常启动。需要卸载DDE再重装:
dnf remove dde-* startdde
dnf install ddednf remove