
I’ve already replaced three hard drives, including a brand new one, but after a while, the disk shows errors and becomes unreadable and unwritable. What on earth is the problem? I’m extremely confused.

I’ve already replaced three hard drives, including a brand new one, but after a while, the disk shows errors and becomes unreadable and unwritable. What on earth is the problem? I’m extremely confused.
The content provided is AI-generated and for reference only:
| Item | Analysis |
|---|---|
| Root Cause | I/O error (-5) during low-level block device read, preventing ext4 htree from reading directory blocks |
| Not a Disk Issue | Problem persists after replacing three disks (including a new one) |
| Likely Root Causes (in order) | hisi_sas_v3_hw) |
| Filesystem corruption? | A symptom, not the root cause |
hisi_sas_v3_hw or SATA hardreset failed cluesdmesg logs for collaborative analysis[details=“Click here to view detailed analysis and resolution”]
Based on the error logs and description provided, the system encounters Input/output error when accessing /data/, with dmesg showing numerous warnings from the ext4 filesystem, with the core error:
EXT4-fs warning (device sda1): htree_dirblock_to_tree:1083: inode #xxxxxx: bblock 0: error -5 reading directory block
Despite replacing three disks (including a brand-new one), the issue reoccurs. This indicates that the problem is likely not with the disk itself, but rather due to hardware compatibility, driver issues, controller problems, or system-level factors.
htree_dirblock_to_tree is the ext4 function responsible for efficient directory lookups using a hash tree structure.error -5 corresponds to EIO (Input/Output Error) in Linux — meaning the underlying device failed to complete a read operation.This is an I/O failure at the ext4 layer, originating from deeper hardware or firmware layers.
Although the current query does not contain identical logs, leveraging contextual knowledge allows for informed inference:
You’ve replaced three disks (including a new one), and the issue persists → disk hardware is not the root cause.
Numerous records in the knowledge base are linked to hisi_sas_v3_hw, a Huawei platform SAS controller driver.
Log example:
"hisi_sas_v3_hw has been associated with system-level issues, including erroneous completions during operation, device handling errors..."
If your server uses Huawei hardware (e.g., Kunpeng) or a Hisilicon SAS controller, this driver may have defects causing abnormal read/write requests or packet loss.
Although your logs don’t explicitly show
hisi_sas_v3_hw, the I/O error pattern matches closely. Check whether your system uses such a controller.
Input/output error despite disk replacement suggests the issue lies in the intermediate layer between disk and OS.ls: reading directory: Input/output error on /data/ suggests:
dmesg have propagated to the VFS layer, indicating complete device-level failureHowever, the key observation: the system works temporarily after installing a new disk, but the issue reappears → filesystem corruption is a result, not the root cause.
# Identify controller type
lspci | grep -i sata\|raid\|scsi\|storage
# Check for Hisilicon controller
dmesg | grep -i hisi_sas
# Check device path
udevadm info --query=all --name=/dev/sda | grep ID_PATH
If output includes
hisi_sas_v3_hw, upgrade firmware or contact vendor for patched drivers.
# Install smartctl
yum install smartmontools -y # CentOS/RHEL
apt install smartmontools -y # Ubuntu/Debian
# Check SMART status (even new disks may show anomalies due to controller issues)
smartctl -a /dev/sda
Pay attention to:
SMART overall-health self-assessment testReallocated_Sector_CtCurrent_Pending_SectorUDMA_CRC_Error_Count (high values suggest cable/interface issues)Goal: Isolate whether the issue lies in cables, ports, RAID card, or elsewhere.
/dev/sda1 and attempt to access /dataIf errors appear in Live system → problem is in hardware chain
If no errors → original OS has software/driver issues
Check logs for:
dmesg | grep -i "bad_read_return\|buffer i/o error\|abnormal completion"
BAD_READ_RETURN has been linked to zfs and raid-lvm configurations, suggesting certain storage stacks may silently corrupt data./data uses LVM, ZFS, or RAID, verify configuration correctness.| Item | Analysis |
|---|---|
| Root Cause | I/O error (-5) during low-level block device read, preventing ext4 htree from reading directory blocks |
| Not a Disk Issue | Problem persists after replacing three disks (including a new one) |
| Likely Root Causes (in order) | hisi_sas_v3_hw) |
| Filesystem corruption? | A symptom, not the root cause |
hisi_sas_v3_hw or SATA hardreset failed cluesdmesg logs for collaborative analysis
Special Reminder: If running in a virtualized environment, check the host’s underlying storage for slow disks, I/O congestion, or storage pool corruption (refer to “Huawei Cloud GaussDB Slow Disk Identification Practice”).
Providing complete lspci output and smartctl results will enable more precise root-cause analysis.