So my temporary home office is next to the furnace room. Today I started to hear disk thrashing on my new TNAS and found it odd as the disks are brand new. I have also noticed random reboots for no good reason that seem to happen every day. The SSH and web interface were also very sluggish. I finally decided to take a look via SSH. I also configured SYSLOG to send all logs to my syslog-ng docker instance. The disks were powering down (standby) and immediately spinning up every few seconds.
SYSLOG-NG showed:
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.371742] ata1.00: exception Emask 0x0 SAct 0x4 SErr 0x0 action 0x6
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.373349] ata1.00: waking up from sleep
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.374351] ata1: hard resetting link
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.851737] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.854405] ata1.00: configured for UDMA/133
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.855491] ata1: EH complete
and it did so for both disks every second or so. Sigh. I got the NAS to avoid the issues of using a USB drive and now this!
After a LOT of looking, following google searches that didn't help, disabling standby with HDPARM and SMARTCTL, etc, I did a TOP and noticed something odd. When I was hearing the thrashing /usr/sbin/ter_smartfan was running and it would spawn hdparm -Y /dev/sda and sdb. So I ran /usr/sbin/ter_smartfan manually and sure enough every second it did HDPARM -Y. So I did a ps for ter_smartfan and killed it. Now there is no thrashing or resetting the link. SSH and web interface are snappy. Logging is not showing issues.
If I run /usr/sbin/ter_smartfan manually the issues start again. So I plan to NOT do that. I will keep an eye on things and see if I have future problems or if the random reboots stop.
Just thought I'd share for those experiencing the same issue. Perhaps I'm off base with my reasoning or solution, If so please inform me.
SYSLOG-NG showed:
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.371742] ata1.00: exception Emask 0x0 SAct 0x4 SErr 0x0 action 0x6
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.373349] ata1.00: waking up from sleep
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.374351] ata1: hard resetting link
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.851737] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.854405] ata1.00: configured for UDMA/133
Jul 4 09:44:59 TNAS-F2-212 kernel: [ 4868.855491] ata1: EH complete
and it did so for both disks every second or so. Sigh. I got the NAS to avoid the issues of using a USB drive and now this!
After a LOT of looking, following google searches that didn't help, disabling standby with HDPARM and SMARTCTL, etc, I did a TOP and noticed something odd. When I was hearing the thrashing /usr/sbin/ter_smartfan was running and it would spawn hdparm -Y /dev/sda and sdb. So I ran /usr/sbin/ter_smartfan manually and sure enough every second it did HDPARM -Y. So I did a ps for ter_smartfan and killed it. Now there is no thrashing or resetting the link. SSH and web interface are snappy. Logging is not showing issues.
If I run /usr/sbin/ter_smartfan manually the issues start again. So I plan to NOT do that. I will keep an eye on things and see if I have future problems or if the random reboots stop.
Just thought I'd share for those experiencing the same issue. Perhaps I'm off base with my reasoning or solution, If so please inform me.
Statistics: Posted by timinator — Yesterday, 22:21