Changeset 3429
- Timestamp:
- 10/07/11 19:32:18 (20 months ago)
- Location:
- trunk/smartmontools
- Files:
-
- 4 modified
-
CHANGELOG (modified) (1 diff)
-
NEWS (modified) (1 diff)
-
smartd.conf.5.in (modified) (5 diffs)
-
smartd.cpp (modified) (8 diffs)
Legend:
- Unmodified
- Added
- Removed
-
trunk/smartmontools/CHANGELOG
r3428 r3429 41 41 <DEVELOPERS: ADDITIONS TO THE CHANGE LOG GO JUST BELOW HERE, PLEASE> 42 42 43 [CF] smartd: Resend warning emails if problem reappears (ticket #167). 44 43 45 [CF] smartd: Add separate directives '-l offlinests' and '-l selfteststs' 44 46 to enable tracking of status changes. Disable '-l offlinests' by -
trunk/smartmontools/NEWS
r3428 r3429 9 9 Summary: smartmontools release 5.42 10 10 ----------------------------------------------------------- 11 - smartd resends warning emails if problem reappears. 11 12 - smartd directives '-l offlinests' and '-l selfteststs'. 12 13 - Platform-specific man pages. -
trunk/smartmontools/smartd.conf.5.in
r3428 r3429 597 597 598 598 [ATA only] Failed self-tests outdated by a newer successful extended 599 self\-test are ignored. 599 self\-test are ignored. The warning email counter is reset if the 600 number of failed self tests dropped to 0. This typically happens when 601 an extended self\-test is run after all bad sectors have been reallocated. 600 602 601 603 .I offlinests … … 847 849 type of disk problem detected. Each interval is twice as long as the 848 850 previous interval. 851 852 If a disk problem is no longer detected, the internal email counter is 853 reset. If the problem reappears a new warning email is sent immediately. 849 854 850 855 In addition, one may add zero or more of the following Directives: … … 1131 1136 See also \'\-v 197,increasing\' below. 1132 1137 1138 The warning email counter is reset if the number of pending sectors 1139 dropped to 0. This typically happens when all pending sectors have 1140 been reallocated or could be read again. 1141 1133 1142 A pending sector is a disk sector (containing 512 bytes of your data) 1134 1143 which the device would like to mark as ``bad" and reallocate. … … 1159 1168 See also \'\-v 198,increasing\' below. 1160 1169 1170 The warning email counter is reset if the number of offline uncorrectable 1171 sectors dropped to 0. This typically happens when all offline uncorrectable 1172 sectors have been reallocated or could be read again. 1173 1161 1174 An offline uncorrectable sector is a disk sector which was not 1162 1175 readable during an off\-line scan or a self\-test. This is important … … 1174 1187 will be send if '-m' is specified. If only the limit \fBINFO\fP is 1175 1188 reached, a message with loglevel \fB\'LOG_INFO\'\fP will be logged. 1189 1190 The warning email counter is reset if the temperature dropped below 1191 \fBINFO\fP or \fBCRIT\fP-5 if \fBINFO\fP is not specified. 1176 1192 1177 1193 If this directive is used in conjunction with state persistence -
trunk/smartmontools/smartd.cpp
r3428 r3429 942 942 return; 943 943 } 944 944 945 945 // Return if a single warning mail has been sent. 946 946 if ((cfg.emailfreq==1) && mail->logged) … … 1243 1243 // increment mail sent counter 1244 1244 mail->logged++; 1245 } 1246 1247 static void reset_warning_mail(const dev_config & cfg, dev_state & state, int which, const char *fmt, ...) 1248 __attribute__ ((format (printf, 4, 5))); 1249 1250 static void reset_warning_mail(const dev_config & cfg, dev_state & state, int which, const char *fmt, ...) 1251 { 1252 if (!(0 <= which && which < SMARTD_NMAIL)) 1253 return; 1254 1255 // Return if no mail sent yet 1256 mailinfo & mi = state.maillog[which]; 1257 if (!mi.logged) 1258 return; 1259 1260 // Format & print message 1261 char msg[256]; 1262 va_list ap; 1263 va_start(ap, fmt); 1264 vsnprintf(msg, sizeof(msg), fmt, ap); 1265 va_end(ap); 1266 1267 PrintOut(LOG_INFO, "Device: %s, %s, warning condition reset after %d email%s\n", cfg.name.c_str(), 1268 msg, mi.logged, (mi.logged==1 ? "" : "s")); 1269 1270 // Clear mail counter and timestamps 1271 mi = mailinfo(); 1272 state.must_write = true; 1245 1273 } 1246 1274 … … 2296 2324 // command failed 2297 2325 MailWarning(cfg, state, 8, "Device: %s, Read SMART Self-Test Log Failed", name); 2298 else { 2326 else { 2327 reset_warning_mail(cfg, state, 8, "Read SMART Self-Test Log worked again"); 2328 2299 2329 // old and new error counts 2300 2330 int oldc=state.selflogcount; … … 2329 2359 2330 2360 // Print info if error entries have disappeared 2331 if (oldc > newc) 2361 // or newer successful successful extended self-test exits 2362 if (oldc > newc) { 2332 2363 PrintOut(LOG_INFO, "Device: %s, Self-Test Log error count decreased from %d to %d\n", 2333 2364 name, oldc, newc); 2365 if (newc == 0) 2366 reset_warning_mail(cfg, state, 3, "Self-Test Log does no longer report errors"); 2367 } 2334 2368 2335 2369 // Needed since self-test error count may DECREASE. Hour might … … 2673 2707 // No report if no sectors pending. 2674 2708 uint64_t rawval = ata_get_attr_raw_value(smartval.vendor_attributes[i], cfg.attribute_defs); 2675 if (rawval == 0) 2709 if (rawval == 0) { 2710 reset_warning_mail(cfg, state, mailtype, "No more %s", msg); 2676 2711 return; 2712 } 2677 2713 2678 2714 // If attribute is not reset, report only sector count increases. … … 2768 2804 cfg.name.c_str(), currtemp, cfg.tempinfo, fmt_temp(state.tempmin, buf), minchg, state.tempmax, maxchg); 2769 2805 } 2806 else if (cfg.tempcrit) { 2807 unsigned char limit = (cfg.tempinfo ? cfg.tempinfo : cfg.tempcrit-5); 2808 if (currtemp < limit) 2809 reset_warning_mail(cfg, state, 12, "Temperature %u Celsius dropped below %u Celsius", currtemp, limit); 2810 } 2770 2811 } 2771 2812 … … 2885 2926 MailWarning(cfg, state, 9, "Device: %s, unable to open device", name); 2886 2927 return 1; 2887 } else if (debugmode) 2928 } 2929 if (debugmode) 2888 2930 PrintOut(LOG_INFO,"Device: %s, opened ATA device\n", name); 2931 reset_warning_mail(cfg, state, 9, "open device worked again"); 2889 2932 2890 2933 // user may have requested (with the -n Directive) to leave the disk … … 2989 3032 } 2990 3033 else { 3034 reset_warning_mail(cfg, state, 6, "read SMART Attribute Data worked again"); 3035 2991 3036 // look for current or offline pending sectors 2992 3037 if (cfg.curr_pending_id)