[plug] Memory testing
Craig Foster
fostware at iinet.net.au
Mon Mar 3 03:38:26 WST 2003
It look like under load, you have either a bad RAM buffer or a faulty
Address line (read - dead stick). Notice how the faulty addresses look
*really* similar :)
If you have at least two memory sticks (which you should have if you
have Bank 2), swap the memory sitcks and see whether the failures
address stays high or whether they are closer to half of what they are
now. From the looks of it, the error is in the second stick (Bank 0,1 =
1 stick; Bank 2,3 = 2nd stick, etc) Programs like aida32
(http://www.aida32.hu) can usually tell you who actually made the stick
too.
Most (decent) memory has lifetime warranties now, so why not just send
it back to the distributors with this output, and get them to replace
the stick.
Other than the big boys (Like SIMMS Int'l) most memory disties don't
test memory other than putting it in another machine and run memtest,
whatmem, or TuffTest Pro. (Trust me, this was a side job I used to do
for a local memory distributor and one of their Eastern States
competitors)
Regards,
Craig Foster
fostware at iinet.net.au (with SMIME)
> -----Original Message-----
> From: Craig Dyke [mailto:grail at enterprize.net.au]
> Sent: Sunday, March 02, 2003 10:02 PM
> To: plug at plug.linux.org.au
> Cc: bernard at blackham.com.au
> Subject: Re: [plug] Memory testing
>
>
> Error:
>
> MCE: The hardware reports a non fatal, correctable incident
> occured on CPU 0. Bank 2: d40040000000017a Feb 22 08:34:11
> Coven kernel: MCE: The hardware reports a non fatal,
> correctable incident occured on CPU 0. Feb 22 08:34:11 Coven
> kernel: Bank 2: d40040000000017a
> MCE: The hardware reports a non fatal, correctable incident
> occured on CPU 0. Bank 2: 940040000000017a Feb 22 08:34:26
> Coven kernel: MCE: The hardware reports a non fatal,
> correctable incident occured on CPU 0. Feb 22 08:34:26 Coven
> kernel: Bank 2: 940040000000017a
> MCE: The hardware reports a non fatal, correctable incident
> occured on CPU 0. Bank 2: d40040000000017a Feb 22 08:34:41
> Coven kernel: MCE: The hardware reports a non fatal,
> correctable incident occured on CPU 0. Feb 22 08:34:41 Coven
> kernel: Bank 2: d40040000000017a CPU 0: Machine Check
> Exception: 0000000000000004 Bank 2: b60020000000017a at
> 000000000ad31280 Kernel panic: CPU context corrupt Feb 22
> 08:34:45 Coven kernel: CPU 0: Machine Check Exception:
> 0000000000000004 Feb 22 08:34:45 Coven kernel: Bank 2:
> b60020000000017a at 000000000ad31280 Feb 22 08:34:45 Coven
> kernel: Kernel panic: CPU context corrupt
>
> The above are the errors I keep getting while trying to
> compile stuff. If I turn MCE off I eventually get the last 4
> line anyway. It seems to be unpicky about what I compile, I
> just eventually get a runaway executable (eg. ld, gcc even
> rm) which I cannot kill.
>
> Craig
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3238 bytes
Desc: not available
URL: <http://lists.plug.org.au/pipermail/plug/attachments/20030303/82ad0715/attachment.bin>
More information about the plug
mailing list