Opened 11 years ago

Closed 11 years ago

#1451 closed defect (duplicate)

gcc-4.1.2 + brcm47xx-2.6 bricks wgt634u (r6543)

Reported by: jhansen@… Owned by: developers
Priority: high Milestone: Kamikaze 7.06
Component: kernel Version:
Keywords: gcc 4.1.2 brcm47xx-2.6 wgt634u trx Cc:


Recently gcc 4.1.2 was made the default compiler for 2.6 kernels, which is good. However, installing any trx/bin image generated using gcc 4.1.2 will brick your wgt634u, at least if you are using the new brcm47xx-2.6 target (you *have* to on the wgt634u, see bug 1312). Shortly after getting through all preinit stuff, dozens of these get printed out:

SQUASHFS error: lzma returned unexpected result 0x1
SQUASHFS error: Unable to read fragment cache block [3e533]
SQUASHFS error: Unable to read page, block 3e533, size 46e5

I've found that if I compile my kernel with gcc 4.1.1, everything is fine. Also, I've found that if I just compile everything in fs with gcc 4.1.1 (with everything else compiled with 4.1.2), everything is fine. It's not squashfs or mini_fo that's causing the problem (if I compile just those with 4.1.1, it's still broken); it seems to be something right in fs/ that's getting compiled differently and affecting squashfs reads. Anyway, I'll post something when I (hopefully) fix the problem, unless someone else has some insight.

Attachments (1)

nodemgmt.s.diff (19.7 KB) - added by jhansen@… 11 years ago.
Difference in assembly of fs/jffs2/nodemgmt.s for 4.1.2 vs. 4.1.1

Download all attachments as: .zip

Change History (15)

comment:1 Changed 11 years ago by jhansen@…

The problem also goes away if you disable inotify support.

comment:2 Changed 11 years ago by nbd

  • Milestone changed from Kamikaze to Kamikaze Milestone 1

comment:3 Changed 11 years ago by anonymous

I've isolated the problem to fs/jffs2/nodemgmt.c. When compiled with 4.1.1, everything is fine, when compiled with 4.1.2, it bricks the wgt634u. The preprocessor output is exactly the same, so the compiler is just generating different assembly for each. I've attached the diff of the assembly.

Changed 11 years ago by jhansen@…

Difference in assembly of fs/jffs2/nodemgmt.s for 4.1.2 vs. 4.1.1

comment:4 Changed 11 years ago by nbd

I don't think it's related to 4.1.2 vs 4.1.1. I've had the same problem with gcc 4.1.1 previously...
I think i will disable inotify support for now...

comment:5 Changed 11 years ago by jhansen@…

It appears to be a race condition or a timing problem. I don't think it directly relates to inotify either. Hopefully I'll be able to figure it out soon.

comment:6 Changed 11 years ago by nbd

  • Resolution set to fixed
  • Status changed from new to closed

brcm47xx-2.6 seems to be fully working as of [6564]

comment:7 Changed 11 years ago by jhansen@…

Thanks for closing out bug 1312. I believe we should still look into this problem, however, since the real problem still persists under certain circumstances (inotify is not the real problem, so I'd actually like to see that turned back on at some point).

Squashfs recently released squashfs 3.2, which supposedly worked around a prefetch bug in mips' memcpy (see This simply works around whatever "prefetching bugs" are in memcpy, also without solving the real problem. The brcm47xx-2.6 target doesn't even use prefetching, since it's configged as !CONFIG_DMA_COHERENT, so I think it's just luck that it works any better.

I think there are either 1) bugs in mips' memcpy.S, or 2) bugs in which registers mips' memcpy.S uses, or 3) bugs in the brcm47xx cache that mips' memcpy.S's sequence of loading and storing exhibits. Perhaps this is a bug that should go on the mips-linux mailing list, but I'd bet it's a brcm47xx-specific bug, looking at how few changes have been made to that code lately.

I believe that fixing this bug could possibly fix other completely random crashes that seem to happen once in a while.

comment:8 Changed 11 years ago by jhansen@…

For my specific situation, I can fix everything by adding 10 ssnop instructions to arch/mips/lib/memcpy.S, line 227 (where the .align is). This leads me to believe that this is still a race condition, and not a pipeline issue.

comment:9 Changed 11 years ago by jhansen@…

  • Resolution fixed deleted
  • Status changed from closed to reopened

Please re-open.

comment:11 Changed 11 years ago by jhansen@…

Regarding the e-mail by Greg Nutt, we don't even have prefetching in memcpy.S enabled for brcm47xx-2.6 (CONFIG_DMA_COHERENT is not set, thus PREF(...) = blank), so I don't feel that prefetching is the problem. I'm trying to write a test-case so that we can generate this problem at will. I think that memcpy'ing different portions of the flash from two ready tasks will easily reproduce this problem.

comment:12 Changed 11 years ago by nbd

From my understanding of the code, PREF() will *NOT* be blank. It depends on the CONFIG_CPU_HAS_PREFETCH option, which is enabled...

Check include/asm-mips/asm.h...

comment:13 Changed 11 years ago by nbd

Oh, sorry. I missed the part where memcpy.S overrides that config option ...
The code is a bit convoluted :)

comment:14 Changed 11 years ago by mbm

  • Resolution set to duplicate
  • Status changed from reopened to closed

Moving this to #1465 since the symptoms seem to have changed.

Add Comment

Modify Ticket

as closed .
The resolution will be deleted. Next status will be 'reopened'.

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.