Modify

Opened 7 years ago

Last modified 4 years ago

#9143 new defect

Kernel oops in preinit after erasing jffs2 partition on ramips rt305x

Reported by: Layne Edwards <ledwards@…> Owned by: developers
Priority: response-needed Milestone: Barrier Breaker 14.07
Component: base system Version: Trunk
Keywords: jffs rootfs_data ramips rt305x Cc:

Description

When restoring to the default configuration by erasing the jffs2 partition (either via 'mtd -r erase rootfs_data' or via LuCI 'reset router to defaults'), I get a kernel oops during preinit after the reboot (see output below). The problem is not evident during firstboot after flashing.

It looks like the preinit thinks the jffs2 is ready and tries to use it ('switching to jffs2'), when it should recognize that is it NOT ready and then build it. When uci tries to write to jffs2, the kernel panics.

I'm running on a Ralink RT3052 board (HW550-3G) with the latest trunk. It was recently updated to kernel 2.6.37.4 (with overlayfs instead of mini_fo), but the problem was also evident with 2.6.36.4.

root@OpenWrt:/# mtd -r erase rootfs_data
Unlocking rootfs_data ...
Erasing rootfs_data ...
Rebooting ...
br-lan: port 1(eth0.1) entering forwarding state
device eth0 left promiscuous mode
device eth0.1 left promiscuous mode
br-lan: port 1(eth0.1) entering disabled state
Restarting system.


U-Boot 1.1.3 (Dec 14 2008 - 16:34:00)

Board: Ralink APSoC DRAM:  32 MB
relocate_code Pointer at: 81fac000
flash_protect ON: from 0xBF000000 to 0xBF02006F
protect on 0
protect on 1
protect on 2
protect on 3
protect on 4
protect on 5
protect on 6
protect on 7
protect on 8
protect on 9
flash_protect ON: from 0xBF030000 to 0xBF03FFFF
protect on 10
============================================
Ralink UBoot Version: 3.2
--------------------------------------------
ASIC 3052_MP2 (Port5<->None)
DRAM COMPONENT: 128Mbits
DRAM BUS: 32BIT
Total memory: 32 MBytes
Date:Dec 14 2008  Time:16:34:00
============================================
icache: sets:256, ways:4, linesz:32 ,total:32768
dcache: sets:128, ways:4, linesz:32 ,total:16384

 ##### The CPU freq = 384 MHZ ####

 SDRAM bus set to 32 bit
 SDRAM size =32 Mbytes

Please choose the operation:
   1: Load system code to SDRAM via TFTP.
   2: Load system code then write to Flash via TFTP.
   3: Boot system code via Flash (default).
   4: Entr boot command line interface.
   9: Load Boot Loader code then write to Flash via TFTP.
 0

3: System Boot system code via Flash.
## Booting image at bf050000 ...
   Image Name:   MIPS OpenWrt Linux-2.6.37.4
   Created:      2011-03-29   2:19:49 UTC

 System Control Status = 0x00440000
   Image Type:   MIPS Linux Kernel Image (lzma compressed)
   Data Size:    845755 Bytes = 825.9 kB
   Load Address: 80000000
   Entry Point:  80000000
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
No initrd
## Transferring control to Linux (at address 80000000) ...
## Giving linux memsize in MB, 32

Starting kernel ...

Linux version 2.6.37.4 (ledwards@OpenSuSE.site) (gcc version 4.5.2 (Linaro GCC 4.5-2011.02-0) ) #1 Mon Mar 28 21:19:27 CDT 2011
bootconsole [early0] enabled
CPU revision is: 0001964c (MIPS 24Kc)
Ralink RT3052   id:1 rev:2 running at 384.00 MHz
Determined physical RAM map:
 memory: 02000000 @ 00000000 (usable)
Initrd not found or empty - disabling initrd
Zone PFN ranges:
  Normal   0x00000000 -> 0x00002000
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000000 -> 0x00002000
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 8128
Kernel command line:  board=HW550-3G mtdparts=physmap-flash.0:192k(u-boot)ro,64k(u-boot-env)ro,64k(factory)ro,896k(kernel),692
PID hash table entries: 128 (order: -3, 512 bytes)
Dentry cache hash table entries: 4096 (order: 2, 16384 bytes)
Inode-cache hash table entries: 2048 (order: 1, 8192 bytes)
Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
Primary data cache 16kB, 4-way, VIPT, no aliases, linesize 32 bytes
Writing ErrCtl register=000038b0
Readback ErrCtl register=000038b0
Memory: 29964k/32768k available (1798k kernel code, 2804k reserved, 432k data, 144k init, 0k highmem)
SLUB: Genslabs=9, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
NR_IRQS:48
console [ttyS1] enabled, bootconsole disabled
console [ttyS1] enabled, bootconsole disabled
Calibrating delay loop... 255.59 BogoMIPS (lpj=1277952)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
NET: Registered protocol family 16
MIPS: machine is Aztech HW550-3G
bio: create slab <bio-0> at 0
Switching to clocksource MIPS
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 1024 (order: 1, 8192 bytes)
TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
TCP: Hash tables configured (established 1024 bind 1024)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
squashfs: version 4.0 (2009/01/31) Phillip Lougher
JFFS2 version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
msgmni has been set to 58
io scheduler noop registered
io scheduler deadline registered (default)
Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
serial8250: ttyS0 at MMIO 0x10000500 (irq = 13) is a 16550A
serial8250: ttyS1 at MMIO 0x10000c00 (irq = 20) is a 16550A
physmap platform flash device: 00800000 at bf000000
physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank. Manufacturer ID 0x0000c2 Chip ID 0x0022cb
Amd/Fujitsu Extended Query Table at 0x0040
  Amd/Fujitsu Extended Query version 1.1.
number of CFI chips: 1
6 cmdlinepart partitions found on MTD device physmap-flash.0
Creating 6 MTD partitions on "physmap-flash.0":
0x000000000000-0x000000030000 : "u-boot"
0x000000030000-0x000000040000 : "u-boot-env"
0x000000040000-0x000000050000 : "factory"
0x000000050000-0x000000130000 : "kernel"
0x000000130000-0x000000800000 : "rootfs"
mtd: partition "rootfs" set to be root filesystem
mtd: partition "rootfs_data" created automatically, ofs=580000, len=280000
0x000000580000-0x000000800000 : "rootfs_data"
0x000000050000-0x000000800000 : "firmware"
TCP westwood registered
NET: Registered protocol family 17
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
VFS: Mounted root (squashfs filesystem) readonly on device 31:4.
Freeing unused kernel memory: 144k freed
- preinit -
Press the [f] key and hit [enter] to enter failsafe mode
- regular preinit -
JFFS2 notice: (286) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of .
switching to jffs2
- init -
CPU 0 Unable to handle kernel paging request at virtual address c0880585, epc == 800e8c9c, ra == 800e8c20
Oops[#1]:
Cpu 0
$ 0   : 00000000 80250000 c0880585 00020000
$ 4   : 00000000 00000018 81a1db30 00000000
$ 8   : 0000002c 00080000 00000000 4c000c00
$12   : 00000007 0000000e 801cda18 00000001
$16   : 00000407 00000007 c0882000 00001e82
$20   : c0952000 00000000 00000000 01000000
$24   : 00000001 80049520
$28   : 81a1c000 81a1db18 000096d7 800e8c20
Hi    : 00000002
Lo    : 00000000
epc   : 800e8c9c unlzma+0xbec/0xd30
    Not tainted
ra    : 800e8c20 unlzma+0xb70/0xd30
Status: 10008403    KERNEL EXL IE
Cause : 00800008
BadVA : c0880585
PrId  : 0001964c (MIPS 24Kc)
Modules linked in: leds_gpio
Process uci (pid: 300, threadinfo=81a1c000, task=818f9100, tls=2abad2f0)
Stack : 00000001 00011210 8184bd18 80049730 00000009 00000019 00000018 0200006c
        01957800 00000000 00000000 800e7f10 c004b126 c004b000 c0053325 00008325
        0c71d90c 4ec20000 4ec20000 00000002 00000009 c0952644 00000000 00000000
        00000000 c004b000 00000040 000000c0 00000013 00000003 00000003 818f9100
        81910380 80260000 81952c88 00000000 00000022 81910280 c0053325 00000000
        ...
Call Trace:
[<800e8c9c>] unlzma+0xbec/0xd30
[<800bb97c>] lzma_uncompress+0x11c/0x230
[<800b8084>] squashfs_read_data+0x434/0x600
[<800b8400>] squashfs_cache_get+0x1b0/0x2d8
[<800b9844>] squashfs_readpage+0x57c/0x818
[<8004f724>] __do_page_cache_readahead+0x1c4/0x224
[<8004fa78>] ra_submit+0x28/0x34
[<8004fe58>] page_cache_sync_readahead+0x58/0x70
[<80048880>] generic_file_aio_read+0x30c/0x768
[<80071dd0>] do_sync_read+0xa8/0xf0
[<800726d0>] sys_read+0x58/0x9c
[<80008b44>] stack_done+0x20/0x40


Code: 5080fffe  00431021  02421021 <90560000> 02501021  26100001  12a0000c  a0560000  8ba20020
Disabling lock debugging due to kernel taint

Thanks,
Layne

Attachments (0)

Change History (9)

comment:1 Changed 7 years ago by Layne Edwards <ledwards76@…>

After reviewing the preinit scripts, I think the problem may be in '20_check_jffs2_ready' at function 'check_for_jffs2'. At this point, 'jffs2_ready' should fail (which checks for existenance of rootfs_data and no magic hex "deadc0de"), but it passes because "deadc0de" is not there. It looks like "deadc0de" is in the root.squashfs image and is written during flash. It is cleared to "ffffffff" when the jffs2 partition is erased via 'mtd erase rootfs_data'.

I have also noticed that this kernel oops can be duplicated during inital flash by trying to read/write to jffs2 before it is fully built (during the first few seconds of firstboot). This can be done by simply accessing the LuCI interface (prematurely). This is not necessarily a bug, as this shouldn't be done until initial flashing/configuration is complete... but it's worth noting.

Layne

comment:2 Changed 7 years ago by Layne Edwards <ledwards@…>

I did some more testing. If I erase the jffs partition without an auto reset (mtd erase rootfs_data). manually write 0xdeadc0de to /dev/mtd5 (rootfs_data) and then reboot, it works as expected (no oops):

root@OpenWrt:/# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00030000 00010000 "u-boot"
mtd1: 00010000 00010000 "u-boot-env"
mtd2: 00010000 00010000 "factory"
mtd3: 000e0000 00010000 "kernel"
mtd4: 006d0000 00010000 "rootfs"
mtd5: 00280000 00010000 "rootfs_data"
mtd6: 007b0000 00010000 "firmware"
root@OpenWrt:/# mtd erase rootfs_data
Unlocking rootfs_data ...
Erasing rootfs_data ...
root@OpenWrt:/# echo $(hexdump /dev/mtd5 -n 4 -e '4/1 "%02x"')
ffffffff
root@OpenWrt:/# echo -e "\xde\xad\xc0\xde" | mtd write - mtd5
Unlocking mtd5 ...

Writing from <stdin> to mtd5 ...
root@OpenWrt:/# echo $(hexdump /dev/mtd5 -n 4 -e '4/1 "%02x"')
deadc0de
root@OpenWrt:/# reboot
device eth0 left promiscuous mode
device eth0.1 left promiscuous mode
br-lan: port 1(eth0.1) entering disabled state
Restarting system.


U-Boot 1.1.3 (Dec 14 2008 - 16:34:00)

Board: Ralink APSoC DRAM:  32 MB
relocate_code Pointer at: 81fac000
flash_protect ON: from 0xBF000000 to 0xBF02006F
protect on 0
protect on 1
protect on 2
protect on 3
protect on 4
protect on 5
protect on 6
protect on 7
protect on 8
protect on 9
flash_protect ON: from 0xBF030000 to 0xBF03FFFF
protect on 10
============================================
Ralink UBoot Version: 3.2
--------------------------------------------
ASIC 3052_MP2 (Port5<->None)
DRAM COMPONENT: 128Mbits
DRAM BUS: 32BIT
Total memory: 32 MBytes
Date:Dec 14 2008  Time:16:34:00
============================================
icache: sets:256, ways:4, linesz:32 ,total:32768
dcache: sets:128, ways:4, linesz:32 ,total:16384

 ##### The CPU freq = 384 MHZ ####

 SDRAM bus set to 32 bit
 SDRAM size =32 Mbytes

Please choose the operation:
   1: Load system code to SDRAM via TFTP.
   2: Load system code then write to Flash via TFTP.
   3: Boot system code via Flash (default).
   4: Entr boot command line interface.
   9: Load Boot Loader code then write to Flash via TFTP.
 0

3: System Boot system code via Flash.
## Booting image at bf050000 ...
   Image Name:   MIPS OpenWrt Linux-2.6.37.4
   Created:      2011-03-30   8:18:18 UTC

 System Control Status = 0x00440000
   Image Type:   MIPS Linux Kernel Image (lzma compressed)
   Data Size:    5570496 Bytes =  5.3 MB
   Load Address: 80000000
   Entry Point:  80000000
   Verifying Checksum ... Bad Data CRC
OK
   Uncompressing Kernel Image ... OK
No initrd
## Transferring control to Linux (at address 80000000) ...
## Giving linux memsize in MB, 32

Starting kernel ...

Linux version 2.6.37.4 (ledwards@OpenSuSE.site) (gcc version 4.5.2 (Linaro GCC 4.5-2011.02-0) ) #1 Wed Mar 30 01:50:23 CDT 2011
bootconsole [early0] enabled
CPU revision is: 0001964c (MIPS 24Kc)
Ralink RT3052   id:1 rev:2 running at 384.00 MHz
Determined physical RAM map:
 memory: 02000000 @ 00000000 (usable)
Initrd not found or empty - disabling initrd
Zone PFN ranges:
  Normal   0x00000000 -> 0x00002000
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000000 -> 0x00002000
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 8128
Kernel command line:  board=HW550-3G mtdparts=physmap-flash.0:192k(u-boot)ro,64k(u-boot-env)ro,64k(factory)ro,896k(kernel),6976k(rootfs),7872k@0x50000(f2
PID hash table entries: 128 (order: -3, 512 bytes)
Dentry cache hash table entries: 4096 (order: 2, 16384 bytes)
Inode-cache hash table entries: 2048 (order: 1, 8192 bytes)
Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
Primary data cache 16kB, 4-way, VIPT, no aliases, linesize 32 bytes
Writing ErrCtl register=0000b120
Readback ErrCtl register=0000b120
Memory: 29964k/32768k available (1798k kernel code, 2804k reserved, 432k data, 144k init, 0k highmem)
SLUB: Genslabs=9, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
NR_IRQS:48
console [ttyS1] enabled, bootconsole disabled
console [ttyS1] enabled, bootconsole disabled
Calibrating delay loop... 255.59 BogoMIPS (lpj=1277952)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
NET: Registered protocol family 16
MIPS: machine is Aztech HW550-3G
bio: create slab <bio-0> at 0
Switching to clocksource MIPS
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 1024 (order: 1, 8192 bytes)
TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
TCP: Hash tables configured (established 1024 bind 1024)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
squashfs: version 4.0 (2009/01/31) Phillip Lougher
JFFS2 version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
msgmni has been set to 58
io scheduler noop registered
io scheduler deadline registered (default)
Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
serial8250: ttyS0 at MMIO 0x10000500 (irq = 13) is a 16550A
serial8250: ttyS1 at MMIO 0x10000c00 (irq = 20) is a 16550A
physmap platform flash device: 00800000 at bf000000
physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank. Manufacturer ID 0x0000c2 Chip ID 0x0022cb
Amd/Fujitsu Extended Query Table at 0x0040
  Amd/Fujitsu Extended Query version 1.1.
number of CFI chips: 1
6 cmdlinepart partitions found on MTD device physmap-flash.0
Creating 6 MTD partitions on "physmap-flash.0":
0x000000000000-0x000000030000 : "u-boot"
0x000000030000-0x000000040000 : "u-boot-env"
0x000000040000-0x000000050000 : "factory"
0x000000050000-0x000000130000 : "kernel"
0x000000130000-0x000000800000 : "rootfs"
mtd: partition "rootfs" set to be root filesystem
mtd: partition "rootfs_data" created automatically, ofs=580000, len=280000
0x000000580000-0x000000800000 : "rootfs_data"
0x000000050000-0x000000800000 : "firmware"
TCP westwood registered
NET: Registered protocol family 17
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
VFS: Mounted root (squashfs filesystem) readonly on device 31:4.
Freeing unused kernel memory: 144k freed
- preinit -
Press the [f] key and hit [enter] to enter failsafe mode
- regular preinit -
jffs2 not ready yet; using ramdisk
- init -

Please press Enter to activate this console. device eth0.1 entered promiscuous mode
device eth0 entered promiscuous mode
br-lan: port 1(eth0.1) entering forwarding state
br-lan: port 1(eth0.1) entering forwarding state
br-lan: received packet on eth0.1 with own address as source address
Compat-wireless backport release: compat-wireless-2011-01-31-26-gf7606f5
Backport based on wireless-testing.git master-2011-03-24
cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:     (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:     (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:     (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:     (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:     (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cifs: Unknown symbol crypto_shash_setkey (err 0)
cifs: Unknown symbol crypto_shash_update (err 0)
cifs: Unknown symbol crypto_shash_final (err 0)
cifs: Unknown symbol crypto_alloc_shash (err 0)
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
NTFS driver 2.1.29 [Flags: R/O MODULE].
PPP generic driver version 2.4.2
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
ip_tables: (C) 2000-2006 Netfilter Core Team
NET: Registered protocol family 24
nf_conntrack version 0.5.0 (470 buckets, 1880 max)
xt_time: kernel timezone is -0000
i2c /dev entries driver
input: gpio-buttons as /devices/platform/gpio-buttons/input/input0
jffs2_scan_eraseblock(): End of filesystem marker found at 0x0
jffs2_build_filesystem(): unlocking the mtd device... done.
jffs2_build_filesystem(): erasing all blocks after the end marker...
done.
JFFS2 notice: (1670) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) fo.

So I think I have identified the problem... I just don't have a solution yet.

Layne

comment:3 Changed 7 years ago by cshore

Does this happen if you reset the flash using the command firstboot ? That is the way the reset should be done, not mtd erase.


comment:4 Changed 7 years ago by Layne Edwards <ledwards@…>

Firstboot works great for clearing the jffs2. However, LuCI System -> Backup/Restore "Reset router to defaults" erases the rootfs_data partition, resulting in the oops. Perhaps LuCI should execute 'firstboot' instead? I thought erasing the jffs2 partition used to work fine (on other targets). Maybe I'm mistaken?

Thanks,
Layne

comment:5 Changed 7 years ago by cshore

Hi again.

Well either should work really. Perhaps you have an unclean build (could you try a fresh checkout), or there is a problem specific to your platform? (Which platform I unfortunately know nothing about).

comment:6 follow-up: Changed 7 years ago by cshore

  • Priority changed from normal to response-needed

Can you check with a clean build?

comment:7 in reply to: ↑ 6 ; follow-up: Changed 7 years ago by Layne Edwards <ledwards@…>

Replying to cshore:

Can you check with a clean build?

I did a distclean, svn up and make... but still have the oops. Maybe it's a problem specific to ramips? I'll try on another ramips target tomorrow (Fonera 2.0n)... and maybe test an ixp4xx and ar71xx.

Thanks,
Layne

comment:8 in reply to: ↑ 7 Changed 6 years ago by josevteg@…

Same here with a Buffalo WZR-HP-G300NH and custom build backfire 10.03.1: I'm getting kernel panics after firstboot but writing 0xdeadc0de to rootfs_data fixes the Oops.

root@OpenWrt:/# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00040000 00020000 "u-boot"
mtd1: 00020000 00020000 "u-boot-env"
mtd2: 00100000 00020000 "kernel"
mtd3: 01e60000 00020000 "rootfs"
mtd4: 01960000 00020000 "rootfs_data"
mtd5: 00020000 00020000 "user_property"
mtd6: 00020000 00020000 "art"
mtd7: 01f60000 00020000 "firmware"
root@OpenWrt:/# firstboot
root@OpenWrt:/# echo $(hexdump /dev/mtd4 -n 4 -e '4/1 "%02x"')
ffffffff
root@OpenWrt:/# echo -e "\xde\xad\xc0\xde" | mtd write - mtd4
root@OpenWrt:/# echo $(hexdump /dev/mtd4 -n 4 -e '4/1 "%02x"')
deadc0de
root@OpenWrt:/# reboot -f
[...]

Router reboots OK.

comment:9 Changed 4 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.