Modify

Opened 2 years ago

Last modified 2 years ago

#20387 new defect

regression in uboot-sunxi causing boot failure.

Reported by: yousong Owned by: developers
Priority: high Milestone:
Component: packages Version: Trunk
Keywords: Cc: wigyori

Description

Hi

I have a Cubieboard2 board. The first time I installed OpenWrt on it was about more than 1 year ago as recorded in the dmesg output of the wiki page [1]. It booted and worked fine. The revision id is 39756 and the corresponding git commit-id is 7ed37d7ef48e601c8a6d22fab6980437a47dcc4a. I just rebuilt that revision and it still booted and worked perfect. Same board and same 256MB SD card were used.

But current trunk didn't work. It failed within uboot

U-Boot SPL 2015.07 (Aug 22 2015 - 18:24:02)
DRAM: 1024 MiB
CPU: 912000000Hz, AXI/AHB/APB: 3/2/2


U-Boot 2015.07 (Aug 22 2015 - 18:24:02 +0800) Allwinner Technology

CPU:   Allwinner A20 (SUN7I)
I2C:   ready
DRAM:  1 GiB
MMC:   SUNXI SD/MMC: 0
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
SCSI:  SUNXI SCSI INIT
SATA link 0 timeout.
AHCI 0001.0100 32 slots 1 ports 3 Gbps 0x1 impl SATA mode
flags: ncq stag pm led clo only pmp pio slum part ccc apst
Net:   eth0: ethernet@01c50000
starting USB...
USB0:   USB EHCI 1.00
USB1:   USB OHCI 1.0
USB2:   USB EHCI 1.00
USB3:   USB OHCI 1.0
scanning bus 0 for devices... 1 USB Device(s) found
scanning bus 2 for devices... 1 USB Device(s) found
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
Found U-Boot script /boot.scr
reading /boot.scr
377 bytes read in 18 ms (19.5 KiB/s)
## Executing script at 43100000
reading uImage
Error reading cluster
** Unable to read file uImage **
SCRIPT FAILED: continuing...
** Can't read partition table on 0:0 **
** Invalid partition 1 **
** Can't read partition table on 0:0 **
** Invalid partition 1 **
** Can't read partition table on 0:0 **
** Invalid partition 1 **
scanning bus for devices...
Found 0 device(s).

SCSI device 0:
    Device 0: not available

USB device 0: unknown device
Speed: 100, full duplex
BOOTP broadcast 1

Another SD card of 1GB size failed right away at SPL stage with error -19 (TIMEOUT from the u-boot code) while initialising mmc device. But it booted fine with debian jessie netboot images [2]. Unfortunately though the sunxi-mmc kernel module within the netboot image cannot initialising correctly the device (it can read and write that 256MB card fine).

U-Boot SPL 2015.07 (Aug 22 2015 - 18:24:02)
DRAM: 1024 MiB
CPU: 912000000Hz, AXI/AHB/APB: 3/2/2
spl: mmc init failed with error: -19
### ERROR ### Please RESET the board ###

[1] http://wiki.openwrt.org/toh/cubietech/cubieboard2
[2] http://mirrors.ustc.edu.cn/debian/dists/jessie/main/installer-armhf/current/images/netboot/SD-card-images/

Attachments (0)

Change History (5)

comment:1 Changed 2 years ago by yousong

Well, I just finished bisecting u-boot commit history.

The problem encountered by that 256MB card (error reading clusters) is caused by timeout when reading uImage file (the file is too big for the fixed timeout of 2 seconds). I will post a patch for that.

The problem encountered by that 1GB card (spl: mmc init failed with error: -19) was caused by commit fc3a832576ce7bb597b1823935bfb7dcca235c3c [1] Well, I can provide git bisect log for this if anyone is interested in fixing this.

It turned out that 4.1.6 kernel from OpenWrt also could not work property with the 1GB card (debian jessie uses 3.16 kernel). A wild guess is that the same issue as u-boot also exists there in the kernel sunxi-mmc code. Below is the relevant error message from kernel log (logged with screen when the board was booted with a working u-boot).

[    1.112060] mmc0: host does not support reading read-only switch, assuming write-enable^M
[    1.120203] ehci-platform 1c1c000.usb: irq 31, io mem 0x01c1c000^M
[    1.126527] sunxi-mmc 1c0f000.mmc: smc 0 err, cmd 6, RD RCE !!^M
[    1.132388] sunxi-mmc 1c0f000.mmc: data error, sending stop command^M
[    1.139451] sunxi-mmc 1c0f000.mmc: send stop command failed^M
[    1.145056] mmc0: error -110 whilst initialising SD card^M
[    1.150424] ehci-platform 1c1c000.usb: USB 2.0 started, EHCI 1.00^M
[    1.156533] sunxi-mmc 1c0f000.mmc: smc 0 err, cmd 1, RTO !!^M

[1] http://git.denx.de/?p=u-boot.git;a=commit;h=fc3a832576ce7bb597b1823935bfb7dcca235c3c

Last edited 2 years ago by yousong (previous) (diff)

comment:2 Changed 2 years ago by yousong

That -110 error code from kernel is ETIMEOUT (include/uapi/asm-generic/errno.h), comparable to that -19 (TIMEOUT) error from U-Boot.

But the root cause was not some timeout. It is caused by SUNXI_MMC_RINT_RESP_CRC_ERROR when waiting for SUNXI_MMC_RINT_COMMAND_DONE within sunxi_mmc_send_cmd() when trying to switch the card mode to SD_HIGHSPEED. If I ignore that failure, U-Boot continued fine but the kernel cannot mount rootfs from /dev/mmcblk0p2 because it cannot initialise the SD card.

The situation should be almost the same with the kernel code. Anyway, that clock divider change from the controller-internal one to mod0-clk is erroneous.

comment:3 Changed 2 years ago by yousong

Below is the change for ignoring failure of switching to highspeed mode.

From fc331c5bf1cd2d9d08d2cdb88e9dcc03596a1460 Mon Sep 17 00:00:00 2001
From: Yousong Zhou <yszhou4tech@gmail.com>
Date: Sun, 30 Aug 2015 15:21:49 +0800
Subject: [PATCH] mmc: continue on failing switching to highspeed mode.

Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
---
 drivers/mmc/mmc.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c
index 79e6fee..6055ff4 100644
--- a/drivers/mmc/mmc.c
+++ b/drivers/mmc/mmc.c
@@ -944,8 +944,10 @@ retry_scr:
 
 	err = sd_switch(mmc, SD_SWITCH_SWITCH, 0, 1, (u8 *)switch_status);
 
-	if (err)
-		return err;
+	if (err) {
+		printf("unable to switch SD_HIGHSPEED: %d\n", err);
+		return 0;
+	}
 
 	if ((__be32_to_cpu(switch_status[4]) & 0x0f000000) == 0x01000000)
 		mmc->card_caps |= MMC_MODE_HS;
-- 
1.7.10.4

comment:4 Changed 2 years ago by yousong

Just in case anyone may come here for at least a temporary fix. link [1] contains patches for the 2 issues reported here.

Then I think this ticket can be closed. Or should it happen after patches are merged?

[1] http://thread.gmane.org/gmane.comp.hardware.netbook.arm.sunxi/18370

comment:5 Changed 2 years ago by yousong

Besides

  1. Ignoring mode switch error and continue on
  2. Scale up to higher frequency before the mode switch instead of after that.

, there exists a 3rd method [1] for making the 1GB card work for me: retry the mode switch in case of error.

[1] [RFC] mmc: core: Set clock before switching to highspeed mode. http://thread.gmane.org/gmane.comp.hardware.netbook.arm.sunxi/18442/focus=33814

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.