Modify

Opened 2 years ago

#20912 new defect

USB mass storage problems on Netgear DGN3500 (lantiq/xway)

Reported by: saf@… Owned by: developers
Priority: normal Milestone:
Component: kernel Version: Trunk
Keywords: usb lantiq dwc2 ifxusb_hcd dgn3500 Cc:

Description

I am having trouble with USB mass storage on my Netgear DGN3500 (lantiq/xway). I have built my own copy of trunk (git commit a11e9f8c8db82f762c5b414207c1baa128c8df73, the login banner says "DESIGNATED DRIVER (15.05, r47461)").

I have tried both the ifxusb_hcd and dwc2 USB drivers; unfortunately, each driver succeeds in some cases and fails in others, so I can't simply use one of the drivers and ignore the other. The full details are given below, but to try to summarise:

I have tested with three USB devices, a (slow) 128MB USB thumb drive, an 8GB USB thumb drive and an externally-powered USB hard drive. I have tried both read and write tests, and the results are as follows:

Driver	Device:	128MB thumb		8GB thumb		hard drive
dwc2		read/write OK		read errors		read errors
ifxusb		read OK, write errors	read sometimes OK	read errors

Where I haven't mentioned write in this table, I haven't tested writes with that driver+device combination. The way in which the hard drive read fails is different for each driver.

Tests used

My read test is:

for x in `seq 100`; do echo $x; time dd bs=1024 count=65535 if=/dev/sda1 | md5sum; done

My write test is:

for x in `seq 100`; do echo $x; time dd bs=1024 count=65535 if=/dev/zero of=/dev/sda1; done

For testing, I start two ssh sessions to the router. I wait for the router to go idle (it seems to run a CPU-intensive jffs_gcd_mtd6 process on boot) and then in one ssh session I do:

while true; do dmesg; sleep 10; done

and use the other to execute the read or write test.

Devices used

I suspect - and it's no more than a suspicion - that some of the differences in behaviour between the 128MB thumb drive and the 8GB thumb drive are related to the speed of the device. Both are high-speed USB devices (according to dmesg), but the 128MB thumb drive is inherently quite slow; on my desktop PC running Linux, reading 64MB from the 128MB drive (using the same dd command, with the appropriate device name) takes 8.5 seconds, while reading 64MB from the 8GB drive takes only 2.5 seconds.

I can successfully execute all the tests I've described here against these devices using my desktop PC, so I don't believe the devices are faulty.

Test results with dwc2 driver

The read test succeeds with no dmesg output using the 128MB drive.

Using the 8GB USB thumb drive, the test will initially work fine, each 64MB read taking about 13-14 seconds and no messages appearing in dmesg. This can last for 60-70 passes round the loop, but sometimes it starts to give errors on the first pass. Once it fails, each 64MB read takes a variable time between 13 and 47 seconds and dmesg shows:

[ 173.744042] usb 1-1: new high-speed USB device number 2 using dwc2
[ 173.999507] usb-storage 1-1:1.0: USB Mass Storage device detected
[ 174.014936] scsi host0: usb-storage 1-1:1.0
[ 175.044054] scsi 0:0:0:0: Direct-Access SanDisk U3 Cruzer Micro 8.01 PQ: 0 ANSI: 0 CCS
[ 175.058599] sd 0:0:0:0: [sda] 15682559 512-byte logical blocks: (8.02 GB/7.47 GiB)
[ 175.065840] sd 0:0:0:0: [sda] Write Protect is off
[ 175.069555] sd 0:0:0:0: [sda] Mode Sense: 45 00 00 08
[ 175.070861] scsi 0:0:0:1: CD-ROM SanDisk U3 Cruzer Micro 8.01 PQ: 0 ANSI: 0
[ 175.080122] sd 0:0:0:0: [sda] No Caching mode page found
[ 175.084076] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 175.148289] sda: sda1
[ 175.159539] sd 0:0:0:0: [sda] Attached SCSI removable disk
[ 233.097882] dwc2 1e101000.ifxhcd: dwc2_hc_chhltd_intr_dma: Channel 15 - ChHltd set, but reason is unknown
[ 233.105964] dwc2 1e101000.ifxhcd: hcint 0x00000002, intsts 0x04000029
[ 233.112465] dwc2 1e101000.ifxhcd: dwc2_update_urb_state(): trimming xfer length
[ 264.195276] usb 1-1: reset high-speed USB device number 2 using dwc2

The md5sum produced each time *is* correct even after this starts to occur, however.

After a while the dd commands fail with "can't open '/dev/sda1'" and dmesg shows:

[ 308.201280] dwc2 1e101000.ifxhcd: dwc2_update_urb_state(): trimming xfer length
[ 308.420630] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 327.368057] usb 1-1: device descriptor read/64, error -145
[ 344.202393] usb 1-1: device descriptor read/64, error -145
[ 344.490394] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 359.674380] usb 1-1: device descriptor read/64, error -145
[ 374.962393] usb 1-1: device descriptor read/64, error -145
[ 375.250388] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 380.270429] usb 1-1: device descriptor read/8, error -145
[ 385.394517] usb 1-1: device descriptor read/8, error -145
[ 385.682472] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 390.702446] usb 1-1: device descriptor read/8, error -145
[ 395.826430] usb 1-1: device descriptor read/8, error -145
[ 395.934694] usb 1-1: USB disconnect, device number 2
[ 395.942564] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
[ 395.948836] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 00 01 a4 1f 00 00 f0 00
[ 395.955795] blk_update_request: I/O error, dev sda, sector 107551
[ 395.962451] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
[ 395.969463] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 00 01 a5 0f 00 00 10 00
[ 395.976430] blk_update_request: I/O error, dev sda, sector 107791
[ 396.358465] usb 1-1: new high-speed USB device number 3 using dwc2
[ 411.542379] usb 1-1: device descriptor read/64, error -145
[ 426.830383] usb 1-1: device descriptor read/64, error -145
[ 427.118385] usb 1-1: new high-speed USB device number 4 using dwc2

I can produce a similar failure using the externally-powered USB hard drive, although there the error -145 and disconnect doesn't seem to occur (at least not as reliably). Instead it grinds very slowly round the dd loop, usually producing the correct md5sum but occasionally producing an incorrect value, with dmesg showing the 'ChHltd' message (and the same hcint/intsts values).

On a possibly related note, *sometimes* after the failures start to occur, I have seen the router spontaneously reboot. I can't reproduce this reliably, but I thought it would be worth mentioning.

The write test with the 128MB drive completes all 100 passes (taking 10-45 seconds for each pass) with no dmesg errors.

Test results with ifxusb_hcd driver

The read test with the 128MB drive passes (as it does with the dwc2 driver), though it takes about 18 seconds to read 64MB compared with 13 seconds for the dwc2 driver. No errors appear in dmesg

The test with the 8GB drive has passed sometimes, taking about 16 seconds to read 64MB and with no errors appearing in dmesg. At other times, this has failed - after a while, /dev/sda1 ceases to exist and dmesg shows:

[ 305.322361] usb 1-1: new high-speed USB device number 2 using ifxusb_hcd
[ 305.521663] usb-storage 1-1:1.0: USB Mass Storage device detected
[ 305.538806] scsi host0: usb-storage 1-1:1.0
[ 306.543749] scsi 0:0:0:0: Direct-Access SanDisk U3 Cruzer Micro 8.01 PQ: 0 ANSI: 0 CCS
[ 306.560907] sd 0:0:0:0: [sda] 15682559 512-byte logical blocks: (8.02 GB/7.47 GiB)
[ 306.571254] scsi 0:0:0:1: CD-ROM SanDisk U3 Cruzer Micro 8.01 PQ: 0 ANSI: 0
[ 306.583666] sd 0:0:0:0: [sda] Write Protect is off
[ 306.587170] sd 0:0:0:0: [sda] Mode Sense: 45 00 00 08
[ 306.594650] sd 0:0:0:0: [sda] No Caching mode page found
[ 306.598614] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 306.629895] sda: sda1
[ 306.641031] sd 0:0:0:0: [sda] Attached SCSI removable disk
[ 343.894384] usb 1-1: reset high-speed USB device number 2 using ifxusb_hcd
[ 359.070409] usb 1-1: device descriptor read/64, error -150
[ 374.350378] usb 1-1: device descriptor read/64, error -150
[ 374.630369] usb 1-1: reset high-speed USB device number 2 using ifxusb_hcd
[ 389.806363] usb 1-1: device descriptor read/64, error -150
[ 405.086361] usb 1-1: device descriptor read/64, error -150
[ 405.366383] usb 1-1: reset high-speed USB device number 2 using ifxusb_hcd
[ 420.386410] usb 1-1: device descriptor read/8, error -150
[ 435.510411] usb 1-1: device descriptor read/8, error -150
[ 435.790379] usb 1-1: reset high-speed USB device number 2 using ifxusb_hcd
[ 450.810426] usb 1-1: device descriptor read/8, error -150
[ 465.934429] usb 1-1: device descriptor read/8, error -150
[ 466.042680] usb 1-1: USB disconnect, device number 2
[ 466.050593] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
[ 466.056876] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 00 01 58 1f 00 00 f0 00
[ 466.063831] blk_update_request: I/O error, dev sda, sector 88095
[ 466.070444] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
[ 466.077414] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 00 01 59 0f 00 00 10 00
[ 466.084380] blk_update_request: I/O error, dev sda, sector 88335
[ 466.611592] usb 1-1: new high-speed USB device number 3 using ifxusb_hcd
[ 481.786361] usb 1-1: device descriptor read/64, error -150
[ 497.066363] usb 1-1: device descriptor read/64, error -150
[ 497.346378] usb 1-1: new high-speed USB device number 4 using ifxusb_hcd

Connecting the USB hard drive immediately gives this reset message in dmesg:

[ 275.298390] usb 1-1: new high-speed USB device number 2 using ifxusb_hcd
[ 275.579148] usb-storage 1-1:1.0: USB Mass Storage device detected
[ 275.595153] scsi host0: usb-storage 1-1:1.0
[ 276.634576] scsi 0:0:0:0: Direct-Access ST350083 0A 3.AA PQ: 0 ANSI: 0
[ 276.647846] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[ 276.826407] usb 1-1: reset high-speed USB device number 2 using ifxusb_hcd

Note that /dev/sda1 is not created and consequently any attempt to run the read test fails as /dev/sda1 does not exist.

Executing the write test using the 128MB drive, the router spontaneously reboots (typically on the second or third pass through the loop). I never see anything in dmesg about this, but obviously I could just be unlucky with the timing.

I hope this explanation is reasonably clear. I have been trying to investigate this problem myself but I am a bit stumped. Any suggestions as to further tests to run or modifications to try would be appreciated. I am quite happy to apply patches and rebuild the firmware to test them out.

Attachments (0)

Change History (0)

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.