VF50 rev1.2B board not booting from nand

I have a vf50 128MB V1.2B board that is running a custom compiled kernel from colibri_vf branch with tag Colibri_VF_LinuxImageV2.3Beta1_20140804.

currently It does not boot to the kernel. It is reading the manufacture id and chip id wrong.

[    0.444200] FSL NFC MTD nand Driver 1.0
[    0.448537] NAND device: Manufacturer ID: 0x4f, Chip ID: 0x4e (Unknown NAND 64GiB 3,3V 16-bit)
[    0.458850] kernel BUG at drivers/mtd/nand/nand_bbt.c:1138!

However, on a V1.2A board, the board is booting fine with the same image.

[    0.451234] FSL NFC MTD nand Driver 1.0
[    0.455578] NAND device: Manufacturer ID: 0xc2, Chip ID: 0xf1 (Unknown NAND 128MiB 3,3V 8-bit)
[    0.467124] 6 cmdlinepart partitions found on MTD device fsl_nfc
[    0.473301] Creating 6 MTD partitions on "fsl_nfc":
[    0.478250] 0x000000000000-0x000000020000 : "vf-bcb"
[    0.485708] 0x000000020000-0x000000180000 : "u-boot"
[    0.493130] 0x000000180000-0x000000200000 : "u-boot-env"
[    0.500860] 0x000000200000-0x000000a00000 : "kernel-ubi"
[    0.508480] 0x000000a00000-0x000001d00000 : "rootfs-ubi"
[    0.516342] 0x000001d00000-0x000008000000 : "userfs-ubi"
[    0.525156] UBI: attaching mtd4 to ubi0
[    0.529227] UBI: physical eraseblock size:   131072 bytes (128 KiB)
[    0.535557] UBI: logical eraseblock size:    126976 bytes
[    0.541079] UBI: smallest flash I/O unit:    2048
[    0.545836] UBI: VID header offset:          2048 (aligned 2048)
[    0.551956] UBI: data offset:                4096
[    0.733717] UBI: max. sequence number:       98
[    0.754389] UBI: attached mtd4 to ubi0
[    0.758215] UBI: MTD device name:            "rootfs-ubi"
[    0.763780] UBI: MTD device size:            19 MiB
[    0.768712] UBI: number of good PEBs:        151
[    0.773443] UBI: number of bad PEBs:         1
[    0.777928] UBI: number of corrupted PEBs:   0
[    0.782473] UBI: max. allowed volumes:       128
[    0.787131] UBI: wear-leveling threshold:    4096
[    0.791937] UBI: number of internal volumes: 1
[    0.796421] UBI: number of user volumes:     1
[    0.800966] UBI: available PEBs:             0
[    0.805451] UBI: total number of reserved PEBs: 151
[    0.810428] UBI: number of PEBs reserved for bad PEB handling: 2
[    0.816491] UBI: max/mean erase counter: 4/1
[    0.820863] UBI: image sequence number:  0
[    0.825017] UBI: background thread "ubi_bgt0d" started, PID 39

From the Product Change Notification that was sent by Toradax, it states

− Linux: The new NAND flash device reports a different product ID and device name. The Toradex Linux BSPs don’t make use of the product ID or device name. No BSP update is required. The new flashes are supported by older Linux BSPs as well. We tested backwards compatibility back to: - Linux: BSP version 2.4 stable.

I prefer not to upgrade to the latest kernel, since that could mean recompiling the tools I am running on the current kernel. Please let me know what could be the problem here. Thanks!

I believe we changed the NAND flash layout a few years back which is a non-trivial change. Please note, however, that the oldest BSP release we do still support is V2.5:

https://developer.toradex.com/software/linux/linux-software/release-details?view=all&version=V2.5

Anything older is no longer supported. You may also consult the following Embedded Linux Support Strategy document:

Thanks for the quick reply. I understand that this kernel is no longer supported. What I am looking here is that maybe a pointer to what could be the problem.

What is puzzling is that, according to the PCN, there is only a change on the NAND chip, which should not affect the software. Here I am with a board that does not boot.

Do you perhaps know some more details on the change? Would the nand chip replacement also changed something else? From my point of view, based on what the PCN states, there should be no changes needed from my side.

You mentioned that nand flash layout is changed, can you elaborate on that? I am not seeing that being mentioned on the PCN.

Very appreciated for your help.

This is not mentioned in the PCN because it has absolutely nothing to do with any of that. Your problem is that for some unknown reason you seem to still be running a more than 4 year old beta BSP which used a preliminary NAND flash layout which long since got superseded and is probably rather hard to even get working on such obsolete beta stuff.

To be frank, the reason is that I have a line of product that has been using this stripped downed kernel image for the last few years. To upgrade the image, it would also mean I need to recompile and upgrade all the customized tools and applications that I am running on it, since kernel is already moving onto 4.4. Potentially lots of effort. Therefore I am trying to gauge the effort between upgrading and fixing the kernel issue, hence asking question on the forum to gain some insight.

It was a surprise to me indeed when a small nand flash replacement would cause an issue like this.

Thanks for your input thus far, really appreciated.

Sounds rather scarry to run a product line on such beta software intended solely for use on sample modules and really amazing that you did not run into any issues earlier.

Effort wise, of course, migrating ancient downstream stuff is rather involved which is why it would be a much smarter approach to stay somewhat up-to-date more or less continuously.

Concerning ancient downstream BSPs you may find much more information in the former release notes:

https://developer.toradex.com/files/toradex-dev/uploads/media/Colibri/Linux/Images/Colibri_VF_LinuxReleaseNotesV2.x.txt