Not a solution per say, but I can give you some info on how we solve the reliability issue in our product that uses RAUC. * We store the env at a raw offset in the eMMC (this should work for SD as well) rather than on a FAT partition as a file. You will need to set your partition table up to leave room for this and modify the U-Boot config. * We use redundant u-boot environments placed in different sectors of the eMMC. This is a built-in feature of U-Boot that can be enabled in the config. If one gets corrupted it will fall back on the previous gracefully. * We have custom code both in U-Boot and in Linux that checks for corrupt or inconsistent RAUC U-Boot environment vars. If they are totally out of whack we will boot into our fail-safe recovery mode where the evn vars are reset to a sane default and an update can be performed (no RMA needed). Over the past year we've had this setup. I haven't once seen or heard of actually hitting a corrupt U-Boot env in any of our development units. We unfortunately don't have analytics around this event in the field. I know this isn't exactly an answer to your question, but hopefully some of this helps you arrive at a robust solution for your setup. Best, ~Matt On Sun, Mar 28, 2021 at 6:11 AM Einar Vading wrote: > > Hi, > > > > On Fri, 2021-03-26 at 05:48 +0000, Einar Vading wrote: > > > > > Hi, > > > > > > > > > > On Thu, 2021-03-25 at 15:22 +0000, Einar Vading wrote: > > > > > > We have a Raspberry Pi 4 system set up using RAUC for updates > and u-boot > > > > > > for > > > > > > booting. For some systems in the field we have the u-boot > environment on > > > > > > the > > > > > > FAT boot partition and we mount that in fstab so that RAUC can > access it > > > > > > with > > > > > > the fw_print/setenv commands. > > > > > > > > > > > > One issue we have seen is that the env-file gets corrupted every > now and > > > > > > then. > > > > > > After corruption we can't RAUC update. The only solution we have > to this > > > > > > problem now is to delete the corrupted env-file and reboot, then > we can > > > > > > perform the upgrade. > > > > > > > > > > > > I have no idea how to track down whatever corrupts the file and > I was > > > > > > wondering if anyone has any input. > > > > > > > > > > You could try placing the environment on a separate partition to > avoid any > > > > > potential issues in the FAT implementation. Also, I think U-Boot > has a way > > > > > to > > > > > support redundant environments. > > > > > > I have just done this for our newer systems. I moved the GPT > partitions back > > > 4MB and placed two redundant environments between the GPT and the > first GPT > > > partition. > > > > > > It is my understanding though that redundant environments are not > supported > > > when storing the env on FAT? > > > > That's probably a question for the U-Boot mailing list. :) > > > > > > Exactly. This should also be documented in the U-Boot integration > guideline > > > > for eMMC: > > > > > > > > > > > > > https://rauc.readthedocs.io/en/latest/integration.html#example-setting-up-u-boot-environment-on-emmc-sd-card > > > > > > > > When writing to the FAT very short before hard rebooting, I could > imagine > > > > this > > > > can lead to failures. Do you see the corruption only after updates, > or also > > > > suddenly after n boots? > > > > > > Yes, this is something we have been able to test. If we cut the power > > > precisely when the env is written to FAT we can corrupt the entire boot > > > partition. > > > Super scary but this is not the problem we're seeing in the field. That > > > problem is more subtle. > > > > It should be possible to mount fat with the 'sync' option, but I'm not > sure if > > that would help in this case. I'd recommend avoiding mounting FAT > filesystems > > R/W if possible. > > Maybe it could help with the problem I'm investigating. Don't think it > would help with > the total corruption on powerloss when writing u-boot env, since that is > in u-boot and > the fs is not "mounted" yet. > > > > > How does the system report the corruption? > > > > > > fw_printenv and fw_setenv stops working and says that the env is > corrupted. > > > That also means that RAUC update fails, that is usually when we notice > it. > > > > > > Is there a way to watch a file and record any process that modifies it? > > > > There is blktrace, but you don't see the contents that way. It still may > be > > enough detail to understand what's happening here. > > Great, I'll check that out. > > > Regards, > > Jan > > Thanks for all the help. > > Regards, > Einar > > _______________________________________________ > RAUC mailing list > -- Matthew Campbell Principal Engineer mcampbell@izotope.com iZotope, Inc. www.izotope.com