Code

SDRAM problems with STM32F427/437/429/439

Posted on

The FMC (flexible memory controller) of these microcontrollers is a bug ridden beast. The latest errata sheet lists a total of 15 “silicon limitations”. Two of them are connected to SDRAM and are particularily ugly, as they may result in corrupt reads. As of now (Dec 2017), both of them are not fixed in any revision, so you may want to continue reading if you use any of these types with SDRAM…

A mysterious bug…

The last few days, my client and I were hunting a mysterious bug: When the STM32F439 was really busy (i.e. lots of interrupts), very, very seldomly two invalid bytes would arrive at the host over USB.

After putting checks in place and making sure the data gets handed to the USB-stack correctly, it turned out that when the bug occured, the invalid data was actually read from the SDRAM, although the same data seemed ok there.

After re-checking that the SDRAM timing and initialization was correct, studying the latest errata sheet revealed a likely candidate: “2.11.15 SDRAM bank address corruption upon an interruption of CPU read burst access”.

If the CPU gets an interrupt while it is busy with an LDM from SDRAM, the next read from another bank of the SDRAM may result in wrong data read. Öha.

Who should be concerned?

Everyone that uses a STM32F427/STM32F437/STM32F429/STM32F439 with SDRAM. Here, the problem went unnoticed for a couple of years, although these microcontrollers have been used in numerous projects. Only in one project under very specific circumstances the problem surfaced.

The workarounds

Luckily, there are several workarounds for every taste.

Workaround #1: 32-bit SDRAM

If you’re using 32-bit SDRAM, you could simply disable the SDRAM’s FIFO by clearing RBURST in FMC_SDCR1/FMC_SDCR2 when initialzing the FMC.

Why only 32-bit SDRAM? Because there’s another bug in the FMC: “2.11.5 Interruption of CPU read burst access to an end of SDRAM row” that only affects 8- and 16-bit SDRAMs. The workaround for this problem would be turning the FIFO on, which is the opposite of the workaround for the problem we’re trying to solve.

But anyway, turning off the FIFO is a viable workaround when using 32-bit SDRAMs. It will probably hurt SDRAM-performance a little, but maybe that’s not an issue in your case.

Workaround #2: Set DISMCYCINT

Another workaround I came up with that’s not mentioned in the errata sheet is to set DISMCYCINT in the “Auxiliary control register” (ACTLR). If this bit is set, LDMs will not get interrupted and the problem will go away.

There’s a drawback, though: The worst-case interrupt latency increases.

This FAQ entry tells us that the longest possible burst is a VLDM Rx, [S0-S31], which is 32 words or 128 bytes.

In our configuration (16-bit SDRAM @ 90 MHz), such a burst takes almost 1 µs. This may be too long if you have really hard realtime requirements. Other than that, it’s a very simple and effective workaround.

Workaround #3: “Precharge all” before the 2nd read

Another workaround mentioned in the errata sheet is to “[…] send a PRECHARGE ALL command before the second read access […]”.

As before the second read access is after the first (interrupted) access, we could send a “precharge all” command first thing in every ISR.

To achieve that, one can wrap all ISRs with a specially crafted vector table.

A few things to consider:

  • VTOR must point to this table.
  • Any vector-table must be aligned to 128 words (512 bytes).
  • To be on the safe side, make sure the first two entries (initial SP and initial PC) contain proper values. There may be some code using them for a reset.

Then, have all interrupt-vectors point to a routine something like that:

isr_wrapper
	; send PRECHARGE ALL command to SDRAM 1 via FMC
	LDR    R0, =0xa0000150 ; FMC_SDCMR register
	MOVS   R1, #0x12       ; Send PALL to SDRAM 1
	STR    R1, [R0]
	
	MRS    R0, IPSR        ; Get current interrupt number ..
	; .. and use it as the index to look up the actual ISR and branch there
	; (in our case, we are using a vector table in RAM)
	LDR    R1, =__VECTOR_RAM
	LDR    R0, [R1, R0, LSL #2]
	BX     R0

If you have two SDRAMs, you’ll have to send PALL to both of them. Also, your actual vector table is probably called differently, or it may just sit at address 0 if you’re using the vector table in Flash ROM.

This is the workaround we finally went with as it hardly increases interrupt latency and is not too invasive. Long-term testing showed that with this in place, the problem doesn’t occur any more.