Alliance memory NAND Flash Stuck in OIP

Thread Starter

haitham dahrooj

Joined Jul 7, 2024
3
Hi all,

I’m working on evaluation boards that integrate the Alliance Memory AS5F38G04SND-08LIN QSPI NAND flash device. During our assembly/inspection flow, the boards are subjected to X-ray imaging.

Here’s what I’ve observed:
Initially, all devices were functional:
• Register communication worked fine
• Page program and data read were successful

After an additional X-ray exposure (~20 min, ~120 rad):
• Register interface is still fully functional (read/write to feature registers works)
• SPI interface signals look clean and correct on the scope
• But we cannot access the NAND flash array anymore

The main symptom: the device stays stuck with the OIP (Operation In Progress) bit set. This prevents normal read/program operations from completing.

My questions:
1. Has anyone encountered this OIP-stuck condition with this or similar NAND devices?

2. Could this be related to a known status handling or protection feature (Block Lock, WP#/HOLD#, etc.)?

3. Could X-ray/TID exposure at levels around ~120 rad be enough to cause this kind of NAND array failure while leaving register access intact?

4. Does anyone have insight into the correct sequence for array read (Page Read → Cache Read, etc.) that might clarify whether this is a usage sequence issue versus a device issue?

I’ve already reached out to Alliance Memory for guidance, but I’d also appreciate the community’s experience.

Thanks in advance for any help!
— Haitham
Reliability Engineer
 

nsaspook

Joined Aug 27, 2009
16,273
What did you think was going to happen when you inject EM energy into nano-scale insulated storage cells designed to use tiny bits of stored charge? Sure, the standard MOS digital features are still working but the flash stack array is likely toast.
https://meridian.allenpress.com/ism...fects-of-x-ray-exposure-on-NOR-and-NAND-flash
Effects of x-ray exposure on NOR and NAND flash memories during high-resolution 2D and 3D x-ray inspection

Don't think I would be using this part for anything new.

https://www.mouser.com/ProductDetail/Alliance-Memory/AS5F38G04SND-08LIN?qs=W/MpXkg%2BdQ6cTzWIadNbnA==&srsltid=AfmBOop2iXsaJFLDeS9WWJa-mvTZV3btORq130LKv0REeIWiUIckWOf_
End of Life: Scheduled for obsolescence and will be discontinued by the manufacturer.
 

Thread Starter

haitham dahrooj

Joined Jul 7, 2024
3
Thanks for the paper link—I know it (x-ray inspection can corrupt floating-gate cells). Two clarifications, though:

Our added exposure was about 120 rad ≈ 1.2 Gy, which is orders of magnitude below the multi-krad(Si) levels where many studies start reporting widespread failure in NAND (often tens of krad). I realize setups vary by energy and bias conditions, but 1–2 Gy is still relatively modest for TID.

The same “OIP stuck” symptom shows up on some unexposed parts under certain sequences/conditions, which suggests a status/sequence or protection/state-machine corner case is also plausible (besides radiation).

If anyone’s seen OIP never clearing after 13h (Page Read to cache) on SPI-NAND and has a checklist, I’d love the pointers. For reference, the AS5F38G04SND status/OIP and read flow are standard SPI-NAND (13h → poll C0h until OIP=0 → cache read 03h/0Bh/3Bh/6Bh). A few things I’m checking: ensure FFh reset between operations, but it didn't work, that feels more like an internal state/array path issue than pure host sequencing.

(Separate note: I’m aware this device is now EOL, so alternatives are on the table—but I’m trying to close the loop on root cause for our process flow.)

— Haitham
 

nsaspook

Joined Aug 27, 2009
16,273
Sounds like a marginal device for your application.
Sustained structure distress from a long duration exposure is much different than typical short bursts X-ray used in inspection
 
I am having this exact same problem, also using an AS5F38G04SND-08LIN chip from Alliance Memory.
I've been debugging this issue for a couple of weeks now and this is where I'm at.

In our application, the issue always occurs during a program sequence. We're using command 0x02 for programming.
The programming sequence is as follows:
1. Send write enable command (0x06)
2. Send Program Load command (0x02) along with 2 byte address and 4096 bytes of data
3. Keep polling OIP bit in status register (command 0x10, address 0xC0).

After a random amount of time (sometimes after 5 minutes, sometimes after an hour), OIP never goes low anymore, until I power cycle the IC. Even a RESET command won't clear the OIP bit (the firmware attempts to reset the chip if OIP doesn't go low within 5ms). I've increased the timeout time to 100ms, but the bit still remains high. 5ms should be plenty, as the worst-case page programming time is 750µs.

Whilst debugging, I'm also reading the status register in between each step in the above sequence. What I've noticed is that the OIP bit is low before step 3 (as expected) and goes high after (again, as expected). Sometimes though, the bit never goes back low. Sending a reset command won't bring the bit down. The address being written to is a valid address. The page was empty (all 0xFF) at the time of writing.

All MISO and MOSI data and signal integrity (including VCC integrity) have been confirmed using an oscilloscope with protocol decoding feature, so I'm confident that this is not a firmware bug.

Additional information:
- SPI mode 0
- SPI clock is 8MHz
- WP and HOLD are both tied to VCC directly
- The write rate of the application is constant (40kB/s).

Adding delays in between SPI command and in between transfers make no difference. A slower or faster clock makes no difference either.

I'm curious to know whether you have any more insights.
 

Thread Starter

haitham dahrooj

Joined Jul 7, 2024
3
Hi, thanks for sharing your experience.
We haven’t reached any clear conclusion yet.

One thing we’re currently investigating is whether this might be related to process compatibility between the flash device and the specific host controller used.

Would you be able to share what controller/MCU you’re using in your setup?
It could help us compare results and see if the problem is linked to certain platforms.

We are Using Stm32f4 .
 
Hi Haitham,

We're using an NRF5340 running Zephyr OS (Build v3.1.99-ncs1-1) and nRF Connect SDK V2.1.2. We've developed the NAND driver and Zephyr disk driver in-house. The file system library which is used by Zephyr is ELM FAT.

We've just placed an order for an equivalent NAND flash chip from a different vendor to see if the problem is tied to the chip. I expect to be able to test this tomorrow. Will keep you informed.
 
This morning I've received the flash chip and connected it to our application. I've left it running for a few hours and there haven't been any errors so far.

For reference, the alternative flash chip is a Macronix MX35LF4G24AD-Z4I8. This is a 4Gb chip instead of 8Gb, but is fully pin compatible. The reason we picked this chip for testing, was because this was the only chip we could find that was pin and package compatible.

The firmware was adjusted to accept the new device ID and the block count was lowered from 4096 to 2048 to make the new chip work.

So it seems that there is something happening inside the Alliance Memory chip, causing the OIP bit to get stuck.

I hope this will be of use to you.
 
Top