✓

Follow along with this comprehensive guide

If you own a recent Android device with built-in AI capabilities, you might have noticed that the Android AICore—the system responsible for running on-device generative AI models like Gemini Nano—occasionally uses more storage than expected. Google recently shed light on this phenomenon, explaining that while large models are inherently space-hungry, the occasional storage surges are tied to several deliberate processes. In this listicle, we break down the top ten reasons behind those storage spikes, so you can understand what's going on under the hood and maybe even appreciate the engineering behind it.

Table of Contents

Model Updates and Versioning
Caching of Intermediate Outputs
On-Device Training and Fine-Tuning
Downloading Additional Language Packs
Runtime Optimization Artifacts
User Data Preprocessing for Personalization
Temporary Files from Model Execution
Fallback Models for Offline Scenarios
Security and Privacy Encryption Overheads
Garbage Collection Delays

1. Model Updates and Versioning

Google frequently pushes updates to Gemini Nano, the core AI model within Android AICore, to improve accuracy, speed, or add new capabilities. Each update may introduce a newer version of the model while keeping the previous one temporarily to ensure a smooth transition. During this overlap, storage consumption can double or even triple. Once the old model is safely deprecated and removed, the storage spike subsides. These updates are typically delivered via Google Play Services, so you may notice a temporary storage increase after a system update.

10 Reasons Why Android AICore Storage Spikes (and What It Means for You) — Source: 9to5google.com

2. Caching of Intermediate Outputs

When you use generative AI features—like smart reply, photo editing, or text summarization—Android AICore often caches intermediate results to provide faster responses for repeated or similar queries. For example, if you frequently ask for photo caption suggestions, the system stores processed data locally to avoid recalculating every time. This caching layer can accumulate over days or weeks, especially if you use AI features heavily. Google clears these caches periodically, but during peak usage, they can take up noticeable storage space.

3. On-Device Training and Fine-Tuning

To personalize your experience, Android AICore may perform lightweight on-device training or fine-tuning of the AI model using your usage patterns. This process requires storing a copy of the base model plus the adjusted weights, which effectively doubles the storage footprint for a short period. Once the fine-tuning completes and the original weights are merged, the extra copy is deleted. This is why you might see storage usage jump after a period of heavy interaction with AI features like predictive text or app suggestions.

4. Downloading Additional Language Packs

Gemini Nano supports multiple languages for text generation and understanding. When you switch languages or the system detects multilingual use, it may download additional language-specific model components. Each language pack can be tens to hundreds of megabytes, depending on the model size. If you have several languages enabled, the cumulative storage can spike significantly. Google optimizes by downloading only necessary components, but the initial download or update of a new language pack causes a temporary increase.

5. Runtime Optimization Artifacts

Android AICore uses advanced runtime compilers (like XNNPACK or NNAPI) to optimize AI models for your device's specific hardware (CPU, GPU, or NPU). These optimizations generate specialized execution graphs and binary artifacts that are stored locally. When your device receives a software update or when the model is first run, these artifacts are created, consuming extra storage. Over time, as new optimizations are generated for different tasks, the stored artifacts may accumulate until a cleanup routine runs.

6. User Data Preprocessing for Personalization

To make AI responses more relevant, Android AICore may preprocess and store anonymized snippets of your data—like recent messages, app usage frequency, or location patterns. This preprocessing helps the model adapt to your context without sending raw data to the cloud. The stored data is encrypted and kept in a dedicated cache folder. Depending on how much you interact with AI features, this cache can grow to several hundred megabytes. Periodic garbage collection and user-triggered cleanup (via Settings) can reclaim this space.

7. Temporary Files from Model Execution

Every time you invoke an on-device AI feature, the runtime creates temporary files to hold intermediate tensors, logs, and debug information. These temp files are usually deleted after the task finishes, but if the execution is interrupted (e.g., app crash, low battery shutdown), they may be left behind. A build-up of such orphaned files can lead to a storage spike over time. Google has improved cleanup routines in recent AICore versions, but some devices may still experience residual temp file accumulation.

8. Fallback Models for Offline Scenarios

To ensure AI features work even without an internet connection, Android AICore may keep a smaller, offline-compatible version of Gemini Nano as a fallback. This fallback model is separate from the primary online-optimized model, and both may be stored simultaneously. When connectivity is intermittent, the system may swap between them, keeping both copies loaded for quick switching. This redundancy can double the storage usage at certain times, especially when preparing for anticipated offline usage.

9. Security and Privacy Encryption Overheads

Google encrypts all sensitive data stored by Android AICore (model weights, user cache) to meet privacy standards. Encryption itself introduces metadata and padding that increase the effective file size. Additionally, the system may store encrypted versions alongside unencrypted runtime copies during processing, leading to a temporary storage spike. Once the processing completes and encrypted data is swapped in, the extra copy is removed. This is a necessary overhead for protecting user privacy.

10. Garbage Collection Delays

Android AICore relies on the operating system's storage management to clean up deleted files. However, due to the way filesystems handle fragmentation and delayed garbage collection, the freed space may not become available immediately. After a large deletion event (like model update removal), the system may still report the old storage usage until a background garbage collection pass runs. This lag can make it appear as if storage is still high even though the actual data is gone. A reboot or waiting a few hours typically resolves this.

Understanding these reasons can help you troubleshoot storage spikes and make informed decisions about when to clear cache or disable certain AI features. In most cases, the spikes are temporary and part of the normal operation of keeping your on-device AI smart and responsive. If storage becomes critical, you can always manage AICore data via Settings > Apps > Android AICore > Storage & cache to clear cached files manually. With these insights, you can now rest assured that your device's AI brain is working efficiently—even when it seems a little bloated.

10 Reasons Why Android AICore Storage Spikes (and What It Means for You)