Stable Diffusion Low VRAM Memory Errors Fix
You just installed Stable Diffusion. You're excited to generate your first images. You type your prompt, click Generate, and then...
"CUDA out of memory"
"RuntimeError: CUDA out of memory. Tried to allocate 512 MB"
"torch.cuda.OutOfMemoryError: GPU out of memory"
I've been there. After helping dozens of users fix VRAM errors across different hardware setups, I've learned that most memory errors are fixable without buying new hardware.
To fix Stable Diffusion VRAM errors: use command line flags like --lowvram or --medvram, enable xformers memory optimization, reduce batch size to 1, lower resolution to 512x512 for SD 1.5 or 768x768 for SDXL, and use optimized/pruned models.
In this guide, I'll walk you through every solution I've tested, from quick settings changes to alternative WebUIs that handle memory better.
What Causes Stable Diffusion VRAM Errors?
Stable Diffusion VRAM errors occur when the image generation process requires more video memory (VRAM) than your GPU has available. Common errors include "CUDA out of memory" or "out of video memory" messages during generation.
When you generate an image, Stable Diffusion loads the AI model into your GPU's video memory. Every step of the diffusion process uses VRAM for model weights, intermediate calculations, and image data. When these exceed available VRAM, generation fails.
VRAM (Video RAM): Dedicated memory on your graphics card used for rendering and AI computations. Unlike system RAM, VRAM is physically located on the GPU and much faster for GPU operations.
The size of your VRAM determines what you can do. A 4GB GPU can run SD 1.5 with optimizations. An 8GB GPU handles SDXL comfortably. Anything below 4GB requires aggressive optimizations.
Common VRAM Error Messages
You'll typically see one of these error messages when VRAM runs out:
| Error Message | Meaning | Likely Cause |
|---|---|---|
| CUDA out of memory | GPU video memory exhausted | Resolution too high or batch size too large |
| torch.cuda.OutOfMemoryError | PyTorch GPU allocation failed | Model too large for available VRAM |
| out of video memory | Direct VRAM exhaustion | Multiple images in batch |
| RuntimeError: allocate | Specific allocation request failed | High-res fix or upscaling enabled |
Quick Fixes: How to Fix CUDA Out of Memory in Stable Diffusion?
Quick Summary: Start with Solution 1 (command line flags) and Solution 2 (batch size). These fix 80% of VRAM errors immediately. Move to other solutions only if these don't work.
Based on my experience helping users with different GPUs, here are the solutions ranked by effectiveness:
- Add --lowvram or --medvram flag to your webui.bat launch command - This single fix resolves most issues by splitting the model between GPU and system RAM.
- Reduce batch size to 1 in WebUI settings - Generating one image at a time significantly reduces memory usage.
- Enable xformers with --xformers flag - This memory optimization library can reduce VRAM usage by 30-40%.
- Lower resolution to 512x512 for SD 1.5 or 768x768 for SDXL - Resolution has the biggest impact on VRAM usage.
- Use optimized/pruned models instead of full checkpoint files - These models are specifically designed for lower VRAM usage.
- Disable high-res fix when generating - This feature uses significant additional memory.
Pro Tip: Try these solutions in order. Most users find that adding --lowvram and reducing batch size fixes their issues immediately.
VRAM Requirements for Stable Diffusion
Understanding your hardware limitations helps set realistic expectations. Here's what different Stable Diffusion versions require:
| Model Version | Minimum VRAM | Recommended VRAM | Notes |
|---|---|---|---|
| SD 1.5 | 4 GB | 6 GB+ | Requires optimizations at 4GB |
| SDXL 1.0 | 8 GB | 12 GB+ | 1024x1024 generation at minimum |
| SDXL Turbo | 6 GB | 8 GB+ | Faster inference, lower memory |
| SD 2.1 | 4 GB | 8 GB+ | Similar to SD 1.5 requirements |
Key Takeaway: "You can run SD 1.5 on 4GB VRAM with the right optimizations. SDXL needs at least 8GB. If you have less VRAM than the minimum, you'll need to use more aggressive optimizations or consider upgrading to one of the best GPUs for Stable Diffusion."
Command Line Arguments for Memory Optimization
The most effective VRAM fixes happen at launch time. These command line arguments tell Automatic1111 WebUI how to manage memory before generation even starts.
Solution 1: Use --lowvram or --medvram Flags
The --lowvram flag splits the Stable Diffusion model between your GPU and system RAM. It's slower but enables generation on hardware that would otherwise fail completely.
lowvram mode: A launch flag that splits the AI model across GPU and CPU memory. The model layers that don't fit in VRAM are stored in system RAM and moved to GPU as needed. This makes generation slower but enables it on low-VRAM cards.
To add memory optimization flags to Automatic1111:
- Navigate to your Stable Diffusion WebUI folder and find
webui-user.bat(Windows) orwebui-user.sh(Linux/Mac) - Right-click and edit with Notepad or your preferred text editor
- Find the line that starts with
set COMMANDLINE_ARGS= - Add the flags so it looks like this:
set COMMANDLINE_ARGS=--lowvram --xformers
For 6-8GB VRAM cards, use --medvram instead:
set COMMANDLINE_ARGS=--medvram --xformers
The difference between --lowvram and --medvram:
| Flag | VRAM Saved | Speed Impact | Best For |
|---|---|---|---|
| --lowvram | Most aggressive (60-70%) | Significant slowdown | 4GB or less VRAM |
| --medvram | Moderate (30-40%) | Minor slowdown | 6-8GB VRAM |
| No flag | None | None (fastest) | 12GB+ VRAM |
After making changes, save the file and restart WebUI. Watch the console output during launch - you should see messages indicating low VRAM mode is active.
Pro Tip: Always include --xformers with your memory flags. xformers provides memory-efficient attention operations that work synergistically with lowvram mode.
Solution 2: Enable xformers Memory Optimization
xformers is a library from Meta that provides optimized attention mechanisms for Transformers. In Stable Diffusion, it reduces VRAM usage by 30-40% while actually improving generation speed.
xformers: A memory optimization library that implements efficient attention mechanisms for Transformer models. It reduces the memory footprint of the attention computation, which is one of the most memory-intensive parts of Stable Diffusion.
To enable xformers in Automatic1111 WebUI:
- Add --xformers to your launch flags in webui-user.bat as shown above
- Restart WebUI - it will automatically install xformers if not present
- Verify installation by checking Settings > Stable Diffusion > Cross attention optimization - it should show "xformers"
If xformers fails to install automatically:
cd stable-diffusion-webui
venv\Scripts\activate
pip install xformers
Solution 3: Additional Launch Flags
For problematic GPUs or specific scenarios, these additional flags can help:
--precision full --no-half
This flag combination can fix errors on some AMD GPUs or older NVIDIA cards. It uses full precision (32-bit) instead of half precision (16-bit), which actually uses more memory but can fix compatibility issues.
For more on Automatic1111 setup and basic configuration, check out our Automatic1111 WebUI beginners guide.
Automatic1111 WebUI Settings to Reduce VRAM Usage
After fixing launch arguments, the next most impactful changes happen within the WebUI interface itself. These settings control how each generation uses memory.
Solution 4: Adjust Batch Size and Batch Count
Batch size determines how many images are generated simultaneously. This is the single most impactful WebUI setting for VRAM usage.
Batch size vs batch count: Batch size generates multiple images in parallel (uses more VRAM). Batch count generates images sequentially after each other (uses the same VRAM but takes longer). For low VRAM, always use batch size = 1 and increase batch count for more images.
Recommended settings for low VRAM:
| Setting | 4GB VRAM | 6GB VRAM | 8GB+ VRAM |
|---|---|---|---|
| Batch size | 1 | 1 | 1-4 |
| Batch count | 1-4 | 1-4 | Any |
Solution 5: Lower Resolution Settings
Resolution directly impacts VRAM usage. Higher resolution means more pixels to process, which exponentially increases memory requirements.
Recommended resolution limits by VRAM:
| VRAM Amount | SD 1.5 Max | SDXL Max | Recommended |
|---|---|---|---|
| 4 GB | 512x512 | Not recommended | 512x512 |
| 6 GB | 512x512 | 768x768 (with optimizations) | 512x512 |
| 8 GB | 768x768 | 1024x1024 | 512x512 or 768x768 |
| 12 GB+ | 1024x1024+ | 1024x1024+ | Any resolution |
Solution 6: Disable High-Res Fix
The High-Res Fix feature generates images at a lower resolution first, then upsamples and adds detail. While useful, it uses significantly more VRAM.
For low VRAM scenarios:
- Keep High-Res Fix disabled in Settings
- Generate at target resolution directly instead of using fix
- Upscale afterward using separate upscaling tools if needed
Solution 7: Adjust CFG Scale
Classifier Free Guidance scale determines how strongly the generation follows your prompt. Higher values use slightly more memory.
Recommended CFG settings for low VRAM:
- Keep CFG between 5-8 instead of 10-15
- Lower CFG still produces good results with proper prompting
Use Optimized Models to Save VRAM
The type of model file you use significantly impacts VRAM usage. Not all checkpoint files are created equal.
Solution 8: Use Pruned Models
Pruned models have unnecessary data removed while maintaining the same generation quality. They're smaller in file size and use less VRAM when loading.
Pruned model: A Stable Diffusion checkpoint file with unnecessary weights removed. These models typically use 30-40% less VRAM than full models while producing identical results. Look for "pruned" or "optimized" in model filenames.
When downloading models from Civitai or Hugging Face:
- Look for "pruned" or "optimized" in the filename
- Choose .safetensors format instead of .ckpt (safer and sometimes more memory-efficient)
- Avoid "inpainting" versions unless you need inpainting - they use more VRAM
Solution 9: Use FP16 Models
Most modern models come in FP16 (half precision) format by default. These use half the memory of full precision models with virtually no quality difference.
| Model Type | File Size | VRAM Usage | Quality |
|---|---|---|---|
| Full precision (FP32) | ~6 GB | Highest | Reference |
| Half precision (FP16) | ~2 GB | ~50% less | Identical |
| Quantized (4-bit/8-bit) | ~1 GB | ~75% less | Slightly reduced |
Solution 10: Try Quantized Models
For extreme low VRAM scenarios, quantized models use even less memory by reducing precision further. These work surprisingly well for most use cases.
SDXL Low VRAM Solutions
SDXL requires significantly more VRAM than SD 1.5. Running it on 8GB or less requires specific optimizations.
SDXL VRAM Requirements
SDXL's native resolution of 1024x1024 means 4x the pixel count of SD 1.5's 512x512. This directly translates to higher VRAM requirements.
Minimum VRAM for SDXL:
SDXL Requirements
Minimum: 8GB VRAM (barely functional)
Recommended: 12GB VRAM (comfortable)
Ideal: 16GB+ VRAM (full features)
6GB VRAM Users
Use SDXL Turbo instead of full SDXL, or switch to ComfyUI which handles SDXL more efficiently.
SDXL Optimizations
For running SDXL on 8GB VRAM:
- Use --medvram flag (not lowvram for 8GB)
- Enable xformers - absolutely required
- Reduce resolution to 768x768 if 1024x1024 fails
- Use SDXL-specific optimized models from Civitai
- Consider SDXL Turbo - same quality with fewer steps
Memory-Efficient Alternative WebUIs
Automatic1111 isn't the only option. Some WebUIs handle memory more efficiently, especially for specific use cases.
ComfyUI: Best for Complex Workflows
ComfyUI uses a node-based workflow system that processes operations sequentially rather than keeping everything in memory at once. This makes it significantly more memory-efficient for complex operations.
Why ComfyUI saves VRAM:
- Sequential processing - nodes execute one at a time, freeing memory between steps
- Lazy loading - models only load when needed
- Better memory management - explicit control over what stays in VRAM
For SDXL on low VRAM, ComfyUI is often the best choice. See our ComfyUI vs Automatic1111 comparison for detailed differences.
Stable Diffusion WebUI Forge
Forge is a fork of Automatic1111 specifically optimized for difficult models including SDXL. It includes better memory management out of the box.
Forge advantages for low VRAM:
- Better SDXL support - optimized for SDXL specifically
- Automatic optimizations - no manual flag configuration needed
- Drops into same folder - can coexist with standard Automatic1111
Fooocus: Best for Beginners
Fooocus simplifies the interface while automatically optimizing settings. It's more memory-efficient by default and requires less technical knowledge.
Fooocus advantages:
- Automatic memory optimization - settings pre-configured
- Simpler interface - fewer ways to accidentally increase VRAM usage
- Good defaults - works well out of the box
Key Takeaway: "If Automatic1111 doesn't work with your VRAM, try ComfyUI for complex workflows or Fooocus for simplicity. Both handle memory more efficiently than Automatic1111. See our alternative Stable Diffusion interfaces guide for more options."
Monitor VRAM Usage to Prevent Errors
Preventing VRAM errors starts with understanding your actual usage. Monitoring tools help you identify problems before generation fails.
NVIDIA GPU Monitoring
For NVIDIA GPUs, nvidia-smi is the built-in monitoring tool:
nvidia-smi
This shows current VRAM usage. Run it before generation, then during generation in a separate terminal to see peak usage.
For continuous monitoring:
watch -n 1 nvidia-smi
Windows Task Manager
Windows Task Manager shows GPU memory usage:
- Open Task Manager (Ctrl+Shift+Esc)
- Go to Performance tab
- Select GPU from the left
- Watch "Dedicated GPU memory" during generation
Warning Signs to Watch For
You're approaching VRAM limits when:
- Generation takes longer than usual - system RAM swapping may be occurring
- System becomes sluggish - low memory affecting overall performance
- Occasional failures - sometimes works, sometimes doesn't
If you're experiencing general VRAM issues outside of Stable Diffusion, check out our guide on freeing up VRAM for system-wide optimization.
Hardware Considerations: When to Upgrade
Sometimes software optimizations aren't enough. Here's when to consider a hardware upgrade.
Current GPU VRAM Breakdown
| VRAM Amount | SD 1.5 | SDXL | Upgrade Recommended |
|---|---|---|---|
| 4 GB or less | Difficult but possible | No | Yes, for SDXL |
| 6 GB | Good | SDXL Turbo only | Yes, for SDXL |
| 8 GB | Excellent | Workable | Optional, for comfort |
| 12 GB+ | Excellent | Excellent | No |
Upgrade Recommendations
If you're upgrading specifically for Stable Diffusion:
- Minimum target: 8GB VRAM - runs everything with some compromises
- Sweet spot: 12GB VRAM - runs everything comfortably
- Ideal: 16GB+ VRAM - no compromises, batch generation possible
For budget-conscious users, see our budget GPU options guide. Otherwise, refer to our full best GPUs for Stable Diffusion recommendations.
Frequently Asked Questions
How do I fix CUDA out of memory in Stable Diffusion?
Use command line flags like --lowvram or --medvram when launching WebUI. Enable xformers for memory optimization. Reduce batch size to 1 in settings. Lower resolution to 512x512 for SD 1.5 or 768x768 for SDXL. Use pruned or optimized model files instead of full checkpoints.
Can Stable Diffusion run on 4GB VRAM?
Yes, Stable Diffusion 1.5 can run on 4GB VRAM with optimizations. You must use the --lowvram flag, enable xformers, and limit resolution to 512x512. SDXL is not recommended on 4GB - use SDXL Turbo or stick with SD 1.5 models. Pruned models are essential at this VRAM level.
What VRAM do I need for Stable Diffusion?
For SD 1.5: 4GB minimum, 6GB recommended. For SDXL: 8GB minimum, 12GB recommended. For SDXL Turbo: 6GB minimum, 8GB recommended. If you only do txt2img at standard resolutions, the minimum works. For img2img, inpainting, or high-resolution generation, get the recommended amount.
What is the difference between --lowvram and --medvram?
--lowvram splits the model more aggressively between GPU and CPU, saving more VRAM but causing significant slowdown. --medvram keeps more of the model on GPU, saving less VRAM but maintaining better speed. Use --lowvram for 4GB or less, --medvram for 6-8GB. For 12GB+, neither flag is needed.
How do I enable xformers in Automatic1111?
Add --xformers to your webui-user.bat launch command: set COMMANDLINE_ARGS=--xformers. Restart WebUI and it will install xformers automatically. You can also enable it in Settings > Stable Diffusion > Cross attention optimization. Always use xformers with --lowvram or --medvram for best results.
Which WebUI is best for low VRAM?
ComfyUI is most memory-efficient for complex workflows due to its node-based sequential processing. Fooocus is best for beginners with automatic optimizations. Stable Diffusion WebUI Forge is optimized specifically for SDXL on lower VRAM. Automatic1111 works fine with proper flags but requires more manual configuration.
How much VRAM for SDXL?
SDXL requires minimum 8GB VRAM for basic 1024x1024 generation. 12GB is recommended for comfortable use with all features. 16GB+ allows batch generation and high-res fix. If you have 6GB, use SDXL Turbo instead or switch to ComfyUI which handles SDXL more efficiently than Automatic1111.
Why does Stable Diffusion crash during generation?
Crashes during generation usually mean VRAM is exhausted mid-process. This happens when high-res fix, upscaling, or img2img operations require more memory than available. Disable high-res fix, reduce resolution, or use --lowvram flag. Also check that no other applications are using GPU memory.
Final Recommendations
After testing these solutions across multiple GPU configurations, I've found that most VRAM errors can be resolved without hardware upgrades.
The combination that works for 80% of users: --medvram --xformers launch flags, batch size of 1, and 512x512 resolution for SD 1.5.
For 4GB VRAM cards, add --lowvram instead and stick with SD 1.5 models.
For SDXL on 8GB, use ComfyUI or WebUI Forge - they handle memory much better than standard Automatic1111.
If you've tried all these solutions and still encounter errors, it may be time to consider one of the budget GPU options or a full upgrade to one of the best GPUs for Stable Diffusion.
