4KLSDB
DataCV @ CVPR 2026

A Large-Scale Dataset for 4K Image Restoration and Generation

Zihao Zhu1 Kuan-Ru Huang1 Zhaoming Xu1 Renjie Li1 Bo Wu1 Ruizheng Bai1 Mingyang Wu1 Sayak Paul2 Zhengzhong Tu†,1,3

1Texas A&M University    2Hugging Face    3Visko Platform
† Corresponding author

4KLSDB example images

Example images from 4KLSDB spanning nature, urban scenes, people, food, CGI, artwork, and more. 4KLSDB contains 129,484 carefully curated native-4K training images, 2,000 validation images, and 1,984 test images. The dataset is designed to support both native-4K image restoration (super-resolution) and 4K text-to-image generation research.

Abstract

High-resolution datasets are essential for advancing super-resolution (SR) and text-to-image (T2I) diffusion research. However, current publicly available datasets lack both the native 4K resolution and the extensive scale necessary for training state-of-the-art models. To address this gap, we introduce a 4K Large Scale Dataset and Benchmark (4KLSDB), a large-scale, diverse dataset consisting of 129,484 carefully curated 4K resolution images spanning multiple categories such as nature, urban scenes, people, food, artwork, and CGI, alongside distinct validation and test sets containing 2,000 and 1,984 images respectively. Images were sourced from established open datasets including Photo Concept Bucket, LAION-2B, and PD12M. 4KLSDB underwent rigorous multi-stage automated filtering and annotation pipelines involving both human annotators and Large Multimodal Models (LMMs) to ensure high aesthetic quality and dataset consistency. We demonstrate 4KLSDB's effectiveness by training representative super-resolution and diffusion models, observing significant improvements in performance on native 4K benchmarks. Comprehensive experiments illustrate a positive correlation between training on true 4K resolution data and improved fidelity in image restoration, especially at 4K resolution.
129,484
Train Images
2,000
Validation
1,984
Test
3840+ px
Native 4K

Dataset

4KLSDB is the first publicly released native-4K dataset that scales to over 100k images and supports both image restoration and generation.

Dataset #Train #Val #Test Max Res. Native 4K
DIV2K8001001002K
LSDIR84,9911,0001,0002K
DIV8K1,5001001008K✓†
DiffusionDB14,000,0001024×1024
HQ-Edit~200,000900×900
4KLSDB (Ours) 129,484 2,000 1,984 4K

† DIV8K contains some 8K-resolution images, but its total scale remains limited for large-scale training.

Curation Pipeline

A robust multi-stage filtering and quality-control pipeline combining rule-based checks, LMM-based aesthetic scoring, and human vetting.

4KLSDB filtering pipeline

Overview of the 4KLSDB filtering pipeline. An initial raw image pool is progressively refined through automated filters and a final manual inspection stage to obtain a high-quality, aesthetically aligned 4K dataset. Resolution-based pre-filtering enforces a minimum dimension of 3840 px and a $3840\times2160$ pixel budget. Q-Align is used to obtain quality and aesthetic scores, retaining the top 80%. Laplacian variance and Sobel-patch flatness ratio further remove overly flat, blurry, or low-texture samples. Two human annotators then review every remaining image, yielding the final native-4K dataset.

Benchmark Results

Native-4K supervision consistently boosts both classical SR and real-world SR models.

Classical Super-Resolution on 4KLSDB Test Set

Model ×4 ×8 ×16
PSNR↑SSIM↑ PSNR↑SSIM↑ PSNR↑SSIM↑
HiT-SR (pretrained)24.500.683922.250.639419.470.5741
HiT-SR (4KLSDB)29.270.789624.750.692823.690.6414
SwinIR (DF2K)24.110.673820.960.591519.200.5684
SwinIR (4KLSDB)28.790.777425.890.687723.690.6376
MambaIR (pretrained)25.920.725921.510.638219.470.5741
MambaIR (4KLSDB)30.920.821623.840.719523.690.6414

Real-World Super-Resolution (4KLSDB Test Set)

MethodScale PSNR↑SSIM↑ LPIPS↓DISTS↓ FID↓
OSEDiff×427.36 / 27.500.7511 / 0.75680.2863 / 0.25460.1604 / 0.143128.07 / 28.35
OSEDiff×823.86 / 24.100.6021 / 0.61880.5463 / 0.42520.1833 / 0.144819.56 / 17.74
OSEDiff×1622.65 / 22.690.6213 / 0.59660.6571 / 0.48660.2861 / 0.217051.76 / 33.97
SeeSR×427.01 / 28.250.6996 / 0.73400.5231 / 0.45110.1407 / 0.127238.95 / 33.88
SeeSR×824.10 / 24.500.6510 / 0.67130.5117 / 0.46280.1607 / 0.155177.46 / 74.46
SeeSR×1624.02 / 24.430.6810 / 0.70010.5594 / 0.51970.1699 / 0.164077.41 / 74.40

Each cell shows baseline / 4KLSDB fine-tuned. Bold marks the better result.

4K Text-to-Image Generation (Sana fine-tuned on 4KLSDB)

ModelpCLIPScore ↑pNIQE ↓
Sana (baseline)28.625.21
Sana + 4KLSDB29.274.63

User Study (Sana + 4KLSDB vs. Sana baseline)

Overall ↑Detail ↑Realism ↑Artifact ↑Alignment ↑
57.34%60.89%74.27% 64.40%52.29%

Double-blind pairwise win rate of 4KLSDB-fine-tuned Sana over Sana baseline.

Qualitative Comparisons

Swipe or use arrows to browse comparisons across SR and T2I tasks.

Real-SR: SeeSR vs. SeeSR + 4KLSDB

SeeSR comparison

From top to bottom: LR input, baseline SeeSR, and SeeSR fine-tuned on 4KLSDB (ours). Insets show sharper structures and more realistic local details.

T2I: Sana vs. Sana + 4KLSDB

Sana T2I comparison

Identical prompts and inference settings. Fine-tuning on 4KLSDB produces sharper boundaries and more coherent high-frequency textures in zoomed regions.

Downloads

Dataset, code, and pretrained checkpoints are all openly released.

Dataset

129,484 train · 2,000 val · 1,984 test native-4K images with captions.

Classical SR Checkpoints

HiT-SR / SwinIR / MambaIR fine-tuned on 4KLSDB for ×4/×8/×16.

Real-World SR Checkpoints

OSEDiff / SeeSR fine-tuned on 4KLSDB blind-degradation pipeline.

4K T2I Checkpoint

Sana fine-tuned on 4KLSDB for 4096×4096 text-to-image generation.

Code

Training, inference, and one-click evaluation scripts for every model.

Paper

4KLSDB: A Large-Scale Dataset for 4K Image Restoration and Generation.

BibTeX

@inproceedings{zhu2026_4klsdb,
  title     = {4KLSDB: A Large-Scale Dataset for 4K Image Restoration and Generation},
  author    = {Zhu, Zihao and Huang, Kuan-Ru and Xu, Zhaoming and Li, Renjie and
               Wu, Bo and Bai, Ruizheng and Wu, Mingyang and Paul, Sayak and Tu, Zhengzhong},
  booktitle = {DataCV @ CVPR 2026},
  year      = {2026},
  url       = {https://openreview.net/forum?id=VW0Fvdfv1k}
}