i-JEPA (Image Joint-Embedding Predictive Architecture) is a self-supervised learning framework that predicts missing high-level representations rather than reconstructing pixels. A context encoder sees visible regions of an image and predicts target embeddings for masked regions produced by a slowly updated target encoder, focusing learning on semantics instead of texture. This objective sidesteps generative pixel losses and avoids heavy negative sampling, producing features that transfer strongly with linear probes and minimal fine-tuning. The design scales naturally with Vision Transformer backbones and flexible masking strategies, and it trains stably at large batch sizes. i-JEPA’s predictions are made in embedding space, which is computationally efficient and better aligned with downstream discrimination tasks. The repository provides training recipes, data pipelines, and evaluation code that clarify which masking patterns and architectural choices matter most.

Features

  • Predictive learning in representation space, not pixel space
  • Context and target encoders with EMA updates for stable training
  • Strong transfer with simple linear probes and low-shot fine-tuning
  • Scales cleanly with ViT backbones and diverse masking strategies
  • Efficient objective without negatives or pixel-level decoders
  • Reproducible training and evaluation recipes with checkpoints

Project Samples

Project Activity

See All Activity >

Categories

Libraries

License

MIT License

Follow iJEPA

iJEPA Web Site

Other Useful Business Software
Go From AI Idea to AI App Fast Icon
Go From AI Idea to AI App Fast

One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of iJEPA!

Additional Project Details

Programming Language

Python

Related Categories

Python Libraries

Registered

2025-10-07