Embeddings Dimensionality Guide

Reference for embedding models: dimensions, size, performance benchmarks. OpenAI ada-002, Cohere, sentence-transformers.

Model Provider Dims Max Tokens MTEB Score Price/1M tokens Notes

Storage Calculator

Tips for Choosing Embeddings

  • Higher dimensions generally capture more semantic nuance but require more storage
  • For most use cases, 768-1536 dimensions provide good quality-to-cost ratio
  • Consider Matryoshka embeddings (text-embedding-3) to reduce dimensions with minimal quality loss
  • Open-source models can be self-hosted to eliminate per-token API costs
  • MTEB (Massive Text Embedding Benchmark) provides standardized comparison scores

More Developer Tools tools at toool.cc