
ByteDance released model with 512K token context
ByteDance company released an open AI model with incredible context of 512,000 tokens. The model name is Seed-OSS-36B. Link in description.
While the world discusses TikTok and the White House, ByteDance quietly rolls out technology that processes information volume equivalent to an entire bookshelf in one session! 3 model versions — with synthetic data, without them, and instructional version — each tailored for its tasks.
The architecture impresses with its elegance. 36 billion parameters distributed across 64 layers. Vocabulary of 155,000 tokens. But the main magic — the thinking budget mechanism! You literally set how much time the model should think before answering. Want instant response — set 0. Need deep analysis — increase the budget.
Test results are awesome! Mathematics — 91.7% on AIME. Programming — 67.4% on LiveCodeBench. Long context work — 94.6% on RULER. All indicators — absolute records among open models!
The key question here is what’s the performance on real tasks, not benchmarks. But so far, ByteDance unexpectedly demonstrates world-class competencies in LLM. This is interesting.