DeepSeek packed LLM engine into 1200 lines of Python code
The DeepSeek team presented nano-vLLM. This is a lightweight and compact engine for running large language models. Which could change perceptions about code efficiency. Amazingly, all functionality fit into just 1200 lines of Python code! This is true technological minimalism in the world of artificial intelligence. Traditional engines like this, for all their power, often suffer from an overloaded codebase. Which makes their modification a real trial for developers. Nano-vLLM solves this problem by offering a simple but powerful tool without unnecessary complexity. The code is open.
At the same time, functionality is not sacrificed. The engine supports prefix caching, tensor parallelism, compilation with torch compile and working with CUDA. Tests on a laptop graphics card RTX 4070 with 8 GB memory showed impressive results. When running the Qwen 3.0 model with 6 billion parameters, DeepSeek’s engine processed 133966 tokens in 93.41 seconds. Which is even faster than the original vLLM engine.
Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
Paradise for introverts: AI will talk to company employeesGoogle released a new feature and now artificial intelligence can call local companies on your behalf. To find out information about prices and service availability. You no longer need to pick up the phone yourself and talk to employees. This is exactly what an introvert's paradise looks like.
OpenAI combined ChatGPT, Deep Research and Operator in one agentOpenAI company introduced ChatGPT Agent. A powerful combination of ChatGPT, Deep Research and Operator in a unified solution. The working principle is maximally simple. You set a goal, for example, send emails, create tables, buy tickets or book hotels. ChatGPT Agent independently breaks this goal into separate tasks, navigates to needed websites, searches for information and fills forms. Before critically important actions such as payment, publication or sending, the agent necessarily requests your confirmation.
Only 1 programmer in the world could beat OpenAI's AIImagine a world where artificial intelligence competes with the best programmers on the planet. Such a confrontation took place at the prestigious AtCoder World Tour Finals tournament. This is one of the most elite programming competitions in the world, where it's extremely difficult to get in.