Tongyi DeepResearch Now Available on HPC-AI.COM

We are excited to announce that Tongyi DeepResearch, the first fully open-source Web Agent to achieve performance on par with OpenAI’s DeepResearch, is now available on our platform. This cutting-edge model enables researchers, developers, and AI enthusiasts to access advanced reasoning and information-seeking capabilities without the need for complex setup or expensive infrastructure.

About Tongyi DeepResearch

Tongyi DeepResearch has demonstrated remarkable results across a comprehensive suite of benchmarks. According to published evaluations, it achieved:

32.9 on Humanity’s Last Exam (HLE), a challenging academic reasoning task.
43.4 on BrowseComp and 46.7 on BrowseComp-ZH, benchmarks for complex web-based information seeking.
75 on xbench-DeepSearch, a user-centric benchmark for deep research tasks.

These results show that Tongyi DeepResearch not only matches but often outperforms existing proprietary and open-source deep research agents, making it one of the most powerful open-source solutions available today.

For those interested in learning more about the underlying technology and methodology, check out the DeepResearch technical blog for a detailed breakdown of the model architecture, training pipeline, and benchmark evaluations.

Why It Matters

Deep research agents like Tongyi DeepResearch are designed for tasks that go beyond simple question answering. They can:

Perform multi-step reasoning for academic and professional use cases.
Navigate and synthesize complex information across sources.
Support researchers and developers in building advanced autonomous workflows.

Because it is open-source, Tongyi DeepResearch also provides transparency and flexibility for experimentation, customization, and academic study.

Available on Our Platform

To make it easier for users to explore and leverage this state-of-the-art model, we have integrated Tongyi DeepResearch into our cloud platform. Whether you are a researcher testing benchmarks, a developer building intelligent agents, or simply curious about the future of web agents, our platform provides the fastest way to try Tongyi DeepResearch.

Tutorial: How to Run Tongyi DeepResearch on Our Platform

Development Environment Pre-Configuration

First, you need a high-performance cloud instance to support the model's operation. You can choose a pre-built image that includes CUDA 12.8 for GPU acceleration.

Install the Inference Framework

After launching your cloud instance, install the inference framework that suits your needs. For example, to use vLLM as the inference engine:

pip install uv
uv venv --python 3.12 --seed
source .venv/bin/activate
uv pip install vllm \
  --extra-index-url https://download.pytorch.org/whl/nightly/cu128 \
  --index-strategy unsafe-best-match

Model Privatization

We use local storage as a cache to support on-demand loading of model files. By combining data disks with high-speed shared storage, you get fast deployment and stable access for large models in isolated environments.

Download the model to high-speed shared storage for efficient read/write operations:

cd highspeedstorage/zhy1/

#!/bin/bash
export model="Alibaba-NLP/Tongyi-DeepResearch-30B-A3B"
curl -sSL https://d.juicefs.com/install | sh -
juicefs sync minio://minio:minio123@minio:9000/hf-model/${model}/ ./${model}/

Public Inference Service

Our platform supports public HTTP forwarding, allowing inference endpoints to be exposed to the internet for convenient remote access.

Launch the inference server:

python -m vllm.entrypoints.openai.api_server \
  --model /root/highspeedstorage/zhy1/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B \
  --tensor-parallel-size 8 \
  --port 8080

Validate service accessibility：

curl https://notebook-38c181af-9501-11f0-a6ea-7ab79f1c53cb-8080.na-usa-1.hpc-ai.com/v1/completions \
-H "Content-Type: application/json" \
-d '{
    "prompt": "Once upon a time",
    "max_tokens": 64
}'

Example result:

#return result
 {
    "id": "cmpl-c17...6788cd10c0",
    "object": "text_completion",
    "created": 1758254888,
    "model": "/root/highspeedstorage/zhy1/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
    "choices": [
        {
            "index": 0,
            "text": ", four knots: two left-handed trefoil knots and two right-handed trefoil knots...What is the minimum number of operations required to obtain one left",
            "logprobs": null,
            "finish_reason": "length",
            "stop_reason": null,
            "token_ids": null,
            "prompt_logprobs": null,
            "prompt_token_ids": null
        }
    ],
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
        "prompt_tokens": 4,
        "total_tokens": 68,
        "completion_tokens": 64,
        "prompt_tokens_details": null
    },
    "kv_transfer_params": null
 }

Local Deployment Performance

Measure generation speed to evaluate performance:

time curl https://notebook-38c181af-9501-11f0-a6ea-7ab79f1c53cb-8080.na-usa-1.hpc-ai.com/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Explain the theory of relativity",
    "max_tokens": 100
  }'

Sample output: Generating 100 tokens of text with the Tongyi-DeepResearch-30B model took approximately 0.535 seconds.

Conclusion

Tongyi DeepResearch represents a milestone in open-source AI agents, and we’re proud to make it accessible through our platform. Try it today to unlock powerful research and reasoning capabilities with just a few clicks.