Pricing

Start free, scale seamlessly.

Choose your preferred storage, inference, and number of instances, and we’ll provide you with an estimated cost.
Easy – just like the rest of Marqo.

Cloud

From $0.38/hour

Pay as you go

Features

Fully managed

End-to-end vector creation and storage

Horizontally scalable

Model Customization

CPU instances and GPU instances

Scale at the click of a button

Access control

High availability

Low latency

Book Demo, Get $500 Credit

Enterprise

Custom Quotes

Tailored to your needs

Features

All the functionality of Marqo Cloud

SSO

Single tenant deployment

Observability Integrations

24/7/365 dedicated support

Migration assistance

Access to ML scientists

VPC deployment (+add on)

Enhanced Enterprise SLA

Sizing assistance

Dedicated slack channel

Contact Sales

Marqtune

Custom Quotes

Customized for you

Features

Fine-tune embedding models

Generalized Contrastive Learning

Flexible training datasets

Wide range of base models

Train with historical sales data

Model evaluation

Access to ML scientists

Contact Sales

Pricing Calculator

When you use Marqo, you are billed on an hourly basis for the resources you use. Usage is rounded up to 15-minute increments.

Choose your storage

Selected
Select

$0.06/hour

marqo.basic

Good for small projects, development, and proof of concept applications. Approx. 2M vectors per shard.

Selected
Select

$0.87/hour

marqo.balanced

Aimed at production applications with large amounts of data where high availability is a requirement. Approx. 16M vectors per shard.

Selected
Select

$2.18/hour

marqo.performance

Designed to support high velocity concurrent requests against tens or hundreds of millions of vectors. Approx. 16M vectors per shard.

Choose your inference

Selected
Select

$0.32/hour

marqo.CPU.large

Suitable for production applications, especially with smaller models.

Selected
Select

$0.97/hour

marqo.GPU

The fastest available inference, capable of serving high request velocity with large multimodal models and low latency.

Estimated Total (USD):

$0.00/hour

Contact Us
Storage Shards
Storage Replicas
Inference Instances

We reserve the right to pass on network ingress/egress costs from cloud providers for externally imposed traffic.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Frequently Asked Questions

What is vector search?

Vector search allows you to search documents, images and other data by converting items into a collection of vectors. This collection of vectors summarises the data in semantic form and allows us not only to match documents against queries through analysis of the semantic content, but also to understand where and how the document matched the query. With Marqo, inference to create the vectors is included.

What is the best setup for my application?

The number of instances you will need depends on a number of factors. The number of documents, the size of the documents and the type of data (image vs text). When dealing with low search volumes that primarily involve text or when low latency is not crucial, using CPU inference nodes can be a cost-effective solution. On the other hand, GPU inference nodes provide a significant performance boost when indexing and searching with images and are recommended for indexing large datasets and processing high volume, low latency searches. For multimodal models marqo.CPU.large is recommended as a minimum.

The estimates for storage capacity provided in our calculator assume your are using a model that produces 768 dim. vectors.

Do I have to change my code to move from open-source to cloud?

The only changes you need to make are to update your URL and API key when accessing Marqo.

How does billing work?

You will be billed at the end of the month for total inference and shard hours used. Usage is rounded up to 15-minute increments.

Request a demo

We’d love to speak with you. Send us your questions about Marqo and we’ll set up a time to meet with you.

Book Demo