(TL;DR)
The cost of implementing generative AI in a business can range from a few hundred dollars per month to $190,000 (and more) for a custom generative AI solution based on a fine-tuned open-source model.
This cost variation is influenced by several factors, including the tasks you want to enhance, the model that best fits those tasks, and the selected implementation approach.
To optimize costs, you need to carefully assess project requirements, evaluate on-premise or cloud infrastructure costs, and choose between hiring in-house AI talent or outsourcing the project to a third party.
We’ve already discussed how generative artificial intelligence (Gen AI) compares to traditional AI, as well as its pros and cons. JustSoftLab's generative AI consulting team has also explored Gen AI use cases across various industries, including healthcare, retail, and supply chains.
Additionally, we have evaluated the cost of building AI systems, including infrastructure, and analyzed the expenses associated with preparing training data, fine-tuning models, and deploying machine learning (ML) solutions.
Now it’s time to decipher the cost of implementing generative AI in business.
This analysis can be challenging, as the specifics of your project are unknown to us at this stage.
However, leveraging our expertise in generative AI consulting, we can explore the pricing of Gen AI services and outline the key factors behind Gen AI project costs. This way, we’ll equip you with the knowledge to make informed decisions, potentially saving your business considerable time and resources in this rapidly evolving tech landscape.
Interested? Let’s get started!
Choosing the Right Model and Implementation Approach as Key Factors Influencing Generative AI Costs
When considering integrating generative AI into your company’s tech stack, you need to keep the following in mind:
1. What business tasks do you want to improve with generative AI?
2. What model best suits these tasks?
At the core of generative AI solutions lie foundation models—large models trained on massive amounts of data. Foundation models serve as the base for creating custom Gen AI solutions, simplifying the development process, and reducing generative AI costs. Their capabilities usually include natural language processing (NLP), computer vision (CV), and content generation.
Foundation models’ cognitive capabilities largely depend on the number of parameters they’ve been trained on. In this context, parameters refer to model elements that are learned from training data, such as weights in a neural network. These parameters help the model make decisions and predictions. The following table illustrates the correlation between the number of parameters—essentially, the volume of these decision-making elements—and the model’s cognitive capabilities.
Number of training parameters | Model performance characteristics | Potential applications |
1 billion parameters | Basic knowledge of the world; pattern matching | Customer sentiment analysis in reviews |
10 billion parameters | Greater knowledge of the world; following basic instructions | Chatbots facilitating product ordering (HoReCa, eCommerce) |
100+ billion parameters | Rich knowledge of the world; complex reasoning | Data analysis, research, and content generation |
The number of parameters, however, is not the only factor that influences the capabilities of foundation models. The quality and diversity of the training data are equally important. Training data is the information fed into the model to learn from, encompassing a wide range of examples that help the model understand and interpret new data. Additionally, the model’s architecture—i.e., the structural design of how the parameters and data interact—and the efficiency of the learning algorithms, which determine how effectively the model learns from data, play critical roles. As a result, in some tasks, a model with fewer parameters but better training data or a more efficient architecture can outperform a larger model.
How could your company select a foundation model that is both effective and meets your expectations regarding the cost of generative AI?
All existing generative AI models can be loosely classified into two types:
Closed-sorced models are developed by large technology companies, such as Google, Meta, Microsoft, and OpenAI. Their source code, architecture, and application programming interfaces (APIs) can be completely proprietary or made available to third parties (usually for a fee, which is essentially the cost of the generative AI solution). In some cases, you can fine-tune the performance of closed-source models using your data. For the purpose of this article, we’ll be referring to closed-source models as commercially available generative AI solutions. The major advantage of such models is that they come with a cloud infrastructure and are fully maintained by the original developer.
Open-source models have their source code, training techniques, and sometimes even the training data available for public use and modification. Your company could use such models “as is” or retrain them on your own data to achieve better accuracy and performance. However, you’ll have to set up an on-premises or cloud infrastructure for the model to run on. The cost of such generative AI models will thus include computing costs and, if you choose to enhance the Gen AI solution, the expenses associated with model training.
Check out the table below for a quick overview of the closed-source and open-source models’ characteristics.
Closed-source models | Open-source models |
|
|
Let’s summarize.
If your company is considering implementing generative AI, there are four primary ways to do it:
Using closed-source models without customization. Generative AI pioneers can integrate off-the-shelf products like OpenAI’s ChatGPT, Google Bard, Claude, and Synthesia with their applications using APIs. The integration process is fairly straightforward, and so is the generative AI pricing (more on that later). Commercially available products are updated frequently and provide extensive documentation for AI developers. The downside? Your customization options will be limited, and you will heavily depend on an external company for vital business tasks, like handling customer support queries or producing visual content.
Retraining commercially available solutions on your corporate data. In this scenario, your in-house AI team will select an existing generative AI product developed by a specific vendor, such as OpenAI, and fine-tune it using your own data. Customized Gen AI solutions will better understand user questions and come up with more accurate responses. However, the vendor will still charge a small fee for running your queries, so the final generative AI cost will comprise both operational and customization expenses.
Using open-source foundation models “as is.” Exaggeratingly, your company could choose RoBERTa, GPT-2, GPT-Neo, or any other open-source model and apply it to business tasks like answering customer emails without further training. However, the cost of generative AI will be determined by the computing resources consumed by the model. Also, your Gen AI solution may underperform when faced with unfamiliar data and tasks.
Retraining open-source models on your data. In this case, you’ll need to obtain and prepare specific data for Gen AI model training, provide on-premise or cloud servers for model training and operations, and continue to fine-tune and update the model as your tasks evolve. While this bespoke approach guarantees superior model performance, it also entails higher generative AI costs.
Now that you know your implementation options, let’s zoom in on the cost of generative AI these options entail.
Insight into generative AI pricing based on the implementation scenario
The cost of commercially available Gen AI tools
Off-the-shelf services that facilitate text processing and generation typically charge enterprises based on the number of characters or tokens—i.e., basic units of text, which can range from punctuation marks to words and other elements of syntax—in input or output text.
Here’s how this works in practice:
Character-based billing. Some solutions, such as Gen AI tools driven by Google’s Vertex AI, bill users based on the number of characters in the input and output text. They count each letter, number, space, and punctuation mark as a character. The generative AI pricing for the PaLM 2 for Text model supported by Vertex, for instance, starts from $0.0005 per 1,000 characters for input and output text (billed separately).
Token-based billing. More advanced Gen AI tools tend to break down text into tokens instead of characters. Depending on a model’s training and processing methods, a token can be a punctuation mark, a word, or part of a word. For example, OpenAI defines a token as a group of approximately four characters. A simple sentence like “Tom has brought Jill flowers.” would thus consist of eight tokens, since the words “brought” and “flowers” slightly exceed the four-character threshold. When it comes to the cost of such generative AI solutions, it largely depends on your chosen language model. OpenAI’s GPT-4 Turbo, one of the most sophisticated tools on the market, charges $0.01 per 1,000 tokens for input text and $0.03 per 1,000 tokens for output text. For GPT-3.5 Turbo, its older version, the prices are significantly lower, ranging from $0.001 per 1,000 tokens for input text to $0.002 per 1,000 tokens for output text.
It should be noted that different generative AI providers have different notions of characters and tokens. To select the most cost-effective option, you should study their documentation and plans and consider which product best fits your unique business needs. For example, if your tasks revolve around text generation rather than analysis, a generative AI service with lower output rates will be more suitable.
Gen AI services for visual content creation, meanwhile, tend to charge users per generated image, with fees tied to image size and quality. A single 1024 by 1024 pixels image produced by DALL·E 3 in standard quality would cost you $0.04. For larger images (1024×1792 pixels), as well as high-definition images, the price would go up to $0.08–0.12 apiece.
And don’t forget about turn-key Gen AI platforms, such as Synthesia.io, which take a more traditional approach to pricing. If your marketing team is looking to speed up the video creation process, you can try the tool for as little as $804 per year.
The cost of customizing commercially available Gen AI products
As you can see from the previous section, the majority of ready-made Gen AI products leverage the pay-as-you-go monetization strategy.
While their pricing models look fairly straightforward at first glance, it could be challenging to predict how many queries your employees will run, especially if you seek to explore multiple generative AI use cases in various departments.
This brings about confusion regarding Gen AI tools’ pricing and total cost of ownership, as it was in the early days of cloud computing.
Another disadvantage of using commercial Gen AI solutions is that general-purpose products like ChatGPT lack contextual knowledge, such as familiarity with your company’s structure, products, and services. This makes it difficult to augment operations like customer support and report generation with AI capabilities, even if you master prompt engineering.
According to Eric Lamarre, Senior Partner at McKinsey, to solve this problem, organizations “need to create a data environment that can be consumed by the model.” In other words, you’ll have to retrain commercially available Gen AI tools on your corporate data, as well as information pulled from external sources via APIs.
There are two ways to achieve this:
Using software-as-a-service (SaaS) platforms with generative AI capabilities.
Many prominent SaaS vendors, including SAP, TIBCO Spotfire, and Salesforce, are rolling out generative AI services that can be fine-tuned using customer data. Salesforce, for example, has launched Einstein Copilot, a conversational AI assistant that pulls proprietary data from Salesforce Data Cloud to craft personalized responses to customer questions. The information utilized by the intelligent assistant includes Slack conversations, telemetry, enterprise content, and other structured and unstructured data. Salesforce clients can also create custom AI models, skills, and prompts using Einstein Copilot Studio’s no-code Prompt Builder and Model Builder. As of now, the latter instrument supports OpenAI’s large language models (LLMs), but there are plans to integrate the product with other third-party solutions, including Amazon Bedrock and Vertex AI. As Einstein Copilot is still in its pilot phase (no pun intended), the generative AI pricing information has not yet been unveiled. However, the cost of the generative AI Sales GPT assistant, which currently totals $50 per user per month, could give you a general idea of what to expect.
Integrating your corporate software with Gen AI solutions over APIs and retraining models on your data. To reduce the cost of generative AI implementation, you could eliminate the intermediary SaaS tools, merging your apps directly with commercial Gen AI solutions on the API level. For instance, if you’re looking to supercharge your customer support chatbot with Gen AI capabilities, you can sync it with one of OpenAI’s models—e.g., GPT-3.5 or GPT-4—using the OpenAI API. Next, you need to prepare your data for machine learning, upload the data to OpenAI, and manage the fine-tuning process using the OpenAI CLI tool and Open AI Python Library. While fine-tuning the model, you’ll be charged $0,008 per 1,000 tokens (GPT-3.5). Once your model goes into production, the input and output rates will amount to $0,003 per 1,000 tokens and $0,006 per thousand tokens, respectively. The overall cost of generative AI will also include storage costs, provided you choose to host your data on OpenAI servers. Data storage expenses could add $0.2 per 1GB of data per day to the final estimate. And don’t forget the data preparation and model fine-tuning efforts. Unless your IT department possesses the required skills, you’ll have to partner with a reliable AI development services company.
The cost of using open-source Gen AI models “as is”
Disclaimer: We’re not suggesting that you build a custom foundation model akin to ChatGPT from the ground up—that’s a venture best left to those with substantial backing, like OpenAI’s support from Microsoft to offset their $540 million losses.
Even more basic foundation models, like GPT-3, can rack up initial training and deployment costs exceeding $4 million. Furthermore, the complexity of these foundation models has skyrocketed at an astonishing rate in recent years.
The amount of computing resources required for training large AI models doubles every 3.5 months. The foundation models’ complexity is changing, too. For instance, in 2016, Bert-Large was trained with 340 million parameters. In comparison, OpenAI’s GPT-3 model was trained with around 175 billion parameters.
The good news is that foundation models are there already, which makes it relatively easy for businesses to start experimenting with them while optimizing generative AI implementation costs.
Essentially, we could treat foundation models as a toolkit for AI software engineers since they provide a starting point for solving complex problems while still leaving room for customization.
We could loosely divide existing foundation models into three categories:
Language models are designed to handle text translation, generation, and question-answering tasks
Computer vision models excel at image classification, object detection, and facial recognition
The third category, generative AI models, create content that resembles the data a model has consumed. This content may include new images, simulations, or, in some cases, textual information.
Once you’ve selected an open-source model that best suits your needs, you can integrate it with your software using APIs and utilize your own server infrastructure.
This approach involves the following generative AI costs:
Hardware costs. Running AI models, especially large ones, requires significant computational resources. If your company lacks the appropriate hardware, you may need to invest in powerful GPUs or CPUs, which can be expensive. If your model is relatively small, a high-end GPU like an NVIDIA RTX 3080 or similar could suffice. The cost of such a GPU can range from $700 to $1,500. For large models like GPT-2 or similar, you need multiple high-end GPUs or even specialized AI accelerators. A single NVIDIA A100 GPU, for example, can cost between $10,000 and $20,000. A setup with multiple GPUs can thus cost between $30,000 and $50,000.
Cloud computing costs. As an alternative to buying hardware, you can rent cloud computing resources from providers like Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure. These services charge based on usage, so costs will depend on how much you use their resources in terms of compute time and storage. For example, GPU instances on AWS (like P3 or P4) can cost anywhere from $3 to $24 per hour, depending on the instance type.
Electricity and maintenance. If you use your own hardware, you’ll incur electricity costs for running the machines and possibly additional cooling systems. Maintenance costs for hardware can also add up.
Integration and deployment. Integrating the AI model into your existing systems and deploying it (especially in a production environment) might require additional software development efforts, which can incur labor costs. The cost of outsourcing AI development to a software development company could range from $50 to $200 per hour, with total expenses ranging from a few thousand to tens of thousands of dollars.
Data storage and management. Storing and managing the data used by the model can be costly, especially when dealing with large datasets or using cloud storage solutions. For on-site installations, the cost of storing generative AI data could range from $1,000 to $10,000, depending on the size of the training dataset and redundancy needs. Charges for cloud-based data storage solutions, like AWS S3, can vary from $0.021 to $0.023 per GB per month, with extra costs for operations and data transfer.
Ultimately, how much could it cost your company to adopt a generative AI foundation model “as is,” deploying it on your own infrastructure?
For a mid-sized enterprise aiming to use a moderately large model like GPT-2 on-premises, the associated generative AI costs could span:
Hardware: $20,000–$50,000 (for a couple of high-end GPUs or a basic multi-GPU setup)
Electricity and maintenance: Around $2,000–$5,000 per year
Integration and deployment: $10,000–$30,000 (assuming moderate integration complexity)
Data storage and management: $5,000–$15,000 (varying with data size)
The total cost of setting up and operating a generative AI solution would include the following:
Initial deployment expenses: Approximately $37,000 to $100,000 (hardware + initial integration and storage setup)
Recurring expenses: $7,000 to $20,000 (including electricity, maintenance, ongoing integration, and data management costs)
These ballpark estimates can vary significantly based on specific requirements, location, and market conditions. It’s always best to consult with a professional for a more personalized and accurate estimate. Additionally, it’s a good idea to check current market rates for hardware and cloud services for the most up-to-date prices.
The cost of retraining open-source Gen AI solutions using your data
If your company is thinking about adjusting an open-source foundation model, it’s important to consider the factors that can affect the cost of implementing generative AI.
Such factors encompass:
Model size. Larger models, such as GPT-3, require more resources to fine-tune and deploy. As a result, the cost of generative AI increases with the size and complexity of the model. Simpler open-source foundation models like GPT-2, XLNet, and StyleGAN2, meanwhile, cannot generate content with the same level of coherence and relevance.
Computational resources. Retraining a foundation model using your company’s data demands substantial computing power. The cost of a generative AI solution thus depends on whether you’re utilizing your own hardware or cloud services, with the latter’s price varying based on the cloud provider and the scale of your operations. If you opt for a simpler model and deploy it on-premises, you’re expected to spend $10,000–30,000 in GPU costs to fine-tune the generative AI solution. With cloud computing, the expenses could range between $1 and $10 per hour, depending on instance type. GPT-3-like open-source models require a more advanced GPU setup, upwards of $50,000–$100,000. The associated cloud computing expenses can range from $10 to $24 per hour for high-end GPU instances.
Data preparation. The process of collecting, cleaning, and preparing your data for fine-tuning foundation models can be resource-intensive. The cost of generative AI implementation will therefore include the expenses associated with data storage, processing, and possibly purchasing training datasets if your company lacks your own data or cannot use it for security and privacy reasons.
Development time and expertise. Artificial intelligence talent doesn’t come cheap. A US-based in-house AI engineer will cost your company $70,000–$200,000 annually, plus the hiring, payroll, social security, and other administrative expenses. You can reduce generative AI costs by partnering with an offshore software engineering company with AI development expertise. Depending on the location, such companies’ hourly rates can range from $62 to $95 for senior development talent in key outsourcing locations, such as Central Europe and Latin America.
Maintenance costs. You’ll be solely responsible for maintaining, updating, and troubleshooting the model, which requires ongoing effort and machine learning engineering and operations (MLOps) expertise.
Considering the factors mentioned above, what is the realistic cost of creating a customized generative AI solution based on a readily available foundation model?
For a mid-sized enterprise looking to fine-tune a moderately large model like GPT-2, the associated generative AI implementation costs could span:
Hardware: $20,000–$30,000 (for a moderate GPU setup)
Development: Assuming 6 months of development time with a mix of in-house and outsourced talent:
In-house: $35,000–$100,000 (half-year salary)
Outsourcing: $20,000–$40,000 (assuming 400 hours at an average rate of $75/hr)
Data preparation: $5,000–$20,000 (varying with data size and complexity)
Maintenance: $5,000–$15,000 per year (ongoing expenses)
The total cost of setting up and operating a generative AI solution would include the following:
Initial deployment expenses: Approximately $80,000 to $190,000 (including hardware, development, and data preparation costs)
Recurring expenses: $5,000 to $15,000 (maintenance and ongoing costs)
Actual Gen AI development and implementation costs can vary based on specific project requirements, the availability of training data and in-house AI talent, and the location of your outsourcing partner. For the most accurate and current pricing, it’s advisable to consult with professionals or service providers directly.
While $190,000 for a generative AI system might seem unreasonably expensive, the cost of building a generative AI solution using open-source foundation models might be lower than opting for a commercially available tool.
Before ChatGPT gained attention, Latitude, a pioneering startup responsible for the AI-based adventure game called AI Dungeon, had been utilizing OpenAI’s GPT model for text generation.
As their user base grew, so did OpenAI’s bills and Amazon infrastructure expenses. At some point, the company was paying $200,000 per month in associated costs to handle the increasing number of user queries.
After switching to a new generative AI provider, the company reduced operating costs to $100,000 per month and adjusted its monetization strategy, introducing a monthly subscription for advanced AI-powered features.
To select the right implementation approach while optimizing generative AI pricing, it is thus important to thoroughly analyze your project requirements beforehand. And that’s why we always encourage our clients to kick off their AI development initiatives with a discovery phase.
Things to consider when implementing Gen AI in business
Now that you know what to expect from generative AI cost-wise, it’s time to talk about the technology’s implementation pitfalls and considerations:
Foundation models, especially large language models, might hallucinate, producing seemingly legitimate but utterly wrongful answers to user questions. Your company could avoid this scenario by improving training data, experimenting with different model architectures, and introducing effective user feedback loops.
Gen AI solutions are trained using vast amounts of data that quickly become outdated. As a result, you’ll have to retrain your model regularly, which increases the cost of generative AI implementation.
Foundation models trained on specific data, such as electronic health record (EHR) entries, might struggle to produce valid content outside of their immediate expertise. General-purpose models, on the other hand, struggle with domain-specific user queries. Some ways to address this issue include creating hybrid models, tapping into transfer learning techniques, and fine-tuning the models through user feedback.
Gen AI solutions are black-box by nature, meaning it’s seldom clear why they produce certain outcomes and how to evaluate their accuracy. This lack of understanding might prevent developers from tweaking the models. By following explainable AI principles during generative AI model training, such as introducing model interpretability techniques, attention mechanisms, and audit trails, you can gain insight into the model’s decision-making process and optimize its performance.
Also, there are several questions that your company needs to answer before getting started with generative AI implementation:
Is there a solid buy vs. build strategy in place to validate that your company only adopts generative AI in functions where the technology would become a differentiator while preventing vendor lock-in? This strategy should be augmented with a detailed roadmap for change management and Gen AI scaling—and provisions for redesigning entire business processes, should the need arise.
Does your in-house IT department possess adequate MLOps skills to test, fine-tune, and maintain the quality of complex ML models and their training data? If not, have you already selected a reliable AI development company to take care of these tasks?
Do you have a substantial amount of computing resources, both in the cloud and on the edge? Also, it’s important to assess the scalability of your IT infrastructure as well as the possibility of reusing Gen AI models across different tasks, processes, and units.
Does your company or your AI development partner have the skills to test the feasibility of Gen AI through proof of concept (PoC) and scale your experiments outside the controlled sandbox environment?
Last but not least, does your organization have effective privacy and security mechanisms to protect sensitive information and ensure compliance with industry- and region-specific regulations?
Having a well-thought-out implementation plan will not only help you adopt the technology in a risk-free way and reap the benefits faster, but also reduce the cost of generative AI.
At JustSoftLab, we are ready to support you at every stage of your generative AI journey—from strategic planning to creating customized solutions using the latest AI technologies.