Great work so far!

Despite being renowned for its GPT series of chat models, OpenAI hosts a diverse array of models capable of performing many different tasks.

In this course, we'll be focusing primarily on OpenAI's text-based models, but later in the course, we'll also take a look at the audio transcription and translation capabilities of the Whisper model.

For now, however, let's take a closer look at the text capabilities available through the API.

The Completions endpoint allows users to send a prompt and receive a model-generated response that attempts to complete the prompt in a likely and consistent way.

Completions is used for so-called single-turn tasks, as there is a single prompt and response. However, the models available via this endpoint are extremely flexible, and are capable of answering questions, performing classification tasks, determining text sentiment, explaining complex topics, and much more.

The Completions endpoint is available via the openai Completion class.

The Chat endpoint can be used for applications that require multi-turn tasks, including assisting with ideation, customer support questions, personalized tutoring, translating languages, and writing code.

Chat models also perform well on single-turn tasks, so many applications are built on top of chat models for flexibility.

The openai package provides the ChatCompletion class for accessing the Chat endpoint, but we'll cover how to use Chat later in the course.

The Moderation endpoint is used to check whether content violates OpenAI's usage policies, such inciting violence or promoting hate speech.

The sensitivity of the model to different types of violations can be customized for specific use cases that may require stricter or more lenient moderation.

For business use cases with frequent requests to the API, it's important to manage usage across the business.

Setting up an organization for the API allows for better management of access, billing, and usage limits to the API.

Users can be part of multiple organizations and attribute requests to specific organizations for billing.

To attribute a request to a specific organization, we only need to add one more line of code.

Like the API key, the organization ID can be set before the request.

API rate limits are another key consideration for companies building features on the OpenAI API. Rate limits are a cap on the frequency and size of API requests.

They are put in place to ensure fair access to the API, prevent misuse, and also manage the infrastructure that supports the API.

For many cases, this may not be an issue, but if a feature is exposed to a large user base, or the requests require generating large bodies of content, they could be at risk of hitting the rate limits.

Much of this risk can be mitigated by, instead of running multiple features under the same organization,

having separate organizations for each business unit or product feature, depending on the number of features built on the OpenAI API.

In this example, we've created separate OpenAI organizations for three different AI-powered features: a customer service chatbot, a content recommendation system, and a video transcript generator.

This distributes the requests to reduce the risk of hitting the rate limit. It also removes the single failure point, so an issue to one organization, such as a billing issue, will only result in the failure of a single feature. Product-separated organizations also provides more granular insights into usage and billing.

Let's practice!