WEBVTT

00:00.000 --> 00:01.500
-: Hey, and in this video you're gonna learn

00:01.500 --> 00:03.210
about some of the core features

00:03.210 --> 00:05.400
that OpenAI provides you with as a developer

00:05.400 --> 00:06.630
so that you get an awareness

00:06.630 --> 00:08.520
about all the different types of capabilities

00:08.520 --> 00:11.370
that you're gonna be able to do when using this API.

00:11.370 --> 00:14.520
For example, some of the core language capabilities

00:14.520 --> 00:15.960
are text generations,

00:15.960 --> 00:18.810
so OpenAI can easily generate text,

00:18.810 --> 00:21.270
you can use the responses API

00:21.270 --> 00:22.830
like you can see on the left here,

00:22.830 --> 00:26.190
and the models range from lightweight to advanced.

00:26.190 --> 00:27.570
You can control parameters

00:27.570 --> 00:29.640
like we were showing you earlier in the Playground,

00:29.640 --> 00:31.110
such as the temperature,

00:31.110 --> 00:33.870
the maximum number of tokens that can be generated,

00:33.870 --> 00:35.490
and also the top-p.

00:35.490 --> 00:37.110
There's also content filtering

00:37.110 --> 00:38.910
and safety measures that are built in.

00:38.910 --> 00:42.090
Secondly, you can extract structured outputs from OpenAI.

00:42.090 --> 00:45.030
This allows you to get predictable JSON response.

00:45.030 --> 00:47.580
This is perfect for doing data extraction

00:47.580 --> 00:49.740
and structured information retrieval,

00:49.740 --> 00:52.530
also enforces the output format

00:52.530 --> 00:54.930
and validates that it is in that structure.

00:54.930 --> 00:57.960
This reduces the post-processing effort that you have to do

00:57.960 --> 01:00.540
and it basically means you can get structured data

01:00.540 --> 01:02.130
from just about anything.

01:02.130 --> 01:04.770
Also, you can use ChatGPT's API

01:04.770 --> 01:08.370
or OpenAI's API to analyze images.

01:08.370 --> 01:11.730
You can provide a prompt with the image URL

01:11.730 --> 01:15.660
or attach the image as Base64 encoded string,

01:15.660 --> 01:18.060
and then you can basically get an output back

01:18.060 --> 01:21.270
directly from OpenAI about that image.

01:21.270 --> 01:22.770
You can also generate images

01:22.770 --> 01:27.090
with OpenAI's API using the dall-e-3 model at the moment,

01:27.090 --> 01:28.440
and you can put in a prompt

01:28.440 --> 01:30.840
and the number of generations you want with n 1

01:30.840 --> 01:33.120
and the size as well of the image.

01:33.120 --> 01:35.850
OpenAI also offers text-to-speech,

01:35.850 --> 01:38.070
so generating real life-like speech

01:38.070 --> 01:39.810
directly from a textual prompt

01:39.810 --> 01:42.663
and also speech-to-text, i.e. transcription.

01:43.950 --> 01:47.190
There are some more advanced features of OpenAI's API,

01:47.190 --> 01:49.680
such as function calling or tools,

01:49.680 --> 01:52.200
and you can basically provide a agent

01:52.200 --> 01:54.240
with a get_weather function,

01:54.240 --> 01:56.580
then the agent is capable of understanding

01:56.580 --> 01:58.200
when to call that tool.

01:58.200 --> 02:00.990
And you can see, for example, in the prompt on the left,

02:00.990 --> 02:02.977
we have this query from the user,

02:02.977 --> 02:05.340
"What's the weather like in Paris today?"

02:05.340 --> 02:07.710
That will likely lead the AI choosing

02:07.710 --> 02:11.580
to use a get_weather tool to answer that user's query.

02:11.580 --> 02:14.460
There are more recent models, which are reasoning models,

02:14.460 --> 02:18.000
these are very good for solving complex problems

02:18.000 --> 02:20.490
and I would recommend having a look at the newer models.

02:20.490 --> 02:22.590
So o3 is the most performant model

02:22.590 --> 02:24.210
at this time of recording.

02:24.210 --> 02:25.590
There is also o4-mini,

02:25.590 --> 02:28.050
which is slightly more cost-efficient.

02:28.050 --> 02:30.330
You have a parameter that allows you

02:30.330 --> 02:33.780
to see on the left-hand side here the ability to choose

02:33.780 --> 02:36.840
how much reasoning effort should be gone into.

02:36.840 --> 02:38.730
There is also embeddings,

02:38.730 --> 02:43.260
which is a way to represent words in a numerical format.

02:43.260 --> 02:47.340
This is useful for semantic search and similarity use cases.

02:47.340 --> 02:49.807
And you can see on the left here, we can take an input,

02:49.807 --> 02:52.740
"The quick brown fox jumps over the lazy dog,"

02:52.740 --> 02:55.680
and then basically we'll produce a series of numbers,

02:55.680 --> 02:57.570
also known as an embedding.

02:57.570 --> 02:58.920
Some of the different types of ways

02:58.920 --> 03:00.270
that you might like to use this

03:00.270 --> 03:02.220
is similarity matching,

03:02.220 --> 03:04.350
dynamically bringing content back,

03:04.350 --> 03:06.000
that's similar to a query,

03:06.000 --> 03:09.030
clustering and classification, knowledge bases,

03:09.030 --> 03:12.450
and RAG, also known as retrieval augmented generation.

03:12.450 --> 03:13.950
There are lots of different things you can do

03:13.950 --> 03:16.410
inside of OpenAI beside the core features,

03:16.410 --> 03:17.880
such as fine-tuning,

03:17.880 --> 03:20.580
so training your own models on your specific data,

03:20.580 --> 03:23.700
running evaluations, or commonly known as evals,

03:23.700 --> 03:26.550
systematically evaluating your model performance,

03:26.550 --> 03:28.020
and also distillation,

03:28.020 --> 03:30.630
so creating smaller, more specialized models,

03:30.630 --> 03:34.260
which helps to reduce the inference cost and the latency.

03:34.260 --> 03:36.060
There's also a bunch of developer resources

03:36.060 --> 03:37.830
that are put into this video,

03:37.830 --> 03:39.870
so feel free to have a look at that

03:39.870 --> 03:41.460
in your journey with OpenAI.

03:41.460 --> 03:42.780
In the next video, we're gonna have a look

03:42.780 --> 03:44.310
at how you can get started.

03:44.310 --> 03:47.160
So creating an account, creating an API key,

03:47.160 --> 03:48.810
and then as well as that, you'll also need

03:48.810 --> 03:52.950
to install the SDK with this pip install openai command.

03:52.950 --> 03:54.800
Cool, I'll see you in the next video.