WEBVTT

00:00.880 --> 00:01.480
All right.

00:01.520 --> 00:03.240
Advanced consistent characters.

00:03.240 --> 00:08.480
So how do you take a character in one image and use the exact same character in a different scene?

00:08.480 --> 00:12.960
So this is the same guy in a business suit or playing tennis?

00:13.000 --> 00:14.480
I have a little trick for doing this.

00:14.480 --> 00:20.280
I used to do this in Midjourney, and I figured out a way to do it in foul recently, so that's what

00:20.280 --> 00:21.560
we're trying to accomplish.

00:21.560 --> 00:24.000
I'll just show you guys this how it works.

00:24.720 --> 00:26.520
We're going to run through it one by one.

00:27.000 --> 00:33.960
So first we need to just install the foul client, make sure it's there and then load the environment

00:33.960 --> 00:37.880
variables so we can get a API key for foul the foul key.

00:38.320 --> 00:40.640
So make sure you've done that there.

00:41.080 --> 00:44.160
And then we just need to bring in the foul client.

00:45.240 --> 00:50.920
So let's import foul client import request Pil image bytes IO.

00:51.520 --> 00:54.400
And then get the Python display as well.

00:55.480 --> 00:55.880
All right.

00:55.920 --> 01:04.680
And we want to take the original image which is this rhino in a suit.

01:05.650 --> 01:08.650
And then we want to shrink the image.

01:08.690 --> 01:09.090
Oh, actually.

01:09.090 --> 01:09.330
Sorry.

01:09.370 --> 01:15.050
We want to basically create a new image which will double the width.

01:15.090 --> 01:15.330
Right.

01:15.370 --> 01:22.410
So create a new image with double width and white background.

01:24.010 --> 01:24.290
All right.

01:24.330 --> 01:29.130
So this doubles the width and that keeps the original height.

01:29.530 --> 01:32.050
And then we're just creating this new image.

01:32.330 --> 01:36.210
And then we just paste the original image on the left hand side.

01:37.010 --> 01:43.490
And then we want to just save that and upload that to file.

01:45.090 --> 01:45.930
So we have that.

01:46.290 --> 01:55.690
And then here what we're doing is we are taking the the image size and basically creating a mask on

01:55.690 --> 01:56.610
the right hand side.

01:56.610 --> 01:59.170
So it's going to make the right hand side of the image black.

01:59.530 --> 02:02.810
And it's going to make the left hand side of the image white.

02:03.130 --> 02:12.450
And we basically just need to mask image paste paste that original image region on the canvas, and

02:12.450 --> 02:14.370
then we can upload that as Tim mask.

02:15.130 --> 02:16.050
Get the URL.

02:16.570 --> 02:18.610
And then we're going to see both these side by side.

02:18.610 --> 02:20.330
So this is all the hard work.

02:20.330 --> 02:24.530
Essentially it's just to create this new image with a section on the right.

02:24.530 --> 02:25.850
And you can't really see this very well.

02:25.850 --> 02:27.690
But this image goes all the way across here.

02:28.050 --> 02:35.250
So paste that image on here and then adds this section on the right hand side, which is a white section.

02:35.250 --> 02:37.210
And then it adds a black mask.

02:37.450 --> 02:39.650
So these two are going to be uploaded separately.

02:40.170 --> 02:43.690
And then from there on out it's basically just inpainting.

02:43.690 --> 02:52.210
And the reason why this works is that we have when you're inpainting, it takes into account the rest

02:52.210 --> 02:53.130
of the image.

02:53.410 --> 02:56.370
It'll stay consistent with that image.

02:56.370 --> 02:57.730
And that's why this works.

02:57.730 --> 03:01.650
Because if you just create something new from the prompt, it hasn't seen the other image.

03:01.850 --> 03:04.610
And even with image to image, it doesn't always work that well.

03:04.770 --> 03:08.610
Whereas when you're inpainting it has better prompt adherence is what I find.

03:08.650 --> 03:10.690
So let's see what that looks like.

03:10.890 --> 03:15.330
The prompt output is I've just again, we're just prompting for the right hand side of the image.

03:15.850 --> 03:20.130
We're feeding it in the left and we're saying on the right, like making it look like a caption.

03:20.170 --> 03:24.130
The photo of the same rhino wearing tennis clothes, carrying a tennis racket, playing tennis on a

03:24.130 --> 03:24.810
tennis court.

03:25.370 --> 03:28.090
And we're using this fill model, which is really good for out painting.

03:28.130 --> 03:29.730
I think that's super important.

03:29.770 --> 03:31.930
We've got the new width and the new height here.

03:31.930 --> 03:35.370
We're passing that in the image size and here we go.

03:35.610 --> 03:39.970
We've got the same rhino but playing tennis right.

03:40.010 --> 03:40.890
So really cool.

03:41.090 --> 03:42.850
And yeah you could do whatever you want.

03:43.010 --> 03:49.290
The really useful thing I found is if you wanted to fine tune on the same character, you could go and

03:49.290 --> 03:54.290
generate a bunch of these, you know, in different scenarios, different situations, and then chop

03:54.290 --> 04:00.770
them all up programmatically, divide the image in half, and then you've got a load of different images

04:00.770 --> 04:02.570
of the same character in different spaces.

04:02.610 --> 04:05.090
And then that could be your fine tuning data.

04:05.130 --> 04:07.210
So that's one of the ways I use this.

04:07.210 --> 04:09.370
But yeah, lots of different ways to tease.

04:09.370 --> 04:10.690
This is a good little hack.

04:10.890 --> 04:17.450
Again, I did it in Midjourney primarily initially, but now you can program and make that programmatic

04:17.450 --> 04:17.650
in.
