Katika nakala hii, tutakutembeza kupitia kuunda seva yako ya FLUX kwa kutumia Python. Seva hii itakuruhusu kutoa picha kulingana na vidokezo vya maandishi kupitia API rahisi. Iwe unaendesha seva hii kwa matumizi ya kibinafsi au unaitumia kama sehemu ya programu ya uzalishaji, mwongozo huu utakusaidia kuanza.
FLUX (na Black Forest Labs ) imechukua ulimwengu wa taswira ya AI kwa dhoruba katika miezi michache iliyopita. Sio tu kwamba imeshinda Diffusion Imara (mfalme wa chanzo huria wa awali) kwenye vigezo vingi, lakini pia imepita mifano ya wamiliki kama vile Dall-E au Midjourney katika baadhi ya vipimo.
Lakini unawezaje kutumia FLUX kwenye mojawapo ya programu zako? Mtu anaweza kufikiria kutumia seva pangishi kama vile Replicate na zingine, lakini hizi zinaweza kuwa ghali sana haraka sana, na huenda zisitoe unyumbufu unaohitaji. Hapo ndipo kuunda seva yako maalum ya FLUX inakuja vizuri.
Kabla ya kupiga mbizi kwenye msimbo, hebu tuhakikishe kuwa una zana na maktaba muhimu zilizowekwa:
torch
: Mfumo wa kujifunza kwa kina tutatumia kuendesha FLUX.diffusers
: Hutoa ufikiaji wa mfano wa FLUX.transformers
: Utegemezi unaohitajika wa visambazaji.sentencepiece
: Inahitajika ili kuendesha tokenizer ya FLUXprotobuf
: Inahitajika ili kuendesha FLUXaccelerate
: Husaidia kupakia muundo wa FLUX kwa ufanisi zaidi katika visa vingine.fastapi
: Mfumo wa kuunda seva ya wavuti ambayo inaweza kukubali maombi ya kuunda picha.uvicorn
: Inahitajika ili kuendesha seva ya FastAPI.psutil
: Inaturuhusu kuangalia ni kiasi gani cha RAM kwenye mashine yetu. Unaweza kusakinisha maktaba zote kwa kutekeleza amri ifuatayo: pip install torch diffusers transformers sentencepiece protobuf accelerate fastapi uvicorn
.
Ikiwa unatumia Mac iliyo na chip ya M1 au M2, unapaswa kusanidi PyTorch na Metal kwa utendakazi bora. Fuata mwongozo rasmi wa PyTorch na Metal kabla ya kuendelea.
Utahitaji pia kuhakikisha kuwa una angalau GB 12 za VRAM ikiwa unapanga kutumia FLUX kwenye kifaa cha GPU. Au angalau GB 12 ya RAM ya kuendesha kwenye CPU/MPS (ambayo itakuwa polepole zaidi).
Wacha tuanze hati kwa kuchagua kifaa sahihi cha kutekeleza uelekezaji kulingana na maunzi tunayotumia.
device = 'cuda' # can also be 'cpu' or 'mps' import os # MPS support in PyTorch is not yet fully implemented if device == 'mps': os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1" import torch if device == 'mps' and not torch.backends.mps.is_available(): raise Exception("Device set to MPS, but MPS is not available") elif device == 'cuda' and not torch.cuda.is_available(): raise Exception("Device set to CUDA, but CUDA is not available")
Unaweza kubainisha cpu
, cuda
(kwa GPU za NVIDIA), au mps
(kwa Vivuli vya Utendaji vya Metali vya Apple). Hati kisha hukagua ikiwa kifaa kilichochaguliwa kinapatikana na kuibua ubaguzi ikiwa hakipo.
Ifuatayo, tunapakia mfano wa FLUX. Tutapakia modeli kwa usahihi wa fp16 ambayo itatuhifadhi kumbukumbu bila hasara nyingi katika ubora.
Katika hatua hii, unaweza kuulizwa kuthibitisha na HuggingFace, kama mtindo wa FLUX umewekwa lango. Ili kuthibitisha kwa mafanikio, utahitaji kuunda akaunti ya HuggingFace, nenda kwenye ukurasa wa mfano, ukubali sheria na masharti, kisha uunde tokeni ya HuggingFace kutoka kwa mipangilio ya akaunti yako na uiongeze kwenye mashine yako kama mabadiliko ya mazingira
HF_TOKEN
.
from diffusers import FlowMatchEulerDiscreteScheduler, FluxPipeline import psutil model_name = "black-forest-labs/FLUX.1-dev" print(f"Loading {model_name} on {device}") pipeline = FluxPipeline.from_pretrained( model_name, # Diffusion models are generally trained on fp32, but fp16 # gets us 99% there in terms of quality, with just half the (V)RAM torch_dtype=torch.float16, # Ensure we don't load any dangerous binary code use_safetensors=True # We are using Euler here, but you can also use other samplers scheduler=FlowMatchEulerDiscreteScheduler() ).to(device)
Hapa, tunapakia mfano wa FLUX kwa kutumia maktaba ya visambazaji. Muundo tunaotumia ni black-forest-labs/FLUX.1-dev
, iliyopakiwa kwa usahihi wa fp16.
Pia kuna modeli iliyosafishwa kwa kasi inayoitwa FLUX Schnell ambayo ina uelekezaji wa haraka zaidi, lakini inatoa picha zenye maelezo kidogo, na pia modeli ya FLUX Pro ambayo ni chanzo funge. Tutatumia kipanga ratiba cha Euler hapa, lakini unaweza kujaribu hili. Unaweza kusoma zaidi juu ya wapangaji hapa . Kwa kuwa utengenezaji wa picha unaweza kutumia rasilimali nyingi, ni muhimu kuboresha utumiaji wa kumbukumbu, haswa wakati wa kutumia CPU au kifaa kilicho na kumbukumbu ndogo.
# Recommended if running on MPS or CPU with < 64 GB of RAM total_memory = psutil.virtual_memory().total total_memory_gb = total_memory / (1024 ** 3) if (device == 'cpu' or device == 'mps') and total_memory_gb < 64: print("Enabling attention slicing") pipeline.enable_attention_slicing()
Msimbo huu hukagua jumla ya kumbukumbu inayopatikana na kuwezesha kukata umakini ikiwa mfumo una chini ya GB 64 ya RAM. Kukatwa kwa uangalifu hupunguza utumiaji wa kumbukumbu wakati wa kuunda picha, ambayo ni muhimu kwa vifaa vilivyo na rasilimali chache.
Ifuatayo, tutasanidi seva ya FastAPI, ambayo itatoa API ya kutengeneza picha.
from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field, conint, confloat from fastapi.middleware.gzip import GZipMiddleware from io import BytesIO import base64 app = FastAPI() # We will be returning the image as a base64 encoded string # which we will want compressed app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=7)
FastAPI ni mfumo maarufu wa kujenga API za wavuti na Python. Katika hali hii, tunaitumia kuunda seva ambayo inaweza kukubali maombi ya kuunda picha. Pia tunatumia GZip middleware kubana majibu, ambayo ni muhimu sana tunapotuma picha katika umbizo la base64.
Katika mazingira ya uzalishaji, unaweza kutaka kuhifadhi picha zinazozalishwa kwenye ndoo ya S3 au hifadhi nyingine ya wingu na kurudisha URL badala ya mifuatano ya base64 iliyosimbwa, ili kufaidika na CDN na uboreshaji mwingine.
Sasa tunahitaji kufafanua mfano wa maombi ambayo API yetu itakubali.
class GenerateRequest(BaseModel): prompt: str seed: conint(ge=0) = Field(..., description="Seed for random number generation") height: conint(gt=0) = Field(..., description="Height of the generated image, must be a positive integer and a multiple of 8") width: conint(gt=0) = Field(..., description="Width of the generated image, must be a positive integer and a multiple of 8") cfg: confloat(gt=0) = Field(..., description="CFG (classifier-free guidance scale), must be a positive integer or 0") steps: conint(ge=0) = Field(..., description="Number of steps") batch_size: conint(gt=0) = Field(..., description="Number of images to generate in a batch")
Mfano huu GenerateRequest
unafafanua vigezo vinavyohitajika ili kutoa picha. Sehemu prompt
ni maelezo ya maandishi ya picha unayotaka kuunda. Sehemu zingine ni pamoja na vipimo vya picha, idadi ya hatua za uelekezaji, na saizi ya bechi.
Sasa, wacha tuunde mwisho ambao utashughulikia maombi ya kuunda picha.
@app.post("/") async def generate_image(request: GenerateRequest): # Validate that height and width are multiples of 8 # as required by FLUX if request.height % 8 != 0 or request.width % 8 != 0: raise HTTPException(status_code=400, detail="Height and width must both be multiples of 8") # Always calculate the seed on CPU for deterministic RNG # For a batch of images, seeds will be sequential like n, n+1, n+2, ... generator = [torch.Generator(device="cpu").manual_seed(i) for i in range(request.seed, request.seed + request.batch_size)] images = pipeline( height=request.height, width=request.width, prompt=request.prompt, generator=generator, num_inference_steps=request.steps, guidance_scale=request.cfg, num_images_per_prompt=request.batch_size ).images # Convert images to base64 strings # (for a production app, you might want to store the # images in an S3 bucket and return the URLs instead) base64_images = [] for image in images: buffered = BytesIO() image.save(buffered, format="PNG") img_str = base64.b64encode(buffered.getvalue()).decode("utf-8") base64_images.append(img_str) return { "images": base64_images, }
Mwisho huu hushughulikia mchakato wa kutengeneza picha. Kwanza inathibitisha kuwa urefu na upana ni vizidishio vya 8, kama inavyotakiwa na FLUX. Kisha hutengeneza picha kulingana na kidokezo kilichotolewa na kuzirudisha kama mifuatano ya msingi-64 iliyosimbwa.
Hatimaye, hebu tuongeze msimbo ili kuanza seva wakati hati inaendeshwa.
@app.on_event("startup") async def startup_event(): print("Image generation server running") if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000)
Msimbo huu huanzisha seva ya FastAPI kwenye bandari 8000, na kuifanya ipatikane sio tu kutoka kwa http://localhost:8000
lakini pia kutoka kwa vifaa vingine kwenye mtandao huo huo kwa kutumia anwani ya IP ya mashine ya seva pangishi, shukrani kwa 0.0.0.0
ya kufunga.
Sasa kwa kuwa seva yako ya FLUX iko na inafanya kazi, ni wakati wa kuijaribu. Unaweza kutumia curl
, zana ya mstari wa amri kwa kufanya maombi ya HTTP, kuingiliana na seva yako:
curl -X POST "http://localhost:8000/" \ -H "Content-Type: application/json" \ -d '{ "prompt": "A futuristic cityscape at sunset", "seed": 42, "height": 1024, "width": 1024, "cfg": 3.5, "steps": 50, "batch_size": 1 }' | jq -r '.images[0]' | base64 -d > test.png
Amri hii itafanya kazi tu kwenye mifumo inayotegemea UNIX iliyo na huduma za
curl
,jq
nabase64
zilizosakinishwa. Inaweza pia kuchukua hadi dakika chache kukamilika kulingana na maunzi yanayopangisha seva ya FLUX.
Hongera! Umefanikiwa kuunda seva yako ya FLUX kwa kutumia Python. Usanidi huu hukuruhusu kutoa picha kulingana na vidokezo vya maandishi kupitia API rahisi. Ikiwa haujaridhika na matokeo ya muundo wa msingi wa FLUX, unaweza kufikiria kurekebisha muundo kwa utendakazi bora zaidi kwenye visa maalum vya utumiaji .
Unaweza kupata nambari kamili iliyotumiwa katika mwongozo huu hapa chini:
device = 'cuda' # can also be 'cpu' or 'mps' import os # MPS support in PyTorch is not yet fully implemented if device == 'mps': os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1" import torch if device == 'mps' and not torch.backends.mps.is_available(): raise Exception("Device set to MPS, but MPS is not available") elif device == 'cuda' and not torch.cuda.is_available(): raise Exception("Device set to CUDA, but CUDA is not available") from diffusers import FlowMatchEulerDiscreteScheduler, FluxPipeline import psutil model_name = "black-forest-labs/FLUX.1-dev" print(f"Loading {model_name} on {device}") pipeline = FluxPipeline.from_pretrained( model_name, # Diffusion models are generally trained on fp32, but fp16 # gets us 99% there in terms of quality, with just half the (V)RAM torch_dtype=torch.float16, # Ensure we don't load any dangerous binary code use_safetensors=True, # We are using Euler here, but you can also use other samplers scheduler=FlowMatchEulerDiscreteScheduler() ).to(device) # Recommended if running on MPS or CPU with < 64 GB of RAM total_memory = psutil.virtual_memory().total total_memory_gb = total_memory / (1024 ** 3) if (device == 'cpu' or device == 'mps') and total_memory_gb < 64: print("Enabling attention slicing") pipeline.enable_attention_slicing() from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field, conint, confloat from fastapi.middleware.gzip import GZipMiddleware from io import BytesIO import base64 app = FastAPI() # We will be returning the image as a base64 encoded string # which we will want compressed app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=7) class GenerateRequest(BaseModel): prompt: str seed: conint(ge=0) = Field(..., description="Seed for random number generation") height: conint(gt=0) = Field(..., description="Height of the generated image, must be a positive integer and a multiple of 8") width: conint(gt=0) = Field(..., description="Width of the generated image, must be a positive integer and a multiple of 8") cfg: confloat(gt=0) = Field(..., description="CFG (classifier-free guidance scale), must be a positive integer or 0") steps: conint(ge=0) = Field(..., description="Number of steps") batch_size: conint(gt=0) = Field(..., description="Number of images to generate in a batch") @app.post("/") async def generate_image(request: GenerateRequest): # Validate that height and width are multiples of 8 # as required by FLUX if request.height % 8 != 0 or request.width % 8 != 0: raise HTTPException(status_code=400, detail="Height and width must both be multiples of 8") # Always calculate the seed on CPU for deterministic RNG # For a batch of images, seeds will be sequential like n, n+1, n+2, ... generator = [torch.Generator(device="cpu").manual_seed(i) for i in range(request.seed, request.seed + request.batch_size)] images = pipeline( height=request.height, width=request.width, prompt=request.prompt, generator=generator, num_inference_steps=request.steps, guidance_scale=request.cfg, num_images_per_prompt=request.batch_size ).images # Convert images to base64 strings # (for a production app, you might want to store the # images in an S3 bucket and return the URL's instead) base64_images = [] for image in images: buffered = BytesIO() image.save(buffered, format="PNG") img_str = base64.b64encode(buffered.getvalue()).decode("utf-8") base64_images.append(img_str) return { "images": base64_images, } @app.on_event("startup") async def startup_event(): print("Image generation server running") if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000)