Aka qillqatanx, Python apnaqasa FLUX servidor uñstayañ tuqitw irpapxäma. Aka servidor ukax mä sanu API tuqiw qillqat mayiwinakarjam uñacht’awinak lurañ yanapt’ani. Aka servidor ukax sapa mayni apnaqañatakix apnaqatächi jan ukax mä producción aplicación ukan chikancht’asis uñstayatächi, aka guia ukax qalltañ yanapt’ätam.
FLUX ( Black Forest Labs ) ukax AI ukan uñacht’äwinak generación ukan uraqpacharux mä tormenta ukhamaw qhipa phaxsinakanx apxaruwayi. Janiw Stable Diffusion (nayrir rey de código abierto) ukakix walja benchmarks ukan atipt’kiti, jan ukasti modelos propietarios ukanakar Dall-E jan ukax Midjourney ukham yaqhip métricas ukanx atipt’awayaraki.
Ukampis ¿kunjamsa mä apps ukan FLUX apnaqañax wakisispa? Maynix amuyt’aspawa jan servidor ukan hosts ukanakamp apnaqañ Replicate ukat yaqhanakampi, ukampis ukax wali jila qullqiruw purispa wali jank’akiw purispa, ukatx inas jan flexibilidad ukax munaskiti. Ukaw kawkhantix juma pachpa FLUX servidor personalizado lurañax wali askiwa.
Janïr código ukar ch’allt’kasaxa, wakiskir herramientas ukat bibliotecas ukanakax utt’ayatäñapatakiw uñjañäni:
torch
: Uka manqhan yatiqañ marco ukax FLUX apnaqañatakiw apnaqasini.diffusers
: Modelo FLUX ukar mantañ churaraki.transformers
: Difusores ukanakat dependencia ukax wakisiwa.sentencepiece
: FLUX tokenizer ukar apnaqañatakix wakisiwaprotobuf
: FLUX apnaqañatakix wakisiwaaccelerate
: Yanapt’iwa modelo FLUX ukaru juk’ampi suma cargañataki yaqhipa pachana.fastapi
: Marco ukax mä servidor web lurañatakiw utji, ukax uñacht’äwinak lurañ mayiwinak katuqañapawa.uvicorn
: FastAPI servidor ukar apnaqañatakix wakisiwa.psutil
: Jiwasan maquinasanx qawqha RAM ukax utji uk uñakipañatakiw jayti. Taqi bibliotecas ukanakax aka kamachimpiw uskt’asispa: pip install torch diffusers transformers sentencepiece protobuf accelerate fastapi uvicorn
.
Mac ukax M1 jan ukax M2 chip ukampiw apnaqasi, ukatx PyTorch ukax Metal ukampiw wakicht’añapa, ukhamat suma irnaqañapataki. Janïr nayrar sartañkamax PyTorch with Metal guia oficial ukar arktañamawa.
Ukhamaraki, 12 GB VRAM ukax utjañapawa, FLUX ukax mä dispositivo GPU ukan apnaqañ amtaskchi ukhaxa. Jan ukax 12 GB RAM ukjam CPU/MPS ukan irnaqañataki (ukax juk’amp llamp’u chuymanïniwa).
Script ukax qalltañäni, chiqap dispositivo ajllisaw inferencia ukar apnaqañatakix hardware ukarjam apnaqasa.
device = 'cuda' # can also be 'cpu' or 'mps' import os # MPS support in PyTorch is not yet fully implemented if device == 'mps': os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1" import torch if device == 'mps' and not torch.backends.mps.is_available(): raise Exception("Device set to MPS, but MPS is not available") elif device == 'cuda' and not torch.cuda.is_available(): raise Exception("Device set to CUDA, but CUDA is not available")
Ukax cpu
, cuda
(NVIDIA GPUs ukataki), jan ukax mps
(Apple ukan Metal Performance Shaders ukatakiw) uñt’ayasispa. Ukatx script ukax ajllit dispositivox utjiti janicha uk uñakipi ukatx mä excepción uñstayi jan ukax utjkchi ukhaxa.
Ukxarusti, modelo FLUX ukaruw cargañasa. Jiwasax modelo ukax fp16 precisión ukanw cargañäni ukax mä juk’a memoria qhispiyistani jan sinti calidad ukan chhaqhayasa.
Aka pachanx HuggingFace ukamp chiqanchañatakix mayitäspawa, kunatix modelo FLUX ukax gated ukhamawa. Suma chiqanchañatakixa, mä HuggingFace cuenta lurañawa wakisi, modelo ukar sarañawa, kamachinaka katuqañawa, ukatxa mä HuggingFace token lurañawa cuenta ukan utt’ayatanakapatxa ukatxa maquinaman yapxatañawa
HF_TOKEN
medio ambiente variable ukhama.
from diffusers import FlowMatchEulerDiscreteScheduler, FluxPipeline import psutil model_name = "black-forest-labs/FLUX.1-dev" print(f"Loading {model_name} on {device}") pipeline = FluxPipeline.from_pretrained( model_name, # Diffusion models are generally trained on fp32, but fp16 # gets us 99% there in terms of quality, with just half the (V)RAM torch_dtype=torch.float16, # Ensure we don't load any dangerous binary code use_safetensors=True # We are using Euler here, but you can also use other samplers scheduler=FlowMatchEulerDiscreteScheduler() ).to(device)
Aka tuqinxa, modelo FLUX ukax biblioteca de difusores ukampiw apkatatäski. Modelo ukax apnaqasktan ukax black-forest-labs/FLUX.1-dev
, fp16 precisión ukan apkatatawa.
Ukhamarakiw mä modelo destilado de paso de tiempo FLUX Schnell sutimp uñt’ata ukax juk’amp jaya inferencia ukaniwa, ukampis juk’a detallada uñacht’awinak apsu, ukhamarak mä modelo FLUX Pro ukax jist’antat-fuente ukhamawa. Aka chiqanx Euler programador ukampiw apnaqañäni, ukampis inas aka tuqit yant’apxchisma. Programadores ukanakat juk'amp uñakipt'apxasmawa aka chiqan . Kunjamakitix uñacht’awinak lurañax recursos-intensive ukhamäspawa, ukax wali askiw memoria apnaqañax juk’amp sumaptañapataki, juk’ampirus CPU jan ukax dispositivo ukan irnaqañax mä juk’a memoria ukaniwa.
# Recommended if running on MPS or CPU with < 64 GB of RAM total_memory = psutil.virtual_memory().total total_memory_gb = total_memory / (1024 ** 3) if (device == 'cpu' or device == 'mps') and total_memory_gb < 64: print("Enabling attention slicing") pipeline.enable_attention_slicing()
Aka código ukax taqpach memoria utjki uk uñakipi ukatx atención slicing ukaruw yanapt’i sistema ukax 64 GB RAM ukjat juk’ampikiw utji. Atención slicing ukax amuyunak apnaqañx jisk’acharakiw uñacht’äwinak lurañ pachanxa, ukax dispositivos ukanakatakix wali wakiskiriwa, juk’a yänakampi.
Ukxarusti, FastAPI servidor ukar wakicht’añäni, ukax mä API ukaw uñacht’awinak lurañatakix utjani.
from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field, conint, confloat from fastapi.middleware.gzip import GZipMiddleware from io import BytesIO import base64 app = FastAPI() # We will be returning the image as a base64 encoded string # which we will want compressed app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=7)
FastAPI ukax mä jach’a marco ukaw web APIs Python ukamp lurañataki. Ukhamächi ukhaxa, mä servidor lurañatakiw apnaqapxta, ukax uñacht’äwinak lurañ mayiwinak katuqaspawa. Ukhamarakiw GZip middleware ukax jaysawinak ch’amanchañatakix apnaqasi, ukax juk’amp askiwa kunawsatix uñacht’awinak base64 formato ukar kutt’ayapki ukhaxa.
Mä lurañ pachanx, lurat uñacht’awinak mä S3 cubo jan ukax yaqha cloud ukan imañ munasma ukat URLs ukanakax base64-codificados cadenas ukanakat sipanx kutt’ayañ munasma, mä CDN ukat yaqha optimizaciones ukanakat askinak apsuñataki.
Jichhax mä modelo uñt’ayañaw wakisi mayiwinakatak kunatix jiwasan API ukax katuqañapawa.
class GenerateRequest(BaseModel): prompt: str seed: conint(ge=0) = Field(..., description="Seed for random number generation") height: conint(gt=0) = Field(..., description="Height of the generated image, must be a positive integer and a multiple of 8") width: conint(gt=0) = Field(..., description="Width of the generated image, must be a positive integer and a multiple of 8") cfg: confloat(gt=0) = Field(..., description="CFG (classifier-free guidance scale), must be a positive integer or 0") steps: conint(ge=0) = Field(..., description="Number of steps") batch_size: conint(gt=0) = Field(..., description="Number of images to generate in a batch")
Aka GenerateRequest
modelo ukax mä uñacht’äw lurañatakix parámetros ukanak qhanañchi. prompt
chiqaxa qillqata qhananchawiwa kuna uñacht’awina luraña munata. Yaqha yapunakax uñacht’awinak dimensiones, inferencias ukan jakhuwipa, ukatx lote tamapa ukanakawa.
Jichhax, tukuyañ chiqawj lurañäni, ukax uñacht’äwinak lurañ mayiwinakaruw apnaqani.
@app.post("/") async def generate_image(request: GenerateRequest): # Validate that height and width are multiples of 8 # as required by FLUX if request.height % 8 != 0 or request.width % 8 != 0: raise HTTPException(status_code=400, detail="Height and width must both be multiples of 8") # Always calculate the seed on CPU for deterministic RNG # For a batch of images, seeds will be sequential like n, n+1, n+2, ... generator = [torch.Generator(device="cpu").manual_seed(i) for i in range(request.seed, request.seed + request.batch_size)] images = pipeline( height=request.height, width=request.width, prompt=request.prompt, generator=generator, num_inference_steps=request.steps, guidance_scale=request.cfg, num_images_per_prompt=request.batch_size ).images # Convert images to base64 strings # (for a production app, you might want to store the # images in an S3 bucket and return the URLs instead) base64_images = [] for image in images: buffered = BytesIO() image.save(buffered, format="PNG") img_str = base64.b64encode(buffered.getvalue()).decode("utf-8") base64_images.append(img_str) return { "images": base64_images, }
Aka tukuyawix uñacht’awinak lurañ thakhi apnaqi. Nayraqatax alturampi anchopampix 8 ukja multiples ukanakaw sasaw chiqancharaki, kunjamatix FLUX ukax mayiki ukhamarjama. Ukatx uñacht’awinakax churat jiskt’äwirjamaw lurasi ukatx base64-codificados cadenas ukham kutt’ayaraki.
Tukuyañatakix, mä juk’a chimpunak yapxatañäni, kunapachatix script ukax apnaqatäki ukhax servidor qalltañataki.
@app.on_event("startup") async def startup_event(): print("Image generation server running") if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000)
Aka código ukax FastAPI servidor ukaruw puerto 8000 uksan qalltawayi, ukax janiw http://localhost:8000
uksatx mantañapakiti jan ukasti yaqha dispositivos ukanak pachpa red uksankir maquinan IP ukar apnaqasa, 0.0.0.0
ukar yuspajarasawa.
Jichhax FLUX servidor ukax utt’ayatawa, ukatx yant’añ pachaw purini. curl
, mä kamachin lurawi HTTP mayiwinak lurañataki, servidor ukamp chikt’atäñatakix apnaqasispawa:
curl -X POST "http://localhost:8000/" \ -H "Content-Type: application/json" \ -d '{ "prompt": "A futuristic cityscape at sunset", "seed": 42, "height": 1024, "width": 1024, "cfg": 3.5, "steps": 50, "batch_size": 1 }' | jq -r '.images[0]' | base64 -d > test.png
Aka kamachix UNIX ukan sistemas ukanakanx
curl
,jq
ukatbase64
utilidades ukanakamp chikancht’asis irnaqani. Ukhamarakiw mä qawqha minutos ukjax tukuyañax wakisispa, ukax hardware ukarjamawa, ukax servidor FLUX ukaruw uñt’ayasi.
¡Jallalla! Ukax mä suma wakicht’at FLUX servidor Python apnaqasa. Aka wakicht’awix mä sanu API tuqiw qillqat mayiwinakarjam uñacht’awinak lurañ yanapt’i. Janitï modelo base FLUX ukat resultados ukanakamp satisfecho ukhamäkchi ukhaxa, modelo ukar sum askichañ amtasmawa, ukhamat juk’amp suma lurañataki casos específicos de uso ukanakan .
Aka guia ukan apnaqat taqpach código ukanakax akham uñt’ayatawa:
device = 'cuda' # can also be 'cpu' or 'mps' import os # MPS support in PyTorch is not yet fully implemented if device == 'mps': os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1" import torch if device == 'mps' and not torch.backends.mps.is_available(): raise Exception("Device set to MPS, but MPS is not available") elif device == 'cuda' and not torch.cuda.is_available(): raise Exception("Device set to CUDA, but CUDA is not available") from diffusers import FlowMatchEulerDiscreteScheduler, FluxPipeline import psutil model_name = "black-forest-labs/FLUX.1-dev" print(f"Loading {model_name} on {device}") pipeline = FluxPipeline.from_pretrained( model_name, # Diffusion models are generally trained on fp32, but fp16 # gets us 99% there in terms of quality, with just half the (V)RAM torch_dtype=torch.float16, # Ensure we don't load any dangerous binary code use_safetensors=True, # We are using Euler here, but you can also use other samplers scheduler=FlowMatchEulerDiscreteScheduler() ).to(device) # Recommended if running on MPS or CPU with < 64 GB of RAM total_memory = psutil.virtual_memory().total total_memory_gb = total_memory / (1024 ** 3) if (device == 'cpu' or device == 'mps') and total_memory_gb < 64: print("Enabling attention slicing") pipeline.enable_attention_slicing() from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field, conint, confloat from fastapi.middleware.gzip import GZipMiddleware from io import BytesIO import base64 app = FastAPI() # We will be returning the image as a base64 encoded string # which we will want compressed app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=7) class GenerateRequest(BaseModel): prompt: str seed: conint(ge=0) = Field(..., description="Seed for random number generation") height: conint(gt=0) = Field(..., description="Height of the generated image, must be a positive integer and a multiple of 8") width: conint(gt=0) = Field(..., description="Width of the generated image, must be a positive integer and a multiple of 8") cfg: confloat(gt=0) = Field(..., description="CFG (classifier-free guidance scale), must be a positive integer or 0") steps: conint(ge=0) = Field(..., description="Number of steps") batch_size: conint(gt=0) = Field(..., description="Number of images to generate in a batch") @app.post("/") async def generate_image(request: GenerateRequest): # Validate that height and width are multiples of 8 # as required by FLUX if request.height % 8 != 0 or request.width % 8 != 0: raise HTTPException(status_code=400, detail="Height and width must both be multiples of 8") # Always calculate the seed on CPU for deterministic RNG # For a batch of images, seeds will be sequential like n, n+1, n+2, ... generator = [torch.Generator(device="cpu").manual_seed(i) for i in range(request.seed, request.seed + request.batch_size)] images = pipeline( height=request.height, width=request.width, prompt=request.prompt, generator=generator, num_inference_steps=request.steps, guidance_scale=request.cfg, num_images_per_prompt=request.batch_size ).images # Convert images to base64 strings # (for a production app, you might want to store the # images in an S3 bucket and return the URL's instead) base64_images = [] for image in images: buffered = BytesIO() image.save(buffered, format="PNG") img_str = base64.b64encode(buffered.getvalue()).decode("utf-8") base64_images.append(img_str) return { "images": base64_images, } @app.on_event("startup") async def startup_event(): print("Image generation server running") if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000)