Ndicwangcisa i-screen-analysis tool yam ngexesha elinye: Ndingathanda into esebenzayo kakhulu enokufunda i-screen-recording, ukufikelela into efanayo njenge-workflow, yaye uqhagamshelane le ngcono kwi-artefacts ye-automation (i-n8n flows, i-step lists, i-structured summaries, i-pipeline yonke). I-part-expensive ayikho yokukwenza i-JSON okanye ukubonisa i-report. Yinto ye-multimodal understanding step – yonke i-frame eyongezelela kwimodeli i-real money. I-screen recordings yi-input distribution ye-worst-case ye-fixed-rate sampling: i-strings ezininzi ye-UI ye-static, ke i-micro-bursts ezidlulileyo apho umdlali uthathe iifoto ezimbini, ifowuni ezimbini, i-dropdown ifumaneka, i-modal flashes, okanye i-tab swaps. Ngoko ndicinga ukuthatha i-frames njenge- "i-data points" kwaye ndisungule ukuba zihlanganisa njenge-budget. Yintoni iingxaki zangaphambili (i-failure eyenza ukuguqulwa) I-cut yam yokuqala yinto evamile: sample zonke i-Nth frame. I-version yahlukileyo kwiiyure ezimbini, kwaye zombini ziyafumaneka kwimveliso. I-upload ye-typical kwi-dashboard yam i-recording ye-6-12 imizuzu. Kwiimeko ezininzi ze-recording, umdlali ibhalisele ukufikelela, ukhangela i-page, okanye i-cursor ikhe. Ukubonisa kwe-sampling eyenziwe ngempumelelo i-frames kwiimizuzu ze-nothing. I-analysis i-cost scales ngokugqithisileyo kunye ne-video length, nangona i-information content ayikho. Failure #1: it wasted frames on dead air. I-incidente yokubaluleka kakhulu (yaye yaye yaye yaye yandisa i-rewrite ye-sampler) yaba i-recording esitsha ye-admin ye-workflow apho i-dropdown ye-permission ye-critical yandise kwaye yandise ngokukhawuleza. I-sampler ye-unique yandisa i-frames phambi kwe-dropdown kwaye ngexesha elinye emva kwe-dropdown - ngoko i-model ayikwazi ukufumana i-permission choice. I-output summary iye yenzelwe ngokufanelekileyo nangokugqithisileyo: iye yandisa i-settings ezahlukileyo ngenxa yokuba yaziwa i-status ye-page enokuthi i-interface ye-interface ye-transient eyenza. Failure #2: it missed the “blink-and-you-miss-it” UI moments. Kwaye ingxaki - ukunika ixesha ye-static nangokuthintela isebenzo esemthethweni - iye ebonakalayo. Ndingathanda i-sampler ebonakalisa iifakamiso apho ividiyo ikhiwa kwaye ibhalisele ukunika i-lease kwi-UI ye-static. Umzekelo wokuqala: ukuchitha kwi-frames apho i-information ifumaneka Umgangatho we-naive i-"sample every Nth frame." I-feel fair. Kwakhona ingxaki. I-screen recording ayikho i-movie. I-reader ibekwe kakhulu: i-state ye-stable elide ebonakalayo nge-transitions ezincinane. I-sampling ye-uniform ibonelela iziphumo ezimbini ezincinane: Uyakwazi ukuxhaswa nge-stable UI. Uya underfund iintshintshi ezincinane apho i-workflow ikhona. I-analogia enye (ngokusetyenziswa ngexesha elinye kwaye uyakufutshane): Ukubalwa kwe-sampling ngokufanelekileyo ibamba ukunika zonke iibhodi efanayo ngaphandle kokuphendula. I-accounting elula, akukho ukwahlukanisa elungileyo. Ukusetyenziswa kwimibuzo ye-changing, ukuxhaswa kwexesha kwi-segments, kunye nokuvelisa i-keyframe budget kwi-segments ezininzi. I-Runtime Architecture: apho i-sampler ibekwe Ukusebenza, i-system iyahlukaniswa kwiiyure ezimbini eziqhelekileyo: Inkonzo ye-Python analysis (eyenziwa njenge-Cloud Run service) enza ividiyo ebhalisiweyo, ukhethe i-keyframes, isebenze i-analysis ye-multimodal, kwaye ikhiqiza i-payload ye-rezultate eyenziwe. I-Next.js app ebonakalayo imiphumo nge-webhook kwaye ibandakanya (kuvelisa i-Dashboard UI). I-sampler ibekwe ngaphakathi kwe-analyzer, ngaphambi kokufunda kwimodeli emangalisayo. I-not-obvious point: i-sampler ayikho "optimization elungileyo." I-control surface. I-converts "ngaphezulu kwevidiyo?" kwi-"ngaphezulu kwe-analytics ngaba ufuna ukuxhaswa?" I-Analyzer inikeza iziphumo kwi-Next.js app nge-webhook ebhalisiweyo—i-HMAC-SHA256 kwi-byte esifunyenweyo ye-JSON, ebhalisiweyo nge ekugqibeleni lokufumana. Umfundisi ebalulekileyo elandelayo: sinikezela i-byte ( ), ayikho i-Python dict eyenziwa kwi-re-serialized ye-HTTP library. Le mismatch ikhiqiza iingxaki ze-verification ezininzi ezibonakalayo ezibonakalayo ezifana ne-ghosts. Kodwa i-webhook seam i-post eyahlukileyo - nto leyo kuxhomekeke apha yintoni. i-payload ifumaneka kwi-analyzer. timingSafeEqual data=body_bytes Kwangathi I-Adaptive Keyframe Sampling: I-Score → I-Segment → I-Allocate → I-Pick Indices I-sampler yinto i-pipeline ye-4-stage: Score ukuguqulwa ixabiso ngexabiso ngexabiso (noma ngexabiso ngexabiso). I-Timeline ifumaneka kwi- "i-mostly stable" kunye ne- "high-change" isebenza. Ukukhangisa i-keyframe budget kwi-segments kunye ne-garderrails. Khetha iindidi ze-frame ze-concrete kwi-segment eyodwa. Yintoni isakhiwo yenza yinto efanelekileyo. Ukuhlaziywa kunzima. Ukuhlaziywa kunokuba linear. Ukuhlaziywa kunokwenzeka. Ukuhlaziywa kunokuba mechanical. Isigaba 1 — Ukubalwa: ukuguqulwa kwe-visual cheap Ndingathanda ukuba ukugcina i-scoring engabizi. Ukuba i-scoring iindleko kakhulu, ndiyabakhokela i-akhawunti kwakhona kwi-pipeline. I-base-line ye-signal eyenziwe kakhulu kwi-screen recording : frame difference energy Ukuguqulwa kwi-greyscale. Ukubala i-difference epheleleyo phakathi kwama-frames ezihlangeneyo. Ukuthatha ubunzima umfanekiso diff (ngokukhetha normalize). Oku kukufumana: Ukuqhagamshelana Ukucaciswa kwe-typing (i-blinking caret kunye ne-text updates) Dropdowns kunye modals I-Page Transitions Hover ukuguqulwa kwezizwe Yinto ayikho epheleleyo, kodwa ngokukhawuleza kwaye kuxhomekeke kakuhle kwi-"enkosi elidlulileyo". Isigaba 2 — Ukwakhiwa kweSegment: Yenza i-flow ye-score ye-noisy kwi-runs I-per-frame scores i-spiky. I-I don't want the allocator to chase noise. Ngoko ndicinga usebenzisa i-state machine elula kunye ne-hysteresis: Ukugcina i-rolling average score. Ukuguqulwa kwisiqingatha "hot" xa i-rolling score ifumaneka phezulu phezulu phezulu phezulu. Ukuguqulwa kwakhona kwi-"cold" xa ivela phantsi kwinqanaba eliphantsi. Ukuvumela ubude obuncinane le-segment ukuze awukwazi ukuvelisa ama-micro-segments ezininzi. Kuyinto ukucaciswa kwe-change-point ye-academic. I-engineering: ukhuseleko olufanelekileyo, ukucaciswa okunciphisa, kunye nokukhipha okucacileyo. Isigaba 3 – Ukukhishwa kwebhizinisi nge-guardrails I-Pure proportional allocation is not enough. It fails on rounding and it can starve short segments. I-Pure proportional allocation is not enough. It fails on rounding and it can starve short segments. Ngoko ke, i-allocator yam has imiyalelo ezintathu: Zonke iingxaki ziquka umgca (min_frames_per_segment). Akukho ingxaki ingaphezulu kwe-cap (max_frames_per_segment). Izixhobo ezihlabathi ifakwe ngokuhambisana ne-segment utility. Ngokwenza oku, ukuxhaswa kubaluleke kunye nokuvimbela izifo. Isigaba 4 - Ukukhetha i-index ngaphakathi iingxaki Emva kokuba i-segment yenzelwe kwi-K frames, ndiza i-K indices ezidlulileyo kwi-segment: Zonke iingxaki zihlanganisa ukuqala i-segment (i-transitions matter). Zonke iingxaki ze-Segment End (iingxaki ze-States Final). Yenza iimveliso ezininzi kunye neengxaki ezisetyenzisiweyo. Ukuba ndifuna i-fidelity ezininzi emva koko, ndiyabakhokela kwi-local maxima ye-score, kodwa ukhetho olungafanelekileyo i-baseline enamandla kwaye ivimbele ikhowudi ngqo. Implementi epheleleyo yokusebenza (i-scoring + i-segmentation + i-allocation + i-extraction) Ukwenza le post ifumaneka, apha i-script ye-Python efana ne-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye-MP4 ye Ukuxhaswa: iimveliso ze-Python Ukucinga Ukucinga: python adaptive_keyframes.py --video input.mp4 --budget 60 --out ./keyframes Yintoni : adaptive_keyframes.py import argparse import os from dataclasses import dataclass from typing import List, Tuple import cv2 import numpy as np @dataclass class Segment: start: int # inclusive frame index end: int # exclusive frame index score: float @property def length(self) -> int: return max(0, self.end - self.start) def frame_diff_score(prev_bgr: np.ndarray, curr_bgr: np.ndarray) -> float: """Cheap per-frame change score in [0, 1] (roughly). Uses grayscale mean absolute difference normalized by 255. """ prev_gray = cv2.cvtColor(prev_bgr, cv2.COLOR_BGR2GRAY) curr_gray = cv2.cvtColor(curr_bgr, cv2.COLOR_BGR2GRAY) diff = cv2.absdiff(prev_gray, curr_gray) return float(diff.mean() / 255.0) def compute_scores( cap: cv2.VideoCapture, stride: int = 1, max_frames: int | None = None, ) -> Tuple[List[float], int]: """Return (scores, total_frames_read). scores[i] is the change score between frame i and i+stride (based on sampled reads). """ scores: List[float] = [] ok, prev = cap.read() if not ok: return scores, 0 frame_idx = 1 frames_read = 1 while True: # Skip stride-1 frames between comparisons. for _ in range(stride - 1): ok = cap.grab() if not ok: return scores, frames_read frame_idx += 1 frames_read += 1 if max_frames is not None and frames_read >= max_frames: return scores, frames_read ok, curr = cap.read() if not ok: return scores, frames_read frames_read += 1 s = frame_diff_score(prev, curr) scores.append(s) prev = curr frame_idx += 1 if max_frames is not None and frames_read >= max_frames: return scores, frames_read def segment_scores( scores: List[float], window: int = 8, hot_thresh: float = 0.030, cold_thresh: float = 0.020, min_len: int = 12, ) -> List[Segment]: """Convert per-step scores into segments with a utility score. Uses a rolling mean with hysteresis to avoid segment flicker. """ if not scores: return [] # Rolling mean via cumulative sum. x = np.array(scores, dtype=np.float32) c = np.cumsum(np.insert(x, 0, 0.0)) def roll_mean(i: int) -> float: j0 = max(0, i - window + 1) n = i - j0 + 1 return float((c[i + 1] - c[j0]) / n) segments: List[Segment] = [] state_hot = False seg_start = 0 seg_scores: List[float] = [] for i in range(len(scores)): rm = roll_mean(i) if state_hot: seg_scores.append(scores[i]) if rm < cold_thresh: # Close hot segment at i+1 seg_end = i + 1 if seg_end - seg_start < min_len: # Too short: merge into previous if possible, else keep. pass segments.append(Segment(seg_start, seg_end, float(np.mean(seg_scores) if seg_scores else 0.0))) # Start cold state_hot = False seg_start = seg_end seg_scores = [] else: if rm > hot_thresh: # Close cold segment seg_end = i + 1 cold_score = float(np.mean(scores[seg_start:seg_end]) if seg_end > seg_start else 0.0) segments.append(Segment(seg_start, seg_end, cold_score)) # Start hot state_hot = True seg_start = seg_end seg_scores = [] # Close tail tail_end = len(scores) if tail_end > seg_start: tail_score = float(np.mean(scores[seg_start:tail_end])) segments.append(Segment(seg_start, tail_end, tail_score)) # Merge very short segments to keep output stable. merged: List[Segment] = [] for seg in segments: if not merged: merged.append(seg) continue if seg.length < min_len: prev = merged[-1] combined = Segment(prev.start, seg.end, (prev.score * prev.length + seg.score * seg.length) / max(1, (prev.length + seg.length))) merged[-1] = combined else: merged.append(seg) # One more pass: ensure non-empty and strictly increasing. cleaned: List[Segment] = [] for seg in merged: if seg.length <= 0: continue if cleaned and seg.start < cleaned[-1].end: seg = Segment(cleaned[-1].end, seg.end, seg.score) if seg.length > 0: cleaned.append(seg) return cleaned def allocate_frames( segments: List[Segment], budget: int, min_frames_per_segment: int = 1, max_frames_per_segment: int = 30, ) -> List[int]: """Allocate keyframes to segments using floor + proportional + cap.""" if budget <= 0 or not segments: return [] n = len(segments) min_total = min_frames_per_segment * n # If budget is smaller than the floor, distribute 1-by-1. if min_total >= budget: alloc = [0] * n for i in range(budget): alloc[i % n] += 1 return alloc utilities = np.array([max(0.0, s.score) for s in segments], dtype=np.float64) total_u = float(utilities.sum()) alloc = [min_frames_per_segment] * n remaining = budget - min_total if total_u == 0.0: raw = np.full(n, remaining / n, dtype=np.float64) else: raw = utilities * (remaining / total_u) # Add integer parts. for i in range(n): add = int(raw[i]) alloc[i] = min(max_frames_per_segment, alloc[i] + add) allocated = sum(alloc) # Distribute leftover by fractional parts, respecting caps. if allocated < budget: frac = raw - np.floor(raw) order = np.argsort(-frac) # descending fractional idx = 0 safety = 0 while allocated < budget and safety < 10_000: i = int(order[idx % n]) if alloc[i] < max_frames_per_segment: alloc[i] += 1 allocated += 1 idx += 1 safety += 1 # If we somehow exceeded budget due to caps/floor interplay, trim from lowest utility. if allocated > budget: order = np.argsort(utilities) # ascending utility idx = 0 safety = 0 while allocated > budget and safety < 10_000: i = int(order[idx % n]) if alloc[i] > 0 and alloc[i] > min_frames_per_segment: alloc[i] -= 1 allocated -= 1 idx += 1 safety += 1 return alloc def pick_indices_for_segment(seg: Segment, k: int) -> List[int]: """Pick k indices in [seg.start, seg.end] over the score-step domain. Note: scores are defined between frames; we later map these to actual frames. """ if k <= 0 or seg.length <= 0: return [] if k == 1: return [seg.start] # Evenly spaced across [start, end-1] xs = np.linspace(seg.start, seg.end - 1, num=k) idxs = sorted({int(round(x)) for x in xs}) # Ensure exactly k by filling gaps if rounding collapsed points. while len(idxs) < k: # Insert midpoints between existing points. candidates = [] for a, b in zip(idxs, idxs[1:]): if b - a >= 2: candidates.append((a + b) // 2) if not candidates: # Fall back: walk forward. x = idxs[-1] if x + 1 < seg.end: idxs.append(x + 1) else: break else: for c in candidates: if c not in idxs and seg.start <= c < seg.end: idxs.append(c) if len(idxs) >= k: break idxs = sorted(idxs) # Trim if we overshot. return idxs[:k] def select_keyframe_indices(segments: List[Segment], alloc: List[int], stride: int = 1) -> List[int]: """Return concrete frame indices (0-based) to extract from the video.""" chosen: List[int] = [] for seg, k in zip(segments, alloc): step_idxs = pick_indices_for_segment(seg, k) # Map score-step domain to frame indices. # score i corresponds to diff between frame i and i+stride; # picking frame i is a reasonable representative. for si in step_idxs: chosen.append(si * stride) chosen = sorted(set(chosen)) return chosen def extract_frames(video_path: str, frame_indices: List[int], out_dir: str) -> None: os.makedirs(out_dir, exist_ok=True) cap = cv2.VideoCapture(video_path) if not cap.isOpened(): raise RuntimeError(f"Failed to open video: {video_path}") frame_set = set(frame_indices) max_idx = max(frame_set) if frame_set else -1 idx = 0 saved = 0 while idx <= max_idx: ok, frame = cap.read() if not ok: break if idx in frame_set: path = os.path.join(out_dir, f"frame_{idx:06d}.jpg") ok2 = cv2.imwrite(path, frame) if not ok2: raise RuntimeError(f"Failed to write: {path}") saved += 1 idx += 1 cap.release() if saved == 0 and frame_indices: raise RuntimeError("No frames were saved; check indices and video decoding") def main() -> None: ap = argparse.ArgumentParser() ap.add_argument("--video", required=True, help="Path to input video") ap.add_argument("--out", required=True, help="Output directory for keyframes") ap.add_argument("--budget", type=int, default=60, help="Total keyframes to extract") ap.add_argument("--stride", type=int, default=2, help="Compare every Nth frame for scoring") ap.add_argument("--window", type=int, default=8, help="Rolling window for segmentation") ap.add_argument("--hot", type=float, default=0.030, help="Enter hot segment threshold") ap.add_argument("--cold", type=float, default=0.020, help="Exit hot segment threshold") args = ap.parse_args() cap = cv2.VideoCapture(args.video) if not cap.isOpened(): raise RuntimeError(f"Failed to open video: {args.video}") scores, frames_read = compute_scores(cap, stride=args.stride) cap.release() segments = segment_scores(scores, window=args.window, hot_thresh=args.hot, cold_thresh=args.cold) alloc = allocate_frames(segments, budget=args.budget, min_frames_per_segment=1, max_frames_per_segment=max(2, args.budget)) keyframes = select_keyframe_indices(segments, alloc, stride=args.stride) # Keep within a hard limit (rounding/uniqueness can change count). if len(keyframes) > args.budget: keyframes = keyframes[: args.budget] extract_frames(args.video, keyframes, args.out) print(f"frames_read={frames_read}") print(f"scores={len(scores)} segments={len(segments)}") print(f"budget={args.budget} selected={len(keyframes)}") if segments: hot_share = sum(1 for s in segments if s.score > args.hot) / len(segments) print(f"segment_hot_share={hot_share:.2f}") if __name__ == "__main__": main() I-scenario yenzelwe ngokufanelekileyo: Yenza i-stable list ye-segment. Ukubonisa ukuba awukwazi ukufikelela kwi-Frame Budget yakho. Yenza iifayile ze-frame ye-Determinist ye-Downstream analysis. Kwi-production service yam, i-frames eyenza kwi-multimodal call (i-Gemini via kwizilwanyana zayo), kwaye i-analytic payload ebonakalayo ithathwe kwiphepha le-Next.js ngokusebenzisa i-webhook ebonakalayo. google-generativeai Iingcebiso ze-tuning ezisebenzayo (iingcebiso ezininzi ezininzi emva kwe-demo yokuqala) Xa unayo ifomu ukusebenza, iingxowa ziya kuqhagamshelwano ukunxibelelana phantsi iingxowa zehlabathi ezininzi. 1) Khetha isinyathelo esifanayo kwi-content yakho Ukuba utshintshe zonke iimfashini kwi-30 FPS yokufaka, uya kufumana iingxaki ezincinane ze-cursor kuzo zonke. Oku ayikho njalo, kodwa kunokufunda i-signal yakho ye- "ukuguqula". Ukuphakamisa kwe-2-5 iindawo ezilungileyo yokufaka i-screen. Uyaziqhelekanga kwimibelelwano ye-UI, kodwa uyaziqhelekanga i-sub-frame noqo. 2) I-Hysteresis ibonelela i-segment flicker Ukusetyenziswa iimveliso (izilinganiso ezimbini) zihlanganisa. Nge iilinganiso elinye, uya kuxhaswa phakathi okushisa / okushisa ngokuqhelekileyo malunga ne-cutoff. hot_thresh cold_thresh I-Hysteresis inikeza iingxaki ze-stable kwaye inikeza izibuyekezo zibonakalayo. 3) Iingubo kunye neengqungquthela ayikho optional Ngaphandle kwe-floor, iingxaki ezincinane ziyafumaneka kwi-zero kwaye uya kuthatha i-micro-burst efanelekileyo. Ngaphandle kwe-cap, i-segment ye-"busy" elide ingasetyenzisa i-budget yakho epheleleyo kwaye ungenza i-context kwi-rest of the workflow. 4) Zibhalisa ngokuzenzekelayo i-byte esithunyelwe Ukuba uthetha Ukukhuphela , i-library ye-HTTP ingaba i-seryalization kunye ne-keyword / i-spacing eyahlukileyo kunokuba i-hTTP ye-hTTP ye-hTTP ye-hTTP ye-hTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP ye-HTTP. json.dumps(payload) json=payload Ukubhalisa i-byte epheleleyo ( ) ukunciphisa le umgangatho we-bugs. Oku kubalulekile apha ngenxa yokusebenza kwe-analytics - oku kuthatha i-keyframes yakho eyenziwe ngokucacileyo - i-artifact engaphezulu kwi-pipeline. Ukuba ifumaneka ngenxa ye-signing mismatch, uye utshintshe i-budget ye-frame ngokupheleleyo. data=body_bytes Yintoni le design scales ekukhiqizeni I-sampler isebenza ngenxa yokuxhomekeka kwezinto ezimbini ezininzi: I-cost scales kunye ne-frames ezahlukileyo, ngaphandle kwe-video length. Emva kokufaka i-budget, i-multimodal ephakanyisiweyo i-step ithatha i-upper bound. Ixesha lokugqibela ifumaneka. Ukubala i-pass ye-linear; i-segmentation i-pass ye-linear; ukuhanjiswa yi-linear ngeengxaki ezincinane ze-constant. Inyaniso olukhulu: isampula ivumela i-system yam ukuba isebenzise njengoko into efanelekileyo. Ngaphandle kokufunda ukuba ividiyo ye-12 imizuzu iye "ngaphezulu," ndiyabakhokela iindidi ezininzi iifram. I-analyzer ibonelela le budget kwiindawo ze-record ezifanelekileyo, kwaye ibonelela ukuthenga i-UI ye-static. Yinto yonke ingxaki: ukusetyenziswa okucacileyo ibandakanya ukusetyenziswa okuhambelana.