paint-brush
Ku tirhisa Ndlela ya Stratification eka Nxopaxopo wa Swikambelohi@nataliaogneva
33,155 ku hlayiwa
33,155 ku hlayiwa

Ku tirhisa Ndlela ya Stratification eka Nxopaxopo wa Swikambelo

hi Natalia Ogneva8m2024/04/19
Read on Terminal Reader
Read this story w/o Javascript

Ku leha ngopfu; Ku hlaya

Stratified sampling i thekiniki ya matimba yo tlakusa vukorhokeri bya swikambelo na ku twisisa ka metric eka nxopaxopo wa datha. Hi ku hlengeleta vayingiseri va wena ni ku va avanyisa hi swipimelo swo karhi, u nga endla leswaku swikambelo swi va leswinene, u hunguta ku hambana ni ku antswisa ku tshembeka ka mimbuyelo.

Company Mentioned

Mention Thumbnail
featured image - Ku tirhisa Ndlela ya Stratification eka Nxopaxopo wa Swikambelo
Natalia Ogneva HackerNoon profile picture
0-item


Xikambelo xin’wana na xin’wana xi katsa ku cincana exikarhi ka mbuyelo wo hatlisa na ku twisisa ka metric. Loko metric leyi hlawuriweke yi anamile hi ku ya hi ku hambana, hi fanele ku rindza nkarhi wo leha ku tiyisisa leswaku mbuyelo wa xikambelo wu lulamile. A hi languteni ndlela yin’we yo pfuna vaxopaxopi ku tlakusa swikambelo swa vona handle ko lahlekeriwa hi nkarhi wo tala kumbe ku twisisa ka metric.


Ku Vumbiwa ka Xiphiqo

A hi nge hi endla xikambelo xa ntolovelo ku kambela algorithm leyintshwa ya xiyimo, hi ku leha ka seshini tanihi metric yo sungula. Ku engetela kwalaho, xiya leswaku vayingiseri va hina va nga avanyisiwa hi mintlawa yinharhu hi xiringaniso: 1 wa timiliyoni ta vantshwa, 2 wa timiliyoni ta vatirhisi va malembe ya 18-45, na 3 wa timiliyoni ta vatirhisi va malembe ya 45 ku ya ehenhla. Nhlamulo eka algorithm leyintshwa ya xiyimo yi ta hambana swinene exikarhi ka mintlawa leyi ya vayingiseri. Ku hambana loku ko anama ku hunguta ku twisisa ka metric.


Hi marito yan’wana, nhlayo ya vaaki yinga avanyisiwa hi ti strata tinharhu, leti hlamuseriweke hi leswi landzelaka:


A hi nge xiphemu xin’wana ni xin’wana xi ni ku hangalasiwa loku tolovelekeke. Kutani, metric leyikulu ya vaaki na yona yina ku hangalaka ka ntolovelo.

Ndlela ya stratification

Hi avanyisa hi ku landzelelana vatirhisi hinkwavo ku suka eka vaaki eka dizayini ya xikambelo xa xikhale handle ko tekela enhlokweni ku hambana exikarhi ka vatirhisi va hina. Xisweswo, hi languta xikombiso lexi nga na nkoka lowu languteriweke na ku hambana loku landzelaka.


Ndlela yin’wana iku avanyisa hiku landzelelana endzeni ka strat yin’wana na yin’wana kuya hi ntiko wa strat eka vaaki hinkwavo.

Eka xiyimo lexi, nkoka lowu languteriweke na ku hambana hi leswi landzelaka.


Nhlayo leyi languteriweke yi fana na le ka nhlawulo wo sungula. Hambiswiritano, ku hambana i kutsongo, leswi tiyisisaka ku twisisa ka le henhla ka metric.

Sweswi, a hi kambisiseni ndlela ya Neyman . Va ringanyeta ku avanyisa vatirhisi hiku landzelelana endzeni ka strat yin’wana na yin’wana na swipimelo swokarhi.

Kutani, ntikelo lowu languteriweke na ku hambana swi ringana na leswi landzelaka eka mhaka leyi.

Nhlayo leyi languteriweke yi ringana na nhlayo leyi languteriweke eka xiyimo xo sungula hi ndlela ya asymptotically. Hambiswiritano, ku hambana loku ku le hansi swinene.

Ku Kamberiwa ka Empiriki

Hi kombisile ku tirha kahle ka endlelo leri hi tlhelo ra thiyori. A hi tekelelani swikombiso hi kambela ndlela ya stratification hi ndlela ya empirically.

A hi kambisiseni timhaka tinharhu:

  • ti strat hinkwato leti nga na swiringanyeto swo ringana na ku hambana, .
  • ti strat hinkwato leti nga na swiringanyeto swo hambana na ku hambana loku ringanaka, .
  • ti strats hinkwato letingana ti means to ringana na ti variances to hambana.

Hi ta tirhisa maendlelo hinkwawo manharhu eka swiyimo hinkwaswo hi tlhela hi plota histogram na boxplot ku ma pimanisa.

Ku lunghiselela khodi

Xo sungula, a hi endleni tlilasi eka Python leyi tekelelaka vaaki va hina hi ku angarhela lava vumbiwaka hi ti-strat tinharhu.

 class GeneralPopulation: def __init__(self, means: [float], stds: [float], sizes: [int], random_state: int = 15 ): """ Initializes our General Population and saves the given distributions :param means: List of expectations for normal distributions :param stds: List of standard deviations for normal distributions :param sizes: How many objects will be in each strata :param random_state: Parameter fixing randomness. Needed so that when conducting experiment repeatedly with the same input parameters, the results remained the same """ self.strats = [st.norm(mean, std) for mean, std in zip(means, stds)] self._sample(sizes) self.random_state = random_state def _sample(self, sizes): """Creates a general population sample as a mixture of strata :param sizes: List with sample sizes of the corresponding normal distributions """ self.strats_samples = [rv.rvs(size) for rv, size in zip(self.strats, sizes)] self.general_samples = np.hstack(self.strats_samples) self.N = self.general_samples.shape[0] # number of strata self.count_strats = len(sizes) # ratios for every strata in GP self.ws = [size/self.N for size in sizes] # ME and Std for GP self.m = np.mean(self.general_samples) self.sigma = np.std(self.general_samples) # ME and std for all strata self.ms = [np.mean(strat_sample) for strat_sample in self.strats_samples] self.sigmas = [np.std(strat_sample) for strat_sample in self.strats_samples]


Kutani, a hi engeteleni mintirho ya tindlela tinharhu ta ku tekela swikombiso leti hlamuseriweke eka xiphemu xa thiyori.

 def random_subsampling(self, size): """Creates a random subset of the entire population :param sizes: subsample size """ rc = np.random.choice(self.general_samples, size=size) return rc def proportional_subsampling(self, size): """Creates a subsample with the number of elements, proportional shares of strata :param sizes: subsample size """ self.strats_size_proport = [int(np.floor(size*w)) for w in self.ws] rc = [] for k in range(len(self.strats_size_proport)): rc.append(np.random.choice(self.strats_samples[k], size=self.strats_size_proport[k])) return rc def optimal_subsampling(self, size): """Creates a subsample with the optimal number of elements relative to strata :param sizes: subsample size """ sum_denom = 0 for k in range(self.count_strats): sum_denom += self.ws[k] * self.sigmas[k] self.strats_size_optimal = [int(np.floor((size*w*sigma)/sum_denom)) for w, sigma in zip(self.ws, self.sigmas)] if 0 in self.strats_size_optimal: raise ValueError('Strats size is 0, please change variance of smallest strat!') rc = [] for k in range(len(self.strats_size_optimal)): rc.append(np.random.choice(self.strats_samples[k], size=self.strats_size_optimal[k])) return rc


Nakambe, eka xiphemu xa empiriki, minkarhi hinkwayo hi lava ntirho wo tekelela endlelo ra ku ringeta.

 def run_experiments(self, n_sub, subsampling_method, n_experiments=1000): """Conducts a series of experiments and saves the results :param n_sub: size of sample :param subsampling_method: method for creating a subsample :param n_experiments: number of experiment starts """ means_s = [] if(len(self.general_samples)<100): n_sub = 20 if(subsampling_method == 'random_subsampling'): for n in range(n_experiments): rc = self.random_subsampling(n_sub) mean = rc.sum()/len(rc) means_s.append(mean) else: for n in range(n_experiments): if(subsampling_method == 'proportional_subsampling'): rc = self.proportional_subsampling(n_sub) elif(subsampling_method == 'optimal_subsampling'): rc = self.optimal_subsampling(n_sub) strats_mean = [] for k in range(len(rc)): strats_mean.append(sum(rc[k])/len(rc[k])) # Mean for a mixture means_s.append(sum([w_k*mean_k for w_k, mean_k in zip(self.ws, strats_mean)])) return means_s


Vuyelo bya ku tekelela

Loko hi languta eka vaaki hi ku angarhela, laha ti strats ta hina hinkwato ti nga na mimpimo na ku hambana loku fanaka, mbuyelo wa tindlela hinkwato tinharhu wu languteriwile ku ringana ngopfu kumbe ku tlula.

Swiringanyeto swo hambana na ku hambana loku ringanaka swi kumile mbuyelo lowu tsakisaka swinene. Ku tirhisa stratification swihunguta swinene ku hambana.

Eka swiyimo leswi nga na swiringanyeto swo ringana na ku hambana ko hambana, hi vona ku hunguteka ka ku hambana eka ndlela ya Neyman.

Mahetelelo

Sweswi, u nga tirhisa ndlela ya stratification ku hunguta ku hambana ka metric na ku tlakusa xikambelo loko u hlengeleta vayingiseri va wena naswona hi tlhelo ra xithekiniki u va avanyisa hi ku landzelelana endzeni ka xitluletsongo xin’wana na xin’wana hi swipimelo swo karhi!