paint-brush
Análisis de Experimento nisqapaq Método de Estratificación nisqawan yanapachikuspaby@nataliaogneva
33,138 ñawinchasqakuna
33,138 ñawinchasqakuna

Análisis de Experimento nisqapaq Método de Estratificación nisqawan yanapachikuspa

by Natalia Ogneva8m2024/04/19
Read on Terminal Reader
Read this story w/o Javascript

Nishu unay; Ñawinchanapaq

Muestreo estratificado nisqa kallpasapa técnica nisqa, experimento nisqapa allin ruwayninta, chaynallataq sensibilidad métrica nisqatapas kallpanchanapaq, datos nisqa anlisis nisqapi. Uyariqniykikunata huñuspa chaymanta sapanchasqa llasakunawan rakispa, experimentokuna allinchayta atikunki, varianzata pisiyachiyta atikunki chaymanta ruwaypa confiabilidadninta aswan allinchayta atikunki.

Company Mentioned

Mention Thumbnail
featured image - Análisis de Experimento nisqapaq Método de Estratificación nisqawan yanapachikuspa
Natalia Ogneva HackerNoon profile picture
0-item


Ima experimentopas huk rantinakuywanmi ruwakun usqaylla ruwaykunawan sensibilidad métrica nisqawan. Sichus akllasqa métrica ancho kanman varianza nisqamanta chayqa, unaytan suyananchis experimentopa ruwasqan allin kananpaq. Huk ruwayta qhawarisun, analisiskunata yanapanapaq experimentonku kallpachanankupaq, mana ancha pachata chinkachispa nitaq sensibilidad métrica nisqatapas.


Sasachakuykuna Formulación

Yuyaykuy huk experimento estándar ruwasqayku huk musuq algoritmo ranking pruebapaq, sesión largo kaqwan ñawpaq métrica hina. Chaymantapas, qhawariy uyariqniykuqa yaqa kimsa huñupi t’aqasqa kankuman: 1 millón wayna sipaskuna, 2 millón usuariokuna 18-45 watayoq, 3 millón usuariokuna 45 watayoqmanta wichayman. Musuq algoritmo ranking kaqman kutichiyqa kay uyariq qutukuna ukhupi anchata t’aqakunman. Kay hatun t’ikrayqa pisiyachinmi chay métrica nisqapa sensibilidad nisqa kayninta.


Huk rimaypiqa, runaqa kimsa estratos nisqamanmi rakikunman, kaypim willakun:


Sapa componente normal distribución nisqayuq kaptin nisunman. Chaymantaqa, llaqtapaq métrica principal nisqapas normal distribución nisqayuqmi.

Método de estratificación nisqa

Llapa llamk'aqkunata llaqtamanta huk diseño experimento clásico kaqpi mana llamk'aqniykupura chiqan kayninkunata qhawaspa al azar rakiyku . Chayhinam, muestrataqa qawarinchik kay suyasqa chaninwan, varianzawan ima.


Huk ñantaqmi sapa estrato ukupi al azar rakiy, chay estratpa llasayninman hina, población general nisqapi.

Kaypiqa, suyasqa chaninpas, varianza nisqapas kayhinam.


Suyasqa chaninqa ñawpaq akllaypi hina kaqllam. Ichaqa, chay varianzaqa pisillam, chaymi garantizan aswan hatun sensibilidad métrica nisqa.

Kunanqa, Neymanpa ruwayninta qhawarisun . Paykunaqa yuyaychanku sapa strat ukhupi llamk'aqkunata al azar rakiyta específico llasayninwan.

Chaymi, suyasqa chaninpas, varianzapas kay casopiqa kay qatiqwan kaqlla.

Suyasqa chaninqa ñawpaq kaqpi suyasqa chaninwanmi asintóticamente. Ichaqa, chay varianzaqa aswan pisillam.

Prueba Empírica nisqa

Kay ruwaypa allin ruwayninta teóricamente pruebarqayku. Muestras nisqakunata simulaspa empíricamente chay método de estratificación nisqa pruebasun.

Kimsa casokunamanta yachasun:

  • llapan stratkuna kaqlla promedioyuq hinaspa varianzayuq, .
  • llapan stratkuna hukniray medios nisqayuq, kaqlla varianzas nisqayuq, .
  • llapan stratkuna kaqlla promedioyuq hinaspa hukniray varianzayuq.

Kimsantin ruwaykunata tukuy imapi churasunchik, chaymantataq histograma nisqawan, cuadro de cuadros nisqawan ima, chaykunata tupachisun.

Código wakichiy

Ñawpaqtaqa, huk claseta ruwasun Python nisqapi, chaymi simulan población generalninchista, kinsa strat nisqamanta ruwasqa.

 class GeneralPopulation: def __init__(self, means: [float], stds: [float], sizes: [int], random_state: int = 15 ): """ Initializes our General Population and saves the given distributions :param means: List of expectations for normal distributions :param stds: List of standard deviations for normal distributions :param sizes: How many objects will be in each strata :param random_state: Parameter fixing randomness. Needed so that when conducting experiment repeatedly with the same input parameters, the results remained the same """ self.strats = [st.norm(mean, std) for mean, std in zip(means, stds)] self._sample(sizes) self.random_state = random_state def _sample(self, sizes): """Creates a general population sample as a mixture of strata :param sizes: List with sample sizes of the corresponding normal distributions """ self.strats_samples = [rv.rvs(size) for rv, size in zip(self.strats, sizes)] self.general_samples = np.hstack(self.strats_samples) self.N = self.general_samples.shape[0] # number of strata self.count_strats = len(sizes) # ratios for every strata in GP self.ws = [size/self.N for size in sizes] # ME and Std for GP self.m = np.mean(self.general_samples) self.sigma = np.std(self.general_samples) # ME and std for all strata self.ms = [np.mean(strat_sample) for strat_sample in self.strats_samples] self.sigmas = [np.std(strat_sample) for strat_sample in self.strats_samples]


Chaymantaqa, kimsa muestreo ruwaypaq ruwaykunata yapaykusun, chaytam willakun parte teórica nisqapi.

 def random_subsampling(self, size): """Creates a random subset of the entire population :param sizes: subsample size """ rc = np.random.choice(self.general_samples, size=size) return rc def proportional_subsampling(self, size): """Creates a subsample with the number of elements, proportional shares of strata :param sizes: subsample size """ self.strats_size_proport = [int(np.floor(size*w)) for w in self.ws] rc = [] for k in range(len(self.strats_size_proport)): rc.append(np.random.choice(self.strats_samples[k], size=self.strats_size_proport[k])) return rc def optimal_subsampling(self, size): """Creates a subsample with the optimal number of elements relative to strata :param sizes: subsample size """ sum_denom = 0 for k in range(self.count_strats): sum_denom += self.ws[k] * self.sigmas[k] self.strats_size_optimal = [int(np.floor((size*w*sigma)/sum_denom)) for w, sigma in zip(self.ws, self.sigmas)] if 0 in self.strats_size_optimal: raise ValueError('Strats size is 0, please change variance of smallest strat!') rc = [] for k in range(len(self.strats_size_optimal)): rc.append(np.random.choice(self.strats_samples[k], size=self.strats_size_optimal[k])) return rc


Hinallataq, parte empírica nisqapaqqa, sapa kutim necesitanchik huk función nisqa experimento proceso nisqa simulanapaq.

 def run_experiments(self, n_sub, subsampling_method, n_experiments=1000): """Conducts a series of experiments and saves the results :param n_sub: size of sample :param subsampling_method: method for creating a subsample :param n_experiments: number of experiment starts """ means_s = [] if(len(self.general_samples)<100): n_sub = 20 if(subsampling_method == 'random_subsampling'): for n in range(n_experiments): rc = self.random_subsampling(n_sub) mean = rc.sum()/len(rc) means_s.append(mean) else: for n in range(n_experiments): if(subsampling_method == 'proportional_subsampling'): rc = self.proportional_subsampling(n_sub) elif(subsampling_method == 'optimal_subsampling'): rc = self.optimal_subsampling(n_sub) strats_mean = [] for k in range(len(rc)): strats_mean.append(sum(rc[k])/len(rc[k])) # Mean for a mixture means_s.append(sum([w_k*mean_k for w_k, mean_k in zip(self.ws, strats_mean)])) return means_s


Simulación nisqamanta ruwasqakuna

Sichus qhawarisun población general nisqapi, maypichus llapa estratosninchiskuna kaqlla valores nisqayoq, varianzas nisqayoq ima, chaymi suyakun kinsantin métodokunaq ruwakuyninqa aswan pisi kaqlla kananta.

Chikan chikan promediokuna, kaqlla varianzakuna ima aswan kusikuypaq ruwaykunata tarirqanku. Estratificación nisqawan yanapachikuspaqa anchatam pisiyachin varianza nisqataqa.

Igual promedioyuq, hukniray varianzayuq casokunapiqa, Neymanpa ruwayninpi varianza pisiyachiyta rikunchik.

Conclusion

Kunanqa, método estratificación nisqawan ruwayta atikunki varianza métrica nisqa pisiyachinapaq chaymanta experimentota kallpachanapaq sichus audienciaykita cluster ruwanki chaymanta técnicamente sapa cluster ukhupi al azar rakinki pesos específicos nisqawan!