Noma yikuphi ukuhlola kuhilela ukuhwebelana phakathi kwemiphumela esheshayo nokuzwela kwemethrikhi. Uma i-metric ekhethiwe ibanzi ngokokwehluka, kufanele silinde isikhathi eside ukuze siqinisekise ukuthi imiphumela yokuhlolwa inembile. Ake sicabangele indlela eyodwa yokusiza abahlaziyi bathuthukise ukuhlola kwabo ngaphandle kokulahlekelwa isikhathi esiningi noma ukuzwela kwemethrikhi.
Ake sithi senza isilingo esijwayelekile ukuze sihlole i-algorithm entsha yezinga, enobude beseshini njengemethrikhi eyinhloko. Ukwengeza, cabanga ukuthi izethameli zethu zingahlukaniswa cishe ngamaqembu amathathu: intsha eyisigidi, abasebenzisi abayizigidi ezingu-2 abaneminyaka engu-18-45, kanye nabasebenzisi abayizigidi ezingu-3 abaneminyaka engu-45 nangaphezulu. Impendulo ku-algorithm entsha yezinga izohluka kakhulu phakathi kwala maqembu ezithameli. Lokhu kuhluka okubanzi kunciphisa ukuzwela kwemethrikhi.
Ngamanye amazwi, inani labantu lingahlukaniswa libe yizigaba ezintathu, ezichazwe kulokhu okulandelayo:
Ake sithi yonke ingxenye inokusabalalisa okuvamile. Bese, imethrikhi eyinhloko yabantu nayo inokusabalalisa okuvamile.
Sihlukanisa ngokungahleliwe bonke abasebenzisi kubantu ngedizayini yokuhlola yakudala ngaphandle kokucabangela umehluko phakathi kwabasebenzisi bethu. Ngakho, sicabangela isampula ngevelu elandelayo elindelekile kanye nokwehluka.
Enye indlela ihlukanisa ngokungahleliwe ngaphakathi kwe-strat ngayinye ngokwesisindo se-strat kubantu abaningi.
Kulokhu, inani elilindelekile kanye nokwehluka yilokhu okulandelayo.
Inani elilindelekile liyafana nelikukhetho lokuqala. Nokho, ukuhluka kuncane, okuqinisekisa ukuzwela okuphezulu kwemethrikhi.
Manje, ake sicabangele indlela kaNeyman . Baphakamisa ukuhlukanisa abasebenzisi ngokungahleliwe ngaphakathi kwe-strat ngayinye enesisindo esithile.
Ngakho-ke, inani elilindelekile nokuhluka kuyalingana nokulandelayo kulesi simo.
Inani elilindelekile lilingana nenani elilindelekile esimweni sokuqala ngokungafani ne-symptotically. Nokho, ukuhluka kuncane kakhulu.
Sibonise ukusebenza kahle kwale ndlela ngokombono. Masilingise amasampuli futhi sihlole indlela yokuhlukanisa ngokunamandla.
Ake sicabangele izimo ezintathu:
Sizosebenzisa zonke izindlela ezintathu kuzo zonke izimo futhi sihlele i-histogram ne-boxplot ukuze siziqhathanise.
Okokuqala, ake sakhe ikilasi kuPython elingisa inani labantu bethu elihlanganisa ama-strats amathathu.
class GeneralPopulation: def __init__(self, means: [float], stds: [float], sizes: [int], random_state: int = 15 ): """ Initializes our General Population and saves the given distributions :param means: List of expectations for normal distributions :param stds: List of standard deviations for normal distributions :param sizes: How many objects will be in each strata :param random_state: Parameter fixing randomness. Needed so that when conducting experiment repeatedly with the same input parameters, the results remained the same """ self.strats = [st.norm(mean, std) for mean, std in zip(means, stds)] self._sample(sizes) self.random_state = random_state def _sample(self, sizes): """Creates a general population sample as a mixture of strata :param sizes: List with sample sizes of the corresponding normal distributions """ self.strats_samples = [rv.rvs(size) for rv, size in zip(self.strats, sizes)] self.general_samples = np.hstack(self.strats_samples) self.N = self.general_samples.shape[0] # number of strata self.count_strats = len(sizes) # ratios for every strata in GP self.ws = [size/self.N for size in sizes] # ME and Std for GP self.m = np.mean(self.general_samples) self.sigma = np.std(self.general_samples) # ME and std for all strata self.ms = [np.mean(strat_sample) for strat_sample in self.strats_samples] self.sigmas = [np.std(strat_sample) for strat_sample in self.strats_samples]
Bese, ake sengeze imisebenzi yezindlela ezintathu zamasampula ezichazwe engxenyeni yetiyori.
def random_subsampling(self, size): """Creates a random subset of the entire population :param sizes: subsample size """ rc = np.random.choice(self.general_samples, size=size) return rc def proportional_subsampling(self, size): """Creates a subsample with the number of elements, proportional shares of strata :param sizes: subsample size """ self.strats_size_proport = [int(np.floor(size*w)) for w in self.ws] rc = [] for k in range(len(self.strats_size_proport)): rc.append(np.random.choice(self.strats_samples[k], size=self.strats_size_proport[k])) return rc def optimal_subsampling(self, size): """Creates a subsample with the optimal number of elements relative to strata :param sizes: subsample size """ sum_denom = 0 for k in range(self.count_strats): sum_denom += self.ws[k] * self.sigmas[k] self.strats_size_optimal = [int(np.floor((size*w*sigma)/sum_denom)) for w, sigma in zip(self.ws, self.sigmas)] if 0 in self.strats_size_optimal: raise ValueError('Strats size is 0, please change variance of smallest strat!') rc = [] for k in range(len(self.strats_size_optimal)): rc.append(np.random.choice(self.strats_samples[k], size=self.strats_size_optimal[k])) return rc
Futhi, engxenyeni ye-empirical, sihlala sidinga umsebenzi wokulingisa inqubo yokuhlola.
def run_experiments(self, n_sub, subsampling_method, n_experiments=1000): """Conducts a series of experiments and saves the results :param n_sub: size of sample :param subsampling_method: method for creating a subsample :param n_experiments: number of experiment starts """ means_s = [] if(len(self.general_samples)<100): n_sub = 20 if(subsampling_method == 'random_subsampling'): for n in range(n_experiments): rc = self.random_subsampling(n_sub) mean = rc.sum()/len(rc) means_s.append(mean) else: for n in range(n_experiments): if(subsampling_method == 'proportional_subsampling'): rc = self.proportional_subsampling(n_sub) elif(subsampling_method == 'optimal_subsampling'): rc = self.optimal_subsampling(n_sub) strats_mean = [] for k in range(len(rc)): strats_mean.append(sum(rc[k])/len(rc[k])) # Mean for a mixture means_s.append(sum([w_k*mean_k for w_k, mean_k in zip(self.ws, strats_mean)])) return means_s
Uma sibheka inani labantu elivamile, lapho wonke ama-strats ethu enamanani afanayo kanye nokuhluka, imiphumela yazo zonke izindlela ezintathu kulindeleke ukuthi ilingane kakhulu noma ilingane.
Izindlela ezihlukene nokuhluka okulinganayo kuthole imiphumela ejabulisa kakhudlwana. Ukusebenzisa i-stratification kunciphisa kakhulu ukuhluka.
Ezimeni ezinezindlela ezilinganayo nokuhluka okuhlukile, sibona ukuncipha kokuhluka endleleni kaNeyman.
Manje, ungasebenzisa indlela yokuhlukanisa ukuze unciphise ukuhluka kwemethrikhi futhi uthuthukise isilingo uma uhlanganisa izethameli zakho futhi ngobuchwepheshe uzihlukanise ngokungahleliwe ngaphakathi kweqoqo ngalinye ngezisindo ezithile!