paint-brush
Outlier Detección: Imakunatan yachanaykiby@nataliaogneva
54,644 ñawinchasqakuna
54,644 ñawinchasqakuna

Outlier Detección: Imakunatan yachanayki

by Natalia Ogneva
Natalia Ogneva HackerNoon profile picture

Natalia Ogneva

@nataliaogneva

Statistics lover

4 min read2024/04/23
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow
qu-flagQU
Ñawinchay kay willakuyta en quechua!
en-flagEN
Read this story in the original language, English!
ln-flagLN
Tanga lisolo oyo na lingala!
lo-flagLO
ອ່ານເລື່ອງນີ້ເປັນພາສາລາວ!
ps-flagPS
دا کیسه په پښتو ژبه ولولئ!
lt-flagLT
Skaitykite šią istoriją lietuvių kalba!
hr-flagHR
Pročitajte ovu priču na hrvatskom!
lv-flagLV
Izlasi šo stāstu latviešu valodā!
ht-flagHT
Li istwa sa a an kreyòl ayisyen!
hu-flagHU
Olvasd el ezt a történetet magyarul!
hy-flagHY
Կարդացեք այս պատմությունը հայերեն։
uk-flagUK
Читайте цю історію українською!
mg-flagMG
Vakio amin'ny teny malagasy ity tantara ity!
More
QU

Nishu unay; Ñawinchanapaq

Analistakunaqa sapa kutim llamkayninkupi datukunapi mana allin kaqkunawan tupanku. Decisiones nisqakunaqa aswantaqa promedio de muestra nisqapim ruwakun, chaymi anchata sensibles outliers nisqaman. Importantemi outliers nisqakunata kamachiy, allinta tanteanapaq. Mana costumbre kaq valorkunawan llamkanapaq achka sasan hinaspa utqaylla ruwaykunata qawarisun.

Companies Mentioned

Mention Thumbnail
effect
Mention Thumbnail
Series
featured image - Outlier Detección: Imakunatan yachanayki
Natalia Ogneva HackerNoon profile picture
Natalia Ogneva

Natalia Ogneva

@nataliaogneva

Statistics lover

Analistakunaqa sapa kutim llamkasqanku pachapi willakuykunapi mana allin kaqkunawan tupanku, ahinataq AB-prueba analisispi, predictivo modelokuna ruwaypi utaq tendenciakuna qatipaypi. Decisiones nisqakunaqa aswantaqa promedio de muestra nisqapim ruwakun, chaymi ancha sensibles outliers nisqaman, chaymi valorta anchata cambianman. Chaymi, ancha allin kanman outliers nisqakunata kamachiy, allinta tanteanapaq.


Mana costumbre kaq valorkunawan llamkanapaq achka sasan hinaspa utqaylla ruwaykunata qawarisun.

Sasachakuykuna Formulación

Yuyaykuy huk experimento analisis ruwayta necesitasqaykita huk promedio orden valorta huk primaria métrica hina llamk'achispa. Nisunman, métricanchikqa normal distribución nisqayuqmi. Hinallataq, yachanchikmi prueba huñupi métrica rakiyqa hukniray kasqanmanta controlpi. Huk rimaypiqa, controlpi rakinakuypa promedionqa 10, pruebapiñataqmi 12. Iskaynin huñupi desviación estándarqa 3.


Ichaqa iskaynin muestrakunam kanku outliers nisqakuna, chaymi skew chay medios de muestra nisqatapas chaymanta desviación estándar de muestra nisqatapas.

image

 import numpy as np N = 1000 mean_1 = 10 std_1 = 3 mean_2 = 12 std_2 = 3 x1 = np.concatenate((np.random.normal(mean_1, std_1, N), 10 * np.random.random_sample(50) + 20)) x2 = np.concatenate((np.random.normal(mean_2, std_2, N), 4 * np.random.random_sample(50) + 1))

NB chay métrica nisqamanta qhawarispaqa iskaynin ladomanta outliers nisqayuq kanman. Sichus métricayki huk ladumantalla outliers nisqayuq kanman, métodokuna chaypaq mana sasachu tikrasqa kanman.

Chupakuna Kuchusqa

Aswan facil ruwayqa llapa qawariykunata kuchuymi manaraq 5% percentil nisqa kachkaptin , 95% percentil nisqa qipatapas . Kay casopiqa, 10% willayta chinkachirqayku con hina. Ichaqa, rakinakuykunaqa aswan formasqa hinam rikurinku, chaymantam muestra momentos nisqakunaqa aswan hichpallapim kachkan distribucin momentos nisqaman.

image

 import numpy as np x1_5pct = np.percentile(x1, 5) x1_95pct = np.percentile(x1, 95) x1_cutted = [i for i in x1 if i > x1_5pct and i < x1_95pct] x2_5pct = np.percentile(x2, 5) x2_95pct = np.percentile(x2, 95) x2_cutted = [i for i in x2 if i > x2_5pct and i < x2_95pct]


Huk ñanqa hawa específico nisqa qhawaykunata qarquymi . Pisi bandaqa 25% percentilwan menos huk kuskanninwanmi tupan intercuartílico nisqapa chawpinpi, hatun bandañataqmi 75% percentilwan kuskanwan kuskanchasqa. Kaypiqa, 0,7% willayllata chinkachisunchik. Rakiykuna aswan formasqa hinam qawakun qallariymantaqa. Chay muestra momentos nisqakunaqa aswanmi kaqlla kanku chay distribución momentos nisqawan.

image

 import numpy as np low_band_1 = np.percentile(x1, 25) - 1.5 * np.std(x1) high_band_1 = np.percentile(x1, 75) + 1.5 * np.std(x1) x1_cutted = [i for i in x1 if i > low_band_1 and i < high_band_1] low_band_2 = np.percentile(x2, 25) - 1.5 * np.std(x2) high_band_2 = np.percentile(x2, 75) + 1.5 * np.std(x2) x2_cutted = [i for i in x2 if i > low_band_2 and i < high_band_2]

Bootstrap nisqa

Iskay kaq método kaypi qhawarisqaykuqa bootstrap nisqa. Kay enfoquepiqa, promedio nisqa ruwakun submuestras nisqapa promedio nisqa hina. Ejemploykupiqa, control qutupi promedioqa 10,35 kaqwan kikin, prueba qutupitaq 11,78 kaqwan. Aswan allin ruwayraqmi kachkan yapasqa willakuy ruwaywan tupachisqa.

 import pandas as pd def create_bootstrap_samples( sample_list: np.array, sample_size: int, n_samples: int ): # create a list for sample means sample_means = [] # loop n_samples times for i in range(n_samples): # create a bootstrap sample of sample_size with replacement bootstrap_sample = pd.Series(sample_list).sample(n = sample_size, replace = True) # calculate the bootstrap sample mean sample_mean = bootstrap_sample.mean() # add this sample mean to the sample means list sample_means.append(sample_mean) return pd.Series(sample_means) (create_bootstrap_samples(x1, len(x1), 1000).mean(), create_bootstrap_samples(x2, len(x2), 1000).mean())

Conclusion

Outlier nisqakuna tariy chaymanta ruwayqa ancha allinmi allin tanteayta ruwanapaq. Kunanqa, kimsa utqaylla chaymanta chiqan ruwaykunallapas yanapasunkiman manaraq t'aqwichkaspa willayta qhawayta.


Ichaqa, ancha allinmi yuyarinapaq, chay outlierkuna tarisqa mana costumbre kaq valorkuna kanman chaymanta huk ruway chay efecto novedad kaqpaq. Ichaqa huk willakuymi :)

L O A D I N G
. . . comments & more!

About Author

Natalia Ogneva HackerNoon profile picture
Natalia Ogneva@nataliaogneva
Statistics lover

HANG TAGS

KAY ARTÍCULO IMAYNA RUWAYPI RIQSICHISQAN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite

Mentioned in this story

companies
X REMOVE AD