Contenido Overview Setup Cargar los datos Preprocesar los datos Regresión logística Fundamentos de la Regresión Logística La función log loss Regla de actualización de descenso de gradientes Entrena el modelo Evaluación de desempeño Salvar el modelo Conclusión This guide demonstrates how to use the TensorFlow Core low-level APIs to perform binary classification with logistic regression. It uses the Clasificación de tumores. Wisconsin Cáncer de mama Es uno de los algoritmos más populares para la clasificación binaria.Dado un conjunto de ejemplos con características, el objetivo de la regresión logística es obtener valores entre 0 y 1, que pueden ser interpretados como las probabilidades de cada ejemplo perteneciente a una clase particular. Regresión logística Setup Este tutorial utiliza Para leer un archivo CSV a , de Para crear una relación paralela en un conjunto de datos, para calcular una matriz de confusión, y Creación de visualizaciones. pandas Dataframe Seaborn Aprendizaje Matplotlib pip install -q seaborn import tensorflow as tf import pandas as pd import matplotlib from matplotlib import pyplot as plt import seaborn as sns import sklearn.metrics as sk_metrics import tempfile import os # Preset matplotlib figure sizes. matplotlib.rcParams['figure.figsize'] = [9, 6] print(tf.__version__) # To make the results reproducible, set the random seed value. tf.random.set_seed(22) 2024-08-15 02:45:41.468739: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-08-15 02:45:41.489749: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-08-15 02:45:41.496228: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2.17.0 Cargar los datos A continuación, carga el Desde la Este conjunto de datos contiene varias características como el radio, textura y concavidad de un tumor. Wisconsin Cáncer de mama Repositorio de aprendizaje automático UCI url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data' features = ['radius', 'texture', 'perimeter', 'area', 'smoothness', 'compactness', 'concavity', 'concave_poinits', 'symmetry', 'fractal_dimension'] column_names = ['id', 'diagnosis'] for attr in ['mean', 'ste', 'largest']: for feature in features: column_names.append(feature + "_" + attr) Leer el conjunto de datos en un panda Uso : Dataframe pandas.read_csv dataset = pd.read_csv(url, names=column_names) dataset.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 569 entries, 0 to 568 Data columns (total 32 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 id 569 non-null int64 1 diagnosis 569 non-null object 2 radius_mean 569 non-null float64 3 texture_mean 569 non-null float64 4 perimeter_mean 569 non-null float64 5 area_mean 569 non-null float64 6 smoothness_mean 569 non-null float64 7 compactness_mean 569 non-null float64 8 concavity_mean 569 non-null float64 9 concave_poinits_mean 569 non-null float64 10 symmetry_mean 569 non-null float64 11 fractal_dimension_mean 569 non-null float64 12 radius_ste 569 non-null float64 13 texture_ste 569 non-null float64 14 perimeter_ste 569 non-null float64 15 area_ste 569 non-null float64 16 smoothness_ste 569 non-null float64 17 compactness_ste 569 non-null float64 18 concavity_ste 569 non-null float64 19 concave_poinits_ste 569 non-null float64 20 symmetry_ste 569 non-null float64 21 fractal_dimension_ste 569 non-null float64 22 radius_largest 569 non-null float64 23 texture_largest 569 non-null float64 24 perimeter_largest 569 non-null float64 25 area_largest 569 non-null float64 26 smoothness_largest 569 non-null float64 27 compactness_largest 569 non-null float64 28 concavity_largest 569 non-null float64 29 concave_poinits_largest 569 non-null float64 30 symmetry_largest 569 non-null float64 31 fractal_dimension_largest 569 non-null float64 dtypes: float64(30), int64(1), object(1) memory usage: 142.4+ KB Mostrar las cinco primeras líneas: dataset.head() id diagnosis radius_mean texture_mean perimeter_mean area_mean smoothness_mean compactness_mean concavity_mean concave_poinits_mean ... radius_largest texture_largest perimeter_largest area_largest smoothness_largest compactness_largest concavity_largest concave_poinits_largest symmetry_largest fractal_dimension_largest 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Divide el conjunto de datos en conjuntos de entrenamiento y prueba utilizando , de y Asegúrese de separar las características de las etiquetas de destino.El conjunto de pruebas se utiliza para evaluar la generalizabilidad de su modelo a los datos invisibles. pandas.DataFrame.sample pandas.DataFrame.drop pandas.DataFrame.iloc train_dataset = dataset.sample(frac=0.75, random_state=1) len(train_dataset) 427 test_dataset = dataset.drop(train_dataset.index) len(test_dataset) 142 # The `id` column can be dropped since each row is unique x_train, y_train = train_dataset.iloc[:, 2:], train_dataset.iloc[:, 1] x_test, y_test = test_dataset.iloc[:, 2:], test_dataset.iloc[:, 1] Preprocesar los datos Este conjunto de datos contiene los valores de error promedio, estándar y máximo para cada una de las 10 mediciones de tumor recogidas por ejemplo. La columna de meta es una variable categórica con Se trata de un tumor maligno y Indicando un diagnóstico de tumor benigno.Esta columna debe ser convertida en un formato binario numérico para el entrenamiento de modelos. "diagnosis" 'M' 'B' El La función es útil para mapear los valores binarios a las categorías. pandas.Series.map El conjunto de datos también debe ser convertido en un tensor con la function after the preprocessing is complete. tf.convert_to_tensor y_train, y_test = y_train.map({'B': 0, 'M': 1}), y_test.map({'B': 0, 'M': 1}) x_train, y_train = tf.convert_to_tensor(x_train, dtype=tf.float32), tf.convert_to_tensor(y_train, dtype=tf.float32) x_test, y_test = tf.convert_to_tensor(x_test, dtype=tf.float32), tf.convert_to_tensor(y_test, dtype=tf.float32) WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1723689945.265757 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.269593 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.273290 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.276976 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.288712 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.292180 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.295550 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.299093 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.302584 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.306098 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.309484 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.312921 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.538105 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.540233 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.542239 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.544278 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.546323 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.548257 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.550168 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.552143 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.554591 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.556540 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.558447 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.560412 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.599852 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.601910 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.604061 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.606104 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.608094 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.610074 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.611985 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.613947 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.615903 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.618356 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.620668 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.623031 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 Uso para revisar la distribución conjunta de algunos pares de características basadas en la media del conjunto de entrenamiento y observar cómo se relacionan con el objetivo: seaborn.pairplot sns.pairplot(train_dataset.iloc[:, 1:6], hue = 'diagnosis', diag_kind='kde'); Este par de trama demuestra que ciertas características como el radio, el perímetro y el área están altamente correlacionados. Esto se espera ya que el radio del tumor está directamente involucrado en el cálculo de ambos perímetro y área. Además, tenga en cuenta que los diagnósticos malignos parecen ser más rectas para muchas de las características. Asegúrese de comprobar también las estadísticas globales.Tenga en cuenta cómo cada característica cubre un rango de valores muy diferente. train_dataset.describe().transpose()[:10] count mean std min 25% 50% 75% max id 427.0 2.756014e+07 1.162735e+08 8670.00000 865427.500000 905539.00000 8.810829e+06 9.113205e+08 radius_mean 427.0 1.414331e+01 3.528717e+00 6.98100 11.695000 13.43000 1.594000e+01 2.811000e+01 texture_mean 427.0 1.924468e+01 4.113131e+00 10.38000 16.330000 18.84000 2.168000e+01 3.381000e+01 perimeter_mean 427.0 9.206759e+01 2.431431e+01 43.79000 75.235000 86.87000 1.060000e+02 1.885000e+02 area_mean 427.0 6.563190e+02 3.489106e+02 143.50000 420.050000 553.50000 7.908500e+02 2.499000e+03 smoothness_mean 427.0 9.633618e-02 1.436820e-02 0.05263 0.085850 0.09566 1.050000e-01 1.634000e-01 compactness_mean 427.0 1.036597e-01 5.351893e-02 0.02344 0.063515 0.09182 1.296500e-01 3.454000e-01 concavity_mean 427.0 8.833008e-02 7.965884e-02 0.00000 0.029570 0.05999 1.297500e-01 4.268000e-01 concave_poinits_mean 427.0 4.872688e-02 3.853594e-02 0.00000 0.019650 0.03390 7.409500e-02 2.012000e-01 symmetry_mean 427.0 1.804597e-01 2.637837e-02 0.12030 0.161700 0.17840 1.947000e-01 2.906000e-01 id 427.0 2.756014e+07 1.162735e+08 8670.00000 865427.500000 905539.00000 8.810829e+06 9.113205e+08 radius_mean 427.0 1.414331e+01 3.528717e+00 6.98100 11.695000 13.43000 1.594000e+01 2.811000e+01 texture_mean 427.0 1.924468e+01 4.113131e+00 10.38000 16.330000 18.84000 2.168000e+01 3.381000e+01 Período - Significado 427.0 9.206759e+01 2.431431e+01 43.79000 75.235000 86.87000 1.060000e+02 1.885000e+02 Área - Mean 427.0 6.563190e+02 3.489106e+02 143.50000 420.050000 553.50000 7.908500e+02 2.499000e+03 Limpieza - Significado 427.0 9.633618e-02 1.436820e-02 0.05263 0.085850 0.09566 1.050000e-01 1.634000e-01 compacto - significa 427.0 1.036597e-01 5.351893e-02 0.02344 0.063515 0.09182 1.296500e-01 3.454000e-01 concavidad_mean 427.0 8.833008e-02 7.965884e-02 0.00000 0.029570 0.05999 1.297500e-01 4.268000e-01 concave_poinits_mean 427.0 4.872688e-02 3.853594e-02 0.00000 0.019650 0.03390 7.409500e-02 2.012000e-01 symmetry_mean 427.0 1.804597e-01 2.637837e-02 0.12030 0.161700 0.17840 1.947000e-01 2.906000e-01 Dados los intervalos inconsistentes, es beneficioso estandarizar los datos de tal forma que cada característica tenga una media cero y varianza de unidad. . Normalización class Normalize(tf.Module): def __init__(self, x): # Initialize the mean and standard deviation for normalization self.mean = tf.Variable(tf.math.reduce_mean(x, axis=0)) self.std = tf.Variable(tf.math.reduce_std(x, axis=0)) def norm(self, x): # Normalize the input return (x - self.mean)/self.std def unnorm(self, x): # Unnormalize the input return (x * self.std) + self.mean norm_x = Normalize(x_train) x_train_norm, x_test_norm = norm_x.norm(x_train), norm_x.norm(x_test) Regresión logística Antes de construir un modelo de regresión logística, es crucial comprender las diferencias del método en comparación con la regresión lineal tradicional. Fundamentos de la Regresión Logística Linear regression returns a linear combination of its inputs; this output is unbounded. The output of a Está en la para cada ejemplo, representa la probabilidad de que el ejemplo pertenezca al de clase . Regresión logística (0, 1) Positivo La regresión logística mapea los resultados continuos de la regresión lineal tradicional, de las probabilidades, Esta transformación también es simétrica de modo que el giro del signo de la salida lineal resulta en el inverso de la probabilidad original. (-∞, ∞) (0, 1) Let Y denote the probability of being in class (the tumor is malignant). The desired mapping can be achieved by interpreting the linear regression output as the Ratio de estar en clase A diferencia de la clase : 1 Encuentra las probabilidades 1 0 ln(Y1−Y)=wX+b Al establecer wX+b=z, esta ecuación puede ser resuelta para Y: Y=ez1+ez=11+e−z La expresión 11+e−z se conoce como la σ(z) Por lo tanto, la ecuación para la regresión logística se puede escribir como Y=σ(wX+b). Función sigmoide El conjunto de datos en este tutorial se ocupa de una matriz de características de alta dimensión. Por lo tanto, la ecuación anterior debe ser reescrita en una forma vectorial de matriz de la siguiente manera: Y=σ(Xw+b) Dónde es: YMx1: un vector objetivo Xm×n: una matriz de características Wnx1: un vector de peso b: a bias σ: una función sigmoide aplicada a cada elemento del vector de salida Comienza visualizando la función sigmoide, que transforma la salida lineal, Para caer entre y . The sigmoid function is available in . (-∞, ∞) 0 1 tf.math.sigmoid x = tf.linspace(-10, 10, 500) x = tf.cast(x, tf.float32) f = lambda x : (1/20)*x + 0.6 plt.plot(x, tf.math.sigmoid(x)) plt.ylim((-0.1,1.1)) plt.title("Sigmoid function"); The log loss function El , o pérdida de entropía cruzada binaria, es la función de pérdida ideal para un problema de clasificación binaria con regresión logística. Para cada ejemplo, la pérdida de registro cuantifica la similitud entre una probabilidad predicha y el valor verdadero del ejemplo. log loss L=−1m∑i=1myi⋅log(y^i)+(1−yi)⋅log(1−y^i) Dónde es: y^: a vector of predicted probabilities y: un vector de objetivos reales You can use the function to compute the log loss. This function automatically applies the sigmoid activation to the regression output: tf.nn.sigmoid_cross_entropy_with_logits def log_loss(y_pred, y): # Compute the log loss function ce = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=y_pred) return tf.reduce_mean(ce) Regla de actualización de descenso de gradientes Las APIs TensorFlow Core admiten la diferenciación automática con Si usted es curioso sobre las matemáticas detrás de la regresión logística Aquí una breve explicación: tf.GradientTape Actualizaciones Gradientes En la ecuación anterior para la pérdida de log, recuerda que cada y^i puede ser reescrito en términos de las entradas como σ(Xiw+b). El objetivo es encontrar una w y b que minimicen la pérdida de log: L=−1m∑i=1myi⋅log(σ(Xiw+b))+(1−yi)⋅log(1−σ(Xiw+b)) Tomando el gradiente L con respecto a w, se obtiene lo siguiente: ∂L∂w=1m(σ(Xw+b)−y)X Tomando el gradiente L con respecto a b, se obtiene lo siguiente: ∂L∂b=1m∑i=1mσ(Xiw+b)−yi Desarrollar el modelo de regresión logística. class LogisticRegression(tf.Module): def __init__(self): self.built = False def __call__(self, x, train=True): # Initialize the model parameters on the first call if not self.built: # Randomly generate the weights and the bias term rand_w = tf.random.uniform(shape=[x.shape[-1], 1], seed=22) rand_b = tf.random.uniform(shape=[], seed=22) self.w = tf.Variable(rand_w) self.b = tf.Variable(rand_b) self.built = True # Compute the model output z = tf.add(tf.matmul(x, self.w), self.b) z = tf.squeeze(z, axis=1) if train: return z return tf.sigmoid(z) Para validar, asegúrese de que el modelo no entrenado emita valores en el rango de para un pequeño subconjunto de los datos de entrenamiento. (0, 1) log_reg = LogisticRegression() y_pred = log_reg(x_train_norm[:5], train=False) y_pred.numpy() array([0.9994985 , 0.9978607 , 0.29620072, 0.01979049, 0.3314926 ], dtype=float32) A continuación, escriba una función de precisión para calcular la proporción de clasificaciones correctas durante el entrenamiento.Para recuperar las clasificaciones de las probabilidades predichas, establezca un umbral para el que todas las probabilidades superiores al umbral pertenezcan a la clase Este es un hiperparámetro configurable que se puede configurar como un defecto. 1 0.5 def predict_class(y_pred, thresh=0.5): # Return a tensor with `1` if `y_pred` > `0.5`, and `0` otherwise return tf.cast(y_pred > thresh, tf.float32) def accuracy(y_pred, y): # Return the proportion of matches between `y_pred` and `y` y_pred = tf.math.sigmoid(y_pred) y_pred_class = predict_class(y_pred) check_equal = tf.cast(y_pred_class == y,tf.float32) acc_val = tf.reduce_mean(check_equal) return acc_val Entrena el modelo El uso de mini-bates para entrenamiento proporciona tanto eficiencia de memoria como convergencia más rápida. API tiene funciones útiles para batch y shuffling. La API le permite construir tuberías de entrada complejas a partir de piezas simples y reutilizables. tf.data.Dataset batch_size = 64 train_dataset = tf.data.Dataset.from_tensor_slices((x_train_norm, y_train)) train_dataset = train_dataset.shuffle(buffer_size=x_train.shape[0]).batch(batch_size) test_dataset = tf.data.Dataset.from_tensor_slices((x_test_norm, y_test)) test_dataset = test_dataset.shuffle(buffer_size=x_test.shape[0]).batch(batch_size) Now write a training loop for the logistic regression model. The loop utilizes the log loss function and its gradients with respect to the input in order to iteratively update the model's parameters. # Set training parameters epochs = 200 learning_rate = 0.01 train_losses, test_losses = [], [] train_accs, test_accs = [], [] # Set up the training loop and begin training for epoch in range(epochs): batch_losses_train, batch_accs_train = [], [] batch_losses_test, batch_accs_test = [], [] # Iterate over the training data for x_batch, y_batch in train_dataset: with tf.GradientTape() as tape: y_pred_batch = log_reg(x_batch) batch_loss = log_loss(y_pred_batch, y_batch) batch_acc = accuracy(y_pred_batch, y_batch) # Update the parameters with respect to the gradient calculations grads = tape.gradient(batch_loss, log_reg.variables) for g,v in zip(grads, log_reg.variables): v.assign_sub(learning_rate * g) # Keep track of batch-level training performance batch_losses_train.append(batch_loss) batch_accs_train.append(batch_acc) # Iterate over the testing data for x_batch, y_batch in test_dataset: y_pred_batch = log_reg(x_batch) batch_loss = log_loss(y_pred_batch, y_batch) batch_acc = accuracy(y_pred_batch, y_batch) # Keep track of batch-level testing performance batch_losses_test.append(batch_loss) batch_accs_test.append(batch_acc) # Keep track of epoch-level model performance train_loss, train_acc = tf.reduce_mean(batch_losses_train), tf.reduce_mean(batch_accs_train) test_loss, test_acc = tf.reduce_mean(batch_losses_test), tf.reduce_mean(batch_accs_test) train_losses.append(train_loss) train_accs.append(train_acc) test_losses.append(test_loss) test_accs.append(test_acc) if epoch % 20 == 0: print(f"Epoch: {epoch}, Training log loss: {train_loss:.3f}") Epoch: 0, Training log loss: 0.661 Epoch: 20, Training log loss: 0.418 Epoch: 40, Training log loss: 0.269 Epoch: 60, Training log loss: 0.178 Epoch: 80, Training log loss: 0.137 Epoch: 100, Training log loss: 0.116 Epoch: 120, Training log loss: 0.106 Epoch: 140, Training log loss: 0.096 Epoch: 160, Training log loss: 0.094 Epoch: 180, Training log loss: 0.089 Evaluación de desempeño Observe los cambios en la pérdida y la precisión de su modelo a lo largo del tiempo. plt.plot(range(epochs), train_losses, label = "Training loss") plt.plot(range(epochs), test_losses, label = "Testing loss") plt.xlabel("Epoch") plt.ylabel("Log loss") plt.legend() plt.title("Log loss vs training iterations"); plt.plot(range(epochs), train_accs, label = "Training accuracy") plt.plot(range(epochs), test_accs, label = "Testing accuracy") plt.xlabel("Epoch") plt.ylabel("Accuracy (%)") plt.legend() plt.title("Accuracy vs training iterations"); print(f"Final training log loss: {train_losses[-1]:.3f}") print(f"Final testing log Loss: {test_losses[-1]:.3f}") Final training log loss: 0.089 Final testing log Loss: 0.077 print(f"Final training accuracy: {train_accs[-1]:.3f}") print(f"Final testing accuracy: {test_accs[-1]:.3f}") Final training accuracy: 0.968 Final testing accuracy: 0.979 El modelo demuestra una alta precisión y una baja pérdida cuando se trata de clasificar tumores en el conjunto de datos de entrenamiento y también se generaliza bien a los datos de pruebas invisibles. Para ir un paso más lejos, puede explorar las tasas de error que dan más información más allá de la puntuación de precisión general. Las dos tasas de error más populares para los problemas de clasificación binaria son la tasa falsa positiva (FPR) y la tasa falsa negativa (FNR). Para este problema, la FPR es la proporción de predicciones de tumores malignos entre los tumores que son realmente benignos. Calcular una matriz de confusión utilizando , que evalúa la exactitud de la clasificación, y utiliza matplotlib para mostrar la matriz: sklearn.metrics.confusion_matrix def show_confusion_matrix(y, y_classes, typ): # Compute the confusion matrix and normalize it plt.figure(figsize=(10,10)) confusion = sk_metrics.confusion_matrix(y.numpy(), y_classes.numpy()) confusion_normalized = confusion / confusion.sum(axis=1, keepdims=True) axis_labels = range(2) ax = sns.heatmap( confusion_normalized, xticklabels=axis_labels, yticklabels=axis_labels, cmap='Blues', annot=True, fmt='.4f', square=True) plt.title(f"Confusion matrix: {typ}") plt.ylabel("True label") plt.xlabel("Predicted label") y_pred_train, y_pred_test = log_reg(x_train_norm, train=False), log_reg(x_test_norm, train=False) train_classes, test_classes = predict_class(y_pred_train), predict_class(y_pred_test) show_confusion_matrix(y_train, train_classes, 'Training') show_confusion_matrix(y_test, test_classes, 'Testing') Observe the error rate measurements and interpret their significance in the context of this example. In many medical testing studies such as cancer detection, having a high false positive rate to ensure a low false negative rate is perfectly acceptable and in fact encouraged since the risk of missing a malignant tumor diagnosis (false negative) is a lot worse than misclassifying a benign tumor as malignant (false positive). Para controlar el FPR y el FNR, trate de cambiar el hiperparámetro del umbral antes de clasificar las predicciones de probabilidad. Un umbral más bajo aumenta las posibilidades globales del modelo de hacer una clasificación de tumores malignos. Esto inevitablemente aumenta el número de falsos positivos y el FPR pero también ayuda a disminuir el número de falsos negativos y el FNR. Save the model Comience creando un módulo de exportación que toma datos crudos y realiza las siguientes operaciones: Normalización Predicción de probabilidad Predicción de clase class ExportModule(tf.Module): def __init__(self, model, norm_x, class_pred): # Initialize pre- and post-processing functions self.model = model self.norm_x = norm_x self.class_pred = class_pred @tf.function(input_signature=[tf.TensorSpec(shape=[None, None], dtype=tf.float32)]) def __call__(self, x): # Run the `ExportModule` for new data points x = self.norm_x.norm(x) y = self.model(x, train=False) y = self.class_pred(y) return y log_reg_export = ExportModule(model=log_reg, norm_x=norm_x, class_pred=predict_class) Si desea guardar el modelo en su estado actual, puede hacerlo con el Para cargar un modelo guardado y hacer predicciones, utilice el Funcionamiento . tf.saved_model.save tf.saved_model.load models = tempfile.mkdtemp() save_path = os.path.join(models, 'log_reg_export') tf.saved_model.save(log_reg_export, save_path) INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp9k_sar52/log_reg_export/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp9k_sar52/log_reg_export/assets log_reg_loaded = tf.saved_model.load(save_path) test_preds = log_reg_loaded(x_test) test_preds[:10].numpy() array([1., 1., 1., 1., 0., 1., 1., 1., 1., 1.], dtype=float32) Conclusión Este cuaderno introdujo algunas técnicas para manejar un problema de regresión logística.Aquí hay algunos consejos más que pueden ayudar: Las APIs TensorFlow Core se pueden utilizar para construir flujos de trabajo de aprendizaje automático con altos niveles de configurabilidad Analizar las tasas de error es una gran manera de obtener más información sobre el rendimiento de un modelo de clasificación más allá de su puntuación de precisión general. Overfitting es otro problema común para los modelos de regresión logística, aunque no fue un problema para este tutorial. Para obtener más ejemplos de cómo utilizar las APIs TensorFlow Core, consulte la . If you want to learn more about loading and preparing data, see the tutorials on o . Guía Carga de datos de imagen Carga de datos CSV Originalmente publicado en el sitio web de TensorFlow, este artículo aparece aquí bajo un nuevo título y está licenciado bajo CC BY 4.0. Originalmente publicado en el sitio web de TensorFlow, este artículo aparece aquí bajo un nuevo título y está licenciado bajo CC BY 4.0. TensorFlow