Logistic Regression ho an'ny Binary Classification amin'ny API Core

Fanamarihana ny votoatin'ny Ny fametrahana Tsindrio ny angon-drakitra Ny fanitsiana ny angona Ny fivoaran'ny logistics Ny fototry ny famerenana logistics Ny fiantraikan'ny fahavoazana The gradient descent update rule Train ny modely Ny fanombanana Mba hamonjy ny modely Ny famaranana Ity torohevitra ity dia mampiseho ny fomba ampiasain'ny TensorFlow Core Low-Level API mba hanatanterahana ny famaritana binary amin'ny famerenana logistics. Ho an'ny famaritana ny tumor. Wisconsin Cancer Dataset ho an'ny homamiadana is one of the most popular algorithms for binary classification. Given a set of examples with features, the goal of logistic regression is to output values between 0 and 1, which can be interpreted as the probabilities of each example belonging to a particular class. Ny fivoaran'ny logistics Ny fametrahana Ity tutorial ity dia mampiasa Mba hamaky ny rakitra CSV amin'ny a Ny ho an'ny fametrahana fifandraisana amin'ny antsipiriany ao amin'ny dataset, for computing a confusion matrix, and Ho an'ny famoronana ny visualizations. pandas DataFrame Ny ranomasina Ny fampianarana Ny fametrahana pip install -q seaborn import tensorflow as tf import pandas as pd import matplotlib from matplotlib import pyplot as plt import seaborn as sns import sklearn.metrics as sk_metrics import tempfile import os # Preset matplotlib figure sizes. matplotlib.rcParams['figure.figsize'] = [9, 6] print(tf.__version__) # To make the results reproducible, set the random seed value. tf.random.set_seed(22) 2024-08-15 02:45:41.468739: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-08-15 02:45:41.489749: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-08-15 02:45:41.496228: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2.17.0 Tsindrio ny angon-drakitra Next, load the Avy amin'ny Ity dataset ity dia ahitana endri-javatra isan-karazany, toy ny fantsona, ny endrika, ary ny concavity ny tumor. Wisconsin Cancer Dataset ho an'ny homamiadana Ohatra amin'ny UCI Machine Learning Repository url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data' features = ['radius', 'texture', 'perimeter', 'area', 'smoothness', 'compactness', 'concavity', 'concave_poinits', 'symmetry', 'fractal_dimension'] column_names = ['id', 'diagnosis'] for attr in ['mean', 'ste', 'largest']: for feature in features: column_names.append(feature + "_" + attr) Vakio ny dataset amin'ny panda using : Ny tahirin-kevitra pandas.read_csv dataset = pd.read_csv(url, names=column_names) dataset.info() RangeIndex: 569 entries, 0 to 568 Data columns (total 32 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 id 569 non-null int64 1 diagnosis 569 non-null object 2 radius_mean 569 non-null float64 3 texture_mean 569 non-null float64 4 perimeter_mean 569 non-null float64 5 area_mean 569 non-null float64 6 smoothness_mean 569 non-null float64 7 compactness_mean 569 non-null float64 8 concavity_mean 569 non-null float64 9 concave_poinits_mean 569 non-null float64 10 symmetry_mean 569 non-null float64 11 fractal_dimension_mean 569 non-null float64 12 radius_ste 569 non-null float64 13 texture_ste 569 non-null float64 14 perimeter_ste 569 non-null float64 15 area_ste 569 non-null float64 16 smoothness_ste 569 non-null float64 17 compactness_ste 569 non-null float64 18 concavity_ste 569 non-null float64 19 concave_poinits_ste 569 non-null float64 20 symmetry_ste 569 non-null float64 21 fractal_dimension_ste 569 non-null float64 22 radius_largest 569 non-null float64 23 texture_largest 569 non-null float64 24 perimeter_largest 569 non-null float64 25 area_largest 569 non-null float64 26 smoothness_largest 569 non-null float64 27 compactness_largest 569 non-null float64 28 concavity_largest 569 non-null float64 29 concave_poinits_largest 569 non-null float64 30 symmetry_largest 569 non-null float64 31 fractal_dimension_largest 569 non-null float64 dtypes: float64(30), int64(1), object(1) memory usage: 142.4+ KB Asehoy ny dimy voalohany: dataset.head() id diagnosis radius_mean texture_mean perimeter_mean area_mean smoothness_mean compactness_mean concavity_mean concave_poinits_mean ... radius_largest texture_largest perimeter_largest area_largest smoothness_largest compactness_largest concavity_largest concave_poinits_largest symmetry_largest fractal_dimension_largest 42Akaza niteraka an'i Jarà; Jarà niteraka an'i Alamata, Azmota ary Zamrì; Zamrì niteraka an'i Alamata, Azmota ary Zamrì; Zamrì niteraka an'i Mosà. Mizara ny dataset amin'ny fampiofanana sy ny fitsapana amin'ny alalan'ny Ny Ary Ataovy azo antoka fa mizara ny endri-javatra avy amin'ny marika tanjona. Ny fitsapana dia ampiasaina mba hanombohana ny fahaiza-manao ny modely amin'ny angon-drakitra tsy hita. pandas.DataFrame.sample pandas.DataFrame.drop pandas.DataFrame.iloc train_dataset = dataset.sample(frac=0.75, random_state=1) len(train_dataset) 427 test_dataset = dataset.drop(train_dataset.index) len(test_dataset) 142 # The `id` column can be dropped since each row is unique x_train, y_train = train_dataset.iloc[:, 2:], train_dataset.iloc[:, 1] x_test, y_test = test_dataset.iloc[:, 2:], test_dataset.iloc[:, 1] Ny fanitsiana ny angona Ity dataset ity dia ahitana ny habetsaky ny fahadisoana, ny fahadisoana ankapobeny, ary ny isa lehibe indrindra ho an'ny tsirairay amin'ireo fanafody 10 voaangona avy amin'ny ohatra. Ny taratasy taratasy dia karazana variable miaraka amin'ny Izany dia mampiseho ny aretina malignant ary izay mampiseho ny diagnosis ny tumor tsara tarehy. Ity fonosana ity dia tsy maintsy hiverina amin'ny endrika binary nomerika ho an'ny fampiofanana modely. "diagnosis" 'M' 'B' The Ny fonosana dia mahasoa amin'ny fametrahana valiny binary amin'ny sokajy. pandas.Series.map Ny dataset dia tokony hifanaraka amin'ny tensor miaraka amin'ny Aorian'ny fanatanterahana dia vita ny fanatanterahana. tf.convert_to_tensor y_train, y_test = y_train.map({'B': 0, 'M': 1}), y_test.map({'B': 0, 'M': 1}) x_train, y_train = tf.convert_to_tensor(x_train, dtype=tf.float32), tf.convert_to_tensor(y_train, dtype=tf.float32) x_test, y_test = tf.convert_to_tensor(x_test, dtype=tf.float32), tf.convert_to_tensor(y_test, dtype=tf.float32) WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1723689945.265757 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.269593 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.273290 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.276976 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.288712 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.292180 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.295550 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.299093 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.302584 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.306098 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.309484 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689945.312921 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.538105 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.540233 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.542239 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.544278 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.546323 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.548257 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.550168 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.552143 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.554591 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.556540 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.558447 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.560412 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.599852 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.601910 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.604061 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.606104 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.608094 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.610074 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.611985 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.613947 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.615903 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.618356 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.620668 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1723689946.623031 132290 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 Ny fampiasana hamerina ny fametrahana ankapobeny amin'ny tsipiriany vitsivitsy amin'ny endri-javatra mifototra amin'ny fampiofanana ary hijery ny fomba mifandray amin'ny tanjona: seaborn.pairplot sns.pairplot(train_dataset.iloc[:, 1:6], hue = 'diagnosis', diag_kind='kde'); This pairplot demonstrates that certain features such as radius, perimeter and area are highly correlated. This is expected since the tumor radius is directly involved in the computation of both perimeter and area. Additionally, note that malignant diagnoses seem to be more right-skewed for many of the features. Make sure to also check the overall statistics. Note how each feature covers a vastly different range of values. train_dataset.describe().transpose()[:10] count mean std min 25% 50% 75% max id 427.0 2.756014e+07 1.162735e+08 8670.00000 865427.500000 905539.00000 8.810829e+06 9.113205e+08 radius_mean 427.0 1.414331e+01 3.528717e+00 6.98100 11.695000 13.43000 1.594000e+01 2.811000e+01 texture_mean 427.0 1.924468e+01 4.113131e+00 10.38000 16.330000 18.84000 2.168000e+01 3.381000e+01 perimeter_mean 427.0 9.206759e+01 2.431431e+01 43.79000 75.235000 86.87000 1.060000e+02 1.885000e+02 area_mean 427.0 6.563190e+02 3.489106e+02 143.50000 420.050000 553.50000 7.908500e+02 2.499000e+03 smoothness_mean 427.0 9.633618e-02 1.436820e-02 0.05263 0.085850 0.09566 1.050000e-01 1.634000e-01 compactness_mean 427.0 1.036597e-01 5.351893e-02 0.02344 0.063515 0.09182 1.296500e-01 3.454000e-01 concavity_mean 427.0 8.833008e-02 7.965884e-02 0.00000 0.029570 0.05999 1.297500e-01 4.268000e-01 concave_poinits_mean 427.0 4.872688e-02 3.853594e-02 0.00000 0.019650 0.03390 7.409500e-02 2.012000e-01 symmetry_mean 427.0 1.804597e-01 2.637837e-02 0.12030 0.161700 0.17840 1.947000e-01 2.906000e-01 Ny ID 427.0 2.756014e+07 1.162735e+08 8670.00000 865427.500000 905539.00000 8.810829e+06 9.113205e+08 Radio dia midika 427.0 1.414331e+01 3.528717e+00 6.98100 11.695000 13.43000 1.594000e+01 2.811000e+01 Ny teny dia midika 427.0 1.924468e+01 4.113131e+00 10.38000 16.330000 18.84000 2.168000e+01 3.381000e+01 Mifanohitra amin'ny 427.0 9.206759e+01 2.431431e+01 43.79000 75.235000 86.87000 1.060000e+02 1.885000e+02 Ny faritra dia midika 427.0 6.563190e+02 3.489106e+02 143.50000 420.050000 553.50000 7.908500e+02 2.499000e+03 Ny fahatsapana dia midika 427.0 9.633618e-02 1.436820e-02 0.05263 0.085850 0.09566 1.050000e-01 1.634000e-01 Ny fitsipika dia midika 427.0 1.036597e-01 5.351893e-02 0.02344 0.063515 0.09182 1.296500e-01 3.454000e-01 Mifandraisa amin'ny 427.0 8.833008e-02 7.965884e-02 0.00000 0.029570 0.05999 1.297500e-01 4.268000e-01 fanehoan-kevitra amin'ny fanehoan-kevitra 427.0 4.872688e-02 3.853594e-02 0.00000 0.019650 0.03390 7.409500e-02 2.012000e-01 Ny simika dia midika 427.0 1.804597e-01 2.637837e-02 0.12030 0.161700 0.17840 1.947000e-01 2.906000e-01 Noho ny habetsahan'ny tsy mifanaraka, dia mahasoa ny manamaivana ny angon-drakitra mba hanana fahafaha-manao tsirairay amin'ny fahafaha-manao sy ny fahafaha-manao tsirairay. . Ny fitsipika class Normalize(tf.Module): def __init__(self, x): # Initialize the mean and standard deviation for normalization self.mean = tf.Variable(tf.math.reduce_mean(x, axis=0)) self.std = tf.Variable(tf.math.reduce_std(x, axis=0)) def norm(self, x): # Normalize the input return (x - self.mean)/self.std def unnorm(self, x): # Unnormalize the input return (x * self.std) + self.mean norm_x = Normalize(x_train) x_train_norm, x_test_norm = norm_x.norm(x_train), norm_x.norm(x_test) Ny fivoaran'ny logistics Alohan'ny famoronana modely logistic regression, dia zava-dehibe ny mahatakatra ny fahasamihafana amin'ny fomba mifandraika amin'ny linear regression mahazatra. Ny fototry ny famerenana logistics Ny regression linear dia miverina amin'ny fiaraha-miasa amin'ny entany; ity vokatra ity dia tsy voafetra. Ao amin'ny ho an'ny ohatra tsirairay, dia maneho ny mety ho an'ny ohatra ao amin'ny Ny kilasy. Ny fivoaran'ny logistics (0, 1) Ny tsara Ny regression logistic dia mampitaha ny vokatra tsy tapaka amin'ny regression linear mahazatra. Ary ny mety, Ity famerenana ity dia simetrika ihany koa ka ny famerenana ny marika amin'ny famerenana ara-dalàna dia mitarika amin'ny mifanohitra amin'ny probabilite voalohany. (-∞, ∞) (0, 1) Aoka ny Y hampiseho ny mety ho ao amin'ny kilasy Ny fanapahan-kevitra ilaina dia azo atao amin'ny alàlan'ny fanazavana ny vokatra linear regression ho toy ny ratio of being in class Mifanohitra amin'ny kilasy : 1 Ny fahaiza-manao 1 0 ln⁡(Y1−Y)=wX+b Amin'ny alalan'ny fametrahana wX + b = z, dia azonao atao ny mamaha ity fampitahana ity ho an'ny Y: Y=ez1+ez=11+e−z Ny fehezan-teny 11+e−z dia fantatra amin'ny anarana hoe Izany no mahatonga ny tsiambaratelo ho an'ny famerenana logistics ho voasoratra ho toy ny Y =σ (wX + b). Ny fametrahana ny sigmoid Ny angon-drakitra ao amin'ity fampiofanana ity dia miresaka amin'ny matrisin'ny endri-javatra avo lenta. Noho izany, tsy maintsy voasoratra indray ny fitsipika etsy ambony amin'ny endrika vektor matrisy toy izao manaraka izao: Y=σ(Xw+b) Aiza ny: Ym×1: ny vektor tanjona Xm×n: a feature matrix wn×1: ny vektor ny lanjany B: ny fahasamihafana σ: a sigmoid function applied to each element of the output vector Manomboka amin'ny famaritana ny endri-javatra sigmoid, izay manova ny vokatra linear, Ny fihenjanana eo amin'ny Ary Ny asa sigmoid dia azo jerena ao amin'ny . (-∞, ∞) 0 1 tf.math.sigmoid x = tf.linspace(-10, 10, 500) x = tf.cast(x, tf.float32) f = lambda x : (1/20)*x + 0.6 plt.plot(x, tf.math.sigmoid(x)) plt.ylim((-0.1,1.1)) plt.title("Sigmoid function"); Ny fiantraikan'ny fahavoazana Ny , na binary cross-entropy fahaverezana, dia ny fahaverezana tsara tarehy ho an'ny binary famaritana olana amin'ny logistic regression. Ho an'ny ohatra tsirairay, ny log fahaverezana mampitombo ny fahasamihafana eo amin'ny azo antoka mety sy ny marina ny ohatra. Ny fahaverezan'ny L=−1m∑i=1myi⋅log⁡(y^i)+(1−yi)⋅log⁡(1−y^i) Aiza ny: y^: ny vektor amin'ny fahafahana voalaza y: Vector ny tena tanjona Azonao ampiasaina ny function to compute the log loss. This function automatically applies the sigmoid activation to the regression output: tf.nn.sigmoid_cross_entropy_with_logits def log_loss(y_pred, y): # Compute the log loss function ce = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=y_pred) return tf.reduce_mean(ce) Ny fitsipika fanavaozana ny gradient descent The TensorFlow Core APIs support automatic differentiation with Raha manontany tena ianao momba ny matematika ao ambadiky ny famerenana logistics Indro ny fanazavana fohy: tf.GradientTape gradient updates Ao amin'ny fitsipika etsy ambony ho an'ny fahaverezan'ny log, tsarovy fa ny y^i tsirairay dia azo soratana indray amin'ny fomba fijery ny entana ho σ(Xiw + b). Ny tanjona dia ny mahita w sy b izay mampihena ny fahaverezan'ny log: L=−1m∑i=1myi⋅log⁡(σ(Xiw+b))+(1−yi)⋅log⁡(1−σ(Xiw+b)) Amin'ny fampiasana ny gradient L amin'ny w, dia mahazo izao manaraka izao: ∂L∂w=1m(σ(Xw+b)−y)X Amin'ny fampiasana ny gradient L mifandraika amin'ny b, dia mahazo izao manaraka izao: ∂L∂b=1m∑i=1mσ(Xiw+b)−yi Ankehitriny, mamorona ny modely logistic regression. class LogisticRegression(tf.Module): def __init__(self): self.built = False def __call__(self, x, train=True): # Initialize the model parameters on the first call if not self.built: # Randomly generate the weights and the bias term rand_w = tf.random.uniform(shape=[x.shape[-1], 1], seed=22) rand_b = tf.random.uniform(shape=[], seed=22) self.w = tf.Variable(rand_w) self.b = tf.Variable(rand_b) self.built = True # Compute the model output z = tf.add(tf.matmul(x, self.w), self.b) z = tf.squeeze(z, axis=1) if train: return z return tf.sigmoid(z) Mba hanamarinana, ataovy azo antoka fa ny modely tsy voaomana dia mamoaka lanja ao amin'ny sehatry ny for a small subset of the training data. (0, 1) log_reg = LogisticRegression() y_pred = log_reg(x_train_norm[:5], train=False) y_pred.numpy() array([0.9994985 , 0.9978607 , 0.29620072, 0.01979049, 0.3314926 ], dtype=float32) Aorian'izany, manoratra asa fanamarinana mba handrefesana ny isan-jaton'ny famaritana marina nandritra ny fampiofanana. Mba hahazoana ny famaritana avy amin'ny fahafahana voalaza, mametraka fetra ho an'ny izay mety ho ambony noho ny fetra rehetra ao amin'ny kilasy Ity dia hyperparameter izay azo aseho amin'ny Tahaka ny tsy fahampian'ny 1 0.5 def predict_class(y_pred, thresh=0.5): # Return a tensor with `1` if `y_pred` > `0.5`, and `0` otherwise return tf.cast(y_pred > thresh, tf.float32) def accuracy(y_pred, y): # Return the proportion of matches between `y_pred` and `y` y_pred = tf.math.sigmoid(y_pred) y_pred_class = predict_class(y_pred) check_equal = tf.cast(y_pred_class == y,tf.float32) acc_val = tf.reduce_mean(check_equal) return acc_val Train ny modely Ny fampiasana mini-bates ho an'ny fampiofanana dia manome fahombiazana amin'ny fahatsiarovana sy ny fifanarahana haingana kokoa. Ny API dia manana endri-javatra mahasoa ho an'ny batching sy shuffling. Ny API dia mamela anao hamorona pipelines fidirana sarotra avy amin'ny singa tsotra, azo ampiasaina indray. tf.data.Dataset batch_size = 64 train_dataset = tf.data.Dataset.from_tensor_slices((x_train_norm, y_train)) train_dataset = train_dataset.shuffle(buffer_size=x_train.shape[0]).batch(batch_size) test_dataset = tf.data.Dataset.from_tensor_slices((x_test_norm, y_test)) test_dataset = test_dataset.shuffle(buffer_size=x_test.shape[0]).batch(batch_size) Amin'izao fotoana izao dia manoratra dingana fampiofanana ho an'ny modely logistic regression. Ny dingana dia mampiasa ny endri-javatra log sy ny gradients amin'ny fifandraisana amin'ny fidirana mba hanatsarana ny endri-javatra ao amin'ny modely. # Set training parameters epochs = 200 learning_rate = 0.01 train_losses, test_losses = [], [] train_accs, test_accs = [], [] # Set up the training loop and begin training for epoch in range(epochs): batch_losses_train, batch_accs_train = [], [] batch_losses_test, batch_accs_test = [], [] # Iterate over the training data for x_batch, y_batch in train_dataset: with tf.GradientTape() as tape: y_pred_batch = log_reg(x_batch) batch_loss = log_loss(y_pred_batch, y_batch) batch_acc = accuracy(y_pred_batch, y_batch) # Update the parameters with respect to the gradient calculations grads = tape.gradient(batch_loss, log_reg.variables) for g,v in zip(grads, log_reg.variables): v.assign_sub(learning_rate * g) # Keep track of batch-level training performance batch_losses_train.append(batch_loss) batch_accs_train.append(batch_acc) # Iterate over the testing data for x_batch, y_batch in test_dataset: y_pred_batch = log_reg(x_batch) batch_loss = log_loss(y_pred_batch, y_batch) batch_acc = accuracy(y_pred_batch, y_batch) # Keep track of batch-level testing performance batch_losses_test.append(batch_loss) batch_accs_test.append(batch_acc) # Keep track of epoch-level model performance train_loss, train_acc = tf.reduce_mean(batch_losses_train), tf.reduce_mean(batch_accs_train) test_loss, test_acc = tf.reduce_mean(batch_losses_test), tf.reduce_mean(batch_accs_test) train_losses.append(train_loss) train_accs.append(train_acc) test_losses.append(test_loss) test_accs.append(test_acc) if epoch % 20 == 0: print(f"Epoch: {epoch}, Training log loss: {train_loss:.3f}") Epoch: 0, Training log loss: 0.661 Epoch: 20, Training log loss: 0.418 Epoch: 40, Training log loss: 0.269 Epoch: 60, Training log loss: 0.178 Epoch: 80, Training log loss: 0.137 Epoch: 100, Training log loss: 0.116 Epoch: 120, Training log loss: 0.106 Epoch: 140, Training log loss: 0.096 Epoch: 160, Training log loss: 0.094 Epoch: 180, Training log loss: 0.089 Performance evaluation Jereo ny fiovana amin'ny fahaverezan'ny modely sy ny fahamarinana amin'ny fotoana. plt.plot(range(epochs), train_losses, label = "Training loss") plt.plot(range(epochs), test_losses, label = "Testing loss") plt.xlabel("Epoch") plt.ylabel("Log loss") plt.legend() plt.title("Log loss vs training iterations"); plt.plot(range(epochs), train_accs, label = "Training accuracy") plt.plot(range(epochs), test_accs, label = "Testing accuracy") plt.xlabel("Epoch") plt.ylabel("Accuracy (%)") plt.legend() plt.title("Accuracy vs training iterations"); print(f"Final training log loss: {train_losses[-1]:.3f}") print(f"Final testing log Loss: {test_losses[-1]:.3f}") Final training log loss: 0.089 Final testing log Loss: 0.077 print(f"Final training accuracy: {train_accs[-1]:.3f}") print(f"Final testing accuracy: {test_accs[-1]:.3f}") Final training accuracy: 0.968 Final testing accuracy: 0.979 Ny modely dia mampiseho fahamarinana avo sy fahaverezana ambany rehefa tonga amin'ny famaritana ny aretina ao amin'ny dataset fampiofanana ary koa dia manamaivana tsara amin'ny angon-drakitra fanadinana tsy hita. Mba handeha dingana iray bebe kokoa, azonao atao ny mandinika ny fahadisoana isan-jato izay manome fahatsapana bebe kokoa mihoatra noho ny fahamarinana amin'ny ankapobeny. Ny roa malaza indrindra fahadisoana isan-jato ho an'ny olana amin'ny famaritana binary dia ny false positive rate (FPR) sy ny false negative rate (FNR). Ho an'ity olana ity, ny FPR dia ny isan-jaton'ny famantarana ny tumor maligne eo amin'ny tumors izay tena tsara tarehy. Ny famaritana ny matrix mampiasa , izay mandinika ny marina ny famaritana, ary mampiasa matplotlib mba hampisehoana ny matrix: sklearn.metrics.confusion_matrix def show_confusion_matrix(y, y_classes, typ): # Compute the confusion matrix and normalize it plt.figure(figsize=(10,10)) confusion = sk_metrics.confusion_matrix(y.numpy(), y_classes.numpy()) confusion_normalized = confusion / confusion.sum(axis=1, keepdims=True) axis_labels = range(2) ax = sns.heatmap( confusion_normalized, xticklabels=axis_labels, yticklabels=axis_labels, cmap='Blues', annot=True, fmt='.4f', square=True) plt.title(f"Confusion matrix: {typ}") plt.ylabel("True label") plt.xlabel("Predicted label") y_pred_train, y_pred_test = log_reg(x_train_norm, train=False), log_reg(x_test_norm, train=False) train_classes, test_classes = predict_class(y_pred_train), predict_class(y_pred_test) show_confusion_matrix(y_train, train_classes, 'Training') show_confusion_matrix(y_test, test_classes, 'Testing') Ao amin'ny fikarohana ara-pitsaboana maro toy ny fikarohana homamiadana, manana tahan'ny false positive avo mba hahazoana antoka ny tahan'ny false negative ambany dia tena azo ekena ary amin'ny zava-misy dia manoro hevitra satria ny loza tsy hahita diagnosis malignant tumor (false negative) dia ratsy kokoa noho ny tsy fahampian'ny tumor malignantina ho malignant (false positive). Mba hanara-maso ny FPR sy FNR, miezaka hanova ny fetran'ny hyperparameter alohan'ny hanara-maso ny fahazoan-dalana. Ny fetran'ny ambany dia mampitombo ny fahafahana amin'ny ankapobeny ny modely ny manao malignant tumor famaritana. Izany dia tsy maintsy mampitombo ny isan'ny false-poizina sy ny FPR fa koa manampy hampihenana ny isan'ny false-negatives sy ny FNR. Mba hamonjy ny modely Start by making an export module that takes in raw data and performs the following operations: Normalization Ny famantarana ny mety Ny famaritana ny kilasy class ExportModule(tf.Module): def __init__(self, model, norm_x, class_pred): # Initialize pre- and post-processing functions self.model = model self.norm_x = norm_x self.class_pred = class_pred @tf.function(input_signature=[tf.TensorSpec(shape=[None, None], dtype=tf.float32)]) def __call__(self, x): # Run the `ExportModule` for new data points x = self.norm_x.norm(x) y = self.model(x, train=False) y = self.class_pred(y) return y log_reg_export = ExportModule(model=log_reg, norm_x=norm_x, class_pred=predict_class) Raha te hamonjy ny modely amin'ny toe-javatra ankehitriny ianao, dia azonao atao izany amin'ny ho an'ny famoahana modely voatahiry ary mamorona fanamby, mampiasa ny Ny asa. tf.saved_model.save tf.saved_model.load models = tempfile.mkdtemp() save_path = os.path.join(models, 'log_reg_export') tf.saved_model.save(log_reg_export, save_path) INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp9k_sar52/log_reg_export/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp9k_sar52/log_reg_export/assets log_reg_loaded = tf.saved_model.load(save_path) test_preds = log_reg_loaded(x_test) test_preds[:10].numpy() array([1., 1., 1., 1., 0., 1., 1., 1., 1., 1.], dtype=float32) Ny famaranana This notebook introduced a few techniques to handle a logistic regression problem. Here are a few more tips that may help: Ny TensorFlow Core API dia azo ampiasaina amin'ny famoronana workflows amin'ny fianarana milina amin'ny ambaratonga avo lenta Ny fanadihadiana ny tahan'ny fahadisoana dia fomba mahomby ahafahana mahazo fahatakarana bebe kokoa momba ny fampisehoana ny modely famaritana mihoatra noho ny valiny amin'ny ankapobeny. Overfitting dia olana hafa mahazatra ho an'ny logistic regression modely, na dia tsy olana ho an'ity fampiofanana ity aza. Jereo ny Overfit sy underfit fampiofanana ho an'ny fanampiana bebe kokoa amin'ity. Ho an'ny ohatra bebe kokoa momba ny fampiasana ny TensorFlow Core API, jereo ny Raha te-hahalala bebe kokoa momba ny fampidirana sy ny fanomanana ny angona ianao, jereo ny fampiofanana ao amin'ny Ary . Ny mpitarika Ny angon-drakitra amin'ny loading CSV angon-drakitra fandefasana Tamin'ny voalohany nivoaka tao amin'ny tranonkala TensorFlow, ity lahatsoratra ity dia hita eto amin'ny lohateny vaovao ary nahazo fahazoan-dalana amin'ny CC BY 4.0. Tamin'ny voalohany nivoaka tao amin'ny tranonkala TensorFlow, ity lahatsoratra ity dia hita eto amin'ny lohateny vaovao ary nahazo fahazoan-dalana amin'ny CC BY 4.0. Ny fifamoivoizana