paint-brush
A Practical Approach to Novel Class Discovery in Tabular Data: Full training procedureby@dataology

A Practical Approach to Novel Class Discovery in Tabular Data: Full training procedure

tldt arrow

Too Long; Didn't Read

The complete training procedure for Novel Class Discovery (NCD) integrates models, hyperparameter optimization, and novel class estimation through k-fold Cross-Validation. Algorithm 1 provides a simplified overview of this complex process for better understanding.
featured image - A Practical Approach to Novel Class Discovery in Tabular Data: Full training procedure
Dataology: Study of Data in Computer Science HackerNoon profile picture

Authors:

(1) Troisemaine Colin, Department of Computer Science, IMT Atlantique, Brest, France., and Orange Labs, Lannion, France;

(2) Reiffers-Masson Alexandre, Department of Computer Science, IMT Atlantique, Brest, France.;

(3) Gosselin Stephane, Orange Labs, Lannion, France;

(4) Lemaire Vincent, Orange Labs, Lannion, France;

(5) Vaton Sandrine, Department of Computer Science, IMT Atlantique, Brest, France.

Abstract and Intro

Related work

Approaches

Hyperparameter optimization

Estimating the number of novel classes

Full training procedure

Experiments

Conclusion

Declarations

References

Appendix A: Additional result metrics

Appendix B: Hyperparameters

Appendix C: Cluster Validity Indices numerical results

Appendix D: NCD k-means centroids convergence study

6 Full training procedure

In the previous sections, we presented the models, the hyperparameter optimization and the estimation procedure of the number of novel classes independently. In this section, these components are brought together to form a complete training procedure. To ensure that no prior knowledge about the novel classes is ever used in this process, the number of novel classes is naturally estimated during the k-fold CV introduced in Section 4. As the whole process is quite complex, we try to summarize it in clear terms in this section and in Algorithm 1.




This paper is available on arxiv under CC 4.0 license.