Machine Learning is at the forefront of advancements in Artificial Intelligence. It’s moving fast with new research coming out each and every day. This post is in continuation of important concepts and notes right from the basics to advance, from the book Machine Learning, by Tom M. Mitchell.
For Machine Learning Notes 1, please click the link below.
Machine Learning Notes 1_From Machine Learning -Tom M. Mitchell_hackernoon.com
A problem of searching through a predefined space of potential hypothesis for the hypothesis that best fits the training example.
Inferring a boolean-valued function from training examples of its input and output.
Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over the unobserved examples.
An algorithm to find the most specific hypothesis or best fit hypothesis out of all hypothesis.
The algorithm works only for Positive instances.
In general, any instance in positive features having more than one value should be replaced with a general term <?>, which is suitable for both the values.
Algorithm:
Most specific hypothesis, not allowing any new value to fit.
h ← <ϕ,ϕ,ϕ,ϕ,ϕ>
Provides us with the most specific & the most general hypothesis for a training example.
Takes consideration for both positive & negative instances.
Return the General Hypothesis (G) and the Specific Hypothesis (S)
Inductive bias talks about the unobserved instances, how does the hypothesis react to it.
The reason why a hypothesis might not work perfectly for the unobserved instances faced in testing data is caused because it never faced it.
To overcome that we ’ve: UNBIASED LEARNER
The solution to it is that the hypothesis should able to represent every possible subset of instance X, i.e, for all subset of X, the power set of X. Which makes the learning of hypothesis function V(h) ←h’
where h’ is the hypothesis state after the unobserved instance.
Decision tree learning is a method for approximating discrete- valued function that is robust to noisy data and capable of learning disjunctive expression it’s a practical method for inductive inference.
Decision tree classifies instances by sorting them down the tree from the root to some leaf node, which provides the classification of the instance. Each node in the tree specifies a test of some attribute of the instance, each branch descending from the node corresponds to one of the possible values for the attributes.
Information gain, the measure of how well a given attribute separates the train.
In order to define information gain precisely, we define entropy, that characterises the impurity of an arbitrary collection of examples.
More generalized
The measure of the effectiveness of an attribute is information gain.
ID3 are better in shorter trees as compared to larger trees.
Issues: Avoiding overfitting of data.
Can be avoided in two major ways:
Some of the other approaches that can be taken.
Pruning: removing the subtree rooted at that node. performed(nodes remove) only if tree performs worse than the original.
Remember to give this post some 👏 if you liked it. Follow me for more content.
Visit my github repo:
sahilverma0696 - Overview_Python and C++ developer. Machine Learning enthusiast. sahilverma0696.github.io - sahilverma0696_github.com