Understanding intent and form to develop a palpable interpretation of what’s rational and irrational.
From almost instinctual guidance, consider what ‘snaps’ your attention, what strangles your gaze.
What’s abnormal isn’t sensible. Our sense of attention fixates on the ‘distinguishable’. This beam of correlation ensures that we can function properly in new environments, extrapolate meaning in unfamiliarity and render art when seemingly handicapped. We can fill in the blanks.
Such insight allows us to bypass more traditional methods of learning — rote learning. Those that seem smarter can generally complete more with less effort — they can fill in the blanks that the others need spelt out. While others are reactionary to their world, there’s some that can march into the unknown and return as conquerers. Clearly rote learning isn’t the only way a machine can learn.
I’m near the final stages of a personalised food-ordering app that’s taken the better part of 2.5 years to develop. Now as I begin developing peripheral services like customer-service and onboarding-processes, I’m trying to automate as much as possible to minimise overheads. Given the nature of the app, it’s likely there’ll be high engagement between users and customer-services so any automation needs to be functional and at a minimum, on par, with a human operator. I need to build a chatbot.
And it needs to really work. In short, I want it to be revolutionary.
I’m familiar with the general ways of teaching a chatbot: broadly leveraging NLP/NLU and parsing sentiment, sentence/intents, dialogue etc, but I’ve always found that most ML based techniques are just rote-learning on steroids and I don’t like that.
Time and time again, programmers are told how toddlers can outsmart state-of-the-art ML algorithms and that the only way to outpace this crawler is to nuke their computer with billions of data-points. Upgrade to a new graphics card. Buy a new Intel multi-core processor. Train through a VM on AWS. The list is endless…
Yet the suggestions are still dumb. We should be focusing on understanding the fundamental rules this data obliges in an open-context rather than dissecting the individual pen-strokes of the number ‘9’ in MNIST. Sure low-level insight is needed to interpret stimuli, but the scope of ML needs to grow from simple, closed-based arenas to the battlefronts of the real-world where information is randomly distributed and spontaneous.
To make the first step into this gunfire, we need to change how machines ‘think’.
A network of competing logic-modules that test their understanding through emulation.
Over the coming weeks I’ll be implementing this idea and once completed, I’ll return with notes.
For now, here’s my general goals:
- The goal is to develop a system that can understand the intent of a user’s message without being trained on a corpus spanning millions of examples.
- The system should be able to understand associations of speech and how they precipitate actions. That is how for example, escalating rhetoric can violently precipitate confrontation whereas a deescalation can reverse tension.
- Interpretations construct a growing sense of understanding that’s relevant to the temporal and spatial properties of the conversation. The more we hear, the more we know.
- There’s a hierarchy of information, organised with respect to scope. Information not only contributes in clarifying this hierarchy but allows for better searching policies as the scope of conversations can vary, meaning the relevance of information is dynamic and related to the intent-at-hand.
As for the implementation:
- Parse a text, such as a book, speech, screenplay and associate how dialogue leads to actions.
- Exploit the natural structure of the text to represent sequences. Higher-quality writing will provide greater boundaries between semantics.
- Assess the formation and degradation of sequences across the text, with each sequence escalating or deescalating the intents that have been pre-built.
- Intents will then gradually become attributed to particular phrases. Using TF-IDF and cosine-similarity, the respective distinction between phrases/words can be attributed to ideas.
- Using this hierarchy of scores from TF-IDF, a tree of intents can be developed. Using this tree-like structure, we can employ a Monte-Carlo searching algorithm during conversations with users to understand and predict what users want to discuss.
- Training should involve emulating the derived insights and seeing how they diverge in unseen texts. That is, being able to ‘write’ their own text and then finding texts that are similar to this written piece. This would be analogous to daydreaming/imagining etc.
It’s still early days so it’s unknown how this will fare, but nonetheless I’ll be sure to write up the results once I’ve implemented the code.
Thanks for reading!