This story draft by @escholar has not been reviewed by an editor, YET.

Future Work and Conclusion, Limitations, Ethics Statement, and References

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Authors:

(1) Kinjal Basu, IBM Research;

(2) Keerthiram Murugesan, IBM Research;

(3) Subhajit Chaudhury, IBM Research;

(4) Murray Campbell, IBM Research;

(5) Kartik Talamadupula, Symbl.ai;

(6) Tim Klinger, IBM Research.

Table of Links

Abstract and 1 Introduction

2 Background

3 Symbolic Policy Learner

3.1 Learning Symbolic Policy using ILP

3.2 Exception Learning

4 Rule Generalization

4.1 Dynamic Rule Generalization

5 Experiments and Results

5.1 Dataset

5.2 Experiments

5.3 Results

6 Related Work

7 Future Work and Conclusion, Limitations, Ethics Statement, and References

7 Future Work and Conclusion

In this paper, we propose a neuro-symbolic agent EXPLORER that demonstrates how symbolic and neural modules can collaborate in a text-based RL environment. Also, we present a novel information gain-based rule generalization algorithm. Our approach not only achieves promising results in the TW-Cooking and TWC games but also generates interpretable and transferable policies. Our current research has shown that excessive reliance on the symbolic module and heavy generalization may not always be beneficial, so our next objective is to develop an optimal strategy for switching between the neural and symbolic modules to enhance performance.

Limitations

One limitation of EXPLORER model is its computation time, which is longer than that of a neural agent. EXPLORER takes more time because it uses an ASP solver and symbolic rules, which involve multiple file processing tasks. However, the neuro-symbolic agent converges faster during training, which reduces the total number of steps needed, thereby decreasing the computation time difference between the neural and neuro-symbolic agents.

Ethics Statement

In this paper, we propose a neuro-symbolic approach for text-based games that generates interpretable symbolic policies, allowing for transparent analysis of the model’s outputs. Unlike deep neural models, which can exhibit language biases and generate harmful content such as hate speech or racial biases, neuro-symbolic approaches like ours are more effective at identifying and mitigating unethical outputs. The outputs of our model are limited to a list of permissible actions based on a peer-reviewed and publicly available dataset, and we use WordNet, a widely recognized and officially maintained knowledge base for NLP, as our external knowledge source. As a result, the ethical risks associated with our approach are low.

References

Ashutosh Adhikari, Xingdi Yuan, Marc-Alexandre Côté, Mikuláš Zelinka, Marc-Antoine Rondeau, Romain Laroche, Pascal Poupart, Jian Tang, Adam Trischler, and Will Hamilton. 2020a. Learning dynamic belief graphs to generalize on text-based games. Advances in Neural Information Processing Systems, 33:3045– 3057.


Ashutosh Adhikari, Xingdi Yuan, Marc-Alexandre CĂ´tĂŠ, MikulĂĄs Zelinka, Marc-Antoine Rondeau, Romain Laroche, Pascal Poupart, Jian Tang, Adam Trischler, and William L Hamilton. 2020b. Learning dynamic knowledge graphs to generalize on text-based games.


Leonard Adolphs and Thomas Hofmann. 2019. Ledeepchef: Deep reinforcement learning agent for families of text-based games. ArXiv, abs/1909.01646.


Prithviraj Ammanabrolu and Mark Riedl. 2019. Playing text-adventure games with graph-based deep reinforcement learning. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3557–3565.


JoaquĂ­n Arias, Manuel Carro, Zhuo Chen, and Gopal Gupta. 2020. Justifications for goal-directed constraint answer set programming. arXiv preprint arXiv:2009.10238.


Joaquin Arias, Manuel Carro, Elmer Salazar, Kyle Marple, and Gopal Gupta. 2018. Constraint answer set programming without grounding. TPLP, 18(3- 4):337–354.


Mattia Atzeni, Shehzaad Zuzar Dhuliawala, Keerthiram Murugesan, and Mrinmaya Sachan. 2022. Casebased reasoning for better generalization in textual reinforcement learning. In International Conference on Learning Representations.


Kinjal Basu, Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Kartik Talamadupula, Tim Klinger, Murray Campbell, Mrinmaya Sachan, and Gopal Gupta. 2022a. A hybrid neuro-symbolic approach for text-based games using inductive logic programming. In Combining Learning and Reasoning: Programming Languages, Formalisms, and Representations.


Kinjal Basu, Elmer Salazar, Huaduo Wang, JoaquĂ­n Arias, Parth Padalkar, and Gopal Gupta. 2022b. Symbolic reinforcement learning framework with incremental learning of rule-based policy. Proceedings of ICLP GDE, 22.


Kinjal Basu, Farhad Shakerin, and Gopal Gupta. 2020. Aqua: Asp-based visual question answering. In International Symposium on Practical Aspects of Declarative Languages, pages 57–72. Springer.


Kinjal Basu, Sarat Chandra Varanasi, Farhad Shakerin, Joaquin Arias, and Gopal Gupta. 2021. Knowledgedriven natural language understanding of english text and its applications. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 12554–12563.


Subhajit Chaudhury, Prithviraj Sen, Masaki Ono, Daiki Kimura, Michiaki Tatsubori, and Asim Munawar. 2021. Neuro-symbolic approaches for text-based policy learning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3073–3078.


Subhajit Chaudhury, Sarathkrishna Swaminathan, Daiki Kimura, Prithviraj Sen, Keerthiram Murugesan, Rosario Uceda-Sosa, Michiaki Tatsubori, Achille Fokoue, Pavan Kapanipathi, Asim Munawar, et al. 2023. Learning symbolic rules over abstract meaning representations for textual reinforcement learning. arXiv preprint arXiv:2307.02689.


Domenico Corapi, Alessandra Russo, and Emil Lupu. 2011. Inductive logic programming in answer set programming. In International conference on inductive logic programming, pages 91–97. Springer.


Marc-Alexandre CôtÊ, Ákos Kådår, Xingdi Yuan, Ben Kybartas, Tavian Barnes, Emery Fine, James Moore, Matthew Hausknecht, Layla El Asri, Mahmoud Adada, Wendy Tay, and Adam Trischler. 2018. Textworld: A learning environment for text-based games. CoRR, abs/1806.11532.


Richard Evans and Edward Grefenstette. 2018. Learning explanatory rules from noisy data. Journal of Artificial Intelligence Research, 61:1–64.


Michael Gelfond and Yulia Kahl. 2014. Knowledge representation, reasoning, and the design of intelligent agents: The answer-set programming approach. Cambridge University Press.


Michael Gelfond and Vladimir Lifschitz. 1988. The stable model semantics for logic programming. In ICLP/SLP, volume 88, pages 1070–1080.


Gopal Gupta, Yankai Zeng, Abhiraman Rajasekaran, Parth Padalkar, Keegan Kimbrell, Kinjal Basu, Farahad Shakerin, Elmer Salazar, and Joaquín Arias. 2023. Building intelligent systems by combining machine learning and automated commonsense reasoning. In Proceedings of the AAAI Symposium Series, volume 2, pages 272–276.


Matthew Hausknecht, Ricky Loynd, Greg Yang, Adith Swaminathan, and Jason D Williams. 2019. Nail: A general interactive fiction agent. arXiv preprint arXiv:1902.04259.


Daiki Kimura, Masaki Ono, Subhajit Chaudhury, Ryosuke Kohita, Akifumi Wachi, Don Joven Agravante, Michiaki Tatsubori, Asim Munawar, and Alexander Gray. 2021. Neuro-symbolic reinforcement learning with first-order logic. arXiv preprint arXiv:2110.10963.


Suraj Kothawade, Vinaya Khandelwal, Kinjal Basu, Huaduo Wang, and Gopal Gupta. 2021. Autodiscern: autonomous driving using common sense reasoning. arXiv preprint arXiv:2110.13606.


Mark Law, Alessandra Russo, and Krysia Broda. 2014. Inductive learning of answer set programs. In European Workshop on Logics in Artificial Intelligence, pages 311–325. Springer.


Vladimir Lifschitz. 2019. Answer set programming. Springer Heidelberg.


Daoming Lyu, Fangkai Yang, Bo Liu, and Steven Gustafson. 2019. Sdrl: interpretable and dataefficient deep reinforcement learning leveraging symbolic planning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 2970–2977.


George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41.


Tom Mitchell. 1997. Machine learning. McGraw Hill series in computer science. McGraw-Hill.


Arindam Mitra and Chitta Baral. 2015. Learning to automatically solve logic grid puzzles. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1023–1033.


Stephen Muggleton and Luc De Raedt. 1994. Inductive logic programming: Theory and methods. The Journal of Logic Programming, 19:629–679.


Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, and Murray Campbell. 2021a. Text-based rl agents with commonsense knowledge: New challenges, environments and baselines. In Thirty Fifth AAAI Conference on Artificial Intelligence.


Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Kartik Talamadupula, Mrinmaya Sachan, and Murray Campbell. 2021b. Efficient text-based reinforcement learning by jointly leveraging state and commonsense graph representations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, volume 2, pages 719–725. Association for Computational Linguistics.


Karthik Narasimhan, Tejas Kulkarni, and Regina Barzilay. 2015. Language understanding for text-based games using deep reinforcement learning. arXiv preprint arXiv:1506.08941.


Dhruva Pendharkar, Kinjal Basu, Farhad Shakerin, and Gopal Gupta. 2022. An asp-based approach to answering natural language questions for texts. Theory and Practice of Logic Programming, 22(3):419–443.


Oliver Ray. 2009. Nonmonotonic abductive inductive learning. Journal of Applied Logic, 7(3):329–340.


Raymond Reiter. 1988. Nonmonotonic reasoning. In Exploring artificial intelligence, pages 439–481. Elsevier.


Tim Rocktäschel and Sebastian Riedel. 2017. Endto-end differentiable proving. arXiv preprint arXiv:1705.11040.


Farhad Shakerin, Elmer Salazar, and Gopal Gupta. 2017. A new algorithm to automate inductive learning of default theories. Theory and Practice of Logic Programming, 17(5-6):1010–1026.


Mohan Sridharan, Ben Meadows, and Rocio Gomez. 2017. What can i not do? towards an architecture for reasoning about and learning affordances. In Proceedings of the International Conference on Automated Planning and Scheduling, volume 27, pages 461–469.


Adam Trischler, Marc-Alexandre CĂ´tĂŠ, and Pedro Lima. 2019. First TextWorld Problems, the competition: Using text-based games to advance capabilities of AI agents.


Mathieu Tuli, Andrew C Li, Pashootan Vaezipoor, Toryn Q Klassen, Scott Sanner, and Sheila A McIlraith. 2022. Learning to follow instructions in text-based games. arXiv preprint arXiv:2211.04591.


Ruoyao Wang, Peter Jansen, Marc-Alexandre CĂ´tĂŠ, and Prithviraj Ammanabrolu. 2022. Behavior cloned transformers are neurosymbolic reasoners. arXiv preprint arXiv:2210.07382.


Fangkai Yang, Daoming Lyu, Bo Liu, and Steven Gustafson. 2018. Peorl: Integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. arXiv preprint arXiv:1804.07779.


Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J Mankowitz, and Shie Mannor. 2018. Learn what not to learn: Action elimination with deep reinforcement learning. In Advances in Neural Information Processing Systems, pages 3562–3573.


Yankai Zeng, Abhiramon Rajasekharan, Parth Padalkar, Kinjal Basu, Joaquín Arias, and Gopal Gupta. 2024. Automated interactive domain-specific conversational agents that understand human dialogs. In International Symposium on Practical Aspects of Declarative Languages, pages 204–222. Springer.


This paper is available on arxiv under CC BY 4.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics

Around The Web...

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks