Search your (large) documents in seconds\n----------------------------------------\n\n!(https://hackernoon.com/hn-images/1*kcT9fDLJC4UxUDu2VwRcAg.png)\n\nBack in May 2017, I was in the exam period of my 3rd year at college. A good preparation required me to go through many assignments and quizzes and answer multiple choice or open questions. Most of the answers were often found in the lecturer’s notes. So, I found myself **reading tens of pages multiple times**, only to find the exactly correct answer somewhere in the notes. And I was not the only one doing this.\n\n!(https://hackernoon.com/hn-images/1*hCiN2ZBEyMaNaFFAjvVnxg.png)\n\nAn ordinary student preparing for exams.\n\nThat experience made me wonder: **Is there a more effective way?**\n\nFast-forward a couple of months, I decided to kick off my Artificial Intelligence journey. I started [learning](https://hackernoon.com/tagged/learning) how some of the most powerful state-of-the-art algorithms work and, eventually, began documenting my knowledge in a [weekly blog](https://medium.com/@simonnoff).\n\nSeeing the rapid [improvement](https://hackernoon.com/tagged/improvement) in deep learning and natural language processing, I decided to **solve the problem that many students face during their college days.**\n\nArmed with my recent learnings and passion to create valuable products, I built [_Asknote_](https://asknote.herokuapp.com/).\n\n!(https://hackernoon.com/hn-images/1*i92fLoDbLATNB0UMogsg7Q.png)\n\n### How does it work?\n\n_Asknote_ is a web application which lets you find the answers you need within any document. Here is how it works:\n\n#### **#1. Enter your email**\n\nFirst, you need to enter your email.\n\n!(https://hackernoon.com/hn-images/1*EvFxr_dDcm7_ibT8KRsHEg.gif)\n\nIt is used both when you add files and ask questions. This way the system authorizes you, delivering only answers based on your own documents.\n\n_Important: always keep your email in the field. Otherwise, you will receive an error message._\n\n#### **#2. Upload your PDF file**\n\nNext you need to add a PDF file of your choice. Just click _Choose File_, followed by _Upload PDF file_.\n\n!(https://hackernoon.com/hn-images/1*dracoyqfxKBnwKSEh4wl-w.gif)\n\n_Important: after uploading the file, wait until you receive a success message on the screen. Then, you will be sure your document was processed successfully._\n\n#### **#3. Ask your question**\n\nAfter accomplishing the above steps, just fill in your question and click _Ask_.\n\n!(https://hackernoon.com/hn-images/1*62JuhgUVEdsEj7M8XrID7g.gif)\n\nYou are supposed to enter a question for which **you expect the answer to be hidden somewhere in the documents.**\n\nYou will receive the top 5 most relevant answers with their associated page number and file name. This way you can write down the correctly predicted result and read only the respective page.\n\n#### **#4. Repeat**\n\nIterate over step **#2** and **#3** as many times as you wish. Your answers will be extracted after a careful analysis of all files.\n\n!(https://hackernoon.com/hn-images/1*CfzVsCQMNrqlqtid4YXXkA.png)\n\n### How it was made?\n\nGreat! Now you fully understand how to use _Asknote_.\n\nIn this section, I want to share with you some insights on the process involved in building this application.\n\n#### #1. Answer anything\n\n!(https://hackernoon.com/hn-images/1*0gaT0zMWzrJxVBr9sbiwGA.png)\n\nThe very first challenge I faced was how to **predict an answer** based on a given passage and a question.\n\nI immediately dived into an extensive research to find out the most recent AI improvements in this field. I came across an amazing [leaderboard](https://rajpurkar.github.io/SQuAD-explorer/) which lists all algorithms. Essentially, each one of them takes a question and a passage, represented as vectors, and outputs the start and end position of the answer.\n\nDue to similarities in their implementation, I decided to first understand the math behind them. This led me to write an [article](https://towardsdatascience.com/how-the-current-best-question-answering-model-works-8bbacf375e2a), explaining the best model.\n\nFor _Asknote_, I chose to go with the second best model (by _Allen Institute for Artificial Intelligence_) because its performance is excellent for the use case.\n\n#### #2. Let’s do files, not text\n\n!(https://hackernoon.com/hn-images/1*CACzf4nI3TxLDGlarqaryQ.png)\n\nSo far so good! I was receiving the correct answers and the system was working. But, as a student, _I didn’t imagine myself coping and pasting chunks of texts to search for an answer._ Millennials, like me, can get really impatient when dealing with technology, so everything needs to happen fast and with no more than 2–3 actions.\n\nThe process needed to be simpler.\n\nTo do so, I introduced **PDF uploads.** The system accepts your files and analysis them to deliver the best results in the end. I wanted a reliable and stable solution, so I spent a decent amount of time perfecting the backend to perform the operations concurrently. That improved the file upload experience significantly.\n\nThere are still some drawbacks which I want to point out. I did my best to avoid them but, I guess, they will remain for version 0.1:\n\n* Currently, you are able to add only PDF files. Any other format will be rejected. _(I am planning on supporting Word docs, webpages or images in the upcoming weeks)_\n* You can upload a single file at a time. For multiple, just do them one by one. _(Don’t worry, your answers will be search amongst all of the files)_\n* Images with text, within a PDF, won’t be processed. This means that you can’t find answers based on the information inside the photos.\n* Large files (100+ pages) may sometimes take longer (5–15 sec). _(I know this is sometimes annoying so I will work on improvement in the next version)_\n\n#### #3. Different users\n\n!(https://hackernoon.com/hn-images/1*0S2suZANPtGnp33UNsT6Rg.png)\n\nI had the idea of different users at the start of the project, but didn’t know how exactly to execute it. I felt _no need of a whole authentication process,_ because it would be an additional step before experiencing the core functionality.\n\nInstead, I decided to **require an email to distinguish between different users.** The email serves as your unique key, so when search for an answer only your documents are being displayed. This makes it possible for you to ask any question anytime, even if you reload or close the page.\n\n#### #4. Answers design\n\n!(https://hackernoon.com/hn-images/1*A2JtKD3na40yzd__5T_uaQ.png)\n\nSince the project leverages a fresh new deep learning technique, which was introduced just a couple of months ago (thus it hasn’t gone through many iterations), _one can sometimes be dissatisfied with the accuracy of the end result._\n\nMy solution is to navigate the user towards the right prediction but leave him to choose the answer himself. Instead of presenting the prediction with the highest probability, the system shows the **top 5 most relevant answers together with their associated page number and document name.**\n\n!(https://hackernoon.com/hn-images/1*b0oZYRGA_DlCT72bdQHI4w.png)\n\n### Let’s work together\n\nWith my recent improvements in the field of Artificial Intelligence I came to realize that a collaboration between machines and humans can result in an immense increase in our total productivity — the power which determines economic growth.\n\n_Asknote_ perfectly illustrates that partnership.\n\nThe AI system simplifies one’s search by providing several recommendations, leaving the human to make the final decision. That results in solving complex problems such as **answering any question in a matter of seconds.**\n\n_The project benefits me a lot. I hope you also find it useful. Excited to hear your thoughts._\n\n#### Thank you for the reading. If you enjoyed the article, give it some claps 👏 . Hope you have a great day!