In recent years, we have seen great advances in Language Large Models (LLMs). From OpenAI’s GPT-3, which produces highly accurate scripts, to its open source counterpart BLOOM, one impressive LLM after another has been released. Language-related tasks that were previously unsolvable only now challenge these paradigms.
All this progress is made possible by the huge amount of data we have on the internet and the availability of powerful GPUs. As good as it sounds, LLM training is a very expensive process, both in terms of data and hardware requirements. We are talking about AI models with trillions of parameters, so it is not really easy to feed these models enough data. However, once you do, you get an enchanting performance from her.
Have you ever wondered what was the starting point in the development of “computing” hardware? Why did people put the time and effort into designing and developing the first computers? We can assume that it was not to entertain people with video games or YouTube videos.
It all started with the goal of solving the problem of increasing information in science. Computers have been proposed as a solution to managing the growing information. They would take care of routine tasks such as storage and retrieval so that the path to insights and decisions is cleared in scientific reasoning. Can we really say that we have achieved this while coming up with an answer to a science question on Google which is becoming more and more difficult nowadays?
Moreover, the sheer amount of scientific papers published daily is far beyond what a human being can handle. For example, an average of 516 sheets per day Submitted to arXiv in May 2022. Moreover, the volume of scientific data is growing beyond our processing capabilities as well.
We have tools to access and filter this information. When you want to search for a topic, the first thing you go to is Google. Although it won’t give you the answer you’re looking for most of the time, Google will point you to the right destination, like Wikipedia or Stackoverflow. Yes, we can find the answers there, but the problem is that these resources require costly human contributions, and updates can be slow in that regard.
What if we had a better tool for accessing and filtering the vast amount of scientific information we have? Search engines can only store information, they can’t think about it. What if we had a Google search that could understand the information it stores and be able to answer our questions directly? Well, it’s time to meet Galactica.
Unlike search engines, language models are likely to store, aggregate, and interpret scientific knowledge. They can find links between research articles, find hidden knowledge, and bring these insights to you. Also, they can actually generate useful information for you by linking to content that they know. Create a literature review on a specific topic, a lecture note about the course, answers to your questions, and wiki articles. All of this is possible with language models.
Galactica is the first step towards a perfect scientific neural network assistant. The ultimate science aid will be the interface of how we access knowledge. You’ll deal with the cumbersome information overload process as you focus on making decisions using this information.
So how does Galactica’s system work? Well, it’s a big language model in itself, so it has billions of parameters trained on billions of data points. Since Galactica is designed to be a science assistant, the obvious source of training data is research papers. In this respect, more than 48 million research papers, 2 million code samples, 8 million lecture notes, and textbooks have been used to create the training data for Galactica. In the end, a data set containing 106 billion tokens was used.
Galactica was used to write its own paper, and this makes Galactica one of the first AI models to present itself. We believe it will be used to write more research papers in the near future.
That was a brief rundown of Galactica, Meta’s new AI model designed to help retrieve scientific knowledge. You can try out Galactica for your use cases using the links below.
scan the paper And the project. All credit for this research goes to the researchers on this project. Also, don’t forget to join Our Reddit page And the discord channelwhere we share the latest AI research news, cool AI projects, and more.
Ekrem Cetinkaya has a Bachelor’s degree. in 2018 and MA. in 2019 from Ozyegin University, Istanbul, Turkey. He wrote his master’s degree. Thesis on image noise reduction using deep convolutional networks. He is currently pursuing his Ph.D. degree at the University of Klagenfurt, Austria, and works as a researcher on the ATHENA project. His research interests include deep learning, computer vision, and multimedia networks.