ScholarSphere Newsletter #14

Where AI meets Academia

Welcome to 14th edition of ScholarSphere

“The secret to happiness is freedom… And the secret to freedom is courage.” Thucydides

Welcome to our AI Newsletter—your ultimate guide to the rapidly changing world of AI in academia. If you haven't joined us yet, now's your chance! Click that button, subscribe with your email, and get ready for an exciting journey through all things AI in the academic realm!

Deep Dive into AI: Expand Your Knowledge

What Is Reinforcement Learning? 3 things you need to know
By Mathworks.com

1. What is Reinforcement Learning?

Reinforcement learning (RL) is a type of machine learning technique where an agent learns through trial and error in an interactive environment. Unlike supervised learning where an agent is given labeled data (correct answers), in RL, the agent receives rewards or penalties for its actions. The goal of the agent is to learn a policy, a set of rules, that will maximize its long-term reward.

Figure 1. Three broad categories of machine learning:
unsupervised learning, supervised learning and reinforcement learning.

For example, imagine an RL agent is controlling a robot that is trying to walk. The agent will take different actions, like moving its legs in certain ways. If the robot takes an action that brings it closer to walking, it will receive a positive reward. If the robot takes an action that makes it fall, it will receive a negative reward. Over time, the agent will learn which actions lead to positive rewards and will refine its policy to walk more effectively.

2. How Does Reinforcement Learning Work?

Figure 2. Reinforcement learning in dog training.

There are four main components to an RL system: the agent, the environment, the action space, and the reward function. The agent is the learner that interacts with the environment. The environment is everything the agent can interact with, including other agents. The action space is the set of all possible actions the agent can take. The reward function defines the reward or penalty the agent receives for taking a specific action in a particular state.

Figure 3. Reinforcement learning in autonomous parking. 

The RL agent uses an algorithm to learn a policy based on the rewards it receives. There are many different RL algorithms, but they all share some common characteristics. They typically involve exploring the environment to try different actions and exploiting the knowledge they have gained to choose actions that will lead to higher rewards.

Figure 4. Reinforcement learning workflow.

3. Applications of Reinforcement Learning 

Reinforcement learning has a wide range of applications in various fields. In robotics, RL agents can be used to control robots to perform complex tasks, such as walking, grasping objects, or navigating through an environment. In game playing, RL agents have been used to develop AI players that can defeat humans in complex games like AlphaGo and StarCraft II. In resource management, RL agents can be used to optimize resource allocation in complex systems, such as traffic light control or power grid management.

Figure 5. Training sample inefficient learning problem with parallel computing.

As RL algorithms and computing power continue to advance, we can expect to see even more applications of RL in the future. RL has the potential to revolutionize many industries and improve our lives in many ways.

4. Advantages and Disadvantages of Reinforcement Learning 

Reinforcement learning offers several advantages over other machine learning techniques. First, RL agents can learn without the need for labeled data, which can be expensive and time-consuming to collect. Second, RL agents can learn to adapt to new situations and environments. Finally, RL has the potential to achieve superhuman performance in complex tasks.

However, there are also some disadvantages to reinforcement learning. First, RL can be computationally expensive, as it often requires a lot of trial and error. Second, it can be difficult to design a good reward function that accurately reflects the desired behavior of the agent. Finally, RL agents can sometimes learn to exploit loopholes in the reward function, which can lead to unintended consequences.

Despite these challenges, reinforcement learning is a powerful machine learning technique with a wide range of potential applications. As RL research continues to advance, we can expect to see even more innovative and beneficial applications of RL in the future.

For reading the full article click here.
You can also find extra teaching articles in our LinkedIn Page.

Mastering AI: Prompt Perfection

Tree of Thought (ToT) Prompting:
What It Is and How to Use It
By Anita Kirkovska, Vellum.ai

The Tree of Thoughts (ToT) is inspired by the human mind's approach to solve complex reasoning tasks through trial and error. Put simply, this technique guides the LLM to explore different ideas, and reevaluate when needed, in order to provide the optimal solution.

Yao, et el. (2023) and Long (2023) recently proposed Tree of Thoughts (ToT), a framework that generalizes over chain-of-thought prompting and encourages exploration over thoughts that serve as intermediate steps for general problem solving with language models.

This approach outperforms Chain of Thought (CoT) prompting because it doesn’t stick to one “chain of thought” reasoning. However, it requires a lot of coding and custom algorithms to search through the layers to find the best reasoning path.

How does it work?

ToT creates a tree-like structure of ideas, where each idea is a step towards solving a problem. This approach enables the LLM to self-evaluate the intermediate “thoughts” and decide whether to continue with that path or choose another.

To perform this, the authors of the ToT framework (Shunyu Yao, et al. 2023) augment the LLM with search algorithms like breadth-first search and depth-first search.

Use-cases

The authors demonstrated that in tasks like mathematical reasoning and creative writing, the Tree of Thoughts prompting significantly exceeds the performance of other prompting methods. This is certainly interesting. But, can we utilize this framework in a plain and simple prompt? Turns out we can. ‍

Tree of Thoughts (ToT) prompting (with examples)

The Tree of Thoughts prompting (ToT) is adapted to the Tree of Thoughts framework, where the author tried to fit the whole concept in a single prompt. The example below showcases the structure of the prompt. It's important to know that GPT-3.5 doesn't get the right answer for the below example with Chain of Thought prompting, but it does with ToT prompting.

Example

Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, etc. If any expert realizes they're wrong at any point then they leave. 

The question is... ‍ 

Bob is in the living room.
He walks to the kitchen, carrying a cup.
He puts a ball in the cup and carries the cup to the bedroom.
He turns the cup upside down, then walks to the garden.
He puts the cup down in the garden, then walks to the garage.

Where is the ball?

The response from GPT-3.5:

Expert 1: The ball is in the living room. Expert 2: The ball is in the kitchen. 
Expert 3: The ball is in the bedroom. 
Expert 1: Bob carries the cup to the bedroom, so the ball must be in the cup. 
Expert 2: Oh, I see my mistake. Yes, the ball is in the cup. 
Expert 3: Agreed, the ball is in the cup in the bedroom. 
Expert 1: Next, Bob turns the cup upside down in the bedroom. 
Expert 2: After that, Bob walks to the garden and puts the cup down. 
Expert 3: Therefore, the ball must have fallen out of the cup when Bob turned it upside down in the bedroom. So, the ball is in the bedroom, not in the cup anymore. 
Expert 1: Oh, you're right. I made a mistake. The ball is in the bedroom, not in the cup. 
Expert 2: Agreed, the ball is in the bedroom. 
Expert 3: Bob then walks to the garage, so the ball remains in the bedroom. It is not in the garden or the garage. 
Expert 1: Absolutely, the ball is still in the bedroom. 
Expert 2: Yes, the ball hasn't moved from the bedroom. 
Expert 3: Therefore, the ball is in the bedroom, not in the garden or the garage. 
All three experts agree that the ball is in the bedroom. 

Some limited data from the authors shows that this prompting technique does better than Chain of Thought prompting on GPT 3.5, so it’s definitely worth exploring if you want to achieve better results with a less expensive model than GPT-4. It’s always useful to compare various techniques with different models and evaluate which one will produce the best result for your use-case. ‍

Full getting access to our Prompt Inventory check here
Don’t forget to visit our LinkedIn Page

Cutting-Edge AI Insights for Academia

President Joseph E. Aoun speaks at the 2022 President’s Convocation. Aoun wrote an article on the future of AI in higher education that was published in The Chronicle of Higher Education. File photo by Marta Hill.

Article of the Week: Research integrity in the era of artificial intelligence: Challenges and responses by Chen, et al. (2023)

Spotlight on AI Tools for Academic Excellence

Fillout.com : A powerful form, survey, and quiz builder that allows you to create customized forms and collect responses from your audience. It offers integrations with various platforms, such as Notion, Airtable, Salesforce, Google Sheets, and more.

Forms.app : Create forms, collect form submissions, and automate workflows with powerful integrations. Experience the best free form builder. Totally free & no coding is required.

WatermarkRemover : An online tool that uses AI technology to remove watermarks from images. With its powerful algorithm, it can accurately detect and remove translucent watermarks, leaving your images watermark-free.

Vast.ai : A platform for low-cost cloud GPU rental, offering significant cost savings for cloud compute needs. The market leader in low-cost cloud GPU rental. Use one simple interface to save 5-6X on GPU compute.

Tutor AI LearnAnything: An online platform powered by TutorAI, which allows users to instantly learn about any topic they desire. With the help of artificial intelligence, users can access educational content on a wide range of subjects.

Please subscribe to keep reading

This content is free, but you must be subscribed to ScholarSphere Newsletter to continue reading.

Already a subscriber?Sign in.Not now

Reply

or to participate.