AI as a Partner in Learning

Although some see artificial intelligence (AI) as our potential nemesis, there is great scope for it to become a transformative ally. In the field of education, AI is already beginning to reshape roles, enhance capabilities, and pose new challenges for educators, students, administrators and support staff. Arguably, we are heading towards a future in which AI’s role is that of a collaborative partner in the educational journey. How might that play out in a university context?

Faculty Engagement with AI. Faculty will be reimagining curricula, integrating AI and analytics to mirror real-world business complexities. Adaptive learning technologies will enable personalized teaching that caters to individual student needs and learning paces. AI will become an indispensable research tool.

The Learner’s Journey. AI will revolutionise the learner’s journey, providing tailored educational experiences and support systems that respond to individual learning styles. It will prepare students for an AI-infused business landscape, equipping them with the skills to excel and the ethical grounding to navigate the complex moral terrain of AI in business.

The Administrator’s View. Administrators will need to steer the strategic integration of AI in business education, aligning it with institutional goals while balancing budgets against the potential ROI from AI investments and crafting policies to uphold ethical standards and robust data governance.

The Support Framework. IT departments will be responsible for constructing the infrastructures needed to support AI initiatives. They will have an important role in focusing knowledge about AI utilization and safeguarding against cybersecurity threats in this new, dynamic academic landscape.

Career Pathfinding – AI’s Role in Career Services. Career services will evolve to focus on equipping students with the AI competencies demanded in the job market. They will strengthen ties with AI-forward businesses to create new opportunities for students. Career advisers will adapt to guide students through the complexities of an AI-integrated workplace.

The Knowledge Keepers. Librarians will manage AI-driven learning resources and databases. They could become pivotal in fostering information literacy within an AI context, guiding students to discern and utilize AI-generated content responsibly.

Beyond Graduation – Alumni and External AI Engagement. Alumni relations will extend into lifelong learning opportunities, with AI-based programs supporting continuous professional development. AI will enhance alumni networking, fostering stronger connections and engagement through intelligent platforms. Partnerships with industry will be cultivated, leveraging AI to foster collaboration and innovation beyond the campus.

Ethical Considerations. Compliance officers will monitor AI applications for adherence to the required educational standards. This will include identifying and mitigating AI bias, and actively promoting inclusivity. Their oversight will be crucial in aligning AI’s use with the core values and ethical commitments of the institution.

Holistic Support – AI in Student Services. Student services will harness AI tools for the early detection of wellness issues, proactively addressing student needs. AI will also enhance accessibility services, ensuring all students benefit from educational resources. Personalized support, informed by AI insights, will become the new standard in student care.

Marketing AI in Education. Marketing teams will showcase the institution’s commitment to AI in education, highlighting innovative programs and cutting-edge learning environments. Outreach and recruitment strategies will be tailored using AI analytics to attract a diverse, tech-savvy student body. Communicating the advantages of AI will become key in positioning the institution as a leader in future-focused education.

Long-term memory for Generative AI

Large language models (LLMs) such as ChatGPT have embedded within them and can make use of the huge amount of information they were fed during training. A user is able to access that embedded knowledge by giving the LLM instructions during the course of a conversation with it. At present, however, the LLM has a limited capacity to remember the details of a conversation; that capacity is determined by the size of its context window.

The context window is the part of the input text that the LLM processes when producing the next word in its response to an instruction. Although it varies across different LLMs, the context window is typically a few thousand words. Once the conversation exceeds the size of the context window, the LLM is unable to make use of everything the user has input over the course of the conversation; it ‘forgets’ things from earlier parts of the conversation. The context window can be increased in size, but that increases the amount of processing that has to be done to produce a response and that further increase soon becomes impractical. Researchers at UC Berkeley are exploring one approach to get around this limitation and have explained it in their paper MemGPT: Towards LLMs as Operating Systems.

In MemGPT, they have given the LLM a memory system similar in principle to that of a personal computer (PC). They call the context window the LLM’s main context and view this as its short-term memory, analogous to a PC’s Random Access Memory (RAM). In addition, MemGPT has been given an external context analogous to a PC’s disk drive or solid state drive. The external context comprises:
recall storage which stores the entire history of events processed by the LLM processor, and
archival storage which serves as a general read-write datastore that can serve as overflow for the main context.

During a conversation, archival storage allows MemGPT to store facts, experiences, preferences, etc. about the user, while recall storage allows MemGPT to find past interactions related to a particular query or within a specific time period. For document analysis, archival storage can be used to search over (and add to) an expansive document database.

To achieve the above, MemGPT’s main context is divided into three components:
system instructions set out the logic for how MemGPT functions control the interaction with external context;
conversational context holds a first-in-first-out (FIFO) queue of recent event history (e.g., messages between the LLM and the user); and
working context serves as a working memory scratchpad.

System instructions are read-only and pinned to main context (they do not change during the lifetime of the MemGPT agent).
Conversational context is read-only with a special eviction policy (if the queue reaches a certain size, a portion of the front is truncated or compressed via recursive summarization).
Working context is writeable by the LLM processor via function calls.

Combined, the three parts of main context cannot exceed the underlying LLM processor’s maximum context size.

The growth in ChatGPT’s capability

The capabilities of ChatGPT are increasing at pace. The latest upgrade turns it into a multimodal AI. Instead of being restricted to text-only input and output, ChatGPT can now accept prompts with images or voice as well as text and can output its responses in one of five AI-generated voices. A user can switch seamlessly between text, image and voice prompts within the same conversation.

Browse with Bing enables ChatGPT to search the internet to help answer questions that benefit from recent information.

Advanced Data Analysis (formerly called Code Interpreter) enables ChatGPT to upload and download files, analyze data, do maths, and create and interpret Python code. These are powerful capabilities but there are restrictions which include: no internet access; a limited set of preinstalled packages; maximum upload and runtime limits; state is cleared (along with any generated files or links) when the environment dies.

Open Interpreter is an open source project which seeks to overcome the restrictions of Advanced Data Analysis. Open Interpreter runs in your local computer and interacts with ChatGPT. It has full access to the internet, is not restricted by time or file size, and can utilize any code package or library. Thus Open Interpreter combines the power of GPT-4’s Advanced Data Analysis with the flexibility of your local development environment.

Plugins enable ChatGPT to interact with functionality provided by other systems. Examples are:
Wolfram Plugin for ChatGPT gives it access to powerful computation, accurate maths, curated knowledge, real-time data and visualization through Wolfram|Alpha and Wolfram Language.
Show Me ChatGPT Plugin allows users to create and edit diagrams directly within a conversation in ChatGPT. 
There is a growing number of plugins; some are shown here.

Plugins expand ChatGPT’s capability

ChatGPT has the ability to make use of third-party plugins which give it access to external sources of information. This is useful because it enables to AI to apply its impressive language capabilities to information that was not in its training data and, unlike the training data which is now two years old, that information can be current.

ScholarAI is a ChatGPT plugindesigned to provide users with access to a database of peer-reviewed articles and academic research“. In this conversation with ChatGPT, I explore a little of what the AI can do when the ScholarAI plugin has been installed. I found that it was able to search for papers, on a given subject, summarise the content of a paper, and answer questions about that content. I have not yet investigated the quality of the answers provided.

Plugins can also provide ChatGPT with additional functionality. In an earlier post, I mentioned the prospect of the AI interfacing with Wolfram Alpha. The Wolfram Alpha plugin is one instance of that, and it enables ChatGPT to give correct answers to prompts that require computation. See below for an example. We can be confident that answers obtained from Wolfram Alpha are of high quality.

There are many plugins to choose from. Websites such as whatplugin.ai can help us to find the ones we need.

Continue reading “Plugins expand ChatGPT’s capability”

How are neural networks implemented?

What can ChatGPT tell us about its inner workings? I instructed it to take the role of a professor in the faculty of engineering at University College London and said that I had the role of a third year undergraduate student. A full transcript of the conversation is here.

To begin, I asked the professor for a high level explanation of network theory, telling him to express the ideas in British English and without using jargon, to take it step by step, and to begin by getting my agreement for what would be the steps in the explanation. After orientating this outline more towards AI networks we experimented with the professor’s ability to reflect on what he had produced – ie to re-evaluate his previous outline. This resulted in a notably better structure for the explanation I sought – see below.

Professor ChatGPT then led me through the subject one step at a time. There were many diversions as I asked him to clarify things. We got as far as Step 2.2 before one of these diversions took me down a path to a good breakpoint: agreement on an understandable explanation of how the concept of an AI’s neuron – a node in its neural network – is implemented in software and hardware and how very large numbers of these nodes can execute simultaneously.

Continue reading “How are neural networks implemented?”

ChatGPT’s temperature

In a previous post I mentioned a parameter called ‘temperature’ which ChatGPT uses to vary its answers from being more focused and deterministic to being more diverse and unpredictable; a kind of creativity, perhaps. This is how the AI explains its ‘temperature.

Question: Consider the questions I have asked you in this conversation. If I now ask you the same questions in the same sequence, would your answers be different and, if so, might they contradict the answers given first time around?

ChatGPT

As an AI, my responses are generated based on patterns learned from a large dataset during training, and are influenced by the specific wording and context of the prompts given to me. If you ask the exact same questions in the same sequence, you should generally expect to receive the same or very similar answers, assuming the underlying model and its parameters remain unchanged.

However, GPT-4, like previous versions of the model, includes an element of randomness in generating responses. This is intended to make the output more diverse and to avoid always giving the exact same response to the same input. As a result, the specific wording and phrasing of my responses may vary slightly each time you ask the same question.

Continue reading “ChatGPT’s temperature”

ChatGPT is not intelligent

As I understand things, ChatGPT uses probabilistic methods applied to a massive amount of data (currently the language content of the internet up to September 2021) to predict what word is most appropriate to follow the preceding words in its output. To kick-start that process you give it an instruction which guides the form of its response. So it’s answers are a re-packaging of the previously-written material on which is has been trained; it does not create new ideas. There is a parameter called ‘temperature’, however, which can vary its answers from being more focused and deterministic to being more diverse and unpredictable; a kind of creativity, perhaps. 

At present, therefore, we are the intelligent agents and ChatGPT is simply our assistant. Our assistant can retrieve information quickly and package it in ways that can help us think through the ideas we are pursuing. Also, we can ask it to do things that will help us analyse the matter at hand, for example by taking large stacks of data from several sources, combining them and charting certain characteristics. And when we ask it to identify connections between things, it will sometimes find connections we would not have thought of ourselves. 

Continue reading “ChatGPT is not intelligent”

Synthetic Interactive Persona Agents

In this article by Sunil Ramlochan, a Synthetic Interactive Persona Agent (SIPA) is predicated as “offering realistic and effective dialogues that align with real-world human tendencies“. That is, it can take the place of human participants in online conversations. To illustrate what can be achieved, Sunil takes us through the creation of a fictional character: Jack Thompson is from Texas and identifies as a Republican. The task is to:
incorporate the socio-political context of Texas, Republican values and beliefs, and personal characteristics in developing John’s persona;
simulate how John would think, speak, and act in various scenarios, ensuring his actions reflect his political identity;
make the depiction … respectful and nuanced, avoiding stereotypes while highlighting the individuality and complexity of John’s persona.

Jack is then asked questions and his answers shown. At face value, Jack’s answers seem convincing.

Sunil suggests eleven use cases out of many possibilities. Most seem reasonable, for example the creation of synthetic data for use in the training of customer service agents and chatbot agents. Others give pause for thought, for example to simulate the political viewpoints of different demographics for use in political polling. Hence the article ends with thoughts about limitations and ethics and urges us “not to misrepresent synthetic data as actual human responses in any form of reporting or decision-making“. That will prove quite a challenge.

The exponential age

At some point between 2013 and 2015, I argue that we entered the Exponential Age. It is a new period of human affairs catalysed by accelerating technologies in four broad domains: computing, energy, biology and manufacturing.

Azheem Azhar, 7 September 2021

We are walking into a nexus of network adoption effects, at scale of which humanity has never seen before, of numerous technologies.

Raoul Pal in discussion with Nathaniel White 25 April 2023

Computational powers for ChatGPT

Being a Large Language Model neural net, ChatGPT cannot by itself do non-trivial computations nor be relied upon to produce correct data. Recent months, however, have seen ChatGPT being linked with Wolfram|Alpha and the Wolfram Language to give it a powerful computational capability. In his blog post ChatGPT Gets Its “Wolfram Superpowers”!, Stephen Wolfram uses some examples to explain the current scope of this combined capability and to hint at the revolutionary power of its future potential.

Steve Blank’s blog post Playing With Fire – ChatGPT looks at that combined capability from another perspective. He highlights that not only is ChatGPT good at what is was designed to do but that it is demonstrating emergent behaviours (things it was not designed to do) which were not seen in its smaller-scale predecessors. He points out, also, that ChatGPT is beginning to interact with a variety of other applications through an application programming interface. These applications can be used by ChatGPT to enhace its own capabilities. Conversely, the applications can harness ChatGPT’s capabilities for their separate, third-party purposes. These increasingly complex systems will display emergent properties, ie properties that the individual parts of the system do not have on their own but which emerge when the parts interact as a whole. Some of the emergent properties will occur by design, but it is inevitable that there will be some which cannot be predicted.

We are still some way from artificial general intelligence, but that is the direction of travel and we should be concerned that the continued development of this technology is driven by for-profit companies, venture capitalists and autocratic governments without any means of control.