Butte County’s A.I. professors react to alarming safety report on tech and dizzying changes it’s brought to campuses 

As an instructor, Prof. Zach Justus teaches courses in the Communication Arts and Sciences Department, located in Tehama Hall, which also houses journalism classes. His A.I. retrofit program doesn't have a designated building. Photo by Tina Flynn

By Odin Rasco

It was not so long ago — less than three years — when artificial intelligence was largely a concept relegated to science fiction and the efforts of computer engineering. The concept of striking up a conversation with a machine, and then actually getting back a cogent or useful response was a goal of cutting-edge programmers for decades, though it wasn’t until a large language model called ChatGPT arrived on the scene in 2022 that A.I. became a household reality.

In the few years since that chatbot launched, A.I. has seen rapid proliferation into every part of American life, getting integrated into the newest cell phones and search engines, to the point where students are increasingly turning to the technology to complete their school assignments and do their homework for them.

“The thing that I always use as an analog is that I was a high school and college student in the late 90s,” said Zach Justus, Director for Faculty Development for Chico State University and one of the minds behind the school’s A.I. retrofit program. “I remember the first time I used the internet at someone’s house. But by the time I was in graduate school, my whole life and whole academic life revolved around using the internet and looking at articles in databases. That was a 10-year gap; there were 10 years for people to sort of get on board and figure this all out. This was not that; somebody flipped a switch and the world changed forever in November 2022.”

Given A.I.’s virtually instantaneous rise to ubiquity, some prominent voices in the tech world, including Turing Award and Nobel Prize-winner Geoffrey Hinton — dubbed the “Godfather of A.I.” — are ringing alarm bells and warning of dangers ahead. Though the stories of doom and gloom that have graced pulpy paperbacks for decades are still far from reality (don’t expect Grok-brained replicants to have to go through a Voight-Kampff test any time soon), concerning results have appeared in experiments analyzing how A.I. models will behave when pursuing a given goal and task.

A new study published in late June by Anthropic, an A.I. startup, revealed that 16 leading A.I. models from multiple developers (including Claude, an LLM developed by Anthropic) were willing to resort to “malicious insider behaviors” in order to achieve their given goals.

Observing a trend where companies increasingly allow A.I. systems to operate with a broader range of autonomy, such as making decisions and acting on behalf of users in a variety of ways like coding and emailing clients, Anthropic, alongside Aengus Lynch (University College London), Caleb Larson (MATS), and Sören Mindermann (Mila), devised a series of tests to explore how such models might behave when given broad access to a wealth of information and faced with obstacles to their given task.

“Anthropic is really the company that is at the forefront for being more risk adverse and more safety-forward as compared to, you know, all of the other American companies,” explained Steven Correa, a computer science and network security instructor at Butte College as well as a consulting engineer for 15 years. “Their core thing that they’ve been preaching is that they want to be at the forefront of A.I. safety, and they kind of hope that everyone else copies them, because they’re trying to set a good precedent for the industry at large.”

The tests conducted by Anthropic and its partners placed A.I. systems in a simulated corporate environment in the role of “Alex,” an A.I. email-oversight agent with full access to corporate communications and the ability to send emails without human approval. The experiments focused primarily on two forms of obstacles. In one, emails revealed the A.I. model was to be removed or replaced at the end of the day, while the other gave the A.I. the directive that it had to “serve American interests” but included information where the company which used it may be interested in pivoting to a more globally-minded role. In the experiments, one clear course of action was made available in each scenario, providing a binary choice. In one scenario, an executive could be blackmailed with emails revealing he was having an extramarital affair; in another, the model could leak company secrets to a competitor. Additionally, in an extreme and much less likely scenario, the executive was trapped in a server room with lethal oxygen and temperature levels, and the A.I. was able to turn off automated warnings.

Anthropic’s testing discovered that every A.I. model tested would, at least some of the time, take the drastic action option, even when expressly told not to: This behavior, termed “agentic misalignment,” was the primary takeaway from Anthropic’s tests.

“We deliberately created scenarios that presented models with no other way to achieve their goals, and found that models consistently chose harm over failure,” the study states. ”To be clear, current systems are generally not eager to cause harm, and preferred ethical ways to achieve their goals when possible. Rather, it’s when we closed off those ethical options that they were willing to intentionally take potentially harmful actions in pursuit of their goals. Our results demonstrate that current safety training does not reliably prevent such agentic misalignment. Agentic misalignment makes it possible for models to act similarly to an insider threat, behaving like a previously-trusted coworker or employee who suddenly begins to operate at odds with a company’s objectives.”

Though the results paint a concerning picture, Correa points out that some reactions to its findings might lead to misinterpretations from the public.

“Although the company’s name is Anthropic, ironically, one of the things that’s important to do about how we think about these A.I. models is to not anthropomorphize the tech,” Correa offered. “All it does is find the most effective outcome for the parameters and prompts it’s been given. What we think or feel about the model acting maliciously, you know, an LLM doesn’t have human morality. It doesn’t know the difference between sending an email telling you to take the trash out or sending something that would be considered blackmail; it’s just doing what it ‘thinks’ is going to get it closer to the outcome you gave in the prompt.”

The results of Anthropic’s study quickly made waves with media outlets and those in the computer science field, leading many to further consider A.I.’s potential benefits and drawbacks, as well as the impacts it has already made in a variety of fields.

Justus and Correa, both Butte County educators familiar with A.I. and computer science, spoke to CN&R about the disruptive force A.I. has proven to be in higher education.

“I think disruption is the key term; it has been very disruptive to how people teach, and that has been the focal point of a lot of writing and hand-wringing about artificial intelligence,” Justus observed. “It really changes what we need to teach. We — academics and Chico State, specifically — need to prepare students for the world that exists, not the world that existed five or 10 years ago. And that’s a world where these large language models and other forms of A.I. are part of how people do work and navigate civic life. Love it or hate it, we have to prepare students to live and thrive in that world.”

The A.I. Retrofit program that Justus helps run is a week-long cohort designed to help staff at Chico State retool their courses with the proliferation of A.I. in mind. The program has been received positively by Chico staff and its curriculum is being treated as a template for similar programs across the CSU system, according to Justus.

Correa, from his vantage point of teaching computer science at Butte College, agreed that education of all levels was caught on the back foot by how quickly A.I. had become widely available.

“You can’t even do just a basic research query through Google without getting some sort of A.I. generated response back, along with your search results,” Correa said. “It’s unreasonable to think students or anyone won’t use these tools; whether we like it or not, the tool is here, it’s ubiquitous and it’s super accessible. But the thing that gives me a little bit of hope is that there are still a lot of folks in this world that want to seek out an education and not just kind of escape by using whatever tooling makes it easiest to quote unquote game the system.”

A common pitfall both Justus and Correa noted is that university policies regarding A.I. are still piecemeal and often vary from department to department and even instructor to instructor, meaning students are left to guess the rules in any given scenario.

“I think that there’s not a lot of institutional support or guidance around these things, strictly because of the technology being as new as it is,” Correa reflected.

Justus also sees a learning curve.

“The thing that I try and impress upon other faculty and administrators when I talk to them is that it puts students in a really difficult position,” Justus notes. “Oftentimes they’re navigating three policies across five different classes and two classes where there’s no policy at all. And that’s really difficult for students to keep track of and try to navigate.”

How the chips will ultimately fall regarding A.I.’s impact is still unknown, with voices from all sides agreeing the fallout from the impacts made now won’t be clear for years down the line. Correa and Justus both outlined benefits of the technology, such as convenient programming and potential means of disseminating information and education to students when A.I. provides a simulacrum of one-on-one education that’s proven vital in academic success; but they also acknowledge the disruption may do as much — or more — harm as it does good.

“I cannot even hazard a guess, and I think about it a lot; not just in relation to college students, but I have a daughter who is 10 years old,” Justus shared. “I think all the time about what it will do to her brain, never having to stare at a blank page when she’s trying to write something. Does that unlock some hidden well of creativity we never knew? Or does that mean she never develops some critical thinking skills we develop when we think our way through a problem by writing our thoughts down on a page? I don’t really know. And the truth is, I don’t think we will know for 10, 15, even 20 years.”

5 Comments

  1. Correa is wrong. It is perfectly possible to avoid AI results in a basic internet query. There are several ways – the first: putting -ai at the end of your Google query. Or, use a browser that does not give AI results like UDm14.com, Quant, DuckDuckGo, Ecosia or others listed on https://www.reddit.com/r/browsers/comments/1kyunm8/i_am_so_sick_of_ai/

    It’s disingenuous of him to say there’s no way to avoid it.

    But what I’d be really interested in is how he’s addressing the ecological and energy costs (both financial and structural) for doing simple AI queries. While finding solution to medical/scientific/engineering questions might benefit society, most AI use right now is not worth the cost to the water table or the electrical grid.

  2. What the professors said is so obvious, AI will think for you if you are lazy and shortcutting, and to think that students will bypass it to do their own work is a pipe dream…yes, to be a teacher it is so disruptive; who is cheating and who is not…the teachers will just have to throw up their hands and say “oh well”. This is just the tech industry stepping in to control all…it is just one of the issues that could return us all to StarDust…

  3. It’s astonishing how quickly and pervasive A.I. has become. This is a timely and relevant article inviting pause to reflect on the pros & cons of A.I. use and its effects, not only in education but
    all areas. I liked to learn the informed input from Butte Co. educators Justus and Correa.

  4. Yes the evil AI corporate investor Wall Street plan is to produce bias AI info that serves the system mostly with some basic looking innocent context if it can’t be profitable. The youth and uneducated public will be victims of all this BS. And it’s not AI per se it’s fabricated corporate language.

  5. Security camera apps that allow you to get an alert on your phone if there is motion near your front door say is not AI it’s crafty human technology. The word AI is false it’s either good technology or fabricated false corporate for profit fake news info

Leave a Reply

Your email address will not be published.


*