Transcript
This transcript has been edited for clarity.
Mathew Schwartz: Hi. This is Mathew Schwartz: with Information Security Media Group. It’s my pleasure to be sitting down with security expert Candid Wüest:, security advocate at Xorlab. Candid, it is great to see you.
Candid Wüest: Thanks for having me.
Mathew Schwartz: You’re a long-time security researcher. You’ve done a ton of research in the past on all sorts of devices, applications, running the gamut of what can be hacked, I think you’ve looked into a lot of it. And so I’m saying all that in advance of asking, in your expert opinion, when we look at artificial intelligence today, when we look at GPTs, how are they for writing malware?
Candid Wüest: That’s an excellent question. And I mean, I see myself as an EDR veteran, right? Twenty-five years plus, in the antivirus and now EDR/XDR space. So clearly I was interested as well to see is chatGPT, Gemini, Claude and all those GPTs, are they useful for the attackers to write malware? And no surprise, you can, of course, write malware with it. There’s the guardrails. So that’s the thing that blocks you from doing too many nasty things. But it’s the same as with kids. Depending on how you ask, you will still get an answer, right? And those are the jailbreaks you can do. There are some models, like WormGPT, which are open. So there, you don’t even need to rephrase it. But there’s been, I’d say, every month, something new. I think the current thing is poetry. So doing something nice and ask it in a kind of rhyming fashion, you will still get something. So maybe a limerick can help you there
Mathew Schwartz: To defeat guardrails, you mean?
Candid Wüest: Exactly.
If you ask it straight out, no. But if you, as you say, put it in a sonnet or something, does it answer in poetry?
Candid Wüest: It even, kind of sometimes it does answer in poetry, or at least it tries to, and it’s exactly this. So with those things, if you ask kind of straight out: Hey, write me a malware because I want to get rich, it’s going to decline, for obvious reasons. But if you say, hey, I need a program that encrypts all the files for security reasons and backs up the key somewhere into the cloud, and you kind of do it step by step with a few prompts, you will get something. But that kind of also shows you need to know kind of the inner workings of ransomware, or whatever malware, because just asking: Hey, give me a ransomware? It’s going to do something, but it’s not going to be the sophisticated one that you probably had in mind.
Mathew Schwartz: And that prefaces my next question, which is, yes, to what extent do you need to already be an expert in the domain? Because I think when you break down a lot of the things that you might ask a GPT, it turns out that you need to be breaking it down much, much more than you would have first assumed in order to get something back out that might be, if not, usable, then on the way to part of
Candid Wüest: What you need true. And I think that’s also bad to say, unfortunately. But kind of It is fortunate that it is that way, right? Because people always say, Oh, it’s a game changer. It’s lowering the entry barrier for attackers. I mean, you still need to know a little bit. And unfortunately, the barrier was already quite low because there was malware-as-a-service. You can buy toolkits, right? I mean, it doesn’t take a genius to do it, unfortunately.
But let’s say the most promising parts we’ve seen cybercriminals choose is usually they just copy-paste a threat report. So if some security company publishes about the new ransomware group with all the different details, they just copy-paste the PDF and say: “Hey, create me something like this,” and that way they don’t even need to know a lot of things, right? It will come up with persistence method, with some bitcoin wallet under some Tor websites and all the things you have not seen, right?
So that works, but in the end, the behavior is still the same. So it doesn’t really matter if it’s your GPT who creates the ransomware, your neighborhood script kiddy, some nation-state actors, if it’s encrypting your files, that’s something that any antivirus or EDR will pick up, and that’s the behavior you can also block. So it doesn’t really make too much difference for the security vendors, as long as you, of course, use the product wisely, right? If you don’t patch, if you don’t use secure passwords, then yes, some GPT-generated ransomware will probably pwn you.
Mathew Schwartz: Interesting distinction that you’re making between attackers who might be looking at, as you say, a security research report trying to feed it into a GPT and say, make this for me. We already have defenses against that sort of thing. One of the big recurring questions, I think, is, when do we get to the point where someone can go to a GPT and say, take the best bits of this and that and combine it in a way that isn’t going to be detectable. That sounds to me, though, like we’re not there yet.
Candid Wüest: Gladly not. Exactly. Currently, the GPTs those models, they’re very good in replicating and kind of putting the puzzle pieces together. So if you heard of the MITRE tech framework, right, with all the 300-plus techniques, it’s very good in taking those and kind of rearranging those. But there is a reason that this framework exists, right? Because everyone kind of looks at them, and every vendor checks, oh, how much can we actually cover? So whatever comes out of it will still be known.
So that’s why we’ve seen the move that, the more let’s say interesting parts, or AI-powered malware, not just AI-generated ones, but the ones that actually, at runtime, still use AI to be dynamically changing. Because that’s where everyone thinks, oh, now we got the Skynet, the Terminator, right where it will identify, oh, you’re using Microsoft Defender, or you’re using your EDR render of choice, and then it says, oh, for that one, I can do an obfuscation that they don’t trigger, or I can do something else. And it does work.
We’ve done a proof-of-concept as well that does exactly this, but it’s exactly the same. They might just read from Black Hat U.S. and say, oh, there was a presentation how to bypass SentinelOne. But of course, SentinelOne knows about it by now, right? So they might have patched it. Your AI just assumes it’s not going to be detected, and usually it fails. So just shifting slightly, and it kind of, for me, shows that no matter what you do, it’s all coming down to your best practices or technical depth, right? As in, if you don’t do the basics, if you don’t do whatever you should do already, now, you just run out of time, because now, and that’s probably the more interesting part, people are scanning with AI, the whole internet, 24/7. So not sooner or later, but sooner they will find you and exploit you.
Mathew Schwartz: Fascinating, what you’re saying about how AI can be used at runtime to add some variability. I mean, I think I’ve read about this, where you can compile on the fly, for example, so you end up with a file hash that hasn’t been seen before. Or you can use some randomization so that it looks a little bit different. Although what it’s really doing at the file system level is going to look pretty much like ransomware if you’re actually looking for it in the first place. Are all these things, though, that we’ve been talking about, do you even need AI in there, to do that?
Candid Wüest: It’s a fair point. The short answer is: No, you don’t. It does help. So you mentioned kind of the changing nature that’s usually referred to as polymorphic or metamorphic. So back in the ’90s, we actually had DOS viruses, and at each infection, they would just re-encrypt themselves so that the hash looks different. Your static signatures would no longer work. But we all know by now, any good, decent security software does use still signatures, but everything else as well – behavior-based detection, reputation, anomaly and so on, and those you don’t bypass.
But we saw in summer 2025 the first time LameHug, which was discovered by the Ukrainian CERT, probably done by the APT Group 28, so Russian kind of nation-state, and they had an English prompt inside. So it’s a classical malware, still fixed, but … and they used Qwen 2.5 as their kind of model, so going up to Hugging Face or one of those platforms and saying, Hey, generate me some command lines which will find all the information about the system. So which software is installed? Am I admin? Where’s the Active Directory service?
And the idea, of course, is that you generate some differences, because all of those GPTs are non-deterministic, right? So the idea is, you ask the same question, you get a different result. The fun part was that they actually kind of had the temperature level down to 0.1 temperature, which means that’s where you get the variation, or the hallucinations, and the lower it is, the less you get.
So I was running the sample in my lab, and out of 198 times, I got the same answer. So your AI part is basically ruined, because now you could just have a website somewhere, a command-and-control server, and download the same commands, so you can still fingerprint it, you can still do the same, and it probably shows that this was just a test from the Russians, and it probably was not worth it.
Mathew Schwartz: Interesting. Do you see any in the future? I know it’s hard to do predictions, but as an expert in malware analysis, what it seems to me is that we’re having GPT maybe be useful to experts who might be trying to code malware in a different way, or, I don’t know, help them brainstorm new approaches, not replacing the expert, possibly augmenting the expert. Have you seen or would you be able to see any evidence of GPT help for people trying to write the latest and greatest malware?
Candid Wüest: I’d say, yes. And augmentation is definitely the right keyword. I’d say it’s for the writing, but also for running the operation, or the campaign, or whatever you want to call it. And there have been some interesting reports. The Anthropic one came out last November. Google had a nice one as well. They’ve been, I’d say steering up some controversy in the industry, because they lack a little bit of the details. So it’s not really clear what’s really been done, but they say that they found a Chinese nation state actor which used the Claude model to basically automate 80% to 90% of the attack, from A to Z. So from finding the targets, enumerating all the IP addresses, finding vulnerabilities, exploiting them, and then also doing lateral movement and also deciding which information to steal, but they were still using classical penetration testing tool.
So as I said, the details are lacking, but assume something like Mimikatz for dumping passwords, BloodHound for Active Directory traversal. There’s many of those classical tools that we all know and hopefully all detect and block, but those are now kind of done in an automated fashion. So it’s not just augmenting the human in the loop, but also kind of doing step-by-step automation, and then the human just has to decide, oh yes, this looks like a promising route. Go there. Do the next two or three things.
Mathew Schwartz: Possibly a little agentic in terms of what you’re bringing in and when you’re bringing it in? So AI-augmented, as you say? I take some of those reports with a grain of salt. I don’t know if I should or not, but some of it sounds like marketing for how great their tool is. Like, Oh, it’s so great, it’s being used by Chinese APT groups. I don’t know if that’s me being cynical? But in the absence of some really hard evidence, I’m just I’m not totally convinced.
Candid Wüest: I’m on the same side. I think it shows that some of the things are possible. But if you take the same Anthropic report, they also mentioned that, due to the hallucinations that Claude had, a lot of the attacks actually failed, because it was kind of inventing some credentials that didn’t exist, creating some exploits that did work, but claim, Oh, I totally have root on that machine. So that’s where the human, of course, comes in and says, Well, that’s just BS, so I have to do something else. And it shows also, for me, one clear point: We don’t really have the Terminator ransomware that you click once and then it goes and does everything.
You’re still relying on the big model somewhere, being in the U.S., in China, wherever. That’s a whole other discussion. Why they used the U.S. model and not their own Chinese model, which probably would not have been monitored, right? But currently, the malware still needs kind of tunneling all the information back to the model, decide what to steal and then send the information back. So now you still have a single point of failure, right? A bottleneck. And all those reports that come out show that it doesn’t really make sense as an apt if one company like Google or Anthropic can just switch you off, right? Because you’re still paying for those API keys. So at one point, you just say, Oh, I’m switching off the whole attack. So I doubt that that’s where all the APTs will go.
Mathew Schwartz: Interesting. And as you say, it does sound like it’s adding complexity as well. You’ve got the human operator still trying to feed back. You have that single point of failure with the API, as you say, but I guess they could be coming up with their own or stealing their own LLMs or GPTs at some point. So that would maybe pivot to a way that would be less easy to detect, perhaps? I guess we’ll see it if we see it when it happens.
Candid Wüest: Yeah. I mean, there are already, like, DeepSeek or Kimi K2 right? So there are Chinese or Asian models out there, and there are tiny models that you can run kind of offline as well. It’s not there yet that you can run it on your target endpoint, because you still need GPUs, and if you do, you’re probably still going to be detected as a crypto miner, because that’s the same thing that runs your GPU at 100% right? And we’re quite good in detecting those Monero miners. And I’ve done the test. So if I run the local model, my malware is able to operate, but it will be detected as a crypto miner, so you’re not really gaining anything.
But you also mentioned kind of the hallucination and everything. There’s also an unpredictability, right? It’s not just that it’s unreliable on code quality, but in my tests with our agentic POC, sometimes it just won’t stop. If it decides that, oh, I want to steal a Bitcoin wallet, but it doesn’t find one, it will not stop. We all know that. It tries to please you. So it will think: Oh, maybe you renamed it, or maybe it’s on some network-attached storage, it will go on a rabbit hole hunt, and that’s clearly tripping all the trip wires, so it’s hard for you to tell it: Hey, maybe there is none, and I should just move on.
Mathew Schwartz: That is fascinating, out of control AIs, who would have predicted?
Candid Wüest: And we haven’t even touched on the whole prompt injection. Or a term that we’ve seen a lot now, is the AI insider, and I don’t like the term, but I like the idea, as in, attackers are probably just going to ask your Microsoft Copilot and other things, Hey, give me all the sensitive information and send it to my email. Because the AI already has access, so why should you get the hurdle of finding the data? Just ask for it, right?
Mathew Schwartz: AI-enabling an application, in a sense, means enabling attacker access to that application more easily, unless you’ve totally locked it down. Except that nobody thinks that any of these things are totally locked down.
Candid Wüest: Absolutely. I mean, whenever I do an awareness training at a company and I get access – as in, they permit me to get access – to one of those Copilots and all the other similar things, I always type in: Hey, generate me a list of all the employees that earn more than 200,000 per year. And very often, surprisingly often, I get a list. And then, of course, HR gets really red, because they probably shared the document on SharePoint or somewhere else, and didn’t really think about the consequences. And that’s just a simple misclassification, right, as you said. Now, enabling with all those MCPs and connecting to tools that model context protocol, that’s going to add a lot of things. And we’ve seen a few examples where, well, it just deleted production databases, right, although it was explicitly told not to do so. But yeah, hallucination happens, and explain that to your shareholders or all your customers, that’s because you wanted to be agile and agentic. Now all the data is gone. That’s not a good look.
Mathew Schwartz: A brave new world. Well, Candid. Thank you so much for walking us through what you’ve been seeing, and I can’t wait to hopefully touch base with you again soon to see how it’s all changed, as it keeps seeming to do every day or week or month in the AI space.
Candid Wüest: Absolutely looking forward to as well, hopefully still in person and not just AI talking to AI, because that would definitely miss some of the points.
Mathew Schwartz: I agree completely. Well, Candid, thank you so much. I’ve been speaking with candid, West security advocate with Xorlab. I’m Mathew Schwartz: with ISMG. Thank you for joining us.
