Transcript
Note: This transcript has been edited for clarity.
Mathew Schwartz: Hi, I'm Mathew Schwartz, with Information Security Media Group, and one of the concerns with generative AI is that it might deceive us. But what if we want to use it to deceive others, for good? To discuss that potential, I'm joined by Xavier Bellekens, co-founder and CEO of Lupovis. Xavier, thank you for being in our studio today.
Xavier Bellekens: Hi, Mathew, thank you very much for having me.
Mathew Schwartz: So, Lupovis: this is a deception-as-a-service firm. You have a platform for this, and there is a great blog post that you authored recently posing the question of: Can generative AI be used to help you experiment at least with deceiving others, with using cyber deception? So how did you test these kinds of capabilities with deception technology?
Xavier Bellekens: Alright. Well. that's a great question, Mathew. Essentially, we've seen that generative AI is very good at writing text, and also some code. We've recently seen generative AI being used to solve problems that usually you would have to search on Stack Overflow. So we decided to use generative AI to write the code for us of a decoy or a honeypot, and see what it would come up with.
Mathew Schwartz: How did it do? Does one need to be an expert in this sort of thing - I mean, is there a use case here for your average user? Or do these people, these researchers, know what the right answer looks like, and so there's a little bit of domain expertise that was helping here?
Xavier Bellekens: Essentially, I played the role of a prompt engineer. In order to obtain the right result, you do have to actually put in the right queries. So there's a bit of going back and forth, and sometimes generative AI goes off-track, to say the least.
But there are templates that you can use once you've managed to build a single decoy. You see the reaction of the AI and what it's waiting for you to prompt, in order to get a very decent output. So what I was doing was understanding how to interact with the AI to get the best output - in some cases reformulating the question, and in some other cases building on what generative AI would provide me.
The way that I thought was the most efficient way to ask the generative AI was to say: "Look, I would like to build this type of decoy" - let's say an SSH box, right? "I would like you to tell me how I would do it in a couple of steps," and so let the AI come up with the right steps for the task, and then reusing those steps myself and say: "Look, I've said that in step one, this is what we were going to build. Can you provide a code example? And then in step two, you've mentioned this, can you provide a code example and merge those together?" And then testing the outcome: is it working? Is it not working? And then feeding back the outcome, so if there is an error, feeding that back. I'd say: "Look, this is what I'm receiving, and this is what the output I am expecting."
Mathew Schwartz: So saying: "Can you help me make up the difference there?" Would you need some kind of deception technology platform to be experimenting with this? Or if you've got a Linux box, could you connect this up in an automated manner? If you were getting prompts back from ChatGPT - or another generative AI engine of your choice - what would you need in order to put this into more of a production stance?
Xavier Bellekens: For me, this is a stepping stone both for the public, the general public that would like to try deception out and build a decoy themselves, see what happens, understand the technology behind it and maybe alleviate some of the questions that they may have about the technology. Would I put it in a production environment? Probably not directly. And the reason is that if you're not a deception company, reinventing the wheel might actually create more hassle than coming to a deception company and saying, "well, this is what we are looking for," and being able to deploy at scale and within minutes, right? But at least it allows you to experiment, and if you want to do a very basic proof of concept and understand, is my red team going to be detecting it, or not? At least you get something and some results that you can report against, and being able to have some sort of baseline.
Now, for deception vendors, what you get essentially is the ability to prototype fast. And I'm not saying that this is going to be integrated anytime soon into deception vendors, but as generative AI evolves, I think we are going to see more and more of that generative AI being included in deception tools, because generating text is fantastic, being able to generate content, maybe graphics. So all of that information can contribute to maybe not reduce the load, but at least to provide that stepping stone and remove some of the effort for the software developers.
Mathew Schwartz: Speaking of effort, one could imagine all sorts of use cases for generative AI, both defensive and offensive. In theory, hackers are going to be saying: "I want you to find me SSH access into the biggest thousand organizations you can come up with and return the working credentials to me. Go." So I guess in theory you're also looking here at fighting fire with fire?
Xavier Bellekens: So far, what we've seen with the likes of ChatGPT and Bard is there are some barriers put in place to make sure that you're not using it for anything that would be malicious. We know that the barriers can be bypassed, right? We've also seen elements where generative AI is being used by adversaries, and so having the ability for ourselves as a blue team, and defenders, to use generative AI against them and increase their workload, making sure that they spend their time on decoys, is a good step forward.
Also, because the technology base is widely available, we are on the same at the same level as adversaries. Usually we play a game of catch-up, right? There's a vulnerability found, and attackers scan the internet, and we need to go and defend. Here, we've got a piece of technology now that is accessible by blue teams, and accessible by red teams and adversarial folks. So it gives us as many possibilities. I think currently the only barrier that we have is imagination, and how to make sure that we get the right outcome, and this is what we were testing. Essentially, we're like, okay, they are going to use it for malicious intent, right? How do we make sure that we use it for the better good? How do we make sure that we trick them into attacking decoys? And how can we leverage this new fantastic tool into something that benefits us as defenders, and change that paradigm?
Mathew Schwartz: Speaking of decoys: as you detailed in your blog post, you came up with a lot of different ones, eventually, using all of these various prompts. If I recall correctly, there was a passport database, a CCTV camera or two, maybe a PLC, suggesting that it was some kind of a manufacturing environment. Why did you pick this swath? Was this a proof of concept? And were there any surprises in how these automated or hands-on keyboard attackers attempted to manipulate them?
Xavier Bellekens: I used to be a pentester, a long time ago. I follow some of the groups out there, and I wanted to understand a bit more, the drive to some of those technologies, from very common technologies to something slightly less common. So, for example, would a passport database be of interest if I just left it open on the web? Or CCTV cameras that we know are vulnerable and being hacked on a daily basis? So if I recall correctly, the CCTV and the PLC were the ones that were mostly attacked. There's no surprise there, right? We see a lot of mass scanners going around the web and looking specifically for CCTVs or for PLCs. A PLC gives access to an operational network. And on the CCTV, there is that access to your video feed, and you never know what's on the video feed, and I think that's what makes it interesting.
We also saw, interestingly enough, some scanners trying to attack those decoys, and then finding a way to get in right on the vulnerability that would have been created by ChatGPT or a misconfiguration, and essentially then using some other tool. Looking at what was the way that those adversaries would then exploit those specific devices, in many cases we saw that once a mass scanner had found a vulnerability, a human would then connect onto the decoy and explore the decoy, and of course, in doing that, whether they realize or not that they are on the decoy, the alert has already been given, and this is the very big, beneficial part of deception. You're going to get that alert fidelity that says, by the way, a human is on your decoy. This is what they are doing; this is how they are interacting with it.
So that's essentially what we were doing. And we can see this as being very interesting, because it also shows some of the modus operandi or techniques, tactics and procedures or vulnerabilities being executed against those decoys.
Mathew Schwartz: So a great use case there for deception is to give you some early warning. We've seen with ransomware attacks, for example, if you've got some early warning, sometimes you can get the blocks in place before the crypto-locking malware comes out. When you're looking at using ChatGPT to create more realistic-looking decoys - I know you said already that you don't envision putting this into production right away - I would guess we're not at the point where it can replace skilled researchers who can help craft really enticing fake IT infrastructure, CCTV, or, I don't know, puppy photos? Anything to divert the attacker a little bit. Will ChatGPT be doing this in an automated fashion anytime soon?
Xavier Bellekens: You'll need that prompt engineering, and ChatGPT so far can only go so far. That's one of the big limitations that we hit during that experiment: ChatGPT is not going to provide you with all of the graphics. ChatGPT is going provide that stepping stone. Generative AI is going to build something that works, that is of interest and follows a very specific template based on your needs and requirements. But then there's an extra step if you want to, let's say, personify the decoy, and in some cases, one of the prompts that I kept using with ChatGPT was, "Now that we've got a decoy that is fully functioning, what are the 10 steps that you would follow to make this decoy better?"
It's interesting to understand what ChatGPT believes the 10 steps are in order to get more value, to trick more adversaries, and you can also make other requests too. For example: "I'm looking at deceiving adversaries that use these 15 MITRE ATT&CK techniques. How do I incorporate this into my decoys?" Or, "here's the list of the decoys that I have that you've built with me, and this is what I want to do. These are the types of techniques and tactics and procedures that I would like to deceive. What do you recommend?"
Of course we could totally do that ourselves, right? We have engineers that are there specifically for building decoys. But it's interesting to understand, are there any data that we are not considering? Is there anything that can be provided? Our engineers rely on their own knowledge, and all of the information that we collect as an organization. But ChatGPT, as a global outlook essentially up to 2021, it still draws information from a larger database than we as a collective mind within Lupovis can draw. So it's interesting to see the differences, and sometimes to question those differences, and to understand the reasons behind them.
Again, here I would say that if you have like an engineering team and a design team, you'll be able to pass on all of the information given by ChatGPT, and reuse that to essentially better the decoys, and draw adversaries towards the decoy in a more, let's say, convincing fashion.
Mathew Schwartz: This sounds like something that you might revisit now on a regular basis, especially as generative AI improves. And the data set gets to be a bit more recent.
Xavier Bellekens: Exactly. That's right. It's an interesting topic because, of course, they are retraining. I think we've got GPT-3.5, GPT-4, and I'm expecting GPT-5 and 6, and those will bring new advancements. They will also be trained, probably, on some of the data that have been fed - what's been entered into ChatGPT - and much more, and we already see a difference between GPT-3, and GPT-4, in terms of the code that it's able to provide. So I would expect a leap between ChatGPT-4 and GPT-5, and becoming even better at some of the information and outputs that we receive.
Mathew Schwartz: Fascinating stuff. It is fun to get a sense of how this is being used in the deception realm by organizations such as yourself, and how it will probably continue to be used in the future. Xavier, thank you so much for your time and your insights today.
Xavier Bellekens: Thank you very much, Mathew, for having me.