Artificial Intelligence & Machine Learning
                                                    ,
                                                            Black Hat
                                                    ,
                                                            Events
                                                                                                                                                                                                                                                                                
                    NYUs Brennan Lodge on Training Your Own Model With Retrieval Augmented Generation
                
Many cybersecurity organizations hope generative artificial intelligence and large language models will help them secure the enterprise and comply with the latest regulations. But to date, commercial LLMs have some big problems – hallucinations and a lack of timely data, said Brennan Lodge, professor of information technology and data analytics at New York University.
See Also: Safeguarding against GenAI Cyberthreats with Zero Trust
Lodge made the case for organizations building out their own LLM using retrieval-augmented generation, or RAG, which continuously updates the data and categorizes it by the type of information it contains – similar to a card catalog system in a library – to help eliminate another major problem posed by commercial, off-the-shelf LLMs – lack of visibility on how it was trained. Lodge’s prototype RAG vector database includes a link to supporting information with every response, to help peer into the “black box” of the LLM.
“At least you have a reference to connect to and you can feel more confident about that system that you manage,” Lodge said. “You own the data and you can use it for cybersecurity-type stuff.”
RAG could help a security analyst sift through threat intelligence feeds, vulnerability alerts or the latest updates to the MITRE framework. Personnel in risk and compliance could use the system to continually monitor and respond to new regulations by creating an “automated gap analysis” of the organization’s existing policies.
In this video interview with Information Security Media Group at Black Hat 2024, Lodge also discussed:
- The strengths and weaknesses of traditional LLMs;
- Use cases for retrieval-augmented generation and options for implementation;
- The role of AI tools in cybersecurity operations.
Lodge, who focuses on information management and data at NYU, also serves as co-founder and CTO of Skidaway. He has been working in the financial services industry for more than 15 years and has held cybersecurity, data scientist and leadership roles at JP Morgan Chase, the Federal Reserve Bank of New York, Bloomberg and Goldman Sachs.

