A common challenge for government agencies is effectively using unstructured data for analysis. Our government client faces this challenge with EDGAR filings—an online corporate filing system created by the Securities and Exchange Commission to enhance transparency and accessibility of corporate information.
One of our government clients aimed to explore AI as a potential solution to help make the vast amount of unstructured EDGAR data more manageable and useful for their analysts. However, two main challenges emerged.
The first challenge was the lack of precedent from enterprise IT or governance policies on AI projects. The second issue was the potentially high costs associated with hosting large language models due to their significant computational requirements.
In consideration of these challenges, our client decided to start small by implementing Mistral 7B, a small language model. This strategic choice allowed for in-house deployment on an on-premises server, ensuring government control of access and requiring lower computational resources than LLMs.