Challenge
State agencies gather and generate massive amounts of data. A Midwest state had approximately 4 petabytes of data siloed into 1,600 databases within 120 separate agencies, boards, and commissions. Despite the vast amount of data, it was mainly only used for traditional reporting by each individual group. There was no cross-agency sharing or analytics capabilities in place which prevented each agency and the state as a whole from leveraging the data into a more holistic picture.
The state's governor recognized the importance of accessing and using the data and tasked an agency to remove the data siloes and open up access to the groups. He maintained that unlocking data was imperative to allow the state to identify and drive meaningful social change and tackle complex problems facing residents' health, security, and well-being. While this is true, there were two significant challenges blocking their ability to meeting this objective.
First, were privacy concerns. Many of the state agencies are highly regulated and have very strict privacy rules and regulations in place. Without proper governance, they couldn't share data while remaining in compliance.
Their existing technology landscape was a significant obstacle to creating a data architecture that opened up access throughout the state government. Most agencies were saddled with legacy platforms and didn't have the resources to break long-term contracts or shift to a new, modern data platform.
In order to meet their objective, the state agency tasked with overseeing this project reached out to New Era to help establish a data sharing platform that would empower the state to maximize their data's potential through analytics.
Solution
Once we assessed the challenges the state agency was facing, we worked with them to develop a robust data sharing platform that would also protect and secure their data. Our data experts used Cloudera's open source platform distribution and Hadoop's reliable, scalable framework to build a big data platform that allowed the agencies to share resources, tools, and commons services to execute their analytical use cases.
To ensure compliance with strict privacy and security regulations, our team developed the platform to include:
- Defining a data governance program and establishing processes and policies, including data lineage, metadata management, quality assessments, and usage monitoring
- Developing deidentification capabilities to protect critical data
- Configuring security services to include authorization, authentication, auditing and monitoring, and encryption
To ensure data could be transferred and shared smoothly between the agencies, our team also built in:
- Defining the framework to support shared capabilities for data ingestion and analytics
- Developing flexibility to enable agencies to integrate into their internal systems
- Defining policies that enable agencies to launch and solicit new projects
- Enabling platform administration services for care and feeding of the platform
- Defining the framework to allow for shared capabilities for ingestion and analytics, along with the flexibility to enable agencies to integrate uniquely with their internal systems
Our team also worked directly in socializing, evangelizing, and consulting with different agencies throughout the state to help them leverage the analytics platform with their use cases. Each agency can operate securely within a flexible and scalable platform that they can customize as their needs change and as they continue to build out use cases and include additional agencies, boards, and commissions.