Monocle: How Chime creates a proactive security & engineering culture (Part 1)
By David Trejo, a member of Chime’s Security Engineering Team
Over the last couple of years, Chime hired over 900 people. As part of this hiring, Chime formed a dedicated Product Security Team that is responsible for application security, cloud security, design and code reviews, threat modeling, building security frameworks and services, vulnerability management, and maintaining Chime’s bug bounty program. Our current ratio of software engineers to security engineers is ~60 to 1.
While building our security program, we realized that we were playing a bit of catch-up, and that there are unique challenges in building a proactive security culture. In this article we’ll cover how we addressed the following hurdles:
- Choosing where to prioritize investments in security
- Empowering engineers and teams to independently improve the security posture of their code
- Achieve these two goals while preserving a philosophy centered on cross-functional collaboration
Our security philosophy
- We seek to create a positive, dynamic relationship with other teams at Chime so that they not only trust us but see us as vital partners they can always come to for help;
- We build preventative guardrails that protect member data while providing engineers with the flexibility to ship quickly and experiment;
- We avoid an easy or simple “no”–instead, we educate and advise on the risks and solutions to mitigate those risks;
- We try to ensure Chimers (Chime employees) avoid tedious busywork, seeking to automate where we can and make work simple and efficient.
Chime members (customers) depend on us to keep their personal and account information secure. The Security team takes this responsibility seriously and invests heavily to keep our members’ information safe.
Results of our security work, so far
We’ve built an internal Rails app, Monocle, which educates service and code owners on their current security posture. To gamify things, the app assigns grades to repositories every night, gives teams instructions on how to raise their scores, and provides leaders with a view of the security posture for both their teams and the company as a whole. Monocle powers strategic engineering and security decisions by pulling together key information from across our tools.
Here’s how our repository security scores have moved over time:
How has our app, Monocle, been received?
The most eager teams fix their issues almost immediately while other teams usually follow up within days at most.
Engineers, security engineers, compliance, internal auditors, and governance teams have all been very pleased with how the app has spurred quick action.
Note: All images have been redacted for employee names, photos, descriptions, team names, and repo names.
How did we do it?
I’ll lay out the key elements of Monocle, why they work, and share some [redacted] screenshots.
Everyone enjoys making numbers go up
After announcing Monocle, we saw many teams immediately fix their issues so they could reach an A+ without any prompting from the Security team.
Badges on Github READMEs are very common in open source — and they’re effective. What if we could condense the top security action items into a Security Score, and then put that in every production service’s README?
So we added badges to all our production repositories. Engineers see the badge, notice that it’s not at an A+, then click through and follow the instructions to earn more points. This reduces the potential for a negative dynamic between the Security and engineering teams as it can be alarming to get a message out of the blue from Security with a reminder to enable branch protection or use approved base images, and Security team members don’t always want to feel like minders either.
Instead, Monocle lifts everyone out of this dynamic by empowering engineers and putting security issues directly in their hands — encouraging them to improve their Security Score while the Security team remains there to advise and help if needed.
The Security Score
As you can see below, we built a Security Score that reflects the actions needed to improve a repository’s security posture. More important score factors are worth more points, which helps teams prioritize their fixes. Score factors are personalized to the type of service and instructions teach teams to fix issues on their own.
Service Score and room for more
We’ve made it easy to add new types of scores so that each platform team can guide engineers in the direction that serves the company best.
Score dips
If a repository’s score ever dips below 80%, we immediately send a message to the team’s channel with tips on how to get it back up right away, and teams almost always do so once they receive that message. Security team members also auto-join the channel so they can be there to provide further advice or tips if needed.
Visibility for leaders
Engineering leaders need to understand the security posture of their teams in real-time. Monocle’s homepage makes this possible by showing:
- Services that are receiving top scores and thus doing all the right things and more from a security perspective,
- Services with lagging grades, that would benefit from a nudge from leadership to pay a little more attention towards proactively resolving issues,
- A list of vulnerabilities with their age, giving a feel for how fast teams make fixes,
- The service’s container security posture, for example, repositories that may not be using base images built by the Security team,
- Services not promptly patching their dependencies, and
- Services not making best use of security tools or settings.
The homepage also shows the median scores for various teams. This information gives leaders data to support the prioritization of investment in security across their teams.
We also have another page that visualizes how many repositories are passing or failing each of the score factors, including a graph to show which repositories have the most vulnerabilities.
Here’s a sample dashboard:
Extreme detail for day-to-day security sleuthing
Sometimes you need to know about all repositories with, say, log4j as a dependency or which services use our Java base image. That’s where the Fact Breakdown comes in. This page sums up the information associated with all your repos, so you can locate repositories matching your criteria and act on them.
This page is most useful for framework teams, keeping dependencies up to date, and security triage during incidents.
Pictured above are a few of the fact breakdowns. Expanding the item shows the repositories with that attribute.
Advice for Security teams
If your team doesn’t have time to build something like Monocle, I’d encourage you to think about where the following intersect:
- Bite-sized, actionable security issues
- that engineering teams can fix themselves
- which leaders want visibility into.
And if you prefer a haiku:
Atomic fixes
doable by engineers
observed by leaders
Anything that fits these three criteria over time is something you’d want to automate. There’s no need to overcomplicate things–a cronjob that runs regularly against Github and sends messages to Slack channels will have a big impact, and is a great place to start.
What security factors and controls have you found to be the most effective? We’d love to hear from you (@DDTrejo or [security [at] chime.com]). You should also say hello if you’re interested in working with us.
I hope this article has inspired you to nurture the Security culture at your company. If you’re interested in learning about similar tools, check out AllStar, Carta’s talk, or Hygieia.
High five,
David Trejo | Security @ Chime
P.S. Our next article in this series will discuss how we drive change with engineers and leaders through regular reporting, slack messages, and security training.
P.P.S. We’re hiring! And this is my favorite job ever.