How Disco SRE improved Infrastructure Security at scale

06.01.23 by Rahul Ganesh Pai

How Disco SRE improved Infrastructure Security at scale

Delivery Hero Logo

5 min read

Infrastructure security and application security are two subjects that go hand in hand in improving security across teams and organizations. This article focuses on how Global Discovery SRE team achieved in optimizing and improving overall security for GCP based Infrastructures and applications.


With the onset of rapid expansion of multiple teams and Cloud based resources, the necessity to streamline and improve safety and security with state of the art security and protection mechanisms was needed. Overtime development teams supported by Global Discovery SRE had substantial increases in their business and products resulting in exponential increase in usage of Infrastructure resources and applications.

The Global Discovery SRE team with active collaboration with Delivery Hero Security team and Security Operations team implemented cutting-edge security mechanisms and protections for Cloud based resources and services. Using a system of well defined security scoring with fine grained topics and milestones, a statistical calculation of continuous security improvements were achieved.

Automated Scans

Ensuring early identification and vulnerability detection through automated scanning for any potential sensitive data, security breaches, code quality, anomalies in public facing components and infrastructure resources can help SRE and development teams to prevent any security breaches.

All public facing environments exposed via Cloudflare are scanned extensively and logged for future reference. Using a combination of Sonarqube and inspector gadget automated scans all resources and code repositories are scanned for any sensitive data, potential security issues and data leaks.

To bolster our automated scans further, Delivery Hero Security team conducts ASV scans on a quarterly basis and generates reports as part of our continuous security improvement.

Configuration and Secret Management

Securing secrets, configurations and any sensitive information with fine grained access to only desired resources or users for any production and non-production environments are vital for any organizations. Global Discovery SRE uses Vault identity based secrets and encryption management system as the single source of truth for protection of all sensitive information and secrets stored in a centralized location. Database backups subjected to retention policies and passwords are always encrypted. These mechanisms ensure each team is provided access to only the required secrets on demand and based on the allocated policies and privileges for the teams. 

Patch management

The SRE team uses GKE managed kubernetes clusters with GCP hardened OS images for deploying applications and softwares. All GKE node pool virtualized servers and master planes are auto upgraded and patched by GCP. Container images specific to Disco SRE team are always built and patched with a stable version automated via our CI/CD tool. All container images are OCI compliant and scanned during build for any vulnerabilities.

Log management

Log management involves centrally collecting, parsing, storing, analyzing, and disposing of data for purposes of identification, troubleshooting, performance management and security monitoring.

All network data traversing through our infrastructure are monitored via GCP and Datadog. All external traffic entering/exiting Cloudflare is scanned and logged. Once the network traffic crosses the NAT Gateway, it is then again captured at nginx ingress controlled in the GKE cluster before being directed to target GCP services. This allows us to have an end-to-end picture of the network traffic navigating inside our infrastructure.

Audit logs are enabled for each project platform. This helps to provide a security-relevant chronological record, data that provide documentary evidence of the sequence of activities that might have affected at any time in a specific operation. The logs once captured are stored as files in a central location and can be accessed by SRE team and other development teams having access privileges.

Hardening and Network Security

CIS Benchmarks help to set consensus-driven best practices for teams to implement and better manage their cybersecurity defenses. The Global Discovery SRE team using GCP provided services and images has achieved in implementing hardened secure configurations for OS and applications.

Securing networks and continuously improving the security mechanisms to adapt to changing technological landscapes and threats has always been a high priority for our team. Cloudflare acts as the first layer of protection for all public facing environments for our infrastructure. Cloudflare based DDOS protection, Firewall rules and TLS encryptions for all external traffic entering helps to bolster the Layer 7 defense. Once the traffic crosses Cloudflare, NAT Gateway changes IP ranges for Public to Private network conversions.

GCP based fine grained network policies and firewall rules help to further improve internal network protection inside the platform. All components inside the platform’s are in private subnets and could only be accessible by resources in the same VPC and through corporate provided VPN.

Database instances specifically need to be restricted from any public access are provisioned inside private subnets with restricted access to only allowed users or applications within the platform. 

2 Factor Authentication is enabled for all possible scenarios as an added protection for our global tools, internal tools and back-office tools. WPA2 Enterprise based protection is enabled for all wireless network communications with network segmentation for both corporate and platform networks.

The Global Discovery SRE team actively collaborates with the Security and Security testing teams conducting Blackbox and Whitebox penetration tests annually. This helps us to continuously improve and maintain our overall infrastructure and application security and network security with early identification of any form of vulnerabilities. All test reports and results are captured and stored for future references. 

How continuous security improvements benefited Dev Teams?

Development teams are in a constant state of developing and improving applications with each application involving different technical and security aspects. The Discovery SRE team takes into consideration every feedback and improvements received from each development team. Each security advancements are always implemented in a unified tribe level benefitting order. With continuous security improvements strategically planned in a phase by phase order with active collaboration with multiple development and security teams, Discovery SRE team abstracts away the security complications and encapsulates all required security aspects in a simplified and streamlined order.

This approach provides multiple benefits.

  • Improves teams total bandwidth by allowing them to focus on their development tasks without being hindered by security ordeals.
  • Continuous security protection with early identification and detection of any security vulnerability and risks with status notification to each team.
  • Faster warning systems and alerting mechanisms for any security breaches and incidents allow teams to take immediate actions and effective remedies.

Better security awareness and understanding of security compliances for each team.

Conclusion

In summary, Infrastructure and application security are always in a state of continuous improvement. With the changing technological landscape, cloud based tools and resources, improving and maintaining security at the highest standard has always been a top priority for the Global Discovery SRE team.

With streamlined security mechanisms, tools and standards, transparent processes and continuous security improvements actively collaborating with multiple Development, Testing and Security teams. All of these help the Global Discovery SRE to always be a step ahead in security and compliance.


If you like what you’ve read and you’re someone who wants to work on open, interesting projects in a caring environment, check out our full list of open roles here – from Backend to Frontend and everything in between. We’d love to have you on board for an amazing journey ahead.

How Disco SRE improved Infrastructure Security at scale
Rahul Ganesh Pai
Systems Engineer
How we boosted our K8s infrastructure performance whilst reducing costs

Next

Infrastructure

How we boosted our K8s infrastructure performance whilst reducing costs

Delivery Hero Logo
5 min read