About Spoke Phone

Founded in 2016, Spoke is the only approved low-code platform for Twilio’s 235,000 Enterprise Customers and 9 Million Developers. Companies of every size and industry are using Spoke to transform their businesses, across sales, service, marketing, commerce, and more by connecting with customers in a unified way. We build solutions that can revolutionise companies. Join Spoke and discover a future of new opportunities.

Spoke provides integrated communication apps, features, and APIs for Twilio, that save months and months of developer time and cost.

Twilio Customers use Spoke to replace traditional PBX and cloud phone systems with a flexible alternative on Twilio that they control.

Twilio Contact Center Customers use Spoke to connect calls, conversations, and context between contact center agents in Twilio Flex and the rest of the business - without need for a Telco or traditional phone system.

Developers Building On Twilio accelerate projects without building everything, using Spoke’s ready-to-use Apps, Features and APIs for Twilio.

With Spoke, customers can now build and deploy the “last-mile” on Twilio without any specialist skills, heavy lifting, or ongoing maintenance.

Here is why this job exists

Spoke provides communications freedom for innovative companies that have complex customer journeys. Powered by Twilio, Spoke ensures that companies are never locked into a one-size-fits-all solution ever again.

We are looking for a Principal Site Reliability Engineer to help take our production infrastructure to the next level as we rapidly expand our services and coverage throughout the world.

Our customer base is growing, and as they grow so does the demand on our infrastructure. You will be part of our new SRE team, working to maintain, extend and support the Spoke platform as we expand across the globe.

Our platform is fully serverless running on AWS Lambda, with our APIs exposed via GraphQL and data stored in PostgreSQL and DynamoDB. We use Terraform to manage our AWS stack and use GitHub to manage our codebase, continuously deploying via CircleCI.

We run a flat organisation and don’t follow rigid scrum, kanban or any specific “agile” process; instead we prefer conversation and communication to deliver work continuously in an agile iterative way. We learn from our mistakes and are always improving the way we work and deliver working software.

What the role involves

  • Keeping Spoke’s service up and running or getting it back up and running quickly when failure occurs
  • Working closely with teams and internal partners to ensure that we ship software that meets security, SLA, and performance requirements
  • Collaborating with cross-functional product engineering teams to drive repeatability and reliability in our production infrastructure.
  • Refining and sharing to make all teams' lives easier, such as developer tooling, build automation, provisioning, logging, monitoring, alerting, etc.
  • Producing clean, consistent and well-organized code to automate our infrastructure, builds, deployments and configurations within our stack.
  • Writing code for infrastructure projects, such as data retention, performance and load testing, monitoring and alerting, command line scripts, automation, etc.
  • Writing, updating, and using documentation, including runbooks/playbooks
  • Automating work including infrastructure needs, testing, failover solutions, failure mitigation, and much more
  • Debugging complex problems across an entire stack and creating solid solutions
  • Designing, implementing, and troubleshooting CI/CD pipelines
  • On-Call Responsibility: You will be one of the main points of contact for alerts and incidents, and responsible for overall reliability and availability

What you will bring

  • 7 years experience with software engineering, software development, or system operations and administration
  • Excellent communication skills, both verbal and written
  • In depth knowledge of AWS Architecture and Security best practices
  • Experience automating infrastructure, testing, and deployments using Terraform and can explain the Infrastructure as Code paradigm
  • Experience with SQL and NoSQL databases such as Postgres, DynamoDB
  • Experience with Node/Javascript/Typescript
  • Experience debugging complex problems
  • Experience designing, building, and operating large-scale production systems
  • Experience with automated configuration management
  • Understand networking and messaging, especially between services
  • Experience with distributed systems.
  • You have impeccable attention to detail, are well organised and self-directed.
  • You are an independent thinker and like to own and solve complex problems.
  • You are willing to wear multiple hats and do what needs to be done, whether or not it’s in your job title.
  • You have experience supporting a system with three-nines reliability requirements.
  • You enjoy instrumenting applications and building monitoring and visualisations.

Good to have

  • Experience with telephony / voice applications.
  • Good understanding of / willingness to learn telephony / SIP.
  • You have experience working in a compliant environment (SOC 2, HIPAA, GDPR).
  • Experience with Android, iOS and Electron applications and build pipelines.

Benefits

  • Flexible remote working
  • Health Insurance and Wellness initiatives
  • Employee Share Options
  • Promote from within and cross functional training