Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Our customers love our technology, but it's our caring employees that make Splunk stand out as an amazing career destination. No matter where in the world or what level of the organization, we approach our work with kindness. So bring your work experience, problem-solving skills and talent, of course, but also bring your joy, your passion and all the things that make you, you. Come help organizations be their best, while you reach new heights with a team that has your back.
Join us as we pursue our innovative vision to make machine data accessible, usable, and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we're committed to our work, our customers, having fun, and most meaningfully, to each other's success. Are you passionate about working on critical systems to create tangible customer impact? Would you like the opportunity to work at a growing company that is changing the way that information is used to support business decisions? If this resonates with you, we would love to speak with you. At Splunk, each and every release of our software is highly scrutinized to meet the demands of our customers. As a Senior Performance Engineer in Test in the Performance, Scalability & Resiliency (PSR) team, you'll have a critical impact on our products' success. Rising to the bespoke challenges of big data security, scalability, performance, and availability will be your passion and ours too!
You will collaborate with peers, field teams, and customers to understand and encapsulate the use-cases into industry-standard benchmarks. You will work with product management and interface directly with our customers to get direct exposure to the usage patterns we strive to satisfy. You will also drive projects to integrate benchmarking into our continuous integration and test automation frameworks. Strong quality ethics, shift left mentality, collaborating closely with software engineers, designers, architects, and product managers to release innovative high-quality products.
Role Summary
This role requires a person familiar with software development and standard methodologies around Performance Certification and Benchmarks. You must have experience developing, configuring, deploying, and debugging in cloud technology-based distribution systems, in addition to on-premises application deployments. You will test a multitude of applications and technologies on our platform
Familiar with software development and standard methodologies around CICD and Quality Engineering practices. You must have experience developing, configuring, deploying, testing and debugging in cloud technology-based environments such as Kubernetes and distributed systems. You will test on a multitude of technologies.
Meet the Products and Technology Team
Want to build security and observability products people love AND work with people as smart (and humble) as you are? Our products and technology team delivers digital resilience at enterprise scale with a self-service Splunk portfolio that offers unified security analytics, full stack observability and real-time visibility of streaming data. Learn more about the team, meet our leaders, and hear from Splunk technologists and engineers at
splunk.com/careers/products-and-technology.
What you'll get to do
Day-to-Day Contributions:
- Define, design and implement performance / scalability benchmarks on Cisco-Splunk observability portfolio
- Identifies opportunities for engineering productivity improvements or directions, and evangelizes these successfully
- You are expected to understand the system wide functionality and then come up with test plans and automated tests. Collaborate with Developers, PMs and Infra/Operations engineers to deliver a high-quality product.
- You must have the ability to find and troubleshoot bugs during testing or automation failures. Aid development with any vital setup and reproduction of scale related issues seen in production to promote collaboration and efficiency.
- Helps team estimate software deliverables, often across a multiple sprint timeline
- You are expected to use standard load generation tools like JMeter, Gatling, Locust, Apache's ab etc. There will be a need to develop custom tools and applications to generate large volumes of custom data to test the backend services or agents.
- Efficiently working with various profiling tools to identify performance and concurrency bottleneck, propose and implement optimizations to improve the Cisco-Splunk observability portfolio
- Ability to find root cause of performance bottlenecks with profiling tools: flamegraphs, pprof, pstack, qmlprofiler, perf
- You will develop and run Automated Test pipelines needed to certify software that handles large volumes of data.
- Be willing to learn, adapt, and adopt modern technologies as needed including software-development and test frameworks.
Who You'll Work With (The group they will work IN and SUPPORT. Be creative and fun!)
You will be part of the Performance, Scale & Resiliency Engineering team. As a part of the team, you will test and certify the entire fleet of microservices and distributed systems using industry standard load generation tools as well as many custom-built tools. If statements like “hundreds of millions calls-per-minute” and “petabytes per minute” don’t scare you, then this will be the team you will want to be a part of! You will collaborate with multiple teams and product managers to ensure that Cisco-Splunk observability products are highly performant.
Who You Are (Desired and soft skills. Write in a personalized way!)
This is a technical role that requires a strong background in software development and testing highly scalable, data-intensive, distributed SaaS applications. You should be able to write high-quality, concurrent and scalable software. You should have a good working knowledge of cloud based distributed enterprise applications. The ability to produce and analyze heap dumps, flame graphs, analyze logs etc. to identify hot spots is a must. You need to have a team-first attitude; this will be critical for success as we test and support hundreds of services.
MINIMUM REQUIREMENTS: (Required per OFCCP compliance)
- Masters or Bachelor’s in Computer Science or an equivalent engineering degree
- 10+ years of proven experience
- Strong coding skills in Java or Python or Go.
- Actively works on building CI/CD for customer facing products
- Hands-on experience and understanding of: TestNG, REST APIs
- Ability to build custom test applications to mimic customer use cases.
- Passionate about debugging complex distributed systems
- Good background with Kubernetes, including Helm packaging, and Docker.
- Experience with any one of AWS/ Azure / GCP cloud services, Kubernetes and Kafka.
Highly desired skills:
- Working knowledge of Git (BitBucket), Artifactory, CI/CD (TeamCity), Bash/Makefile.
- Basic knowledge of OpenTelemetry concepts.
- Experience working with observability tools such as Splunk, AppDynamics, Grafana.
- Knowledge of UI Performance tools like WebPageTest or Playwright etc.