By signing up you accept the Terms of Service and Privacy Policy

"DevOps Right Now: Cutting Through Complexity in Uncertain Times"

Below is an article originally written by Lance Allen and Tori Wieldt at PowerToFly Partner Mission, and published on June 23, 2020. Go to Mission's page on PowerToFly to see their open positions and learn more.

Coping With The Complexity Of Modern Applications: The AWS Perspective

The biggest challenge out there with modern applications is complexity. Over and over again, modern applications are typically created from a standpoint of meeting many different small pieces. These small different components, nowadays called microservices, were created because we needed applications that could rapidly change. And that's great. But the downside to that is it creates tremendously complex applications.

That creates the kind of next challenge: How do you deal with these complex types of applications?

This is where the topic of DevOps comes into play. Too many organizations think DevOps is just about automating delivery. Automation is important. But it's also important to know why you're automating. The point is not to push a button and send code to production once a quarter. The point is to do this frequently.

It's about how often and how quickly. Automation is a byproduct of your frequency of delivery — how often you're innovating and trying things. It's not automation first for its own sake. It's automation in order to get this innovation flywheel in place. That's the real point.

Let's look at the development transformation at Amazon between 2001-2009 to gain some context around the evolution of microservices and, subsequently, the DevOps philosophy.

Development Transformation At Amazon: 2001 - 2009

Like so many other organizations, AWS started out with a big model. They had a giant application:

In 2001, it had just one big relational database that everybody was sharing and using. A development team of several hundred people were all using this database. It became very cumbersome to make changes. They had to replace the whole thing one at a time.

To deal with this, they did something interesting and unusual for its time. They had a top-down directive to become bottom-up.

From the top down, they decided to incorporate what at the time was called service-oriented architecture. Nowadays, we would call this going to a microservices architecture. But back then, people didn't really talk about microservices very much.

They decided to start building all their software in terms of small components. Why small? Because independently of deciding to make everything separate, they decided they were going to work in small teams. This was where the well-known idea of two pizza teams came about — the idea to work in teams small enough that the whole team could be fed with only two pizzas.

This idea of small teams working independently on their own pieces of software became the goal within Amazon. They broke the whole organization from working on one big app made of a hundred pieces into what is now thousands of little companies working somewhat independently. What's interesting about that is that it's scaled. But the flip side of this is that, as this evolved into thousands and (nowadays millions) of microservices, It got to the place where it was more than any one person could comprehend. This is where tooling became absolutely required.

Measuring Reliability Across Application Infrastructure With New Relic

Bringing everything into the one particular tool helps. It doesn't matter if it's logs or open telemetry, or if you're using Prometheus or something else. What's important is that you can aggregate it and work on slicing and dicing the data that you need.

Observability is a term people are talking about a lot now — it's a way that you can look at the system and not only know what happened, but why it happened and look at the root cause. Monitoring just tells you when something happens. Observability lets you ask why. And most importantly, why you would use that information. You want to be able to pinpoint issues and resolve them as quickly as possible. You also want to tie the data to business outcomes.

You can take advantage of that data in a lot of different ways. In terms of reactive strategies, you can gain visibility into your systems, establish baselines, setting basic alerts, and fixing the obvious problems. You can get rid of your outliers and understand "what's normal."

From a proactive strategy, you want to be able to fix the issues before customers notice. You don't want to find out on Twitter that your site's down or people are having a bad experience. You also want to optimize the processes so teams can balance resiliency with speed. There's often a lot of pressure to deliver things more quickly. But if you're not able to roll back things really quickly, there are going to be some problems. So you want to be able to balance those two competing issues.

Finally, being data-driven, you want to be able to constantly improve the customer experience and make sure your systems are resilient.

So what does that look like inside New Relic?

If you start out reactive, (this word has a bad connotation, but it's a good place to start), look and see how your applications are doing.

Also look at the user experience. New Relic has something called Apdex, which is a system comparison of numbers that give you a sense of how users are experiencing your system — whether they're happy, tolerating, or unhappy with it. It's a great place to start to get a sense of how your users are doing.

Next, look at your air rate. You've got to get rid of some of those errors because you want to smooth things out and get out the big outliers. Then you have a good sense of what normal is for the operations of your systems.

You also want to start setting alerts. Dashboards are great. There's a lot of things they can do. But obviously you don't have people that can sit around and watch dashboards all day. So it's important to set alerts, and it should be pretty darn easy to set them. And don't forget to tune them.

If you see the gray area in the above image, that's automatic baselining. New Relic can look at your systems and get a sense of what the band of normal is. And then it's very easy to define thresholds and answer, "How long do you want something to be flapping before you actually page somebody about it?" So it's important to set a baseline of alerts for that.

Moving into the next phase, it's important it is to establish SLOs when being more proactive. The above case features users that wanted to scratch their own itch. So they went into the New Relic platform and built their own application, which is pretty straightforward and easy to do. Looking at the four golden signals, they want to understand how they're doing with their targets. Then, if they need to, they can very easily go in and drill down to see what's going on.

One of the biggest parts of setting SLOs is getting all the right people in the room to agree about what the most important thing is. This can be the hard work. Once it's agreed upon, the tool is straightforward and easy to do. But it's really important to do that front-loaded work.

It's also important to release often in the proactive stage and understand the impact of every release.

The above dashboard belongs to a New Relic user monitoring a mobile app they own. You can see that they want to know their crash rate — a very important metric. Looking at the Apdex, they can measure the customer experience by seeing how many users are annoyed. It's important to be able to track these things to have an impact, and have a way to roll back if you need to.

Achieving Operational Excellence By Leveraging AWS And New Relic In Your DevOps Journey — Mission Consulting

Mission Consulting works with customers to understand their business goals and objectives, as well as the applications they're running so that we can serve those business needs. We aim to have a strong understanding of your infrastructure and the processes of your environment so we can be of value as a DevOps consulting service provider.

In general, things tend to become more complex over the course of time when running workloads. As new features are released into the application or the product, new infrastructure's provisioned to support those feature sets. And as teams grow, the processes will continue to evolve over time.

DevOps is about understanding all that complexity and pairing you with our senior engineers and consultants who can really do that deep technical discovery and understand the application — how it's packaged, how it's built, how it's deployed, how to run it, and how to serve it at a large scale in production.

We use tools like New Relic to instrument and monitor the environment and help with all of the observability. We capture not only the metrics, but the right metrics. Because that's a very important aspect. Even a simple architecture could have hundreds of different data points to monitor and alert on, but not every one is necessarily relevant. It's really about determining the right things to monitor and alert on.

Along with the general architecture in the roadmapping, we perform hands on keyboard work. Our engineers are involved with activities like scripting, creating those automation components that help businesses achieve their goals, setting up the CI/CD pipelines and deployment processes, as well as doing all of the infrastructure's code activities like templating.

If you have any additional questions about Mission, our managed DevOps service, or about the New Relic product, or even from the AWS platform, please feel free to reach out.


How These Companies Are Celebrating Asian American and Pacific Islander Heritage Month

According to a recent study, anti-Asian hate crimes have risen 150% since the pandemic started. But these acts of violence are not new — they are part of a much larger history of anti-Asian racism and violence in the U.S.

That makes celebrating Asian American and Pacific Islander Heritage Month (which was named a month-long celebration in May by Congress in 1992 "to coincide with two important milestones in Asian/Pacific American history: the arrival in the United States of the first Japanese immigrants on May 7, 1843 and contributions of Chinese workers to the building of the transcontinental railroad, completed May 10, 1869") this year all the more important.

Autodesk, Inc.

How Embracing What She Doesn’t Know Led Autodesk’s Arezoo Riahi to a Fulfilling Career in DEI

Arezoo Riahi isn't a big fan of the "fake it till you make it" approach. She'd rather ask for the help she needs and learn from it.

Autodesk's Director of Diversity and Belonging joined the design software company from the nonprofit world after a long career in connecting people from different cultures. While her work had been deeply rooted in DEI values, there were certain parts of the strategy-building aspects to her new role that she wasn't sure about.

"If you know it, show up like you know it. If you don't know it, you shouldn't fake it. And Autodesk didn't shame me for not knowing everything. They helped me, and the entire team, by providing the resources that we needed, bringing in outside expertise to help teach us when we were in new territory," says Arezoo, who has been at Autodesk for three years now, during which she's been promoted twice into her current role.

We sat down with Arezoo to hear more about her path into DEI work, what she thinks the future of that work must include, and what advice she has for women looking to build fulfilling careers, from knowing what you don't know and beyond.

LogMeIn Inc.

Behind-the-Scenes: Sales Interview Process at LogMeIn

Get an inside look at the interview process for sales roles at LogMeIn, one of the largest SaaS companies providing remote work technology, from Michael Gagnon, Senior Manager of Corporate Account Executive Sales.

Procore Technologies Inc

How Being an Open Member of the LGBTQIA+ Community Has Helped Procore’s Alex Zinik Overcome Imposter Syndrome at Work

Alex Zinik wasn't surprised that she started her career in education—she decided she would become a teacher when she was just in third grade.

She was surprised while working as a paraeducator in the school system and preparing to become a special education teacher, she discovered that it didn't feel quite right. "I didn't know if that's what I really wanted to do," she recalls.

So a friend suggested she take a job during her off summers at construction software company Procore. She thought this would be the perfect opportunity to try out this new challenge, and if she needed to, she could go back to the school district once the summer was over.

"Five summers later, I'm still here!" she says, smiling. "And I see myself here for many more years. I just fell in love with the company, the culture, and with the career growth opportunities I was presented with."

As part of our Pride month celebrations, Alex, currently the Senior Executive Assistant to the CEO at Procore, sat down with us to share how a common fear—the fear of being found out—underlay the imposter syndrome she felt when pivoting to an industry in which she lacked experience, and the anxiety she often felt before coming out to her friends and family about her sexuality.

Read on for her insight on overcoming negative thought patterns, being yourself, and paying it forward.


The Outlook That Helps CSL’s Paula Manchester Invest in Herself and Her Team

If you told Paula Manchester that you weren't good at math, she wouldn't believe you.

"That's a global indictment," she says. "'I'm not good at math' implies that you don't have the ability to nurture that muscle. And then I'd ask what kind of math? There's a lot to math."

© Rebelmouse 2020