r_devops | Unsorted

Telegram-канал r_devops - Reddit DevOps

86

Reddit DevOps. #devops Thanks @reddit2telegram and @r_channels

Subscribe to a channel

Reddit DevOps

We Don't Need Perfect Code

https://www.youtube.com/watch?v=al8KN5incV8

https://redd.it/1hg82j2
@r_devops

Читать полностью…

Reddit DevOps

Incident Response Metrics

When I started driving incident response processes, I spent a ton of time building custom reporting and massaging data to report out and show that our incidents were becoming less impactful, resolved faster, or trending in the right direction. Eventually, I built an internal self-service ops platform to streamline it all. The incidents of year 1 differed from those we saw 3 years later, increasing in complexity.

Given everyone's experience, I’m curious about the metrics you track on the operations side. I assume MTTR is one of them, but do you also monitor MTTD (detect)—the time between when an impactful change was executed (the actual start of the incident) and when it was declared? Is anyone creating SLOs and properly rolling the SLIs up to them?

Do you look at DORA metrics, such as change failure rate, specifically changes that lead to incidents? Do you track anything related to team incident readiness?

I’d love to hear which metrics you find most valuable and if there are others you wish you had.

I'm always looking to learn from folks who’ve been through it.

https://redd.it/1hg369y
@r_devops

Читать полностью…

Reddit DevOps

Handling Feedback and Building Positive Interactions in DevOps

I’ve been reflecting lately on how I interact with others, especially after realising that I sometimes unintentionally upset people. It’s certainly not my intention, but these moments have got me thinking more about feedback and communication.

In a recent discussion within my group, we talked about how crucial communication is in a DevOps environment. Misunderstandings can arise easily, given the high-pressure nature of our work and the variety of personalities and roles we collaborate with.

One thing I’ve noticed is that everyone has their own opinion and their unique way of doing things, and let’s be honest nobody likes being told what to do or feeling like they’re being told they’re bad at their job. This is especially true in DevOps, where collaboration is key, but so is respecting individual expertise and autonomy. Striking the right balance between offering guidance and letting people take ownership can be challenging.

The conversation made me curious to learn more about how others in the wider DevOps community handle feedback and navigate interpersonal dynamics.

Some questions I’d love to explore include:

* How do you handle giving constructive feedback without causing offense?
* What strategies have helped you receive and process feedback constructively, even when it stings?
* Are there any practices you’ve adopted to foster better understanding and collaboration in your teams?

https://redd.it/1hfv4oo
@r_devops

Читать полностью…

Reddit DevOps

CICD pipeline uses helm to deploy apps without packaging them in a OCI registry

Am I the worst person alive? I have implemented helm deployment through our CICD pipeline, the build phase of the pipeline pulls the helm charts from the app's repository then makes an artifact from it. When its time to deploy the pipeline simply runs an helm upgrade command using the artifacted helm chart. I have seen a lot of oeople mentionning they instead package their helm charts and push them to an OCI registry which is fine but in our case I feel like its an additional step and aditional dependency (the OCI registry) which we don't really need. Any thoughts?

https://redd.it/1hfueb7
@r_devops

Читать полностью…

Reddit DevOps

Let’s Smash 2025 Together: Join My DevOps Accountability Group

With the new year just around the corner, I’m setting some big goals for my career, my skills, and my mental wellbeing. But I know from experience that it’s tough to stay motivated on your own. That’s why I’ve started a DevOps Accountability Group—a space for engineers to share goals, track progress, and support each other. It’s not just about tech—we also tackle topics like burnout, productivity, and finding balance in a demanding industry.

Whether you’re working towards certifications, improving your DevOps skills, or just trying to manage the chaos of on-call shifts, this group is here to help you stay on track. If you’re looking for a supportive community to grow with, drop me a comment or a message, and I’ll send you the invite.

Let’s make 2024 the year we grow, achieve, and thrive—together! 🚀

https://redd.it/1hftqmg
@r_devops

Читать полностью…

Reddit DevOps

Migrating from Elasticsearch to Opensearch

Hello all

I am working on a project to migrate some elasticsearch clusters to opensearch. I would like to know if anyone here has done this and can recommend strategies to do this in production for a customer facing application. So far I've evaluated two options, option 1 would require a "read-only" downtime where any writes to the cluster would error out. This would give time to migrate all the relevant indexes using the `reindex` operation and to the code to point to the OS cluster instead of the ES one. Option 2 is more involved, it would require updating the application code to dual-write for a period of time, then doing reindexing of all the data prior to when the dual-write was enabled, and when we have confidence, doing a full cut over. Option 2 has a lot of issues in that the code changes are more complex and I don't yet know how to deal with data divergence, e.g. if a write fails in OS but succeeds in ES. However, I am getting word that a 0 downtime approach is strongly preferred.

Any advice would be great!

edit: ES 7.10, OS 2.15

https://redd.it/1hfmzdu
@r_devops

Читать полностью…

Reddit DevOps

I failed the last round of a Platform Engineer job interview because of a database migration question. How bad was my answer?

For clarity, I am a Full-Stack developer, but these days my work is 80% front-end.

I have learnt all DevOps and Cloud-related stuff in my own time by studying and applying it to my own projects. I'm trying to break into a cloud-focused role because it has interested me for a while now.

This round was a whiteboarding session where I was tasked with migrating a legacy system to a cloud provider. We went with GCP for service choice names, as we both have worked this provider.

One of the components of the legacy system was the database. The interviewer said let's assume it's an SQL-based database and it's running on a VM somewhere.

By this point in the interview, I had asked questions and got the following points:

\- Developers are manually having to SSH into VM's to make changes and they are struggling to manage.
\- The company expects to grow within 12 months, up to 100k DAU. They currently have 10K DAU.

Once we get into the database section, it kind of went like this:

>Me:
For the database, I would recommend a managed SQL service, Cloud SQL.
The advantage of this choice is that it takes a big amount of cognitive load from the developers. DB Admin tasks like backups can be done much easier. It is also more suitable to scale than self-managing on a VM.
The disadvantage of this choice is the cost, it'll easily be one of the most expensive components of this architecture. But I feel like the trade-off is worth it, as managing a database for this scenario can be a full-time job in itself.

>Interviewer:
Ok great, that trade-off makes sense. Let's talk about the actual migration of the data from the current system to GCP. We want minimal downtime, we can maybe afford a maximum of an hour of downtime.

>Me:
Ok, what is the rough amount of data we need to transfer? I'm guessing it's not Petabytes of data?

>Interviewer:
Correct, it's not that level of data, but equally, it's not so little that we can just do a quick and simple export/import

>Me:
Ok, I know that cloud providers do have database migration services available. These aim to do a lot of the work for you. However, I am aware that they only fit very simple scenarios. You have to have a supported db version and a relatively simple schema, so let's assume these services do not fit the bill.

>In that case, I would replicate the legacy database to the cloud-based one. Once replication is done, we can ensure new records are added by adding a dual write strategy at the application level, so both databases are fully up to date. Then, after communicating with users about the expected hour downtime, ideally during a low-traffic period, we can mark the old database as read-only and switch over to the new instance.

>The upside of this approach is that we can get downtime down to < hour. The downside is that we need to write application-level code purely for this migration, which increases complexity and chances of mistakes, so it's something the developers will need to keep in mind.

And to be honest, the interviewer just OK'd that and we moved on to another question.

However, in the feedback, one of the main points was that I need to expand my knowledge on database migration strategies.

Was my answer straight-up wrong? Or did I fail to expand on the solution more?

https://redd.it/1hfgw4b
@r_devops

Читать полностью…

Reddit DevOps

In your opinion what's the best way to insert values for the app?

Most shop i know uses argo helm plugin. With values.yaml file in the app repo where the argo will read the app and deploy it. The secrets will be managed either rblindfold/sops/secrets-injector sidecar.


What's your preferred method?

https://redd.it/1hfcaay
@r_devops

Читать полностью…

Reddit DevOps

The ways to make portfolio/practice different DevOps tools.

Hi, I have completed couple of courses on different DevOps tools and now I want to practice working with those tools. The thing is I don't know how.

It might sound stupid but when I was learning something like Python I just came up with some kind of program I want to make and started coding. With Terraform or Docker I find this approach not working as well and when it comes to practicing pipelines or Kubernetes I spend a lot of time on finding app I want to use to learn and not on the tools that matter.

The same goes with me wanting to build a portfolio to have something to show during job searching. If I want to show what I know Python I just make an app and put it on GitHub but when it comes to DevOps I don't know what should I put there.

So my question is how to practice the usage of those DevOps tools? Is making my own projects similar to programming ones the way or there is more efficient way? Should I just use some already make apps to learn and just focus on the DevOps stuff?

https://redd.it/1hf47wl
@r_devops

Читать полностью…

Reddit DevOps

DevOps vs TestManager vs Test Automation as career

Hello,

I am curious about your advice. I have been working in IT for more than 15 years now.
I was software developer a few years, then spent around a decade in Test Automation engineer/ Test Lead/ Test Manager positions.

In the last 3 years I have been working as DevOps engineer. I think I am good in what I am doing but I wouldn't say I am great. In the field of Testing I can say I am great. My personal opinion though is that DevOps is a much more dynamic and challenging field so it is not a suprise.

I enjoy working on both fields the reason i changed to DevOps wasn't that I hated Testing but that I was afraid that with the emergence of AI there is not really a future of this field. (i am following the number of job posts and that seems to be verifying this)

Recently went for a few job interviews (mostly out of curiosity). I've got 2 DevOps offers with pretty much the same salary I have currently.
However, another company I went offered 2 positions : a Test Automation Lead and a Test Manager one for around 20% salary than my current one.
Being in either position would be much less challenging, and super easy to do.

Unfortunately, that still won't negate that if I want to change again in the future probably won't be able to find anything(which is not crap) in testing and would forget enough/changes enough in DevOps that I would be maximum a junior.

What donyou think? What would you do? Do you agree that DevOps has a mich brighter future or I am just mistaken?


https://redd.it/1hexqcw
@r_devops

Читать полностью…

Reddit DevOps

Software Engineer Jobs Report 12/11: 1200 new jobs. Every week I scrape the internet for recently posted software engineer jobs. I hand pick the best ones, put them in a list, and share them to help your job search. Here is last weeks spreadsheet. DevOps roles included.

Hey friends, every week I search the internet for software engineer jobs that have been recently posted on a company's career page. I collect the jobs, put them in a spreadsheet, and share them with anyone whose looking for their next role. All for free.

The data is sourced by my own web scraping bots, paid sources, free sources, VC sites, and the typical job board sites. I spend an ungodly amount on the web so you don't have too!

About me, I am a senior software engineer with a decade of work history, and ample job searching experience to know that its a long game and its a numbers game.

If there are other roles you'd like to see, let me know in the comments.

To get the nicely formatted spreadsheet, click here.

If you want to read my write up, click here.

if you want to get these in an email, click here.

If you want to see all previous job reports, click here.

Cheers!

https://redd.it/1hey8g6
@r_devops

Читать полностью…

Reddit DevOps

How do you use dynatrace effectively?

I am trying to build a dashboard on dynatrace off metrics from metrics from an application that exports them via Prometheus. Example:

self.histograme2etimerequest = self.histogramcls(
name="e2e
requestlatencyseconds",
documentation="Histogram of end to end request latency in seconds.",
labelnames=labelnames,
buckets=1.0, 2.5, 5.0, 10.0, 15.0, 20.0, 30.0, 40.0, 50.0, 60.0)


I am not even able to display the different buckets, or the different percentiles e.g P99, P95. Coming from Grafana, this is a huge surprise to me. Can anyone point me in the right direction?

https://redd.it/1hesi5d
@r_devops

Читать полностью…

Reddit DevOps

What is up with pulumi pricing? Or are my expectations off?

Hi,

Please let me know if this question doesn't fit the rules of the sub, I can remove it.

I was researching today for alternatives to Terraform. The main issue I have with it is that there is little support for different environments - and I want to be able to just reuse the same file in different environments.

Each developer can bring up a stack to check his changes really quickly, before bringing it down when they're done. So each developer can do like "bring up this stack with name 'abcTest'" without having to write config files from scratch.

One option is terragrunt, another one is pulumi. I have seen reviews about the benefits of one vs the other, and I am still trying to understand which one I think would be nicer to work with. But before that, there is one thing which is really bugging me and preventing me from even trying pulumi out - the pricing.

The price at 0.37$ *per resource*. This is mind-boggling high, no? If we assume you only have 10-20 resources in prod (bare-bones, small project), then multiply that by staging, and dev - that's 20$/month, which sounds reasonable.

But add the experimental stacks that engineers can experiment on (e.g. the 'abcTest' stack), and you pay 4$ per experiment. Every time you want to try a new setup in a new environment, it's a 4$ cost. That sounds incredibly high to me.

In general I really dislike this form of pricing - the more you use it, the more you pay is great for them, but horrible for you. I _want_ to use resources because I want to use Infrastructure as code. I don't want to optimize my setup to use as few resources as possible because each resource costs me a lot of money - I want to use the tool as best as I can, without this additional per-resource constraint.

Anyway, are my expectations off? How do you manage hundreds of resources across dozens of environments, are you paying thousands of dollars for the infra management alone?

https://redd.it/1heq8xl
@r_devops

Читать полностью…

Reddit DevOps

tips for terraform for devops interviews

I have been trying to switch to DevOps from developer. The main issue I'm facing is that I don't have proper production experience with DevOps.
I knew Jenkins, python, shell, learned Docker, and Kubernetes, and got Cloud Practitioner certification
I am in the process of learning Terraform and would appreciate any tips on where to learn this from
I get stuck when they ask about production situations I have no experience ...
Do you happen to have any tips on how to improve this?
I would also appreciate any sort of training ground (KodeKloud ?) to practice this as EKS is not included in the free tier

https://redd.it/1hen70e
@r_devops

Читать полностью…

Reddit DevOps

whats next ???

Hello everyone,

I am a junior DevOps engineer looking for guidance. Currently, I am working at a startup where I work with Kubernetes, Terraform, Linux, AWS, etc. I am not very proficient in any of them, but I am trying to improve—it's never too late!

Last week, I obtained the AWS SAA certification. What should I learn next to differentiate myself in the market?

I also see a lot of discussions about how a DevOps engineer should have a background in either development or operations first, but I don't have one. What should I do?

https://redd.it/1he7r4j
@r_devops

Читать полностью…

Reddit DevOps

Blameless culture?

Does it really exist?
What does it look like in reality?

https://redd.it/1hg65pv
@r_devops

Читать полностью…

Reddit DevOps

How would you manage/automate this mess?

I believe we use almost every possible way to host SQL Server: on AWS EC2 (as an FCI cluster), on AWS EC2 (as an AG cluster), on Multi-AZ RDS SQL Server, on Single-AZ RDS SQL Server, and we’re currently in the middle of a POC for RDS Custom for SQL Server.

Our application follows a single-tenant database architecture, which means we host around 2000 databases—a number that is always changing:

* We onboard new customers, which means creating new databases.
* Customers terminate their subscriptions, which means we drop their database(s).
* We receive a backup file from the implementation/support team or the customer directly to create a new database (yes, each customer can have anywhere from **1 to 20+ DBs**) or override an existing database with their backup file.

At this point, even a simple restore operation is different across all our servers. For example, to override a database:

* **On an AG cluster**: Remove the database from AG first, put the database in single-user mode, restore it, switch back to multi-user mode, and then add it back to the AG.
* **On an FCI cluster**: Follow the same steps as AG, except for removing/adding it from/to AG.
* **On Single-AZ RDS**: Use a stored procedure. If you’re overriding an existing database, you can add logic to check if the database exists, drop it, and then restore using the stored procedure.
* **On Multi-AZ RDS**: Use the same stored procedure, but if you’re overriding an existing database, you need to drop it using different SQL commands (because it’s Multi-AZ) and then run the stored procedure for the restore operation.

I’m not even diving into other tasks, like managing users/logins (each customer has their own login, used by our applications), read-only users, or performing some other updates/changes across every single database.

So, how would you manage this mess?
Does anyone know of any third-party automation tools that could help?

https://redd.it/1hg28ff
@r_devops

Читать полностью…

Reddit DevOps

To version control or not to version control

I was assigned as a project manager to a team that has to migrate an mssql database to a Salesforce instance.

The project used to be in the hands of one developer. I used to ask if everything was progressing fine. For a couple of months he assured me it did. When we actually had to see results it didn't work out.

Therefore higher management decided to add two developers to help him (and check on the work). I supported that.

In my company they usually don't expect project managers to have a lot of technical knowledge. But I do have some development experience and I wanted to really understand what they were doing, as this migration was by this point a huge mess and I could never challenge them when they said something.

The first thing I noticed was that they weren't using version control for their migration scripts. They use a combination of tsql scripts, "SQL database procedures" and "Talend jobs".

So my first reaction was: we have to put things in version control because our scripts are all over the place and we can't track changes.

However they claimed that version control is not possible for the procedures and the Talend jobs.

Based on a couple of Google searches I realized that isn't entirely true. So when I told them that, they told me (quite annoyed) that it also wouldn't be practical as they have to put the scripts on a secured server and can't easily transfer them from their local computer to the secured server.

My reaction was that we then should find a way to fix that but the team unanimously concluded that version control really wasn't the biggest priority right now. We should urgently fix the bugs in the scripts and do this 'one time' migration. Although it isn't really one time as we have to do it for four offices.

I was really confident that I was right at the start, but I'm starting to hesitate as I feel that the whole team doesn't agree. What's your opinion on this? Or do you need extra info to answer the question?

https://redd.it/1hfvflj
@r_devops

Читать полностью…

Reddit DevOps

CICD pipeline uses helm to deploy apps without packaging them in a OCI registry

Am I the worst person alive? I have implemented helm deployment through our CICD pipeline, the build phase of the pipeline pulls the helm charts from the app's repository then makes an artifact from it. When its time to deploy the pipeline simply runs an helm upgrade command using the artifacted helm chart. I have seen a lot of oeople mentionning they instead package their helm charts and push them to an OCI registry which is fine but in our case I feel like its an additional step and aditional dependency (the OCI registry) which we don't really need. Any thoughts?

https://redd.it/1hfueg4
@r_devops

Читать полностью…

Reddit DevOps

Non-technical founder wanting to learn DevOps (what’s the best approach with all the new AI tools?)

Hey r/devops, I’m a non-technical founder who wants to better understand the DevOps landscape. I’ve got some experience running products and teams, but the hands-on technical side is a bit of a gap for me. With all the recent changes...especially the explosion of AI-driven tools and platforms. What’s the smartest way for someone in my position to get up to speed?

Any recommendations on learning paths, must-read resources, courses, or even communities that focus on the fundamentals? If you were starting today, where would you invest your time and energy to quickly build a solid understanding? Any guidance is greatly appreciated!

https://redd.it/1hfq4kd
@r_devops

Читать полностью…

Reddit DevOps

Did 'vi' win the editor war?

I was just thinking that I haven't heard anyone talk about emacs for a few years. But all the new people I meet who need an editor that's present on 'any' system are using 'vi'. I think one of them had never even heard of emacs.

https://redd.it/1hflvt6
@r_devops

Читать полностью…

Reddit DevOps

EDR and build systems - how to get them to play nicely?

Our IT are trialing some increased protection settings in our EDR system and we *think* it is causing some intermittent build failures. Specifically, even though the EDR system is just in "learning" mode, for the C++ build (using VisualC++ / msbuild) it looks like the EDR service might be briefly locking a recently generated file when a subsequent build step also tries to read that file.

I expect that convincing our IT to reduce the level of monitoring on our development environments would be a hard argument, so I'm starting by looking at what mitigations I can make within the build processes. I'm trying to find some guidance / best practices on how to configure "security software aware" build processes, and I have not found anything. Does anyone have some resources they can share?

https://redd.it/1hfd06b
@r_devops

Читать полностью…

Reddit DevOps

Deployment Issue

Good day! I am still a beginner and this is my first time on deploying a website specifically a mern web. I deploy my web on render, the backend is working fine but the problem is the frontend it keeps on requesting on localhost:5000 which is my local backend. I already changed my baseURL, I am using axios for requesting btw, but it still requesting on my localhost... Why do you think is that????

https://redd.it/1hfaqr3
@r_devops

Читать полностью…

Reddit DevOps

A solution to the problem of cluster-wide CRDs

I’m an average Reddit user, scrolling much more than reading or interacting. Sometimes, however, a post rings a giant red bell. When I stumbled upon If you could add one feature to K8s, what would it be?, I knew the content would be worth it. The most voted answer is:

> Namespace scoped CRDs

Here's a solution with vCluster: https://blog.frankel.ch/cluster-wide-crds/

https://redd.it/1hexgsn
@r_devops

Читать полностью…

Reddit DevOps

Learn hands-on operating system concepts with IBM AIX on Coursera

Here's the specialization: https://www.coursera.org/specializations/mastering-operating-systems-with-ibm-aix

It has three courses:

1. AIX Operating System Fundamentals

2. System Administration with IBM AIX: Getting Started

3. System Administration with IBM AIX: Beyond the Basics

Earn an IBM badge at the end, which will help you in your career.

https://redd.it/1heycbr
@r_devops

Читать полностью…

Reddit DevOps

Is Teleport widely used?

Is it worth it to learn it for upskilling considering its current market?

https://redd.it/1hewj2b
@r_devops

Читать полностью…

Reddit DevOps

How to handle test environments - locally and on the cloud?

Hi all,


I am new to DevOps, and I have a bunch of questions on what are the best practices / tools to setup test environments. Let's get a concrete example: you have one frontend app and one backend app (in the same repo - I am using bazel for my monorepo tooling)

The backend app is self-contained, the frontend app makes http requests to the backend app to retrieve some information.

Let's say you want to provide the following capabilities:

1. A local test environment. You run the frontend/backend application locally, and you can test your changes. This is easy for the backend. For the frontend, you need to be able to connect to the backend app. My idea would be to use docker to setup a locally-running instance of the backend to which the frontend connects to. Ideally, this is completely isolated from the internet - meaning you can run this without any internet connection.


2. A cloud test environment. Since this is meant to help developer testing things out, each developer can bring up its own test instances of the frontend/backend (with specific name, e.g. abcTest) and see the results on the cloud.


I am currently mostly using GCP + Terraform for the backend (Vercel hosting for the frontend). Terraform doesn't provide an easy way to bring up ephemeral environments out of the box - you have to create a new directory with a bunch of files to get it to work. Ideally, one would have a single .tf file - and you just have to pass the name of the environment to bring it up. This doesn't really work with terraform.


I am also exploring terragrunt for 2), which I hope will make it easier to have different ephemeral test environments.


I would appreciate any suggestions! :)

https://redd.it/1herh0p
@r_devops

Читать полностью…

Reddit DevOps

Thoughts on AWS codeartifact

Anyone who used it, how have you found it. We dont use AWS anywhere so wanted to ask if its worth onboarding it or if theres some vendor lock in stuff (that i would like to avoid). Its not on my top 3 list but just wanted to know if it’s worth having a thought about it

https://redd.it/1heo07w
@r_devops

Читать полностью…

Reddit DevOps

C/c++ static analysis

I am currently using cpp check, I was looking for something to give better results. Is there a better tool out there right now. I tried sonar source but that is alot of work for something to give you static analysis tools.

I was able to download their free version and run it over my codebase only to find out that it wont do c or c++ for free. It was a pain in arse to get runningthen all that and I cant even see if it gives good results. WTF. Then a trial requires them to get back to you.

So I am looking for a good misra checker / static analysis tool that doesnt take half a day to setup.

https://redd.it/1he8xl2
@r_devops

Читать полностью…

Reddit DevOps

After graduation how do you get to work on really challenging projects to learn?

All the startups hire for DevOps but they all want experienced people. For the juniors, where can one go to learn?

IMHO, if you want to become a decent devops you need experience working in challenging real projects and there are not many places to learn, because they look for someone with 5 years of experience.

For the great DevOps engineers on this group, how did you learn? Who opens their door for junior people to learn and also has challenging project

https://redd.it/1he1vbg
@r_devops

Читать полностью…
Subscribe to a channel