Reddit DevOps. #devops Thanks @reddit2telegram and @r_channels
What are Buildkite and ArgoCD for?
I saw a job posting of a big tech company for a site reliability engineer role which contains the following bulletpoint:
> Expert knowledge of continuous deployment systems such as Buildkite and ArgoCD
I have set up a lot continuous delivery mechanisms and have worked with a lot CI/CD over the past 7-8 years but I don't know Buildkite and ArgoCD. We have always just used a gitlab-ci.yml
, a GitHub workflow, Azure pipelines or the like and it works great.
Can someone tell me what the benefits of Buildkite, ArgoCD et al. are? I've googled it of course but I don't see anything that wouldn't work with GitHub actions for example.
https://redd.it/1lhko66
@r_devops
From 0 to 240⭐ in 2 weeks—and then this happened! 🚀
Hey r/devops I launched my side project DevOps: Learn by Doing at the start of the month to curate free, hands-on labs and end-to-end projects. Two weeks later it’s racked up 240+ stars on GitHub—thanks to all of you! 🙌
But the real plot twist? I just got an email from Yevgeniy Brikman (Gruntwork himself) saying he loved the idea so much he’s sending me a print copy of - Fundamentals of DevOps and Software Delivery! 😱📚
I definitely didn’t expect this kind of ROI from a humble repo—guess my next KPI is “books received”! 😂
Huge thanks to everyone who starred, shared, or contributed.
linkedin post for more details : https://www.linkedin.com/feed/update/urn:li:activity:7342405110272008193/
https://redd.it/1lhfc86
@r_devops
🚀 Launching a New Cloud & DevOps Channel On WhatsApp! Looking for passionate admins to help build and grow a vibrant tech community. ☁️👨💻🔥
We're looking for experienced, self-motivated admins who live and breathe Cloud, DevOps, and Open Source culture. If you're passionate about automation, containerization, infrastructure as code, and sharing wisdom .. we need you! 🧠💻
Think of it as a selective, high-signal version of the DevOps subreddit — but delivered straight to your WhatsApp, real-time, curated, and community-driven. 🔥
🎯 What this channel brings:
📰 The latest in Cloud & DevOps news
📚 Curated resources from top engineers & open-source projects
💡 Daily tips, tricks & tools from the trenches
🤯 Fun facts & real talk about what it’s really like working in tech
🚫 The dos and don’ts of professional work life no one teaches you
🧘♂️ Smart takes on workplace well-being & career longevity
This isn't just another forward-spam group
it's an open-source-style revolution in community learning, where we grow together, stay ahead, and support each other.
🎖️ Become a founding admin. Help us lead this space with purpose, passion, and a bit of bash scripting.
https://redd.it/1lhcy68
@r_devops
Roast my resume
I need a good and thorough roasting of my resume. 100 applications these last couple of months and only got 3 interviews. I'm not american and don't live in the US if that matters, I'm applying for local jobs, not for international roles.
this is the link, tear it apart: https://i.imgur.com/Z4UQqk2.jpeg
I wonder if I should even include the projects section in there, I was almost never asked about them during the interviews.
https://redd.it/1lh9hvx
@r_devops
Working on a drop-in replacement for InfluxDB v1 - looking for feedback from DevOps users (I will not promote)
Hi Everyone,
I'm working on a drop-in replacement for InfluxDB v1, aimed at solving some of the frustrations I have had with it over the years. Particularly around memory usage, write throughput, cardinality etc. It's still early days, and I’m trying to gather feedback before carry on down a specific route.
I’d love to hear from anyone who has used InfluxDB (v1 in particular):
What did you love?
What drove you nuts?
If you moved off of it, why?
What did you switch to?
Key goals I’m pursuing:
Easy migration: reuse the same line protocol and nearly full InfluxQL support
Does not explode on high cardinality queries.
Better long-term storage.
Lower Latency Queries
This isn't a pitch, I will not promote, it's an open call for feedback from the trenches. I’ll eventually open source the project, but right now I want to make sure it’s solving the right problems.
Let me know what you think!
(I used GPT to help write this, words are hard)
https://redd.it/1lh53s2
@r_devops
Is k8s the best way to deploy this?
https://i.postimg.cc/prymfX7p/IMG-20250621-212721.jpg
Is k8s the best way to deploy a microservice based project , as shown in the above image , each pointed folder is a microservice but are these not in a monorepo. Two of these microservice rely on postgres and kafka docker images. I'd really appreciate your help.
https://redd.it/1lh02xq
@r_devops
Everyone’s Scaling AI. No One’s Fixing the Infra First.
Last month, I posted: “Give me your toughest cloud/SRE issue , I’ll fix it in 48 hours, free.”
The response? 500K+ views. 100+ DMs. 12 real work offers.
I spent the next 2 weeks knee-deep in infrastructure chaos.
Here’s what I ended up fixing:
\- ML inference spikes breaking autoscaling logic
\- CI runners choking on ephemeral disks (zero alerts)
\- Terraform modules quietly overriding prod configs
\- S3 buckets serving stale model files after blue/green deploys
\- $4.8K GPU bills from idle nodes no one shut down
👀 The pattern? Everyone’s sprinting to scale AI, but treating infra like a TODO comment.
That’s fine… until it breaks prod.
So here’s what I’m doing next:
⚙️ I’m quietly building a system to help AI teams, solo founders, and DevOps folks detect and fix these infra blindspots *before* they explode.
Because let’s be real:
\- If you’re a **solo dev or indie hacker**, your YAML is probably held together with hope and duct tape.
\- If you're a **SaaS builder shipping AI features**, every minute lost to infra is a minute not building your product.
\- If you're a **DevOps/SRE**, you’re probably firefighting more than shipping.
Until that system’s ready, I’m still in the trenches helping folks 1:1 fast, no fluff, just execution.
🧯Got a fire? Stuck on:
\- Inference-time API spikes?
\- Terraform ghost bugs?
\- Broken rollback logic in prod but not staging?
Drop your *toughest* infra pain here. Or DM me.
Let’s fix it.
**What’s the infra fire *you’re putting out this week*?**
https://redd.it/1lh06pp
@r_devops
How would I create my own version of supabase/crunchy data
This is for educational pruposes only.
Basically I want to learn how can I self host postgres and automate backups, testing, observability and even the moving the postgres server into a bigger/smaller machine.
https://redd.it/1lgw5ss
@r_devops
Book resources
Hi, I’m an IT system engineer and not a developer. Trying to learn K8s in this new roll. I’m tasked with loose instructions cleaning up repos and making small changes.
One of my tickets deploy isito in the ABC repo.
Oh and we use kustomize and rancher desktop.
My learning resources which I’ve paid for is KodeKloud, Udemy and Whizlabs.
I’ve been going through the KodeKloud “CKA”materials but finding that’s not helpful for my daily tasks.
I feel so lost in learning.
I’m looking for two books to read on vacation w/o terminal access.
One book for learning One book for the CKA exam
My research has lead me to the following three books.
kubernetes in action
The kubernetes book - Nigel
Certified Kubernetes Administrator (SeeKA) Study Guide - From Orielly publishing by Muschko
https://redd.it/1lguip9
@r_devops
May I develop a business app in Godot?
I did start developing a shiftplaning software for Windows, but soon I realized I need a code-signing certificate in order to use it in my company. I shifted everything to js so I could run it locally in the browser but there are some limitations in saving files.
Now I got the idea making this program in Godot to avoid the code-signing certificate.
But I don't know, if it is allowed to do it, because I'm not making a game.
https://redd.it/1lgsda3
@r_devops
I’m starting a DevOps Dojo show based on “learning by fixing broken things” what would you love to see?
Hey folks, I’m a DevOps engineer who’s finally starting a YouTube series, but with a twist: instead of polished tutorials, I want to show what really happens, stuff breaks, I troubleshoot, I learn.
Think “debugging in public” meets casual DevOps Dojo. Real-world infra, real errors, honest process.
I’ll cover things like:
- Broken CI/CD pipelines (Jenkins → GitHub Actions)
- Keycloak in CrashLoopBackOff hell
- Terraform misbehaving in AWS
- Secret management gone wrong
- All the dumb mistakes we pretend don’t happen
I want to make this accessible for beginners but still useful for mid/senior folks. Less buzzwords, more bash errors and real lessons.
What would you like to see in a show like this?
Any common pain points or “I wish someone walked me through this” moments?
@AlanDevOps
https://redd.it/1lgq717
@r_devops
How do you handle technical skill gaps in a managed services team supporting multiple Azure clients?
Hi everyone,
I work in a managed services company that supports multiple clients’ Azure environments. Our team handles tickets, incidents, and complex challenges, but we’re noticing a gap in technical depth across the team.
I’ve started using automation (emails, Teams, Power Platform) to improve ticket awareness, but I’d love to hear from others:
🔹 How do you address skill gaps in a busy support team?
🔹 What processes or tools have helped you upskill your engineers while still meeting client SLAs?
🔹 Any tips on balancing automation, documentation, and training?
🔹 How do you build a knowledge base that actually works?
Any real-world advice, examples, or lessons learned would be super helpful. Thanks in advance!
https://redd.it/1lgopn0
@r_devops
Would an AWS infrastructure visualizer and security alerts all visualised via an interactive graph for less than 7 dollars a scan be useful?
As title states, i have built an aws infrastructure interactive graph visualizer and security violations. It works by using a read only iam role and scans all your aws resources using the necessary metadata and infrastruture. Its also runs your run of the mill security misconfigurations rules but also multi hop and complicated threats. For example privilege escalation etc. Which is what you can get with WIZ and others but pay a fraction of the price with mine
.as low as 5 dollars one time scan. wouldnt have runtime detection but can do real time scanning based on the iam role .
Is this something ppl would want?
https://redd.it/1lgfndl
@r_devops
How do you not burn out?
I’ll Try to TLDR - Not in a senior role, under that and brought on with no prior devops experience but definitely a role supporting dev teams pushing through CI/CD implementation.
It seems that now I am the main point of contact for our applications. Which they are a few - For the most part my senior has migrated them to a more stable state. With no previous devops experience, I have been able to swim despite being thrown into the deep end. Now, I’ve run across a few issues which took a LOT longer than i would have liked, (days / weeks) and it turned out to be the silliest of things. Although I’m glad it’s resolved, i feel mentally exhausted lol. I am unofficially the point of contact for our apps. Any discussion on new implementation of anything, has to go through me. I sh*t my pants cause half the time I honestly dont know what or how to implement what they are looking for. Imposter syndrome is real. Have been in the role for sometime now, but its all starting to hit me, and i feel like everyone knows i dont know squat lol.
Implementing new infrastructure requires a lot of trail and error and i may skip things or miss things, much to the annoyance of the team i support. I’ll most likely take a day or two in the next few days or wait till the holiday.
https://redd.it/1kis5mg
@r_devops
What’s the one skill every DevOps engineer should master early on?
If I could go back and tell my younger self one thing, it’d be: learn bash scripting properly. I kept jumping into tools like Docker and Terraform without being solid on the fundamentals, and it slowed me down big time.
Now I use bash daily—for automation, debugging, gluing tools together—and I still learn new tricks every week.
What about you?
If someone’s just getting into DevOps, what’s one skill or habit that pays off long term?
https://redd.it/1kip4w3
@r_devops
The CoinMarketCap attack
My team did a write up on the CoinMarketCap attack of yesterday. Would love your perspective. Client-side attacks are scary and on the rise. It’s obvious that bad actors have figured out that no one really monitors how their application behaves in the browser of a user.
https://cside.dev/blog/coinmarketcap-client-side-attack-a-comprehensive-analysis
https://redd.it/1lhfg5c
@r_devops
Devops folks, are you using ai for infra tasks yet, or is it still too risky?
I’ve seen a few tools now claiming they can help with infrastructure-as-code, dockerfile optimisation, CI/CD pipeline generation, and even kubernetes YAML generation using ai prompts.
But I’m still hesitant to trust ai with things that touch production or deployment logic.
anyone here actually using ai to help with devops tasks in a real workflow?
any tools you trust (or don’t)?
Is it good for boilerplate only, or have you let it touch live infra?
any close calls or success stories?
https://redd.it/1lhdw7d
@r_devops
Just got invited to a technical interview at Forvia. They seem heavily Windows-focused.
Mission:
Implement, automate, and continuously improve development, integration, and deployment processes (CI/CD), in close collaboration with development and operations teams.
Skills:
Tools: Azure DevOps, Git, Docker, Kubernetes (a plus)
Languages: C#, .NET, PowerShell or Bash scripting
Methods: Continuous Integration, Continuous Deployment, TDD
Environments: Windows Server, MSSQL, Azure Cloud
Profile:
Bachelor’s in Computer Science
Good level of English
Collaborative mindset, rigorous, autonomous
DevOps certification is a plus
How mush Windows server, PowerShell stuff do you think I will have to do
I'm more of a Linux user, never used azure. I have some experience with AWS.
I really hate windows.
https://redd.it/1lhbfou
@r_devops
Setup your AWS infra, just by stating the requirements and pushing a button
See how the AI agents tackles the challenge to do a real Upwork job. The agents sets up an ec2 instance, installs and runs n8n on it along with a custom domain and ssl certificates. All under an hour. With zero human intervention.
Short video : https://youtu.be/kCQ2YLDLZ4Y
full video : https://youtu.be/PKTtNl3Puko
https://redd.it/1lh8d7o
@r_devops
How likely is a career switch from DevOps to Golang Dev?
Im 30 year old, started 5 years ago with linux administattion and then jumped to DevOps.
Golang has always been a passion and i was exited when i landed a job where our stack was half Go half Node.
But ive never gotten around to seriously coding in go and have no professional experience other than making a few bespoke tools that work in our infrastructure.
Our devs are pretty lazy so i usually take up the task of profiling, debugging and ever so often push commits to fix bugs or align the code to our convention.
So, is a career change at this moment even possible? If yes, how should i go about this? Try to contribute to our go code or create my portfolio?
https://redd.it/1lh4y6l
@r_devops
Dealing with Terraform Drift
i got tired of dealing with drift and i didnt want to pay for terraform cloud or other SAAS solutions so i built a drift detector that gives you a table/html page
tfdrift
wrote a blog about it devopsdaily/p-166303218">devopsdaily/p-166303218" rel="nofollow">https://substack.com/@devopsdaily/p-166303218
just wanted to share with the community, feel free to try out!
Note: remember to download the binary (or build if building golang locally) with the right GOOS and GOARCH. There are issues with which aws provider binary depending on what binary the tool is built it
https://redd.it/1lh1ufl
@r_devops
Monitoring data from 2nd/3rd parties, once you have set up monitoring on all your servers
I've just read that there was an attack on coinmarketcap through a third party code integration. This is what I've read:
'How It Started: The attack began with a small, seemingly harmless element on CMC’s homepage: a “doodle” image (a decorative graphic, like a holiday-themed logo).'
Was this attack even avoidable, any devops engineers here at larger firms, do you currently do monthly checks on whether all 3rd party scripts are maintained by reputable firms etc? How does this scale?
https://redd.it/1lgyvjk
@r_devops
Is it really true that roles like Cloud Engineer or SysAdmin can lead to a DevOps job later?
Hey everyone, Hope yall doing well :D
I’ve been learning about DevOps and really like the idea of working in that field — automating things, working with cloud infrastructure, CI/CD, etc. But I keep hearing that it’s hard to land a DevOps job right away, especially as a beginner.
So I started looking into roles that might *lead* to DevOps after gaining some experience, like:
* Cloud Support Associate / Cloud Engineer
* Linux System Administrator
* QA Automation
* IT Support
* Junior Backend Developer
From what I understand, these jobs give you exposure to things like scripting, Linux, cloud platforms, monitoring, and automation, which are all part of DevOps.
But here’s my question:
**Is it** ***actually*** **true that you can move from one of these roles into DevOps eventually?** Or is it just one of those things people say but don’t really happen often?
I’m especially curious about the **Cloud Engineer** role. Is it really one of the best stepping stones into DevOps?
Would love to hear from anyone who made that transition or is on that path right now.
Thanks in advance!
https://redd.it/1lgw9hq
@r_devops
Looking for advice with personal virtual-try-on application project!!
Hey, I’m trying to create a prototype for a VTON (virtual-try-on) application where I want the users to be able to see themselves wearing a garment without full 3D scans or heavy cloth sims. Here’s the rough idea:
1. Predefine 5 poses (front, ¾ right, side, ¾ left, back) using a neutral mannequin or model wearing each item.
2. User enters their height and weight, potentially entering some kind of body scan as well, creating a mannequin model.
3. User uploads a clean selfie, maybe an extra ¾-angle if they’re game, or even more selfies depending on what is required.
4. Extract & warp just their face onto the mannequin’s head in each pose.
5. Blend & color-match so it looks like “them” wearing the piece.
6. Return a small gallery of 5 images in the browser.
I haven’t started coding yet and would love advice on:
Best tools for fast, reliable face-landmark detection + seamless blending
Lightweight libs or tricks for natural edge transitions or matching skin tones/lighting.
Multi-selfie workflows, if I ask for two angles, how to fuse them simply without full 3D reconstruction?
Alternative hacks, anything even simpler (GAN-based face swap, CSS filters, etc.) that still looks believable.
Really appreciate any pointers, example repos, or wild ideas to help me pick the right path before I start with the heavy coding. Thanks!
https://redd.it/1lgu4sg
@r_devops
Built a free AWS cost audit tool (AltCloud.dev) — looking for honest DevOps feedback
Hey folks 👋
I’ve been working with startups and infra-heavy products for \~9 years, and one thing that keeps coming up, especially with smaller teams is cloud cost visibility (or the lack of it).
So I’ve started building **AltCloud.dev** — a free tool that:
Pulls your AWS cost and usage data
Shows real-time EC2 metrics (usage, idle detection)
Gives recommendations like overprovisioned instances, unused volumes, etc.
It’s very much an MVP right now, but functional and free — and I’d genuinely appreciate feedback from folks who’ve been in the DevOps trenches.
Would love to hear:
Is this useful to your workflow?
What’s missing to make it part of your toolkit?
Would you trust tools like this to suggest migrations or changes?
DMs or comments welcome — also happy to walk through what I’ve built so far if that helps.
Thanks!
https://redd.it/1lgrwdy
@r_devops
How to automate daily KPI emails from AWS CloudWatch using Outlook?
I’m working on a task where I need to fetch daily metrics from AWS CloudWatch for a few deployed models and send an automated status email via Outlook.
The metrics include:
4xx / 5xx Errors
API Latency (max & avg)
CPU and Memory Utilization
Total number of hits
I’ve got a fixed email template for this, and I currently send it manually every day. I want to automate the entire process — from pulling the data from CloudWatch to sending it via Outlook using a specific format.
I'm planning to use Python for this, probably with boto3 for AWS and win32com.client for Outlook email. Has anyone done something similar? Any best practices, sample scripts, or gotchas I should know about?
Would really appreciate your insights or any suggestions of youtube channel?
https://redd.it/1lgpfr1
@r_devops
Getting into devops
Hey so currently in a backend engineer internship and I'm currently coding, testing with postman, building with Jenkins, using grafana for testing.
I am enjoying it but maybe eventually I want to be dev ops. Can anyone help me with a good path for learning? And maybe certificates? Was hearing about the kubernetes certs. So any help would be appreciated
https://redd.it/1lgn6a3
@r_devops
Using kube-downscaler to reduce Kubernetes costs—my take
If you're running dev/staging clusters or workloads with predictable low-traffic hours, kube-downscaler is a simple win.
It lets you define schedules (via annotations) to scale Deployments down—without interfering with HPA.
I shared my setup, where it fits well, and a few caveats here:
https://blog.abhimanyu-saharan.com/posts/reduce-kubernetes-costs-with-kube-downscaler
Curious—anyone using this in production? Or paired it with Keda?
https://redd.it/1kis091
@r_devops
Onprem Application Logging with Slurm?
Hey guys so slightly baffled, I have been thrown a problem at me about getting our slurm + apptainer cluster logs to be stored and accessible somewhere centrally. I have been simple logging and storing the logs on a nfs server.
On cloud in azure I use log analytics + application insights + openetelemetry. But not sure about onprem, do I just setup a loki + grafana container and go for it?
https://redd.it/1kiqutn
@r_devops
Becoming K8s/Openshift expert ?
Hello Fellas,
Presently an RHCSA/RHCE. Earlier I wanted to get into Devops, however I have realised its better to gain a solid understanding of one tool and become good enough in it. I am working on K8s now and plan to be an openshift architect and Kubestronaut. Also i hope to gain a basic fundamental understanding of other tools like git,CI/CD etc. Any inputs on this about the career growth, I work as a system admin for linux/ansible right now.
https://redd.it/1kimnob
@r_devops