r_devops | Unsorted

Telegram-канал r_devops - Reddit DevOps

270

Reddit DevOps. #devops Thanks @reddit2telegram and @r_channels

Subscribe to a channel

Reddit DevOps

Tired of copy-pasting AWS CLI / kubectl output into online formatters?

Wrote a quick practical guide on jq : the one terminal command that handles JSON the way grep handles text.

# Only show failed CI jobs
curl -s .../jobs | jq '.jobs[ | select(.conclusion == "failure") | .name]'

Covers filtering, reshaping, piping into bash scripts, and more.

https://medium.com/stackademic/practical-jq-for-developers-parse-json-from-the-terminal-d6caac870d4f?sk=9daddc495b92f13fbb9150ebd5649494

What's your go-to jq one-liner?

https://redd.it/1skcglc
@r_devops

Читать полностью…

Reddit DevOps

Weekly Self Promotion Thread

Hey r/devops, welcome to our weekly self-promotion thread!

Feel free to use this thread to promote any projects, ideas, or any repos you're wanting to share. Please keep in mind that we ask you to stay friendly, civil, and adhere to the subreddit rules!

https://redd.it/1sk2s6h
@r_devops

Читать полностью…

Reddit DevOps

What’s the most painful part of working across multi-cloud + Terraform?

Hey everyone, I’m exploring an idea for DevOps / platform / SRE work.

The main problem I’m looking at is the usual bouncing between cloud consoles, Terraform, terminal sessions, and cross-account context.

Curious how people here feel about it:

* What’s the most annoying part of your multi-cloud or Terraform workflow today?
* Where do your current tools fall short?
* What would a tool like this need to do before you’d even try it?
* What would make you immediately say no?
* Is drift/environment comparison actually painful enough to need a dedicated tool?

Would love to hear real workflow pain points more than feature wishlists.

https://redd.it/1sjpa4b
@r_devops

Читать полностью…

Reddit DevOps

Moving to devops

Sorry if this is not the place the post this. Just looking for some advice.

I’m currently an IT Support Manager. I’ve been doing this for almost 10 years. I wanted to get into something else midway through my career but my wife and I started a family at the time and I just stuck with what I know. A couple of kids later, I’m now looking to move on from my role and hopefully move into something different.

Again, I’m just looking for advice on a good starting point. What areas of focus should be looking into? Scripting? Networking? Cloud?

Any good books or online courses I should look into? Any homelab or projects I should start doing?

Any advice is welcome!

https://redd.it/1siah0r
@r_devops

Читать полностью…

Reddit DevOps

FAANG nerds who jumped to SRE

Hey folks,

Need some unsolicited advice (feel free to bash me ).

I m software Enginner with 4 YOE across dev + support/SRE-ish chaos. Stack: Python, .NET, Datadog, Docker, Azure. Recently added Kubernetes (AKS), Terraform, Linux because free time is overrated and I don’t have life. 🥲

Trying to break into SRE/Platform at FAANG-level, stuck between:

A) Grind NeetCode/LeetCode like my life depends on it

B) Go deep into K8s (CKA-level nerd mode)

I know SRE needs coding and infra, but I don’t have time to suck at both.

People who’ve actually interviewed recently and what matters more to clear the loop ?

https://redd.it/1shvsag
@r_devops

Читать полностью…

Reddit DevOps

Update: moving secret remediation out of CI — pre-commit seems to be the only acceptable boundary

I posted about this a few weeks ago and got strong feedback against CI auto-fix.

The original idea was to automatically fix hardcoded secrets inside CI pipelines.

The feedback was pretty clear: people don’t trust CI modifying code — even if the change is technically safe.

After thinking about it, I agree.

So I changed direction.

Instead of CI auto-fix:

\- remediation runs locally (pre-commit / manual)
\- CI stays detection-only

The reasoning:

\- CI should stay deterministic and non-invasive
\- developers are more comfortable reviewing changes before commit
\- automatic fixes only make sense when they’re predictable and visible

The constraints stayed the same:

\- only simple, structurally safe rewrites (AST-based)
\- no guessing or pattern-based hacks
\- anything ambiguous is refused

Now the question is where the boundary should be.

\- Is pre-commit the right place for this kind of remediation?
\- Or should tools stop entirely at detection and leave fixes fully manual?
\- Has anyone actually seen auto-remediation work safely in real pipelines?

Trying to understand what people are actually comfortable running in practice.

https://redd.it/1shbkol
@r_devops

Читать полностью…

Reddit DevOps

Hey, could anybody help with materials and roadmap for becoming strong DevOps?

I have an applied math background and basic hands-on experience with Git, Linux, Docker, Python, and C++. I want to build a serious foundation for DevOps.

I am currently planning to study computer architecture, operating systems, networking, Linux internals, and distributed systems. The books I am considering are Tanenbaum, OSTEP, Top-Down Networking, The Linux Programming Interface, and a distributed systems by Klepman.

Would that be enough for a strong foundation, or are there other fundamentals that matter more for DevOps and production engineering?

https://redd.it/1scwwp8
@r_devops

Читать полностью…

Reddit DevOps

FinOps question: what do you do when a few pods keep entire nodes alive?

Coming at this from the FinOps side, so apologies if I’m missing something obvious.
When I look at our cluster utilization, a lot of nodes sit around 20–30%. So my first reaction is being happy since we should be able to consolidate those and reduce the node count.

But when I bring this up with the DevOps team, the explanation is that some pods are effectively unevictable, so we can’t just drain those nodes.
From what I understand the blockers are things like:


Pod disruption budgets
Local storage
Strict affinities
Or simply no other node being able to host the pod

So in practice a node can be mostly idle, but one or two pods keep it alive.
I understand why the team is hesitant to touch this, but from the FinOps side it’s frustrating to see committed capacity tied up in mostly empty nodes.
How do teams usually deal with this?

Are there strategies to clean these pods so nodes can actually be consolidated later?
I’m trying to figure out what kind of proposal I could bring to the DevOps lead that doesn’t sound like “just move the pods.”

Any suggestions?

https://redd.it/1se1aky
@r_devops

Читать полностью…

Reddit DevOps

To vex or not to vex?

Management is adamant on fixing all CVEs, even the unfixable and unreachable/un-executable ones. i am wondering if i should just tag them with a vex and move on. What do you fine folks do for these?

https://redd.it/1sfvi9v
@r_devops

Читать полностью…

Reddit DevOps

Trying to get better at DevOps by working on real problems

Hey everyone,

I’ve been learning DevOps for a while now, but I feel like tutorials only take you so far I want to get better by actually working on real setups and issues

If you’re dealing with anything like CI/CD, Docker, Kubernetes, deployments, monitoring, or even small bugs in your setup, feel free to share I’ll try to work through it and share what I learn not looking for payment or anything just want to learn by doing real stuff instead of only following guides

Appreciate it 🙂

https://redd.it/1se7dt2
@r_devops

Читать полностью…

Reddit DevOps

Alternative to NAT Gateway for GitHub Access in Private Subnets

I have a cluster where private subnet traffic goes through a NAT Gateway, but data transfer costs are high, mainly due to fetching resources from GitHub, which cannot be optimized using VPC endpoints.

To reduce costs, I set up an EC2 instance with an Elastic IP and configured it as a proxy.

I then injected HTTP_PROXY and HTTPS_PROXY settings into workloads in the private subnets. This setup works well, even under peak traffic, and has significantly reduced data transfer costs.

For DR, I still keep the NAT Gateway on standby.

Are there any risks or considerations I should be aware of with this approach?

https://redd.it/1sagpn7
@r_devops

Читать полностью…

Reddit DevOps

AWS Bahrain under attack !

Those who migrated workloads are lucky; those who haven't started yet or are in progress,

I don't think there's any possibility for recovery in the UAE region.


https://www.wionews.com/world/iran-strikes-bahrain-s-top-telco-hosting-amazon-web-services-marking-1st-direct-hit-on-us-tech-giants-1775046327018


https://redd.it/1s9uukb
@r_devops

Читать полностью…

Reddit DevOps

your CI/CD pipeline probably ran malware on march 31st between 00:21 and 03:15 UTC. here's how to check.

if your pipelines run npm install (not npm ci) and you don't pin exact versions, you may have pulled axios@1.14.1 a backdoored release that was live for \~2h54m on npm.

every secret injected as a CI/CD environment variable was in scope. that means:

AWS IAM credentials
Docker registry tokens
Kubernetes secrets
Database passwords
Deploy keys
Every $SECRET your pipeline uses to do its job

the malware ran at install time, exfiltrated what it found, then erased itself. by the time your build finished, there was no trace in node_modules.

how to know if you were hit:

bash

# in any repo that uses axios:
grep -A3 '"plain-crypto-js"' package-lock.json

if 4.2.1 appears anywhere, assume that build environment is fully compromised.

pull your build logs from March 31, 00:21–03:15 UTC. any job that ran npm install in that window on a repo with axios: "^1.x" or similar unpinned range pulled the malicious version.

what to do: rotate everything in that CI/CD environment. not just the obvious secrets, everything. then lock your dependency versions and switch to npm ci.

Here's a full incident breakdown + IOCs + remediation checklist: https://www.codeant.ai/blogs/axios-npm-supply-chain-attack

Check if you are safe, or were compromised anyway..

https://redd.it/1saa69w
@r_devops

Читать полностью…

Reddit DevOps

<Generic vague question about obscure DevOps related pain point and asking how others are handling it>

<Details on the issue>

<But not too many details>

<sentence with no auto caps, because I am not a bot, see Mom? I’m a real boy>

How do you deal with it?

https://redd.it/1sarhqy
@r_devops

Читать полностью…

Reddit DevOps

Would you go from a DevOps to L3 Support Role for 20% Salary hike.

The role is a L3 /Production support role. L2 team will forward the tickets to L3 team which should be resolved via going through the code or looking at the database.

https://redd.it/1sc5zs2
@r_devops

Читать полностью…

Reddit DevOps

How do you even know what's running in prod anymore

we're a team of 12 shipping 3-4 times a day because cursor and claude have basically doubled our velocity. which is great! but I genuinely cannot tell you right now what version of the payment service is live in prod. I'd have to open github actions, cross reference ECR tags, maybe ping someone on slack.

we have staging, sandbox, and prod. sometimes something gets deployed to staging and just... sits there. weeks later someone asks "hey is the new checkout flow live?" and we do archaeology.

is this just the normal tax for a small team shipping fast or are people actually solving this? we're not big enough for a dedicated platform person. curious what workflows actually work at this scale

https://redd.it/1skaydb
@r_devops

Читать полностью…

Reddit DevOps

Question to senior DevOps Engineers

How do you upskilled when you were junior or intern , How do you cope up with seniors and implement new tech and tools quickly, I am a DevOps Intern wanna upskill besides POC's and reading blogs and docs any other way or smart trick to upskill faster?

Love to hear different perspectives of senior Engineer's

https://redd.it/1sk0wy6
@r_devops

Читать полностью…

Reddit DevOps

System Design coming from a purely Systems / Cloud Infra background

I've been preparing for what I think is my 3rd interview for an infrastructure role that includes a system design component. And I have to say, as someone who had heard of leetcode and system design but never actually sat down and practiced it before this, my imposter syndrome has somehow... grown.

Never in my career have I felt the absence of a CS degree more than when I'm being asked to articulate APIs and data models for things like a Dropbox clone, a URL shortener, or a parking lot manager. It's humbling in a way I didn't expect.

That said, there's an upside I didn't anticipate. Learning to think through systems at that level has already changed how I look at the infrastructure I work on every day. I've started noticing places where the architecture could be cleaner or where past decisions might not hold up at scale, and actually being able to reason through why. So even if this role doesn't pan out, I don't think the time was wasted.

Anyone else come from a pure sysadmin / cloud infra background and go through this? Curious if there is any shortcuts other than repetition.

https://redd.it/1sj2mry
@r_devops

Читать полностью…

Reddit DevOps

InfraLens: A workspace for cloud ops, Terraform context, and less context switching

InfraLens is a desktop workspace for DevOps and cloud operations work.

The core idea is pretty simple: a lot of day-to-day infra work gets fragmented across cloud consoles, Terraform, terminals, and a pile of open tabs. You inspect something in one place, verify it somewhere else, run the next step in a shell, then lose context in the process.

InfraLens is meant to bring more of that flow together:

cloud resource context
Terraform-adjacent workflows
shared operational views
terminal follow-up in the same workflow

The project started as AWS Lens, but it has now been renamed to InfraLens as it expands beyond AWS into Google Cloud and Azure.

Important note: GCP and Azure support are still in beta. AWS is currently the more mature side of the product, and I’m actively trying to understand what’s missing on the multi-cloud side.

Would love feedback from people here on:

where your biggest context-switching pain is today
what you’d actually want from a workspace like this
what multi-cloud tooling usually gets wrong
what GCP/Azure workflows or missing features would matter most

Very open to blunt feedback, gaps, and feature requests.


Repo: https://github.com/BoraKostem/InfraLens

https://redd.it/1shwfz5
@r_devops

Читать полностью…

Reddit DevOps

Stuck in a company with no Git workflow, no PRs, and resistance to change😭

I joined a company as a DevOps engineer and found their Git workflow is completely broken.

They use a single GitHub account for everything. Developers don’t have their own accounts. Everyone shares access by giving their SSH public key to the boss, who adds it to his account.

There’s no GitHub UI usage, no pull requests, no code reviews, no branch protection. Developers push directly to random branches, and those branches sometimes go straight to production. A senior handles merges and deployments manually.

Many developers (even with years of experience) don’t know basic Git practices like PRs. When I suggested standard improvements (feature → dev → main flow, PR approvals, CI/CD, branch rules), I got resistance. Some don’t want to change, others think this is normal. Even a junior argued that my approach is wrong.

I’m the only one with Docker experience here. Overall engineering practices are outdated.

I discussed this with my boss and suggested proper setup (including to buy GitHub Team plan), but it was rejected due to cost, despite having big international clients.

I feel stuck. Trying to improve things but facing strong resistance, and I can’t leave yet since I don’t have another job offer.

Has anyone been in this situation? How did you handle it?

https://redd.it/1sho7g4
@r_devops

Читать полностью…

Reddit DevOps

r/DevOps looking for Mods
https://redd.it/1shbt4e
@r_devops

Читать полностью…

Reddit DevOps

Automation engineer interview

Hey everyone, i have an interview coming up and i’ve been studying a couple of things here and there. I was wondering if anyone could provide some guidance for me to know what to focus on exactly. Here is the job description:

Manage continuous integration and continuous deployment (CI/CD) pipelines.

Automate operational processes to reduce manual intervention and increase efficiency.

Ensure smooth integration between development and operational teams.

Collaborate with developers to design solutions that meet both operational and development needs.

Implement and manage infrastructure as code to ensure consistent and scalable deployments.

Conduct post-deployment reviews to ensure successful implementations.

Continuously improve and optimize DevOps practices to increase efficiency.

Design and implement integration solutions that connect different IT systems and applications.

Ensure data flows efficiently and securely between systems.

Collaborate with other architects and developers to ensure compatibility and scalability.

Develop and maintain documentation for integration processes and protocols.

Works closely with data and automation team to ensure integration facilitates their projects

Qualifications

Knowledge and Skills:

experience in deployment or support of application software, implementing systems and modules with experience in multiple full lifecycle implementations.Strong knowledge in Python, Java, C, SQL, and DevOps

https://redd.it/1sgm52h
@r_devops

Читать полностью…

Reddit DevOps

Will Datadog bill me twice for APM if I delete and recreate a host?

On the datadog pricing table, it says that APM starts at $35 per host per month.


Now my question is : what if during a month I delete one of my hosts (for example an AWS EC2) and I create a new host. Will I be billed twice ($70 for the month), or will they calculate my bill according to the number of hours that I've used each host? (so the total would be $35 for the month)


Thank you

https://redd.it/1sfvw6p
@r_devops

Читать полностью…

Reddit DevOps

Testing a $6 server under load (1 vCPU / 1GB RAM) - interesting limits with Nginx and Gunicorn

I ran a small load test on a very small DigitalOcean droplet, $6 CAD:

1 vCPU / 1 GB RAM
Nginx -> Gunicorn => Python app
k6 for load testing

At \~200 virtual users the server handled \~1700 req/s without issues.

When I pushed to \~1000 VUs the system collapsed to \~500 req/s with a lot of TIME_WAIT connections (\~4096) and connection resets.

Two changes made a large difference:

increasing `nginx worker_connections`
reducing Gunicorn workers (4 → 3) because the server only had 1 CPU

After that the system stabilized around \~1900 req/s while being CPU-bound.

It was interesting how much the defaults influenced the results.

Full experiment and metrics are in the video: https://www.youtube.com/watch?v=EtHRR\_GUvhc

https://redd.it/1sg03rt
@r_devops

Читать полностью…

Reddit DevOps

Need suggestions

I am started learning cloud/ Devops, I have completed Linux, networking and AWS- broke and fix nginx, S3 permission, website forbidden, checkingigs etc, now I am thinking about getting a course from train with Shubham, is it worth it or should I look for other cources

https://redd.it/1se3khd
@r_devops

Читать полностью…

Reddit DevOps

What are we using for realtime blocking of remote packages?

Was looking at the landscape for services that block upstream remote packages at an organizational level. I couldn’t really see a winner that spans across all package types. We currently use jfrog’s xray but it didnt block the recent axios exploit in time.

Does anyone use Jfrog’s curation subscription or socket.dev? Did it block the recent axios 1.14 package before anyone downloaded?

https://redd.it/1saoexe
@r_devops

Читать полностью…

Reddit DevOps

Openclaw agent for devs to create new apps on EKS

Bear with me here. I'm thinking about having an openclaw agent that devs can interact with when they want to add a new app on our EKS cluster. For now it would be for the nonprod cluster only.

Say they can interact with the agent through slack. They tell the agent about what their app will need. Like open port 8080, make a pvc, make a configmap with those values. Then the agent creates the new app from an helm template and would also create the cicd pipeline from a template. The agent could open a Jira ticket a pr for us to review before applying the change. It could also document the app in confluence. I don't see why this would not work. And we make sure the agent only has limited credentials and network accesses

When we want to deploy the app on the prod cluster we could do it ourselves for now.

https://redd.it/1sbjpl5
@r_devops

Читать полностью…

Reddit DevOps

<Generic 'I built this to do some problem that doesnt actually exist' >

<Totally not AI generated problem statement that actually just exposes that OP has 0 clue about how anything works>

<Github link 80% of the time. Usually created 1 or 2 days ago. Completely out of whack when compared to OP's other public repo code which are usually named ~"python||typescript testing". Only shows OP as contributor cause they make the repo with AI first then delete and copy/paste/push >


<Generic asking for feedback section and statement that there is a paid version but you dont need to use it at first>

All credit to /u/Arucious for this one lmao

https://redd.it/1saw5ro
@r_devops

Читать полностью…

Reddit DevOps

Are certs still wort it anymore in the job market??

I’m about to reenter the job market sadly, I remember certs being all the rage within 2019-2023 at my previous 2 companies back in that time. Hell back then, my company even gave us a 2 week sprint to just get certified & reimbursed us for 2 certifications a year.

I had an AWS cloud practitioner that expired 3 years ago, is it worth getting a newer AWS cert like solutions architect? For work around Ansible, terraform, or kubernetes?? Or one of the azure certs?

Or should I just build shit in my AWS environment and showcase it on my resume? Pretty much have 4 years of experience but the last 7 months might be a gap with the sysadmin contracting gig I had to take

https://redd.it/1sbwkn7
@r_devops

Читать полностью…

Reddit DevOps

Looking for new r/devops mods

We’re planning to add few more mods to help with spam and keep things clean.
to apply fill this form https://forms.gle/uWsqcZPUNvtxgi1v7

https://redd.it/1sb0vdd
@r_devops

Читать полностью…
Subscribe to a channel