r_devops | Unsorted

Telegram-канал r_devops - Reddit DevOps

270

Reddit DevOps. #devops Thanks @reddit2telegram and @r_channels

Subscribe to a channel

Reddit DevOps

Any self-hosted/FOSS log fingerprinting/anomaly pipelines?

I'm using vector to ship my K8s/Spark/Kubernetes Events/Network Flow logs to Victoria Logs. I'd like to detect anomalies in logs and/or know when a new log pattern exists (specifically to help with the former). I realize Victoria Metrics offers anomaly detection on their gold-tier, but, it's outside of our price range.

I'm coming up blank for anything you'd just drop in there... So far I've found:

[https://pyod.readthedocs.io/en/latest](https://pyod.readthedocs.io/en/latest)
drain3

Bonus points: if I can use the same pipeline for metrics from Victoria Metrics/prometheus compatible source.

https://redd.it/1t40zwx
@r_devops

Читать полностью…

Reddit DevOps

Is it just me or HCL has basically no enforceable standards beyond formatting?

Working with Terraform / Terragrunt over time, I keep running into the same issues:

- every repo structures HCL differently
- PRs spend time on block order / layout instead of logic
- dependency references that look fine but break later
- terraform fmt only solving whitespace, not structure

It feels like HCL tooling stops at formatting, while in most other ecosystems you have linters that enforce actual structure and conventions.

I tried experimenting with a linter approach for HCL that focuses more on:

- enforcing block order (e.g. includelocalsdependencyinputs)
- validating dependency references and outputs
- detecting duplicates / inconsistent definitions
- optionally auto-fixing some of these issues

Example of the kind of thing I mean:

before:

dependency "vpc" { ... }
locals { ... }
include { ... }


after:

include { ... }
locals { ... }
dependency "vpc" { ... }


The goal here isn’t aesthetics - it’s reducing cognitive load.
Curious how others see it.

When structure is consistent:
- you know where to look for things without scanning the whole file
- diffs become smaller and easier to review
- mistakes like misplaced dependencies or overrides are easier to catch

Right now this is usually enforced (if at all) during PR review, which is slow and subjective.

Pushing it into a linter makes it:
- deterministic
- automatable in CI
- less dependent on team habits

That said, I’m not convinced yet this is better than just keeping things flexible + relying on reviews.

https://redd.it/1t3rrxc
@r_devops

Читать полностью…

Reddit DevOps

Where do you keep your personal scripts?

Talking about scripts you have written to get information or help you do a task at work but don’t necessarily belong in a repo (Like looping aws cli commands through multiple environments to audit fargate versions, audit users in rds databases, kick off force deploys, etc). Not to mention if you leave the company you wouldn’t wanna lose it.

Upload to personal GitHub? Save to a personal note taking app with cloud saves? I’ve got enough scripts now that I’d be devastated if I was let go and lost access to the local files on my work computer. Would be neat to have something with versioning, otherwise I guess I’ll just look at a note taking app with cloud saves

https://redd.it/1t3n2yf
@r_devops

Читать полностью…

Reddit DevOps

The job market is tough but .. I think it’s tougher on my side

I'm really having it hard . I can’t send as many job applications as I would like due to my location . Most jobs aren't hiring from Africa . The local market here is also in the pits way worse than the global market. No opportunities here . The few opportunities I get that accept EMEA end up ghosting no replies . Others are straight up scams. I’m averaging a total of 4 applications in a month . I know I need to send more applications to increase my chances but theres literally no job vacancies that I can send my resume to .

Then I log into this sub and I see posts saying the Western countries are outsourcing jobs to third world markets since we are cheaper . And I’m left wondering where are these jobs though ? Outsourcing to us but I can’t find a single place that accepts applications from third world countries.

https://redd.it/1t3k8tk
@r_devops

Читать полностью…

Reddit DevOps

I made TUI for easy Terraform work
https://redd.it/1t3cuxj
@r_devops

Читать полностью…

Reddit DevOps

Weekly Self Promotion Thread

Hey r/devops, welcome to our weekly self-promotion thread!

Feel free to use this thread to promote any projects, ideas, or any repos you're wanting to share. Please keep in mind that we ask you to stay friendly, civil, and adhere to the subreddit rules!

https://redd.it/1t39nww
@r_devops

Читать полностью…

Reddit DevOps

Is Docker still used in industry or is orchestration the way to go?

For most of the time I've had a home lab, I've used Docker to set up services in my network. I don't see much online for examples of actual businesses using Docker in a significant capacity though. Does anybody still use Docker in the industry or has everything switched over to Kubernetes?

https://redd.it/1t2uq6h
@r_devops

Читать полностью…

Reddit DevOps

4 YOE DevOps Engineer — Can someone review my resume? A senior told me I need 3+ pages to get offers but I kept it to 2 . can some give any suggestions on this.

One of my senior colleagues suggested I need at least 3 pages in my resume to get offers. However, from everything I've read online, 1–2 pages is the standard — especially in tech. I kept mine to 2 pages and focused on quality over quantity.

I have 4 years of experience in DevOps, specializing in Kubernetes, cloud infrastructure, and CI/CD automation. Currently working at Infosys as a Senior Associate Consultant and actively looking for new opportunities.

**Key highlights:**

* CKA + CKS certified
* Managed 8 EKS clusters and 200+ nodes
* Maintained 100+ Jenkins pipelines
* Hands-on vulnerability remediation using Wiz (actual CVE IDs in resume)
* Took a DR implementation to production

Would love honest feedback — is 2 pages fine or should I be adding more? Also open to any suggestions on content, formatting, or anything I'm missing!


https://preview.redd.it/cw4oxqa7txyg1.png?width=682&format=png&auto=webp&s=e23ff7f75fbf5678a7a5a51788f600aa21ffa18b



https://redd.it/1t2nu5b
@r_devops

Читать полностью…

Reddit DevOps

For Transitioning to DevOps - Does this looks a good plan?

https://preview.redd.it/1csfwfxqywyg1.png?width=944&format=png&auto=webp&s=f89aef501d4a03c2a43c0c8d0264abbd2d593f2b

I am helping an x-colleague transitioning to DevOps, does this looks kind of a good plan?

>Transitioning from a Telecom Engineer to a Senior DevOps Engineer presents a moderate gap, primarily in cloud and containerization skills. With your strong problem-solving abilities and systems integration experience, you are well-positioned to bridge this gap with targeted learning and hands-on practice. Expect a competitive salary increase upon successful transition.

https://redd.it/1t2js86
@r_devops

Читать полностью…

Reddit DevOps

VS code inserting 'co-authored by copilot', regardless of usage

https://github.com/microsoft/vscode/pull/310226 - i never use copilot, and feel marking my work as co-authored is very wrong. Time to get a new IDE, what goose alternative would the group recommend.

https://redd.it/1t2emie
@r_devops

Читать полностью…

Reddit DevOps

Tired of rebuilding the same on call pay spreadsheet every month so I made a thing (pager duty)

Every month without fail someone on my team had to manually cross-reference PagerDuty schedules with our incident log, figure out which days were bank holidays (spoiler: England and Scotland have different ones), apply the right stipend rates, count callouts, and format it all into something finance would actually accept.

It took about 1 hour each time, was different every month depending on who did it, and we had at least two payroll disputes in a year because of calculation errors.

Eventually got fed up and built calloutpay.com — connects to PagerDuty, you put in your rates (weekday/weekend/bank holiday stipend, callout fee, hourly rate if applicable), it saves all this spits out a formatted XLSX, the only thing you need to change is the date range (handy for monthly submissions). Handles UK regions and US federal holidays automatically, custom dates for anywhere else.

Not trying to hard sell it, free trial — mainly posting because I suspect we're not the only team doing this manually every month and wanted to see if others have hit the same problem or solved it differently.

Anyone else dealing with this or have a better approach?

https://redd.it/1t20opt
@r_devops

Читать полностью…

Reddit DevOps

fedit — a deterministic CLI + MCP file editor I built after watching LLMs mangle my Terraform/YAML/nginx configs

**TL;DR:** `fedit` is a small Go CLI that does line-addressable file edits (show / insert / delete / replace / replaceall / map / find / insertafter / insertbefore / write). It also ships an **MCP server** so Claude/Cursor/etc. can call the same ops as tools instead of regenerating whole files. MIT, single binary, no deps. Repo: https://github.com/amalexico/fedit



\---



\### Why this exists



Like a lot of you, I've been letting LLMs touch my configs — Terraform modules, Helm values, nginx blocks, GitHub Actions workflows, Ansible playbooks. The failure mode is always the same:



\- I ask for a 3-line change in a 400-line file.

\- The model rewrites the whole file.

\- Two unrelated keys get reordered, a comment vanishes, an indent flips from 2→4 spaces somewhere on line 217, and a heredoc gets "helpfully" reformatted.

\- `terraform plan` now wants to recreate half my infra. Cool.



The root cause: text generation is non-deterministic and whole-file rewrites have no surgical primitives. So I gave the model surgical primitives.



\### What fedit does



Ten operations, all addressable by line number **or** by content match:



```

show Display lines (whole file or range)

insert Insert after line N

delete Delete line N or range

replace Replace line range

replaceall Global find-and-replace

write Overwrite file

map Structural overview (functions/classes/blocks)

find Find lines matching substring

insertafter Insert after a matching line ← preferred

insertbefore Insert before a matching line ← preferred

```



`map` understands 17 languages: Go, HTML, SQL, Python, JS, TS, CSS, Rust, Java, C#, YAML, TOML, Markdown, Ruby, PHP, Dockerfile, Makefile. So an agent can ask "where's the `resource "aws_s3_bucket"` block?" and get an answer without reading the whole file into context.



Every mutation supports `-v` which prints a `=== STATS ===` block (op, file, match, line delta, elapsed) so the agent can verify its own edit landed.



\### Demo (4-op workflow)



![demo\](https://raw.githubusercontent.com/amalexico/fedit/main/demo.gif)



Recorded with vhs — `demo.tape` is in the repo if you want to reproduce it.



\### MCP server mode



`fedit mcp` starts a JSON-RPC 2.0 server on stdin/stdout exposing all 10 ops as MCP tools (`fedit_show`, `fedit_insertafter`, `fedit_replaceall`, etc.). Drop it in your Claude Desktop / Cursor / Continue config and the model edits files through tool calls instead of regenerating them.



There's also a `SKILL.md` (Anthropic Agent Skills format) in the repo so Claude picks up the usage conventions automatically.



\### Benchmark



I ran the same 7 editing tasks (T1–T7: insert middleware, delete dead handler, rename across file, swap a YAML key, etc.) across **Claude, ChatGPT, and Gemini**, comparing free-form whole-file edits vs. fedit-mediated edits. Full table is in the README. Short version: free-form rewrites silently corrupted unrelated lines on 4–6 of 7 tasks depending on the model; fedit-mediated edits were byte-exact on all 7 across all three.



Not claiming this is rigorous science — it's a reproducible smoke test. Tape is in the repo, run it yourself.



\### Install



```bash

go install github.com/amalexico/fedit@latest

```



Single binary, no runtime deps, MIT license.



\### What I'd love feedback on



1. Op set — anything missing for your IaC / config workflows? (someone in r/golang asked for `swap-lines`, considering it)

2. The `map` language list — what should I add? Thinking HCL/Terraform next, given the audience here.

3. MCP tool schemas — if you've wired this kind of thing into an agent before, am I exposing the right surface area?



\---



*Disclosure: I used LLM assistance writing parts of fedit and this post. The benchmark,

Читать полностью…

Reddit DevOps

Root SSH with keys only 👍 or 👎? Why as opposed to another user with sudo without password ability?

Most basic os hardening recommendations say. To disable root login? What is the security risk as opposed to having another user with sudo ability without password?
Things I can think of obvious username to try to brute force.
Highly risky if compromised.
But the other username I have is obvious too and It does have sudo ability. So what is the best approach?

https://redd.it/1t1zprm
@r_devops

Читать полностью…

Reddit DevOps

How do you debug async job failures across multiple steps?

Ran into this recently & it took longer than expected to figure out.
Flow was something like:
Order → Payment → Email → Analytics

The payment step was failing intermittently & retrying in the background because of that, downstream jobs were either delayed or never triggered.

The tricky part wasn’t the error itself but figuring out which retry attempt actually failed whether it eventually succeeded or not & what downstream jobs were impacted. I ended up jumping between logs and trying to piece everything together manually.

How do you handle this in production? Logs + grep? Tracing tools (OpenTelemetry, etc) internal dashboards? Or is this just one of those things everyone works around?

https://redd.it/1t1z1t9
@r_devops

Читать полностью…

Reddit DevOps

Built a Jenkins plugin that tracks all 4 DORA metrics natively

Been working on getting DORA metrics visibility for our pipelines without setting up external infrastructure like Prometheus/Grafana or paying for commercial tools. Ended up building a Jenkins plugin that does it all inside Jenkins itself.

It tracks Deployment Frequency, Lead Time for Changes, MTTR, and Change Failure Rate. Also does pipeline rankings (slowest, most failing, flakiest), stage-level analytics, and has a REST API if you want to pull data into other tools.

Everything runs on an embedded SQLite database, zero external dependencies. You install it and it starts collecting data from every build automatically.

Just got it officially hosted in the Jenkins project: https://plugins.jenkins.io/pipeline-dora-metrics/

Curious if others have tackled DORA metrics tracking in Jenkins and what approach you went with.

https://redd.it/1t1vixo
@r_devops

Читать полностью…

Reddit DevOps

Transitioning as a Sysadmin/Engineer to DevOps

I am a Sysadmin/Engineer with 15+ years of experience and am making the decision to switch to Devops.

I have worked closely with Devops teams and understand what they do, however, the bulk of my responsibility with them is to provide them infrastructure, alleviate any networking / firewall issues from our on-prem to cloud, and making sure our infra is dynamic and can scale in the ways that we need.

I've done quite a bit of automation with PowerShell, know some Ruby, and have used Ansible to manage our Linux fleet.

I'm looking to learn more in-depth knowledge with k8s, Terraform, and essentially standard tools a Devops engineer should have in their belt.

Looking for advice from anyone who made the jump from traditional ops or those in the field.

Should I learn Python over Ruby? What tools are standard in the Devops realm? Anything I should be aware of?

https://redd.it/1t3n2l0
@r_devops

Читать полностью…

Reddit DevOps

Human written reviews for the FinOps tools

hey guys, where are you checking the reviews for the tools that you want to buy, I feel like reviews in G2 are fake, writeen for some bonuses, because somebody reached out and offered a 30$ git card for a 5 star review

https://redd.it/1t3ojo7
@r_devops

Читать полностью…

Reddit DevOps

How do you handle email deliverability issues in your apps? Any tools or tricks that actually work?

Please share your experience with a tool that worked best for you.

https://redd.it/1t3lkeh
@r_devops

Читать полностью…

Reddit DevOps

K8S at first or not ? Clickhouse or Loki for logs ?

Hello guys,

I work at a startup and we’re getting close to going into production. Right now we have a backend, a load balancer, managed PostgreSQL, and managed Redis. We’re planning to use Elasticsearch (or maybe just rely on PostgreSQL for full-text search).

We’ll also have a separate server for logs, and another one for metrics with Grafana for visualization.

I’m not sure if it’s better to start with Kubernetes from day one, or just stick with managed services so I don’t have to deal with managing all of this infrastructure without real production experience.

I’m a backend engineer with good knowledge of cloud, DevOps, and Kubernetes, but I don’t have hands-on production experience yet, and honestly I’m a bit overwhelmed with all the options.

Would appreciate any advice..

https://redd.it/1t3j9p0
@r_devops

Читать полностью…

Reddit DevOps

Start my journey

Hey new to this sub and I've been pretty interested in learning and getting a job as devops. What skills do I need and up to what extent for a fresher role?

Lmk any advices by people who are learning or doing job as devops engineer.

https://redd.it/1t3c6yv
@r_devops

Читать полностью…

Reddit DevOps

How to monitor your Kubernetes cluster with the OpenTelemetry Collector using the agent + gateway pattern
https://telflo.com/blog/monitoring-kubernetes-with-the-opentelemetry-collector

https://redd.it/1t35mos
@r_devops

Читать полностью…

Reddit DevOps

Radar, the “yet another Kubernetes UI” project, now at 1.4k stars after a couple of months

A couple of months ago I posted here about Radar, the OSS Kubernetes UI we had just released after getting frustrated with Lens / FreeLens / Headlamp / Kubernetes Dashboard / k9s.

That post got a lot more attention than we expected, and the repo is now at \~1.4k GitHub stars ⭐. So first: thanks. A lot of the feedback from that thread shaped what we shipped next.

Radar is still fully open source, Apache 2.0. It runs locally as a single Go binary using your existing kubeconfig. No account or cloud dependency required. Still takes about 15 seconds to install and run 😄

Since the original post, we’ve added/improved quite a bit:

topology with real ownership/resource relationships
resource browser with logs, exec, port-forward, YAML, etc.
live event streams using Kubernetes watches
Helm diff/rollback
Argo CD + Flux visibility/sync
traffic flows via Hubble/Cilium, Istio, or Caretta
OpenCost insights
image filesystem inspection
cluster checks
built-in MCP server for AI agents

Plus a lot of fixes and polish, including plenty from bugs reported here on Reddit.

We’re doing a more proper launch today, if anyone wants to support: https://www.producthunt.com/products/radar-7

But mostly I wanted to come back here because Reddit is where this really got kicked off in the first place.

Would love feedback again: what’s missing, what breaks, what would make this useful as your daily K8s UI?

https://redd.it/1t2q3pj
@r_devops

Читать полностью…

Reddit DevOps

Does this kind of 4-mode deployment diagram (local dev / CI / staging / prod) make any sense?
https://redd.it/1t2la7x
@r_devops

Читать полностью…

Reddit DevOps

DevOps/SRE autonomous agent permissions

After the story of claude destroying their whole data with a token it found laying around, how much would you trust a DevOps/SRE agent sitting in the cloud tasked with some autonomous tasks like remediating alerts? how much autonomy/permissions would you give such an agent?

https://x.com/lifeof\_jer/status/2048103471019434248

https://redd.it/1t2gadk
@r_devops

Читать полностью…

Reddit DevOps

AI coding tools are now a CVSS 10.0 CI/CD supply chain vector - patch Gemini CLI and update Cursor

Two critical AI coding tool vulns landed last week with the same root cause: agents that autonomously execute OS operations trust their environment in ways that weren't designed for automated use.

Gemini CLI (CVSS 10.0, no CVE): In headless/CI mode, it automatically trusted workspace folders for config loading - no sandboxing, no explicit consent. Attack vector: submit a PR to any project running Gemini CLI in CI, plant a crafted .gemini/ config, and you get RCE on the CI host before the sandbox initializes. Affected: u/google/gemini-cli < 0.39.1 and google-github-actions/run-gemini-cli < 0.1.22. Also fixed in 0.39.1: --yolo mode was ignoring tool allowlists entirely, running run_shell_command on untrusted input without confirmation.

Cursor CVE-2026-26268 (CVSS 8.1): Clone a repo containing an embedded bare repository with a malicious post-checkout hook. Open in Cursor. Ask "explain the codebase." The agent reads AGENTS.md, runs git checkout on the embedded repo - hook fires, RCE, no further interaction required. Fixed in Cursor 2.5. CursorJacking (CVSS 8.2) is a separate issue: any installed extension can read an SQLite database holding API keys and session tokens. Still unpatched.

The Cursor advisory describes the pattern well: "The root cause is not a flaw in Cursor's core product logic, but rather a consequence of a feature interaction in Git, one that becomes exploitable the moment an AI agent starts autonomously executing Git operations inside a repository it doesn't control."

These tools were built for human-supervised use. Agents running the same operations unsupervised expand the trust boundary in ways nobody audited. How are you sandboxing AI coding agents in your CI pipelines?

https://redd.it/1t26rnm
@r_devops

Читать полностью…

Reddit DevOps

tape, and code are mine and reproducible from the repo.*

https://redd.it/1t24q5z
@r_devops

Читать полностью…

Reddit DevOps

Looking for hands-on DevOps experience — happy to contribute to real projects

Hi everyone,

I’m a QA engineer (\~3 YOE) based in Pune, India, transitioning into DevOps. Over the last 6–7 months, I’ve completed 100+ hands-on labs (KodeKloud) and worked with tools like Kubernetes, Docker, Linux, Terraform, AWS, Jenkins, ArgoCD, Python, Grafana, and Prometheus.

I’m looking for opportunities to contribute to real-world projects (personal, open-source or professional) to gain practical experience before applying for DevOps roles. I’m happy to help, learn, and collaborate — compensation isn’t a concern.

On the positive side, my current role is quite balanced and doesn’t demand long hours, which allows me to dedicate consistent time to learning and improving my DevOps skills.

Also open to any advice or guidance from the community.

Thanks in Advance!

https://redd.it/1t22o93
@r_devops

Читать полностью…

Reddit DevOps

I started a DevOps YouTube channel and would love feedback / ideas on learning content

Hello DevOpsers!

I am a big video learner, and learned lots about devops through videos in the last 10 years or so. There are some nice channels like learndevopswithnana for example. This influenced me to start a devops channel of my own on YouTube, and would love to hear your thoughts on what direction I could go in..

I am big on analogies to explain things. For example trying to explain what cloud networks are using analogies. What are some of the topics you really struggle to understand still in devops?
What do you think are some good niches to go into within DevOps? Another crash course in k8? In deploying LLMs ? Or do you think just going over fundamentals is nice too? Covering more lesser-known Cloud services?
For more of the product/tech enthusiasts out there, maybe some higher level topics would be cool like understanding the netflix architecture and explaining how this architecture works on a really high level for a less techy person?
or just all of the above? Like don't sweat it


I have made two videos so far, and my dev friends liked it. i also really enjoy the new skills of learning how to make content too tbh. It's a long game, but I enjoy making the content.


Don't want to post my channel name here - but if you are interested in watching, just DM me.


Thanks in advance everyone!

https://redd.it/1t1yiro
@r_devops

Читать полностью…

Reddit DevOps

Built something for learning DevOps

Hey everyone,

I’ve been working on a small project called devopsbuddy.in — it’s a beginner-friendly website focused on learning DevOps concepts in a more interactive and simple way.

It’s still a work in progress (definitely not fully polished yet), but I wanted to share it early and get some honest feedback from the community.

The idea is to make DevOps easier to understand with:

\- Simple explanations

\- Interactive elements (still adding more)

\- A structured path for beginners

I’d really appreciate any feedback — whether it’s about UI/UX, content clarity, missing topics, or anything else you think could improve it.

Be as critical as you want — I’m building this to get better and make something genuinely useful.

Here’s the link: https://devopsbuddy.in

Thanks in advance 🙌

https://redd.it/1t1w88v
@r_devops

Читать полностью…

Reddit DevOps

Did you guys started arguying with ai?

Did you guys started arguying with ai? Untile it says the right answer?

https://redd.it/1t1r2sg
@r_devops

Читать полностью…
Subscribe to a channel