Reddit DevOps. #devops Thanks @reddit2telegram and @r_channels
One piece of advice you wish you'd heard sooner?
Mine is pretty basic: it's not worth it to learn a new framework before getting pretty good at one. I wasted a solid year (doing tech support and trying to break into a product team) because I kept changing languages/frameworks/tools. I guess the general advice is 'for the first year, pick a context and stick with it.'
It's a lot easier to learn AWS after you've stuck with Azure for a year solid. It's a lot easier to learn Playwright tests if you have a good grasp of Selenium, rather than switching back and forth as you're first learning.
https://redd.it/1brd69a
@r_devops
The Critical Role of Continuous Integration in Agile Software Development
The guide explores how agile transforms software development, making it easier and faster if developers practice test-driven development (TDD) and continuous integration (CI) simultaneously as well as how to take CI to the next level with CodiumAI as well as how it involves deeper integration with practices like Continuous Delivery (CD) and DevOps, enhanced automation, and improved collaboration and efficiency in software teams.
https://redd.it/1branno
@r_devops
Forced from slack to teams
My company pulled the plug on slack after 8 years. We were given a two weeks notice to migrate over 100 integrations and all our alerts.
MS Teams freaked out a couple times and we've had to delete teams channels and recreate them to get our integrations to work. Channels feel like Twitter or social media posts. I can't limit notifications as well or set groups to mention.
Is it wrong to quit just because they took away slack? Anyone else go through this?
https://redd.it/1br7ig1
@r_devops
Certification Exam Nightmare w PSI
Recently I attempted to take a Consul Associate Exam with PSI Services from Hashicorp. I clicked on the Launch
button, and it immediately opens up a psiexam://
URL, which the browser cannot understand (Firefox on both macOS or Windows 10). Apparently, there's supposed to be this PSI Secure browser installed, but there's no documented requirement for this. It seems like it should be installed through the Launch process, but that is not happening.
When I attempted to get support (calling to reach an actual human agent), the text exam window expired, and I was marked as absent. So the agent said they could not help me. Apparently, I would have to buy multiple exams, get human tech support during one of these exam windows, and hopefully, they would be able to find a solution by remoting into my system.
I am not sure what to do. I am concerned as I think Linux Foundation uses PSI services as well. :'(
https://redd.it/1br21dv
@r_devops
Build agent (runner) with option to execute as regular user
I searched for such option but surprisingly I wasn’t able to find one. Gitlab has an open issue for allowing their windows runners be installed without elevated access but that’s the only one I found. Has anyone used windows build agent that has an option to run as regular user? Thanks.
https://redd.it/1bqv14l
@r_devops
Is DevOps role near to dead ?
There used to be a time when DevOps was booming now we rarely see any opening for DevOps and SRE.
What your thoughts?
https://redd.it/1bqwaig
@r_devops
Retiring Ansible?
Hey guys,
So my company heavily uses Ansible. We run tons of k8s but we have our database and a few other legacy services running directly within VMs.
Said VMs are launched from a custom configured AMI but still some configuration is done via Ansible.
In my interviews with other companies, many have moved away from Ansible only using Terraform.
My question, for those who run VMs outside k8s, how do you configure or setup your VMs using only Terraform?
In later interviews, it came to pass that many of them no longer ran any VMs outside of k8s, which allowed them to retire Ansible. That said, I'm curious if others have seen or done differently in their own experience.
I have used Terraform for a few projects but not full GitOps where my entire infra is managed by Terraform.
This post is an attempt at discovering maybe a piece of the infra or usage of Terraform I don't fully grasp or am I aware. I don't know, what I don't know I guess lol.
Thanks
https://redd.it/1bqs1o3
@r_devops
Grafana + Prometheus on AKS
Hey,
I'm using Managed Prometheus on AKS, and have installed DCGM in order to scrape GPU data. When port forwarding the DCGM, I can see that it is scraping fine. In azure docs, it says that the managed prometheus will automatically pick up any new scraping, however when on Grafana and am adding metrics such as 'DCGM_FI_DEV_GPU_TEMP{instance=\~"${instance}", gpu=\~"${gpu}"}' ( Temperature! ), I get the 'No Data screen'.
Is there any other way I can verify that this managed prometheus is scraping the GPU data correctly?
Thanks.
https://redd.it/1bqe3du
@r_devops
Help on understanding Dev, Staging and Prod Environments
Hey there,
I'm working on a small app with separate backend and frontend components. The backend consists of two containers: a Java API and a PostgreSQL database. These are deployed on AWS using one EC2 instance to host both containers and an S3 bucket primarily for storing assets like images. The frontend is built on React.
I've divided backend and frontend development, and now I'm figuring out how to manage different environments. Here are my questions:
Development/Test Environment: Testing the backend is straightforward on a local machine by running the containers, but what about S3? Should I simulate storage locally or connect to a dev S3 bucket?
Staging Environment: I'm using Terraform to provision staging and production environments. My plan is to create the staging environment when needed and tear it down after testing. Should I use separate S3 buckets for staging to avoid extra costs? Any recommendations for managing staging efficiently?
Production Environment: Once staging tests are successful, updates and fixes need to go to production. I'll use Terraform to check for updates and then update the code with Ansible. Does this approach make sense? Any recommendations for handling production updates?
Terraform Management: Should staging and production Terraform configurations be separate files? If yes, how do I promote changes from staging to production? Also, do I need Terraform for development/testing?
I know it's a lot, but I want to follow best practices. Any advice would be greatly appreciated. Thanks!
https://redd.it/1bq5v44
@r_devops
Do any of you actually have the capacity to do your jobs correctly? Mostly speaking to more “platform” folks.
In the last year I’ve become very apathetic to process and “rigor”. As long as we can get a change out with confidence that it didn’t break anything, I’ve stopped caring how we get to that point. If something’s broken, I only care that it was fixed. If someone’s misconfigured something, I only care that it’s now configured correctly with no additional thought to how we may be able to build a better process.
I just want to get the job done by any means necessary. We just don’t have the time to do anything else. The team’s been cut to ribbons and our ownership is massive. I want to stay because the upward mobility is good and the job market looks like crap, but it sucks feeling like I’m just keeping the lights on and not improving.
Do any of you actually have the time to do things correctly, automate, and simplify or are you just wading through the muck waiting for your shares to vest?
https://redd.it/1bq59to
@r_devops
how do you test github actions
when adding new content to an existing workflow/action what easy way do you have to test it without waiting tons of time for each run to end?
https://redd.it/1bpz8zu
@r_devops
I lied to HR
Hey, I exaggerated to HR about my experience with Kubernetes in my last job. I made it sound like I was the main person handling all the Kubernetes development and pipelines. Surprisingly, they believed me and I cleared the Interview and they offered me a good job as a Kubernetes developer. But now I'm feeling nervous because I actually only had some experience deploying to EKS, and most of our deployments were on ECS. I have 4 years of experience in DevOps, but I'm worried about handling such a big Kubernetes deployment. I know I shouldn't have exaggerated so much, but I really needed the job. Now I'm not sure what to do because I have to start in 20 days.
https://redd.it/1bpqcs5
@r_devops
Do you own the production infrastructure?
Hey guys,
I've been in DevOps for a few years now across different industries and the roles have been quite different in terms of ownership over operational resources.
At some places we would own the tooling, processes and standards but not necessarily own the resources that are deployed. Other places I've worked at it's basically Cloud SysAdmin using DevOps tooling which means that any infrastructure concerns are owned by DevOps (patching, upgrades, reliability of all servers/dbs/microservices).
​
So I thought I'd see where everyone else is at and what you think the correct ownership model looks like?
​
How many production resources are you responsible for? What does that ownership look like?
​
Thanks all!
https://redd.it/1bpmh1k
@r_devops
Reminder: There are no best practices…
…only things that have worked for other people.
https://redd.it/1bpjq5l
@r_devops
Zitadel Experiences
Has anyone used Zitadel yet? https://zitadel.com/
We are evaluating it for both b2b (saas) service to abstract away SSO/Login for our users and also looking at it for b2c so we can create a generalized service that anyone can login to.
Curious on how people have found the self hosted option to be;
Easy to operate?
Easy to troubleshoot?
How hard is scaling it? (Looks like its mostly adding more replicas of the app + scaling the DB vertically)
Any big gotchas you ran into?
​
Lastly, would you recommend something entirely different? We are in the market for a run it yourself system.
https://redd.it/1bpg4gj
@r_devops
(HELP) Currently I am only logging all the requests in my access.log that comes to my squid proxy server. I need to enable response logging that comes from the destination server. Can anyone help?
How can I enable response body in the squid logformat?
https://redd.it/1brb6za
@r_devops
DevOps Intern Interview
I have an Interview with a relatively small company as a DevOps Intern and they use Terraform. Do you guys know what I would need to know prior to the interview?
https://redd.it/1br9t1e
@r_devops
PSI Nightmares and How to handle it
I have used Firefox browser did compatibility checks also this is my third exam with PSI first two exams were smooth.
More than exam PSI will teach you patience and more than exam how to think and react when unexpected issues from exam software crops up....
Do compatibility checks also give all permission and allow all pop ups from PSI before exam otherwise use chrome which is mentioned to be best experience for PSI.
Even I had faced PSI browser issue in between exam last 26 min was left and the session closed by itself in between file edit for launching yaml object.
I followed one simple thing don't panic!
Restart the session it wasn't opening again.
Don't call tech support Linux foundation has limited technical personnel especially PSI procters raise ticket instead, which is going to be resolved in 2-4 days at least things will be in record according to timing of the ticket matching your exam...
I didn't get the session back but check-in process was followed again, I had to restart the pc and uninstalled the PSI browser and reinstall from downloads again to get this session.
I had to do end-procter session otherwise I would be not marked against the attempted questions...which the procter didn't end before the exam timer ends which is great also I mentioned to new procter i didnt get my older session he/she assured that recording is there so PSI will check where the fault is and better raise technical support which I already raised.
My exam was in grading for 3-4 days for which I had raised extra Linux foundation ticket after result timing exceeded 24hrs ...
Both tickets were closed for unexpected psi browser close and Grading still after 24hrs as I passed the exam if it was not given I think if would mentioned for free retake i wouldnt have a case as issue happened in between exam and exam session record shows I am doing needful file edits also ticket was raised....
People facing issue at start of exam which my friend also faced, In their case as longs as exam didn't start 2hrs in your hand + half hour before required check-in time utilise for raising first support ticket otherwise if marked absent due to no-show we should have raised support to justify issue faced then doing troubleshooting by checking psi browser compatibility with different browser......allow all pop-ups don't use VPN and multiple monitor PSI checks this as well also don't have multiple programs opened also mute notifications from external discord reddit or Outlook etc so in between exam it doesn't create hassle don't use virtualistion of any kind or VM to launch exam have good bandwidth wifi UPS if affordable so even if generator kicks in there shouldn't be flap in networks in between exam....
Don't give up also not to be disheartened due to these issues be prepared for next exam do all needful checks read through their documentation initially regarding exams and scenarios you think might affect you to be prepared search any community posting similar issues faced by you...
https://redd.it/1br6skw
@r_devops
Senior advice
I'm in a weird spot. I manage a Jenkins instance and automation that oversees hundreds of millions of dollars in Revenue thousands of builds and mobile application deployments. I do ad hoc projects lots of scripting, lots of code, and application consultation for Developers. I have very rudimentary Cloud skills and the closest I got to infrastructure as code was Windows desired state. I feel like I could pick all of these skills up very quickly but I feel like if I ever got laid off I would not be able to get into another devops role without them. I mostly just like to learn theory and how to do things, I don't really care about tools as they are just an abstraction of the process.
I have managed doctor I've never managed kubernetes. Am I fucked if I get fired?
https://redd.it/1bqzt02
@r_devops
What should I brush up on when it comes to Infrastructure provisioning and automation?
I have an interview coming up for a SWE internship with a team that works on provisioning, and I thought it would be best to ask you all what you think I should touch up on before the interview.
My previous internship involved working with Docker, k8, and micro-services in general. So I am assuming those are things Ill need to refresh.
https://redd.it/1bqv2eq
@r_devops
What do you look for in case studies?
Hey all! I'm slightly new to the dev world and have been tasked with writing a case study for a data management solution. I've been reading through examples but I feel like they all say very similar things, and so I'm finding it difficult to understand what might set them apart to readers.
I'm hearing a lot that case studies are very valuable, but is it simply the fact of having one, no matter how generic, that is the important part, or are there specific things readers want to see?
https://redd.it/1bqszay
@r_devops
Anyone done hub and spoke networking across AWS and Azure
Any recommendations? I’m going to be new to Azure but 10 YoE with AWS.
Is it going to be worth doing TGW and Azures equivalent? Should I just do site to site vpn and call it?
Any help is appreciated
https://redd.it/1bqg7f2
@r_devops
Have you ever wanted to monitor whether the API response is correct or not?
Hi,
I'm wondering if anyone has had this thought before: having a third-party service that monitors whether your API responses are working as expected. For example:
1. An array/dictionary should never be empty.
2. A field should never be null
3. Timestamps should always be greater than or equal to the current time.
I'm just curious if this is a common need
https://redd.it/1bqdhrx
@r_devops
Savvy: Create and Share Runbooks directly from your terminal
I built [Savvy](https://getsavvy.so) to help developers automatically create and share runbooks directly from your terminal.
**Savvy Record**
Savvy's [open source CLI](https://github.com/getsavvyinc/savvy-cli) works with any bash or zsh shell and can record your commands and uses AI to automatically create an accurate and easy to follow runbook. `savvy record` automatically expands all aliases to ensure anyone can follow the runbook.
`savvy record --ignore-errors` will ignore any command that returns a non-zero exit code. This is great for creating long runbooks and not have to worry about any typos you make along the way. Checkout our demo here: [https://youtu.be/GzcvGEg6oYc](https://youtu.be/GzcvGEg6oYc)
Savvy makes it really easy create a runbook from your shell history with `savvy record history.` Simply select your shell commands and watch Savvy's CLI do its magic. Here's a demo of `savvy record history` in action: [https://youtu.be/Nk\_NeLjt2Tk](https://youtu.be/Nk_NeLjt2Tk)
**Share and Export Runbooks**
All runbooks are private by default. However, you can easily share runbooks with a private link or make them public. Here's the [public runbook](https://app.getsavvy.so/runbook/rb_7b74b43c5d61bd57/How-To-Validate-Kubernetes-Root-Certificate) I created from the demo above.
Savvy also allows you to export runbooks to markdown with one click. You can paste the runbook in any document editor like Notion, Coda, Slab, Google Docs etc. that supports markdown.
**Get Started**
* Follow our [Quick Start](https://github.com/getsavvyinc/savvy-cli?tab=readme-ov-file#quick-start) to download the CLI and create your first runbook
* Check out our public roadmap at [feedback.getsavvy.so](https://feedback.getsavvy.so)
* Join our [Discord](https://getsavvy.so/discord)
https://redd.it/1bq70kw
@r_devops
Could Platform Engineering Tactics Solve Some of these Common DevOps challenges?
DevOps has revolutionized software development and deployment, but as the complexity of modern cloud-native technologies increases, it has become evident that the current approach has limitations and inefficiencies. As a technology leader myself, it’s become increasingly clear to me that the traditional role of DevOps may not survive in the future if we can’t overcome our current challenges and struggles to automate.DevOps leaders need to get on board with the latest evolution of DevOps to adapt and overcome some of these challenges in order to keep pace with ever-changing technology demands. The answer is here, and it starts with platform engineering.
Read on for the full blog: https://www.getambassador.io/blog/platform-engineering-solution-common-devops-challenges
https://redd.it/1bq0nxy
@r_devops
Any tips for understanding Terraform from previous colleague?
I've just joined a company replacing the recently left devops engineer.
Ive been told I'll be working on an AWS deployment for a customer and that a very similar deployment has been told previously by the ex colleague.
The company basically wants a carbon copy and paste of his work but changes where changed are needed. I've been given his Terraform, but I have diddly no clue as to what they're code is doing. I've used Terraform for a good period, so it's not that. It's more I have no idea why they've structured their code in a certain way. They have odd folder structuring and then they are using powershell wrapper scripts for running the Terraform.
Anyone have any tips of how you should go about deciphering someone's code?
https://redd.it/1bpuzy2
@r_devops
Is it state of the industry that contracted DevOps is associate level at best?
This may be just a rant but we've a had a contracted DevOps service for a few years now. We'd hoped we were getting expertise and experience engaging this DevOps company and would be able to offload our DevOps tooling and support for the dev team to be able to focus more on our core business and products.
I don't think we've gotten any expertise. Yes some services have been put up. Some solutions have been implemented but everything is singular and with no thought to an overall design. The "solutions" are hacked together, fragile things that the person implementing barely understands. I'm learning things on the fly here so I may not have better solutions but I can see the inherent flaws in the designs.
Sometimes I push back or try to discuss our concerns and refine the constraints but I just don't have the energy to do this for everything. It just feels like we are getting associate-level work where there's enough knowledge to hack something together and be dangerous but broader strategy and expertise don't yet exist. I feel like I'm spending half the time managing, learning, and reviewing work than if I had to do it myself. The hope was to be able to offload work and I guess it half worked.
Is this a common experience? What are others experiences with contracted DevOps —on either side (client or contractor)?
https://redd.it/1bpoej7
@r_devops
Can we ban people complaining about people complaining about people asking how to do devops or how to get into devops? This is getting out of hand
Every morning people send me messages asking stupid questions. “Why is datadog inaccessible”, “cloud trail says you changed the only root credentials to AWS and our pipelines are all failing because you configured everything to use root”, these people are pathetic. They don’t even know how to google basic things, so I ignore them. Besides, negative vibes stress me out - that’s a big “ick” from me, fam.
Already tired of dealing with these idiots, I put on my Tom Ford sunglasses (prescription with UV phasing) and sign on to Reddit. I need to relax and read some posts from my fellow engineers.
Instead what do I see! A question that has clearly been asked multiple times. And another one. And ANOTHER one. We are ENGINEERS, not DJ Khaled. This is getting out of hand.
Speaking of out of hand, my hands are literally shaking. My pinky ring begins annoyingly tapping the mahogany wood of my electric convertible standing ergopod ($12,000, covered my work since I accused my boss of sexual harassment when they asked me to come into the office). Do you fools understand the impact of your words? Of course not, you didn’t even use Google. How dare you? You have activated my nuclear trap card now, buddy, strap yourself in.
First of all, you want to get into devops? What is a dev op? You haven’t even searched for it yet. Get away from me, this pyjama jacket is Gucci limited edition (I refuse to shower until 3PM, when I will change into more customary attire) - don’t touch me, you might ruin it with your stinky hands.
Mods, this is getting serious. I have already contacted my local congressman, but he didn’t even know what Ansible was or about r/DevOps, so I explicitly had to threaten him and his family just to get my point across. Finally, he conceded to my superior intellect and agreed to send someone over to talk things through and find a solution.
Meanwhile, mods, I implore you - please no more of these posts, asking to ban other posts, which could be searched, or even posts which are searchable asking to ban other posts. With advancements AI none of this will be needed soon anyway!!
My good sirs, I tip my hat - and keenly await our democratic meeting of minds through which a solution will, inevitably, be engineered (because of course, we are engineers).
https://redd.it/1bplxt1
@r_devops
How do I avoid having to do DevOps and seeing people complain about asking how to do it? Can we ban DevOps Post?
I’m a software engineer and I don’t like doing DevOps even though I do it being I work at a small company. I mean every question has already been asked and answered with a basic search of this sub. If you can’t search this sub then you probably should look into doing something that you like and make it some other unfortunate non DevOps person problem to complain on the sub.
Clearly this is satire: I see people complaining about it all the time and I’m not even in this sub.
https://redd.it/1bpk213
@r_devops
On Call worth the anxiety and stress?
Hi guys,
I’ve got an offer for a new role as a “DevOps Engineer” but this new role will involve on-call. The on-call consists of devs as well as cloud team members which is excellent IMO. The on-call payment is decent (I need to confirm exactly how much).
My question is, is the extra money worth the anxiety and stress I might face from a 3am call telling me that production is down? (Obviously this is worst case scenario but I’m assuming most likely a simple service or cronjob hasn’t run)
Has anyone got experience doing on-call in a “DevOps” position?
I’m based in the UK by the way and the team are all over the world.
Thanks
https://redd.it/1bpe85c
@r_devops