
Fateh Khan Demonstrates proficiency in a wide array of technologies including Terraform, Git, Helm Charts, and Kubernetes, among others, in the role of DevOps and SRE Engineer Manager at V4You Technologies. Showcasing exceptional performance as an Infrastructure SRE, delivering reliable 24x7 infrastructure and application operations, meeting business expectations and serving as a management escalation point during major issues. Expertise in automating infrastructure using Terraform, implementing CI/CD pipelines with Git, GitHub Action & Jenkins CICD pipelines and maintaining Helm charts for application deployment. Implemented Kubernetes Tanzu to optimize container orchestration, ensuring the security and availability of Microservices. Experienced in Infrastructure as Code (IaC) using Ansible, with hands-on experience in AWS, GCP, and Azure Cloud platforms, with a strong background in server hardening, networking, and troubleshooting. Skilled in disaster recovery planning, system administration, automation, and performance tuning in Unix environments. Designed & implemented disaster recovery plans, ensuring business continuity and data integrity in high-pressure environments. Led a diverse team of global application reliability, infrastructure, and operations engineers; delivering effective talent management practices; fostered a continuous learning culture.
DevOps and SRE Engineer Manager
V4YOU TechnologiesSr. DevOps Engineer & Release Engineer
Intelly Labs Private LimitedServer Administrator & DevOps Engineer
IDS Logic Pvt. Ltd.IT Executive
Ryddx Pharmetry (P) LtdSystem Administrator
Mindz Technology
Terraform

Git

Helm Charts

Kubernetes

Ansible

AWS

GCP
Azure

GitHub Action
.png)
Docker

Docker-Compose

Helm

Prometheus
.jpg)
Grafana

Loki

Zabbix

Cloud Watch

Vercel

ArgoCD

Nginx

HA-proxy

IIS

SQL

NoSQL

SonarQube

ElasticSearch

Varnish

VPN

Proxmox

VMware

Hyper-V

Vagrant

VirtualBox
Hi. My name is Fadek Khan, and, uh, I'm working from 7 years as a DevOps engineer and SRE manager. Along with that, I do have worked with, uh, multiple organization where I was working as a senior server administrator and a marketing engineer as well. I do have experience in Kubernetes where I have worked with GKE and EKS very closely along with Kubernetes. I do have expertise in monitoring, deploying the application in containerized and non containerized way. Uh, I'm very proficient in scaling up the infrastructure using Terraform and other tools like Call Me in Python. Other than that, I do have experience in GitOps as well where I used to advise provide the administration on Git and manage, uh, node DevOps practices along with, uh, monitoring as well. I do have experience in databases where I have, uh, provided the administration on my SQL and, uh, on SQL and non SQL. Thank you. So Helm basically, Helm is a packet manager which, uh, help to, uh, manage your Kubernetes application where, uh, you can, uh, populate the configuration using by email, and you can install it on any environment you want. Just we have to change the, uh, input variable using helm command. And the best benefit is we do not have to work around with the manifest again and again, and helm will be there to take care of the and roll back if there's any issue with it. Other than that, we can also work with, uh, we can also we can we can also utilize Helm in GitOps practices using ROCD where every single component of Helm will be get managed by ROCD itself.
So if you want to, uh, attach a storage class to a stateless application and if you are using GKE or EKS, we do get, uh, the option to utilize services like, uh, the assistant volume in GCP and, uh, your EFS and EPS, uh, services in GK EKS itself where you can attach, uh, the volumes using volume claim, volume claim, and volumes volume storage as a disk to the port itself. The moment any port get die and it's can spin up, the moment any new port gets spin up, the remain data will the data will be remain in in the same process in this, and it will be get attached to the newer port newer port, which will be get available for the deployment itself. Other than that, it also cost it is also possible to attach, uh, port I mean, the, uh, persistent storage at a run time in the department itself. 1st, we have to claim, uh, the storage, and then we have to attach it as a PPC.
So, uh, we can use horizontal pod scaling to scale up the environment if the traffic if the defined threshold get cross. We can use metrics over there, and we have to define a manifest for the deployment which will work at at the level of selectors. Let's say if the deployment's having the selector label as application 1, we will be going to define HPA with the, uh, where we will be defining the API. The kind will be, uh, horizontal for scaling and the name and the selector and then the metrics. We can define the metrics as per CPU level and as per RAM level. Other than that, we can also define, uh, the capacity, how much we want. The the port should begin to scale up with the number of, uh, replica set. So anytime anything happen, let's say the threshold that cross the defined threshold that cross, it will, uh, it will scale up the deployment itself.
So we can use, uh, blue green deployment to deploy the application new deployment take place uh, every time when a new deployment take place in a grouping manner, we have to update the DNS over there to make it happen. Once the deployment is successfully done, what we can do is, uh, what we can do, we can deploy the application. 1st, we have to deploy the application, and, uh, once that application get deployed, we can we can we can we can we can update the DNS, uh, once once we tested everything, it's running fine.
So what are the strategies you would imply to ensure zero downtime to the admin transition? So zero downtime. Again, it's nothing but just a practice where we deploy the application and transfer the data and transfer the traffic to the newer version. So zero time, we we can we can what we can do, uh, we can create the same, uh, deployment set and same application deployment over the, uh, over the, uh, AKS side and point the DNS entries over there. Point the DNS entry is over there where, uh, uh, once the DNS is pointed, the application will be running from from the AKS itself. And we will we can we can keep the Tanzu application running till the time we verify that everything is fine or not.
So when we, uh, when we're setting up, uh, Kubernetes pipeline, we we have to be sure about that which application we are deploying for, whether it will be in a helm or manifest, whether there will be any GitOps operation if they are not. If the GitOps operation are involved, which application we will be going to use, whether it will be Spinach or Argo CD or Git, uh, GitHub Action itself or Jenkins. Uh, other than that, we can, uh, we can also, uh, we we we also have to look around the, uh, deployment replica set, uh, the storage set. And if the application is getting deployed, uh, if if the application is getting deployed and there is some, uh, API API, uh, decommission has happened on the Kubernetes upgrade side. We also need to make conditions that if the cluster version is this, then it going to install this version of API and the cluster version is this, then we're going to install this version of API. So, uh, while deploying the application, we also we we also has to make sure that the current, uh, stable version is running absolutely fine after doing smooth tests. And we also have to make sure that the end charts are properly running or not over there. Then after that, we can we can proceed for the deployment.
Consideration I'm not sure about this.
So Kubernetes rely on hard on time environment. Could be container for the prior environment 21. It was running on Docker, and then after that, it start, uh, replacing the Docker Docker mechanism from the cluster itself. Now, uh, they are running container dot d as a default and type it. And, uh, and then then and, uh, the deployment is getting managed by the deployment is getting managed by q API, uh, server proxy, basically, which is which send all the, uh, inputs and output to the, uh, to the API server. The scheduler is responsible for deploying the application on the side of node, and the API server is responsible for managing the and replacing the current deployments. An EDCD data is there to contain the name and keyword of every deployment that has taken place within the cluster itself.
So service mesh implementing. So it will give you more control on the service side where you can control the entire traffic flow and, uh, the the network, basically, where the request you want to send. It also give me the entire network diagram like, uh, Kaldi as a dashboard if we are using a SKU. And and then other than that, uh, it basically works with the service discovery. As long as the service discovery is working, the STU will be keep running, and the SEO will be sending the data on the port side only after when the service gets successfully initialized. Uh, so the best advantage of using link, uh, network mesh, uh, technology in Cuba discussed, sir, that allows you to fully control the network. And you can you can describe that if the request is coming from a particular resource, so you can block it or you can allow it for a particular services. These are the best practices and, uh, the best way and the features that SEO can provide for network mesh.