Incident Manager
Primary skills: Salesforce, ServiceNow
Years of experience: 15+ Years
Minimum Required:
- Familiarity with incident-based ticket management tools.
- Running NOC, Designing and Rolling Processes, Running regular calls, Running Outage calls, RCA, etc.
- Should be well versed with L3 Support activity and responding to a reported service incident, identifying the cause, and initiating the incident management process.
- Overseeing the incident management process and team members involved in resolving the incident.
- Prioritizing incidents according to their urgency and influence on the business.
- Producing documents that outline incident protocols such as how to handle cybersecurity threats or how to correct server failures.
- Collaborating with the incident management team to ensure that all protocols are diligently followed.
- Logging all incidents and their resolution to see if there are recurring malfunctions.
- Adjusting the incident management process as required to ensure its effectiveness.
- Communicating with upper management if major issues are found in the IT system.
- Managing the incident team members by re-assigning workloads and re-scheduling non-urgent tasks.
- Efforts estimation and team management including compiling team goals, as well as people’s goals with KRAs.
- Configure and troubleshoot the issue with RAID configurations using tools like mdadm and smartctl.
- Knowledge of PCI and PCIe and troubleshoot PCI issues using tools like lspci and lshw as well as standard networking protocols like.
- Spanning Tree (STP, different types), LAG, VLAN (tagged vs untagged).
- Hosting from the Service Processor from the host using tools like RKVM, IPMI tool.
- Keep continuous updates on the stage the porting efforts are in and provide realistic ETAs back to Customer Engineering .
- Evaluation of the existing model libraries to ensure that there are no performance or regressions on a new software release.