Skip to content
Go to homepage

site

  • About
  • Our Team
  • Job Openings
  • Contact

This site uses cookies to improve the user experience! Would you like to allow cookies?

Cookie Settings

These cookies are necessary for the website to function and cannot be switched off in our systems. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site will not then work.

These cookies help us understand and improve the use and performance of our services including what links visitors clicked on the most, and how they interact with the various areas and features on our website and apps.

NOC Engineer

Job #: 25-07892
Pay Rate: Not Specified
Job type: contractor
Location: Houston, TX
Apply Now Back to Search
Key Responsibilities:
Monitor public cloud infrastructure (compute, storage, networking, and Kubernetes clusters) using observability tools like Prometheus, Grafana, and internal dashboards.
Identify, triage, and respond to real-time alerts and incidents to prevent or minimize customer impact.
Perform first-level troubleshooting of system issues, including host failures, degraded services, and latency incidents.
Escalate critical issues to CloudOps Engineering, Network Infrastructure, or Security teams following predefined runbooks and escalation paths.
Maintain clear documentation of incidents, resolutions, and system changes in the ticketing system (e.g., Jira, PagerDuty, or internal tooling).
Write and update operational playbooks to standardize response procedures for cloud infrastructure issues.
Collaborate in post-incident reviews with the Network Infrastructure and CloudOps teams to identify root causes and help implement long-term fixes.

Qualifications:
2+ years of experience in a NOC, cloud operations, or system monitoring role, preferably in a public cloud or SaaS environment.
Strong understanding of Linux systems, networking concepts (TCP/IP, DNS, VPN, BGP), and system administration basics.
Experience working with Juniper and Arista network equipment, including basic configuration and troubleshooting.
Familiarity with container orchestration and cloud-native tools (e.g., Kubernetes, Docker) is a plus.
Excellent troubleshooting skills and ability to work calmly in high-pressure, time-sensitive situations.
Strong communication skills with the ability to write clear incident reports and Cloud Operations playbooks.
Experience with services (e.g., Droplets, VPCs, Load Balancers, Spaces) is highly preferred.

Preferred Qualifications:
Certifications in Juniper (e.g., JNCIA, JNCIS) or Cisco (e.g., CCNA) technologies.
Familiarity with Infrastructure-as-Code tools (e.g., Terraform) and CI/CD pipelines.
Prior experience in high-availability cloud environments and large-scale incident management.

Apply Now Back to Search
Go to corporate home page
Copyright © 2026 TechDigital Group
  • linkedin
  • facebook
Monster Strategic Talent Solutions