Synthetic intelligence is reshaping each business, and unlocking its full potential requires infrastructure that’s sturdy, scalable, safe, and observable. As organizations broaden their AI initiatives, managing advanced workloads and making certain constant efficiency grow to be mission-critical.
That is the place Cisco AI PODs, the foundational constructing blocks of Cisco Safe AI Manufacturing facility with NVIDIA, mixed with the deep visibility of Splunk Observability Cloud, ship a robust resolution for constructing and working trendy AI environments.
Cisco AI PODs: The muse for AI innovation
Cisco AI PODs are modular, versatile, and scalable AI infrastructure designed to speed up time to worth for AI initiatives. They permit organizations to deploy production-grade AI environments shortly—however to maintain these environments working optimally, groups want complete perception into efficiency and well being.
How are you going to detect points early, troubleshoot effectively, and give attention to delivering enterprise outcomes as a substitute of spending time addressing pressing manufacturing points? That’s the place observability turns into indispensable.
Splunk Observability: Your eyes and ears inside AI PODs


Splunk Observability Cloud delivers end-to-end visibility throughout each layer of Cisco AI PODs—from bodily infrastructure to Kubernetes to the AI purposes layer.
It’s not nearly knowledge assortment. Splunk turns metrics, traces, and logs into actionable insights, serving to groups detect, troubleshoot, and resolve points in seconds.
We’re excited to introduce a brand new Splunk Dashboard purpose-built for observability throughout all the AI POD stack.


What the brand new Splunk Dashboard brings to Cisco AI PODs
- Unified Kubernetes cluster monitoring – Get a single view of all Kubernetes clusters, together with Crimson Hat OpenShift working on AI PODs.
- Deep host-level insights – Monitor the efficiency of particular person Cisco UCS servers, together with CPU, reminiscence, disk, and community utilization.
- AI POD infrastructure dashboard – Observe important metrics like GPU utilization, GPU reminiscence utilization, energy, and community efficiency, integrating knowledge from Cisco Intersight and Cisco Nexus.
- Streaming analytics benefit – Leverage Splunk’s real-time streaming analytics to attain quicker detection and near-instant “time to glass.”
Whereas Cisco AI PODs present modular, scalable infrastructure for enterprise AI, every AI POD will also be monitored individually. This enables groups to realize detailed perception into the distinctive efficiency metrics and workloads of a particular deployment. Listed below are some screens from the Splunk Dashboard for AI PODs to assist visualize the monitoring capabilities. By aggregating the variety of enter and output tokens processed by the big language mannequin (LLM) working on an AI POD, Splunk is ready to calculate an approximate price for token utilization over time:


Splunk additionally pulls in metrics from Cisco Intersight, to supply visibility to energetic alarms associated to the monitored AI POD, and key UCS metrics corresponding to UCS host energy, temperature, and fan pace:


The Nexus dashboard offers perception into the interfaces configured on every Nexus change, the transmit errors and drops, and the information transferred between storage and compute nodes:


An actual-world situation: Diagnosing LLM latency
Think about an software working on a Cisco AI POD using an LLM for consumer queries. All of the sudden, response instances on the applying spike. Right here’s how Splunk Observability Cloud helps resolve it in minutes:
- Alert triggered – Splunk detects excessive response instances and raises an alert.
- Hint evaluation – The service map highlights that the majority latency happens inside /v1/chat/completions calls to the LLM.
- Infrastructure view – The AI POD dashboard reveals that solely one of many 4 accessible GPUs is energetic and absolutely utilized.
- Actionable perception – You reconfigure the workload to make use of all GPUs—immediately restoring efficiency.
The NVIDIA connection: Powering clever workloads
Splunk Observability additionally screens key NVIDIA AI Enterprise elements—together with the NVIDIA NIM operator and NVIDIA NIMs microservices for LLM inferencing—making certain the NVIDIA software program stack performs at its greatest.
FedRAMP and authorities readiness: Splunk’s present path in the direction of attaining FedRAMP Average for Splunk Observability
Splunk stays a trusted associate in authorities digital transformation, empowering companies to ship safe, resilient, and clever providers by means of cloud and customer-managed options. Constructing on the success of Splunk Cloud Platform—licensed at FedRAMP Excessive and DoD Affect Degree 5, and listed on the StateRAMP (dba GovRAMP) Licensed Merchandise Record—Splunk continues to put money into increasing our FedRAMP program to fulfill evolving public sector wants. As beforehand introduced, Splunk Observability Cloud has already obtained “In Course of” designation and awaits full authorization to function on the Average degree from the FedRAMP Program Administration Workplace. Splunk stays dedicated to supporting the safety and mission success of all our authorities prospects.
Observability: A cornerstone of Cisco Safe AI Manufacturing facility with NVIDIA
In Cisco Safe AI Manufacturing facility with NVIDIA, observability is just not non-compulsory—it’s foundational.
By delivering deep, real-time insights throughout infrastructure and purposes, Splunk Observability Cloud enhances:
- Operational effectivity
- Useful resource optimization
- Reliability and uptime
- Safety posture
This holistic visibility is crucial for constructing, working, and securing advanced AI pipelines at scale.
Conclusion
Cisco AI PODs ship the sturdy, scalable infrastructure required for right now’s demanding AI workloads. When paired with Splunk Observability Cloud, organizations achieve unmatched visibility and management—enabling speedy troubleshooting, optimum efficiency, and quicker innovation.
Splunk Observability kinds a core pillar of Cisco Safe AI Manufacturing facility with NVIDIA, empowering companies to construct and run AI with confidence, pace, and safety.
