The outage affecting HUIT Open On Demand has been resolved, and users are able to login and access resources. We'll continue to closely monitor to ensure service stability.
Posted Apr 17, 2025 - 09:46 EDT
Monitoring
Restarting the slurm controller node and altering its configuration has enabled launching interactive apps in HUIT Open OnDemand once again. HUIT will continue to monitor the service to ensure stability before resolving the Major Incident.
Posted Apr 16, 2025 - 19:44 EDT
Identified
The service team believes they've identified a path forward. They continue working to investigate and remediate the root cause.
Posted Apr 16, 2025 - 18:45 EDT
Investigating
When logging into https://ood.huit.harvard.edu, users are able to load the Open OnDemand dashboard, but interactive apps will not start properly. The terminal app may load, but the Slurm scheduler is unstable, and compute jobs may or may not run.
User data is unaffected, and can still be downloaded through the Open OnDemand dashboard.
FAS Academic Technology is troubleshooting this issue and working to restore this service.