Multiple services are unavailable. OIT is investigating.
Incident Report for Brown University
Postmortem

What services were affected? For how long?

From approximately 6:45 p.m. on Sunday March 24 to 3:30 a.m. on Monday, March 25, most IT services were unavailable including the use of wired, wireless, and VPN networks, Shibboleth Single Sign-On, and many services hosted within Brown's datacenter. 

Who was affected?

All members of the Brown community working on campus, and anyone connecting to our VPN or most services hosted at Brown, were unable to connect successfully.

What happened?

At approximately 6:45 p.m. on Sunday, March 24, OIT received several automated alerts and customer reports of lost connectivity and service failures. We immediately initiated a service incident response, and started to troubleshoot the problem. We issued a public service alert to the community via our Statuspage alert dashboard at 7:17 p.m., and placed an updated greeting on our phone line at the IT Service Center to communicate about the outage.

Because the problem affected connectivity and authentication, OIT staff were unable to troubleshoot remotely, so we immediately dispatched multiple team members to campus locations. While we could physically access campus spaces and systems, the underlying problem still presented obstacles to logging in to systems. 

After some early analysis we determined that the network infrastructure was not operating correctly, leading to failures in network routing, domain name resolution, and general connectivity. OIT contacted our network vendor to join our response. As we investigated the problem, we found workarounds to access critical systems.

At approximately 2:50 a.m. we determined that one of our core network routers in the Brown datacenter was not passing network traffic correctly. We restarted this router, and found that services were restored. We validated multiple services and networks, and resolved our service alert at 3:27 a.m.

OIT did not find any evidence that this problem was caused by malicious actors.

What is OIT doing about it?

We will continue to analyze available logs and the outage timeline, and will work with our vendor to identify any possible root causes of this problem.

In addition, this outage demonstrated that we need to implement a means of accessing critical systems more quickly if a similar outage happens again. OIT will explore any viable technical approaches and will schedule implementation work in the immediate future.

Posted Mar 25, 2024 - 16:48 EDT

Resolved
This incident has been resolved.
Posted Mar 25, 2024 - 16:20 EDT
Monitoring
The incident has been resolved and OIT is monitoring and analyzing root cause with our vendor.
Posted Mar 25, 2024 - 03:27 EDT
Update
We are continuing to investigate this issue.
Posted Mar 25, 2024 - 02:14 EDT
Update
Since approximately 6:45pm on Sunday, March 24, we have been experiencing a major outage of IT services including campus wired and wifi networking, Shibboleth and Active Directory sign-on services, and multiple other services that run from our on-campus network including file storage services and print services. At this time we suspect a low-level systems issue which affects the availability of many of our services.

OIT has had a large team of technical responders working non-stop to resolve this issue since it began, and will continue to work until all services are restored.
Posted Mar 25, 2024 - 02:14 EDT
Investigating
Multiple services are unavailable. OIT is investigating.
Posted Mar 24, 2024 - 19:17 EDT
This incident affected: Other (Other), Network Connectivity and Telecommunications (VPN - Virtual Private Network, Wired Network, Wireless network (Brown, Brown-Guest, & Eduroam)), and Information Security and Accounts (Shibboleth Single-Signon Authentication).