NinerNet Communications™
System Status

Server and System Status

NC031: Database overload incident

14 December 2018 05:51:47 +0000

On 13 December (UTC) the database server on server NC031 (the primary web server) failed twice, the second time resulting in our deciding to reboot the server. We believe this to be the result of a marked increase in denial of service and hacking attempts against this server over the last few days.

The database server went down at 01:55. We immediately logged into the server to determine the cause, and restarted it at 02:37. Shortly after that the problem manifested itself again, we did a full reboot of the machine at 03:03, and the server was again online and fully functional at 03:06.

The database failure resulted in database-based websites — e.g., WordPress websites — generating “error connecting to database” errors.

This issue highlights an issue on this server that we intend to address very early in the New Year, that being a necessary upgrade of our firewall system to better handle such attacks in the future.

We apologise that this issue occurred. If you have any questions, please contact NinerNet support. Thank-you.

NC031: Data centre power failure

19 July 2018 05:09:49 +0000

At 23:04 UTC on 18 July connectivity issues were reported at the New York, US, data centre. The cause of this problem was later identified as a power outage. The data centre is currently reporting that power was restored to the data centre at 00:38 UTC on 19 July, but our own logs on the server itself indicate that it was only down between 22:55 and 23:50 UTC on 18 July. We are expecting a full report from the data centre and will post further details here when we receive that.

This affected our primary web server, hosting most clients’ websites, including our own primary website (www.niner.net).

We later soft-rebooted the server at 04:07 UTC on 19 July, and it was back online at 04:08.

This outage affected automated daily back-ups on the server. These have been restored and will run as scheduled at the next scheduled time, which is at midnight UTC on 20 July.

We apologise for this outage. The data centres we select are supposed to have redundant power systems to ensure that this kind of event never happens. However, clearly it did in this case and, as mentioned above, we are expecting a follow-up report from the data centre to explain this failure.

If you have any questions or concerns, please contact us to let us know. Thank-you, and we again apologise for this incident.

NC036: Migration update 25 — Final

18 June 2018 08:54:43 +0000

The migration of all email accounts from server NC027 to server NC036 is complete. In fact, it was successfully completed at 04:00 UTC on 4 June. What followed over the next few days was an unprecedented avalanche of misinformation and red herrings that resulted in our moving the new server to another data centre (a move that took ten times longer than the previous move from the data centre where NC027 was located) where the same “problems” experienced by only some of our clients magically reappeared.

We planned the migration to have absolutely no impact on existing email configurations. We did this by pointing legacy sub-domains of the niner.net domain that named server NC027 — e.g., smtp27.niner.net — to server NC036. At the conclusion of the migration these sub-domains were indeed pointing to the new server. In other words, on Monday morning (4 June) email programs would have thought they were still downloading mail from the same server, not realising (or needing to realise) that they were in fact downloading from a new server.

However, it turned out that a significant minority of email programs were somehow misconfigured with settings that worked on the old server, but stopped working when connecting to the new server. Those clients who were using the correct settings experienced no disruption at all, and when those clients with incorrect settings corrected them on the morning of Monday the 11th, the problems were fixed instantly.

Over the rest of that week (11-15 June) we helped a few clients with some issues unique to how they use email, especially where those practices clashed with current best practices for email transmission. We also dealt with some issues of senders whose mail servers were behaving improperly, causing their emails to be blocked because they looked like spammers. This notably affected email from the ZRA, but their emails are once again flowing unimpeded.

We’re monitoring the spam filtering on the new server. Any message that the server identifies as spam will have the subject of the message prefixed to add “[SPAM]“. You can use this to configure your email program or the webmail to deal with spam automatically, by filtering it into your “junk” folder or deleting it entirely. We recommend filtering to the junk folder so that you can catch the occasional legitimate message that is misclassified as spam.

Finally, in recognition of the fact that the emergency migration of the server to a new data centre on 6 June disrupted all clients’ email, and the fact that those clients with misconfigured email programs experienced a week of disruption before the issue was identified, we will be applying a one-week (quarter month) credit to the accounts of all clients hosted on server NC036. We apologise for the difficulties caused, and will apply what was learned this time to future migrations.

Thank-you, as always, for your custom and patience.

NC033: Maintenance complete

11 June 2018 00:45:06 +0000

Server NC033 is back online. It was down between 00:37 and 00:43 UTC.

NC033: Maintenance

11 June 2018 00:35:57 +0000

Server NC033 (the primary nameserver) is going down for maintenance in a few minutes for maintenance related to solving the mail server migration issue.

NC036: Migration update 15 — Possible return of connection issues

6 June 2018 16:35:16 +0000

As happened on Monday, after initial success with setting up the new mail server, we’re again receiving reports from clients in Zambia (so far) that are starting to have their connections to the server dropped by their ISPs. This is incredibly frustrating, certainly for you, of course, but also for us.

If this is happening to you, you can help us help you by submitting what’s called an MTR report. MTR is a network diagnostic tool that gives us quantifiable information we can analyse and (hopefully) act on, showing where the network problem affecting your email lies so that we may possibly contact the organisation responsible for the problem.

If you’re running Windows, please download WinMTR. The file you will download is just a zip file that contains the actual program (32- and 64-bit versions). Read the “README.TXT” file for instructions on which version to use. All you need to do is double-click the appropriate file and the program runs; it’s not actually installed on your computer. Then type nc036.ninernet.net (yes, that’s ninernet.net, NOT niner.net) in the box as prompted, and select the start option to begin generating report data. Once you’ve gathered enough data, please use the copy or export functions to send us the report.

Please also send us the following information:

  • Your IP address.
  • The name of your ISP. If you have access to multiple connections, an MTR report from each ISP would be greatly appreciated.
  • If you have a reliable contact at that ISP, his/her name, position, phone number and email address.

Please email this to our usual email address, or send it via our contact form.

Thanks.

NC036: Migration update 13

6 June 2018 12:52:23 +0000

We will post a postmortem here in due course, hopefully with 24-48 hours, along with a thousand more apologies, but we are looking for feedback to ensure that all clients are able to connect to the server and download and send email, as this was not the case on Monday and Tuesday.

NC036: Migration update 12 — server back online

6 June 2018 12:28:04 +0000

The transfer of the mail spools has completed and server NC036 was brought back online at 12:12 UTC.

NC036: Migration update 12

6 June 2018 08:32:10 +0000

The transfer of mail data between the old and new data centres is still underway. Considering the transfer of the same amount of data took 50 minutes over the weekend, and we are now at the 5-hour mark, it was impossible for us to predict that this would take so long.

I can assure you that I understand the frustration that you are feeling with this situation, but given the network problems in southern and central Africa that necessitated this emergency move, we had no choice but to act immediately rather than forcing many clients to do without mail until the weekend.

My best estimate at this point, based on how much data has transferred so far (55%) and how much is left, is that the transfer will complete at approximately 13:00 UTC. Assuming this is the case, the server will be back online and accessible at about 13:30 UTC.

I sincerely and deeply apologise for this situation.

NC036: Migration update 11

6 June 2018 04:42:04 +0000

The transfer is taking significantly longer than we anticipated, likely due to the greater distance between the two data centres.

NinerNet home page

Systems at a Glance:


Loc.SystemStatusPing
Server NC023, London, United Kingdom (Relay server), INTERNAL.NC023InternalUp?
Server NC028, Vancouver, Canada (Monitoring server), INTERNAL.NC028InternalUp?
Server NC031, New York, United States of America (Web server), INTERNAL.NC031InternalUp?
Server NC033, Toronto, Canada (Primary nameserver), OPERATIONAL.NC033OperationalUp?
Server NC034, Lusaka, Zambia (Phone server), INTERNAL.NC034InternalUp?
Server NC035, Sydney, Australia (Secondary nameserver), OPERATIONAL.NC035OperationalUp?
Server NC036, Amsterdam, Netherlands (Mail server), OPERATIONAL.NC036OperationalUp?
Server NC040, Toronto, Canada (Web server), INTERNAL.NC040InternalUp?
Server NC041, New York, United States of America (Web server), OPERATIONAL.NC041OperationalUp?
Server NC042, Seattle, United States of America (Status website), OPERATIONAL.NC042OperationalUp?

Subscriptions:

RSS icon. RSS

Twitter icon. Twitter

Search:

 

Recent Posts:

Archives:

Categories:

Links

Tags:

.co.zm domains .com.zm domains .zam.co domains back-up bounce messages browser warnings connection issues control panel database dns dos attack dot-zm domains down time email email delivery error messages ftp hardware imap mail mailing lists mail relay mail server microsoft migration nameservers network networking performance php phplist pop reboot shaw shaw communications inc. smtp spam spamassassin ssl ssl certificate tls tls certificate viruses webmail web server

Resources:

On NinerNet: