NinerNet Communications™
System Status

Server and System Status

NC031: Web server downtime

14 March 2019 21:34:49 +0000

The database on our primary web server (NC031) went down at 20:20 UTC, and was not brought back up until 21:00 UTC. This means that all database-driven websites were down during this 40-minute period, and would have displayed the message, “Error establishing a database connection”. We are still trying to determine why this happened and why the service was not automatically restarted, as should have been the case. Coincidentally — or perhaps not — at the same time the server was under a heavy load from hundred of thousands of requests on a single website, and we have now blocked the source of that traffic.

We sincerely apologise for this inconvenience. If you have any questions, please do contact support.

NC023: Secondary issue resolved

12 March 2019 20:26:33 +0000

The secondary issues referred to previously were not issues at all. A single test out of several failed, but the failure was unrelated to this server.

This issue is now closed. We apologise for any inconvenience this emergency maintenance may have caused while the server was offline between approximately 13:26 and 14:03 UTC.

If you have any questions, please contact support. Thank-you for your patience.

NC023: Emergency maintenance

12 March 2019 15:01:42 +0000

Due to issues affecting this server (NC023, the relay server), it has just undergone emergency maintenance. However, on boot there are new issues affecting it.

We are working to resolve the issue, and will post updates here. We apologise for this inconvenience.

NC031: Emergency maintenance complete

26 December 2018 12:18:17 +0000

We took server NC031 (the primary web server) down for eleven minutes today to correct an issue related to memory usage. The server was offline between 10:32 and 10:43 UTC.

We will be replacing this server early next month, and migrating all websites to the new server.

Our apologies for this inconvenience. If you have any questions or concerns, please contact NinerNet support. Thank-you.

NC031: Database overload incident

14 December 2018 05:51:47 +0000

On 13 December (UTC) the database server on server NC031 (the primary web server) failed twice, the second time resulting in our deciding to reboot the server. We believe this to be the result of a marked increase in denial of service and hacking attempts against this server over the last few days.

The database server went down at 01:55. We immediately logged into the server to determine the cause, and restarted it at 02:37. Shortly after that the problem manifested itself again, we did a full reboot of the machine at 03:03, and the server was again online and fully functional at 03:06.

The database failure resulted in database-based websites — e.g., WordPress websites — generating “error connecting to database” errors.

This issue highlights an issue on this server that we intend to address very early in the New Year, that being a necessary upgrade of our firewall system to better handle such attacks in the future.

We apologise that this issue occurred. If you have any questions, please contact NinerNet support. Thank-you.

NC036: Mail server blocked by Microsoft

14 December 2018 05:33:38 +0000

We are aware that the IP address of server NC036 (the primary mail server) has again been blocked by Microsoft’s various mail services, variously known as Outlook.com, MSN, Hotmail, Live.com, etc.

Although we are a member of their Smart Network Data Services programme and Junk Mail Reporting Program, which are supposed to allow us to proactively prevent these kinds of issues, we have been unable to use the service as advertised, or at least as we understand it’s supposed to work. We will continue to attempt to have this server’s IP address removed from their blacklist, and report here when we have success.

In the meantime, outgoing mail to their primary domains (hotmail.ca, hotmail.com, hotmail.co.uk, live.com, msn.com and outlook.com) is being routed through our relay server. If you receive a bounce message that reads similarly the one below to an email you’ve sent, it is probably for a private domain hosted by Microsoft of which we are not aware. Please contact us and we will add it to the list of domains for which email is routed through our relay server:

host 901e3cd0af6f44ab11b5a5e8a49da3.pamx1.hotmail.com[104.47.0.33] said: 550
5.7.1 Unfortunately, messages from [178.62.195.26] weren’t sent. Please
contact your Internet service provider since part of their network is on
our block list (S3140). You can also refer your provider to
http://mail.live.com/mail/troubleshooting.aspx#errors.
[HE1EUR01FT033.eop-EUR01.prod.protection.outlook.com] (in reply to MAIL
FROM command)

Please remember that all email you send through our mail server must be to recipients with whom you already have a business or personal relationship, and all mass email must be explicitly requested — i.e., Confirmed opt-in (COI) or Double opt-in (DOI) email.

Thanks for your cooperation, and our apologies for this inconvenience.

NC031: Server back online

18 August 2018 01:28:10 +0000

Server NC031 is back online, although we are waiting for word from the data centre that the issue is definitely completely resolved.

NC031: Networking issues at data centre

18 August 2018 01:11:02 +0000

At 00:09 UTC on 18 August we became aware of a networking issue at the data centre where NC031 — the primary web server — is located. This is affecting almost all web hosting clients, including the primary NinerNet website. The web server itself is up and running, as far as we know, but the problem is that traffic to and from the data centre is down.

This issue does not affect any email services.

Data centre staff are working on restoring connectivity, and we will post an update as soon as we are aware that the web server is once gain accessible. We apologise for this issue.

NC031: Data centre power failure

19 July 2018 05:09:49 +0000

At 23:04 UTC on 18 July connectivity issues were reported at the New York, US, data centre. The cause of this problem was later identified as a power outage. The data centre is currently reporting that power was restored to the data centre at 00:38 UTC on 19 July, but our own logs on the server itself indicate that it was only down between 22:55 and 23:50 UTC on 18 July. We are expecting a full report from the data centre and will post further details here when we receive that.

This affected our primary web server, hosting most clients’ websites, including our own primary website (www.niner.net).

We later soft-rebooted the server at 04:07 UTC on 19 July, and it was back online at 04:08.

This outage affected automated daily back-ups on the server. These have been restored and will run as scheduled at the next scheduled time, which is at midnight UTC on 20 July.

We apologise for this outage. The data centres we select are supposed to have redundant power systems to ensure that this kind of event never happens. However, clearly it did in this case and, as mentioned above, we are expecting a follow-up report from the data centre to explain this failure.

If you have any questions or concerns, please contact us to let us know. Thank-you, and we again apologise for this incident.

NC036: Migration update 25 — Final

18 June 2018 08:54:43 +0000

The migration of all email accounts from server NC027 to server NC036 is complete. In fact, it was successfully completed at 04:00 UTC on 4 June. What followed over the next few days was an unprecedented avalanche of misinformation and red herrings that resulted in our moving the new server to another data centre (a move that took ten times longer than the previous move from the data centre where NC027 was located) where the same “problems” experienced by only some of our clients magically reappeared.

We planned the migration to have absolutely no impact on existing email configurations. We did this by pointing legacy sub-domains of the niner.net domain that named server NC027 — e.g., smtp27.niner.net — to server NC036. At the conclusion of the migration these sub-domains were indeed pointing to the new server. In other words, on Monday morning (4 June) email programs would have thought they were still downloading mail from the same server, not realising (or needing to realise) that they were in fact downloading from a new server.

However, it turned out that a significant minority of email programs were somehow misconfigured with settings that worked on the old server, but stopped working when connecting to the new server. Those clients who were using the correct settings experienced no disruption at all, and when those clients with incorrect settings corrected them on the morning of Monday the 11th, the problems were fixed instantly.

Over the rest of that week (11-15 June) we helped a few clients with some issues unique to how they use email, especially where those practices clashed with current best practices for email transmission. We also dealt with some issues of senders whose mail servers were behaving improperly, causing their emails to be blocked because they looked like spammers. This notably affected email from the ZRA, but their emails are once again flowing unimpeded.

We’re monitoring the spam filtering on the new server. Any message that the server identifies as spam will have the subject of the message prefixed to add “[SPAM]“. You can use this to configure your email program or the webmail to deal with spam automatically, by filtering it into your “junk” folder or deleting it entirely. We recommend filtering to the junk folder so that you can catch the occasional legitimate message that is misclassified as spam.

Finally, in recognition of the fact that the emergency migration of the server to a new data centre on 6 June disrupted all clients’ email, and the fact that those clients with misconfigured email programs experienced a week of disruption before the issue was identified, we will be applying a one-week (quarter month) credit to the accounts of all clients hosted on server NC036. We apologise for the difficulties caused, and will apply what was learned this time to future migrations.

Thank-you, as always, for your custom and patience.

NinerNet home page

Systems at a Glance:


Loc.SystemStatusPing
Server NC023, London, United Kingdom (Relay server), INTERNAL.NC023InternalUp?
Server NC028, Vancouver, Canada (Monitoring server), INTERNAL.NC028InternalUp?
Server NC031, New York, United States of America (Web server), INTERNAL.NC031InternalUp?
Server NC033, Toronto, Canada (Primary nameserver), OPERATIONAL.NC033OperationalUp?
Server NC034, Lusaka, Zambia (Phone server), INTERNAL.NC034InternalUp?
Server NC035, Sydney, Australia (Secondary nameserver), OPERATIONAL.NC035OperationalUp?
Server NC036, Amsterdam, Netherlands (Mail server), OPERATIONAL.NC036OperationalUp?
Server NC040, Toronto, Canada (Web server), INTERNAL.NC040InternalUp?
Server NC041, New York, United States of America (Web server), OPERATIONAL.NC041OperationalUp?
Server NC042, Seattle, United States of America (Status website), OPERATIONAL.NC042OperationalUp?

Subscriptions:

RSS icon. RSS

Twitter icon. Twitter

Search:

 

Recent Posts:

Archives:

Categories:

Links

Tags:

.co.zm domains .com.zm domains .zam.co domains back-up bounce messages browser warnings connection issues control panel database dns dos attack dot-zm domains down time email email delivery error messages ftp hardware imap mail mailing lists mail relay mail server microsoft migration nameservers network networking performance php phplist pop reboot shaw shaw communications inc. smtp spam spamassassin ssl ssl certificate tls tls certificate viruses webmail web server

Resources:

On NinerNet: