NinerNet Communications™
System Status

Server and System Status

NC036: Migration update 17

7 June 2018 09:12:19 +0000

We continue to track the intermittent connections in Zambia. They simply don’t make sense. For example, some MTN customers have no problems connecting, but some do. And some people can connect on MTN, but not Realtime/HAI, or they can connect on Paratus, but not MTN.

But we are slowly managing to narrow things down with a resolution in mind.

We did receive a call from a client who has talked to at least one ISP up on the Copperbelt, and they informed him that they allow some connections but not others, and they allow some connections intermittently such that it works one minute and stops working the next. This is exactly the behaviour our clients are seeing, and it seems to be intentional on the part of at least one Zambian ISP! Now, these are very vague statements, but our client asked us for an email explaining how our system works and is configured that he could send to them. Herewith a copy of our email:

Thanks for your phone call. As I said on the phone, this mail server operates in exactly the same way as the old mail server. There is simply no way to operate a mail server on the Internet that does not conform to the same interoperability standards as every other mail server on the Internet. Sure, the are minor variations on how some things are done internally on all servers, but for server A to talk to server B and deliver an email — or for a personal computer or phone to get that email to server A in the first place — they all have to be talking the same language.

Also, I find it very difficult to understand an ISP saying that they allow some standard behaviour and disallow other standard behaviour. And it’s even more bizarre that they say they allow some behaviour intermittently; what’s the point of that?!

With that editorial out of the way, this is the configuration of both the old and new mail servers:

SOFTWARE:

  • MTA (mail transfer agent, i.e., mail server software, SMTP): Postfix
  • MDA (mail delivery agent, i.e., POP and IMAP): Dovecot
  • Web server (control panel and webmail): Nginx

PORTS (all TLS/SSL):

  • POP: 110/995
  • IMAP: 143/993
  • SMTP: 587
  • Web: 443

This is a 100% standard configuration, and as I’ve said before, is exactly the same as it was on the old server … EXACTLY the same.

Any ISP is welcome to contact me directly, by email or phone, to explain why users on our system should be subject to some sort of arbitrary blocking of anything. And they’re welcome to contact me just to ask questions or for a friendly chat. Everyone in the world (barring repressive dictatorships, which I don’t think Zambia has become just yet) uses these same port numbers and configurations.

Please keep me informed. Thanks.

Craig

NC036: Migration update 16

7 June 2018 05:43:46 +0000

Are you wondering if our mail server is really up or if we’re “having problems”? We could be lying, but this third-party service will uncover our lies:

https://downforeveryoneorjustme.com

Every time we check, mail.niner.net and webmail.niner.net are up. Please check for yourself. In fact, we suggest contacting your ISP and asking them why you cannot reach a server that is alive and well.

You can also check the pop, imap and smtp sub-domains of niner.net, as well as the old pop27, imap27 and smtp27 sub-domains, all of which are working.

We actually do strongly urge you to contact your ISP about the fact that you can only intermittently connect to our mail server. They are the only ones who can help you with your connection to the Internet when it is not working properly.

NC036: Migration update 14 — Microsoft blocks

6 June 2018 15:43:33 +0000

It seems that Microsoft blocks every IP address on the Internet by default, except those for which mail server administrators like NinerNet have to beg repeatedly to have removed. Our requests keep being ignored, despite the fact that we are members of both their Smart Network Data Service (SNDS) and their Junk Mail Reporting Program (JMRP), but we will keep trying.

Currently this means that we route Microsoft’s main domains — hotmail.com, outlook.com, msn.com and live.com — through our relay server which is not blacklisted as it pre-dates their aggressive blocking practices. However, if you send email to a non-Microsoft domain hosted by Outlook/Office365, you will almost certainly receive a bounce message that looks like this (if the domain you sent to hosted by Microsoft is “exampledomain.com”):

Remote-MTA: dns; exampledomain-com.mail.protection.outlook.com
Diagnostic-Code: smtp; 550 5.7.606 Access denied, banned sending IP
    [178.62.195.26]. To request removal from this list please visit
    https://sender.office.com/ and follow the directions. For more information
    please go to  http://go.microsoft.com/fwlink/?LinkID=526655 (AS16012609)

NC036: Migration update 13

6 June 2018 12:52:23 +0000

We will post a postmortem here in due course, hopefully with 24-48 hours, along with a thousand more apologies, but we are looking for feedback to ensure that all clients are able to connect to the server and download and send email, as this was not the case on Monday and Tuesday.

NC036: Migration update 12 — server back online

6 June 2018 12:28:04 +0000

The transfer of the mail spools has completed and server NC036 was brought back online at 12:12 UTC.

NC036: Migration update 12

6 June 2018 08:32:10 +0000

The transfer of mail data between the old and new data centres is still underway. Considering the transfer of the same amount of data took 50 minutes over the weekend, and we are now at the 5-hour mark, it was impossible for us to predict that this would take so long.

I can assure you that I understand the frustration that you are feeling with this situation, but given the network problems in southern and central Africa that necessitated this emergency move, we had no choice but to act immediately rather than forcing many clients to do without mail until the weekend.

My best estimate at this point, based on how much data has transferred so far (55%) and how much is left, is that the transfer will complete at approximately 13:00 UTC. Assuming this is the case, the server will be back online and accessible at about 13:30 UTC.

I sincerely and deeply apologise for this situation.

NC036: Migration update 11

6 June 2018 04:42:04 +0000

The transfer is taking significantly longer than we anticipated, likely due to the greater distance between the two data centres.

NC036: Migration update 10

6 June 2018 03:36:47 +0000

We’re almost done. Just waiting for the mail spools to finish transferring, then a few checks (double-checks) to ensure that all is in order, and we’ll re-enable all services. Then more checks to ensure that mail is flowing as it should, and then we await feedback.

NC036: Migration update 9

6 June 2018 01:22:20 +0000

We’ve finished planning this emergency migration of the new mail server (NC036), and will be shutting it down within the next five minutes.

NC036: Migration update 8 — Plan B

5 June 2018 23:32:15 +0000

Over the weekend we successfully migrated all of the email accounts on old server NC027 to new server NC036. Except that for a large swathe of our clients, this migration was NOT a success.

We can tell you unequivocally that the new server is running and running well. It’s doing a much better job than the old server, and we splurged on a high-performance server with additional software to process mail quicker and to do a better job of filtering out spam and viruses. That part is all going great, and I can tell you that I am delighted about that part.

The part that’s not going great is that the data centre in which we placed the new server appears to have some serious networking issues for a large number of clients in southern and central Africa. We could spend the next week troubleshooting this and perhaps find the cause (and then work on addressing the cause), but you and we don’t have the luxury of that much time. Within the next few hours we will send troubleshooting instructions to affected clients, just in case.

What we are going to do to resolve this issue is use the wonders of modern technology to shut down the new server, take an image of it to preserve the time, effort and expense that has gone into creating it, and transfer it to new hardware in a different data centre. Ideally I would like to take a day to set up a test server in that data centre but, again, you can’t afford to have no or limited access to your email for a day.

Fortunately the process of moving to another data centre is quite straightforward, and will not require as much downtime as the full migration did. Copying the image from one data centre to another will be much like physically carrying the server to the new data centre; it’s already set up and configured, it just needs to be plugged in at the other end. The only thing that will not be quicker is that we have to use more traditional methods to transfer your mail spools to the new server. This took 50 minutes on the weekend, but with two equally powerful servers on both ends of the transfer it should be a bit quicker this time.

We will shut down the server at 01:00 UTC on Wednesday 6 June. Including the data transfer and some minor reconfiguration, I sincerely hope to have it back online by 03:00 UTC.

Please keep an eye on this status blog, where we will post important updates during the process.

Thank-you for your extraordinary patience.

NinerNet home page

Systems at a Glance:


Loc.SystemStatusPing
Server NC023, London, United Kingdom (Relay server), INTERNAL.NC023InternalUp?
Server NC028, Vancouver, Canada (Monitoring server), INTERNAL.NC028InternalUp?
Server NC031, New York, United States of America (Web server), INTERNAL.NC031InternalUp?
Server NC033, Toronto, Canada (Primary nameserver), OPERATIONAL.NC033OperationalUp?
Server NC034, Lusaka, Zambia (Phone server), INTERNAL.NC034InternalUp?
Server NC035, Sydney, Australia (Secondary nameserver), OPERATIONAL.NC035OperationalUp?
Server NC036, Amsterdam, Netherlands (Mail server), OPERATIONAL.NC036OperationalUp?
Server NC040, Toronto, Canada (Web server), INTERNAL.NC040InternalUp?
Server NC041, New York, United States of America (Web server), OPERATIONAL.NC041OperationalUp?
Server NC042, Seattle, United States of America (Status website), OPERATIONAL.NC042OperationalUp?

Subscriptions:

RSS icon. RSS

Twitter icon. Twitter

Search:

 

Recent Posts:

Archives:

Categories:

Links

Tags:

.co.zm domains .com.zm domains .zam.co domains back-up bounce messages browser warnings connection issues control panel database dns dos attack dot-zm domains down time email email delivery error messages ftp hardware imap mail mailing lists mail relay mail server microsoft migration nameservers network networking performance php phplist pop reboot shaw shaw communications inc. smtp spam spamassassin ssl ssl certificate tls tls certificate viruses webmail web server

Resources:

On NinerNet: