Server NC031 is back online, although we are waiting for word from the data centre that the issue is definitely completely resolved.
Server NC031 is back online, although we are waiting for word from the data centre that the issue is definitely completely resolved.
At 00:09 UTC on 18 August we became aware of a networking issue at the data centre where NC031 — the primary web server — is located. This is affecting almost all web hosting clients, including the primary NinerNet website. The web server itself is up and running, as far as we know, but the problem is that traffic to and from the data centre is down.
This issue does not affect any email services.
Data centre staff are working on restoring connectivity, and we will post an update as soon as we are aware that the web server is once gain accessible. We apologise for this issue.
At 23:04 UTC on 18 July connectivity issues were reported at the New York, US, data centre. The cause of this problem was later identified as a power outage. The data centre is currently reporting that power was restored to the data centre at 00:38 UTC on 19 July, but our own logs on the server itself indicate that it was only down between 22:55 and 23:50 UTC on 18 July. We are expecting a full report from the data centre and will post further details here when we receive that.
This affected our primary web server, hosting most clients’ websites, including our own primary website (www.niner.net).
We later soft-rebooted the server at 04:07 UTC on 19 July, and it was back online at 04:08.
This outage affected automated daily back-ups on the server. These have been restored and will run as scheduled at the next scheduled time, which is at midnight UTC on 20 July.
We apologise for this outage. The data centres we select are supposed to have redundant power systems to ensure that this kind of event never happens. However, clearly it did in this case and, as mentioned above, we are expecting a follow-up report from the data centre to explain this failure.
If you have any questions or concerns, please contact us to let us know. Thank-you, and we again apologise for this incident.
The migration of all email accounts from server NC027 to server NC036 is complete. In fact, it was successfully completed at 04:00 UTC on 4 June. What followed over the next few days was an unprecedented avalanche of misinformation and red herrings that resulted in our moving the new server to another data centre (a move that took ten times longer than the previous move from the data centre where NC027 was located) where the same “problems” experienced by only some of our clients magically reappeared.
We planned the migration to have absolutely no impact on existing email configurations. We did this by pointing legacy sub-domains of the niner.net domain that named server NC027 — e.g., smtp27.niner.net — to server NC036. At the conclusion of the migration these sub-domains were indeed pointing to the new server. In other words, on Monday morning (4 June) email programs would have thought they were still downloading mail from the same server, not realising (or needing to realise) that they were in fact downloading from a new server.
However, it turned out that a significant minority of email programs were somehow misconfigured with settings that worked on the old server, but stopped working when connecting to the new server. Those clients who were using the correct settings experienced no disruption at all, and when those clients with incorrect settings corrected them on the morning of Monday the 11th, the problems were fixed instantly.
Over the rest of that week (11-15 June) we helped a few clients with some issues unique to how they use email, especially where those practices clashed with current best practices for email transmission. We also dealt with some issues of senders whose mail servers were behaving improperly, causing their emails to be blocked because they looked like spammers. This notably affected email from the ZRA, but their emails are once again flowing unimpeded.
We’re monitoring the spam filtering on the new server. Any message that the server identifies as spam will have the subject of the message prefixed to add “[SPAM]“. You can use this to configure your email program or the webmail to deal with spam automatically, by filtering it into your “junk” folder or deleting it entirely. We recommend filtering to the junk folder so that you can catch the occasional legitimate message that is misclassified as spam.
Finally, in recognition of the fact that the emergency migration of the server to a new data centre on 6 June disrupted all clients’ email, and the fact that those clients with misconfigured email programs experienced a week of disruption before the issue was identified, we will be applying a one-week (quarter month) credit to the accounts of all clients hosted on server NC036. We apologise for the difficulties caused, and will apply what was learned this time to future migrations.
Thank-you, as always, for your custom and patience.
We suspect that clients having problems sending or receiving email have very old legacy configuration settings. Please see the “Email server settings” section below for the definitively correct settings.
Over the weekend we took a deep breath and stepped back to re-analyse this problem, and consult with a number of you. Between…
.. we were awash in red herrings to an extent I have never seen in 22 years.
We’ve taken a look at the behaviour of two of the most used email programs (Thunderbird and Outlook) and come to some conclusions about what might be happening:
So, if you’re having problems sending, it will likely be worth your while to check your SMTP (outgoing) settings; if you’re having problems receiving, it will likely be worth your time to check your POP or IMAP (incoming) settings. I wanted to have some screenshots ready for this post, but I’d rather get the words up now and post screenshots shortly afterwards, so here are the settings you need to use:
I can’t emphasise strongly enough how important it is for you to be precise in setting up this configuration. No setting is “close enough”, and your computer is not smart enough to figure it out; it will just tell you there is an error. Although, having said that, I’d like to emphasise that the niner.net sub-domains with “27” in them — i.e., pop27.niner.net, imap27.niner.net and smtp27.niner.net — do still also work, but they will be phased out; do not use them.
In the case of those email programs that like to railroad you into sending all email through a single SMTP account by default, we suggest that you start with a clean slate there too by deleting all of the saved SMTP accounts (unless you have some on systems that are completely separate from NinerNet) and creating a new one for each of your email accounts. Because your email program may not let you delete the “default” SMTP account, you’ll need to make a new SMTP account the new default, and then delete the old default.
We will post helpful screenshots as soon as possible. In the meantime, please check (and, if necessary, update) your email account settings and ensure that they are correct.
Thank-you.
I have just got off the phone with someone in IT security at MTN head office in Lusaka, and they confirm that they have been blocking our new mail server as part of a wrong-headed plan to prevent MTN users from sending spam. It is likely that the first new mail server was also being actively blocked. He says that our IP addresses will be unblocked within the next ten minutes.
This raises the significant question of whether or not this is now an Africa-wide policy with many other ISPs. Other countries manage to prevent their users from sending spam without holding the keys to a gateway to the Internet, forcing companies like NinerNet to supplicate themselves to the likes of big companies like MTN when we find our businesses held hostage.
This is why we sent the questionnaire out yesterday asking you for details on whether nor not you are still having problems, and for the details of your ISP. Please reply to those emails so that we may determine which ISPs are actively blocking our servers and take the appropriate action.
We have had this report from a client:
I have now reset my LTE unit in our office to factory default and mails are working again on MTN, weird…We will monitor and see if it goes off again
We continue to track the intermittent connections in Zambia. They simply don’t make sense. For example, some MTN customers have no problems connecting, but some do. And some people can connect on MTN, but not Realtime/HAI, or they can connect on Paratus, but not MTN.
But we are slowly managing to narrow things down with a resolution in mind.
We did receive a call from a client who has talked to at least one ISP up on the Copperbelt, and they informed him that they allow some connections but not others, and they allow some connections intermittently such that it works one minute and stops working the next. This is exactly the behaviour our clients are seeing, and it seems to be intentional on the part of at least one Zambian ISP! Now, these are very vague statements, but our client asked us for an email explaining how our system works and is configured that he could send to them. Herewith a copy of our email:
Thanks for your phone call. As I said on the phone, this mail server operates in exactly the same way as the old mail server. There is simply no way to operate a mail server on the Internet that does not conform to the same interoperability standards as every other mail server on the Internet. Sure, the are minor variations on how some things are done internally on all servers, but for server A to talk to server B and deliver an email — or for a personal computer or phone to get that email to server A in the first place — they all have to be talking the same language.
Also, I find it very difficult to understand an ISP saying that they allow some standard behaviour and disallow other standard behaviour. And it’s even more bizarre that they say they allow some behaviour intermittently; what’s the point of that?!
With that editorial out of the way, this is the configuration of both the old and new mail servers:
SOFTWARE:
- MTA (mail transfer agent, i.e., mail server software, SMTP): Postfix
- MDA (mail delivery agent, i.e., POP and IMAP): Dovecot
- Web server (control panel and webmail): Nginx
PORTS (all TLS/SSL):
- POP: 110/995
- IMAP: 143/993
- SMTP: 587
- Web: 443
This is a 100% standard configuration, and as I’ve said before, is exactly the same as it was on the old server … EXACTLY the same.
Any ISP is welcome to contact me directly, by email or phone, to explain why users on our system should be subject to some sort of arbitrary blocking of anything. And they’re welcome to contact me just to ask questions or for a friendly chat. Everyone in the world (barring repressive dictatorships, which I don’t think Zambia has become just yet) uses these same port numbers and configurations.
Please keep me informed. Thanks.
Craig
Are you wondering if our mail server is really up or if we’re “having problems”? We could be lying, but this third-party service will uncover our lies:
Every time we check, mail.niner.net and webmail.niner.net are up. Please check for yourself. In fact, we suggest contacting your ISP and asking them why you cannot reach a server that is alive and well.
You can also check the pop, imap and smtp sub-domains of niner.net, as well as the old pop27, imap27 and smtp27 sub-domains, all of which are working.
We actually do strongly urge you to contact your ISP about the fact that you can only intermittently connect to our mail server. They are the only ones who can help you with your connection to the Internet when it is not working properly.
As happened on Monday, after initial success with setting up the new mail server, we’re again receiving reports from clients in Zambia (so far) that are starting to have their connections to the server dropped by their ISPs. This is incredibly frustrating, certainly for you, of course, but also for us.
If this is happening to you, you can help us help you by submitting what’s called an MTR report. MTR is a network diagnostic tool that gives us quantifiable information we can analyse and (hopefully) act on, showing where the network problem affecting your email lies so that we may possibly contact the organisation responsible for the problem.
If you’re running Windows, please download WinMTR. The file you will download is just a zip file that contains the actual program (32- and 64-bit versions). Read the “README.TXT” file for instructions on which version to use. All you need to do is double-click the appropriate file and the program runs; it’s not actually installed on your computer. Then type nc036.ninernet.net (yes, that’s ninernet.net, NOT niner.net) in the box as prompted, and select the start option to begin generating report data. Once you’ve gathered enough data, please use the copy or export functions to send us the report.
Please also send us the following information:
Please email this to our usual email address, or send it via our contact form.
Thanks.
Systems at a Glance:
| Loc. | System | Status | Ping |
|---|---|---|---|
| NC023 | Internal | Up? | |
| NC028 | Internal | Up? | |
| NC031 | Internal | Up? | |
| NC033 | Operational | Up? | |
| NC034 | Internal | Up? | |
| NC035 | Operational | Up? | |
| NC036 | Operational | Up? | |
| NC040 | Internal | Up? | |
| NC041 | Operational | Up? | |
| NC042 | Operational | Up? |
Subscriptions:
Search:
Recent Posts:
Archives:
Categories:
Links
Tags:
Resources:
On NinerNet: