We have finished dealing with the temporary problem on the mail server (NC036). All messages are being processed normally.
Thank-you for your patience.
We have finished dealing with the temporary problem on the mail server (NC036). All messages are being processed normally.
Thank-you for your patience.
We are dealing with a temporary problem on the mail server (NC036). Some users might experience temporary interruptions in sending messages. We are actively addressing this problem, and will post an update when the interruption is over.
In the evening UTC of 2 October 2025 our mail server (NC036) got behind on mail delivery. By about 19:00 UTC the mail queue was manually cleared, and the huge amount of spam was manually cleared as well. We apologise for this interruption.
We were contacted by SMTP2GO late in the morning of the 15th. We weren’t given any more specific information about this particular issue, so we have to assume that this is a new, more proactive approach to preventing spam outbreaks. In our experience, a party that has compromised an email account they intend to use to send spam will sometimes send an initial test message from the account, and that message (if the body can be seen) can be recognised and acted on. We don’t have any easily accessible examples, but the few we have seen essentially identify the compromised account and provide the account’s password. The initiator of that message might be an automated process, or it might be a human. Regardless, if that message is seen the intent is fairly obvious, and the recipient then acts on it by using (and sometimes modifying, as we’ve noted before) the account however they intended to use it.
So I’m not very impressed that they immediately shut down our SMTP account based on one suspicious message, but I am thankful that our account with them allows us to reactivate it. So we reactivated it after disabling the one NinerNet-hosted compromised email account, and none of our clients (except the owner of the compromised account) were affected. Now that we know that we won’t overreact when or if this happens again.
It’s important to note here that we actually have the ability to communicate with SMTP2GO, who provide us the ability to communicate with intelligent, thinking humans who understand how email works and understand that providers like NinerNet do the very best that we can to prevent spam from emanating from our network, but that we are not 100% successful 100% of the time. This stands in stark contrast to providers like Outlook/Hotmail/Microsoft/whatever, Gmail, Yahoo, etc., who all act as if they own the planet and are answerable to nobody … including their own customers whom they inconvenience with their arrogant, scornful and contemptuous attitudes. You don’t actually see that yourself 99% of the time if you’re one of their users, of course — it’s all sunshine and light from their marketing departments — but it’s a painfully known fact amongst small providers like NinerNet who are shunted out of the way, downtrodden and treated like trash.
The point of this separate post (rather than a second update to the last post) is to highlight the excellent behaviour of SMTP2GO. It’s not often in this business that we send good vibes to a supplier, but SMTP2GO deserve any good vibes they get. If they provide any services that you need we couldn’t recommend them more strongly.
This announcement is not about our suspending a Windows-based email account, but the fact that an errant Windows-hosted email account belonging to one of our clients temporarily caused the suspension of an account we maintain with a supplier through which a substantial number of our outbound email messages are delivered every day — our “relay server“. We are still waiting for feedback from our supplier, but this happened after business hours in the Pacific time zone and in the middle of the night in the Central Africa time zone.
In the meantime we have un-suspended our account with our supplier, as we are certain that this suspension was in error. We do not expect this to affect any outbound email today, but we alert you to it in case there is a problem.
However this turns out we will update this post with the outcome, hopefully very soon. Thank-you for your patience.
Update, 2025-07-15: The relay server people (SMTP2GO) have not even acknowledged our ticket yet, so our hopes for a quick update have been dashed. However, the relay server is working as it should as of this moment. As we said, we believe this suspension was in error, so we’re not expecting it to happen again.
We’ve had reports that emails to and from our mail server are taking a couple of hours or more to get through. We’ve been on the mail server and there is no apparent reason for this — other than the huge amounts of spam the mail server processes every day — but we have rebooted it twice now and we’ve initiated a flush of the mail queue. These actions seem to have finally flushed the mail queue, and all delayed messages have been delivered, both locally and to foreign servers.
There’s never a dull day around here when dealing with email.
As predicted immediately after the June mail server issue (that started on 11 June UTC), problems continued and new problems cropped up, delaying this post-mortem. The two primary results of this were that a relatively simple issue on the mail server that would normally have been addressed before it was even noticed by anyone was not addressed when it should have been, and the second was that our invoices that should have been sent on 15 June have still not gone out in early July! (It’s not unusual for our invoices to go out a couple of days late, but over half a month late is extreme.)
The primary issue on the mail server was that the disk drive that stores our clients’ email was about to fill up. This is a relatively routine occurrence that is addressed with the data centre and on the server in literally minutes in a two-step process: We buy new disk space in our data centre control panel, and then we configure the mail server to use that disk space.
Concurrent with the mail server issue, following fairly routine maintenance on my desktop machine, I could no longer log into it. This was traced to a configuration choice I had made in the maintenance that resulted in the main drive on my machine filling up; instead of the free space on the drive being overwritten with 1’s and 0’s or random data and being classed as free space, it was overwritten with data that looked like real data that could not be overwritten, deleted or re-classed as empty free space. The result was that the installation drive was full and I could not log in. This denied me access to data on my computer, namely a key that I need to log into the mail server to complete the second step listed in the previous paragraph. Very early on 13 June (UTC) I was able to access the encrypted drive on which the key was stored, log into the mail server and reconfigure it to use the additional space, and the problem on the mail server was fully resolved literally seconds later.
While that addressed the problem on the mail server, that, however, was the last time I was able to access the encrypted drive.
While I had access to the encrypted drive I stupidly saved files to it that I had been saving on flash drives. This wasn’t really “stupid”; it was a completely reasonable decision as the hard drive has far more space than the growing collection of flash drives I was using temporarily, and I had access to the encrypted drive and didn’t expect to lose access. As it turns out, since I no longer have access to the encrypted hard drive and the files that I saved to it were not included in a daily back-up that is run when I log into the machine, they now seem to be lost forever. Those files, while important at the time, do not include any vital business files.
A little earlier than planned I started the replacement of the now-just-outdated operating system on my work machine. For reasons I still can’t explain, the new operating system was so incredibly slow that I could make a sandwich and a cup of tea between clicks. (That’s a lot of sandwiches in an eight-sixteen-hour work day!) Several days were spent troubleshooting that issue when it suddenly, for absolutely no reason and without any action on my part, started working properly. The next priority was, since I could no longer access the encrypted drive, recovering backed-up files from the most recent daily back-up. I started with recovering vital business files and was able to immediately contact delinquent clients who apparently don’t pay their invoices until they receive a reminder. Then I started restoring all of the remaining files, meaning I could move forward with June’s invoicing. However, the restoration failed part way through, so I have had to give up and start our billing before the middle of July comes!
Our invoices will be dated 30 June 2025, which I realise is a bit disingenuous, but it keeps them dated in June. The more important dates for invoices are the dates on which your services expire; you can pay your invoice as late as you want (keeping in mind what we have said often in the recent past about waiting until the last minute), but you just need to pay it before your service, domain or certificate expires.
As always, we do sincerely apologise for the disruption that has been caused. What we have learned from this are the following:
The third item is already in progress, as we make a second attempt to recover our backed-up data; the fourth will have to happen over an extended period in the future with no goal date and no guarantee of success, but in the meantime the data we recover from our back-ups — that are intact and in place — will be saved in unencrypted form. (Technically this goes against the first point in the “data storage and transmission” section of our privacy policy, but if we cannot access our data, there’s no point in it being encrypted!) The first item will be implemented as part of getting our daily back-ups up and running again, and the second will be implemented where it can be at our earliest convenience.
Thank-you again for your noting this information that we take to ensure that we learn from our experiences where our existing systems have failed. Please advise if you have any questions or suggestions.
Update, 2025-07-09: Contrary to what was stated above about June invoices, we have decided not to send invoices in June — if that wasn’t blatantly obvious, now that it is July — and well be sending June and July’s invoices in July. Please see our post about this on our corporate blog, especially if you have any products or services scheduled to expire soon in July. Thank-you.
The issue on our primary mail server has finally been resolved, and all messages in the queue have been delivered. As expected, once we had access it only took a few seconds.
We will post a post-mortem in the next couple of days … hopefully. I can’t exaggerate the extent to which numerous unrelated events have piled on top of one another — even in the last few minutes! — to prevent an earlier resolution of this problem, and at this point I can’t predict whether or not more issues will prevent the posting of the post-mortem. However, I’m finally taking a breath, as this issue (amongst other things) is finally resolved.
I do once again extend my heartfelt apology for this incident, and I will do everything in my power to review the cascading failures — all not even related to the mail server itself! — that led to this not being resolved much, much sooner.
Words cannot express my frustration at this point. 🙁
It will be another few hours again before this situation can be resolved. It just cannot go beyond tonight, UTC. By that time my computer will be completely reset with a fully updated operating system installed.
Sorry.
Let me explain the situation we’re in. It’s an illustration of the fact that sometimes too much is, in fact, too much.
My primary workstation stopped working late Wednesday afternoon (UTC). It stopped working because I could not log in after performing a maintenance/security operation that I routinely run, but I ran it in a certain way that was sightly different to how I usually run it with no problems.
At about the same time I received a report from a client about a problem with the mail server. I received it by email (of course) which I read on my phone. I hadn’t seen anything similar before, so I asked him for screenshots. In the meantime I had an idea of what the cause of the problem could be based on monitoring I had done the day before, but without access to my workstation I could not log in and check and fix the problem … which would (and will) take all of about 60 seconds if I am correct. Reports and my experience since have almost confirmed my suspicions.
So, given the fact that it is the middle of the night where I am I cannot do anything until business hours, which will be about 06:00 local, 13:00 UTC.
My local workstation is, of course, fully backed up, so it’s not a problem of a loss of data. The “problem” is with the additional security on logging into the server which we have purposely put into place in order to protect our infrastructure and your email. Because of that I cannot log into the mail server from the machine I am currently using, and will only have access to the resources I require in the morning, local time.
I cannot apologise enough for this situation that we have caused. We will calculate a credit that will be applied to all invoices of clients who host their email with us.
In the meantime, we apologise but this issue will continue until about 13:00 UTC. At that time I should have access to the server to fully and permanently address the problem. I will post an update here, on the status blog, when this issue is resolved. My humble and sincere apologies once again.
Systems at a Glance:
Loc. | System | Status | Ping |
---|---|---|---|
![]() | NC023 | Internal | Up? |
![]() | NC028 | Internal | Up? |
![]() | NC031 | Internal | Up? |
![]() | NC033 | Operational | Up? |
![]() | NC034 | Internal | Up? |
![]() | NC035 | Operational | Up? |
![]() | NC036 | Operational | Up? |
![]() | NC040 | Internal | Up? |
![]() | NC041 | Operational | Up? |
![]() | NC042 | Operational | Up? |
Subscriptions:
Search:
Recent Posts:
Archives:
Categories:
Links
Tags:
Resources:
On NinerNet: