Category: Uncategorized

  • Clouds got a bit dark this week

    It’s not been a good month for hyperscalers, or beds!

    With two recent cloud outages, Microsoft’s Front Door outage globally that impacted not just customer services but Microsoft themselves, and this week’s AWS outage in US East 1, where a simple DNS issue on one service cascaded to take down huge numbers of services in a region even if not directly related, including most critically my own bed, I was thinking a bit…


    Yes, I’m one of those with the now somewhat famous smart mattress topper that has made itself stupidly cloud centric. I don’t even have the ‘AI powered insights’ subscription, I just use it in dumb mode where I just have to turn it on at night, set the temp, and turn it off in the morning. I can’t even set a timer without the cloud! But I couldn’t turn it off the other morning unless I just pulled the plug…

    When designing highly resilient services, looking at potential points of failure and understanding their impact is crucial. I’ve spoken in the past about how on the Olympics we worked to a 5x 9s SLA, and built in layers and layers of redundancy, removing single points of failure, and at every level ensuring there were layers of contingency, and then testing every single one of them – repeatedly. The same should apply in designing your IT services, and the scale of that effort depends on the criticality of your solutions, and of course the impact to your brand.

    From multi- AZ approaches, multi-region, to multi-cloud there are levels of resilience that can be built in to cloud environments to mitigate the risks of outages, and of course let’s not forget your own data centres as well – hybrid approaches can also help in such scenarios, both with distributed critical systems where the cloud is used for scalability but the core can remain on prem for example (of course outages in your own data centre are also a thing, not just the realm of hyperscalers).

    The Front Door outage the other week – that impacted globally, so no matter how well engineered your solution was, it was going down if you relied on Front Door. But a multi-CDN approach, where Application Gateways are fronted by multiple CDNs would have remained up. Increased cost and complexity, vs higher SLA.

    The Amazon Outage, which also impacted the AWS control plane, perhaps made it hard for people to be able to understand the full impact, and decide if a DR would be necessary or not (or indeed even if they were able to). For clients in active multi-region set ups, certainly it didn’t work as planned, for some of them at least!

    But for my bed? Is it a mission critical system? Definitely not (well… perspective is everything!). But, one would argue the brand damage here could be quite major (Saying that, a lot of people are talking about it, and perhaps are curious…). However, from a design perspective, I think the solution they have done simply makes no sense whatsoever – the unit in my bedroom has by all accounts a quad core ARM processor… yet all it does is connect back to AWS USE1 for instructions. Indeed users who do have the subscription, it apparently sends a full 16GB of data a day!! (Anyone in the Edge AI/ML world, when all it’s doing is collating sample points from sensors, would perhaps find this quite startling that there is zero logic being applied at the edge here).

    I can see no good reason that there can’t be certain functions that just are processed locally, like turning it on or off – sure I can turn it on when I’m the other side of the world too… but not sure that’s useful! Limiting functions like a timer is a business decision, but again technically also odd! Not only do these decisions impact costs for the company – they are processing lots more data than they need to in the cloud when edge processing could do plenty of that lifting, we’ve now seen the impact of an outage on an architecture that, perhaps rightly so, is not highly redundant and multi-region active active.

    So… my point? When designing a cloud service, you need to think hard about ‘what would happen if…’ and balance the risks and costs against the very rare chance of a major outage. Yes it’s been a bad month, they are very rare, but they do happen. And being prepared for when they do is crucial, and needs to start with a fundamentally understanding of your business, its application and users, and the impact on them. A distributed cloud solution can keep you edge working in an outage (and optimise data flows with edge computer!), a resilient hybrid/multi cloud solution can reduce the impact on your critical services, and it might not be as hard to achieve as you think!

  • So what’s all this about Sovereignty?

    Sovereign Cloud is certainly not a new topic, but one that in recent months has made a lot more noise than usual, especially in Europe.

    I’ve worked in infra and cloud for many years, and sovereign cloud itself isn’t anything that new – indeed with the rise of the hyperscalers in the 2010s, the topic of ‘who has my data/compute’ has always been there, as has the question of sovereignty.

    So let’s start with the ‘simple’ question – what does sovereign mean (to you)? Because in my experience over the last 10 years or so of companies wanting sovereign cloud, it means something different to everyone.

    • No one wants to give up control of their data to someone else, so therefore I need sovereign cloud? Not really!!
    • I want to ensure that my data stays in my country, therefore I need sovereign cloud? Also not really (a bit more nuanced perhaps).
    • I don’t want anyone outside of Europe knowing anything about my estate? The Hyperscalers are working to address this one now too.
    • I simply don’t want a non-European company having anything to do with my cloud? That’s a tricky one for the hyperscalers, but they are making moves to address it now, and enter the realm that was traditionally served by companies like OVH for example.

    All of these are ‘starting points’ a company might have, and as you drill down some of them may have more grounding than others in to why a business thinks they need sovereign cloud. It’s no secret that I am a Microsoft guy mainly so this post will be quite Microsoft centric, although I have worked extensively with AWS, GCP and Alibaba Cloud in my time, but the story isn’t that different for any of the hyperscalers really.

    The next point I want to look at is if a company is so concerned about sovereignty, why are they looking at public cloud in the first place? Historically, private clouds don’t offer the breadth of services (especially in areas like AI capabilities), certainly can’t offer the scalability (well, you can keep buying more hardware, but that takes time), and of course tend to come with significant upfront costs. All of these things are changing, and a recent Gartner study shows that a decent proportion of CIOs are now investing in Private Cloud, bucking a trend of decreasing investment over the previous years. Private Cloud has come a long way from when it started – we just called it virtualisation back then, implementing clusters of ESX3 or Windows Server 2012 + HyperV, consolidating physical infrastructure and trying to get the most out of your servers. I recall one of my first VMWare projects, taking racks of servers down to a single blade centre and thinking ‘how cool is this’. I digress though!

    So companies are looking at private cloud, and things like Azure Local (and Azure Stack in the past), AWS Outposts, Google Distributed Cloud are all trying to let the hyperscalers play in this market (although coming from a hybrid perspective mostly), and VMWare themselves have positioned themselves in the same space and not just to be seen as a virtualisation platform. I’ve seen a huge pick up in interest in Azure Local for example, both as a hybrid solution, but also as a disconnected solution where a customer wants to use the Azure API they know already, and take advantage of the scale of Microsoft, while remaining entirely disconnected from public Azure.

    But let’s go back to the question – what does someone really mean when they say sovereign? In my experience, with a few guardrails, public cloud actually does satisfy most companies’ needs – at least up till now. The primary concern was always around data residency, and who can have access to your data. But this was a problem long solved with guardrails and encryption to a level that would satisfy easily 90%+ of customers. The support for ‘bring your own key’, and more recently ‘bring your own HSM’ has further strengthened that by ensuring that you could easily render any data useless as well. Despite the recent noise, Microsoft Cloud for Sovereign for example has existed for years, mostly as a set of policies. Of course, the further down this rabbit hole you go, the more expensive things get!

    As we look forward, the desire to have more European centric solutions certainly changes the field for the future, with concerns raised that a foreign court could order a company to hand over information of a European company, or at least shut it down if that isn’t possible (and indeed in a well implemented public sovereign cloud, the hyperscalers cannot ever access your data). That is why all of the major hypervisors have made announcements in the last few months around how they are going to be ‘more European’ in some way or another. In the Microsoft world, that is the new Data Guardian solution that ensures a European Sovereign Cloud customer will be exclusively managed and supported in Europe, and support for your own hardware HSMs. Then they have gone a step further in France and Germany, letting local companies run a subset version of Azure Cloud (like Azure China for example, or for those of us who were there ten years ago, the original Azure Germany instance!). These offerings are aiming to be direct competitors of companies like OVH (once they get all their government certifications), but trying to offer ‘more’ – an API compatible cloud that can co-exist with Azure and offer the broader catalogue of Azure Services. The question is, would that be enough to tempt someone over who was convinced they needed OVH in the first place?

    For sure we are going to see a growth in sovereign cloud demands moving forward now, as we enter a new era of trust. I touched on AI being a major reason for cloud, with the costs of entry prohibitive otherwise; what I hope this drive to more european sovereignty doesn’t lead to is a ‘two tier’ cloud, where the major hyperscalers offer ‘less’ in their sovereign clouds to non-sovereign. I don’t see this happening myself, and of course it also represents a opportunity for the smaller European players to become more significant – till now their use cases with enterprise have always been niche, fitting more with SMEs who couldn’t invest in their own private cloud.

  • From the archives… It’s getting Cloudy – part 3/3 of my look back

    Originally posted 16th June 2022

    Moving back to the UK and starting in Cloud Engineering was another big change for me – up till this point I hadn’t worked in a formal Agile team, taking a more classic project approach, and I hadn’t done a huge amount on public Azure cloud either! What I did have was a track record of building and growing highly successful teams and delivering complex infrastructure projects.

    Coming in to Cloud Services gave me the chance to learn from my team and develop my skills in Agile and SCRUM, while helping to bring fresh eyes and knowledge. A key part of all of my journey has been the opportunity to surround myself with great people and learn from them, while hopefully imparting some knowledge back! Here I had another highly multinational team spanning a good portion of the globe to work with, along with the challenges and possibilities of a fully remote team. I guess this gave me a good head start for what would come in 2020, having had 4 years of experience of working from home and managing a team spread across multiple countries and time zones who didn’t get to meet face to face very often!

    It was now that I finally fully immersed myself in cloud, developing my skills from on-prem infra and virtualisation in to cloud, as well as containerisation and Kubernetes. Getting to work across a multitude of customers with different requirements, as well as hugely differing views of cloud and what it meant to them, helped hone my own skills in how to view and shape cloud strategy for customers. Here we really embraced the Infrastructure as Code, data driven approach, building on the knowledge the team bought together from their own previous projects and roles to take it to a new level of automation and repeatability, allowing for rapid delivery of complex, secure cloud environments.

    Over my time in the team my job title changed a few times as I took on wider scopes of responsibilities, including working with AWS and GCP. I had the opportunity to engage with a wide variety of customers across multiple industries, learning the challenges each see with the cloud, and perhaps most importantly the solutions they needed solving that cloud might be able to help with. It can be far too easy to look at what we can do with the technology available and sometimes lose track of what the actual problems we needed to solve in the first place. Starting without clearly defined criteria of the real needs and problem to be solved can easily result in a solution that just keeps growing in scope and complexity that in the end doesn’t solve the initial problem.

    Taking those skills I moved to our CTO team, as part of the Manufacturing Industry CTO. Here I got to look at a new industry for me, and the industry (as well as customer) specific challenges around Cloud, Edge and IoT. It’s here I started to delve in to Digital Platforms and how to take the next step in a digital transformation journey, embracing data and APIs at the heart.

    And that brings me to my final role in Atos – in 2021 I took the opportunity to move back to Major Events to head up their Cloud, Infrastructure and DevOps team. This period of my career lasted just over a year, but I hope in that year I managed to make a big impact, bringing back the knowledge and ideas I had gained. I’ve said this for each part of my journey, but in each case it is true – yet again I had a great team to work with, share ideas with and learn from. It also gave me the chance to work on two further Olympic deliveries – Tokyo 2020 and Beijing 2022.

    For me it was great to see the evolution since Rio 2016 and the continuation of what was started there, with the move not just to cloud, but also the shift in the applications to microservices and containerisation. Beijing also gave me the opportunity to work with a new cloud provider to me, Alibaba Cloud. The data driven, IaC approach we had adopted made it easy to consume cloud, whether it be Alibaba or anyone else, ensuring a secure and standardised approach to deployment in a multi-cloud environment.

    So that’s a quick look at my path through Atos, and some of the technological evolution I’ve seen over that time. As I’ve kept this focused more on the changing technologies, there’s a few areas I didn’t touch on such as my tenures in the Scientific Community and Expert Communities. These bodies gave me the opportunity to work with some of the most brilliant people in Atos, and provide my own thought leadership to our strategy and contributing to white papers; another highly formative experience for me, and yet again really based around the people.

    But this brings to an end my 17 year journey in Atos: from configuring a Cisco 7200 router via a console cable to automated deployment of complex multi-cloud environments. I am hugely grateful to each and every person who I have got to work with and learn from in that time, and I hope I have left a bit of a legacy here and there!

  • From the archives… My Olympic Journey – part 2 of my look back

    11 June 2022

    So as we left off in the last post, I got the change to start working on the London 2012 Olympics. I packed my bags and headed down to London in September 2009, starting as the first network architect on the project, that eventually grew to run a team of 8 direct reports and indirectly look after a team of tens of venue engineers. I never saw myself as a manager, and it wasn’t something that I directly aspired to – however it also seemed like I had a good knack of building fantastic teams and getting the best out of them. The ‘Olympic Experience’ was a hugely formative one for me, working in a critical environment where delays and downtime were not an option. Yet again I was surrounded by hugely talented people I got to learn and grow from, becoming more involved in general infrastructure than just networking, and also helping to bring x86 virtualisation for the first time in an Olympics.

    I think one of the key things for me was how the whole team would work together, from the application developers, infrastructure team, venue IT teams, results teams and more – a whole team working as one to deliver. DevOps was a nascent term in 2010, but perhaps something we were doing before it really took off and became more formalised as it is today.

    Working closely with the telecoms and equipment partners BT, Cisco and Acer, my team built and managed a network spanning something like 50 venues and two core data centres consisting of over one thousand network devices, securely supporting thousands of clients and hundreds of servers to deliver the Games.

    This resulted in me being offered the opportunity to take my knowledge to Brazil and the Rio 2016 Games to design, build and run the IT Infrastructure. Again, this might have been a bit before the term ‘private cloud’ took off, but that I guess is what we ended up building! Working in a brand-new country, building a local team, learning the culture was a fantastic experience, with the added bonus of making a set of great friends who have now spread out across the world, making their own amazing careers.

    Building on top of my previous experiences, we delivered a hugely successful infrastructure project: reducing the data centre server count by something like 80% while improving speed of delivery of new servers through automating end to end deployments using a data driven approach. It was also my first taste of PaaS, looking at how we could take our application stack and run them on a PaaS instead of spinning up VMs and middlewear per application; an experience that would be a key part of the next phase of my career and set the ground for the move in to containerisation and container orchestration. Also worth mentioning is the continual evolution of security, each Games presents new challenges with how we secure it; I think somewhere out there on the net is a video interview I did discussing the growing security threats and how we have to continue to evolve to deal with them.

    Then there was Rio itself! Living in Rio was very influential on me for sure (not least because that’s where I met the person who would become my husband!). Although I had worked with multi-national teams before, being fully immersed in another culture very much was a key step for me and the Olympics really does take multi-national teams to a new level – counting on my hand now I think I had the opportunity to work with people from nearly every continent (Antarctica was missing!) and I’ve lost track of the number of countries. This was a huge opportunity to learn how different cultures work, how successful teams are built and come together, and how to get the very best out of the team.

    As Rio came to an end, I faced a decision of continuing with the Olympics and moving to Tokyo, or returning to the UK. This in the end wasn’t that difficult to decide for a multitude of reasons, but the decision was to move back to London and take up a position in the Cloud Services division, heading up a newly formed Microsoft Cloud Engineering team. But that’s for the next post!

  • From the archives… A look back at over the last 17 years – part 1 – starting as a network engineer

    Originally posted 2 June 2022

    As I mentioned in my last post, I said I’d post a little retrospective of my 17 years in IT, so here is part one!

    Back in 2005, I started to work for a company called, at the time, Atos Origin. I had applied to the Atos Origin Grad scheme in 2004, shortly after I finished Uni. After a multi-stage process I successful in getting a place starting, starting in September 2005. This gave me a bit of a gap, where I put some of my maths knowledge to do private tuition for a year before starting (something that gave me huge respect for teachers!).

    No alt text provided for this image

    I stared my Atos career in networking, working for the network project team in Birmingham. I recall one of my first tasks being given a Cisco 7200 router to build, at the time knowing next to nothing about what a Cisco 7200 was! Thankfully, I had a great team around me to teach me the ropes. This router ended up being part of a shared internet service that, as I recall, terminated a 155Mbps ATM link.  And as one does with a new toy we tried to do some speed testing on this link before hooking it up; although at the time we couldn’t get close to maxing it out from out laptops that only had 100Mbps ethernet ports! And now I am posting this from a 1Gbps symmetric home internet connection…

    The first few years of my career were spent in this team, mostly working on what was a cutting edge shared networking platform, while also having the odd stint of crawling under data centre floors laying cables too! The concept of shared resources of course being the bedrock of today, but in 2005 was still quite novel! I had the opportunity to work across a great range of customers, both on shared platforms and dedicated builds; but more critically the opportunity to learn from some phenomenally talented people, working on cutting edge hardware and highly sophisticated and complex architectures.  There was also a chance to work with some very legacy stuff even back in 2005; there’s a clear memory of plugging a device in to a Token Ring MAU with a warning ‘if it clicks rapidly don’t panic’!

    I think it was early 2008 when I first got involved with what today would be termed private cloud; a project to virtualise an EOL physical estate on to the newly released VMWare ESX 3.5. My role started at looking at the networking but also gave me the chance to take my VCP exam as well.  This was my first ‘taste’ of enterprise virtualisation and helped set the roadmap for the rest of my career; seeing the potential of taking tens of physical server down to a single blade, consolidating racks of devices in to a single blade chassis, while improving performance. It also didn’t go unnoticed that it wasn’t just about reducing servers to reduce servers, but also the carbon footprint – another element that is of course hugely important and increasingly more visible and key in decision making.

    In 2009, the opportunity to work on the London 2012 Olympics came up, so I packed my bags down to London and started the next phase of my journey that would end up taking me half way around the world, but that’s for the next post!

  • From the archives… Microsoft Future Decoded

    Originally posted 31 October 2017

    It’s not that far off two years ago since I wrote a blog post from the Microsoft Cloud Roadshow event in Sao Paulo. Now I’m a bit closer to home, and more directly involved in Azure in my day to day job, heading up Atos’ Managed Azure Cloud engineering and development team. So, what has the best part of 2 years changed in the Microsoft vision at this this two day Microsoft event in London, Future Decoded?

    I took a few key take aways from the keynotes today, and it is still thoroughly exciting to see Microsoft’s vision.  

    Day one had a wider set of themes than just Azure Cloud by itself, including a fascinating keynote on Quantum Computing, and a comedic Q&A with David Walliams. But, we still had some excellent talks on both Cloud and Digital Workplace. The introductory slide below helps to set out that vision, underpinned by the ‘intelligent cloud’.  

    A couple of key messages stood out to me in the talk including one statistic that within 3 years time 50% of the global workforce will be mobile. An amazing stat really, and Microsoft’s vision is of technology becoming just a transparent enabler to allow that work to flow between location and device. It’s a world I already live in, working from between home, office, and indeed Microsoft roadshows, using devices interchangeably to pick up work from my laptop, tablet and phone as most convenient and suitable for the occasion. The future vision sees that going steps further including improvements in Office 365, as well as of course the Surface range.  

    When it comes to Microsoft’s enablement of digital transformation and Microsoft’s view of four towers , the first – Empower Employees – is crucial, and links back to the comment above on transparency.  

    All this work should be to empower employees in the end. I would say the other three points of engaging customers, optimising operations and transforming products all fall out of the first step. Back in my previous role working on the Olympics, a key message that would always be said is the transparency of technology – generally people don’t care about the complex and substantial IT systems that are behind watching a competition, as long as they work!  IT only becomes visible if it fails. In the digital workplace we are heading towards, and in some areas faster and closer than others, technology should take a similar viewpoint. The convenience and simplicity we are often used to at home is coming to the workplace, whether enabling geographically diverse teams to communicate and collaborate better, to an expenses system that is as simple and convenient as many modern apps to use or, as I’ve been doing today, the ability to seamlessly work on a document on my phone, tablet and now laptop to post online. Modern workplace and business applications form two of the four pillars Microsoft discussed.

    They were joined by Applications and Infrastructure, the backbone of Azure Cloud, and Data and AI. These are transformed by the move from virtualisation to PaaS services (specifically mentioned as microservices an containers in the below slide), as well as the convergence of disparate data to a connected data estate. A key point here was with the commonly viewed point of data as a currency.  

    The convergence of disparate data is a fairly standard message, but the capabilities of Azure and Microsoft’s view of AI being an enabler give a significant next step in Big Data analytics.  

    I had a conversation earlier discussing the growth of AI in our day to day lives over the last few years, whether it be interacting with Alexa, Cortana or Siri to the ever improving translations online. It has become such a natural day to day part of our lives without really that much realisation that we are living with tools that only twenty years ago were more the realm of Star Trek than our living rooms. Who knows what the future can hold!

    And one last thing to mention, Project Emma. A genuinely amazing and emotional piece of work.

    Tomorrow Future Decoded continues, looking more specifically on Azure.

  • From the archives… Microsoft Cloud Roadshow São Paulo

    Originally posted 24 January 2016

    On Tuesday and Wednesday last week I spent two days at Microsoft’s 2016 Cloud Roadshow.

    Through a series of sessions, Microsoft shared their vision of Cloud and Windows Server 2016 (as well as Windows 10, Office 365 etc).  And I must say I left with a fantastic impression of both the latest improvements either live today or coming soon to Azure, and a real desire to start playing with Server 2016.

    I’d like to share some of my thoughts from some of the sessions (and if you happen to follow me on Twitter you will have had a live blog of some of the sessions almost!)

    For those in Europe, the roadshow will be coming to London on the 29th February, and more details can be found online: https://www.microsoftcloudroadshow.com/cities

    Azure:

    • Azure is running from over 100 data centers worldwide, supported by almost 2 million servers
    • 20% of Azure’s VMs are Linux
    • An Azure data center has a minimum of 30 racks of storage
    • 1,400,000 SQL databases running in Azure
    • 425,000,000 AD users in Azure

    Microsoft Azure Stack:

    Coming with Server 2016 is ‘Azure Stack’.  Instead of just Hyper-V and SCVMM, it will be possible to run the same software stack as runs Azure.  This will bring the same web based management tools that you use on Azure to your own private cloud, as well as what looks to be very seemless integration for a Hybrid Cloud environment.

    Microsoft Operations Server Manager:

    The new OSM web based server management tool definitely looks to have aspirations to replace your SIEM to me – it pulls in event data from all Windows servers as well as firewalls and can present a view of all potential performance, operations and security issues.

    Microsoft Advanced Threat Analytics:

    Another Security Tool, focused specifically on combining data from Windows Servers (via your existing SIEM tool) and providing out the box correlation of Windows events looking for suspicious behaviour.  In a demo shown at the event it detected a potential hack via stolen credentials and un-authorised escalation of privileges.  It can then feed this data back in to your SIEM – effectively providing an in depth level of intelligence behind your event logs more than your average SIEM can do.

    Advancements in Active Directory:

    AD (at least Azure AD for now) will be able to provide built in two factor authentication, in theory making it easier to bring two factor authentication to your own applications, directly via AD.

    ‘JEA’ – Just Enough Administrator rights.  Based through Powershell only, a way to provide highly granular restricted administration rights to users.  MS presented a clear vision of not having people log in to servers, with only remote management.  Right now JEA just applies to Powershell, but could scale out to graphical tools in the future.  Linked with this, but with wider scope is Just in Time rights, based on a web portal that will allow users to request their limited admin rights to  a server, that then has an approval workflow and grants the user rights for a specific time-frame only.

    Server 2016:

    A big topic here!  There’s a lot of new stuff coming here, but here are some of my highlights:

    Nano Server: Nano Server is a new highly cut down version, that can run in a footprint of around 400MB.  This smaller surface has resulted in internal MS testing of far less patches required, and considerably less reboots.

    Nano Server has limited scope, and isn’t designed to be able to run all your legacy apps.  However, it does partner very well with Microsoft’s new Containers, or running infrastructure features such as Hyper-V, DNS Servers, and hopefully by release time AD servers.

    It is a 100% remote management only, via Powershell, Server Manager, or most interestingly the new (currently in internal release only in MS) Remote Server Management Tool.  Indeed, to look locally, there is only a basic diagnostic system to look at configured IP addresses in an almost Linux appliance like shell.

    Containers: Microsoft does containers now!  Now running in 2016 TP4, MS Containers are supported on all versions from Nano to Full (although really targetting Nano-Server as the ideal platform).  Docker is supported for management and deployment as well.   There is also something that will be coming to Azure (currently in technical preview only as well) that to me looked like ‘Containers as a Service’ – in Azure you can deploy a fixed set of servers to run your Containers, currently based on Ubuntu Linux but with Windows 2016 planned, that deploys a stepping stone server, three servers running management tools, and a dynamic number of container host VMs.

    Micro-Services: linked to a clear MS vision here with Nano Server and Containers are a drive towards Micro-Services.  MS has a new management platform / framework, Azure Service Fabric for managing and deploying Micro Service based applications.  They have some great plans here, including zero down-time rolling updates for stateless and stateful Micro Service applications.  

    RSMT: a new web based management tool that can manage all your servers from Azure to internal, Nano Server to full GUI versions.  Through the tool users can access all the normal sort of features they would expect, opening a Powershell window, Services, Cluster Manager, AD etc.

    Software Defined Network: this is an area where it looks like a lot of improvements have been made to the network stack, but it still has a way to go I would say – it didn’t strike me as mature as NSX for instance, and even now still doesn’t provide the functionality we have in our own environment with the Cisco Nexus 1000v virtual distributed switch add in for 2012 R2.

    Azure Load Balancing is now available in 2016, although this is limited to L3 load balancing still and doesn’t offer any L7 or even L4 features.  Interesting MS themselves are using the A10 virtual load balancer for all Microsoft Live ID authentications.  

    The new software defined FW is a big step forwards though, and although MS say it won’t replace a HW edge firewall all together for big enterprises, I can see it actually doing this for some internal private clouds – a comprehensive set of features were presented, along with some impressive performance figures.

    VM Security: a few new features are now introduced, including Shielded Virtual Machines.  This is mainly applicable in a multi-tenant environment, and allows a tenant admin to prevent access to the VMs, or the data on them, from the overall environment administrator.  It utilises a new Azure Key Store to store key material, Bit Locker for encrypted disks, and a new Host feature to also allow VMs to be restricted to specific hosts only.  This should be supported in both Azure, Azure Stack and Hyper-V in Server 2016.

    Software Defined Storage: Some noticeable improvements here, some features were appearing in 2012 R2 but far more mature in 2016.  Storage Spaces already exists in 2012 R2, but has a lot of improvements in Azure Stack.  It allows you to use local disks in each server to build your CSVs and distribute your VMs across, removing a SAN or NAS.  And the performance figures shown are very impressive – a 4 node cluster with each node having 2 SSDs and 4 HDDs was returning 650,000 IOPs, split 70/30 reads and writes.  2016 brings improved Storage QoS as well.

    There are also improvements in SMB Security, Deduplication, ReFS performance (I haven’t used ReFS in 2012 R2, but now I’m definitely tempted with some drastic performance increases on some operations).

    Storage Replica is another interesting technology introduced, with block-wise replication of volumes, with a demonstrating showing the automated recovery of VMs on to a different host, with the underlying VHD volume having a Storage Replica replication to a different server.

    StorSimple is also an interesting product, allowing an easily deployable Hybrid Cloud storage approach, with a local appliance with replication to cloud storage.  Based on iSCSI on your local private cloud, with internet or ExpressRoute (the dedicated link to Azure data centres) connectivity to replicate all traffic to the cloud for backup, or indeed a full DR environment cloud based.  They demonstrated some compelling price comparisons based on a 60TB array.  The devices themselves are being made by Seagate.

    I’ve spent my weekend playing with some VMs, getting a nano server working etc and having fun being a bit geeky!  And I’m left with a very positive impression, especially considering this is still only TP4 with months to go before the production release!

  • From the archives… Securing the Olympic Games in Rio de Janeiro

    Originally posted 23rd September 2016 (which also happened to be the day I left Rio after living there for 4 years!)

    One of the most ever present concerns, not just during the Olympic Games, but for the 2 years preceding them since our first systems went live for the Volunteer Portal, is ensuring the security of our systems. To make the technology of the Olympics run smoothly, we bring to the table over 25 years of experience of delivery excellence to the Olympic and Paralympic Games, some of the most visible events in the world, but also from doing this for our clients day in day out.

    The Olympic Games are first and foremost about sport and bringing the world together. Our work should be invisible, silently working away behind the scenes, but always ensuring that everyone can keep enjoying the big sport event. Every day hacker’s work to come up with new ways to disrupt IT systems, and in return corporations have to keep one step ahead to ensure their systems and data remain secure.

    For Rio we have been building on our experience to securely provide the most connected Olympics ever. Our systems processed and delivered more data than any previous ones that reached and impacted more people than ever before. The Atos team, in conjunction with our partners, worked tirelessly to insure that this information gets delivered successfully, allowing the world to share in real time in the most connected way yet. 200,000 of hours of testing have taken place, testing thousands of different scenarios to ensure that when the event started on the 5th August 2016 with the eyes of the world watching, we were ready.

    But it is not just the results that matter. Our Games Management Systems processed 430,000 accreditations; set-up effectively in the Rio 2016 partners cloud set-up. For the Olympic Games, the pressing IT challenges are to further secure operations, contain costs, and leverage experience and investment across multiple Olympic Games. To meet those challenges, the IOC is committed to continuous improvement and innovation, and delivering greater benefits from the evolution of technologies and emergence of new services.

    These accreditation passes not only act as the person’s credentials to access Olympic venues, but also act as visa waiver for entry in to the country. Other systems managed the 50,000 volunteers and their working schedules. Security was paramount, and for the first time the accreditation systems were running – in the cloud, delivered together with fellow Rio 2016 technology partners.

    Our team of experts, based in the Technical Operations Center in Rio de Janeiro, worked 24/7 throughout the event, keeping a close eye on everything flowing through the network. We anticipated collecting and analyzing more data than ever, building on the over 400 million IT security events that were analyzed during the Rio 2016 Olympic Games and 120 million IT security events during the Paralympic Games.

    Using the latest in real time data analytics we worked to sort through these millions of IT security events, looking for behavior that really were suspicious, filtering down to ensure that our team of Security Experts got fed what they needed to see, and made the human call on what really was a risk to our systems, and what wasn’t. We brought in skills in data analytics to crunch through the vast amounts of data we gathered to bring out knowledge and patterns to help ensure we keep learning and improving to stay that one step ahead of the game.

    So when you were watching the excitement of the opening ceremony, hearing the roar of the crowd as the athletes go for gold, our team were there behind the scenes ensuring that results were securely delivered to the world’s media and that the huge workforce got to the right places at the right times with the right access.

  • From the archives… Introducing energy efficient benefits while defining your network

    Originally posted 11th July 2014

    Last week I was having a very interesting conversation with a colleague related to a previous blog post of mine around energy efficient coding and I thought perhaps it was time to revisit, but on a slightly different angle. How can we look at using Software Defined Network to improve energy utilisation in our network and data centre? In a similar way to improving programs by making sure they are as efficient as possible, or a serverfarm by maximising resources, we can also use SDN to ensure our network is running as efficiently as possible.

    In a traditional network each device will sit there using power, with the consumption varying a bit as load comes and goes, regardless of if it is taking an active part in the flow of traffic. A typical top of rack switch is measured in hundreds of watts, while a big modular switch could be thousands of watts. And half of this network could be using power while doing almost nothing other than providing redundancy, using a sizable percentage of power in comparison to the active path.

    However, in an SDN each device doesn’t have to be its own device – we can have the awareness of the bigger picture and perhaps look to introduce lots of power savings. We can also even build in an awareness of trends, to allow the network to preempt traffic changes, with learning algorithms allowing a network to become more efficient and better over time.

    With this, we can look to use SDN to route traffic in the most efficient way for power utilisation, rather than tuned for performance optimisation. By viewing the network as a whole, we can look to put some paths in to a ‘power saving mode’ where they can use a lot less power by powering down large parts of themselves in to a standby mode, and ensure the paths we do have are utilised as the maximum efficiency we can. If load builds too much, we can look to make use of other paths, while still trying to do this in a way that is the most efficient use of power.

    Some of this may need new hardware, such as having efficienct standby states that use less power than being in a ‘ready’ state, but can quickly come back online in sub-second time in the event of a failure or need to shift traffic, and the control plane OS running on the most energy efficient hardware available, while itself also being coded from an energy efficient perspective.

    When I look in to the data centre of the future, I can see a situation where we have applications developed from an energy efficient manner, balanced on highly efficient server hardware by algorithms to ensure they run together in the most energy efficient way, with a network moving traffic around as efficiently as possible to result in a data center that uses a lot less power than today. And indeed, all of this balancing of applications, servers and traffic could have an overall control plane that is managing the whole piece as a single entity, using learning algorithms to get better over time.

  • From the archives… The new Green IT – Energy Efficient Browsing?

    Originally published July 2013

    The last couple of weeks I’ve seen energy efficiency of a web browser mentioned right in the main headlines of why to switch. Is this a new direction in Green IT, or simply just a marketing headline to try and catch users?

    The first article I saw concerned Internet Explorer 10, and how it used less energy in doing everyday tasks in comparison to Chrome and Firefox. My first thoughts were how do you accurately measure this in a real world example and be sure it is the browser making the difference? It so happens there is a lot of detail behind how the stats were gathered, which can be found in the full report.

    The figures in the example are quite small in reality, a couple of watts here or there in normal use . For a single user, it is certainly not going to make huge savings on their monthly electricity bill by switching. However, in large scale, then those few watts here and there add up of course, and suddenly kWh start to disappear off electricity bills for a large organisation. Indeed, Microsoft used their own calculation on the biggest difference in power consumption to come up with some very large savings of 120 million kWh (or 120 gWh perhaps) for the USA over the course of the year by switching (but did imply sitting there watching flash video non-stop it seems, maybe a waste in the first place)!

    Equally, a few watts here and there on a mobile device such as a tablet or smartphone is far more significant than a full PC system as well and could mean that extra 10 minutes of battery life.

    Is coding for energy efficiency in itself something new? Efficiency of programming was something that was taught to me and many others back at university, but never from the aspect of saving energy. To me, this is just a new way to gauge and measure efficient programming and achieving the desired outcome in the most efficient way. And is it a one off fad of one browser? Nope! In the recent Apple WWDC Keynote, the new version of Safari made similar claims of using less energy.

    I feel this is something we are going to start seeing a lot more of now for both consumer and enterprise software, and although for the average home user it may not mean much saving a few watts (after all, they probably weren’t thinking that firing up the web browser might consume over 20W of power in the first place, or going to YouTube might double that), to large organisations this could realise genuine cost savings in power usage (while at the same time flying in the face of BYOD and having users make use of whichever system they like perhaps)! And on a global scale, every little watt helps!