MySQL on Azure – Azure loves Open Source

Do you know? Open source software (OSS) platforms and technologies are among the fastest growing workloads on Azure nowadays.

As people say, Over the past few years, Microsoft has undergone a major cultural shift towards sharing and collaboration and this naturally includes engaging with open source communities. Microsoft teams across the company are now releasing their function as open source, contributing to a wide variety of open source projects, and are using open source tools.

I presume you know about the followings –

.NET Core –

.NET is a free, cross-platform, open source developer platform for building many different types of applications. Community makes 60% of code contributions. More

Visual Studio Code –

Quick to market with cross-platform support and Microsoft has made numerous contributions to the Electron codebase. More

And of course, Azure –

Popular services based on Linux, Hadoop, Redis and other OSS projects. Microsoft has contributed Azure’s data center designs to the Open Compute Project. More

Azure Database for MySQL” –

A new addition to the Azure Relational Database family. A fully managed, enterprise-ready community MySQL database as a service.

The MySQL Community edition helps you easily lift and shift to the cloud, using languages and frameworks of your choice and Scale in seconds with built-in high availability with Secure, compliant and global reach.

Developing a “Hello World” program using C# and Azure Database for MySQL is possible within few minutes.

  • Login to Azure portal and create an “Azure Database for MySQL server” and a sample database with sample table.
  • Create a C# Console App.
  • Install the MySql.Data.MySqlClient .Net Core Class Library to the project using NuGet package manager.
  • Compile and Run.

I know it’s easy said than done so here is git repository for quick reference. Happy coding

QnA Maker – Documents to a FAQs App

A common challenge for most informational FAQs (Frequently Asked Questions) scenarios is to separate out the content management from the FAQs design and development, as content owners are usually domain experts who may not be technical. QnAMaker addresses this by enabling a QnA management experience.

In Microsoft words – Azure QnA Maker allows you to edit, remove, or add QnA pairs with an easy-to-use interface, then publish your knowledge base as an API endpoint for a bot service. It’s simple to test and train the bot using a familiar chat interface, and the active learning feature automatically learns questions variations from users over time and adds them to your knowledge base. Use the QnA Maker endpoint to seamlessly integrate with other APIs like Language Understanding service and Speech APIs to interpret and answer user questions in different ways. More details are available here.

To get my hands dirty, I started developing a minimal app. Basically, a Hello World with QaA Maker

The high-level scope is –

  1. Firstly, I have an “Azure Purchase FAQ.pdf” file which contains the questions and answers related to Azure purchasing (it’s a cut short version for obvious reasons).
  2. Secondly, I uploaded the above file to https://www.qnamaker.ai/ and terminology wise, it’s the knowledge base creation.
  3. Lastly, developed a quick console application using C#. On start, this app hits the ‘qnamaker’ endpoint to search a question and gets the answer for it.

Honestly, writing this blog took me more time than developing an end to end working solution

The solution is uploaded on the GitHub with project namely, “AzurePurchaseFAQ“. Feel free to copy/clone and try your own hands also. To begin, create the QnA Maker service and deploy with free/defaults. The settings for Sample HTTP request would look like –

POST /knowledgebases/ee0f5e4a-ac42-4f91-9b47-8161b6c5a409/generateAnswer

Host: https://azurepurchasefaq.azurewebsites.net/qnamaker

Authorization: EndpointKey 4836aac3-9fcf-45ca-9295-d256a50216ec

Content-Type: application/json

{“question”:””}

On running the app, the console window should look like this –


Happy coding, and have fun 🙂

Memory-Optimized Tables – Helps in Performance and Scalability

Today, while performing the code review on one of my project, which is getting developed using many Azure services/technologies.

Being ‘Internet of Things’/IOT scenario, the requirements demands the use of “Polyglot Persistence” pattern. Because solution need to store the structured/SQL as well unstructured/NoSQL data. And as we know, to store the structured/relational data ‘SQL Azure’ is the default technology choice being on Azure and Microsoft person 🙂

So, while analyzing the stored procedure’s T-SQL code, observed that many of the SPs are utilizing the temporary tables for data computation/processing operations to improve the overall performance. Using temporary tables, table variables, or table-valued parameters was a reasonable/acceptable practice when I was a programmer 🙂 But started wondering if anything new added to this approach/pattern to improve for better. By using Bing, I quickly discovered that we really have something new and better of course namely, “Memory-Optimized Tables“. This is part of In-Memory OLTP, which is the premier technology available in SQL Server and Azure SQL Database for optimizing performance of transaction processing, data ingestion, data load, and transient data scenarios.

As MS docs says, Memory-optimized tables are tables, created using CREATE TABLE with “MEMORY_OPTIMIZED = ON” option. Memory-optimized tables are fully durable by default, and, like transactions on (traditional) disk-based tables, fully durable transactions on memory-optimized tables are fully atomic, consistent, isolated, and durable (ACID). Memory-optimized tables and natively compiled stored procedures support a subset of Transact-SQL. More details.

Hence, if you are exploring the options to enhance your SPs/T-SQL code on SQL Azure then please refer here for performance and scalability considerations.

The details scenario are documented with instructions @ Replace global tempdb ##table and Replace session tempdb #table

So, next time whenever you see CREATE TABLE #temptable and/ CREATE TABLE #temptable and choose to replace by memory optimization option then make sure you visit this blog to say thanks you 🙂

Designing High Availability and Disaster Recovery for IoT/Event Hub

Before we jump directly to the topic, it requires some pre-requisites. So, make yourself comfortable with them.

As per wiki, High Availability (HA) is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period. It is measured as a percentage of uptime in a given year. For details, please refer.

And, Disaster Recovery (DR) involves a set of policies, procedures and tools to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster. Disaster recovery focuses on the IT or technology systems supporting critical business functions. For details, please refer.

Azure like any other cloud provider, has many built-in platform features that support highly available applications. However, you need to design the application specific logic (checklist) which absorbs fluctuations in availability, load, and temporary failures in dependent services and hardware. So that, the overall solution continues to perform acceptably, as defined by business requirements or application service-level agreements (SLAs). For details, please refer.

Hoping above info provides the high-level picture of HA/DR. Remaining post is more focused on a specific scenario in Internet of Things (IOT), basically the headline 🙂

I’m intentionally skipping the conceptual part of HA/DR importance, how to measure it, and different enables. As enough literature is available of the www.

Designing HA/DR for a solution which is using IoT/Event Hub has few considerations –

  • Devices are Smart – The devices should either have logic to differentiate between the primary and secondary region/site or shouldn’t declaratively aware of any endpoint. One of the way is to devices regularly check a concierge service for the current active endpoint. The concierge service can be a web service that is replicated and kept reachable using DNS-redirection techniques (Example, Azure Traffic Manager or AWS Route 53). So, you need to ask yourself what will happened to messages when cloud endpoint is not available? Message loss is acceptable/not? If yes, then fine otherwise you need some offline storage/queue at device end also.
  • Devices Identities – Generally endpoint understand the devices identities, if so then all device identities should be geo-replicated/backups and pushed to the secondary IoT hub before switching the active endpoint for the devices. Accordingly, the concierge service and ultimately devices must be made aware of this change in the endpoint. Also you need to develop the tools/utilities to quickly upload/push devices metadata to the IoT Hub.
  • Delta Identification and Merge – Once the primary region becomes available again, all the state and data that have been created in the secondary site must be migrated back to the primary region. This state and data mostly relates to device identities and application metadata, which must be merged with the primary IoT hub and any other application-specific stores in the primary region.

How much time it should take to fall back to secondary site and recover from it, is something which is solution specific and depends on solution’s RPOs and RTOs.

The overall approach includes following considerations in two major areas –

  • Device – IoT Hub
    • A secondary IoT hub
    • Backup Identities to a geo-redundant store
    • Device routing logic
    • Merging identities, when Primary is back
    • Either interim message store on device or message loss acceptable.
  • Application Components/Storages
    • A secondary App/Services Instance
    • Enable geo-redundant for all storages
    • Restoration of data/states from used storages (SQL & NoSQL)
    • Anything custom

Here is the conceptual architecture diagram, which depicts the proposed solution.

Although diagram is self-explanatory – but feel free to comment/ask on anything.

Big Data Concepts – In 5 Minutes

What is Big Data –

If you are looking for standard definition, then refer to obvious source i.e. wiki

As per wiki, the term has been in use since the 1990s, with some giving credit to “John Mashey” for coining or at least making it popular. Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time. More details are anyways at wiki.

The definition I prefer is, “When data is too big for OLTP then it’s Big Data“. Other definitions –

  • When data is in Peta Bytes.
  • 3 Vs (Volume, Velocity and Variety) or 4Vs (Volume, Velocity, Variety, and Veracity)

What Scenario produces it –

Data getting produced from web/internet, social networking/media, phone/mobile tower and many more as mentioned in the diagram below.

Point to be notes is, the notion of big data is not NEW. We always had it, what we haven’t done is to STORE IT and ANALYSE IT. This is now possible because of many factor/enablers.

What Enables it –

If you compare today with a day decades ago. You will observe the entry barriers got reduced significantly and democratization of concepts and its enablers happened. For example, nowadays buying compute/storage resources is relatively cheap than it was previously. Also, the technologies/solutions required to make sense out of big data are more accessible, thanks to open source initiatives and its serious players in the market. Hence, today we have more and more Producers and Consumers of data who are interested in it and its analysis.

I’m trying to list few enablers, but true list would be far greater than this. However, it should give you initial food for thoughts.

What It Enables –

  • Analysis – Sentiments, Clickstream and Forensic etc. Analysis.
  • Patterns – Buying, Search and Investment.
  • Machine Learning
  • Research – Physics and Healthcare
  • Prediction and Prevention Maintenance.
  • And many more…Just Bing/Google it

Map Reduce, I heard somewhere about it what’s that –

Developed and perfected inside the google then published to public. It’s 2 pass process – 1) Map and 2) Reduce. More details

Let’s understand it quickly via picture. As, “a picture is worth a thousand words”

Although picture is self-explanatory, but I will add the explanation, if required and requested

The Azure Architecture Center is available now

The Azure Architecture Center is available now in the documentation section of Azure. It’s Open to everyone with no cost to access the information. The Architecture Center is an extremely valuable resource as it brings –

  • Information for all cloud users ranging from beginners to specialists.
  • Best practices for security, availability, scalability, performance, cost, and manageability.
  • Tested, proven, and verified guidance. Not theoretical designs, they have been built and successfully run and ready for production.
  • Prepared deployment scripts and diagrams that anyone can use to get started quickly

The main areas of the architecture center covers are –

  • Application Architecture Guide – This guide presents a structured approach for designing applications on Azure that are scalable, resilient, and highly available.
  • Reference Architectures – Scenarios with related architectures grouped together.
  • Cloud Design Patterns – These design patterns are useful for building reliable, scalable, secure applications in the cloud.

One of interesting topic is a special section for customers coming from compete cloud provider namely, AWS. It helps Amazon Web Services (AWS) experts understand the basics of Microsoft Azure accounts, platform, and services. It also covers key similarities and differences between the AWS and Azure platforms, here.

Lastly, the people who are deep in architectural/design work should visit here. This provides resources including icons, Viso templates, PNG files, and SVG files that are useful for producing your own architecture diagrams. A direct link to download.

Google Cloud – Developer’s Sneak Peek

As per Gartner, the Internet search and ad giant has entered top 3 cloud provider.

If you are already familiar with any cloud provider like Azure or AWS. You will find yourself at home Google Cloud Platform (GCP) is hosted on the same infrastructure used by Google Search and YouTube.

The fundamentals of PaaS, IaaS, Compute, Storage, Networking and Security will help you to quickly digest the google cloud platform specifics. However, refer to google differentiators, as it claims.

Probably, the good news for beginners is that “Google Cloud Platform Free Tier” is relatively relaxed compare to other cloud provider, IMO. Question on it visit here.

Although, Google started late. But, it seems to have strong IaaS and PaaS capabilities. Many of its cloud services are extended capabilities of existing services. Interesting observations is, to improve developer productivity, GCP offers App Engine Flexible Environment (Managed ‘Virtual Machine’) that operates between IaaS and PaaS. The App Engine flexible environment is based on Google Compute Engine and automatically scales your app up and down while balancing the load.

Another significant aspect is GCP’s significant contribution to ‘Open Source’. Few examples –

  • Kubernetes – System for automating deployment, operations, & scaling of containerized apps.
  • Spanner – Scalable, multi-version, globally-distributed, & synchronously-replicated database.
  • Hadoop MapReduce – It will let users run native C/C++ code in their Hadoop environments.
  • Dataflow – Ability to handle batch/stream processing of large data sets.

The way I see, from uber perspective google has divided the cloud offerings into –

1) Consumer oriented and 2) Developer oriented

For beginners, especially if you are coming from application architecture/design/development background the navigation path for google cloud could be – more details.